CN113986624A - In-system verification of interconnects by error injection and measurement - Google Patents

In-system verification of interconnects by error injection and measurement Download PDF

Info

Publication number
CN113986624A
CN113986624A CN202011566305.0A CN202011566305A CN113986624A CN 113986624 A CN113986624 A CN 113986624A CN 202011566305 A CN202011566305 A CN 202011566305A CN 113986624 A CN113986624 A CN 113986624A
Authority
CN
China
Prior art keywords
error
flit
error injection
injection
errors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011566305.0A
Other languages
Chinese (zh)
Inventor
D·达斯夏尔马
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Publication of CN113986624A publication Critical patent/CN113986624A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/24Testing correct operation
    • H04L1/242Testing correct operation by comparing a transmitted test signal with a locally generated replica
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/26Functional testing
    • G06F11/263Generation of test inputs, e.g. test vectors, patterns or sequences ; with adaptation of the tested hardware for testability with external testers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • G06F11/2215Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test error correction or detection circuits
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • G06F11/221Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test buses, lines or interfaces, e.g. stuck-at or open line faults
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2247Verification or detection of system hardware configuration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3027Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • G06F13/1684Details of memory controller using multiple buses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4027Coupling between buses using bus bridges
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0041Arrangements at the transmitter end
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0045Arrangements at the receiver end
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • H04L1/0061Error detection codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/24Testing correct operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • G06F13/4295Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus using an embedded synchronisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0042Universal serial bus [USB]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Physics (AREA)
  • Information Transfer Systems (AREA)
  • Detection And Correction Of Errors (AREA)

Abstract

Systems and devices may include an error injection register that includes error injection parameter information. The system and apparatus may further include error injection logic circuitry to: reading error injection parameter information from an error injection register and injecting an error into a flow control unit (flit); and protocol stack circuitry to transmit flits including errors over the multi-channel link. The injected errors may be detected by the receiver and used to test and characterize various aspects of the link, such as bit error rate, error correction codes, cyclic redundancy check, replay capability, error logging, and other characteristics of the link.

Description

In-system verification of interconnects by error injection and measurement
Cross Reference to Related Applications
IN accordance with 35 U.S.C. § 119(e), the present application claims benefit OF U.S. provisional patent application No. 63/057,168 entitled "IN-SYSTEM vacuum apparatus for ERROR INJECTION AND MEASUREMENT OF indirect", filed 7, 2020 AND 27, which is incorporated herein by reference IN its entirety.
Background
For each generation of PCIe, the serial interconnectWith a Bit Error Rate (BER) per channel across the link of 10 expected-12. As the number of link channels increases, BER is affected by crosstalk, intersymbol interference (ISI), and channel loss caused by sockets, vias, boards, connectors, add-on cards (AIC), and the like. With the deployment of PAM-4 encoding for next generation data rates (e.g., PCIe Gen 6 at 64GT/s and next generation CXL and UPI data rates), the target BER is as high as 10-6
Drawings
FIG. 1 illustrates an embodiment of a block diagram of a computing system including a multicore processor.
Fig. 2A-2B are simplified block diagrams of an example link including one or more retimers, according to embodiments of the present disclosure.
Fig. 3 is a schematic diagram of a common physical layer (common PHY) supporting multiple interconnect protocols, according to an embodiment of the disclosure.
Fig. 4A-4B are schematic diagrams illustrating exemplary circuitry and logic within a protocol stack including a Flit (Flit) error counter and jitter insertion circuitry according to embodiments of the disclosure.
FIG. 5 is a process flow diagram for injecting errors into flits according to an embodiment of the disclosure.
FIG. 6 is a process flow diagram for injecting errors into flits according to an embodiment of the disclosure.
Fig. 7 is a process flow diagram for a transmitter side protocol stack injecting errors into an ordered set in accordance with an embodiment of the disclosure.
Figure 8 is a process flow diagram for a receiver-side protocol stack injecting errors into an ordered set in accordance with an embodiment of the disclosure.
Fig. 9 is a process flow diagram for performing delay measurements according to an embodiment of the disclosure.
FIG. 10 illustrates an embodiment of a computing system including an interconnect architecture.
FIG. 11 illustrates an embodiment of an interconnect architecture including a layered stack.
FIG. 12 illustrates an embodiment of a request or packet to be generated or received within an interconnect fabric.
Fig. 13 illustrates an embodiment of a transmitter and receiver pair for an interconnect architecture.
FIG. 14 illustrates another embodiment of a block diagram of a computing system including a processor.
FIG. 15 illustrates an embodiment of a block of a computing system including multiple processor sockets.
The figures are not drawn to scale.
Detailed Description
In the following description, numerous specific details are set forth, such as examples of specific types of processors and system configurations, specific hardware structures, specific architectural and microarchitectural details, specific register configurations, specific instruction types, specific system components, specific measurements/heights, specific processor pipeline stages and operations, etc., in order to provide a thorough understanding of the present disclosure. It will be apparent, however, to one skilled in the art that these specific details need not be employed to practice the present disclosure. In other instances, well known components or methods, such as specific and alternative processor architectures, specific logic circuits/code for described algorithms, specific firmware code, specific interconnect operations, specific logic configurations, specific manufacturing techniques and materials, specific compiler implementations, specific expressions of algorithms in code, specific power down and gating techniques/logic, and other specific operational details of computer systems have not been described in detail in order to avoid unnecessarily obscuring the present disclosure.
Although the following embodiments may be described with reference to energy conservation and efficiency in a particular integrated circuit (e.g., in a computing platform or microprocessor), other embodiments may be applicable to other types of integrated circuits and logic devices. Similar techniques and teachings of the embodiments described herein may be applied to other types of circuits or semiconductor devices that may also benefit from better energy efficiency and energy conservation. For example, the disclosed embodiments are not limited to desktop computer systems or UltrabooksTM. And may also be used in other devices such as handheld devices, tablet computers, other thin and lightweight notebooks, system on a chip (SOC) devices, and embedded applications. Some examples of handheld devices include beesCellular phones, internet protocol devices, digital cameras, Personal Digital Assistants (PDAs), and handheld PCs. Embedded applications typically include microcontrollers, Digital Signal Processors (DSPs), systems on a chip, network computers (netpcs), set-top boxes, network hubs, Wide Area Network (WAN) switches, or any other system that can perform the functions and operations described below. Furthermore, the apparatus, methods, and systems described herein are not limited to physical computing devices, but may also relate to software optimization for energy conservation and efficiency. As will become apparent in the following description, embodiments of the methods, apparatus and systems described herein (whether with reference to hardware, firmware, software or a combination thereof) are critical to the "green technology" future of balancing performance considerations.
As computing systems evolve, the components therein become more complex. As a result, the complexity of the interconnect architecture used to couple and communicate between components is also increasing to ensure that bandwidth requirements are met for optimal component operation. Furthermore, different market segments require different aspects of the interconnect architecture to meet the needs of the market. For example, servers require higher performance, and mobile ecosystems can sometimes sacrifice overall performance to save power. However, the only purpose of most architectures is to provide the maximum possible performance and maximize power savings. In the following, a number of interconnects are discussed that would potentially benefit from aspects of the present disclosure described herein.
Referring to FIG. 1, an embodiment of a block diagram of a computing system including a multicore processor is depicted. Processor 100 includes any processor or processing device, such as a microprocessor, embedded processor, Digital Signal Processor (DSP), network processor, hand-held processor, application processor, co-processor, system on a chip (SOC), or other device that executes code. In one embodiment, processor 100 includes at least two cores — cores 101 and 102, which may include asymmetric cores or symmetric cores (the illustrated embodiment). However, processor 100 may include any number of processing elements that may be symmetric or asymmetric.
In one embodiment, a processing element refers to hardware or logic that supports software threads. Examples of hardware processing elements include: a thread unit, a thread slot, a thread, a processing unit, a context unit, a logical processor, a hardware thread, a core, and/or any other element capable of maintaining a processor state (e.g., an execution state or an architectural state). In other words, a processing element, in one embodiment, refers to any hardware capable of being independently associated with code, such as a software thread, operating system, application, or other code. A physical processor (or processor socket) generally refers to an integrated circuit, which may include any number of other processing elements, such as cores or hardware threads.
A core generally refers to logic located on an integrated circuit capable of maintaining an independent architectural state, where each independently maintained architectural state is associated with at least some dedicated execution resources. In contrast to cores, a hardware thread generally refers to any logic located on an integrated circuit capable of maintaining an independent architectural state, wherein the independently maintained architectural states share access to execution resources. It can be seen that the boundaries between the nomenclature of hardware threads and cores overlap when some resources are shared while others are dedicated to architectural state. However, operating systems often view cores and hardware threads as separate logical processors, where the operating system is able to schedule operations on each logical processor separately.
As shown in FIG. 1, physical processor 100 includes two cores — cores 101 and 102. Here, cores 101 and 102 are considered symmetric cores, i.e., cores having the same configuration, functional units, and/or logic. In another embodiment, core 101 comprises an out-of-order processor core and core 102 comprises an in-order processor core. However, cores 101 and 102 may be individually selected from any type of core, such as a native core, a software management core, a core adapted to execute a native Instruction Set Architecture (ISA), a core adapted to execute a translated Instruction Set Architecture (ISA), a co-designed core, or other known cores. In a heterogeneous core environment (i.e., asymmetric core), some form of translation (e.g., binary translation) may be used to schedule or execute code on one or both cores. For further discussion, the functional units shown in core 101 are described in more detail below, as the units in core 102 operate in a similar manner in the illustrated embodiment.
As shown, core 101 includes two hardware threads 101a and 101b, which may also be referred to as hardware thread slots 101a and 101 b. Thus, in one embodiment, a software entity such as an operating system potentially views processor 100 as four separate processors, i.e., four logical processors or processing elements capable of executing four software threads simultaneously. As described above, a first thread is associated with architecture state registers 101a, a second thread is associated with architecture state registers 101b, a third thread may be associated with architecture state registers 102a, and a fourth thread may be associated with architecture state registers 102 b. Here, as described above, each architecture state register (101a, 101b, 102a, and 102b) may be referred to as a processing element, a thread slot, or a thread unit. As shown, architecture state registers 101a are replicated in architecture state registers 101b, thus enabling storage of individual architecture states/contexts for logical processor 101a and logical processor 101 b. Other smaller resources, such as instruction pointers and rename logic in allocator and renamer block 130, may also be replicated for threads 101a and 101b in core 101. Some resources, such as reorder buffers in reorder/retirement unit 135, ILTB 120, load/store buffers, and queues may be shared through partitioning. Other resources (e.g., general purpose internal registers, page table base registers, lower level data caches and portions of data TLB 115, execution unit 140, and out-of-order unit 135) may be fully shared.
Processor 100 typically includes other resources, which may be fully shared, shared through partitioning, or dedicated by/to processing elements. In FIG. 1, an embodiment of a purely exemplary processor with illustrative logical units/resources of the processor is shown. Note that a processor may include or omit any of these functional units, as well as include any other known functional units, logic, or firmware not shown. As shown, core 101 includes a simplified, representative out-of-order (OOO) processor core. In-order processors may be utilized in different embodiments. The OOO core includes: a branch target buffer 120 for predicting branches to be executed/taken; and an instruction translation buffer (I-TLB)120 to store address translation entries for instructions.
Core 101 also includes a decode module 125 coupled to fetch unit 120 to decode the fetch elements. In one embodiment, the fetch logic includes respective sequencers associated with the thread slots 101a, 101b, respectively. Typically, core 101 is associated with a first ISA that defines/specifies instructions executable on processor 100. Typically, machine code instructions that are part of the first ISA include portions of instructions (referred to as opcodes) that reference/specify instructions or operations to be performed. Decode logic 125 includes circuitry that recognizes these instructions from their opcodes and passes the decoded instructions into the pipeline for processing as defined by the first ISA. For example, as discussed in more detail below, in one embodiment, decoder 125 includes logic designed or adapted to recognize specific instructions (e.g., transaction instructions). As a result of the recognition by the decoder 125, the architecture or core 101 takes certain predefined actions to perform the task associated with the appropriate instruction. It is important to note that any of the tasks, blocks, operations, and methods described herein may be performed in response to a single or multiple instructions; some of which may be new or old instructions. Note that in one embodiment, decoder 126 recognizes the same ISA (or a subset thereof). Alternatively, in a heterogeneous core environment, decoder 126 recognizes a second ISA (either a subset of the first ISA, or a different ISA).
In one example, allocator and renamer block 130 includes an allocator to reserve resources, such as register files, to store instruction processing results. However, threads 101a and 101b are potentially capable of out-of-order execution, with allocator and renamer block 130 also reserving other resources, such as reorder buffers to track instruction results. Unit 130 may also include a register renamer to rename program/instruction reference registers to other registers internal to processor 100. Reorder/retirement unit 135 includes components such as the reorder buffers described above, load buffers, and store buffers to support out-of-order execution, and later to sequentially retire instructions that are executed out-of-order.
In one embodiment, scheduler and execution unit block 140 includes a scheduler unit to schedule instructions/operations on the execution units. For example, floating point instructions are scheduled on a port of an execution unit having an available floating point execution unit. Register files associated with the execution units are also included to store information instruction processing results. Exemplary execution units include floating point execution units, integer execution units, jump execution units, load execution units, store execution units, and other known execution units.
A lower level data cache and data translation buffer (D-TLB)150 is coupled to execution unit 140. The data cache will store recently used/operated on elements (e.g., data operands), which may be held in a memory coherency state. The D-TLB will store the most recent virtual/linear to physical address translations. As a particular example, a processor may include a page table structure to divide physical memory into a plurality of virtual pages.
Here, cores 101 and 102 share access to higher level or more distant caches, such as a second level cache associated with on-chip interface 110. Note that higher level or farther refers to increasing or farther from the cache level of the execution unit. In one embodiment, the higher level cache is a last level data cache (the last level cache in a memory hierarchy on processor 100), such as a second or third level data cache. However, the higher level cache is not so limited, as it may be associated with or include an instruction cache. A trace cache (an instruction cache) may instead be coupled after decoder 125 to store recently decoded traces. Here, an instruction potentially refers to a macro-instruction (i.e., a general-purpose instruction recognized by a decoder) that may be decoded into a plurality of micro-instructions (micro-operations).
In the depicted configuration, the processor 100 also includes an on-chip interface module 110. Historically, the memory controller, described in more detail below, has been included in a computing system external to processor 100. In this scenario, on-chip interface 11 is used to communicate with devices external to processor 100, such as system memory 175, a chipset (typically including a memory controller hub connected to memory 175 and an I/O controller hub connected to peripherals), a memory controller hub, a Northbridge, or other integrated circuit. And in this scenario, bus 105 may include any known interconnect, such as a multi-drop bus, a point-to-point interconnect, a serial interconnect, a parallel bus, a coherent (e.g., cache coherent) bus, a layered protocol architecture, a differential bus, and a GTL bus.
Memory 175 may be dedicated to processor 100 or shared with other devices in the system. Common examples of types of memory 175 include DRAM, SRAM, non-volatile memory (NV memory), and other known storage devices. Note that device 180 may include a graphics accelerator, a processor or card coupled to a memory controller hub, a data storage device coupled to an I/O controller hub, a wireless transceiver, a flash memory device, an audio controller, a network controller, or other known devices.
More recently, however, as more logic and devices are integrated on a single die, such as an SOC, each of these devices may be incorporated on processor 100. For example, in one embodiment, the memory controller hub is on the same package and/or die as processor 100. Here, a portion of the core (the upper core portion) 110 includes one or more controllers for interfacing with other devices, such as memory 175 or graphics device 180. Configurations that include interconnects and controllers interfacing with such devices are often referred to as on-core (or off-core configurations). By way of example, the on-chip interface 110 includes a ring interconnect for on-chip communications and a high-speed serial point-to-point link 105 for off-chip communications. However, in an SOC environment, even more devices (e.g., network interfaces, coprocessors, memory 175, graphics processor 180, and any other known computer device/interface) may be integrated on a single die or integrated circuit to provide a small form factor as well as high functionality and low power consumption.
In one embodiment, processor 100 is capable of executing compiler, optimization, and/or translator code 177 to compile, translate, and/or optimize application code 176 to support or interface with the apparatus and methods described herein. A compiler typically includes a program or collection of programs to convert source text/code into target text/code. Generally, compiling program/application code with a compiler is performed in multiple stages and passes to convert high-level programming language code to low-level machine or assembly language code. However, a single pass compiler may still be used for simple compilation. The compiler may utilize any known compilation technique and perform any known compiler operation, such as lexical analysis, preprocessing, parsing, semantic analysis, code generation, transcoding, and code optimization.
Larger compilers typically include multiple phases, but most commonly, these phases are contained in two general phases: (1) a front-end, i.e., where syntactic processing, semantic processing, and some transformation/optimization may generally be done, and (2) a back-end, i.e., where analysis, transformation, optimization, and code generation are generally done. Some compilers reference an intermediary that accounts for contour ambiguity between the front-end and back-end of the compiler. As a result, references to the insertion, association, generation, or other operations of the compiler may occur in any of the above-described stages or passes of the compiler as well as any other known stages or passes. As an illustrative example, a compiler may insert an operation, call, function, etc. in one or more compilation stages, such as inserting a call/operation in a front-end stage of compilation and then converting the call/operation to lower level code in a conversion stage. Note that during dynamic compilation, compiler code or dynamic optimization code may insert such operations/calls and optimize the code for execution at runtime. As a particular illustrative example, binary code (compiled code) may be dynamically optimized at runtime. Here, the program code may include dynamic optimization code, binary code, or a combination thereof.
Similar to a compiler, a translator (e.g., a binary translator) statically or dynamically translates code to optimize and/or translate the code. Thus, reference to execution of code, application code, program code, or other software environment may refer to: (1) dynamically or statically executing a compiler program, optimizing a code optimizer or translator to compile program code, maintaining software structure, performing other operations, optimizing code or translating code; (2) executing main program code including operations/calls, such as optimized/compiled application code; (3) executing other program code, such as libraries associated with the main program code, to maintain software structure, to perform other software-related operations, or to optimize code; or (4) combinations thereof.
As data rates increase in interconnects such as PCIe/CXL/UPI, verifying link performance and RAS capabilities at the system level is becoming a significant challenge. The lack of architected mechanisms creates further complexity as the system is composed of components from multiple vendors. This can affect functional correctness and expected link-level performance, which presents a significant challenge as we move to higher data rates encoded with PAM-4 and the resulting recovery mechanisms such as Forward Error Correction (FEC) and link-level retry (LLR) to handle the high BER produced by PAM-4.
The present disclosure describes error injection and delay measurement mechanisms to validate interconnected components, including links and retimers, and to help characterize links in an architected, automated, and standardized manner in all implementations using the same link protocol.
Fig. 2A-B illustrate a sample multi-channel link with or without a retimer. If one or more retimers are present, each link segment is electrically independent and errors can be accumulated independently in each receiver. Thus, with one retimer, errors may be introduced in the receiver of the retimer or the receiver of the port. The retimer operates on a per-lane basis and therefore does not correct or detect any errors in flits operating across all lanes in the link. If one or more retimers are present, each link segment is electrically independent and errors can be accumulated independently in each receiver. Thus, with one retimer, errors may be introduced in the receiver of the retimer or the receiver of the port. The retimer operates on a per-lane basis and therefore does not correct or detect any errors in flits operating across all lanes in the link. Although shown as including a retimer, it is to be understood that the use of a retimer is implementation specific.
Fig. 2A is a schematic diagram illustrating a sample topology 200 having two retimers 204 and 206 between an upstream component downstream port 202 and a downstream component upstream port 208, according to an embodiment of the present disclosure. The upstream component downstream port 202 may be a port for a PCIe-based device, such as a CPU or other device capable of generating and sending data packets across a data link compliant with the PCIe protocol. Downstream component upstream port 208 may be a port for a peripheral component that may receive data packets from a link that conforms to the PCIe protocol. It should be understood that upstream component downstream port 202 and downstream component upstream port 208 may send and receive data packets across PCIe links (shown as PCIe links 210 a-c).
Topology 200 may include one or more retimers 204 and 206. Retimers 204 and 206 may function as signal repeaters operating at the physical layer to fine tune signals from upstream component 202 and/or downstream component upstream port 208. The retimer may use Continuous Time Linear Equalization (CTLE), Decision Feedback Equalization (DFE), and transmit impulse response equalization (Tx FIR EQ or TxEQ only). The retimer is transparent to the data link and transaction layers, but implements the complete physical layer.
The multi-channel PCIe link is divided into three Link Segments (LS)210a, 210b, and 210c in each direction. Upstream component downstream port 202 may be coupled to retimer 1204 via multi-channel PCIe link 210 a. Retimer 1204 may be coupled to retimer 2206 via link segment 210 b. And retimer 2206 may be coupled to downstream component upstream port 208 through link segment 210 c.
The components may also be coupled by a sideband link. Upstream component downstream port 202 may be coupled to retimer 1204 via sideband link 212 a. Retimer 1204 may be coupled to retimer 2206 via sideband link 212 b. And retimer 2206 may be coupled to downstream component upstream port 208 via sideband link 212 c.
The main function of a retimer (buffer) device is signal retiming. These functions are performed by retimers 204 and 206. The particular retimer device circuitry will depend on the PHY used for the link. In general, the retimer circuit is configured to recover the incoming signal and retransmit using a local clock and a new transmit equalization circuit, and well-known circuits such as a phase locked loop may be employed for this purpose. The retimer may also include transmitter and receiver circuitry including one or more amplifier circuits, as well as various types of well-known signal conditioning circuitry for increasing the drive level of a received signal. Such retimer circuits are well known to those skilled in the art of high speed interconnects and, therefore, further details are not shown or discussed herein.
Each retimer 204 and 206 may have an upstream path and a downstream path. In some implementations, the retimer may include two dummy ports, and the dummy ports may dynamically determine their respective downstream/upstream directions. In addition, retimers 204 and 206 may support operating modes including a forwarding mode and an execution mode. In some cases, retimers 204 and 206 may decode data received on a sub-link and re-encode data forwarded downstream on another of its sub-links. In this way, the retimer may capture the received bitstream before regenerating and retransmitting the bitstream to another device or even another retimer (or redriver or repeater). In some cases, a retimer may modify some values in the data it receives, for example, when processing and forwarding ordered set data. Further, the retimer may potentially support any width option as its maximum width, such as a set of width options defined by a specification such as PCIe.
As the data rate of serial interconnects (e.g., PCIe, UPI, USB, etc.) increases, retimers are increasingly used to extend channel range. Multiple retimers may be cascaded to achieve even longer channel ranges. It is expected that as the signal speed increases, the channel range generally decreases. Thus, as interconnect technologies accelerate, the use of retimers may become more prevalent. For example, since PCIe Gen-4 utilizing 16GT/s is employed to support PCIe Gen-3(8GT/s), the use of retimers may increase in PCIe interconnects, as may be the case in other interconnects with increasing speed.
In one implementation, a generic BGA (ball grid array) footprint may be defined for a PCI Express Gen-4(16GT/s) based retimer. Such a design may address at least some of the exemplary deficiencies found in conventional PCIe Gen-3(8GT/s) retimer devices, as well as some of the problems that arise when PCIe Gen-4 is employed. Furthermore, an increase in the number and volume of retimer vendors is expected for PCIe Gen-4. The interconnect length achievable in Gen-4 is significantly reduced due to signal loss at double data rate (from 8GT/s to 16 GT/s). In this and other exemplary interconnect technologies, retimers may thus increase utility as data rates increase, as they may be used to significantly increase channel lengths that would otherwise be constrained by the increased data rates, such as in PCIe Gen 5 and Gen 6 and beyond.
Although the retimer is shown as being separate from the upstream and downstream components, the retimer may be part of the upstream or downstream component, on board with the upstream or downstream component, or on package with the downstream component.
The upstream component downstream port 202 may access a storage element 222, such as flash memory, cache, or other memory device. Retimer 1204 may optionally include a similar storage element 224. Retimer 2206 may optionally include a similar storage element 226. The downstream component upstream port 208 may optionally include a similar storage element 228.
Fig. 2B is a schematic diagram of a connection system 250 showing an in-band upstream port and retimer configuration, according to an embodiment of the present disclosure. As shown in fig. 2A, upstream component downstream port 202 may be coupled to downstream component upstream port 208 by links 210a-c extended by two retimers 204, 206. In this example, the downstream port 202 may be provided with a retimer configuration register address/data register 252 to use the fields of the enhanced SKP OS to hold data to be sent to one of the two retimers in a configuration access command. One or more bits of the SKP OS may include a command code, data, or address to use at a configuration register (e.g., 256, 258) of a retimer (e.g., 204, 206, respectively) to read/write data from/to registers 256, 258. The retimer may respond to the transmitted configuration access command by encoding data in an instance in which the enhanced SKP OS alone encodes response data in a subsequent instance of the enhanced SKP OS. Data encoded by the retimer (e.g., 204, 206) may be extracted at the downstream port and recorded in a retimer configuration data return register (e.g., 254). Registers (e.g., 252, 254) maintained at the upstream device downstream port 202 may be written to and read from by system software and/or other components of the system that allow (indirect) access to the retimer registers: a register (e.g., 252) that conveys address/data/commands to the retimer; and a second register (e.g., 254) that stores the response returned from the retimer. In other implementations, these registers (e.g., 260) may be maintained at the downstream component upstream port 208 instead of or in addition to maintaining the registers at the upstream component downstream port 202, among other examples.
Continuing with the example of fig. 2B, in conjunction with a mechanism for providing in-band access to retimer registers, the retimer may have architected registers that may be addressed with well-defined bits and characteristics. In this example, the enhanced SKP OS is defined/modified to a periodic pattern generated by the physical layer to carry commands/information from "Retimer Config Reg Addr/Data" (e.g., 252) to the Retimer and to carry back a response from the Retimer to load into the "Retimer Config Data Return" (e.g., 840) and to allocate some bits for the CRC to protect the Data. For example, in PCIe, this may include enhancing existing SKP ordered sets (e.g., with CSR accesses and CSR returns (CRC protected bits)). In addition, a flow may be defined to ensure that commands/information are guaranteed to be delivered to the retimer and the corresponding response returns. The physical layer mechanism may be enhanced to also include notifications from retimers (in addition to responses) when certain services are needed, as well as other exemplary features.
If one or more retimers are present, each link segment is electrically independent and errors can be accumulated independently in each receiver. Thus, with one retimer, errors may be introduced in the receiver of the retimer or the receiver of the port. The retimer operates on a per-lane basis and therefore does not correct or detect any errors in flits operating across all lanes in the link.
PCIe Gen 6 at 64.0GT/s (PCI Express generation 6), CXL 3.0 at 64.0GT/s (compute Express Link generation 3), and CPU-CPU symmetric coherent links such as UPIs (HyperPath interconnects) with frequencies higher than 32.0GT/s (e.g., 48.0GT/s or 56.0GT/s or 64.0GT/s) are examples of interconnects that require FEC to work in conjunction with CRC. In a SoC, it is highly desirable that the same PHY has multi-protocol functionality and functions as PCIe/CXL/UPI depending on the device connected as a link partner.
In embodiments of the present disclosure, multiple protocols (e.g., PCIe, CXL, UPI) may share a common PHY. However, each protocol may have different delay tolerance and bandwidth requirements. For example, PCIe may be more tolerant of latency increases than CXL. The CPU-CPU symmetric cache coherency link (e.g., UPI) is most sensitive to latency increases.
Links such as PCIe and CXL may be divided into smaller independent sublinks. For example, a x16PCIe/CXL link may be divided into up to 8 separate links, each link x 2. Symmetric cache coherency links may not support partitions at this level. Due to differences in delay characteristics, partition support, and due to differences in basic protocols, these links may use different flow control unit (flit) sizes and flit arrangements even though they may share the same physical layer.
Fig. 3 is a schematic diagram of a common physical layer (common PHY)300 for supporting multiple interconnect protocols, according to an embodiment of the disclosure. PHY is an abbreviation for "physical layer" and is an electronic circuit that can implement the physical layer functions of the OSI model.
Fig. 3 illustrates an exemplary universal PHY 300 (emulated PHY and logical PHY) with PAM-4 encoding at higher data rates, which may support multiple protocols (e.g., PCIe, CXL, UPI, cache coherent interconnect for accelerator (CCIX), open Coherent Accelerator Processor Interface (CAPI), etc.) operating at different data rates. Both the analog PHY 302 and the logical PHY 304 are common to each protocol supported. Analog PHY 302 may support a multi-channel link (e.g., x16PCIe link) with 48GT/s and 56GT/s PAM-4 for other interconnect protocols.
The logical PHY 304 may include a TX logic sub-block 306 and an RX logic sub-block 308. TX logic sub-block 306 may include logic to prepare data streams for transmission across a link. For example, TX logic sub-block 306 may include an idle flit generator 312 to generate flits. The flit size may be determined based on protocol, bandwidth, operating conditions, protocols used, and the like. Cyclic Redundancy Check (CRC) code generator 314 may include one or more CRC code generators and a rolling CRC code generator for generating CRC codes. The CRC code is an error detection code for detecting accidental alteration of data. In an embodiment, CRC code generator 314 may be bypassed while maintaining clock integrity. TX logic sub-block 306 may also include a Forward Error Correction (FEC) encoder 316 to encode data with an Error Correction Code (ECC). The FEC encoder 316 may also be bypassed without compromising clock integrity. Other logic elements may also be present in TX logic sub-block 306, such as lane inversion 318, LFSR 320, symbol alignment 322, and so forth. Since all protocols are flit-based, the logical PHY may also include a common retry buffer 340.
The logical PHY may include an RX logic sub-block 308. The RX logic sub-block 308 may include an FEC decoder/bypass 322, a CRC decoding/bypass 334, and an error reporting element 336. The FEC decoder 332 may decode ECC bits in the received data block and perform error correction. CRC decode logic 334 may check for uncorrectable errors and report the errors to error reporting element 336. Retry buffer 340 may be used to signal a retry of a data block having an uncorrectable error. Other logic elements may also be present in RX logic block 308, such as channel inversion 330, LFSR 328, elastic/drift buffer 328, symbol alignment 324, and so forth.
Logical PHY 304 may also include a static multiplexer (not shown) to select between the different protocol stacks supported by PHY 300. The use of static MUXs facilitates reuse of logic elements (including what is traditionally a significant portion of link layer functionality such as CRC and Retry), and may also yield area/power efficiency in addition to pin efficiency and flexible I/O support (the ability to select between different protocols depending on system configuration). The static multiplexer may direct data to the appropriate physical and logical elements and to the appropriate CRC encoder/decoder and FEC encoder/decoder based on the flit size associated with the protocol used.
The use of generic PHY 300 (analog PHY 302 plus logical PHY 304), flit size, FEC and CRC may differ between different protocols and operating conditions. Any additional logic that facilitates a common PHY is less costly than replicating the logical PHY stack multiple times for each protocol. Instead, the data may be directed electrically to the appropriate encoder/decoder based on the protocol used, which was initially set during link initialization.
The present disclosure addresses different aspects in testing and characterizing links using the following mechanisms:
error injection at the flit level at the transmit side;
error injection for replay at the receiving side;
error injection at the transmitter and/or receiver at the order set level;
Ack/Nak/replay delay is measured, which covers the round trip of the transmission and reception in the link.
The advantages of the present disclosure will be apparent to those skilled in the art. Advantages include, but are not limited to, testing and characterizing links within a system in an architected and automated manner that will help define links with time-to-market advantages. These techniques and logic circuits may also be deployed in a server to help identify root causes and debug failed components.
4A-B are schematic diagrams illustrating exemplary circuitry and logic within a protocol stack including an error logging mechanism according to embodiments of the disclosure. 4A-B illustrate an exemplary microarchitecture that implements the physical, link, and transaction layers of a serial interconnect, such as PCIe/CXL/UPI. The present disclosure describes various detection and recording mechanisms, as shown in fig. 4A-B.
Fig. 4A illustrates a logic circuit of a protocol stack 400 showing some elements of the transmitter side of the micro-architecture. Protocol stack 400 may include Transaction Layer (TL) queue 408a, no operation transaction layer packet (NOP TLP) generator 410 and TX retry buffer 416. The TL queue 408a may include logic for storing or buffering outbound transaction layer information, payload data, control data, etc. of outbound packets. NOP TLP generator 410 may generate NOP TLPs, which may be included in NOP flits sent by transmitters across links. NOP flits can be considered as flits that do not contain transaction layer packets. In some cases, the NOP flit may include no DLLP payload (i.e., all 0 s in the DLLP payload). This may be referred to as an idle flit. In some cases, NOP flits (especially free flit types) can be sent for use by the receiver to check for errors and increase the likelihood of correcting retry packets. Multiplexer 412 may multiplex information from TL queue 408a with information from NOP TLP Gen 410.
TX retry buffer 416 may be used to temporarily store packets (TLP payloads) for retransmission if an error occurs during a previous flit or a current flit. In some implementations, such as those with no available configuration register space, a portion of TX retry buffer 416 may instead be used to store error information. The DLLP processor 438 (shown in some elements of the receiver-side protocol stack 450) may provide an ACK/NACK response to the retry buffer 416 to either cause a TLP in the retry buffer 416 to be resent or cleared. The DLLP processor 438 may use the information about the error in the flit to cause a new flit to be sent across the link.
Protocol stack 400 may include a Data Link Layer Packet (DLLP) generator 420 to generate DLLP information for the packet. DLLP information may be enhanced to a TLP by multiplexing/merging 422. The output of the TX retry buffer may be multiplexed with the output of multiplexer 412 by multiplexer 414. The output of multiplexer 414 may be multiplexed/merged with either all zeros 418, which may be used for error checking as described later, or with the results of DLLP generator 420.
Protocol stack 400 may also include a Cyclic Redundancy Check (CRC) code generator 424 that may generate a CRC for the outbound flit. The CRC code may be multiplexed/merged 426 with the outbound flit. As described above, Forward Error Correction (FEC) generator 428 may add Error Correction Codes (ECC). The ECCs may be interleaved across each flit channel using three sets of ECCs.
Ordered Set (OS) generator 430 may provide ordered sets as flit payloads. For example, OS generator 430 may provide a SKiP (SKP) OS into the microchip stream. The SKP OS may be used to indicate that the next flit is an all-zero flit, as described below.
The flits may be transmitted from a PHY output 434, which PHY output 434 may include an Analog Front End (AFE), scrambling operations, sequencing, and so forth.
Errors may be injected into flits by logic circuitry that is part of the protocol stack transmitter side 400, such as physical layer circuit flit error injection 474. The error may include a bit flip (e.g., from 0 to 1 or from 1 to 0, or other type of error across flits).
The flit error injection circuitry 474 can include hardware circuitry to flip bit values in flits, data blocks, or other information structures. Flit error injection 474 may inject errors based on several factors stored in or controlled by configuration registers, such as a Transmitter (TX) flit error register and flip-flop element 470. The flit error injection circuit 474 can inject errors into the flits after calculating the ECC for the payload. In this way, FEC can be used to identify errors on the receiver side. Similarly, the CRC is calculated and encoded prior to error injection.
Table 1 represents a description of the configuration registers and the bits that inject one or more errors in the transmitted flit when enabled (bit 0). Point a in fig. 4A provides a trigger in TX flit error register and flip-flop element 470 according to the flit type (bits 10 and 9 in table 1) that is sent to perform an action depending on the flit type. Once an error needs to be injected, the byte to be injected with an error of corresponding magnitude is injected by the flit error injection logic element 474, as determined by the TX flit error register and flip-flop element 470. The flit error injection logic element 474 may include logic circuitry for introducing or injecting errors into the flit. For example, the flit error injection logic element 474 may cause a bit flip of one or more bits of the flit payload. Notably, flit error injection logic element 474 resides after FEC generator 428 has computed the error correction codes for the flit payload (including one or both of the TLP or the DLP).
Table 1.TX configuration registers and description
Figure BDA0002861802160000161
Fig. 5 is a process flow diagram 500 for injecting errors into flits according to an embodiment of the disclosure. In some embodiments, the error injection mechanism is activated when the link is operating in flit mode. During link initialization, negotiation, handshaking, etc., the link partner may open flit mode. If each link partner supports flit mode, the link partner may turn on flit mode. The flit mode can follow the required operational features of PCIe 6 or higher versions based on implementation choices, as well as optional features. The link partner tends to operate in flit mode until the link is reset, reinitialized, or renegotiated. In some cases, flit mode can be dynamically turned on and off using, for example, power state, quality of service, or other forms of operation that can trigger link retraining; however, these forms may result in re-initialization and/or re-negotiation of link parameters.
First, a link partner, such as a host device or an endpoint device, may include protocol stack circuitry to activate or determine that error injection is activated (502). For example, information in the TX flit error register (or other configuration register that can control error injection) can indicate that error injection has been activated. The TX flit error register may also provide other types of information as shown in table 1.
In an embodiment, an error is injected into a flit at the beginning of a new flit or series of flits (504). Further, for example, even if a series of multiple flits were to inject an error, at the receiver side, the error would be counted as one error occurred. This condition may be referred to as a continuous flit rule.
The protocol stack circuitry may determine various error injection parameters from the TX flit error register, including a flit type (506) and other error parameters (508). The flit types can include payload flits, free flits, No Operation (NOP) flits, Ordered Set (OS) flits, or OS structures, among others. Other error parameters may include the number of injected errors, the interval between occurrences depending on the number of flits, the type of injected error (e.g., correctable, uncorrectable, combinatorial, etc.), consecutive error injections, error offsets, error magnitudes, etc. Table 1 provides more details.
The error may be contained in the flit (510). The error may include a bit flip or other type of error. For example, after the flit is built (and in some cases, after the error correction code is calculated for the flit payload), an error may be introduced or injected into the flit by logic circuitry that flips one or more bits in the flit payload.
After injecting the error into the flit, the protocol stack circuitry may update configuration register state information to reflect that the error injection has completed (512).
The following is an exemplary algorithm for injecting errors into flits. It should be understood that the following algorithm is for purposes of illustration, and not limitation. Other processes and implementations may be selected to achieve similar results.
1. If error injection is activated (i.e., enable-bit 0 ═ 1b), go to step 2; otherwise, staying in the step 1 until the error injection is activated;
2. some variables may be initialized:
num_errors_injected=0,consecutive_Flit_inject=0,distance_inject_Flits=0,fec_group=0;
3. if the bits of the next group scheduled to be sent are new flits, go to step 4; otherwise, staying in the step 3 until the bits of the next group scheduled to be sent are new flits;
4. if (coherent _ flag _ inject >0) starts// due to a continuous demand, the Flit needs to be injected:
a.consecutive_Flit_inject--;
b. go to step 8;
5. if ((num _ errors _ rejected) ═ Number of errors rejected-Bits [5:1]) & (num _ errors _ rejected >0)),
set the error injection status (bits 31:30) to 10b and then go to step 1// complete error injection;
6. if (distance _ inject _ flags >0) start// is too close to the last flit injected with an error:
a.distance_inject_Flits--;
b. turning to step 3;
and (6) ending.
7. If (((injection-bit [10:9] on flit type) ═ 00b) | luminance
(the new flit to be sent is NOP and (injection type on flit ═ 01b or 11b)) | electrically non-counting
(the new flit to be sent is a payload flit and (injection type on flit ═ 01b or 10b))), then
Start// inject correct type of flit for error:
if (num _ errors _ rejected ═ 0), then the error injection status (bits 31:30) is set to 01b
num_errors_injected++;
distance _ inject _ flags is the interval between occurrences in terms of the number of Flits (bits [8:6 ]);
if (distance _ inject _ flags ═ 0) distance _ inject _ flags ═ a random number between 1 and 127;
coherent _ flag _ inject ═ sequential error injection (bits [14:13 ]);
if (0) the dependent _ flag _ reject is a random number between 1 and 10;
consecutive_Flit_inject--;
finishing;
otherwise, go to step 3.
8. If ((type of error injected (bits [12:11 ])) - | 11b) | survival
((type of error injected ═ 00b) & (1-bit random number generator ═ 0)))
Go to step 10// uncorrectable error injection
9. /correctable error injection:
error _ parity ═ error amplitude (bits [31:24 ]);
b. if (error _ identity ═ 0) error _ identity ═ random number between 1 and 255;
c. inject the error's magnitude error _ map in byte position (error offset-bits [21:15]) in the FEC group FEC _ group;
d.fec_group=(fec_group+1)mod 3;
e. if ((type of error injected (bit 12:11) = 10b) & (fec _ group | ═ 0)), go to step 9 a;
f. turning to step 3;
10.// uncorrectable error injection:
a.error_locn=0;
error _ parity ═ error amplitude (bits [31:24 ]);
c. if (error _ identity ═ 0) error _ identity ═ random numbers between min (error _ locn,1) to 255// this allows no errors in byte 0, so we can get more random effects;
d. injecting the error amplitude error _ magnitude in the byte position error _ locn;
error _ locn + (error offset (bits [21:15 ]));
f. if (error _ locn > flit size), go to step 3;
g. go to step 10b.
num _ errors _ rejected and consecutive _ flag _ reject variables may also be added to the status register to signal completion and the exact current state. When software writes to any bit in this register, the "error injection status" bit will be set to 00 b.
In general, determining error parameters for a flit can include determining one or more of: the number of injected errors, the interval between error occurrences, the type of error injected, consecutive error injections, error offsets or error magnitudes.
Injecting the number of errors may include injecting the errors until error injection is disabled, or may include injecting a specified number of errors. The intervals between the error occurrences may include intervals between occurrences depending on the number of flits between flits having errors. For example, the interval may be a random number, such as a random number of flits between 1 and 127 errors, without repeating the interval until all random numbers are used. Keeping in mind the sequential flit rule, the interval can be a specified number of flits. The error types may include random error types between correctable errors or uncorrectable errors, correctable errors in one FEC group; correctable errors, or uncorrectable errors in all three FEC groups.
Consecutive error injection may include injecting an error into a number of consecutive flits (including only one flit). For example, consecutive mis-injection locations may result in mis-injection into a single flit, mis-injection into two consecutive flits, mis-injection into three consecutive flits, or mis-injection into a random number of consecutive flits.
The error offset may be different for correctable errors and uncorrectable errors. For correctable errors, the error offset may determine the byte offset of the error injection. For uncorrectable errors, the error offset may determine the distance between subsequent erroneous bytes.
The error magnitude may be a random non-0 value or may be an exact error magnitude. The error magnitude may define the magnitude of the injected error. The error magnitude may include, for example, the number of bits flipped in a single flit.
Turning to the receiver side, fig. 4B illustrates the logic circuitry of a protocol stack 440 of the receiver side of the micro-architecture. The PHY input 442 of the receiver port may receive the flit. The PHY input 442 may include AFE, descrambling operations, and other operations found in PHY inputs. The flits can be demultiplexed by a demultiplexer 444 to be used as OS flits for OS check 446 or payload flits. The payload flit may be error detected by the FEC logic 448, which the FEC logic 448 uses the ECC to identify and correct errors within the flit. The results of the error detection may be recorded in an error log 442, as described in more detail below. The flit may also be checked for CRC by CRC logic 450. CRC logic 450 may use the functionality found in error check logic 446 to detect errors at log errors, as described below. Error check logic 446 may also use information stored in RX replay buffer 444 (if present) to identify bit positions of uncorrectable errors. The flits are split 460 and the ACK/NACK is provided by the DLLP processor 438 to the TX retry buffer 416. The TLP processor 448 may send the payload to the transaction layer queue 408 b.
In some embodiments, the error check logic 446 may also provide a Bit Error Rate (BER) based on errors received by one or more counters and based on the number of bits received as determined by the flit counter and errors reported by the FEC correction logic 448 fed to the CRC/Seqno check logic 450. In PCIe, as in other interconnects (e.g., CXL, UPI, etc.), flits contain a fixed number of bits, meaning that the total number of bits can be determined by counting the number of flits received. The error may be compared to the total number of bits to calculate the BER.
Protocol stack circuitry 440 may include logic and registers for injecting errors into received flits. Table 2 represents a description of the configuration registers and the bits that inject one or more errors into the received flit when enabled (bit 0). Point B in fig. 4B provides a trigger according to the flit type (bit 14) received to act upon the flit type. RX flit error logic element 480 determines whether an error (NAK) needs to be injected and injects it as shown in fig. 4B. RX flit error logic element 480 may include hardware circuitry and registers, as well as other information, to trigger the injection of a NAK into a received flit.
TABLE 2 configuration register for error injection at port receiver
Figure BDA0002861802160000211
Figure BDA0002861802160000221
Fig. 6 is a process flow diagram 600 for injecting errors into flits according to an embodiment of the disclosure. In some embodiments, the error injection mechanism is activated when the link is operating in flit mode. During link initialization, negotiation, handshaking, etc., the link partner may open flit mode. If each link partner supports flit mode, the link partner may turn on flit mode. The flit mode may follow the required operational features of PCIe 6 or higher versions, as well as optional features, based on implementation choices.
First, a link partner, such as a host device or an endpoint device, may include protocol stack circuitry to activate error injection or determine that error injection is activated (602). For example, information in the RX flit error register (or other configuration register that can control error injection) can indicate that error injection has been activated. The RX flit error register may also provide other types of information as shown in table 2.
In an embodiment, an error is injected into a flit when a new flit or series of flits is received (604). Further, for example, even if a series of multiple flits has injected an error, the error will be counted as one error occurred at the receiver side. This condition may be referred to as a continuous flit rule.
The protocol stack circuitry may determine various error injection parameters from the TX flit error register, including a flit type (606) and other error parameters (608). The flit type can include a payload flit or other type of flit. Other error parameters may include the number of injected errors, the interval between occurrences depending on the number of flits, the type of injected error (e.g., correctable, uncorrectable, combinatorial, etc.), consecutive error injections, error offsets, error magnitudes, etc. Table 2 provides more details.
The error may be contained in the flit (610). The error may include NAK or other type of negative acknowledgement or error injected at the receiver side. In some embodiments, an injected error (e.g., NAK) may be used to trigger or force replay of a received flit.
After injecting the error into the flit, the protocol stack circuitry may update configuration register state information to reflect that the error injection is complete (612).
The injected NAK may trigger a replay of the flit. Replay may be used to test and characterize replay mechanisms and determine the delay of replaying a flit.
The configuration register for error injection at the port receiver may include a flip-flop (e.g., an enable bit) for turning the error injection mechanism on and off. The configuration register also provides an indication of various error injection parameters. For example, the configuration register may include information indicating the number of injected errors. Errors may be injected continuously until the error injection is turned off. Alternatively, in some embodiments, a predetermined number of errors may be specified using information in a register.
Another parameter may include the interval between error occurrences, which may also be indicated by information in a register. Errors may be separated by the number of flits between flits into which artificial NAKs are injected. The interval may be a random number of flits or a specified number of flits.
Another parameter that may be indicated by the configuration register is simply forcing the flit to be replayed. At setup, the configuration register information causes the receiver to perform replay-only flits, then replay from the same sequence number after the interval; subsequent NAKs may be shifted by the same amount.
Another parameter for error injection may include whether multiple replay requests for the same sequence number are included. If enabled, two, three, or four NAKs may be injected into the same sequence number after a flit interval specified elsewhere in the register.
Another parameter may include continuous error injection. Continuous error injection errors can be injected into consecutive flits at a random flit frequency between 1 and 10 (only one flit, two consecutive flits, or three consecutive flits). Error offset represents the byte offset of error injection for correctable errors or the distance between subsequent error bytes for uncorrectable errors.
The error magnitude may be a random non-0 value or may be a predetermined error magnitude.
The following algorithm describes an exemplary mechanism for injecting errors in received flits at the receiver side protocol stack circuitry.
1. If error injection is enabled (i.e., enable-bit 0 ═ 1b), go to step 2; otherwise, staying in the step 1;
2.num_errors_injected=0,consecutive_Flit_inject=0,distance_inject_Flits=0,seq_num_inject=0,num_repeat_errors_seqno=0;
3. if the bit of the next group which is planned to be received is a new flit, turning to the step 4; otherwise, the step 3 is left;
4. if (consecutive _ flag _ inj >0) starts// needs to be injected in the Flit due to the continuation request:
a.consecutive_Flit_inject--;
b. go to step 9;
5. if ((num _ errors _ injected >) number of injection errors-bits [5:1]) & (num _ errors _ injected >0) & (num _ repeat _ errors _ seqno ═ 0);
set the error injection status (bits 31:30) to 10b and go to step 1// complete error injection;
6. if ((seq _ num _ inject >0) & & (received sequence number ═ seq _ num _ inject) & & (seq _ num _ inject > 0))// NAK is repeatedly injected into the same sequence number;
a.seq_num_inject--;
b.num_errors_injected++;
c. if (forced replay-only flit (bit 15) ═ 1b) starts:
i. generating a NAK for a received flit, resulting in a replay request for a flit No. seq _ num _ inj;
if (interval between occurrence of artificial NAK (bits 13:6) >0) plan to replay from "return n" of seq _ num _ inject after the number of flits indicated in bit 13:6
End up
Otherwise, a NAK is generated for the received flit, resulting in a return n replay request from the flit No. seq _ num _ inj
d. Go to step 3
And (6) ending.
7. If (distance _ inject _ flags >0) the start// is too close to the last flit to inject an error
a.distance_inject_Flits–
b. Go to step 3
End up
8. If ((the received flit type is payload flit | (injection on flit type (bit 14) ═ 0b)) & (num _ errors _ injected < number of injection errors))
Start// correct flit type for injecting errors
If (num _ errors _ injected ═ 0) the error injection status (bits 21:20) is 01b
num_errors_injected++;
distance _ inject _ flags the interval between the occurrence of artificial "NAK" in terms of the number of Flits (bits [13:6])
If (distance _ inject _ flags ═ 0) distance _ inject _ flags ═ 1 to 127 of the random number;
coherent _ flag _ inject ═ sequential error injection (bits [19:18 ]);
if (0) the dependent _ flag _ reject is a random number between 1 and 10;
consecutive_Flit_inject--;
end up
Otherwise go to step 39// inject error on the receive side
a. If ((multiple replay requests for the same serial number (bit 17:16) | 00b) & (seq _ num _ inject ═ 0)) start
I.seq _ num _ inject-1 multiple rebroadcast requests for the same sequence number;
num _ repeat _ errors _ seqno, which is the sequence number of the received flit
End up
b.num_errors_injected++
c. If (forced replay-only flit (bit 15) ═ 1b) starts
I. generating a NAK for a received flit results in a replay request for a flit No. seq _ num _ inj
If (interval between occurrence of artificial NAK (bits 13:6) >0) the number of flits scheduled to be indicated in bit [13:6] is later than the "return n" replay from seq _ num _ inject
End up
Otherwise, generating a NAK for the received flit results in a return n replay request from the flit No. seq _ num _ inj
d. Go to step 3
num _ errors _ rejected and consecutive _ flag _ reject variables may also be added to the status register to signal completion and the exact current state. When software writes to any bit in the register, the "error injection status" bit is set to 00 b.
Returning to fig. 4A and 4B, embodiments of the present disclosure may include injecting errors into the transmitted and received Ordered Sets (OS). On the transmitter side of the protocol stack, errors may be injected into the OS based on OS triggers in the OS error injection register and trigger logic 476. OS error injection register and flip-flop logic 476 may include registers and other information to activate/trigger and configure error injection into the OS on the transmit side. OS error injection register and flip-flop logic 476 may cause OS error injection circuit 478 to insert an error into the generated OS series of bytes. OS error injection circuit 478 may include hardware circuitry to cause bit flipping of the OS bytes generated by OS generation circuit 430. The flit or OS series is sent from the transmitter side of the protocol stack circuitry. At the receiver, an OS error injection register and trigger logic element 476 may cause the OS to check 446 to assume that the received OS has failed.
Table 3 represents a description of the configuration registers and the bits that inject one or more errors in the ordered set of transmit and/or receive when enabled (bit 0). The OS Gen logic 430 and OS check logic 446 in FIGS. 4A and 4B provide the OS error injection register and trigger logic 476, respectively, for evaluating the trigger input for when an error is injected. If the error injection needs to be on the ordered set sent, it instructs the OS error injection circuit 478 to perform the injection. The trigger may include an enabled OS fault injection bit (1 b). The error injection may include one or more bit flips in the ordered set. If it needs to be on the received ordered set, it instructs receive-side OS check circuitry 446 to pretend that the ordered set failed (e.g., by forced replay or by injecting a NAK). The error injection mechanism may be based on the type of ordered set, the LTSSM state (one bit per state) at which injection is required, or other factors.
TABLE 3 error injection in ordered sets of Transmit and receive
Figure BDA0002861802160000261
Figure BDA0002861802160000271
Fig. 7 is a process flow diagram 700 for a transmitter side protocol stack to inject errors into an ordered set in accordance with an embodiment of the disclosure. First, the error injection feature may be enabled by an enable bit in the OS error injection register and flip-flop logic element 476 (702). At the start of a new OS (704), the OS generator logic may inject an error into the ordered set based on a number of parameters identified by the OS error injection register and trigger logic element 476. Error injection in the OS may be based on OS type or link training and state machine (LTSSM) state of the link (706). The error parameters may be determined from the OS error injection register and flip-flop logic 476 (708). An error may be injected into the OS by error injection logic 478 (710). And the error injection status may be updated (712).
Fig. 8 is a process flow diagram 800 for a receiver-side protocol stack to inject errors into an ordered set in accordance with an embodiment of the disclosure. First, the error injection feature may be enabled by an enable bit in the OS error injection register and flip-flop logic element 476 (802). Upon receiving the OS, the receiver may treat the received OS as a failure based on a number of parameters identified by the OS error injection register and trigger logic 476. The processing of the OS may be based on OS type or link training and state machine (LTSSM) state of the link (806). The error parameters may be determined from the OS error injection register and flip-flop logic 476 (808). The error register may cause the OS check logic to treat the incoming OS or OS sequence as a failed OS (810). And the error injection status may be updated (712).
Examples of error parameters for an OS include the number of lanes injecting errors, the direction (TX-side or RX-side), the number of occurrences, the interval between occurrences depending on the number of OSs, the OS type (including any OS, Training Sequence (TS)0, TS1, TS2, SKP OS, electrical idle exit OS (eieos), or electrical idle OS (eios)), bytes injected, or LTSSM state.
Error injection may be used to test and characterize various components of the link. For example, error injection may test forward error correction mechanisms, cyclic redundancy check mechanisms, acknowledgment/negative acknowledgment mechanisms, replay functions, error handling, error rates, link stability, recovery and re-initialization timing and sequencing, and the like. In addition, error injection allows delays to be tested and characterized at the link level. The delay measurement is described more below:
delay measurement at link level:
delay measurements at the link level may include a count period from the time the flit is scheduled to be sent to the time the Ack/Nak comes back, ignoring any SKP OS received (or sent). If an Ack is used for subsequent flits, the time is adjusted accordingly by adjusting the time to which the sequence number will be transmitted (adding time on the TX side). Nak replay delay can be tracked by counting the time that Nak is scheduled to be sent to the time that replay of the sequence number occurs, ignoring any SKP OS received.
Turning to fig. 4A-4B, protocol stacks 400 and 440 may include delay measurement registers and flip-flops 472 and delay measurement logic 473. The delay measurement register and flip-flop 472 may include information for activating, disabling, and configuring delay measurement parameters. The delay measurement register and flip-flop 47 may also include register information for reporting the status of the delay measurement based on control registry settings, parameters, etc. The reporting register may be located in the same registry as the delay measurement configuration information and the trigger or in a separate registry. Tables 4 and 5 provide the configuration register settings required for control and the measurement results in the state.
TABLE 4 control registers for delay measurement and LTSSM state transition of flits
Figure BDA0002861802160000281
TABLE 5 status register reports delay measurements based on control register settings
Figure BDA0002861802160000282
Figure BDA0002861802160000291
The delay measurement register and flip-flop 472 may include an information storage device such as a memory, register, latch, or other logic to hold information to activate, configure, execute, and report delay measurements. Delay measurement logic 473 may include hardware or software logic to receive information and calculate the delay. For example, delay measurement logic 473 may receive information from a transmitter-side protocol stack circuit element at the link layer or elsewhere indicating the transmission of flits. Delay measurement logic 473 may also receive information from CRC check circuit 450 or other circuit elements on receiver-side protocol stack 440 that provide Ack/Nak information for the received flit.
Fig. 9 is a process flow diagram 900 for performing delay measurements according to an embodiment of the disclosure. The delay tracking may be turned on and off by setting the registry information contained in the delay measurement register and flip-flop 472 (902).
When the delay measurement feature is turned on, the delay measurement logic 473 may use the information in the delay measurement register to determine how to calculate the delay based on the arranged parameters (908). In some embodiments, the delay tracking mechanism may include different modes as specified by the information in the registry. For example, in the first mode, sequence numbers may be tracked one at a time. In another mode, multiple sequence numbers may be tracked (one sequence number and its subsequent sequence numbers started after the delay associated with the SKP OS is removed). Similarly, in some embodiments, different types of flits (payload plus NOP) can be tracked; and in other embodiments, one flit type can be tracked, such as a payload flit.
One or more flits can be transmitted to a receiver by a transmitter side protocol stack circuit (904). The protocol stack circuitry may receive an Ack/Nak to indicate successful or unsuccessful completion of the transmitted flit (906). Based on the delay measurement parameters identified from register 472, delay measurement logic 473 may track flit transmit information (910) and flit receive and Ack/Nak information (912). Delay measurement logic 473 may determine or calculate the delay based on the flit send information and the flit receive information. Examples are provided below:
if more than one flit is being tracked, the first flit is initially tracked and then more than one flit is tracked until the flit just before the SKP OS is sent (n number).
On the receiving side, flits with Ack/Nak sequence numbers greater than or equal to (>) the starting flit number are tracked. Tracking is performed (with a period of c from transmission to reception of Ack/Nak) until a flit is received with an Ack/Nak sequence number greater than or equal to the tracked end transmit flit number (incremented by c from c) ignoring any SKP OS received between c and c'. So that:
s is the start time of the first tracking flit sent;
m is the time per microchip;
n is the starting serial number of the tracking flit;
n ═ the end sequence number of the tracking flit;
n is the total number of flits tracked (N-N' -N +1).
If R is equal to the received flit is at the Ack sequence number i:
s + m.i starting for i;
(R-S-m.i) for i delay.
A delay is added to count and increment the received flit count. Thus, the average delay can be calculated from these two values (delay and flit number n).
During replay, Ack or Nak counting is suspended to account for retransmission of one or more flits with a particular sequence number.
The above mechanism can be extended to measure delay by recovering from errors or by link retraining or L0p state transitions.
Referring to fig. 10, an embodiment of a fabric comprised of point-to-point links interconnecting a set of components is shown. The system 1000 includes a processor 1005 and a system memory 1010 coupled to a controller hub 1015. Processor 1005 includes any processing element, such as a microprocessor, host processor, embedded processor, co-processor, or other processor. The processor 1005 is coupled to the controller hub 1015 via a Front Side Bus (FSB) 1006. In one embodiment, the FSB 1006 is a serial point-to-point interconnect as described below. In another embodiment, link 1006 comprises a serial differential interconnect architecture that conforms to different interconnect standards.
The system memory 1010 includes any memory device, such as Random Access Memory (RAM), non-volatile (NV) memory, or other memory accessible by devices in the system 1000. The system memory 1010 is coupled to the controller hub 1015 through a memory interface 1016. Examples of memory interfaces include Double Data Rate (DDR) memory interfaces, double channel DDR memory interfaces, and dynamic ram (dram) memory interfaces.
In one embodiment, the controller hub 1015 is a root hub, root complex, or root controller in a peripheral component interconnect express (PCIe or PCIe) interconnect hierarchy. Examples of controller hub 1015 include a chipset, a Memory Controller Hub (MCH), a northbridge, an Interconnect Controller Hub (ICH), a southbridge, and a root port controller/hub. In general, the term "chipset" refers to two physically separate controller hubs, i.e., a Memory Controller Hub (MCH) coupled to an Interconnect Controller Hub (ICH). Note that current systems typically include an MCH integrated with the processor 1005, while the controller 1015 will communicate with I/O devices in a similar manner as described below. In some embodiments, peer-to-peer routing is optionally supported by the root complex 1015.
Here, controller hub 1015 is coupled to switch/bridge 1020 via serial link 1019. Input/output modules 1017 and 1021 (which may also be referred to as interfaces/ports 1017 and 1021) include/implement a layered protocol stack to provide communication between controller hub 1015 and switch 1020. In one embodiment, multiple devices can be coupled to the switch 1020.
The switch/bridge 1020 routes packets/messages from the devices 1025 upstream (i.e., up the hierarchy of the root complex) to the controller hub 1015, and from the processor 1005 or system memory 1010 downstream (i.e., down the hierarchy away from the root port controller) to the devices 1025. In one embodiment, switch 1020 is referred to as a logical component of a plurality of virtual PCI-PCI bridge devices. Device 1025 includes any internal or external device or component to be coupled to an electronic system, such as an I/O device, Network Interface Controller (NIC), add-in card, audio processor, network processor, hard drive, storage device, CD/DVD ROM, monitor, printer, mouse, keyboard, router, portable storage device, firewire device, Universal Serial Bus (USB) device, scanner, and other input/output devices. It is often referred to as an endpoint in PCIe local devices (e.g., devices). Although not specifically shown, device 1025 may include a PCIe to PCI/PCI-X bridge to support legacy or other versions of PCI devices. Endpoint devices in PCIe are typically classified as legacy, PCIe, or root complex integrated endpoints.
Graphics accelerator 1030 is also coupled to controller hub 1015 via serial link 1032. In one embodiment, graphics accelerator 1030 is coupled to an MCH, which is coupled to an ICH. Switch 1020 and corresponding I/O device 1025 are then coupled to the ICH. I/ O modules 1031 and 1018 also serve to implement a layered protocol stack to communicate between graphics accelerator 1030 and controller hub 1015. Similar to the MCH discussion above, a graphics controller or graphics accelerator 1030 itself may be integrated within processor 1005.
Turning to fig. 11, an embodiment of a layered protocol stack is shown. Layered protocol stack 1100 includes any form of layered communication stack, such as a Quick Path Interconnect (QPI) stack, a PCIe stack, a next generation high performance computing interconnect stack, or other layered stack. Although the following direct discussion with reference to fig. 10-15 is related to a PCIe stack, the same concepts may be applied to other interconnect stacks. In one embodiment, protocol stack 1100 is a PCIe protocol stack that includes a transaction layer 1105, a link layer 1110, and a physical layer 1120. Interfaces (e.g., interfaces 1017, 1018, 1021, 1022, 1026, and 1031 in fig. 10) may be represented as a communication protocol stack 1100. The representation as a communication protocol stack may also be referred to as a module or interface implementing/comprising a protocol stack.
PCI Express uses packets to communicate information between components. Packets are formed in the transaction layer 1105 and the data link layer 1110 to carry information from the sending component to the receiving component. As the transmitted packets flow through the other layers, they are extended with additional information needed to process the packets at these layers. The reverse process occurs at the receiving side and the packet is converted from its physical layer 1120 representation to a data link layer 1110 representation and finally (for transaction layer packets) to a form that can be processed by the transaction layer 705 of the receiving device.
Transaction layer
In one embodiment, transaction layer 1105 is used to provide an interface between the processing cores of the device and the interconnect fabric, such as data link layer 1110 and physical layer 1120. In this regard, the primary responsibility of the transaction layer 1105 is to assemble and disassemble packets (i.e., transaction layer packets or TLPs). Translation layer 1105 generally manages credit-based flow control for TLPs. PCIe implements split transactions, i.e., transactions with requests and responses separated in time, allowing the link to carry other traffic while the target device collects the response data.
In addition, PCIe utilizes credit-based flow control. In this scheme, the device advertises an initial credit amount for each receive buffer in the transaction layer 1105. An external device at the opposite end of the link (e.g., controller hub 1015 in figure 10) counts the number of credits consumed by each TLP. If the transaction does not exceed the credit limit, the transaction may be transmitted. After receiving the response, the credits are restored. The advantage of the credit scheme is that the delay in credit return does not affect performance as long as no credit limit is encountered.
In one embodiment, the four transaction address spaces include a configuration address space, a memory address space, an input/output address space, and a message address space. The memory space transaction includes one or more of a read request and a write request to transfer data to/from a memory-mapped location. In one embodiment, memory space transactions can use two different address formats, e.g., a short address format (e.g., a 32-bit address) or a long address format (e.g., a 64-bit address). The configuration space transaction is used to access a configuration space of the PCIe device. Transactions to the configuration space include read requests and write requests. Message space transactions (or simply messages) are defined to support in-band communication between PCIe agents.
Thus, in one embodiment, the transaction layer 1105 assembles the packet header/payload 1106. The format of the current packet header/payload may be found in the PCIe specification at the PCIe specification web site.
Referring quickly to FIG. 12, an embodiment of a PCIe transaction descriptor is shown. In one embodiment, transaction descriptor 1200 is a mechanism for carrying transaction information. In this regard, the transaction descriptor 1200 supports the identification of transactions in the system. Other potential uses include tracking modifications to the default transaction order and association of transactions with channels.
Transaction descriptor 1200 includes a global identifier field 1202, an attribute field 1204, and a channel identifier field 1206. In the illustrated example, a global identifier field 1202 is depicted that includes a local transaction identifier field 1208 and a source identifier field 1210. In one embodiment, the global transaction identifier 1202 is unique to all outstanding requests.
According to one implementation, the local transaction identifier field 1208 is a field generated by the requesting agent and is unique to all outstanding requests that require the requesting agent to complete. Further, in this example, the source identifier 1210 uniquely identifies the requestor agent within the PCIe hierarchy. Thus, the local transaction identifier 1208 field, along with the source ID 1210, provides global identification of the transaction within the hierarchy domain.
The attributes field 1204 specifies the nature and relationship of the transaction. In this regard, the attribute field 1204 is potentially used to provide additional information that allows modification of the default handling of the transaction. In one embodiment, attribute fields 1204 include a priority field 1212, a reserved field 1214, an ordering field 1216, and a no snoop field 1218. Here, the priority subfield 1212 may be modified by the initiator to assign a priority to the transaction. Reserved attributes field 1214 is reserved for future or vendor defined use. The possible usage model using priority or security attributes may be implemented using reserved attribute fields.
In this example, sort attributes field 1216 is used to provide optional information that conveys the type of order in which default order rules may be modified. According to one exemplary implementation, an ordering attribute of "0" represents a default ordering rule to be applied, where an ordering attribute of "1" represents relaxed ordering, where writes may pass writes in the same direction and read completions may pass writes in the same direction. Snoop attributes field 1218 is used to determine whether to snoop the transaction. As shown, channel ID field 1206 identifies a channel associated with the transaction.
Link layer
Link layer 1110, also referred to as data link layer 1110, acts as an intermediate stage between transaction layer 1105 and physical layer 1120. In one embodiment, it is the responsibility of the data link layer 1110 to provide a reliable mechanism for exchanging Transaction Layer Packets (TLPs) between the two components of the link. One side of the data link layer 1110 accepts a TLP assembled by the transaction layer 1105, applies the packet sequence identifier 1111 (i.e., identification number or packet number), computes and applies an error detection code (i.e., CRC 1112), and submits the modified TLP to the physical layer 1120 for cross-physical transmission to an external device.
Physical layer
In one embodiment, the physical layer 1120 includes a logical sub-block 1121 and an electronic block 1122 to physically transmit packets to an external device. Here, logical sub-block 1121 is responsible for the "digital" function of physical layer 1121. In this regard, the logical sub-block includes a transmit portion to prepare outgoing information for transmission by physical sub-block 1122 and a receiver portion to identify and prepare received information before passing it to link layer 1110.
Physical block 1122 includes a transmitter and a receiver. Logic sub-block 1121 provides symbols to the transmitter, which sequences and transmits the symbols to an external device. The receiver is provided with the serialized symbols from the external device and converts the received signal into a bit stream. The bit stream is deserialized and provided to logical sub-block 1121. In one embodiment, an 8b/10b transmission code is employed, in which ten-bit symbols are transmitted/received. Here, the special symbols are used to frame the packet in frames 1123. In addition, in one example, the receiver also provides a symbol clock recovered from the incoming serial stream.
As described above, while the transaction layer 1105, link layer 1110, and physical layer 1120 are discussed with reference to a particular embodiment of a PCIe protocol stack, the layered protocol stack is not so limited. In fact, any layered protocol may be included/implemented. As an example, a port/interface represented as a layered protocol includes: (1) the first layer, the transaction layer, of the assembled packet; a second layer, the link layer, that sequences packets; and a third layer, i.e., a physical layer, of the transport packet. As a specific example, a Common Standard Interface (CSI) layered protocol is used.
Referring next to FIG. 13, an embodiment of a PCIe serial point-to-point fabric is shown. Although an embodiment of a PCIe serial point-to-point link is shown, the serial point-to-point link is not so limited as it includes any transmission path for transmitting serial data. In the illustrated embodiment, the primary PCIe link includes two low voltage differential drive signal pairs: a transmit pair 1306/1311 and a receive pair 1312/1307. Thus, device 1305 includes transmit logic 1306 for transmitting data to device 1310 and receive logic 1307 for receiving data from device 1310. In other words, the PCIe link includes two transmit paths (i.e., paths 1316 and 1317) and two receive paths (i.e., paths 1318 and 1319) therein.
A transmit path refers to any path for transmitting data, such as a transmission line, copper wire, optical fiber, wireless communication channel, infrared communication link, or other communication path. The connection between two devices (e.g., device 1305 and device 1310) is referred to as a link (e.g., link 1315). A link may support one lane-each lane represents a set of differential signal pairs (one pair for transmission and one pair for reception). To extend bandwidth, a link may aggregate multiple lanes represented by xN, where N is any supported link width, e.g., 1, 2, 4, 8, 12, 16, 32, 64 or wider.
A differential pair refers to two transmission paths, such as lines 1316 and 1317, for transmitting differential signals. By way of example, line 1317 is driven from a high logic level to a low logic level, i.e., a falling edge, when line 1316 switches from a low voltage level to a high voltage level, i.e., a rising edge. Differential signals may exhibit better electrical characteristics, such as better signal integrity, i.e., cross-coupling, voltage overshoot/undershoot, ringing, and the like. This may provide a better timing window and thus a faster transmission frequency.
Note that the apparatus, methods, and systems described above may be implemented in any electronic device or system as described above. As a specific illustration, the following figures provide an exemplary system for utilizing the present disclosure as described herein. As the following system is described in more detail, many different interconnections are disclosed, described, and re-discussed from the above discussion. And it is apparent that the above described advances may be applied to any of those interconnects, structures, or architectures.
Turning to FIG. 14, a diagram is shown in accordance with the present disclosureA block diagram of an exemplary computing system formed with a processor including execution units to execute instructions, wherein one or more interconnects implement one or more features. In accordance with the present disclosure, for example, in the embodiments described herein, the system 1400 includes components, such as a processor 1402, to employ execution units that include logic to execute algorithms to process data. System 1400 represents PENTIUM III available from Intel Corporation of Santa Clara, CaliforniaTM、PENTIUM 4TM、XeonTM、Itanium、XScaleTMAnd/or StrongARMTMA microprocessor, but other systems (including PCs with other microprocessors, engineering workstations, set-top boxes, etc.) may also be used. In one embodiment, the sample system 1000 implements WINDOWS, available from Microsoft Corporation of Redmond, WashingtonTMVersions of the operating system, but other operating systems (e.g., UNIX and Linux), embedded software, and/or graphical user interfaces may also be used. Thus, embodiments of the present disclosure are not limited to any specific combination of hardware circuitry and software.
Embodiments are not limited to computer systems. Alternative embodiments of the present disclosure may be used in other devices, such as handheld devices and embedded applications. Some examples of handheld devices include cellular telephones, internet protocol devices, digital cameras, Personal Digital Assistants (PDAs), and handheld PCs. The embedded application may include a microcontroller, a Digital Signal Processor (DSP), a system on a chip, a network computer (NetPC), a set-top box, a network hub, a Wide Area Network (WAN) switch, or any other system that may execute one or more instructions in accordance with at least one embodiment.
In the illustrated embodiment, processor 1402 includes one or more execution units 1008 to implement an algorithm to execute at least one instruction. One embodiment may be described in the context of a single processor desktop or server system, but alternative embodiments may be included in a multiprocessor system. System 1400 is an example of a "central" system architecture. The computer system 1400 includes a processor 1402 to process data signals. As one illustrative example, processor 1402 includes a Complex Instruction Set Computer (CISC) microprocessor, a Reduced Instruction Set Computing (RISC) microprocessor, a Very Long Instruction Word (VLIW) microprocessor, a processor implementing a combination of instruction sets, or any other processor device (e.g., a digital signal processor). The processor 1402 is coupled to a processor bus 1410, which processor bus 1410 transmits data signals between the processor 1402 and other components in the system 1400. The elements of the system 1400 (e.g., graphics accelerator 1412, memory controller hub 1416, memory 1420, I/O controller hub 1424, wireless transceiver 1426, flash BIOS 1028, network controller 1434, audio controller 1436, serial expansion port 1438, I/O controller 1440, etc.) perform their conventional functions well known to those skilled in the art.
In one embodiment, processor 1402 includes a level one (L1) internal cache memory 1404. Depending on the architecture, processor 1402 may have a single internal cache or multiple levels of internal cache. Other embodiments include a combination of both internal and external caches, depending on the particular implementation and requirements. Register file 1406 is used to store different types of data in various registers, including integer registers, floating point registers, vector registers, bank registers, shadow registers, checkpoint registers, status registers, and instruction pointer registers.
Execution unit 1408, which includes logic to perform integer and floating point operations, is also located in processor 1402. In one embodiment, the processor 1402 includes microcode (ucode) ROM to store microcode that, when executed, will execute certain macroinstructions or algorithms to process complex scenarios. Here, the microcode may be updateable to handle logical errors/repairs for the processor 1402. For one embodiment, the execution unit 1408 includes logic to process the packed instruction set 1409. By including the packed instruction set 1409 in the instruction set of the general purpose processor 1402, and by executing the circuitry associated with the instructions, the packed data in the general purpose processor 1402 can be used to perform operations used by many multimedia applications. Thus, many multimedia applications are accelerated and executed more efficiently by using the full width of the processor data bus to perform operations on packed data. This potentially eliminates the need to transfer a small unit of data across the data bus of the processor to perform one or more operations, one data element at a time.
Alternative embodiments of the execution unit 1408 may also be used in microcontrollers, embedded processors, graphics devices, DSPs, and other types of logic circuitry. System 1400 includes a memory 1420. Memory 1420 includes Dynamic Random Access Memory (DRAM) devices, Static Random Access Memory (SRAM) devices, flash memory devices, or other memory devices. Memory 1420 stores instructions and/or data represented by data signals to be executed by processor 1402.
Note that any of the foregoing features or aspects of the present disclosure may be used on one or more of the interconnects shown in fig. 14. For example, an on-die interconnect (ODI), not shown, for coupling internal units of processor 1402 implements one or more aspects of the above disclosure. Alternatively, the present disclosure is associated with: a processor bus 1410 (e.g., an Intel Quick Path Interconnect (QPI) or other known high performance computing interconnect), a high bandwidth memory path 1418 to memory 1420, point-to-point links to the graphics accelerator 1412 (e.g., peripheral component interconnect express (PCIe) compatible fabric), a controller hub interconnect 1422, I/O or other interconnects (e.g., USB, PCI, PCIe) for coupling other illustrated components. Some examples of such components include an audio controller 1436, a firmware hub (flash BIOS)1428, a wireless transceiver 1426, a data storage device 1424, a conventional I/O controller 1410 containing a user input and keyboard interface 1442, a serial expansion port 1438 (e.g., Universal Serial Bus (USB)), and a network controller 1434. The data storage device 1424 may include a hard disk drive, a floppy disk drive, a CD-ROM device, a flash memory device, or other mass storage device.
Referring now to fig. 15, shown is a block diagram of a second system 1500 in accordance with an embodiment of the present disclosure. As shown in fig. 15, multiprocessor system 1500 is a point-to-point interconnect system, and includes a first processor 1570 and a second processor 1580 coupled via a point-to-point interconnect 1550. Each of processors 1570 and 1580 may be some version of a processor. In one embodiment, 1552 and 1554 are part of a serial point-to-point coherent interconnect structure, such as the Quick Path Interconnect (QPI) architecture of Intel. As a result, the present disclosure may be implemented within QPI architecture.
Although only two processors 1570, 1580 are shown, it is to be understood that the scope of the present disclosure is not so limited. In other embodiments, one or more additional processors may be present in a given processor.
Processors 1570 and 1580 are shown including integrated memory controller units 1572 and 1582, respectively. Processor 1570 also includes as part of its bus controller units point-to-point (P-P) interfaces 1576 and 1578; similarly, the second processor 1580 includes P-P interfaces 1586 and 1588. Processors 1570, 1580 may exchange information via a point-to-point (P-P) interface 1550 using P-P interface circuits 1578, 1588. As shown in fig. 15, IMCs 1572 and 1582 couple the processors to respective memories, namely a memory 1532 and a memory 1534, which may be portions of main memory locally attached to the respective processors.
Processors 1570, 1580 each exchange information with a chipset 1590 via respective P-P interfaces 1552, 1554 using point to point interface circuits 1576, 1594, 1586, 1598. Chipset 1590 also exchanges information with a high-performance graphics circuit 1438 along a high-performance graphics interconnect 1539 via interface circuit 1592.
A shared cache (not shown) may be included in either processor or external to both processors; it may also be connected with the processors via a P-P interconnect so that if the processors are placed in a low power mode, local cache information for either or both processors may be stored in the shared cache.
Chipset 1590 may be coupled to a first bus 1516 via an interface 1596. In one embodiment, first bus 1516 may be a Peripheral Component Interconnect (PCI) bus, or a bus such as a PCI Express bus or another third generation I/O interconnect bus, although the scope of the present disclosure is not so limited.
As shown in fig. 15, various I/O devices 1514 are coupled to first bus 1516 along with a bus bridge 1518, which bus bridge 1518 couples first bus 1516 to a second bus 1520. In one embodiment, second bus 1520 comprises a Low Pin Count (LPC) bus. Various devices are coupled to second bus 1520 including, for example, a keyboard and/or mouse 1522, communication devices 1527 and a storage unit 1528 (e.g., a disk drive or other mass storage device, which typically includes instructions/code and data 1530 in one embodiment). Further, an audio I/O1524 is shown coupled to second bus 1520. Note that other architectures are possible, with the components and interconnection architecture included being variable. For example, instead of the point-to-point architecture of FIG. 15, a system may implement a multi-drop bus or other such architecture.
Many different use cases may be implemented using various inertial and environmental sensors present in the platform. These use cases may enable advanced computing operations including perceptual computing, and may also enhance power management/battery life, safety, and system responsiveness.
For example, with respect to power management/battery life issues, based at least in part on information from an ambient light sensor, an ambient light condition in the platform location is determined and the intensity of the display is controlled accordingly. Thus, power consumption for operating the display is reduced in certain lighting conditions.
With respect to security operations, based on contextual information (e.g., location information) obtained from sensors, it may be determined whether to allow a user to access certain security documents. For example, a user may be allowed access to such documents at a workplace or home location. However, when the platform is located in a public place, the user will be prevented from accessing such documents. In one embodiment, the determination is based on location information determined, for example, via a GPS sensor or camera recognition of landmarks. Other security operations may include providing pairing of devices within close proximity of each other, e.g., a portable platform and a user's desktop computer, mobile phone, etc., as described herein. In some implementations, when the devices are so paired, some sharing is achieved via near field communication. However, such sharing may be disabled when the device is beyond a certain range. Further, when pairing a platform as described herein with a smartphone, an alarm may be configured to trigger when the devices move more than a predetermined distance from each other in a public place. Conversely, when the paired devices are in a secure location, such as a workplace or home location, the devices may exceed the predetermined limit without triggering such an alarm.
Sensor information may also be used to enhance responsiveness. For example, sensors may be enabled to run at a relatively low frequency even when the platform is in a low power state. Thus, any change in the position of the platform is determined, such as by inertial sensors, GPS sensors, or the like. If such a change is not registered, a match with a previous wireless hub (e.g., Wi-Fi) may occurTMAccess point or similar wireless enabler) because there is no need to scan for available wireless network resources in this case. Thus, a higher level of responsiveness is achieved when waking from a low power state.
It should be understood that many other use cases may be enabled using sensor information obtained via integrated sensors within a platform as described herein, and the above examples are for illustration purposes only. Using the systems described herein, perceptual computing systems may allow for the addition of alternative forms of input, including gesture recognition, and enable the system to sense user operation and intent.
In some embodiments, there may be one or more infrared or other thermal sensing elements, or any other element for sensing the presence or movement of a user. Such sensing elements may include a plurality of different elements that work together, sequentially, or both. For example, the sensing elements include elements that provide initial sensing, such as light or sound projection, and then gesture detection is sensed by, for example, an ultrasonic time-of-flight camera or a pattern light camera.
Also in some embodiments, the system includes a light generator to generate the illumination line. In some embodiments, the line provides a visual cue about a virtual boundary (i.e., an imaginary or virtual location in space) where the user's action through or breaching the virtual boundary or plane is interpreted as an intent to engage with the computing system. In some embodiments, the illumination lines may change color when the computing system transitions to a different state with respect to the user. The illumination lines may be used to provide visual cues to the user of virtual boundaries in space, and may be used by the system to determine state transitions of the computer relative to the user, including determining when the user wishes to engage with the computer.
In some embodiments, the computer senses the user position and operates to interpret movement of the user's hand through the virtual boundary as a gesture indicating that the user intends to engage with the computer. In some embodiments, the light generated by the light generator may change as the user passes through a virtual line or plane, thereby providing visual feedback to the user that the user has entered an area for providing gestures to provide input to the computer.
The display screen may provide a visual indication of a state transition of the user's computing system. In some embodiments, a first screen is provided in a first state in which the system senses the presence of a user, for example by using one or more sensing elements.
In some implementations, the system senses the user identity, for example, through facial recognition. Here, the transition to the second screen may be provided in a second state in which the computing system has recognized the user identity, wherein the second screen provides visual feedback to the user that the user has transitioned to the new state. The transition to the third screen may occur in a third state where the user has confirmed that the user is recognized.
In some embodiments, the computing system may use a translation mechanism to determine the location of the virtual boundary of the user, where the location of the virtual boundary may vary by user and context. The computing system may generate light (e.g., illumination lines) to indicate virtual boundaries for interfacing with the system. In some embodiments, the computing system may be in a wait state and light may be produced in a first color. The computing system may detect whether the user has crossed the virtual boundary, for example, by sensing the presence and motion of the user using sensing elements.
In some embodiments, if it has been detected that the user has crossed the virtual boundary (e.g., the user's hand is closer to the computing system than the virtual boundary line), the computing system may transition to a state for receiving gesture input from the user, where the mechanism to indicate the transition may include indicating that the virtual boundary changes to a second color of light.
In some embodiments, the computing system may then determine whether gesture motion is detected. If gesture motion is detected, the computing system may perform a gesture recognition process, which may include using data from a gesture database, which may reside in memory of the computing device or may be otherwise accessed by the computing device.
If a user's gesture is recognized, the computing system may perform a function in response to the input and return to receiving additional gestures if the user is within the virtual boundary. In some embodiments, if the gesture is not recognized, the computing system may transition to an error state, wherein the mechanism to indicate the error state may include indicating that the virtual boundary changes to a third color of light, and if the user is within the virtual boundary to engage with the computing system, the system returns to receive additional gestures.
As described above, in other embodiments, the system may be configured as a convertible tablet system that may be used in at least two different modes (tablet mode and notebook mode). The convertible system may have two panels, a display panel and a base panel, such that in the flat panel mode the two panels are arranged stacked on top of each other. In tablet mode, with the display panel facing outward, touch screen functionality as with conventional tablet computers may be provided. In the notebook mode, the two panels may be arranged in an open clamshell configuration.
In various embodiments, the accelerometer may be a 3-axis accelerometer having a data rate of at least 50 Hz. A gyroscope, which may be a three-axis gyroscope, may also be included. In addition, there may be an electronic compass/magnetometer. Also, one or more proximity sensors may be provided (e.g., for opening the cover to sense when a person is approaching (or not approaching) the system and adjusting power/performance to extend battery life). For some OS's sensor fusion functions, including accelerometers, gyroscopes, and compasses, enhanced features may be provided. Additionally, via a sensor center having a Real Time Clock (RTC), a wake-up from the sensor mechanism can be implemented to receive sensor input when the rest of the system is in a low power state.
In some embodiments, an internal lid/display open switch or sensor indicates when the lid is closed/open and can be used to place the system in "connected standby" or automatically wake up from the "connected standby" state. Other system sensors may include ACPI sensors for internal processor, memory, and skin temperature monitoring to enable modification of processor and system operating states based on sensed parameters.
In an embodiment, the OS may be connected standby enabled
Figure BDA0002861802160000411
8OS (also referred to herein as Win8 CS). Windows 8 connected standby or another OS with a similar state may provide very low ultra-idle power via the platform described herein to enable applications to remain connected to, for example, a cloud-based location with very low power consumption. The platform can support 3 power states, i.e., screen on (normal); connected standby (default to "off state); and shutdown (zero watts power consumption). Thus, in the connected standby state, the platform is logically on (at the lowest power level) even though the screen is off. On such platforms, power management may be made transparent to applications and maintain constant connectivity, in part because offload techniques enable the lowest power consuming components to perform operations.
In one example, the PCIe physical layer may be used to support a variety of different protocols. Thus, a particular training state in PCIe LTSSM may be used for protocol negotiation between devices on the link. As described above, protocol determination may occur even before the link is trained to an active state (e.g., L0) at the lowest supported data rate (e.g., PCIe Gen 1 data rate). In one example, a PCIe Config state may be used. In practice, the modified PCIe training set (e.g., TS1 and TS2) may be used after link width negotiation and (at least in part) in parallel with lane numbering performed during the Config state to use the PCIe LTSSM negotiation protocol. The protocol stack may include circuitry to support multiple protocols, such as PCIe and CXL.
While the present disclosure has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present disclosure.
A design may go through various stages, from creation to simulation to fabrication. The data representing the design may represent the design in a variety of ways. First, as is useful in simulations, the hardware may be represented using a hardware description language or another functional description language. Additionally, a circuit level model with logic and/or transistor gates may be generated at some stages of the design process. In addition, most designs, at some stage, reach a level of data representing the physical location of various devices in the hardware model. In the case where conventional semiconductor fabrication techniques are used, the data representing the hardware model may be the data specifying the presence or absence of various features on different mask layers for masks used to produce the integrated circuit. In any representation of the design, the data may be stored in any form of a machine-readable medium. A memory or magnetic or optical storage device, such as a disk, may be a machine-readable medium for storing information which is transmitted via optical or electrical waves which are modulated or otherwise generated to transmit such information. When an electrical carrier wave indicating or carrying the code or design is transmitted, to the extent that copying, buffering, or re-transmission of the electrical signal is performed, a new copy is made. Thus, a communication provider or a network provider may store, at least temporarily, an article embodying techniques of embodiments of the present disclosure, e.g., information encoded as a carrier wave, on a tangible, machine-readable medium.
A module, as used herein, refers to any combination of hardware, software, and/or firmware. As an example, a module includes hardware (e.g., a microcontroller) associated with a non-transitory medium to store code adapted to be executed by the microcontroller. Thus, in one embodiment, reference to a module refers to hardware specifically configured to identify and/or execute code to be stored on non-transitory media. Furthermore, in another implementation, the use of a module refers to a non-transitory medium including code that is specifically adapted to be executed by a microcontroller to perform predetermined operations. And as may be inferred, in yet another embodiment, the term "module" (in this example) may refer to a combination of a microcontroller and a non-transitory medium. In general, module boundaries that are shown as separate will typically vary and may overlap. For example, the first and second modules may share hardware, software, firmware, or a combination thereof, while potentially retaining some independent hardware, software, or firmware. In one embodiment, use of the term "logic" includes hardware such as transistors, registers, or other hardware such as programmable logic devices.
In one embodiment, use of the phrases "for" or "configured to" refer to devices, hardware, logic, or elements arranged, put together, manufactured, offered for sale, imported, and/or designed to perform a specified or determined task. In this example, an inoperative device or element thereof, if designed, coupled, and/or interconnected to perform the specified task, is still "configured to" perform the specified task. As merely an illustrative example, a logic gate may provide a 0 or a 1 during operation. But logic gates "configured to" provide an enable signal to the clock do not include every potential logic gate that may provide a 1 or a 0. Instead, the logic gates are coupled in such a way that the 1 or 0 output enables the clock during operation. It is again noted that the use of the term "configured to" does not require operation, but rather focuses on the underlying state of the device, hardware, and/or element in which the device, hardware, and/or element is designed to perform a particular task while the device, hardware, and/or element is operating.
Furthermore, in one embodiment, use of the phrases "capable of/for" and/or "operable to" refer to some devices, logic, hardware, and/or elements designed in a manner to enable the device, logic, hardware, and/or elements in a specified manner. As described above, in one embodiment, the use of "for," "capable of," or "operable to" refers to a potential state of a device, logic, hardware, and/or element, wherein the device, logic, hardware, and/or element is not operational, but is designed in a manner that enables the device in a specified manner.
As used herein, a value includes any known representation of a number, state, logic state, or binary logic state. Typically, the use of logic levels, logic (logical) values, or logic (local) values, also referred to as 1's and 0's, represent only binary logic states. For example, a 1 represents a high logic level and a 0 represents a low logic level. In one embodiment, a memory cell, such as a transistor or flash memory cell, can hold a single logic value or multiple logic values. However, other representations of values in computer systems have been used. For example, the decimal number "ten" may also be represented as a binary value of 1010 and a hexadecimal letter a. Accordingly, a value includes any representation of information that can be stored in a computer system.
Further, a state may be represented by a value or a portion of a value. As an example, a first value (e.g., a logical 1) may represent a default or initial state; while a second value (e.g., a logical 0) may represent a non-default state. Additionally, in one embodiment, the terms "reset" and "set" refer to a default value or state and an updated value or state, respectively. For example, the default value may include a high logic value, i.e., reset; and the update value may comprise a low logic value, i.e. a setting. Note that any combination of values may be used to represent any number of states.
The embodiments of methods, hardware, software, firmware, or code set forth above may be implemented via instructions or code stored on a machine-accessible, machine-readable, computer-accessible, or computer-readable medium that may be executed by a processing element. A non-transitory machine-accessible/readable medium includes any mechanism that provides (i.e., stores and/or transmits) information in a form readable by a machine (e.g., a computer or electronic system). For example, a non-transitory machine-accessible medium includes Random Access Memory (RAM), such as static RAM (sram) or dynamic RAM (dram); a ROM; a magnetic or optical storage medium; a flash memory device; an electrical storage device; an optical storage device; an acoustic storage device; other forms of storage devices that store information received from transient (propagated) signals (e.g., carrier waves, infrared signals, digital signals); etc., as distinguished from a non-transitory medium from which information may be received.
Instructions for programming logic to perform embodiments of the present disclosure may be stored within a memory of a system, such as a DRAM, cache, flash, or other storage device. Further, the instructions may be distributed via a network or by other computer readable media. Thus, a machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer), without limitation, a floppy disk, an optical disk, a compact disk read only memory (CD-ROM), a magneto-optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a magnetic or optical card, flash memory, or a tangible machine-readable storage device for transmitting information over the internet via electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.). Thus, a computer-readable medium includes any type of tangible machine-readable medium suitable for storing or transmitting electronic instructions or information in a form readable by a machine (e.g., a computer).
Reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
In the foregoing specification, a detailed description has been given with reference to specific exemplary embodiments. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the disclosure as set forth in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. Moreover, the foregoing use of embodiment and other exemplarily language does not necessarily refer to the same embodiment or the same example, but may refer to different and distinct embodiments, as well as potentially the same embodiment.
Various aspects and combinations of the embodiments are described above, some of which are represented by the following examples:
example 1 is an apparatus, comprising: an error injection register including error injection parameter information; and error injection logic circuitry to: reading error injection parameter information from an error injection register and injecting an error into a flow control unit (flit); and protocol stack circuitry to transmit flits including errors over the multi-channel link.
Example 2 may include the subject matter of example, wherein the error injection parameter information comprises information to activate or deactivate the error injection logic circuit.
Example 3 may include the subject matter of any of examples 1 or 2, wherein the error injection parameter information includes an indication of a number of errors to be injected into the flit; and error injection logic circuitry to inject the number of errors into the flit based on an indication of a number of errors to be injected in the error injection parameter information.
Example 4 may include the subject matter of any one of examples 1-3, wherein:
the error injection parameter information includes an indication of an interval between flits including an injected error depending on the number of flits; and error injection logic circuitry injects errors into the flits based on the spacing.
Example 5 may include the subject matter of any one of examples 1-4, wherein: the error injection parameter information includes an indication of a flit type to which the error is to be injected; and error injection logic circuitry injects errors into the flits based on the flit type.
Example 6 may include the subject matter of example 5, wherein the indication of the flit type in the error injection parameter information comprises an indication that an error is to be injected into any non-idle flit, a payload flit, a No Operation (NOP) flit, or any type of flit.
Example 7 may include the subject matter of any one of examples 1-5, wherein: the error injection parameter information includes an indication of a type of error to be injected into the flit; and error injection logic circuitry to inject a type of error into the flit based on the indication of the type of error to be injected in the error injection parameter information.
Example 8 may include the subject matter of example 7, wherein the indication of the type of error may indicate that the error may be a correctable error or an uncorrectable error.
Example 9 may include the subject matter of any of examples 1-8, wherein the indication of the type of error can indicate that the error is to be a correctable error in one forward error correction group or a correctable error in three forward error correction groups.
Example 10 may include the subject matter of any of examples 1-9, wherein the indication of the type of error can indicate that the error is to randomly select between a correctable error or an uncorrectable error.
Example 11 may include the subject matter of any of examples 1-10, wherein the error injection parameter information includes an indication of a magnitude of an error to be injected into the flit; and error injection logic circuitry to inject an error having a magnitude to be injected in the error injection parameter information into the flit based on an indication of the magnitude of the error.
Example 12 may include the subject matter of any one of examples 1-11, further comprising a port to receive a flit; wherein: the error injection parameter information includes an indication to inject a Negative Acknowledgement (NAK) into the received flit; and error injection logic injects NAKs into the received flits.
Example 13 may include the subject matter of any one of examples 1-12, further comprising a port to receive a flit; wherein: the error injection parameter information includes an indication of a replay of the received flit; and protocol stack circuitry causes the flit to be retransmitted.
Example 14 may include the subject matter of any one of examples 1-13, further comprising: an ordered set error register comprising ordered set error injection parameter information; an ordered set error injection circuit for: determining activated ordered set error injection, reading ordered set error injection parameter information, and injecting errors into the ordered set; and protocol stack circuitry to transmit the ordered set with the error over the multi-channel link.
Example 15 is a method, comprising: coding a stream control unit (microchip) through a protocol stack circuit; encoding, by an error injection circuit, an error into the flit based on an error injection parameter stored in an error injection register; the flits with errors are transmitted over a multi-channel link.
Example 16 may include the subject matter of example 15, further comprising: calculating an error correction code for the flit prior to encoding the error into the flit; and encoding the flit with an error correction code prior to encoding the error into the flit.
Example 17 may include the subject matter of any one of examples 15 or 16, further comprising: calculating a cyclic redundancy check code of the flit before encoding the error into the flit; and encoding the flit with a cyclic redundancy check code of the flit before encoding the error into the flit.
Example 18 may include the subject matter of any one of examples 15-17, further comprising: determining error injection parameters from an error injection parameter register; errors are encoded into the flits based on error injection parameters.
Example 19 may include the subject matter of example 18, wherein the error injection parameters include one or more of: error magnitude, number of errors, flit spacing between error occurrences, flit type, type of error being encoded, or error offset.
Example 20 may include the subject matter of any one of examples 15-19, further comprising: receiving a negative acknowledgement message for a previously transmitted flit; and replaying previously transmitted flits.
Example 21 is a system, comprising: a host device comprising a transmitter for transmitting a flow control unit (flit) over a multi-channel link, the host device comprising: an error injection register including error injection parameter information; and error injection logic circuitry to: reading error injection parameter information from an error injection register and injecting an error into the flit; and a protocol stack circuit for transmitting flits including errors over the multi-channel link; an endpoint device comprising a receiver port for receiving a flit, the receiver port comprising: error detection circuitry for detecting errors in the flit, and an error log for storing information relating to the errors in the flit.
Example 22 may include the subject matter of example 21, further comprising protocol stack circuitry to: calculating an error correction code for the flit prior to encoding the error into the flit; and encoding the flit with an error correction code prior to encoding the error into the flit.
Example 23 may include the subject matter of any one of examples 21 or 22, further comprising protocol stack circuitry to: calculating a cyclic redundancy check code of the flit before encoding the error into the flit; and encoding the flit with a cyclic redundancy check code of the flit before encoding the error into the flit.
Example 24 may include the subject matter of any one of examples 21-23, the error injection circuitry to: determining error injection parameters from an error injection parameter register; and encoding the error into the flit based on the error injection parameters.
Example 25 may include the subject matter of example 24, wherein the error injection parameters include one or more of: error magnitude, number of errors, flit spacing between error occurrences, flit type, error type encoded, or error offset.

Claims (25)

1. An apparatus, comprising:
an error injection register including error injection parameter information; and
error injection logic circuitry to:
reading error injection parameter information from the error injection register, an
Injecting errors into the flow control unit (flit); and
protocol stack circuitry to transmit the flit including the error over a multi-channel link.
2. The apparatus of claim 1, wherein the error injection parameter information comprises information for activating or deactivating the error injection logic circuit.
3. The apparatus of any one of claims 1 or 2, wherein:
the error injection parameter information comprises an indication of a number of errors to be injected into the flit; and
the error injection logic circuitry injects the number of errors into the flit based on an indication of a number of errors to inject in the error injection parameter information.
4. The apparatus of any one of claims 1-3, wherein:
the error injection parameter information includes an indication of an interval between flits to include an injected error depending on the number of flits; and
the error injection logic circuit injects an error into the flit based on the interval.
5. The apparatus of any one of claims 1-4, wherein:
the error injection parameter information comprises an indication of a flit type to which the error is to be injected; and
the error injection logic circuit injects an error into a flit based on the flit type.
6. The apparatus of claim 5, wherein the indication of a flit type in the error injection parameter information comprises an indication that an error is to be injected into any non-idle flit, a payload flit, a No Operation (NOP) flit, or any type of flit.
7. The apparatus of any one of claims 1-5, wherein:
the error injection parameter information comprises an indication of a type of error to be injected into the flit; and
the error injection logic circuitry injects the type of error into the flit based on the indication of the type of error to be injected in the error injection parameter information.
8. The apparatus of claim 7, wherein the indication of the type of error can indicate that the error can be a correctable error or an uncorrectable error.
9. The apparatus according to any of claims 1-8, wherein the indication of the type of error can indicate that the error is to be a correctable error in one forward error correction group or a correctable error in three forward error correction groups.
10. The apparatus of any of claims 1-9, wherein the indication of the type of error is capable of indicating that the error is to randomly select between a correctable error or an uncorrectable error.
11. The apparatus of any one of claims 1-10, wherein:
the error injection parameter information comprises an indication of a magnitude of an error to be injected into the flit; and
the error injection logic circuitry injects an error having the magnitude into the flit based on the indication of the magnitude of the error to be injected in the error injection parameter information.
12. The apparatus of any one of claims 1-11, further comprising a port for receiving a microchip;
wherein:
the error injection parameter information comprises an indication to inject a Negative Acknowledgement (NAK) into a received flit; and
the error injection logic injects NAKs into the received flits.
13. The device of any one of claims 1-12, further comprising a port for receiving a microchip;
wherein:
the error injection parameter information comprises an indication of a replay of a received flit; and
the protocol stack circuitry causes the flit to be retransmitted.
14. The apparatus of any of claims 1-13, further comprising:
an ordered set error register comprising ordered set error injection parameter information;
an ordered set error injection circuit for:
it is determined that ordered set error injection has been activated,
reading the ordered set of error injection parameter information, an
Injecting errors into the ordered set; and
the protocol stack circuitry to transmit the ordered set with the error over a multi-channel link.
15. A method, comprising:
coding a stream control unit (microchip) through a protocol stack circuit;
encoding, by an error injection circuit, an error into the flit based on an error injection parameter stored in an error injection register;
transmitting the flit with the error over a multi-channel link.
16. The method of claim 15, further comprising:
calculating an error correction code for the flit prior to encoding the error into the flit; and
encoding the flit with an error correction code prior to encoding the error into the flit.
17. The method of any of claims 15 or 16, further comprising:
calculating a cyclic redundancy check code for the flit prior to encoding the error into the flit; and
encoding the flit with a cyclic redundancy check code of the flit prior to encoding the error into the flit.
18. The method according to any one of claims 15-17, further comprising:
determining error injection parameters from an error injection parameter register;
encoding the error into the flit based on the error injection parameters.
19. The method of claim 18, wherein the error injection parameters include one or more of: error magnitude, number of errors, flit spacing between error occurrences, flit type, type of error being encoded, or error offset.
20. The method according to any one of claims 15-19, further comprising:
receiving a negative acknowledgement message for a previously transmitted flit; and
replaying the previously transmitted flit.
21. A system, comprising:
a host device comprising a transmitter for transmitting a flow control unit (flit) over a multi-channel link, the host device comprising:
an error injection register including error injection parameter information; and
error injection logic circuitry to:
reading error injection parameter information from the error injection register, an
Injecting an error into the flit; and
protocol stack circuitry to transmit the flit including the error over a multi-channel link; an endpoint device comprising a receiver port for receiving the flit, the receiver port comprising:
an error detection circuit for detecting said error in the flit, an
An error log for storing information about the errors in the flit.
22. The system of claim 21, further comprising protocol stack circuitry to:
calculating an error correction code for the flit prior to encoding the error into the flit; and
encoding the flit with the error correction code prior to encoding the error into the flit.
23. The system of any of claims 21 or 22, further comprising protocol stack circuitry to:
calculating a cyclic redundancy check code for the flit prior to encoding the error into the flit; and
encoding the flit with a cyclic redundancy check code of the flit prior to encoding the error into the flit.
24. The system of any of claims 21-23, the error injection circuitry to:
determining error injection parameters from an error injection parameter register;
encoding the error into the flit based on the error injection parameters.
25. The system of claim 24, wherein the error injection parameters include one or more of: error magnitude, number of errors, flit spacing between error occurrences, flit type, error type encoded, or error offset.
CN202011566305.0A 2020-07-27 2020-12-25 In-system verification of interconnects by error injection and measurement Pending CN113986624A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202063057168P 2020-07-27 2020-07-27
US63/057,168 2020-07-27
US17/115,168 US20210089418A1 (en) 2020-07-27 2020-12-08 In-system validation of interconnects by error injection and measurement
US17/115,168 2020-12-08

Publications (1)

Publication Number Publication Date
CN113986624A true CN113986624A (en) 2022-01-28

Family

ID=74879922

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011566305.0A Pending CN113986624A (en) 2020-07-27 2020-12-25 In-system verification of interconnects by error injection and measurement

Country Status (3)

Country Link
US (1) US20210089418A1 (en)
EP (1) EP3945688A1 (en)
CN (1) CN113986624A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117498991A (en) * 2023-11-14 2024-02-02 无锡众星微系统技术有限公司 Testability fault injection method and device based on retransmission function prototype device

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11397701B2 (en) * 2019-04-30 2022-07-26 Intel Corporation Retimer mechanisms for in-band link management
US11449403B2 (en) * 2019-10-09 2022-09-20 Honeywell International Inc. Apparatus and method for diagnosing faults in a fieldbus interface module
GB2596872B (en) * 2020-07-10 2022-12-14 Graphcore Ltd Handling injected instructions in a processor
US11836059B1 (en) 2020-12-14 2023-12-05 Sanblaze Technology, Inc. System and method for testing non-volatile memory express storage devices
US20220391524A1 (en) * 2021-06-07 2022-12-08 Infineon Technologies Ag Interconnection of protected information between components
US11461205B1 (en) * 2021-08-24 2022-10-04 Nxp B.V. Error management system for system-on-chip
CN113726609B (en) * 2021-08-31 2022-09-20 北京百度网讯科技有限公司 System test method, apparatus, electronic device, and medium

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7010607B1 (en) * 1999-09-15 2006-03-07 Hewlett-Packard Development Company, L.P. Method for training a communication link between ports to correct for errors
US9432298B1 (en) * 2011-12-09 2016-08-30 P4tents1, LLC System, method, and computer program product for improving memory systems
US9384108B2 (en) * 2012-12-04 2016-07-05 International Business Machines Corporation Functional built-in self test for a chip
US11009550B2 (en) * 2013-02-21 2021-05-18 Advantest Corporation Test architecture with an FPGA based test board to simulate a DUT or end-point
GB2520268A (en) * 2013-11-13 2015-05-20 Ibm Injecting lost packets and protocol errors in a simulation environment
US9692589B2 (en) * 2015-07-17 2017-06-27 Intel Corporation Redriver link testing
US10139445B2 (en) * 2016-09-30 2018-11-27 Intel Corporation High speed I/O pinless structural testing
US10261880B1 (en) * 2016-12-19 2019-04-16 Amazon Technologies, Inc. Error generation using a computer add-in card
US11093673B2 (en) * 2016-12-22 2021-08-17 Synopsys, Inc. Three-dimensional NoC reliability evaluation
US10625752B2 (en) * 2017-12-12 2020-04-21 Qualcomm Incorporated System and method for online functional testing for error-correcting code function
US10853212B2 (en) * 2018-01-08 2020-12-01 Intel Corporation Cross-talk generation in a multi-lane link during lane testing
US11094392B2 (en) * 2018-10-15 2021-08-17 Texas Instruments Incorporated Testing of fault detection circuit
US11323348B2 (en) * 2019-05-17 2022-05-03 Citrix Systems, Inc. API dependency error and latency injection
US20200366573A1 (en) * 2019-05-17 2020-11-19 Citrix Systems, Inc. Systems and methods for visualizing dependency experiments
US20210050941A1 (en) * 2020-07-06 2021-02-18 Intel Corporation Characterizing and margining multi-voltage signal encoding for interconnects

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117498991A (en) * 2023-11-14 2024-02-02 无锡众星微系统技术有限公司 Testability fault injection method and device based on retransmission function prototype device
CN117498991B (en) * 2023-11-14 2024-05-28 无锡众星微系统技术有限公司 Testability fault injection method and device based on retransmission function prototype device

Also Published As

Publication number Publication date
US20210089418A1 (en) 2021-03-25
EP3945688A1 (en) 2022-02-02

Similar Documents

Publication Publication Date Title
US11740958B2 (en) Multi-protocol support on common physical layer
US11595318B2 (en) Ordered sets for high-speed interconnects
US11397701B2 (en) Retimer mechanisms for in-band link management
US20210050941A1 (en) Characterizing and margining multi-voltage signal encoding for interconnects
CN113986624A (en) In-system verification of interconnects by error injection and measurement
JP6251806B2 (en) Apparatus, method, program, system, and computer-readable storage medium
US11637657B2 (en) Low-latency forward error correction for high-speed serial links
US11886312B2 (en) Characterizing error correlation based on error logging for computer buses
KR20210065834A (en) Partial link width states for bidirectional multilane links
US20210013999A1 (en) Latency-Optimized Mechanisms for Handling Errors or Mis-Routed Packets for Computer Buses
JP2019192287A (en) Apparatus, method, program, system, and computer readable storage medium
JP2018049658A (en) Devices, methods and systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination