CN107636620B - Link retraining based on runtime performance characteristics - Google Patents

Link retraining based on runtime performance characteristics Download PDF

Info

Publication number
CN107636620B
CN107636620B CN201580045878.XA CN201580045878A CN107636620B CN 107636620 B CN107636620 B CN 107636620B CN 201580045878 A CN201580045878 A CN 201580045878A CN 107636620 B CN107636620 B CN 107636620B
Authority
CN
China
Prior art keywords
link
condition
retraining
degradation
performance characteristics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201580045878.XA
Other languages
Chinese (zh)
Other versions
CN107636620A (en
Inventor
J·贝尔特
D·M·赫勒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Publication of CN107636620A publication Critical patent/CN107636620A/en
Application granted granted Critical
Publication of CN107636620B publication Critical patent/CN107636620B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3027Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • G06F1/206Cooling means comprising thermal management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • G06F1/3228Monitoring task completion, e.g. by use of idle timers, stop commands or wait commands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/3296Power saving characterised by the action undertaken by lowering the supply or operating voltage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/349Performance evaluation by tracing or monitoring for interfaces, buses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

Systems and methods may provide for monitoring one or more runtime performance characteristics of a link and determining a condition of the link based on at least one of the one or more runtime performance characteristics. In addition, retraining of the link may be automatically scheduled based on the condition of the link. In one example, scheduling the retraining of the link further comprises setting one or more retraining parameters.

Description

Link retraining based on runtime performance characteristics
Cross reference to related applications
This application claims priority to U.S. non-provisional patent application No. 14/497, 499, filed on 26/9/2014.
Technical Field
Embodiments are generally related to link management. More particularly, embodiments relate to link retraining based on run-time performance characteristics.
Background
A communication bus may be used to transfer information between components of a computing system in a wide variety of settings. Bus training may include configuring transmitters, receivers, and other bus components to have the proper voltage and/or timing settings so that data is properly detected and interpreted at the receiving end of the bus. Typically, the bus may be trained only once in a manufacturing environment (e.g., a factory) before the computing system is shipped to the retailer and/or end user. However, the operating environment of the bus may differ significantly from the manufacturing environment in terms of thermal conditions, power consumption, and so forth, particularly as the components degrade over time. This mismatch between the training conditions of the manufacturing environment and the operating conditions of the operating environment may therefore result in the conventional bus operating in a suboptimal configuration.
Drawings
Various advantages of the embodiments will become apparent to those skilled in the art from a reading of the following specification and appended claims, and from a reference to the following drawings, in which:
FIG. 1 is a block diagram of a retraining approach (approach) according to an embodiment;
FIG. 2 is a flow diagram of an example of a method of managing a link in accordance with one embodiment;
FIG. 3 is a flow diagram of an example of a method of automatically scheduling retraining of a link in accordance with an embodiment;
FIG. 4 is a block diagram of an example of a link manager according to an embodiment;
FIG. 5 is a block diagram of an example of a processor according to an embodiment; and
FIG. 6 is a block diagram of an example of a computing system in accordance with an embodiment.
Detailed Description
Turning now to fig. 1, a retraining approach is illustrated in which a link 10 enables the transfer of information between a first system component 12 and a second system component 14. The link 10 may be physically attached to the system components 12, 14, or part of a discrete/remote subsystem to which the system components 12, 14 may be logically reachable/programmable, the link 10 may include, for example, a memory bus, a processor-to-processor bus, an input/output (IO) bus (e.g., serial, parallel), and so forth. For example, the link 10 may support communication between the system components 12, 14 via PCI-e (peripheral component interconnect express, such as PCI express x16 graphic 150W-ATX Specification 1.0, PCI Special interest group), MIPI (Mobile industry processor interface), or other suitable protocol. Accordingly, the system components 12, 14 may include, for example, processors (e.g., graphics, hosts, and/or central processing units/CPUs), chipsets, IO modules, memory devices and/or controllers, network controllers, and/or other subsystems.
In the illustrated example, link manager 19 monitors one or more runtime performance characteristics 16 in real-time, such as, for example, error status (e.g., correctable, uncorrectable), bandwidth status, retransmission status (e.g., packet loss), power consumption status, and/or thermal status information for link 10, and uses characteristics 16 to determine a retraining policy 18 for link 10. Thus, rather than training the link 10 only once in a manufacturing environment, the illustrated approach enables the link 10 to be selectively retrained throughout the life of the link 10. Such an approach may enable the link 10 to automatically adapt to degradation of components over time and other operational considerations in order to maintain an optimal performance profile.
Turning now to fig. 2, a method 20 of managing links is shown. The method 20 may be implemented as a module or related component of a set of logical instructions stored in a machine-or computer-readable storage medium such as a Random Access Memory (RAM), a Read Only Memory (ROM), a programmable ROM (prom), firmware, flash memory, etc., in configurable logic such as, for example, a Programmable Logic Array (PLA), a Field Programmable Gate Array (FPGA), a Complex Programmable Logic Device (CPLD), in fixed functionality hardware logic using circuit technology such as, for example, Application Specific Integrated Circuit (ASIC), Complementary Metal Oxide Semiconductor (CMOS), or transistor-transistor logic (TTL) technology, or any combination thereof. For example, computer program code to carry out operations shown in method 20 may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, ACPI Source Language (ASL), or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages.
Processing block 22 is shown performing initial link training. Initial link training may occur at an initial boot sequence of a system containing the link, and may include, for example, iteratively wave shaping, equalizing, and/or establishing an operating margin (margin) for the link for voltage, timing, and so on. Thus, the transmitter associated with the link may be configured with the appropriate parameters at block 22 to drive the digital voltage to a level and at some point in time to produce a detectable voltage transition and minimize errors at the receiver. However, it is particularly noted that because initial link training may occur outside of the actual operating environment, the initial training parameters may be sub-optimal due to changes in thermal conditions, degradation, humidity, noise sources, mechanical shock, etc., and inaccurate initial training. In fact, these changes may even be seasonal or location dependent (e.g., students moving notebook computers home from school; servers in high density environments are trained during cold boot on a cold day but operate on a hot day, etc.).
Accordingly, the illustrated block 24 provides for monitoring one or more runtime performance characteristics of the link. As already noted, the runtime performance characteristics may include, for example, error status (e.g., correctable, uncorrectable), bandwidth status, retransmission status (e.g., packet loss), signal pair margin status, power consumption status, and/or thermal status information for the link and/or system. Monitoring may thus include automatically accessing error status registers, link status registers, etc., as well as detecting various runtime events and/or interrupts. The condition of the link may be determined at block 26 based on at least one of the one or more runtime performance characteristics. Thus, block 26 may include quantifying and/or classifying the number of uncorrectable errors, link bandwidth, packet loss, amount of power consumed by the link component, operating temperature of the link component, and so forth. Block 28 is shown to automatically schedule retraining of the link based on the condition of the link. As will be discussed in more detail, block 28 may use link conditions to set retraining parameters to optimize, for example, operating margins, power consumption, etc., depending on the context. Further, the retraining parameters may be designed to address specific runtime performance characteristics associated with a change in the condition of the link. In one example, the hardware monitors runtime performance characteristics and sets a "degradation flag" for the system software to consume/interpret/trigger retraining. Other distributions of the functionality of the method 20 between hardware and software may also be used depending on the context.
Fig. 3 illustrates a method 30 of automatically scheduling retraining of a link. The method 30 may readily replace the block 28 (fig. 2) already discussed. The method 30 may also be implemented as a module or related component in a set of logic instructions stored in a machine or computer readable storage medium such as RAM, ROM, PROM, firmware, flash memory, etc., in configurable logic such as, for example, PLA, FPGA, CPLD, in fixed functionality hardware logic using circuit technology such as, for example, ASIC, CMOS or TTL technology, or any combination thereof.
Block 32 is shown determining whether the condition of the link satisfies a degradation condition. Thus, block 32 may include determining whether the error (e.g., correctable, uncorrectable) rate or number thereof has reached a particular threshold, whether the link bandwidth has dropped below a particular threshold, whether packet loss has increased significantly, whether the link component is consuming an abnormally high amount of power (e.g., a peripheral card is near the end of life), whether the link component is overheating, and so forth. If the degradation condition is met, the illustrated block 34 provides for configuring one or more retraining parameters to increase an operating margin of the link, where increasing the operating margin may result in greater immunity to noise.
The operating margin may be a voltage margin, timing margin, or the like, that defines a waveform for signals transmitted on the link. For example, if the expected voltage is 0V-0.4V for "low" and 4.6V-5V for "high" and the initial training of the link establishes a voltage margin of 0.1V (i.e., the receiver observes signals at 0.3V and 4.7V), block 34 may include increasing the voltage margin to 0.2V (i.e., the receiver observes signals at 0.2V and 4.8V), which may effectively tighten the constraints placed on the transmitter, reduce the likelihood of link transmission errors, increase noise immunity, and improve reliability. In another example, if the initial training of the link establishes a timing margin of 100ns, block 34 may include increasing the timing margin to 200ns, which may also tighten constraints placed on the transmitter, reduce the likelihood of link transmission errors, increase jitter immunity, and improve reliability. The values provided herein are merely to facilitate discussion and may vary depending on the context. Further, other types of retraining parameters and/or operating margins may be used. Once the retraining parameters have been configured, the illustrated block 36 selects the next opportunity (i.e., the next pilot sequence, almost instantaneous without requiring a system reset, etc.) for retraining the link and schedules the link to be retrained at the selected time.
If it is determined at block 32 that the condition of the link does not satisfy the degradation condition, illustrated block 38 determines whether the link may be optimized for power. In this regard, the link operating properly may present opportunities to mitigate operating margins and reduce power consumption. Thus, block 38 may include determining whether link degradation has not been detected for a certain amount of time. If so, illustrated block 40 configures one or more retraining parameters to allow and/or trigger a reduction in the operating margin of the link. For example, if the initial training of the link established a voltage margin of 0.2V, block 40 may include reducing the voltage margin to 0.1V, which may effectively relax the constraints placed on the transmitter and enable lower power operation.
Accordingly, the illustrated block 42 initiates one or more power reduction operations with respect to the link, wherein the reduced operation margin may facilitate and/or enable the power reduction operations to be made. Block 42 may include, for example, bypassing (bypass) and/or powering down link components such as amplifiers, control loops, and the like. Further, power reduction operations may occur during training/equalization (e.g., setting retraining parameters to meet lower power targets established at runtime) and during runtime operations. Additionally, if degradation of the link is subsequently detected, any power reduction operations initiated at block 42 may be returned/cancelled at block 34. As already noted, block 36 may provide for selecting a next opportunity for retraining the link and scheduling the link to be retrained at the selected opportunity.
Fig. 4 shows the link managers 44 (44 a-44 c). Link manager 44 may be a device (e.g., comprising fixed-function hardware logic, configurable logic, logic instructions, or any combination thereof), an operating system driver, and/or system firmware (e.g., basic input/output system/BIOS) that implements one or more aspects of method 20 (fig. 2) and/or method 30 (fig. 3) that have been discussed. In the illustrated example, performance monitor 44a is configured to monitor one or more runtime performance characteristics of the link. Degradation detector 44b may be coupled to performance monitor 44a, where degradation detector 44b is shown determining a condition of the link based on at least one of the one or more runtime performance characteristics. In addition, a training scheduler 44c coupled to the degradation detector 44b may automatically schedule retraining of the link based on the condition of the link.
In one example, the training scheduler 44c includes a parameter optimizer 46 (46 a, 46 b) to set one or more retraining parameters. Thus, the parameter optimizer 46 may include a margin component 46a to configure at least one of the one or more retraining parameters to increase the operating margin of the link if the conditions satisfy the degradation condition. Alternatively, the margin component 46a may configure at least one of the one or more retraining parameters to allow the operating margin to be reduced if the condition does not satisfy the degradation condition. In such cases, if the conditions do not satisfy the degradation condition, power component 46b may initiate one or more power reduction operations with respect to the link.
Fig. 5 illustrates a processor core 200 in accordance with one implementation. Processor core 200 may be a core for any type of processor, such as a microprocessor, an embedded processor, a Digital Signal Processor (DSP), a network processor, or other device to execute code. Although only one processor core 200 is shown in fig. 5, a processing element may alternatively include more than one processor core 200 shown in fig. 5. Processor core 200 may be a single-threaded core, or for at least one embodiment, processor core 200 may be multithreaded in that it may include more than one hardware thread context (or "logical processor") per core.
Fig. 5 also shows a memory 270 coupled to processor core 200. Memory 270 may be any of a wide variety of memories (including various levels of hierarchical memory hierarchy) known or otherwise available to those skilled in the art. Memory 270 may include one or more code 213 instructions to be executed by processor core 200, where code 213 may implement link manager 44 (fig. 4), method 20 (fig. 2), and/or method 30 (fig. 3), as already discussed. Processor core 200 follows a program sequence of instructions indicated by code 213. Each instruction may enter the front-end portion 210 and be processed by one or more decoders 220. Decoder 220 may generate as its output micro-operations, such as fixed width micro-operations in a predetermined form, or may generate other instructions, micro-instructions, or control signals that reflect the original code instructions. Front end 210 is also shown to include register renaming logic 225 and scheduling logic 230, which generally allocate resources and queue operations corresponding to translation instructions for execution.
Processor core 200 is shown to include execution logic 250, execution logic 250 having a set of execution units 255-1 through 255-N. Some embodiments may include multiple execution units dedicated to a particular function or set of functions. Other embodiments may include only one execution unit or one execution unit capable of performing a particular function. The illustrated execution logic 250 performs the operations specified by the code instructions.
After completing execution of the operation specified by the code instruction, back-end logic 260 retires the instruction of code 213. In one embodiment, processor core 200 allows out-of-order execution but requires in-order retirement of instructions. Retirement logic 265 may take various forms known to those skilled in the art (e.g., a reorder buffer or the like). In this manner, processor core 200 is transformed during execution of code 213 at least in terms of the outputs generated by the decoders, hardware registers and tables utilized by register renaming logic 225, and any registers (not shown) modified by execution logic 250.
Although not shown in fig. 5, the processing elements may include other elements on the chip with processor core 200. For example, a processing element may be included in conjunction with processor core 200 and memory control logic. The processing element may include I/O control logic and/or may include I/O control logic integrated with memory control logic. The processing element may also include one or more caches.
Referring now to fig. 6, a block diagram of an embodiment of a computing system 1000 (e.g., server, blade, desktop, notebook, tablet, convertible tablet, smartphone, removable internet device/MID, wearable computer, media player, etc.) is shown, in accordance with an embodiment. Shown in fig. 6 is a multiprocessor system 1000 that includes a first processing element 1070 and a second processing element 1080. Although two processing elements 1070 and 1080 are shown, it is to be understood that an embodiment of system 1000 may also include only one such processing element.
System 1000 is illustrated as a point-to-point interconnect system in which a first processing element 1070 and a second processing element 1080 are coupled via a point-to-point interconnect 1050. It should be understood that any or all of the interconnects shown in figure 6 may be implemented as a multi-drop bus rather than a point-to-point interconnect.
As shown in fig. 6, each of processing elements 1070 and 1080 may be multicore processors, including first and second processor cores (i.e., processor cores 1074a and 1074b, and processor cores 1084a and 1084 b). Such cores 1074a, 1074b, 1084a, 1084b may be configured to execute instruction code in a manner similar to that discussed above in connection with fig. 5.
Each processing element 1070, 1080 may include at least one shared cache 1896a, 1896 b. The shared caches 1896a, 1896b may store data (e.g., instructions) that are utilized by one or more components of the processors, such as the cores 1074a, 1074b and 1084a, 1084b, respectively. For example, the shared caches 1896a, 1896b may locally cache data stored in the memories 1032, 1034 for faster access by components of the processors. In one or more embodiments, the shared caches 1896a, 1896b may include one or more mid-level caches, such as level 2 (L2), level 3 (L3), level 4 (L4), or other levels of cache, Last Level Cache (LLC), and/or combinations thereof.
Although shown with only two processing elements 1070, 1080, it is to be understood that the scope of the embodiments is not so limited. In other embodiments, one or more additional processing elements may be present in a given processor. Alternatively, one or more of the processing elements 1070, 1080 may be an element other than a processor, such as an accelerator or a field programmable gate array. For example, the additional processing elements may include additional processors that are the same as first processor 1070, additional processors that are heterogeneous or asymmetric to processor first processor 1070, accelerators (such as, for example, graphics accelerators or Digital Signal Processing (DSP) units), field programmable gate arrays, or any other processing element. Various differences can exist between the processing elements 1070, 1080 in terms of quality metric spectra (including architectural, microarchitectural, thermal, power consumption characteristics, and the like). These differences may effectively manifest themselves as asymmetries and heterogeneity among the processing elements 1070, 1080. For at least one embodiment, the various processing elements 1070, 1080 may reside in the same die package.
The first processing element 1070 may further include memory controller logic (MC) 1072 and point-to-point (P-P) interfaces 1076 and 1078. Similarly, second processing element 1080 may include a MC 1082 and P-P interfaces 1086 and 1088. As shown in fig. 6, MC's 1072 and 1082 couple the processors to respective memories, namely a memory 1032 and a memory 1034, which may be portions of main memory locally attached to the respective processors. Although MC 1072 and 1082 are shown as integrated within processing elements 1070, 1080, for alternative embodiments, MC logic may be discrete logic external to processing elements 1070, 1080 rather than integrated within them.
First processing element 1070 and second processing element 1080 may be coupled to I/O subsystem 1090, via P-P interconnects 1076, 1086, respectively. As shown in FIG. 6, I/O subsystem 1090 includes P-P interfaces 1094 and 1098. Further, the I/O subsystem 1090 includes an interface 1092 to couple the I/O subsystem 1090 with a high performance graphics engine 1038. In one embodiment, a bus 1049 may be used to couple graphics engine 1038 to I/O subsystem 1090. Alternatively, a point-to-point interconnect may couple these components.
In turn, I/O subsystem 1090 may be coupled to a first bus 1016 via an interconnect 1096. In one embodiment, first bus 1016 may be a Peripheral Component Interconnect (PCI) bus, or a bus such as a PCI express bus, or another third generation I/O interconnect bus, although the scope of the embodiments is not so limited.
As shown in fig. 6, various I/O devices 1014 (such as cameras, sensors), along with a bus bridge 1018 may be coupled to first bus 1016, bus bridge 1018 may couple first bus 1016 to a second bus 1020. In one embodiment, second bus 1020 may be a Low Pin Count (LPC) bus. In one embodiment, various devices may be coupled to second bus 1020 including, for example, a keyboard/mouse 1012, communication devices 1026, and a data storage unit 1019 such as a disk drive or other mass storage device which may include code 1030. The illustrated code 1030 may implement the link manager 44 (fig. 4), the method 20 (fig. 2), and/or the method 30 (fig. 3) already discussed, and may be similar to the code 213 (fig. 5) already discussed. Thus, the code 1030 may be used to automatically schedule retraining of any of the interconnects and/or communication buses shown in fig. 6. Further, an audio I/O1024 may be coupled to second bus 1020 and battery 1010 may supply power to computing system 1000.
Note that other embodiments are contemplated. For example, instead of the point-to-point architecture of fig. 6, a system may implement a multi-drop bus or another such communication topology. Also, more or fewer integrated chips than shown in FIG. 6 may alternatively be used to divide the elements of FIG. 6.
Additional attention and examples:
example 1 may include a system to perform link-based operations, the system comprising: a host processor, a subsystem, a link coupled to the host processor and the subsystem, and a link manager coupled to one or more of the host processor, the subsystem, or the link. The link manager may include: a performance monitor to monitor one or more runtime performance characteristics of the link; a degradation detector coupled to the performance monitor, the degradation detector to determine a condition of the link based on at least one of the one or more runtime performance characteristics; and a training scheduler coupled to the degradation detector, the training scheduler to automatically schedule retraining of the link based on the condition of the link.
Example 2 may include the system of example 1, wherein the training scheduler includes a parameter optimizer to set one or more retraining parameters.
Example 3 may include the system of example 2, wherein the parameter optimizer includes a margin component to configure at least one of the one or more retraining parameters to increase an operating margin of the link if the condition satisfies a degradation condition.
Example 4 may include the system of example 2, wherein the parameter optimizer includes: a margin component to configure at least one of the one or more retraining parameters to allow a reduction in an operating margin for the link if the condition does not satisfy a degradation condition; and a power component to initiate one or more power reduction operations with respect to the link if the condition does not satisfy the degradation condition.
Example 5 may include the system of any one of examples 1 to 4, wherein the one or more runtime performance characteristics are to include one or more of an error state, a bandwidth state, a retransmission state, a power consumption state, or a thermal state.
Example 6 may include the system of any of examples 1 to 4, wherein the link includes one or more of a memory bus, a processor-to-processor bus, or an input/output bus.
Example 7 may include a method of managing a link, the method including monitoring one or more runtime performance characteristics of the link, determining a condition of the link based on at least one of the one or more runtime performance characteristics, and automatically scheduling retraining of the link based on the condition of the link.
Example 8 may include the method of example 7, wherein scheduling the retraining of the link further comprises setting one or more retraining parameters.
Example 9 may include the method of example 8, wherein setting the one or more retraining parameters includes configuring at least one of the one or more retraining parameters to increase an operating margin of the link if the condition satisfies a degradation condition.
Example 10 may include the method of example 8, wherein setting the one or more retraining parameters includes: configuring at least one of the one or more retraining parameters to allow a reduction in an operating margin of the link if the condition does not satisfy a degradation condition; and initiating one or more power reduction operations with respect to the link if the condition does not satisfy the degradation condition.
Example 11 may include the method of any one of examples 7 to 10, wherein the one or more runtime performance characteristics include one or more of an error state, a bandwidth state, a retransmission state, a power consumption state, or a thermal state.
Example 12 may include the method of any one of examples 7 to 10, wherein one or more of the memory bus, the processor-to-processor bus, or the input/output bus is scheduled for retraining.
Example 13 may include at least one computer-readable storage medium comprising a set of instructions that, when executed by a computing system, cause the computing system to monitor one or more runtime performance characteristics of a link, determine a condition of the link based on at least one of the one or more runtime performance characteristics, and automatically schedule retraining of the link based on the condition of the link.
Example 14 may include the at least one computer-readable storage medium of example 13, wherein the instructions, when executed, cause the computing system to set one or more retraining parameters.
Example 15 may include the at least one computer-readable storage medium of example 14, wherein the instructions, when executed, cause a computing device to configure at least one of the one or more retraining parameters to increase an operating margin of the link if the condition satisfies a degradation condition.
Example 16 may include the at least one computer-readable storage medium of example 14, wherein the instructions, when executed, cause the computing device to: configuring at least one of the one or more retraining parameters to allow a reduction in an operating margin of the link if the condition does not satisfy a degradation condition; and initiating one or more power reduction operations with respect to the link if the condition does not satisfy the degradation condition.
Example 17 may include the at least one computer-readable storage medium of any one of examples 13 to 16, wherein the one or more runtime performance characteristics are to include one or more of an error state, a bandwidth state, a retransmission state, a power consumption state, or a thermal state.
Example 18 may include the at least one computer-readable storage medium of any one of examples 13 to 16, wherein one or more of the memory bus, the processor-to-processor bus, or the input/output bus is to be scheduled for retraining.
Example 19 may include an apparatus to manage a link, comprising: a performance monitor to monitor one or more runtime performance characteristics of the link; a degradation detector coupled to the performance monitor, the degradation detector to determine a condition of the link based on at least one of the one or more runtime performance characteristics; and a training scheduler coupled to the degradation detector, the training scheduler to automatically schedule retraining of the link based on the condition of the link.
Example 20 may include the apparatus of example 19, wherein the training scheduler includes a parameter optimizer to set one or more retraining parameters.
Example 21 may include the apparatus of example 20, wherein the parameter optimizer includes a margin component to configure at least one of the one or more retraining parameters to increase an operating margin of the link if the condition satisfies a degradation condition.
Example 22 may include the apparatus of example 20, wherein the parameter optimizer comprises: a margin component to configure at least one of the one or more retraining parameters to allow a reduction in an operating margin for the link if the condition does not satisfy a degradation condition; and a power component to initiate one or more power reduction operations with respect to the link if the condition does not satisfy the degradation condition.
Example 23 may include the apparatus of any one of examples 19 to 22, wherein the one or more runtime performance characteristics are to include one or more of an error state, a bandwidth state, a retransmission state, a power consumption state, or a thermal state.
Example 24 may include the apparatus of any one of examples 19 to 22, wherein one or more of the memory bus, the processor-to-processor bus, or the input/output bus is to be scheduled for retraining.
Example 25 may include an apparatus to manage a link, comprising means for performing the method of any of examples 7 to 12.
Thus, the techniques described herein may detect an abnormal event on a link and/or degradation of a link and trigger retraining of that link at the next opportunity. The link may take into account events that trigger retraining to better optimize parameters and/or retraining approaches.
Embodiments are applicable for use with all types of semiconductor integrated circuit ("IC") chips. Examples of such IC chips include, but are not limited to, processors, controllers, chipset components, Programmable Logic Arrays (PLAs), memory chips, network chips, system on chip (SoC), SSD/NAND controller ASICs, and the like. Additionally, in some of the figures, signal conductor lines are represented with lines. Some may be different to indicate more constituent signal paths; having a number label to indicate the number that constitutes the signal path; and/or have arrows at one or more ends to indicate the initial direction of information flow. However, this should not be interpreted in a limiting manner. Rather, such added detail may be used in connection with one or more exemplary embodiments to facilitate easier understanding of the circuitry. Any represented signal lines, whether or not having additional information, may actually comprise one or more signals that may propagate in multiple directions and may be implemented using any suitable type of signal scheme, e.g., digital or analog lines, fiber optic lines, and/or single-ended lines implemented using different pairs.
Example sizes/modules/values/ranges may have been given, although embodiments are not limited to being the same. As manufacturing techniques (e.g., photolithography) mature over time, it is expected that devices of smaller size could be manufactured. In addition, well-known power/ground connections to IC chips and other components may or may not be shown within the figures, for simplicity of illustration and discussion, and so as not to obscure certain aspects of the embodiments. Further, arrangements may be shown in block diagram form in order to avoid obscuring the embodiments, and also in view of the fact that specifics with respect to implementation of such block diagram arrangements are highly dependent upon the computing system within which the embodiment is to be implemented, i.e., such specifics should be well within purview of one skilled in the art. Where specific details (e.g., circuits) are set forth in order to describe example embodiments, it should be apparent to one skilled in the art that embodiments can be practiced without, or with variation of, these specific details. The description is thus to be regarded as illustrative instead of limiting.
The term "coupled" may be used herein to refer to any type of relationship (direct or indirect) between the components in question, and may apply to electrical, mechanical, fluid, optical, electromagnetic, electromechanical, or other connections. In addition, the terms "first," "second," and the like may be used herein only to facilitate discussion, and do not carry a particular chronological or chronological meaning unless otherwise indicated.
As used in this application and in the claims, a list of items joined by the term "one or more of can mean any combination of the listed items. For example, the phrase "A, B or one or more of C" may mean a; b; c; a and B; a and C; b and C; or A, B and C.
Those skilled in the art will appreciate from the foregoing description that the broad techniques of the embodiments can be implemented in a variety of forms. Therefore, while the embodiments have been described in connection with particular examples thereof, the true scope of the embodiments should not be so limited since other modifications will become apparent to the skilled practitioner upon a study of the drawings, the specification, and the following claims.

Claims (25)

1. A system for performing link-based operations, comprising:
a host processor;
a subsystem;
a link coupled to the host processor and the subsystem; and
a link manager coupled to one or more of the host processor, the subsystem, or the link, the link manager comprising:
a performance monitor to monitor one or more runtime performance characteristics of the link;
a degradation detector coupled to the performance monitor, the degradation detector to determine a condition of the link based on at least one of the one or more runtime performance characteristics; and
a training scheduler coupled to the degradation detector, the training scheduler to automatically schedule retraining of the link based on the condition of the link.
2. The system of claim 1, wherein the training scheduler comprises a parameter optimizer to set one or more retraining parameters.
3. The system of claim 2, wherein the parameter optimizer includes a margin component to configure at least one of the one or more retraining parameters to increase an operating margin of the link if the condition satisfies a degradation condition.
4. The system of claim 2, wherein the parameter optimizer comprises:
a margin component to configure at least one of the one or more retraining parameters to allow a reduction in an operating margin for the link if the condition does not satisfy a degradation condition; and
a power component to initiate one or more power reduction operations with respect to the link if the condition does not satisfy the degradation condition.
5. The system of any of claims 1-4, wherein the one or more runtime performance characteristics are to include one or more of an error state, a bandwidth state, a retransmission state, a power consumption state, or a thermal state.
6. The system of any of claims 1-4, wherein the link comprises one or more of a memory bus, a processor-to-processor bus, or an input/output bus.
7. A method of managing a link, comprising:
monitoring one or more runtime performance characteristics of the link;
determining a condition of the link based on at least one of the one or more runtime performance characteristics; and
automatically scheduling retraining of the link based on the condition of the link.
8. The method of claim 7, wherein scheduling the retraining of the link further comprises setting one or more retraining parameters.
9. The method of claim 8, wherein setting the one or more retraining parameters comprises configuring at least one of the one or more retraining parameters to increase an operating margin of the link if the condition satisfies a degradation condition.
10. The method of claim 8, wherein setting the one or more retraining parameters comprises:
configuring at least one of the one or more retraining parameters to allow a reduction in an operating margin of the link if the condition does not satisfy a degradation condition; and
initiating one or more power reduction operations with respect to the link if the condition does not satisfy the degradation condition.
11. The method of any of claims 7 to 10, wherein the one or more runtime performance characteristics include one or more of an error state, a bandwidth state, a retransmission state, a power consumption state, or a thermal state.
12. The method of any of claims 7 to 10, wherein one or more of a memory bus, a processor-to-processor bus, or an input/output bus is scheduled for retraining.
13. An apparatus to manage a link, comprising:
a performance monitor to monitor one or more runtime performance characteristics of the link;
a degradation detector coupled to the performance monitor, the degradation detector to determine a condition of the link based on at least one of the one or more runtime performance characteristics; and
a training scheduler coupled to the degradation detector, the training scheduler to automatically schedule retraining of the link based on the condition of the link.
14. The apparatus of claim 13, wherein the training scheduler comprises a parameter optimizer to set one or more retraining parameters.
15. The apparatus of claim 14, wherein the parameter optimizer comprises a margin component to configure at least one of the one or more retraining parameters to increase an operating margin of the link if the condition satisfies a degradation condition.
16. The apparatus of claim 14, wherein the parameter optimizer comprises:
a margin component to configure at least one of the one or more retraining parameters to allow a reduction in an operating margin for the link if the condition does not satisfy a degradation condition; and
a power component to initiate one or more power reduction operations with respect to the link if the condition does not satisfy the degradation condition.
17. The apparatus of any of claims 13 to 16, wherein the one or more runtime performance characteristics are to include one or more of an error state, a bandwidth state, a retransmission state, a power consumption state, or a thermal state.
18. The apparatus of any of claims 13 to 16, wherein one or more of a memory bus, a processor-to-processor bus, or an input/output bus is to be scheduled for retraining.
19. An apparatus to manage a link, comprising:
means for monitoring one or more runtime performance characteristics of the link;
means for determining a condition of the link based on at least one of the one or more runtime performance characteristics; and
means for automatically scheduling a retraining of the link based on the condition of the link.
20. The apparatus of claim 19, wherein scheduling the retraining of the link further comprises setting one or more retraining parameters.
21. The apparatus of claim 20, wherein means for setting the one or more retraining parameters comprises means for configuring at least one of the one or more retraining parameters to increase an operating margin of the link if the condition satisfies a degradation condition.
22. The apparatus of claim 20, wherein the means for setting the one or more retraining parameters comprises:
means for configuring at least one of the one or more retraining parameters to allow a reduction in an operating margin of the link if the condition does not satisfy a degradation condition; and
means for initiating one or more power reduction operations with respect to the link if the condition does not satisfy the degradation condition.
23. The apparatus of any of claims 19 to 22, wherein the one or more runtime performance characteristics comprise one or more of an error state, a bandwidth state, a retransmission state, a power consumption state, or a thermal state.
24. The apparatus of any of claims 19 to 22, wherein one or more of a memory bus, a processor-to-processor bus, or an input/output bus is to be scheduled for retraining.
25. A computer-readable medium having instructions stored thereon that, when executed by a computing device, cause the computing device to perform the method of any of claims 7-12.
CN201580045878.XA 2014-09-26 2015-08-25 Link retraining based on runtime performance characteristics Active CN107636620B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/497499 2014-09-26
US14/497,499 US9626270B2 (en) 2014-09-26 2014-09-26 Link retraining based on runtime performance characteristics
PCT/US2015/046686 WO2016048525A1 (en) 2014-09-26 2015-08-25 Link retraining based on runtime performance characteristics

Publications (2)

Publication Number Publication Date
CN107636620A CN107636620A (en) 2018-01-26
CN107636620B true CN107636620B (en) 2021-01-05

Family

ID=55581748

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580045878.XA Active CN107636620B (en) 2014-09-26 2015-08-25 Link retraining based on runtime performance characteristics

Country Status (4)

Country Link
US (1) US9626270B2 (en)
EP (1) EP3198454B1 (en)
CN (1) CN107636620B (en)
WO (1) WO2016048525A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11023244B2 (en) * 2017-09-25 2021-06-01 Intel Corporation System, apparatus and method for recovering link state during link training
KR102499794B1 (en) * 2018-05-21 2023-02-15 삼성전자주식회사 Storage device and operating method of storage device
CN109376103B (en) * 2018-06-19 2021-10-19 华为技术有限公司 Method, chip and communication system for rapid equalization
US10854136B2 (en) * 2018-12-21 2020-12-01 Lg Electronics Inc. Organic light emitting diode display device
US10860512B2 (en) * 2019-04-26 2020-12-08 Dell Products L.P. Processor interconnect link training system
KR102519480B1 (en) * 2021-04-01 2023-04-10 에스케이하이닉스 주식회사 PCIe DEVICE AND COMPUTING SYSTEM INCLUDING THEREOF
US11546128B2 (en) 2020-06-16 2023-01-03 SK Hynix Inc. Device and computing system including the device
KR102518285B1 (en) 2021-04-05 2023-04-06 에스케이하이닉스 주식회사 PCIe INTERFACE AND INTERFACE SYSTEM
KR102415309B1 (en) 2020-06-16 2022-07-01 에스케이하이닉스 주식회사 Interface device and method for operating the same

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1744065A (en) * 2004-09-02 2006-03-08 国际商业机器公司 Automatic hardware data link initialization method and system
CN103270497A (en) * 2010-09-24 2013-08-28 英特尔公司 Method and system of live error recovery

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5859959A (en) * 1996-04-29 1999-01-12 Hewlett-Packard Company Computer network with devices/paths having redundant links
US6952746B2 (en) * 2001-06-14 2005-10-04 International Business Machines Corporation Method and system for system performance optimization via heuristically optimized buses
US8635328B2 (en) 2002-10-31 2014-01-21 International Business Machines Corporation Determining time varying thresholds for monitored metrics
US7426597B1 (en) 2003-05-07 2008-09-16 Nvidia Corporation Apparatus, system, and method for bus link width optimization of a graphics system
US20060291500A1 (en) * 2005-06-03 2006-12-28 Adc Dsl Systems, Inc. Non-intrusive transmit adjustment control
US7539809B2 (en) * 2005-08-19 2009-05-26 Dell Products L.P. System and method for dynamic adjustment of an information handling systems graphics bus
US7587625B2 (en) * 2006-02-16 2009-09-08 Intel Corporation Memory replay mechanism
US7493439B2 (en) 2006-08-01 2009-02-17 International Business Machines Corporation Systems and methods for providing performance monitoring in a memory system
US8582448B2 (en) * 2007-10-22 2013-11-12 Dell Products L.P. Method and apparatus for power throttling of highspeed multi-lane serial links
US8320411B1 (en) * 2009-01-29 2012-11-27 Aquantia Corporation Fast retraining for transceivers in communication systems
US8589670B2 (en) * 2009-03-27 2013-11-19 Advanced Micro Devices, Inc. Adjusting system configuration for increased reliability based on margin
US8132048B2 (en) * 2009-08-21 2012-03-06 International Business Machines Corporation Systems and methods to efficiently schedule commands at a memory controller
US20140281067A1 (en) 2013-03-15 2014-09-18 Debendra Das Sharma Apparatus, system, and method for performing link training and equalization
US9647852B2 (en) * 2014-07-17 2017-05-09 Dell Products, L.P. Selective single-ended transmission for high speed serial links

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1744065A (en) * 2004-09-02 2006-03-08 国际商业机器公司 Automatic hardware data link initialization method and system
CN103270497A (en) * 2010-09-24 2013-08-28 英特尔公司 Method and system of live error recovery

Also Published As

Publication number Publication date
WO2016048525A1 (en) 2016-03-31
US20160092335A1 (en) 2016-03-31
CN107636620A (en) 2018-01-26
EP3198454A4 (en) 2018-06-06
EP3198454B1 (en) 2019-07-17
EP3198454A1 (en) 2017-08-02
US9626270B2 (en) 2017-04-18

Similar Documents

Publication Publication Date Title
CN107636620B (en) Link retraining based on runtime performance characteristics
US10198379B2 (en) Early identification in transactional buffered memory
US10140213B2 (en) Two level memory full line writes
EP3370156B1 (en) Speculative reads in buffered memory
US10248325B2 (en) Implied directory state updates
CN107438838B (en) Packed write completions
US9170975B2 (en) High speed overlay of idle I2C bus bandwidth
US20200192832A1 (en) Influencing processor governance based on serial bus converged io connection management
US11144410B2 (en) System and method to dynamically increase memory channel robustness at high transfer rates
EP2818963B1 (en) Restricting clock signal delivery in a processor
US11544129B2 (en) Cross-component health monitoring and improved repair for self-healing platforms
US10528398B2 (en) Operating system visibility into system states that cause delays and technology to achieve deterministic latency

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant