CN115878540B - PCIe device link training management method, management device and server - Google Patents

PCIe device link training management method, management device and server Download PDF

Info

Publication number
CN115878540B
CN115878540B CN202310061710.4A CN202310061710A CN115878540B CN 115878540 B CN115878540 B CN 115878540B CN 202310061710 A CN202310061710 A CN 202310061710A CN 115878540 B CN115878540 B CN 115878540B
Authority
CN
China
Prior art keywords
link
pcie
preset
training
initial value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310061710.4A
Other languages
Chinese (zh)
Other versions
CN115878540A (en
Inventor
管彦广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310061710.4A priority Critical patent/CN115878540B/en
Publication of CN115878540A publication Critical patent/CN115878540A/en
Application granted granted Critical
Publication of CN115878540B publication Critical patent/CN115878540B/en
Priority to PCT/CN2023/121333 priority patent/WO2024152604A1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computer And Data Communications (AREA)
  • Communication Control (AREA)

Abstract

The invention relates to a link training management method, a management device, a server, equipment and a storage medium for PCIe devices, wherein the training management method comprises a link training failure repairing step, and the link training failure repairing step comprises the following steps: acquiring a default preset initial value Px as a kth generation initial value for carrying out link training on the PCIe device; judging whether the provider ID and the equipment ID of the PCIe device can be read normally; if the PCIe link cannot be read normally, judging that the PCIe link training fails; acquiring a preset initial value Py as a new kth generation initial value for link retraining; after the link retraining is carried out on the PCIe device, judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not again according to the first waiting time; if the PCIe link can be read normally, judging that the PCIe link training is successful. Through the technical scheme, the problem of link training failure of key PCIe devices on the current server main board can be solved.

Description

PCIe device link training management method, management device and server
Technical Field
The invention relates to the technical field of link training, in particular to a PCIe device link training management method, a PCIe device link training management device and a PCIe device link training management server.
Background
On a server or a storage server based on the X86 platform, the CPU uses a PCIe high-speed bus to perform data interaction with other peripheral devices (such as a BMC chip, a network card chip, an NTB, a PCIe switch, an extensible plug-in card and the like). The peripheral devices such as NTB, PCIe switch and the like are generally used as key devices to be welded on a main board and are called as on-board devices; the network card, the FC card, the sas card and the like are externally arranged and connected through an expansion slot, and the network card, the FC card, the sas card and the like are called as an externally inserted device.
The on-board device is welded on the main board, and PCIe wiring and the like are all solidified on the PCB, so that consistency of PCIe link SI signals on the main board is ensured. However, when the number of servers is large, it is inevitable that some mainboards with poor consistency appear; in addition, there may be some variation in the uniformity of the on-board devices. This results in the critical devices of the individual boards experiencing PCIe link training failure due to poor PCIe link SI performance.
PCIe equalization techniques are used in PCIe link training, which may be based on equalization adjustment of EQ by PCIe Tx Preset initial values; link training from PCIe gen1 to gen3, gen3 to gen4, and gen4 to gen5 stages corresponds to Preset values of the PCIe transmit side link equalization parameters of gen3 Tx Preset, gen4 Tx Preset, and gen5 Tx Preset, respectively; each stage is finely adjusted based on the initial value of Tx Preset;
At present, the intel platform performs 3 times of balance adjustment for the stage gen3 and performs more than 10 times of balance adjustment for the stages gen4 and gen 5; in general, in the stage gen3, the initial value is set unreasonably, so that training in the stages gen1 to gen3 cannot be completed, and the link training fails.
Disclosure of Invention
In order to solve the technical problems, the invention provides a PCIe device link training management method, a management device and a server, wherein the PCIe device link training management method is used for solving the problem of failure of key PCIe device link training on a main board of a server at present.
In order to achieve the above object, the present invention provides a PCIe device link training management method, configured to perform link training management on PCIe devices critical to a server motherboard by PCIe equalization technology, where the training management method includes a link training failure repair step, including:
acquiring a default preset initial value Px from a preset value aggregate of link equalization parameters of a transmitting end set for a kth generation PCIe protocol, and taking the default preset initial value Px as a kth generation initial value for carrying out link training on the PCIe device; the k-th generation initial value is a transmitting end link balance parameter initial value when balance adjustment is carried out on a k-th generation PCIe protocol, and k is not less than 2; the data of the preset value set of the link equalization parameters of the transmitting end are sequentially arranged according to the numerical value;
After the link training is carried out on the PCIe device through the kth generation initial value, judging whether the provider ID and the equipment ID of the PCIe device can be read normally;
if the PCIe link cannot be read normally, judging that the PCIe link training fails;
acquiring a preset initial value Py from the preset value combination set of the link equalization parameters of the transmitting end according to a preset value taking rule, and taking the preset initial value Py as a new kth generation initial value for carrying out link retraining on the PCIe device;
after the link retraining is carried out on the PCIe device, judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not again according to a first waiting time;
if the provider ID and the equipment ID can be read normally after the link retraining is carried out, judging that the PCIe link training is successful.
Further, after judging whether the vendor ID and the device ID of the PCIe device can be read normally again according to the first waiting time, the link training failure repairing step further includes:
if the provider ID and the equipment ID of the PCIe device can not be read normally after the link retraining, judging that the PCIe link training fails;
sequentially taking values from the preset value collection set of the link equalization parameters of the transmitting end as new k generation initial values according to the preset value increasing direction and/or the preset value decreasing direction in the preset value rule, and respectively carrying out link retraining on the PCIe device until the PCIe link training is successful.
Further, the training management method further comprises a link stability checking step; the link stability checking step includes a link liveness management step including:
after judging that the PCIe link training is successful, judging whether the link liveness flag bit is 1;
if the link activity flag bit is not 1, sequentially performing link retraining according to preset retraining times; and when the link retrains each time, setting the link retraining flag bit to be 1, and judging whether the link activity flag bit is 1 again according to a second preset time interval.
Further, the link stability checking step further includes a link state management step, which includes:
when the link activity flag bit is 1, judging whether the link speed in the link state information is smaller than a speed expected value or not; if the speed is smaller than the preset speed, judging that the link speed is reduced; wherein the speed expectation value matches a minimum specification of a maximum link speed in a link capability register;
when the link activity flag bit is 1, judging whether the link bandwidth in the link state information is smaller than a bandwidth expected value or not; if the link is smaller than the preset threshold, judging that the link is down; wherein the bandwidth expectation value matches a minimum specification of a maximum link bandwidth in the link capability register.
Further, the link state management step further includes:
when the link speed reduction and/or the link lane reduction occur, the link disabling operation is sequentially carried out according to the preset disabling repetition times; and when the link is disabled to enable operation, setting a disabling bit in a link control register, setting a disabling and enabling time interval according to a preset operation time interval, and re-reading the link state information according to a second waiting time to judge whether the link state is normal or not.
Further, the link liveness management step further includes:
if the link liveness flag bit is 1, printing the preset retraining retry times and link training negotiation result information to a log, and writing the log into a CPLD register; the link training negotiation result information comprises a new k generation initial value corresponding to the successful link retraining.
Further, the link state management step further includes:
when the link state information is normal, recording a log, and writing a link training success mark into a CPLD register.
Further, after determining whether the vendor ID and the device ID of the PCIe device can be read normally, the training management method further includes:
If the vendor ID and the device ID of the PCIe device can be read normally, judging that the PCIe link training is successful, and recording logs.
Further, after the link retraining is sequentially performed according to the preset retraining retry times, the link activity management step further includes:
and when the link retraining is sequentially carried out according to the preset retraining times and the link activity flag bit is not 1, printing the preset retraining times and a link training failure flag into a log, and writing the log into a CPLD register.
Further, after the link disabling operation is sequentially performed according to the preset disabling repetition times, the link state management step further includes:
when the link disabling operation is sequentially carried out according to the preset disabling repetition times and the link speed reduction and/or the link lane reduction occur in each retry, the preset disabling repetition times and the link training failure mark are printed into a log and written into a CPLD register.
Further, after sequentially taking values from the preset value set of the link equalization parameters of the transmitting end according to the preset value increasing direction and/or the preset value decreasing direction in the preset value rule as new kth generation initial values, the training management method further includes:
After traversing the value in the preset value combination set of the link equalization parameters of the transmitting end in sequence according to the preset value increasing direction and/or the preset value decreasing direction in the preset value rule as a new kth generation initial value, when all links are retrained to fail, recording a log, and writing a link training failure sign into a CPLD (complex programmable logic device) register.
Further, before the default preset initial value Px is obtained from the preset value set of the link equalization parameters of the transmitting end set for the kth PCIe protocol and is used as the kth initial value, the training management method further includes:
acquiring a preset value set of the link equalization parameters of the transmitting end according to the PCIe signal parameter characteristics of the server main board and the large-scale sample test requirements of the server main board; the set of preset values of the link equalization parameters of the transmitting end comprises the default preset initial value Px.
Further, after obtaining the default preset initial value Px from the preset value set of the link equalization parameters of the transmitting end set for the kth PCIe protocol as the kth initial value, the training management method further includes:
and taking the default preset initial value Px as a kth generation initial value, and performing PCIe link training on a PCIe root port of the server CPU.
Further, a default preset initial value Px is obtained from a preset value set of link equalization parameters of a transmitting end set for a kth generation PCIe protocol, and is used as a kth generation initial value for performing link training on the PCIe device, and specifically includes:
and acquiring the default preset initial value Px from the preset value set of the link equalization parameters of the transmitting end as a kth generation initial value, wherein the default preset initial value Px is used for carrying out link training on a CPU root port corresponding to a link where the PCIe device is located.
Further, after performing link training on the PCIe device through the kth generation initial value, determining whether the vendor ID and the device ID of the PCIe device can be read normally includes:
and acquiring and judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not through the basic input/output system firmware, and judging whether PCIe link training of a PCIe bus and the CPU root port is successful or not.
Further, before obtaining a default preset initial value Px from a preset value set for a transmission end link equalization parameter set for a kth generation PCIe protocol as the kth generation initial value, and performing link training on the PCIe device, the training management method further includes:
And setting a preset value set of the link equalization parameters of the transmitting end through a register, wherein the preset value set is used for carrying out link training on a CPU root port corresponding to a link where the PCIe device is located.
The invention also provides a link training management device of the PCIe device, which is used for carrying out link training management on key PCIe devices of the server main board through PCIe equalization technology, and comprises a link training failure repair unit, which comprises:
the link training unit is used for acquiring a default preset initial value Px from a preset value set of the link equalization parameters of the transmitting end set for the kth generation PCIe protocol, and taking the default preset initial value Px as the kth generation initial value for carrying out link training on the PCIe device; the k-th generation initial value is a transmitting end link balance parameter initial value when balance adjustment is carried out on a k-th generation PCIe protocol, and k is not less than 2; the data of the preset value set of the link equalization parameters of the transmitting end are sequentially arranged according to the numerical value;
a link training judging unit, configured to judge whether a vendor ID and a device ID of the PCIe device can be read normally after performing link training on the PCIe device through the kth generation initial value;
the training failure judging unit is used for judging that the PCIe link training fails when the provider ID and the equipment ID cannot be read normally;
The link retraining unit is used for acquiring a preset initial value Py from the preset value collection set of the link equalization parameters of the transmitting end according to a preset value rule, and taking the preset initial value Py as a new kth generation initial value for retraining the link of the PCIe device;
the link retraining judging unit is used for judging whether the provider ID and the equipment ID of the PCIe device can be read normally according to the first waiting time after the PCIe device is retrained;
and the link retraining success judging unit is used for judging that the PCIe link training is successful when the provider ID and the equipment ID can be read normally after the link retraining is carried out.
The invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring a default preset initial value Px from a preset value aggregate of link equalization parameters of a transmitting end set for a kth generation PCIe protocol, and taking the default preset initial value Px as a kth generation initial value for carrying out link training on the PCIe device; the k-th generation initial value is a transmitting end link balance parameter initial value when balance adjustment is carried out on a k-th generation PCIe protocol, and k is not less than 2; the data of the preset value set of the link equalization parameters of the transmitting end are sequentially arranged according to the numerical value;
After the link training is carried out on the PCIe device through the kth generation initial value, judging whether the provider ID and the equipment ID of the PCIe device can be read normally;
if the PCIe link cannot be read normally, judging that the PCIe link training fails;
acquiring a preset initial value Py from the preset value combination set of the link equalization parameters of the transmitting end according to a preset value taking rule, and taking the preset initial value Py as a new kth generation initial value for carrying out link retraining on the PCIe device;
after the link retraining is carried out on the PCIe device, judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not again according to a first waiting time;
if the provider ID and the equipment ID can be read normally after the link retraining is carried out, judging that the PCIe link training is successful.
The present invention further provides a computer readable storage medium storing a computer program which when executed by a processor performs the steps of:
acquiring a default preset initial value Px from a preset value aggregate of link equalization parameters of a transmitting end set for a kth generation PCIe protocol, and taking the default preset initial value Px as a kth generation initial value for carrying out link training on the PCIe device; the k-th generation initial value is a transmitting end link balance parameter initial value when balance adjustment is carried out on a k-th generation PCIe protocol, and k is not less than 2; the data of the preset value set of the link equalization parameters of the transmitting end are sequentially arranged according to the numerical value;
After the link training is carried out on the PCIe device through the kth generation initial value, judging whether the provider ID and the equipment ID of the PCIe device can be read normally;
if the PCIe link cannot be read normally, judging that the PCIe link training fails;
acquiring a preset initial value Py from the preset value combination set of the link equalization parameters of the transmitting end according to a preset value taking rule, and taking the preset initial value Py as a new kth generation initial value for carrying out link retraining on the PCIe device;
after the link retraining is carried out on the PCIe device, judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not again according to a first waiting time;
if the provider ID and the equipment ID can be read normally after the link retraining is carried out, judging that the PCIe link training is successful.
The invention further provides a server, which comprises a key PCIe device arranged on the main board, wherein the PCIe device realizes link training management through the PCIe device link training management method.
Compared with the prior art, the technical scheme provided by the invention has the following technical effects:
the PCIe device link training management method is used for carrying out link training management on key PCIe devices of the server main board through PCIe equalization technology;
The training management method comprises a link training failure repair step, which comprises the following steps:
firstly, acquiring a default preset initial value Px as a kth generation initial value for carrying out link training on a PCIe device;
judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not; if the PCIe link cannot be read normally, judging that the PCIe link training fails;
then, acquiring a preset initial value Py as a new kth generation initial value for carrying out link retraining on the PCIe device;
then judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not again; if the provider ID and the equipment ID can be read normally after the link retraining is carried out, judging that the PCIe link training is successful;
the method comprises the steps that a preset value set of link equalization parameters of a transmitting end is preset aiming at a kth PCIe protocol, a default preset initial value Px can be obtained from the preset value set of link equalization parameters of the transmitting end, and a preset initial value Py is obtained according to a preset value rule;
therefore, when the provider ID and the equipment ID can be read normally, the success of the PCIe link training can be judged, and the problem of the failure of the link training of the key PCIe device on the main board of the server at present is solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a link training 10-set Tx preset value in the prior art;
FIG. 2 is a flowchart of a PCIe device link training management method according to a first embodiment of the present invention;
FIG. 3 is a schematic overall flow chart of a link training management method according to an embodiment of the present invention;
FIG. 4 is a block diagram illustrating a method and apparatus for managing link training of PCIe devices in accordance with a second embodiment of the present invention;
fig. 5 is an internal structure diagram of a computer device in the second embodiment of the present invention.
Description of the embodiments
As shown in fig. 1, in the prior art, 10 sets of Tx Preset values Tx Preset (i.e., the sender link equalization parameter Preset) are respectively used in the gen3, gen4, and gen5 phases of PCIe link training.
The Bios (i.e., bios firmware) start-up phase may set a default preset value (e.g., the default preset value of Intel xx platform is P7); however, actual tests find that the preset value of the P7 in the gen3 stage does not meet the SI parameter requirement of the current main board PCB and key PCIe devices;
multiple sample tests are carried out on the current main board, and the requirement of SI parameters can be met when the Tx Preset=P4 of the gen3 stage is found; however, the SI parameter requirements of all the motherboards cannot be satisfied when Tx preset=p4, and when the SI parameters of the individual motherboards and the key PCIe devices have consistency differences, a link training failure condition in the gen3 stage still occurs.
Therefore, the invention provides a PCIe device link training management method, a PCIe device link training management device and a PCIe device link training management server, so as to solve the problems.
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Embodiment one:
as shown in fig. 2, an embodiment of the present invention provides a PCIe device link training management method, configured to perform link training management on a PCIe device critical to a server motherboard by using PCIe equalization technology, where the training management method includes a link training failure repair step, including:
s2, acquiring a default preset initial value Px from a preset value set of link equalization parameters of a transmitting end set for a k-th generation PCIe protocol, and using the default preset initial value Px as a k-th generation initial value for carrying out link training on a PCIe device; the k-th generation initial value is a transmitting end link balance parameter initial value when balance adjustment is carried out on the k-th generation PCIe protocol, and k is not less than 2; the data of the preset value set of the link equalization parameters of the transmitting end are sequentially arranged according to the numerical value;
S3, after the PCIe device is subjected to link training through the kth generation initial value, judging whether the provider ID and the equipment ID of the PCIe device can be read normally;
s31, if the PCIe link cannot be read normally, judging that the PCIe link training fails;
s4, acquiring a preset initial value Py from a preset value combination set of link equalization parameters of a transmitting end according to a preset value rule, and taking the preset initial value Py as a new kth generation initial value for carrying out link retraining on the PCIe device;
s5, after the link retraining is carried out on the PCIe device, judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not again according to the first waiting time;
s51, if the provider ID and the equipment ID can be read normally after the link retraining is carried out, judging that the PCIe link training is successful.
In a specific embodiment, the training management method includes a link training failure repair step, which includes:
firstly, acquiring a default preset initial value Px as a kth generation initial value for carrying out link training on a PCIe device;
judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not; if the PCIe link cannot be read normally, judging that the PCIe link training fails;
then, acquiring a preset initial value Py as a new kth generation initial value for carrying out link retraining on the PCIe device;
Then judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not again; if the provider ID and the equipment ID can be read normally after the link retraining is carried out, judging that the PCIe link training is successful;
the method comprises the steps that a preset value set of link equalization parameters of a transmitting end is preset aiming at a kth PCIe protocol, a default preset initial value Px can be obtained from the preset value set of link equalization parameters of the transmitting end, and a preset initial value Py is obtained according to a preset value rule;
therefore, when the provider ID and the equipment ID can be read normally, the success of the PCIe link training can be judged, and the problem of the failure of the link training of the key PCIe device on the main board of the server at present is solved.
In practice, PCIe represents PCI-Express (peripheral component interconnect express), which is a high-speed serial computer expansion bus standard.
PCIe Tx Preset represents a PCIe transmit side link equalization parameter Preset value. PCIe SI represents PCIe signal parameters.
PCIe gen3, gen4, gen5 represent PCIe protocol generations and may also be used to represent speeds that can be achieved after PCIe link negotiation, from gen1 to gen5, which support speeds of 2.5GT/s, 5.0GT/s, 8.0GT/s, 16.0GT/s, 32.0GT/s, respectively.
As shown in fig. 3, in a practical embodiment, the link training management method mainly includes the following stages:
step 1, initiating link training for PCIe in a Bios starting step, and setting a gen3 Tx Preset initial value of a CPU RootPort (i.e. a root port) corresponding to a link where an on-board key PCIe device is located into a register before the link training;
the gen3 Tx Preset initial value is a Preset initial value obtained by the hardware according to the PCIe SI parameter characteristic of the server motherboard and performing a large number of sample tests (for example, gen3 Tx preset=P4);
step 2, when training corresponding CPU PCIe root ports, performing PCIe link training by using a gen3 Tx Preset initial value;
stage 3, the Bios determines whether the PCIe link training between the PCIe bus and the CPU RootPort is successful by checking whether VID (i.e., vendor ID) and DID (i.e., device ID) of the on-board critical PCIe device can be read normally.
When the VID and the DID of the on-board key PCIe device can be read normally, judging that the link training is successful, and recording a log; bios continues to operate;
when the VID and DID of the on-board key PCIe device cannot be read, adjusting the initial value of the PCIe Gen3 Tx Preset of the CPU RootPort corresponding to the link where the on-board key PCIe device is located, and performing link retry, wherein the initial value adjusting method of the PCIe Gen3 Tx Preset is detailed in the stage 5.
After the initial value of PCIe Gen3 Tx Preset is modified, initiating a link retraining operation for a rootport port connected with a key PCIe device, and then waiting for 15ms to check whether VID and DID of the key PCIe device can be read normally;
when the VID and the DID of the on-board key PCIe device can be read normally, judging that the link training is successful, recording logs, and continuing to operate the bios;
otherwise, repeating the stage 4 and the stage 5;
stage 5, PCIe Gen3 Tx Preset initial value adjustment is performed as follows:
starting from a default Preset initial value P4, firstly setting the initial value to the Preset value increasing direction once (for example, PCIe Gen3 Tx preset=P5), and then initiating PCIe link retraining;
if the link training judgment fails, setting the initial value reduction direction once again (for example, PCIe Gen3 Tx preset=P3), and then initiating PCIe link retraining;
until the two direction presets reach P0 and P9, respectively.
In summary, under the condition that a PCIe link training algorithm is not changed by a CPU end, adaptively adjusting a gen3 stage Tx Preset initial value according to a link training result through bios; and then, the PCIe link training is restarted to solve the starting failure problem of the main board key PCIe devices with poor SI consistency.
In a preferred embodiment, after S3, the training management method further includes:
s32, if the provider ID and the equipment ID of the PCIe device can be read normally, judging that the PCIe link training is successful, and recording a log.
In an actual embodiment, when the VID and the DID of the on-board key PCIe device can be read normally, judging that the link training is successful, and recording a log; bios continues to operate.
In a preferred embodiment, after S5, the link training failure repair step further includes:
s521, if the provider ID and the equipment ID of the PCIe device cannot be read normally after the link retraining, judging that the PCIe link training fails;
s522, sequentially taking values from a preset value collection set of link equalization parameters of a transmitting end as new k-th generation initial values according to a preset value increasing direction and/or a preset value decreasing direction in a preset value rule, and respectively performing link retraining on PCIe devices until the PCIe link training is successful.
In a specific embodiment, data of a preset value set of the link equalization parameters of the transmitting end are sequentially arranged according to the numerical value; the value can be taken from the preset value set of the link equalization parameters of the transmitting end as the kth generation initial value according to the preset value rule, so that the link training or the link retraining can be carried out on the PCIe device.
In an actual embodiment, in the link training management method, adaptive adjustment may be performed in a cyclic manner, and link training may be performed in a cyclic manner:
after the initial value of PCIe Gen3 Tx Preset is modified, initiating a link retraining operation for a rootport port connected with a key PCIe device, and then waiting for 15ms to check whether VID and DID of the key PCIe device can be read normally;
when the VID and the DID of the on-board key PCIe device can be read normally, judging that the link training is successful, recording logs, and continuing to operate the bios;
otherwise, repeating the stage 4 and the stage 5;
stage 5, PCIe Gen3 Tx Preset initial value adjustment is performed as follows:
starting from a default Preset initial value P4, firstly setting the initial value to the Preset value increasing direction once (for example, PCIe Gen3 Tx preset=P5), and then initiating PCIe link retraining;
if the link training judgment fails, setting the initial value reduction direction once again (for example, PCIe Gen3 Tx preset=P3), and then initiating PCIe link retraining;
until the two direction presets reach P0 and P9, respectively.
Therefore, the link training is ensured to be successful by circularly performing self-adaptive adjustment and circularly performing link training.
In addition, if the link training fails, retries are not performed, logs are recorded, a link training failure mark is recorded to the CPLD, and the key PCIe device link training flow is exited.
In a preferred embodiment, after S522, the training management method further includes:
s523, after traversing the numerical value in the preset value combination set of the link equalization parameters of the transmitting end in sequence according to the preset value increasing direction and/or the preset value decreasing direction in the preset value rule as a new kth generation initial value, when all the links are retrained to fail, recording a log, and writing a link training failure mark into a CPLD register.
In a practical embodiment, in stage 5 of the link training management method, the method further includes:
if the link training fails, retrying is not performed, a log is recorded, a link training failure mark is recorded to the CPLD, and the key PCIe device link training flow is exited.
In a preferred embodiment, the training management method further comprises a link stability checking step; the link stability checking step includes a link liveness management step including:
s6, judging whether the link liveness flag bit is 1 after judging that the PCIe link training is successful;
s61, if the link activity flag bit is not 1, sequentially performing link retraining according to the preset retraining retry times; and when the link retrains each time, setting the link retraining flag bit to be 1, and judging whether the link activity flag bit is 1 again according to a second preset time interval.
In a specific embodiment, in addition, in order to ensure that the link activity and the link state of the key PCIe device meet the expectations after the link training, the link activity and the link state are respectively checked in the link training process;
when abnormal link activity, link speed reduction or link lane reduction are detected, multiple PCIe link retraining and PCIe link disabling/enabling operations are respectively initiated, so that the stability of the key PCIe device link is further improved.
In a practical embodiment, the link stability checking step in the link training management method includes a link liveness management step, which includes:
and 6, after the key PCIe device links are successfully trained, checking the link liveness, and checking whether the link liveness flag bit of a link_status register (namely a link state register) on the root port is 1 or not to judge whether the connection between the root port and the equipment below is normal or not.
If the link liveness flag bit is 1, the Bios prints the retry times and the negotiation result to a log at the same time and writes the log into a CPLD register;
if the link liveness zone bit is not 1, writing 1 on the link retraining zone bit of a link_control register (namely a link control register) of the root port, and then checking whether the link liveness zone bit is 1 after waiting for 15 ms;
If the link activity is not 1, three retraining retries of the link are initiated.
In a preferred embodiment, the link liveness management step further comprises:
s62, if the link liveness flag bit is 1, printing the preset retraining retry times and link training negotiation result information to a log, and writing the log into a CPLD register; the link training negotiation result information comprises a new k generation initial value corresponding to the successful link retraining.
In a practical embodiment, in the link liveness management step of stage 6, if the link liveness flag bit is 1, the bios prints the retry number and the negotiation result to the log at the same time, and writes the retry number and the negotiation result to the CPLD register.
In a preferred embodiment, after S61, the link liveness management step further comprises:
and S63, when the link retraining is carried out according to the preset retraining times in sequence and the link activity flag bit is not 1, printing the preset retraining times and the link training failure flag into a log, and writing the log into a CPLD register.
In a practical embodiment, the link liveness management step of the stage 6 further includes:
if the link liveness flag bit is not 1 after three retries, the Bios prints the retries and the link training failure flag bit to the log at the same time, writes the log into the CPLD register, and exits the key PCIe device link training flow.
In a preferred embodiment, the link stability checking step further comprises a link state management step comprising:
s71, judging whether the link speed in the link state information is smaller than a speed expected value or not when the link activity flag bit is 1; if the speed is smaller than the preset speed, judging that the link speed is reduced; wherein the speed expectation value matches a minimum specification of a maximum link speed in the link capability register;
s72, when the link activity flag bit is 1, judging whether the link bandwidth in the link state information is smaller than a bandwidth expected value; if the link is smaller than the preset threshold, judging that the link is down; wherein the bandwidth expectation value matches a minimum specification of a maximum link bandwidth in the link capability register.
In a practical embodiment, the link stability checking step in the link training management method includes a link state management step including:
and 7, after the link liveness of the key PCIe device is normal, checking the link state, mainly checking the link speed and the link channel width, wherein the checking method comprises the following steps:
checking whether the negotiated bandwidth and rate in the link state of the PCIe device are equal to expected values; the expected value is the minimum specification of the maximum link speed and the maximum link width in the link capacity register of the upstream and downstream equipment of the link where the PCIe equipment is located;
If the link speed is less than the expected value after the PCIe device negotiates, the link speed reduction is considered to occur;
if the link width is less than the expected value after negotiation by the PCIe device, a link down lane is considered to occur.
In a preferred embodiment, the link state management step further comprises:
s73, when the link state information is normal, recording a log, and writing a link training success mark into a CPLD register.
In a practical embodiment, in the link state management step of stage 7, if the link state is checked to be normal (for example, if the negotiated bandwidth and rate are equal to the expected values), log is recorded, a link training success flag is written into the CPLD register, and the critical PCIe device link training procedure is exited.
In a preferred embodiment, the link state management step further comprises:
s81, when link speed reduction and/or link lane reduction occur, sequentially performing link disabling operation according to preset disabling repetition times; and when the link is disabled to enable operation, setting a disabling bit in a link control register, setting a disabling and enabling time interval according to a preset operation time interval, and re-reading the link state information according to a second waiting time to judge whether the link state is normal or not.
In a practical embodiment, the link state management step further comprises:
stage 8, when the link speed reduction or the link lane reduction occurs, initiating 3 times of link disabling and link enabling:
the disabling process of one time is realized by setting the disable bit (namely disabling bit) of the link_control register (namely a link control register) of the Rootport configuration space, the time interval between disabling and enabling is set to be 200ms, and the value of the link state of the configuration space is read again after waiting for 100 ms;
in a preferred embodiment, after S81, the link state management step further includes:
s82, when the link disabling operation is sequentially carried out according to the preset disabling repetition times and the link speed reduction and/or the link channel reduction occur in each retry, the preset disabling repetition times and the link training failure mark are printed into a log and written into a CPLD register.
In a practical embodiment, the link state management step in stage 8 further comprises:
if the link is disabled for 3 times and the link is enabled to retry, the link speed reduction or the link channel reduction occurs, the Bios prints the retry times and the link training failure mark to the log at the same time, writes the log into the CPLD register, and exits the key PCIe device link training flow.
In a preferred embodiment, before S2, the training management method further includes:
s1, acquiring a preset value set of link equalization parameters of a transmitting end according to PCIe signal parameter characteristics of a server main board and large-scale sample test requirements of the server main board; the preset value set of the link equalization parameters of the transmitting end comprises a default preset initial value Px.
In a preferred embodiment, after S2, the training management method further includes:
and taking the default preset initial value Px as a kth generation initial value, and performing PCIe link training on the PCIe root port of the server CPU.
In a preferred embodiment, before S2, the training management method further includes:
and setting a preset value set of the link equalization parameters of the transmitting end through a register, wherein the preset value set is used for carrying out link training on a CPU root port corresponding to a link where the PCIe device is located.
In a practical embodiment, the link training management method comprises the following stages:
step 1, initiating link training for PCIe in a Bios starting step, and setting a gen3 Tx Preset initial value of a CPU RootPort (i.e. a root port) corresponding to a link where an on-board key PCIe device is located into a register before the link training;
the gen3 Tx Preset initial value is a Preset initial value obtained by the hardware according to PCIe SI parameter characteristics of the server motherboard and performing a lot of sample tests (e.g., gen3 Tx preset=p4).
And in the stage 2, when the corresponding CPU PCIe root port is trained, performing PCIe link training by using the gen3 Tx Preset initial value.
In a preferred embodiment, S2 specifically includes:
and acquiring a default preset initial value Px from the preset value set of the link equalization parameters of the transmitting end, and taking the default preset initial value Px as a kth generation initial value for carrying out link training on a CPU root port corresponding to a link where the PCIe device is located.
In a preferred embodiment, S3 specifically includes:
and acquiring and judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not through the basic input/output system firmware, and judging whether PCIe link training of the PCIe bus and the CPU root port is successful or not.
In addition, the link training management method further comprises the following stages:
stage 3, the Bios determines whether the PCIe link training between the PCIe bus and the CPU RootPort is successful by checking whether VID (i.e., vendor ID) and DID (i.e., device ID) of the on-board critical PCIe device can be read normally.
When the VID and the DID of the on-board key PCIe device can be read normally, judging that the link training is successful, and recording a log; bios continues to operate. In a practical embodiment, the training management method further includes:
step 1, initiating link training for PCIe in a Bios starting step, and setting a gen3 Tx Preset initial value of a CPU RootPort (i.e. a root port) corresponding to a link where an on-board key PCIe device is located into a register before the link training;
The gen3 Tx Preset initial value is a Preset initial value obtained by the hardware according to the PCIe SI parameter characteristic of the server motherboard and performing a large number of sample tests (for example, gen3 Tx preset=P4);
and in the stage 2, when the corresponding CPU PCIe root port is trained, performing PCIe link training by using the gen3 Tx Preset initial value.
In summary, the PCIe device link training management method provided by the invention has the following advantages:
1) By adaptively adjusting the gen3 Tx Preset initial value before the PCIe link in the starting stage is trained by bios, the starting failure problem of a main board key PCIe device with poor SI consistency is solved.
2) Meanwhile, checking the link activity and the link state; when abnormal link activity, link speed reduction or link lane reduction are detected, operations of repeated PCIe link retraining and PCIe link disabling/enabling are respectively initiated, so that the stability of the key PCIe device link is further improved.
It should be noted that, although the steps in the flowchart are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps in the flowcharts may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order in which the sub-steps or stages are performed is not necessarily sequential, and may be performed in turn or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
Embodiment two:
as shown in fig. 4, the embodiment of the present invention further provides a PCIe device link training management apparatus, configured to perform link training management on a PCIe device critical to a server motherboard by using PCIe equalization technology, where the training management apparatus includes a link training failure repair unit, including:
the link training unit is used for acquiring a default preset initial value Px from a preset value set of the link equalization parameters of the transmitting end set for the kth generation PCIe protocol, and taking the default preset initial value Px as the kth generation initial value for carrying out link training on the PCIe device; the k-th generation initial value is a transmitting end link balance parameter initial value when balance adjustment is carried out on the k-th generation PCIe protocol, and k is not less than 2; the data of the preset value set of the link equalization parameters of the transmitting end are sequentially arranged according to the numerical value;
the link training judging unit is used for judging whether the provider ID and the equipment ID of the PCIe device can be read normally after the PCIe device is subjected to link training through the kth generation initial value;
the training failure judging unit is used for judging that the PCIe link training fails when the provider ID and the equipment ID cannot be read normally;
the link retraining unit is used for acquiring a preset initial value Py from a preset value collection set of link equalization parameters of the transmitting end according to a preset value rule, and taking the preset initial value Py as a new kth generation initial value for carrying out link retraining on the PCIe device;
The link retraining judging unit is used for judging whether the provider ID and the equipment ID of the PCIe device can be read normally according to the first waiting time after the PCIe device is retrained;
and the link retraining success judging unit is used for judging that the PCIe link training is successful when the provider ID and the equipment ID can be read normally after the link retraining.
In a preferred embodiment, the link training failure repair unit further includes:
the link retraining failure judging unit is used for judging that the PCIe link training fails if the provider ID and the equipment ID of the PCIe device cannot be read normally after the link retraining is carried out;
the cyclic link retraining unit is used for sequentially taking values from the preset value set of the link equalization parameters of the transmitting end according to the preset value increasing direction and/or the preset value decreasing direction in the preset value rule as new k-th generation initial values, and is used for respectively carrying out link retraining on the PCIe devices until the PCIe link training is successful.
In a preferred embodiment, the training management device further comprises a link stability checking unit; the link stability checking unit includes a link liveness management unit including:
the link liveness zone bit judging unit is used for judging whether the link liveness zone bit is 1 after judging that the PCIe link training is successful;
The link liveness recovery unit is used for carrying out link retraining in sequence according to the preset retraining retry times if the link liveness flag bit is not 1; and when the link retrains each time, setting the link retraining flag bit to be 1, and judging whether the link activity flag bit is 1 again according to a second preset time interval.
In a preferred embodiment, the training management device further comprises a link state management unit comprising:
the link disabling operation unit is used for sequentially carrying out link disabling operation according to the preset disabling repetition times when the link is slowed down and/or the link is down; and when the link is disabled to enable operation, setting a disabling bit in a link control register, setting a disabling and enabling time interval according to a preset operation time interval, and re-reading the link state information according to a second waiting time to judge whether the link state is normal or not.
In a preferred embodiment, the link stability check unit further comprises a link state management unit comprising:
a link speed reduction judging unit, configured to judge whether a link speed in the link state information is less than a speed expected value when the link activity flag bit is 1; if the speed is smaller than the preset speed, judging that the link speed is reduced; wherein the speed expectation value matches a minimum specification of a maximum link speed in a link capability register;
The link down channel judging unit is used for judging whether the link bandwidth in the link state information is smaller than a bandwidth expected value or not when the link activity flag bit is 1; if the link is smaller than the preset threshold, judging that the link is down; wherein the bandwidth expectation value matches a minimum specification of a maximum link bandwidth in the link capability register.
In a preferred embodiment, the link retraining success determination unit is further configured to:
if the link liveness flag bit is 1, printing the preset retraining retry times and link training negotiation result information to a log, and writing the log into a CPLD register; the link training negotiation result information comprises a new k generation initial value corresponding to the successful link retraining.
In a preferred embodiment, the link retraining success determination unit is further configured to:
when the link state information is normal, recording a log, and writing a link training success mark into a CPLD register.
In a preferred embodiment, the link retraining success determination unit is further configured to:
if the vendor ID and the device ID of the PCIe device can be read normally, judging that the PCIe link training is successful, and recording logs.
In a preferred embodiment, the link retraining failure determination unit is further configured to:
and when the link retraining is sequentially carried out according to the preset retraining times and the link activity flag bit is not 1, printing the preset retraining times and a link training failure flag into a log, and writing the log into a CPLD register.
In a preferred embodiment, the link retraining failure determination unit is further configured to:
when the link disabling operation is sequentially carried out according to the preset disabling repetition times and the link speed reduction and/or the link lane reduction occur in each retry, the preset disabling repetition times and the link training failure mark are printed into a log and written into a CPLD register.
In a preferred embodiment, the link retraining failure determination unit is further configured to:
after traversing the value in the preset value combination set of the link equalization parameters of the transmitting end in sequence according to the preset value increasing direction and/or the preset value decreasing direction in the preset value rule as a new kth generation initial value, when all links are retrained to fail, recording a log, and writing a link training failure sign into a CPLD (complex programmable logic device) register.
In a preferred embodiment, the training management apparatus further comprises:
the preset value set acquisition unit is used for acquiring the preset value set of the link equalization parameter of the transmitting end according to the PCIe signal parameter characteristics of the server main board and the large-scale sample test requirements of the server main board; the set of preset values of the link equalization parameters of the transmitting end comprises the default preset initial value Px.
In a preferred embodiment, the link training unit is further configured to:
and taking the default preset initial value Px as a kth generation initial value, and performing PCIe link training on a PCIe root port of the server CPU.
For specific limitations of the above apparatus, reference may be made to the limitations of the method described above, which are not repeated here.
Each of the modules in the above apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware, or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
The computer device may be a terminal, as shown in fig. 5, which includes a processor, a memory, a network interface, a display screen, and an input device connected through a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It is to be understood that the structures shown in the above figures are merely block diagrams of some of the structures associated with the present invention and are not limiting of the computer devices to which the present invention may be applied, and that a particular computer device may include more or less components than those shown, or may combine some of the components, or have a different arrangement of components.
Implementation of all or part of the flow in the above-described embodiment methods may be accomplished by a computer program that instructs related hardware, and the computer program may be stored in a non-volatile computer readable storage medium, and the computer program may include the flow in the above-described embodiment methods when executed.
Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (17)

1. The link training management method for the PCIe devices is characterized by being used for carrying out link training management on key PCIe devices of a server main board through PCIe equalization technology, and comprises the following steps of:
acquiring a default preset initial value Px from a preset value aggregate of link equalization parameters of a transmitting end set for a kth generation PCIe protocol, and taking the default preset initial value Px as a kth generation initial value for carrying out link training on the PCIe device; the k-th generation initial value is a transmitting end link balance parameter initial value when balance adjustment is carried out on a k-th generation PCIe protocol, and k is not less than 2; the data of the preset value set of the link equalization parameters of the transmitting end are sequentially arranged according to the numerical value;
After the link training is carried out on the PCIe device through the kth generation initial value, judging whether the provider ID and the equipment ID of the PCIe device can be read normally;
if the PCIe link cannot be read normally, judging that the PCIe link training fails;
acquiring a preset initial value Py from the preset value combination set of the link equalization parameters of the transmitting end according to a preset value taking rule, and taking the preset initial value Py as a new kth generation initial value for carrying out link retraining on the PCIe device;
after the link retraining is carried out on the PCIe device, judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not again according to a first waiting time;
if the provider ID and the equipment ID can be read normally after the link retraining is carried out, judging that the PCIe link training is successful;
the training management method further comprises a link stability checking step; the link stability checking step includes a link liveness management step including:
after judging that the PCIe link training is successful, judging whether the link liveness flag bit is 1;
if the link activity flag bit is not 1, sequentially performing link retraining according to preset retraining times; setting a link retraining flag bit to be 1 when a link retrains each time, and judging whether the link activity flag bit is 1 again according to a second preset time interval;
Wherein the link stability checking step further includes a link state management step including:
when the link activity flag bit is 1, judging whether the link speed in the link state information is smaller than a speed expected value or not; if the speed is smaller than the preset speed, judging that the link speed is reduced; wherein the speed expectation value matches a minimum specification of a maximum link speed in a link capability register;
when the link activity flag bit is 1, judging whether the link bandwidth in the link state information is smaller than a bandwidth expected value or not; if the link is smaller than the preset threshold, judging that the link is down; wherein the bandwidth expected value is matched with a minimum specification of a maximum link bandwidth in the link capability register;
the link state management step further includes:
when the link speed reduction and/or the link lane reduction occur, the link disabling operation is sequentially carried out according to the preset disabling repetition times; and when the link is disabled to enable operation, setting a disabling bit in a link control register, setting a disabling and enabling time interval according to a preset operation time interval, and re-reading the link state information according to a second waiting time to judge whether the link state is normal or not.
2. The PCIe device link training management method according to claim 1, wherein after judging again whether the vendor ID, the device ID of the PCIe device can be read normally or not according to a first waiting time, the link training failure repair step further comprises:
if the provider ID and the equipment ID of the PCIe device can not be read normally after the link retraining, judging that the PCIe link training fails;
sequentially taking values from the preset value collection set of the link equalization parameters of the transmitting end as new k generation initial values according to the preset value increasing direction and/or the preset value decreasing direction in the preset value rule, and respectively carrying out link retraining on the PCIe device until the PCIe link training is successful.
3. The PCIe device link training management method of claim 1 wherein the link liveness management step further comprises:
if the link liveness flag bit is 1, printing the preset retraining retry times and link training negotiation result information to a log, and writing the log into a CPLD register; the link training negotiation result information comprises a new k generation initial value corresponding to the successful link retraining.
4. The PCIe device link training management method of claim 1, wherein the link state management step further comprises:
when the link state information is normal, recording a log, and writing a link training success mark into a CPLD register.
5. The PCIe device link training management method according to claim 1, wherein after judging whether the vendor ID, device ID of the PCIe device can be read normally, the training management method further comprises:
if the vendor ID and the device ID of the PCIe device can be read normally, judging that the PCIe link training is successful, and recording logs.
6. The PCIe device link training management method of claim 1 wherein the link liveness management step further comprises, after sequentially performing link retraining according to a preset retraining retry number:
and when the link retraining is sequentially carried out according to the preset retraining times and the link activity flag bit is not 1, printing the preset retraining times and a link training failure flag into a log, and writing the log into a CPLD register.
7. The PCIe device link training management method of claim 1, wherein after sequentially performing link disabling operations according to a preset disabling repetition number, the link state management step further comprises:
When the link disabling operation is sequentially carried out according to the preset disabling repetition times and the link speed reduction and/or the link lane reduction occur in each retry, the preset disabling repetition times and the link training failure mark are printed into a log and written into a CPLD register.
8. The PCIe device link training management method according to claim 2, wherein after sequentially taking values from the sender link equalization parameter preset value set as new kth generation initial values according to a preset value increasing direction and/or a preset value decreasing direction in the preset value rule, the training management method further comprises:
after traversing the value in the preset value combination set of the link equalization parameters of the transmitting end in sequence according to the preset value increasing direction and/or the preset value decreasing direction in the preset value rule as a new kth generation initial value, when all links are retrained to fail, recording a log, and writing a link training failure sign into a CPLD (complex programmable logic device) register.
9. The PCIe device link training management method according to claim 2, wherein before acquiring a default preset initial value Px as a kth-generation initial value from a transmitting-side link equalization parameter preset value set for the kth-generation PCIe protocol, the training management method further comprises:
Acquiring a preset value set of the link equalization parameters of the transmitting end according to the PCIe signal parameter characteristics of the server main board and the large-scale sample test requirements of the server main board; the set of preset values of the link equalization parameters of the transmitting end comprises the default preset initial value Px.
10. The PCIe device link training management method according to claim 9, wherein after obtaining a default preset initial value Px as a kth-generation initial value from a sender link equalization parameter preset value set for the kth-generation PCIe protocol, the training management method further comprises:
and taking the default preset initial value Px as a kth generation initial value, and performing PCIe link training on a PCIe root port of the server CPU.
11. The PCIe device link training management method according to claim 1, wherein a default preset initial value Px is obtained from a sender link equalization parameter preset value set for a kth generation PCIe protocol as a kth generation initial value, and is used for performing link training on the PCIe device, and specifically includes:
and acquiring the default preset initial value Px from the preset value set of the link equalization parameters of the transmitting end as a kth generation initial value, wherein the default preset initial value Px is used for carrying out link training on a CPU root port corresponding to a link where the PCIe device is located.
12. The PCIe device link training management method according to claim 11, wherein after the PCIe device is link trained by the kth generation initial value, determining whether a vendor ID and a device ID of the PCIe device can be read normally comprises:
and acquiring and judging whether the provider ID and the equipment ID of the PCIe device can be read normally or not through the basic input/output system firmware, and judging whether PCIe link training of a PCIe bus and the CPU root port is successful or not.
13. The PCIe device link training management method according to claim 12, wherein before obtaining a default preset initial value Px from a sender link equalization parameter preset value set for a kth generation PCIe protocol as a kth generation initial value for performing link training on the PCIe device, the training management method further comprises:
and setting a preset value set of the link equalization parameters of the transmitting end through a register, wherein the preset value set is used for carrying out link training on a CPU root port corresponding to a link where the PCIe device is located.
14. The utility model provides a PCIe device link training management device for link training management is carried out to server mainboard key PCIe device through PCIe balanced technique, training management device includes link training failure repair unit, and it includes:
The link training unit is used for acquiring a default preset initial value Px from a preset value set of the link equalization parameters of the transmitting end set for the kth generation PCIe protocol, and taking the default preset initial value Px as the kth generation initial value for carrying out link training on the PCIe device; the k-th generation initial value is a transmitting end link balance parameter initial value when balance adjustment is carried out on a k-th generation PCIe protocol, and k is not less than 2; the data of the preset value set of the link equalization parameters of the transmitting end are sequentially arranged according to the numerical value;
a link training judging unit, configured to judge whether a vendor ID and a device ID of the PCIe device can be read normally after performing link training on the PCIe device through the kth generation initial value;
the training failure judging unit is used for judging that the PCIe link training fails when the provider ID and the equipment ID cannot be read normally;
the link retraining unit is used for acquiring a preset initial value Py from the preset value collection set of the link equalization parameters of the transmitting end according to a preset value rule, and taking the preset initial value Py as a new kth generation initial value for retraining the link of the PCIe device;
the link retraining judging unit is used for judging whether the provider ID and the equipment ID of the PCIe device can be read normally according to the first waiting time after the PCIe device is retrained;
The link retraining success judging unit is used for judging that the PCIe link training is successful when the provider ID and the equipment ID can be read normally after the link retraining is carried out;
wherein the training management device further comprises a link stability checking unit; the link stability checking unit includes a link liveness management unit for:
after judging that the PCIe link training is successful, judging whether the link liveness flag bit is 1;
if the link activity flag bit is not 1, sequentially performing link retraining according to preset retraining times; setting a link retraining flag bit to be 1 when a link retrains each time, and judging whether the link activity flag bit is 1 again according to a second preset time interval;
wherein the link stability check unit further includes a link state management unit including:
a link speed reduction judging unit, configured to judge whether a link speed in the link state information is less than a speed expected value when the link activity flag bit is 1; if the speed is smaller than the preset speed, judging that the link speed is reduced; wherein the speed expectation value matches a minimum specification of a maximum link speed in a link capability register;
The link down channel judging unit is used for judging whether the link bandwidth in the link state information is smaller than a bandwidth expected value or not when the link activity flag bit is 1; if the link is smaller than the preset threshold, judging that the link is down; wherein the bandwidth expected value is matched with a minimum specification of a maximum link bandwidth in the link capability register;
the training management apparatus further includes a link state management unit including:
the link disabling operation unit is used for sequentially carrying out link disabling operation according to the preset disabling repetition times when the link is slowed down and/or the link is down; and when the link is disabled to enable operation, setting a disabling bit in a link control register, setting a disabling and enabling time interval according to a preset operation time interval, and re-reading the link state information according to a second waiting time to judge whether the link state is normal or not.
15. A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the computer program, performs the steps of the PCIe device link training management method of any one of claims 1-13.
16. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the PCIe device link training management method of any one of claims 1-13.
17. A server, comprising a critical PCIe device disposed on a motherboard, wherein the PCIe device implements link training management by the PCIe device link training management method according to any one of claims 1-13.
CN202310061710.4A 2023-01-19 2023-01-19 PCIe device link training management method, management device and server Active CN115878540B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202310061710.4A CN115878540B (en) 2023-01-19 2023-01-19 PCIe device link training management method, management device and server
PCT/CN2023/121333 WO2024152604A1 (en) 2023-01-19 2023-09-26 Link training management method and management apparatus for pcie device, and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310061710.4A CN115878540B (en) 2023-01-19 2023-01-19 PCIe device link training management method, management device and server

Publications (2)

Publication Number Publication Date
CN115878540A CN115878540A (en) 2023-03-31
CN115878540B true CN115878540B (en) 2023-06-13

Family

ID=85758702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310061710.4A Active CN115878540B (en) 2023-01-19 2023-01-19 PCIe device link training management method, management device and server

Country Status (2)

Country Link
CN (1) CN115878540B (en)
WO (1) WO2024152604A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878540B (en) * 2023-01-19 2023-06-13 苏州浪潮智能科技有限公司 PCIe device link training management method, management device and server
CN118012812B (en) * 2024-04-10 2024-06-18 芯瞳半导体技术(山东)有限公司 PCIE link training method and device, electronic equipment and computer storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105706388A (en) * 2013-12-06 2016-06-22 英特尔公司 Lane error detection and lane removal mechanism of reduce the probability of data corruption
CN105814828A (en) * 2013-12-06 2016-07-27 英特尔公司 Efficient link layer retry protocol utilizing implicit acknowledgements
CN108780436A (en) * 2016-03-23 2018-11-09 高通股份有限公司 Link-speeds control system for power optimization
CN115061962A (en) * 2022-04-28 2022-09-16 苏州浪潮智能科技有限公司 Method, system, storage medium and equipment for managing peripheral transmission rate

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377547B (en) * 2019-06-28 2021-03-12 苏州浪潮智能科技有限公司 Method and device for realizing driver parameter self-adaption in PCIE4.0 link
CN111538539B (en) * 2020-04-23 2022-07-22 苏州浪潮智能科技有限公司 Storage system starting method and device and computer readable storage medium
US11226919B1 (en) * 2020-06-23 2022-01-18 Amazon Technologies, Inc. Communication link recovery
CN114416636A (en) * 2021-12-17 2022-04-29 飞腾信息技术有限公司 PCIE equipment link rate matching method, system on chip and computer equipment
CN114816885A (en) * 2022-05-27 2022-07-29 苏州浪潮智能科技有限公司 Method, device, equipment and medium for automatically adjusting balance value of sending end
CN115048235B (en) * 2022-06-14 2023-05-23 北京百度网讯科技有限公司 Configuration method, device, equipment and medium of link parameters
CN115878540B (en) * 2023-01-19 2023-06-13 苏州浪潮智能科技有限公司 PCIe device link training management method, management device and server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105706388A (en) * 2013-12-06 2016-06-22 英特尔公司 Lane error detection and lane removal mechanism of reduce the probability of data corruption
CN105814828A (en) * 2013-12-06 2016-07-27 英特尔公司 Efficient link layer retry protocol utilizing implicit acknowledgements
CN108780436A (en) * 2016-03-23 2018-11-09 高通股份有限公司 Link-speeds control system for power optimization
CN115061962A (en) * 2022-04-28 2022-09-16 苏州浪潮智能科技有限公司 Method, system, storage medium and equipment for managing peripheral transmission rate

Also Published As

Publication number Publication date
CN115878540A (en) 2023-03-31
WO2024152604A1 (en) 2024-07-25

Similar Documents

Publication Publication Date Title
CN115878540B (en) PCIe device link training management method, management device and server
CN105975357B (en) A kind of method and system of positioning failure
CN112014726A (en) DSP chip testing device and method
CN110287151A (en) Distributed memory system, method for writing data, device and storage medium
CN111008102B (en) FPGA accelerator card high-speed interface SI test control device, system and method
CN107818032A (en) A kind of mainboard, information Method of printing, system, device and storage medium
CN111625199A (en) Method and device for improving reliability of data path of solid state disk, computer equipment and storage medium
WO2021088368A1 (en) Method and device for repairing memory
CN114221903A (en) Data transmission method and device
CN112947964A (en) Chip firmware updating method, device, equipment and storage medium
CN112866061A (en) NCSI (network control information system) testing method, device, equipment and medium of onboard network port
CN111198832B (en) Processing method and electronic equipment
CN115617411B (en) Electronic equipment data processing method and device, electronic equipment and storage medium
CN117093427A (en) PCIE equipment state detection method, system, electronic equipment and medium
KR101300443B1 (en) Flash memory device capable of verifying reliability using bypass path, and system and method of verifying reliability using that device
US20070171150A1 (en) Burning apparatus
CN114116337A (en) Hard disk test method, system, terminal and storage medium based on PCIE link configuration
CN114996069A (en) Mainboard test method, device and medium
CN112445669B (en) Storage performance testing method and device and electronic equipment
CN117112447B (en) Data transmission method and device, electronic equipment and readable storage medium
CN108829417A (en) A kind of update device of CPLD, method, equipment and storage medium
CN111782446B (en) SSD normal power-down testing method and device, computer equipment and storage medium
CN116028409B (en) Adapter card, mainboard, computer, data transmission method, equipment and medium
CN114741350A (en) Method, system, equipment and medium for cascading multiple NVME hard disk backplanes
CN116913359A (en) ECC function test method, system, equipment and storage medium of NAND FLASH

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant