CN116955001A - PCIE equipment information acquisition method, system, device and medium - Google Patents

PCIE equipment information acquisition method, system, device and medium Download PDF

Info

Publication number
CN116955001A
CN116955001A CN202310833631.0A CN202310833631A CN116955001A CN 116955001 A CN116955001 A CN 116955001A CN 202310833631 A CN202310833631 A CN 202310833631A CN 116955001 A CN116955001 A CN 116955001A
Authority
CN
China
Prior art keywords
pcie
information
equipment
mctp
topology
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310833631.0A
Other languages
Chinese (zh)
Inventor
周志超
李政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310833631.0A priority Critical patent/CN116955001A/en
Publication of CN116955001A publication Critical patent/CN116955001A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/24Negotiation of communication capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/142Reconfiguring to eliminate the error
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1443Transmit or communication errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • G06F11/221Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test buses, lines or interfaces, e.g. stuck-at or open line faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention provides a method, a system, a device and a medium for acquiring PCIE equipment information, wherein the method comprises the following steps: recording information of whether the known PCIE equipment supports the MCTP protocol or not as a full PCIE topology; collecting equipment information of all external PCIE equipment of a current server; according to the information of all external PCIE devices of the current server, a polling mode is adopted to send MCTP protocol to each external PCIE device, and PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation are identified according to the total PCIE topology and the response situation of the PCIE devices; performing pause control on PCIE equipment which does not perform information acquisition operation; generating PCIE topology information special for MCTP polling; and acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling. The invention effectively avoids the problem that the disabled PCIE equipment continuously enters the energy-saving wake-up state to cause downtime of the machine, and ensures the stable operation of customer service.

Description

PCIE equipment information acquisition method, system, device and medium
Technical Field
The invention relates to the technical field of computers, in particular to a method, a system, a device and a medium for acquiring PCIE equipment information.
Background
Currently, the BMC of the server can acquire information of the PCIE device in an SMBus/I2C mode, but the I2C rate is very slow, and the information acquired through the I2C is limited, such as the state of a network card, SN and MAC address are not acquired through the I2C, and some information of the RAID card is not acquired through the I2C. Therefore, in some new server platforms, MCTP over PCIe is started to acquire information of PCIe devices. Compared with the traditional BMC management mode through SMBus/I2C, MCTP (management component transmission protocol) can realize more comprehensive management functions, PCIE equipment in a system can be managed quickly through MCTP Over PCIe, the reliability of system management is improved, and the mode has the characteristics of high signal bus speed, high anti-interference capability and the like.
However, for some special industries, such as finance and communication industries, some PCIE devices are not used according to their industry requirements, in order to save power and avoid misoperations, the unused PCIE devices are generally disabled under the operating system. Disabling these PCIE add-in cards allows the operating system and drivers to enter power saving mode.
However, the MCTP protocol that the BMC sends to obtain device information is a generalized packet that sends and receives information according to the PCIE topology, cannot skip a PCIE slot, or specifies a slot or device. The BMC wakes up the forbidden devices every 2 seconds through the MCTP, enters a normal state, and the devices enter an energy-saving mode again after the needed information is taken; the method is to circulate all the time, and for some unstable devices, when the PCIE device cannot be awakened for a certain time, CPU communication is overtime, so that the CPU is halted, and the server is crashed. In addition, as the equipment is continuously awakened, not only is the waste of software performance and energy consumption caused, but also the running risk of the BMC caused by frequent access is increased.
Disclosure of Invention
Aiming at the problems, the invention aims to provide a method, a system, a device and a medium for acquiring PCIE equipment information, which realize that PCIE equipment which does not support MCTP protocol and PCIE equipment which is forbidden by a client because of no service are not used, and when the information is not acquired through the MCTP protocol, the BMC does not need to poll any more to acquire the equipment information of the PCIE equipment, thereby effectively reducing the burden of the BMC.
The invention aims to achieve the aim, and the aim is achieved by the following technical scheme:
in a first aspect, the present invention discloses a method for acquiring PCIE device information, including:
recording information of whether the known PCIE equipment supports the MCTP protocol or not, taking the information as a full PCIE topology, and storing the information;
collecting equipment information of all external PCIE equipment of a current server;
according to the information of all external PCIE devices of the current server, a polling mode is adopted to send MCTP protocol to each external PCIE device, and PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation are identified according to the total PCIE topology and the response situation of the PCIE devices;
performing pause control on PCIE equipment which does not perform information acquisition operation;
based on the information of all external PCIE equipment, generating PCIE topology information special for MCTP polling according to the equipment information of the PCIE equipment capable of performing information acquisition operation, and storing the PCIE topology information;
and acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling.
Further, the recording the information of whether the known PCIE device supports the MCTP protocol, as a full PCIE topology, and storing the information includes:
acquiring information of whether the known PCIE equipment supports an MCTP protocol;
and generating a BDF list according to the acquired information, taking the BDF list as a full PCIE topology, and storing the BDF list in a shared memory of the BMC.
Further, the collecting device information of all external PCIE devices of the current server includes:
in the power-on self-checking process of the server, collecting equipment information of all external PCIE equipment of the current server through BIOS;
and taking the collected device information as full information and storing the full information in a shared memory of the BMC.
Further, the method for sending MCTP protocol to each external PCIE device by adopting a polling manner according to information of all external PCIE devices of the current server, and identifying PCIE devices capable of performing information obtaining operation and PCIE devices not performing information obtaining operation according to total PCIE topology and response conditions of PCIE devices includes:
according to the full information, sending an MCTP protocol to PCIE equipment through the BMC every 2 seconds;
and identifying PCIE equipment supporting MCTP polling, PCIE equipment not supporting MCTP polling, faulty PCIE equipment and PCIE equipment disabled by a client according to the total PCIE topology and the response conditions of the PCIE equipment.
Further, identifying, according to the total PCIE topology and the response conditions of PCIE devices, PCIE devices supporting MCTP polling, PCIE devices not supporting MCTP polling, faulty PCIE devices, and PCIE devices disabled by the client, where the identifying includes:
acquiring the equipment information of any PCIE equipment from the full information, and marking the equipment information as PCIE equipment to be identified;
searching record information corresponding to PCIE equipment to be identified in the full PCIE topology to judge whether the PCIE equipment to be identified supports MCTP protocol or not;
if the recorded information does not support the MCTP protocol, the PCIE equipment to be identified does not support the MCTP protocol and is recorded as PCIE equipment which does not perform information acquisition operation;
if the recorded information is that the MCTP protocol is supported, the PCIE equipment to be identified supports the MCTP protocol, and the response condition of the PCIE equipment is obtained; if the PCIE equipment receives the MCTP protocol three times in succession, the PCIE equipment is recorded as PCIE equipment for information acquisition operation; if the PCIE equipment does not receive the MCTP protocol for three times continuously, the PCIE equipment is a fault PCIE equipment and is marked as PCIE equipment which does not perform information acquisition operation; if the PCIE device returns the register information to the BMC after receiving the MCTP protocol, the PCIE device is a PCIE device disabled by the client, and is marked as a PCIE device that does not perform the information obtaining operation.
Further, the performing pause control on the PCIE device that does not perform the information obtaining operation includes:
and triggering the GPIO signal to be sent to the CPLD by the BMC in a mode of sending the IPMI instruction, and then triggering the logic signal by the CPLD to control an E-fuse switch of the PCIE equipment which does not perform information acquisition operation so as to stop working.
Further, the generating and storing PCIE topology information dedicated to MCTP poll according to the device information of PCIE devices capable of performing information obtaining operation based on the information of all external PCIE devices includes:
generating a list of all external PCIE devices based on the full information;
deleting PCIE equipment which does not support MCTP polling, faulty PCIE equipment and PCIE equipment which is forbidden by a client from the list of all external PCIE equipment, and taking the PCIE equipment, the faulty PCIE equipment and the PCIE equipment as PCIE topology information special for MCTP polling;
and storing PCIE topology information special for MCTP polling in a shared memory of the BMC.
In a second aspect, the present invention also discloses a PCIE device information obtaining system, including:
the full PCIE topology recording module is used for recording whether the known PCIE equipment supports the MCTP protocol or not, and taking the information as a full PCIE topology and storing the full PCIE topology;
the device information collection module is used for collecting device information of all external PCIE devices of the current server;
the device identification module is used for sending an MCTP protocol to each external PCIE device in a polling mode according to the information of all external PCIE devices of the current server, and identifying PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation according to the total PCIE topology and the response situation of the PCIE devices;
a pause control module, configured to perform pause control on PCIE devices that do not perform information acquisition operations;
the polling topology generation module is used for generating PCIE topology information special for MCTP polling according to the equipment information of PCIE equipment capable of performing information acquisition operation based on the information of all external PCIE equipment and storing the PCIE topology information;
and the information acquisition module is used for acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling.
Further, the full PCIE topology recording module is specifically configured to: acquiring information of whether the known PCIE equipment supports an MCTP protocol; and generating a BDF list according to the acquired information, taking the BDF list as a full PCIE topology, and storing the BDF list in a shared memory of the BMC.
Further, the device information collection module is specifically configured to: in the power-on self-checking process of the server, collecting equipment information of all external PCIE equipment of the current server through BIOS; and taking the collected device information as full information and storing the full information in a shared memory of the BMC.
Further, the device identification module is specifically configured to: according to the full information, sending an MCTP protocol to PCIE equipment through the BMC every 2 seconds; and identifying PCIE equipment supporting MCTP polling, PCIE equipment not supporting MCTP polling, faulty PCIE equipment and PCIE equipment disabled by a client according to the total PCIE topology and the response conditions of the PCIE equipment.
Further, the pause control module is specifically configured to: and triggering the GPIO signal to be sent to the CPLD by the BMC in a mode of sending the IPMI instruction, and then triggering the logic signal by the CPLD to control an E-fuse switch of the PCIE equipment which does not perform information acquisition operation so as to stop working.
Further, the polling topology generation module is specifically configured to: generating a list of all external PCIE devices based on the full information; deleting PCIE equipment which does not support MCTP polling, faulty PCIE equipment and PCIE equipment which is forbidden by a client from the list of all external PCIE equipment, and taking the PCIE equipment, the faulty PCIE equipment and the PCIE equipment as PCIE topology information special for MCTP polling; and storing PCIE topology information special for MCTP polling in a shared memory of the BMC.
In a third aspect, the present invention also discloses a PCIE device information obtaining apparatus, including:
the memory is used for storing an acquisition program of PCIE equipment information;
and the processor is used for realizing the steps of the PCIE equipment information acquisition method when executing the PCIE equipment information acquisition program.
In a fourth aspect, the present invention further discloses a readable storage medium, where a PCIE device information obtaining program is stored, where the PCIE device information obtaining program implements the steps of the PCIE device information obtaining method according to any one of the preceding claims when executed by a processor.
Compared with the prior art, the invention has the beneficial effects that: the invention discloses a method, a system, a device and a readable storage medium for acquiring PCIE equipment information, which are used for recording which equipment supports an MCTP protocol and which equipment does not support the MCTP protocol by writing a BDF list supported by the MCTP protocol in a BMC. After the machine is started, the BIOS initializes the external plug-in equipment in the process of starting the POST, collects the information and sends the information to the BMC; after the BMC receives the MCTP protocol, the MCTP protocol is still sent to the devices according to the original PCIE topology, and then the devices are checked to see which devices have responses and which do not have the responses. And comparing the device with the response packet return value with a BDF list, identifying the device which does not support the MCTP protocol, removing the device which does not support the MCTP protocol from the BDF list, regenerating a new PCIE topology, and only transmitting the MCTP protocol packet by the BMC according to the new PCIE topology. The invention effectively solves the problem that the disabled PCIE equipment continuously enters the energy-saving wake-up state to cause downtime of the machine IERR, and ensures stable operation of customer service. In addition, the invention can effectively identify the fault PCIE equipment, and is beneficial to improving the stability, reliability, maintainability and product competitiveness of the server.
It can be seen that the present invention has outstanding substantial features and significant advances over the prior art, as well as the benefits of its implementation.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for obtaining PCIE device information in an embodiment of the present invention.
Fig. 2 is a system configuration diagram of a PCIE device information obtaining system according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a PCIE device information obtaining apparatus according to an embodiment of the present invention.
In the figure, 1, a full PCIE topology recording module; 2. an equipment information collection module; 3. an equipment identification module; 4. a pause control module; 5. a polling topology generation module; 6. an information acquisition module; 101. a processor; 102. a memory; 103. an input interface; 104. an output interface; 105. a communication unit; 106. a keyboard; 107. a display; 108. and a mouse.
Detailed Description
The invention provides a PCIE equipment information acquisition method, wherein in the related art, a MCTP protocol sent by a BMC for acquiring equipment information is a generalized packet, which is used for sending and receiving information according to PCIE topology, and a certain PCIE slot can not be skipped, or a slot or equipment can not be designated. The BMC wakes up the forbidden devices every 2 seconds through the MCTP, enters a normal state, and the devices enter an energy-saving mode again after the needed information is taken; the method is to circulate all the time, and for some unstable devices, when the PCIE device cannot be awakened for a certain time, CPU communication is overtime, so that the CPU is halted, and the server is crashed. In addition, as the equipment is continuously awakened, not only is the waste of software performance and energy consumption caused, but also the running risk of the BMC caused by frequent access is increased.
In the method for acquiring PCIE device information provided by the present invention, first, information whether the known PCIE device supports MCTP protocol is recorded as a full PCIE topology, and device information of all external PCIE devices of the current server is collected. And then, according to the information of all external PCIE devices of the current server, sending an MCTP protocol to each external PCIE device in a polling mode, and according to the total PCIE topology and the response condition of the PCIE devices, identifying PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation. At this time, the PCIE device that does not perform the information obtaining operation is subjected to suspension control, and based on the information of all external PCIE devices, PCIE topology information dedicated to MCTP polling is generated according to the device information of the PCIE device that can perform the information obtaining operation. And finally, acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling. Therefore, the invention realizes that the BMC only transmits the MCTP protocol packet aiming at the PCIE topology of the PCIE equipment excluding the information acquisition operation, effectively avoids the problem that the disabled PCIE equipment continuously enters the energy-saving wake-up state to cause downtime of the machine IERR, and ensures stable operation of customer service.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
The following explains key terms appearing in the present invention.
BMC, execution server remote management controller, english name Baseboard Management Controller, is baseboard management controller.
IPMI, an abbreviation for intelligent platform management interface (Intelligent Platform Management Interface), is an industry standard adopted to manage peripheral devices used in Intel-based enterprise systems.
CPLD, english is called Complex Programmable Logic Device, complex programmable logic device.
EEPROM, english is known as Electrically Erasable Programmable Read Only Memory charged erasable programmable read-only memory.
GPIO, english is named General Purpose Input/Output, general purpose input and Output.
VRD, english, is known as Voltage Regulator Down, voltage regulator.
E-FUSE, english is called Electron FUSE, electronic FUSE.
MCTP, english, is known as Management Component Transport Protocol, manages the component transport protocol.
In order to better understand the aspects of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and detailed description. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the present embodiment provides a PCIE device information obtaining method, including the following steps:
s1: and recording the information of whether the known PCIE equipment supports the MCTP protocol or not as a full PCIE topology, and storing the information.
In a specific embodiment, first, information about whether a known PCIE device supports an MCTP protocol is obtained; and then generating a BDF list according to the acquired information, taking the BDF list as a full PCIE topology, and storing the BDF list in a shared memory of the BMC.
S2: and collecting the device information of all external PCIE devices of the current server.
In a specific embodiment, in the power-on self-test process of the server (POST), device information of all external PCIE devices of the current server is collected through the BIOS. After the collection is completed, the collected device information is used as full information and is stored in the shared memory of the BMC.
S3: and according to the information of all external PCIE devices of the current server, sending an MCTP protocol to each external PCIE device in a polling mode, and identifying PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation according to the total PCIE topology and the response conditions of the PCIE devices.
In a specific embodiment, the step includes:
and sending the MCTP protocol to the PCIE equipment at intervals of 2 seconds through the BMC according to the total information, and according to the total PCIE topology and the response condition of the PCIE equipment. The BMC broadcasts MCTP packets to all external PCIE devices of the server once every 2 seconds, whether the PCIE devices support MCTP protocol or are disabled by the client.
PCIE devices supporting MCTP polling, PCIE devices not supporting MCTP polling, failed PCIE devices, and client-disabled PCIE devices are identified.
The purpose of this step is to identify PCIE devices supporting MCTP polling, PCIE devices not supporting MCTP polling, failed PCIE devices, and PCIE devices disabled by clients, record PCIE devices supporting MCTP polling as PCIE devices capable of performing information acquisition operations, and record PCIE devices not supporting MCTP polling, failed PCIE devices, and PCIE devices disabled by clients as PCIE devices not performing information acquisition operations.
The identifying process of the PCIE device specifically includes:
firstly, acquiring device information of any PCIE device in the full information, and recording the device information as PCIE devices to be identified.
And searching record information corresponding to the PCIE equipment to be identified in the full PCIE topology to judge whether the PCIE equipment to be identified supports the MCTP protocol.
If the recorded information is that the MCTP protocol is not supported, the PCIE device to be identified does not support the MCTP protocol and is recorded as a PCIE device that does not perform the information acquisition operation. The device will not be polled any more in subsequent MCTP polls to save resources.
If the record information is that the MCTP protocol is supported, the PCIE equipment to be identified supports the MCTP protocol, and the response situation of the PCIE equipment is obtained. And identifying the PCIE equipment with faults and the PCIE equipment disabled by the client according to the response condition of the PCIE equipment.
In order to avoid factors such as jitter false alarm, the recognition result is confirmed according to the response condition of three broadcasting. Specifically, if the PCIE device corresponds to the MCTP protocol received three times in succession and returns a response packet, the PCIE device marks the PCIE device as a PCIE device performing the information obtaining operation; if the PCIE device does not receive the MCTP protocol for three times continuously, the PCIE device is a fault PCIE device, and is recorded as a PCIE device which does not perform information acquisition operation, and for the fault PCIE device, an SEL alarm is sent in real time to remind a user, and the device is not polled in subsequent MCTP polling so as to save resources. If the PCIE device returns the register information to the BMC after receiving the MCTP protocol, the PCIE device is a PCIE device disabled by the client, and is recorded as a PCIE device that does not perform the information obtaining operation, and the device is not polled in a subsequent MCTP poll to save resources.
S4: and performing pause control on PCIE equipment which does not perform information acquisition operation.
In a specific embodiment, the mode of sending the IPMI instruction by the BMC triggers the GPIO signal to send to the CPLD, and then the CPLD triggers the logic signal to control the E-fuse switch of the PCIE device, so that the unused extrapolation device stops working under the operating system, and the continuous switching of energy-saving wakeup of the PCIE device is avoided.
S5: based on the information of all external PCIE devices, generating PCIE topology information special for MCTP polling according to the device information of the PCIE devices capable of performing information acquisition operation, and storing the PCIE topology information.
In a specific real-time mode, firstly, a list of all external PCIE devices is generated based on the full information; then deleting PCIE equipment which does not support MCTP polling, faulty PCIE equipment and PCIE equipment which is forbidden by a client from the list of all external PCIE equipment, and taking the PCIE equipment, faulty PCIE equipment and PCIE equipment as PCIE topology information special for MCTP polling; and finally, storing PCIE topology information special for MCTP polling in a shared memory of the BMC.
The PCIE topology information dedicated to MCTP poll may be stored in a data table. The data table is recorded with: device name, DI, polling time, and device status information.
S6: and acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling.
In particular embodiments. The BMC broadcasts MCTP packets to PCIE devices recorded in the PCIE topology information special for MCTP polling every 2 seconds. And the PCIE equipment returns equipment information to the BMC after receiving the MCTP packet.
Therefore, the invention provides a method for acquiring PCIE device information, which records which devices support MCTP protocol and which devices do not support MCTP protocol by writing a BDF list supported by MCTP protocol in BMC. After the machine is started, the BIOS initializes the external plug-in equipment in the process of starting the POST, collects the information and sends the information to the BMC; after the BMC receives the MCTP protocol, the MCTP protocol is still sent to the devices according to the original PCIE topology, and then the devices are checked to see which devices have responses and which do not have the responses. And then comparing the equipment with the response packet return value with a BDF list, identifying equipment which does not support the MCTP protocol, removing the equipment which does not support the MCTP protocol from the BDF list, regenerating a new PCIE topology, and only sending the MCTP protocol packet by the BMC according to the new PCIE topology, thereby effectively avoiding the problem that the disabled PCIE equipment continuously enters an energy-saving wake-up state to cause downtime of a machine IERR and ensuring stable operation of customer business.
Referring to fig. 2, the invention also discloses a PCIE device information obtaining system, including: the device comprises a full PCIE topology recording module 1, a device information collecting module 2, a device identification module 3, a pause control module 4, a polling topology generating module 5 and an information obtaining module 6.
And the full PCIE topology recording module 1 is used for recording the information whether the known PCIE equipment supports the MCTP protocol or not, taking the information as the full PCIE topology and storing the information.
In a specific embodiment, the full PCIE topology recording module 1 is specifically configured to: acquiring information of whether the known PCIE equipment supports an MCTP protocol; and generating a BDF list according to the acquired information, taking the BDF list as a full PCIE topology, and storing the BDF list in a shared memory of the BMC.
And the device information collection module 2 is used for collecting device information of all external PCIE devices of the current server.
In a specific embodiment, the device information collection module 2 is specifically configured to: in the power-on self-checking process of the server, collecting equipment information of all external PCIE equipment of the current server through BIOS; and taking the collected device information as full information and storing the full information in a shared memory of the BMC.
And the device identification module 3 is used for sending an MCTP protocol to each external PCIE device in a polling mode according to the information of all external PCIE devices of the current server, and identifying PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation according to the total PCIE topology and the response condition of the PCIE devices.
In a specific embodiment, the device identification module 3 is specifically configured to: according to the full information, sending an MCTP protocol to PCIE equipment through the BMC every 2 seconds; and identifying PCIE equipment supporting MCTP polling, PCIE equipment not supporting MCTP polling, faulty PCIE equipment and PCIE equipment disabled by a client according to the total PCIE topology and the response conditions of the PCIE equipment.
And the pause control module 4 is used for performing pause control on PCIE equipment which does not perform information acquisition operation.
In a specific embodiment, the suspension control module 4 is specifically configured to: and triggering the GPIO signal to be sent to the CPLD by the BMC in a mode of sending the IPMI instruction, and then triggering the logic signal by the CPLD to control an E-fuse switch of the PCIE equipment which does not perform information acquisition operation so as to stop working.
The polling topology generating module 5 is configured to generate and store PCIE topology information dedicated to MCTP polling according to device information of PCIE devices capable of performing information acquisition operation based on information of all external PCIE devices.
In a specific embodiment, the polling topology generation module 5 is specifically configured to: generating a list of all external PCIE devices based on the full information; deleting PCIE equipment which does not support MCTP polling, faulty PCIE equipment and PCIE equipment which is forbidden by a client from the list of all external PCIE equipment, and taking the PCIE equipment, the faulty PCIE equipment and the PCIE equipment as PCIE topology information special for MCTP polling; and storing PCIE topology information special for MCTP polling in a shared memory of the BMC.
And the information acquisition module 6 is used for acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling.
In a specific embodiment, the information acquisition module 6 is specifically configured to: and broadcasting an MCTP packet to PCIE equipment recorded in the PCIE topology information special for MCTP polling by the BMC every 2 seconds. And the PCIE equipment returns equipment information to the BMC after receiving the MCTP packet.
Therefore, the invention provides a PCIE equipment information acquisition system, which realizes that the BMC only transmits MCTP protocol packets aiming at PCIE topology excluding PCIE equipment which does not perform information acquisition operation, effectively avoids the problem that the disabled PCIE equipment continuously enters an energy-saving awakening state to cause downtime of a machine IERR, and ensures stable operation of customer business.
Referring to fig. 3, the invention also discloses a PCIE device information obtaining apparatus, including a processor 101 and a memory 102; the processor 101 executes the PCIE device information acquiring program stored in the memory to implement the following steps:
1. and recording the information of whether the known PCIE equipment supports the MCTP protocol or not as a full PCIE topology, and storing the information.
2. And collecting the device information of all external PCIE devices of the current server.
3. And according to the information of all external PCIE devices of the current server, sending an MCTP protocol to each external PCIE device in a polling mode, and identifying PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation according to the total PCIE topology and the response conditions of the PCIE devices.
4. And performing pause control on PCIE equipment which does not perform information acquisition operation.
5. Based on the information of all external PCIE devices, generating PCIE topology information special for MCTP polling according to the device information of the PCIE devices capable of performing information acquisition operation, and storing the PCIE topology information.
6. And acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling.
The PCIE device information obtaining apparatus provided in this embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like.
Processor 101 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 101 may be implemented in at least one hardware form of digital signal processing (Digital Signal Processor, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 101 may also include a main processor and a coprocessor, the main processor being a processor for processing data in an awake state, also referred to as a central processor (Central Processing Unit, CPU); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 101 may be integrated with an image processor (Graphics Processing Unit, GPU) for use in connection with rendering and rendering of content to be displayed by the display screen. In some embodiments, the processor 101 may also include an artificial intelligence (Artificial Intelligence, AI) processor for processing computing operations related to machine learning.
Memory 102 may include one or more computer-readable storage media, which may be non-transitory. Memory 102 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 102 is at least configured to store a computer program, where the computer program, after being loaded and executed by the processor 101, is capable of implementing relevant steps of the PCIE device information obtaining method disclosed in any one of the foregoing embodiments. In addition, the resources stored in the memory 102 may also include an operating system, data, and the like, and the storage manner may be transient storage or permanent storage. The operating system may include Windows, unix, linux, among others. The data may include, but is not limited to, data involved in the above-described PCIE device information acquisition method, and the like.
Further, the PCIE device information obtaining apparatus in this embodiment may further include:
the input interface 103 is configured to acquire an acquiring program of PCIE device information imported from the outside, store the acquired acquiring program of PCIE device information in the memory 102, and also be configured to acquire various instructions and parameters transmitted by an external terminal device, and transmit the various instructions and parameters to the processor 101, so that the processor 101 uses the various instructions and parameters to develop corresponding processing. In this embodiment, the input interface 103 may specifically include, but is not limited to, a USB interface, a serial interface, a voice input interface, a fingerprint input interface, a hard disk reading interface, and the like.
And an output interface 104 for outputting various data generated by the processor 101 to a terminal device connected thereto, so that other terminal devices connected to the output interface can acquire various data generated by the processor 101. In this embodiment, the output interface 104 may specifically include, but is not limited to, a USB interface, a serial interface, and the like.
And the communication unit 105 is used for establishing remote communication connection between the server operation service optimizing configuration device and the external server, so that the image file can be mounted in the external server by the PCIE equipment information acquisition device. In this embodiment, the communication unit 105 may specifically include, but is not limited to, a remote communication unit based on a wireless communication technology or a wired communication technology.
A keyboard 106 for acquiring various parameter data or instructions inputted by a user by tapping the key cap in real time.
And the display 107 is configured to display, in real time, related information of an acquisition process of running PCIE device information.
The mouse 108 may be used to assist the user in inputting data and to simplify the user's operation.
The invention also discloses a readable storage medium, which includes Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art. The readable storage medium stores an acquisition program of PCIE device information, where the acquisition program of PCIE device information when executed by the processor implements the following steps:
1. and recording the information of whether the known PCIE equipment supports the MCTP protocol or not as a full PCIE topology, and storing the information.
2. And collecting the device information of all external PCIE devices of the current server.
3. And according to the information of all external PCIE devices of the current server, sending an MCTP protocol to each external PCIE device in a polling mode, and identifying PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation according to the total PCIE topology and the response conditions of the PCIE devices.
4. And performing pause control on PCIE equipment which does not perform information acquisition operation.
5. Based on the information of all external PCIE devices, generating PCIE topology information special for MCTP polling according to the device information of the PCIE devices capable of performing information acquisition operation, and storing the PCIE topology information.
6. And acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling.
In summary, the invention realizes that for PCIE devices which do not support MCTP protocol and PCIE devices which are forbidden by clients due to non-use of service, when the BMC can not acquire information through MCTP protocol, the BMC does not need to poll to acquire the device information of the PCIE devices, thereby effectively reducing the burden of the BMC. .
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the method disclosed in the embodiment, since it corresponds to the system disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed systems, and methods may be implemented in other ways. For example, the system embodiments described above are merely illustrative, e.g., the division of the elements is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, system or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit.
Similarly, each processing unit in the embodiments of the present invention may be integrated in one functional module, or each processing unit may exist physically, or two or more processing units may be integrated in one functional module.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The method, the system, the device and the readable storage medium for obtaining the PCIE device information provided by the present invention are described in detail above. The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present invention and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.

Claims (10)

1. The method for acquiring the PCIE equipment information is characterized by comprising the following steps:
recording information of whether the known PCIE equipment supports the MCTP protocol or not, taking the information as a full PCIE topology, and storing the information; collecting equipment information of all external PCIE equipment of a current server;
according to the information of all external PCIE devices of the current server, a polling mode is adopted to send MCTP protocol to each external PCIE device, and PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation are identified according to the total PCIE topology and the response situation of the PCIE devices;
performing pause control on PCIE equipment which does not perform information acquisition operation;
based on the information of all external PCIE equipment, generating PCIE topology information special for MCTP polling according to the equipment information of the PCIE equipment capable of performing information acquisition operation, and storing the PCIE topology information;
and acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling.
2. The PCIE device information obtaining method of claim 1, wherein the recording information about whether the known PCIE device supports MCTP protocol, as a full PCIE topology, and storing includes: acquiring information of whether the known PCIE equipment supports an MCTP protocol;
and generating a BDF list according to the acquired information, taking the BDF list as a full PCIE topology, and storing the BDF list in a shared memory of the BMC.
3. The PCIE device information obtaining method of claim 2, wherein the collecting device information of all external PCIE devices of the current server includes:
in the power-on self-checking process of the server, collecting equipment information of all external PCIE equipment of the current server through BIOS;
and taking the collected device information as full information and storing the full information in a shared memory of the BMC.
4. The PCIE device information obtaining method of claim 3, wherein the sending MCTP protocol to each external PCIE device in a polling manner according to information of all external PCIE devices of the current server, identifying PCIE devices capable of performing information obtaining operations and PCIE devices not performing information obtaining operations according to a full PCIE topology and response conditions of PCIE devices, includes:
according to the full information, sending an MCTP protocol to PCIE equipment through the BMC every 2 seconds;
and identifying PCIE equipment supporting MCTP polling, PCIE equipment not supporting MCTP polling, faulty PCIE equipment and PCIE equipment disabled by a client according to the total PCIE topology and the response conditions of the PCIE equipment.
5. The method for obtaining PCIE device information according to claim 4, wherein identifying PCIE devices supporting MCTP polling, PCIE devices not supporting MCTP polling, faulty PCIE devices, and PCIE devices disabled by clients according to a full-scale PCIE topology and response conditions of PCIE devices includes:
acquiring the equipment information of any PCIE equipment from the full information, and marking the equipment information as PCIE equipment to be identified;
searching record information corresponding to PCIE equipment to be identified in the full PCIE topology to judge whether the PCIE equipment to be identified supports MCTP protocol or not;
if the recorded information does not support the MCTP protocol, the PCIE equipment to be identified does not support the MCTP protocol and is recorded as PCIE equipment which does not perform information acquisition operation;
if the recorded information is that the MCTP protocol is supported, the PCIE equipment to be identified supports the MCTP protocol, and the response condition of the PCIE equipment is obtained;
if the PCIE equipment receives the MCTP protocol three times in succession, the PCIE equipment is recorded as PCIE equipment for information acquisition operation;
if the PCIE equipment does not receive the MCTP protocol for three times continuously, the PCIE equipment is a fault PCIE equipment and is marked as PCIE equipment which does not perform information acquisition operation;
if the PCIE device returns the register information to the BMC after receiving the MCTP protocol, the PCIE device is a PCIE device disabled by the client, and is marked as a PCIE device that does not perform the information obtaining operation.
6. The PCIE device information obtaining method of claim 1, wherein the performing pause control on the PCIE device that does not perform the information obtaining operation includes:
and triggering the GPIO signal to be sent to the CPLD by the BMC in a mode of sending the IPMI instruction, and then triggering the logic signal by the CPLD to control an E-fuse switch of the PCIE equipment which does not perform information acquisition operation so as to stop working.
7. The PCIE device information obtaining method according to claim 5, wherein the generating and storing PCIE topology information dedicated to MCTP poll based on information of all external PCIE devices according to device information of PCIE devices capable of performing information obtaining operation includes:
generating a list of all external PCIE devices based on the full information;
deleting PCIE equipment which does not support MCTP polling, faulty PCIE equipment and PCIE equipment which is forbidden by a client from the list of all external PCIE equipment, and taking the PCIE equipment, the faulty PCIE equipment and the PCIE equipment as PCIE topology information special for MCTP polling;
and storing PCIE topology information special for MCTP polling in a shared memory of the BMC.
8. The system for acquiring the PCIE equipment information is characterized by comprising the following steps:
the full PCIE topology recording module is used for recording whether the known PCIE equipment supports the MCTP protocol or not, and taking the information as a full PCIE topology and storing the full PCIE topology;
the device information collection module is used for collecting device information of all external PCIE devices of the current server;
the device identification module is used for sending an MCTP protocol to each external PCIE device in a polling mode according to the information of all external PCIE devices of the current server, and identifying PCIE devices capable of performing information acquisition operation and PCIE devices not performing information acquisition operation according to the total PCIE topology and the response situation of the PCIE devices; a pause control module, configured to perform pause control on PCIE devices that do not perform information acquisition operations;
the polling topology generation module is used for generating PCIE topology information special for MCTP polling according to the equipment information of PCIE equipment capable of performing information acquisition operation based on the information of all external PCIE equipment and storing the PCIE topology information;
and the information acquisition module is used for acquiring the equipment information of the PCIE equipment by using the MCTP protocol according to the PCIE topology information special for MCTP polling.
9. The device for acquiring the PCIE equipment information is characterized by comprising the following components:
the memory is used for storing an acquisition program of PCIE equipment information;
the processor is configured to implement the steps of the PCIE device information acquisition method according to any one of claims 1 to 7 when executing the PCIE device information acquisition program.
10. A readable storage medium, characterized by: the readable storage medium stores a PCIE device information acquiring program, where the PCIE device information acquiring program, when executed by a processor, implements the steps of the PCIE device information acquiring method according to any one of claims 1 to 7.
CN202310833631.0A 2023-07-07 2023-07-07 PCIE equipment information acquisition method, system, device and medium Pending CN116955001A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310833631.0A CN116955001A (en) 2023-07-07 2023-07-07 PCIE equipment information acquisition method, system, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310833631.0A CN116955001A (en) 2023-07-07 2023-07-07 PCIE equipment information acquisition method, system, device and medium

Publications (1)

Publication Number Publication Date
CN116955001A true CN116955001A (en) 2023-10-27

Family

ID=88450372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310833631.0A Pending CN116955001A (en) 2023-07-07 2023-07-07 PCIE equipment information acquisition method, system, device and medium

Country Status (1)

Country Link
CN (1) CN116955001A (en)

Similar Documents

Publication Publication Date Title
CN104076903A (en) Postponing suspend
CN111124981B (en) Management system and method for server I2C equipment
CN113051503A (en) Browser page rendering method and device, electronic equipment and storage medium
CN102929381A (en) Electronic system and power management system thereof
CN107479900B (en) Hot plug software scheme suitable for real-time operating system
CN112925685A (en) Handheld server field operation and maintenance tool
CN115102937B (en) Self-adaptive communication method, device and medium for server power supply
CN116955001A (en) PCIE equipment information acquisition method, system, device and medium
CN113075992B (en) Memory power-on method, device, equipment and computer readable storage medium
CN114706371A (en) Complete vehicle network non-dormancy diagnosis method and device, electronic equipment and storage medium
CN108196617A (en) BMC time setting methods, device, system and readable storage medium storing program for executing
CN114356970A (en) Storage system resource caching method and device
CN113010303A (en) Data interaction method and device between processors and server
CN110647435A (en) Server, hard disk remote control method and control assembly
CN111542048A (en) Method and device for restarting acquisition function of code detection equipment, server and storage medium
CN114884724B (en) Cloud server interaction method and device, readable storage medium and terminal equipment
CN110703988A (en) Storage pool creating method, system, terminal and storage medium for distributed storage
CN108181983A (en) Enter the electronic equipment of low-power mode with controller
CN114153303B (en) Power consumption control system, power consumption control method, device and medium
CN112084022B (en) Project capacity planning method and device, computer equipment and storage medium
WO2024139076A1 (en) Power-off protection method and apparatus, and device and storage medium
CN117055977B (en) Method and device for linking data between code-free applications
CN115474227B (en) Abnormality detection method and device and vehicle
CN106933558A (en) A kind of power control method and device
CN114385252A (en) Device control method, device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination