CN111190848B - Method and device for server to read GPU - Google Patents
Method and device for server to read GPU Download PDFInfo
- Publication number
- CN111190848B CN111190848B CN201911333280.7A CN201911333280A CN111190848B CN 111190848 B CN111190848 B CN 111190848B CN 201911333280 A CN201911333280 A CN 201911333280A CN 111190848 B CN111190848 B CN 111190848B
- Authority
- CN
- China
- Prior art keywords
- gpu
- information
- bmc
- hardware
- pcie
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4022—Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2213/00—Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F2213/0026—PCI express
Abstract
The invention discloses a method and a device for a server to read a GPU, wherein the method comprises the following steps: the BIOS synchronizes the first part of information of the identified PCIE device to the BMC; the BMC judges whether the PCIE equipment is a GPU or not through the first part of information synchronized by the BIOS, and when the judging result is yes, the BMC corresponds the first part of information to the hardware position of the corresponding GPU so as to locate the corresponding GPU and read the second part of information of the corresponding GPU; the first partial information and the second partial information are integrated and displayed. According to the technical scheme, the complete GPU information at the corresponding silk-screen position can be intuitively displayed.
Description
Technical Field
The invention relates to the technical field of servers, in particular to a method and a device for reading a GPU by a server.
Background
The current SMBUS Access mode has limited information acquisition of the graphic processor, and only the temperature, the equipment identification code, the manufacturer identification code, the sub-equipment identification code and the sub-manufacturer identification code of the graphic processing can be acquired. The BIOS (Basic Input Output System ) can only obtain the device identification code, manufacturer identification code, sub-device identification code, sub-manufacturer identification code, type, model, link rate and link width of the graphics processor by PCIE protocol.
The prior art is to install the driver of the graphics processor through the OS system, and the detailed information of the graphics processor can be obtained through the SMBUS in-band mode, but only a large number of return presentations in the form of command lines can be obtained.
In the prior art, the addresses of GPUs are fixed, and in order to realize simultaneous use of multiple GPUs on hardware design, GPU devices are required to be placed on different I2C channels, or the same I2C channel chips are subjected to expansion distinction, so that the position information on the hardware is fixed, related silk-screen display is necessary, and the BMC can acquire the position information of the I2C channel of each GPU in a mode of SMBUS Access and corresponds to silk-screen printing. But in this way all the information of the GPU cannot be obtained.
The identification of the GPU by the BIOS is carried out according to the sequence of PCIE interfaces, so that the corresponding relation of the BIOS cannot be uncertain in the change of different PCIE cable link modes because the hardware positions have complete corresponding relations.
The in-band mode of the SMBUS is not convenient enough, the displayed information can only return a large amount of data through a command line, the desired information can not be positioned quickly, the real-time state information of the corresponding GPU can not be monitored and displayed intuitively, and once the unsatisfied GPU card appears, the display sequence of the GPU in the system is different from the silk-screen sequence.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a method and a device for reading a GPU by a server, which can intuitively display complete GPU information at a corresponding silk-screen position.
The technical scheme of the invention is realized as follows:
according to an aspect of the present invention, there is provided a method for a server to read a GPU, including:
the BIOS synchronizes the first part of information of the identified PCIE device to the BMC;
the BMC determines whether the PCIE device is a GPU through the first partial information synchronized by the BIOS,
when the judging result is yes, the BMC corresponds the first part information to the hardware position of the corresponding GPU so as to locate the corresponding GPU and read the second part information of the corresponding GPU;
the first partial information and the second partial information are integrated and displayed.
According to an embodiment of the present invention, the BMC corresponding the first partial information to the hardware location of the corresponding GPU to locate the corresponding GPU includes: and the BMC is matched according to the corresponding relation between the PCIE and the hardware position and the first part of information of the PCIE equipment identified by the BIOS so as to obtain the hardware position of the PCIE equipment.
According to an embodiment of the present invention, reading the second partial information includes: and reading the second part of information inside the corresponding GPU in an OOB mode through the I2C channel.
According to an embodiment of the present invention, the method for reading the GPU by the server further includes: updating the firmware version information and the serial number of the GPU when the server is started each time; and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
According to an embodiment of the present invention, the first partial information includes: at least one of vendor information, type, model, link rate, link width; the second part of information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
According to another aspect of the present invention, there is provided an apparatus for a server to read a GPU, including:
the BIOS module is used for identifying the first part of information of the PCIE equipment and synchronizing the first part of information to the BMC,
a BMC for judging whether the PCIE device is a GPU or not through the first partial information of BIOS synchronization,
when the judging result is yes, the BMC corresponds the first part of information to the PCIE interface position of the corresponding GPU, and locates the BMC to the corresponding GPU and reads the second part of information of the corresponding GPU;
and the display module is used for displaying the first part of information and the second part of information after BMC integration.
According to the embodiment of the invention, the BMC performs matching according to the corresponding relation between the PCIE and the hardware position and the first part of information of the PCIE device identified by the BIOS so as to obtain the hardware position of the PCIE device.
According to the embodiment of the invention, the BMC reads the second part of information inside the corresponding GPU in an OOB mode through the I2C channel.
According to an embodiment of the invention, the BMC is also for: updating the firmware version information and the serial number of the GPU when the server is started each time; and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
According to an embodiment of the present invention, the first partial information includes: at least one of vendor information, type, model, link rate, link width; the second part of information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
According to the technical scheme, the BMC is used for displaying the complete information of the GPU, the BIOS can obtain the information of the GPU, the BMC can integrate part of the GPU information obtained by the BIOS, and the actual hardware position (such as the hardware silk-screen position) of the GPU is more intuitively corresponding to the GPU information and displayed to a client. Therefore, when an operation and maintenance person or a user wants to check the model of the GPU on the hardware screen printing, the operation and maintenance person or the user can intuitively monitor the hardware screen printing model through the web page of the BMC, the integrity of the GPU information display is greatly improved, and the technical cost and the time cost for checking the GPU information through various operations are reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for a server to read a GPU according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for a server to read a GPU in accordance with a specific embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the invention, fall within the scope of protection of the invention.
FIG. 1 is a flowchart of a method for a server to read a GPU according to an embodiment of the present invention. As shown in fig. 1, the method for reading the GPU by the server of the present invention may include the following steps:
s11, the BIOS synchronizes the first part of information of the identified PCIE device to the BMC. In one embodiment, the first portion of information includes: at least one of vendor information, type, model number, link rate, link width.
S12, the BMC judges whether the PCIE equipment is the GPU or not through the first part of information synchronized by the BIOS.
And S13, when the judgment result is yes, the BMC corresponds the first part of information to the hardware position of the corresponding GPU so as to locate the corresponding GPU and read the second part of information of the corresponding GPU.
In one embodiment, the second portion of information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature. In one embodiment, the BMC locates to the corresponding GPU through an I2C channel. In one embodiment, the BMC reads the second part Of information inside the corresponding GPU by means Of OOB (Out Of Band).
S14, integrating and displaying the first part of information and the second part of information, for example, on a web side.
According to the technical scheme, the BMC is used for displaying the complete information of the GPU, the BIOS can obtain the information of the GPU, the BMC can integrate part of the GPU information obtained by the BIOS, and the actual hardware position (such as the hardware silk-screen position) of the GPU is more intuitively corresponding to the GPU information and displayed to a client. Therefore, when an operation and maintenance person or a user wants to check the model of the GPU on the hardware screen printing, the operation and maintenance person or the user can intuitively monitor the hardware screen printing model through the web page of the BMC, the integrity of the GPU information display is greatly improved, and the technical cost and the time cost for checking the GPU information through various operations are reduced.
In some embodiments, the firmware version information and the serial number of the GPU may be updated each time the server is powered on; and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
FIG. 2 is a flowchart of a method for a server to read a GPU in accordance with a specific embodiment of the present invention. As shown in fig. 2, in this embodiment, the present invention uses BMC as the rendering of the GPU complete information. After power-on and power-on, the BMC identifies whether the GPU equipment is inserted into the hardware slot or not in an SMBUS mode, and determines the position of the GPU on the hardware. After the BIOS completes the initialization of the device, vendor information, type, model, link rate and link width information of all PCIE devices are captured, and the BIOS sends the vendor information, the type, the model, the link rate and the link width information to the BMC according to a predefined IPMI SUGON OEM command format. The BMC builds a corresponding relation table of PCIE and hardware screen printing in advance according to hardware design, performs one-by-one matching according to the BUS/DEVICE/FUNCTION information of the PCIE equipment identified by the BIOS, obtains the position information of all PCIE equipment on hardware, and determines GPU equipment in the PCIE equipment according to BaseClass, subClass. And combining the SMBUS identification result and the PCIE identification result, determining hardware link information of the GPU. And (3) circularly monitoring, namely circularly opening an I2C BUS channel of hardware according to hardware link information, and reading more perfect information of the GPU in an OOB mode, such as: firmware version information, production time, serial number, power consumption, maximum working temperature and the like are combined, integrated, converted into Chinese and English, and displayed on a web interface.
In addition, the firmware version information and the serial number of the GPU are updated once every time the GPU is started. The GPU equipment does not support hot plug, power-off operation is needed for replacing the GPU equipment, and after the equipment is replaced, the GPU can be identified again after the equipment is powered on for the first time. The identification process is carried out again, so that even if GPUs of different models are replaced, the insertion positions of the GPUs are changed, the display of the GPUs cannot be affected, and the correctness and completeness of the information of the GPUs can be ensured.
More specifically, referring to fig. 2, after the BMC receives and recognizes the first boot, it starts to recognize the position of the PCIE interface of the GPU on the hardware, and the BUS/DEVICE/FUNCTION record of the PCIE corresponds to the hardware screen printing information one by one. When the BIOS identifies the information of the PCIE device in each starting process, the IPMI OEM command is used for carrying out data interaction between the BMC and the BIOS, and the acquired information is synchronized to the BMC end. The BMC end can determine whether the GPU equipment is the GPU equipment or not through the information identified by the BIOS through BaseClass, subClass, corresponds to the actual PCIE interface position on hardware through BUS/DEVICE/FUNCTION, switches the I2C channel to position each GPU equipment according to the position information recorded before, reads information inside the GPU through an OOB mode, and displays the information on the web end.
With continued reference to fig. 2, each time the server is started or restarted, the BIOS identifies BUS/DEVICE/FUNCTION information of the PCIE link where the PCIE DEVICE is located, and may obtain base/sub-class information of the PCIE DEVICE according to a PCIE standard protocol, so as to determine a type of the PCIE DEVICE. According to IPMI SUGON OEM CMD, PCIE identification information is transmitted to the BMC. The BMC builds the corresponding relation between the integral PCIE and the hardware screen printing in advance according to the hardware design of the server, and the position information of the PCIE interface can be obtained through the data of the circulating corresponding table through BusNum, devNum, funNum sent by the BIOS. And judging whether the PCIE device is the GPU device according to BaseClass, subClass. When the existence of the GPU equipment is identified, the I2C link information of the hardware where the GPU is located can be determined according to the hardware silk-screen position, interaction is realized with the GPU in an OOB mode, and the detailed information of the GPU is obtained. And integrating all the information, displaying the integrated information on a WEB terminal, and seeing the detailed information of the GPU.
In summary, according to the method provided by the invention, the BMC can obtain the visual silk-screen position of the GPU on the hardware in an I2C mode, the BIOS can obtain partial information of the GPU, the BMC can reprocess the GPU information which can be obtained by the BIOS at present, perfection is achieved, and the position of the GPU actually on the hardware is more visually corresponding to the GPU information and is presented to a client.
When operation and maintenance personnel or users want to check the model of the GPU on the hardware screen printing, the operation and maintenance personnel or users can intuitively monitor the model through the web page of the BMC, so that the integrity of the GPU information display is greatly improved, and the technical cost and the time cost for checking the GPU information by various operations are reduced. Meanwhile, the real-time information of the temperature and the power consumption of the GPU can be monitored, and the heat dissipation is timely controlled by the fan, so that the use persistence of the GPU is improved. Meanwhile, the real-time information of the temperature and the power consumption of the GPU can be monitored, and the heat dissipation is timely controlled by the fan, so that the use persistence of the GPU is improved.
According to an embodiment of the present invention, there is also provided an apparatus for reading a GPU by a server, including:
the BMC is used for identifying and recording PCIE interface positions of the GPUs;
the BIOS module is used for identifying the first part of information of the PCIE equipment and synchronizing the first part of information to the BMC,
wherein, the BMC judges whether the PCIE equipment is the GPU or not through the first partial information of the BIOS synchronization,
when the judging result is yes, the BMC corresponds the first part of information to the PCIE interface position of the corresponding GPU, and locates the BMC to the corresponding GPU and reads the second part of information of the corresponding GPU;
and the display module is used for displaying the first part of information and the second part of information after BMC integration.
According to an embodiment of the invention, the BMC locates to the corresponding GPU through the I2C channel.
According to the embodiment of the invention, the BMC reads the second part of information inside the corresponding GPU in an OOB mode.
According to an embodiment of the invention, the BMC is also for: updating the firmware version information and the serial number of the GPU when the server is started each time; and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
According to an embodiment of the present invention, the first partial information includes: at least one of a link speed and a link bandwidth;
the second part of information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.
Claims (8)
1. A method for a server to read a GPU, comprising:
the BMC identifies whether the GPU equipment is inserted into the hardware slot position or not in an SMBUS mode, and determines the position of the GPU on hardware;
the BIOS synchronizes the first part of information of the identified PCIE equipment to the BMC;
the BMC judges whether the PCIE device is a GPU or not through the first partial information synchronized by the BIOS,
when the judgment result is yes, the BMC corresponds the first part information to the hardware position of the corresponding GPU so as to locate the corresponding GPU and read the second part information of the corresponding GPU, wherein the BMC matches the first part information of the PCIE equipment identified by the BIOS according to the corresponding relation between the PCIE and the hardware position, determines the hardware link information of the GPU by combining the SMBUS identification result and the PCIE identification result, circularly monitors, and circularly opens the I2C BUS channel of the hardware according to the hardware link information;
and integrating and displaying the first part of information and the second part of information.
2. The method of the server to read the GPU according to claim 1, wherein reading the second partial information comprises:
and reading the second part of information inside the corresponding GPU in an OOB mode through an I2C channel.
3. The method of the server to read the GPU according to claim 1, further comprising:
updating the firmware version information and the serial number of the GPU when the server is started each time;
and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
4. A method for a server to read a GPU according to any one of claims 1-3,
the first part of information comprises: at least one of vendor information, type, model, link rate, link width;
the second partial information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
5. An apparatus for a server to read a GPU, comprising:
the SMBUS module is used for identifying whether the GPU equipment is inserted into the hardware slot position or not and determining the position of the GPU on the hardware;
a BIOS module for identifying the first part of information of the PCIE device and synchronizing the first part of information to the BMC,
a BMC for judging whether the PCIE device is a GPU or not through the first partial information synchronized by the BIOS,
when the judging result is yes, the BMC corresponds the first part information to the PCIE interface position of the corresponding GPU, the BMC locates to the corresponding GPU and reads the second part information of the corresponding GPU, wherein the BMC matches the first part information of the PCIE equipment identified by the BIOS according to the corresponding relation between the PCIE and the hardware position, the BMC combines the SMBUS identification result and the PCIE identification result to determine hardware link information of the GPU, and circularly monitors the hardware link information, and circularly opens an I2C BUS channel of hardware according to the hardware link information;
and the display module is used for displaying the first part of information and the second part of information after the BMC is integrated.
6. The device for reading GPUs by the server according to claim 5, wherein the BMC reads the second portion of information inside the corresponding GPU in an OOB manner through an I2C channel.
7. The apparatus of the server to read the GPU of claim 5, wherein the BMC is further configured to:
updating the firmware version information and the serial number of the GPU when the server is started each time;
and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
8. The apparatus for reading GPU by a server according to any one of claims 5-7,
the first part of information comprises: at least one of vendor information, type, model, link rate, link width;
the second partial information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911333280.7A CN111190848B (en) | 2019-12-23 | 2019-12-23 | Method and device for server to read GPU |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911333280.7A CN111190848B (en) | 2019-12-23 | 2019-12-23 | Method and device for server to read GPU |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111190848A CN111190848A (en) | 2020-05-22 |
CN111190848B true CN111190848B (en) | 2023-09-15 |
Family
ID=70705863
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911333280.7A Active CN111190848B (en) | 2019-12-23 | 2019-12-23 | Method and device for server to read GPU |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111190848B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114968862B (en) * | 2022-08-01 | 2022-11-11 | 摩尔线程智能科技(北京)有限责任公司 | Graphics processor management method, apparatus and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104090631A (en) * | 2013-04-01 | 2014-10-08 | 鸿富锦精密工业(深圳)有限公司 | PCI (Peripheral Component Interconnect) device and electronic device with PCI interface |
CN108268361A (en) * | 2018-01-23 | 2018-07-10 | 郑州云海信息技术有限公司 | A kind of method, system, device and the storage medium of BMC monitoring GPU |
CN108776595A (en) * | 2018-06-11 | 2018-11-09 | 郑州云海信息技术有限公司 | A kind of recognition methods, device, equipment and the medium of the video card of GPU servers |
CN109828798A (en) * | 2019-01-31 | 2019-05-31 | 郑州云海信息技术有限公司 | A method of PCIE silk-screen information is sent to BMC |
-
2019
- 2019-12-23 CN CN201911333280.7A patent/CN111190848B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104090631A (en) * | 2013-04-01 | 2014-10-08 | 鸿富锦精密工业(深圳)有限公司 | PCI (Peripheral Component Interconnect) device and electronic device with PCI interface |
CN108268361A (en) * | 2018-01-23 | 2018-07-10 | 郑州云海信息技术有限公司 | A kind of method, system, device and the storage medium of BMC monitoring GPU |
CN108776595A (en) * | 2018-06-11 | 2018-11-09 | 郑州云海信息技术有限公司 | A kind of recognition methods, device, equipment and the medium of the video card of GPU servers |
CN109828798A (en) * | 2019-01-31 | 2019-05-31 | 郑州云海信息技术有限公司 | A method of PCIE silk-screen information is sent to BMC |
Also Published As
Publication number | Publication date |
---|---|
CN111190848A (en) | 2020-05-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6895532B2 (en) | Wireless server diagnostic system and method | |
US7013385B2 (en) | Remotely controlled boot settings in a server blade environment | |
US20030105904A1 (en) | Monitoring insertion/removal of server blades in a data processing system | |
CN100472460C (en) | Detection and display method and device for computer self-test information | |
US20070076006A1 (en) | Detection of displays for information handling system | |
US20100306357A1 (en) | Server, computer system, and method for monitoring computer system | |
EP2472402A1 (en) | Remote management systems and methods for mapping operating system and management controller located in a server | |
CN109828798A (en) | A method of PCIE silk-screen information is sent to BMC | |
US20080059626A1 (en) | Method for display of blade video location and status information | |
CN111190848B (en) | Method and device for server to read GPU | |
CN112269584A (en) | PCIe Switch firmware updating method, device, electronic equipment and medium | |
CN112069766A (en) | Method and device for reducing cables of hard disk backboard in server | |
CN111382027A (en) | BMC IP obtaining method and device and cabinet type server | |
US11308002B2 (en) | Systems and methods for detecting expected user intervention across multiple blades during a keyboard, video, and mouse (KVM) session | |
CN106528226B (en) | Installation method and device of operating system | |
CN113190395B (en) | State monitoring method and device | |
US7114067B2 (en) | Method of efficiently detecting whether a device is connected to an information processing system by detecting short circuits to predetermined signal lines of an IDE interface | |
US20110292591A1 (en) | Expanding Functionality Of One Or More Hard Drive Bays In A Computing System | |
CN116627729A (en) | External connection cable, external connection cable in-place detection device, startup self-checking method and system | |
CN116303200A (en) | PCIE equipment positioning management method, system, terminal and storage medium | |
CN113849267A (en) | Virtual display method, system, terminal and storage medium for display card | |
CN113626278B (en) | Hardware topology generation method and related equipment thereof | |
EP1750196A2 (en) | Computer system and interface card module thereof | |
CN114253573A (en) | PCIe device firmware batch upgrading method, system, terminal and storage medium | |
US7009380B2 (en) | Interface device for product testing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |