CN111190848B - Method and device for server to read GPU - Google Patents

Method and device for server to read GPU Download PDF

Info

Publication number
CN111190848B
CN111190848B CN201911333280.7A CN201911333280A CN111190848B CN 111190848 B CN111190848 B CN 111190848B CN 201911333280 A CN201911333280 A CN 201911333280A CN 111190848 B CN111190848 B CN 111190848B
Authority
CN
China
Prior art keywords
gpu
information
bmc
hardware
pcie
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911333280.7A
Other languages
Chinese (zh)
Other versions
CN111190848A (en
Inventor
梁晨光
黄洪
宋军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Co Ltd
Original Assignee
Dawning Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Co Ltd filed Critical Dawning Information Industry Co Ltd
Priority to CN201911333280.7A priority Critical patent/CN111190848B/en
Publication of CN111190848A publication Critical patent/CN111190848A/en
Application granted granted Critical
Publication of CN111190848B publication Critical patent/CN111190848B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4022Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express

Abstract

The invention discloses a method and a device for a server to read a GPU, wherein the method comprises the following steps: the BIOS synchronizes the first part of information of the identified PCIE device to the BMC; the BMC judges whether the PCIE equipment is a GPU or not through the first part of information synchronized by the BIOS, and when the judging result is yes, the BMC corresponds the first part of information to the hardware position of the corresponding GPU so as to locate the corresponding GPU and read the second part of information of the corresponding GPU; the first partial information and the second partial information are integrated and displayed. According to the technical scheme, the complete GPU information at the corresponding silk-screen position can be intuitively displayed.

Description

Method and device for server to read GPU
Technical Field
The invention relates to the technical field of servers, in particular to a method and a device for reading a GPU by a server.
Background
The current SMBUS Access mode has limited information acquisition of the graphic processor, and only the temperature, the equipment identification code, the manufacturer identification code, the sub-equipment identification code and the sub-manufacturer identification code of the graphic processing can be acquired. The BIOS (Basic Input Output System ) can only obtain the device identification code, manufacturer identification code, sub-device identification code, sub-manufacturer identification code, type, model, link rate and link width of the graphics processor by PCIE protocol.
The prior art is to install the driver of the graphics processor through the OS system, and the detailed information of the graphics processor can be obtained through the SMBUS in-band mode, but only a large number of return presentations in the form of command lines can be obtained.
In the prior art, the addresses of GPUs are fixed, and in order to realize simultaneous use of multiple GPUs on hardware design, GPU devices are required to be placed on different I2C channels, or the same I2C channel chips are subjected to expansion distinction, so that the position information on the hardware is fixed, related silk-screen display is necessary, and the BMC can acquire the position information of the I2C channel of each GPU in a mode of SMBUS Access and corresponds to silk-screen printing. But in this way all the information of the GPU cannot be obtained.
The identification of the GPU by the BIOS is carried out according to the sequence of PCIE interfaces, so that the corresponding relation of the BIOS cannot be uncertain in the change of different PCIE cable link modes because the hardware positions have complete corresponding relations.
The in-band mode of the SMBUS is not convenient enough, the displayed information can only return a large amount of data through a command line, the desired information can not be positioned quickly, the real-time state information of the corresponding GPU can not be monitored and displayed intuitively, and once the unsatisfied GPU card appears, the display sequence of the GPU in the system is different from the silk-screen sequence.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a method and a device for reading a GPU by a server, which can intuitively display complete GPU information at a corresponding silk-screen position.
The technical scheme of the invention is realized as follows:
according to an aspect of the present invention, there is provided a method for a server to read a GPU, including:
the BIOS synchronizes the first part of information of the identified PCIE device to the BMC;
the BMC determines whether the PCIE device is a GPU through the first partial information synchronized by the BIOS,
when the judging result is yes, the BMC corresponds the first part information to the hardware position of the corresponding GPU so as to locate the corresponding GPU and read the second part information of the corresponding GPU;
the first partial information and the second partial information are integrated and displayed.
According to an embodiment of the present invention, the BMC corresponding the first partial information to the hardware location of the corresponding GPU to locate the corresponding GPU includes: and the BMC is matched according to the corresponding relation between the PCIE and the hardware position and the first part of information of the PCIE equipment identified by the BIOS so as to obtain the hardware position of the PCIE equipment.
According to an embodiment of the present invention, reading the second partial information includes: and reading the second part of information inside the corresponding GPU in an OOB mode through the I2C channel.
According to an embodiment of the present invention, the method for reading the GPU by the server further includes: updating the firmware version information and the serial number of the GPU when the server is started each time; and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
According to an embodiment of the present invention, the first partial information includes: at least one of vendor information, type, model, link rate, link width; the second part of information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
According to another aspect of the present invention, there is provided an apparatus for a server to read a GPU, including:
the BIOS module is used for identifying the first part of information of the PCIE equipment and synchronizing the first part of information to the BMC,
a BMC for judging whether the PCIE device is a GPU or not through the first partial information of BIOS synchronization,
when the judging result is yes, the BMC corresponds the first part of information to the PCIE interface position of the corresponding GPU, and locates the BMC to the corresponding GPU and reads the second part of information of the corresponding GPU;
and the display module is used for displaying the first part of information and the second part of information after BMC integration.
According to the embodiment of the invention, the BMC performs matching according to the corresponding relation between the PCIE and the hardware position and the first part of information of the PCIE device identified by the BIOS so as to obtain the hardware position of the PCIE device.
According to the embodiment of the invention, the BMC reads the second part of information inside the corresponding GPU in an OOB mode through the I2C channel.
According to an embodiment of the invention, the BMC is also for: updating the firmware version information and the serial number of the GPU when the server is started each time; and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
According to an embodiment of the present invention, the first partial information includes: at least one of vendor information, type, model, link rate, link width; the second part of information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
According to the technical scheme, the BMC is used for displaying the complete information of the GPU, the BIOS can obtain the information of the GPU, the BMC can integrate part of the GPU information obtained by the BIOS, and the actual hardware position (such as the hardware silk-screen position) of the GPU is more intuitively corresponding to the GPU information and displayed to a client. Therefore, when an operation and maintenance person or a user wants to check the model of the GPU on the hardware screen printing, the operation and maintenance person or the user can intuitively monitor the hardware screen printing model through the web page of the BMC, the integrity of the GPU information display is greatly improved, and the technical cost and the time cost for checking the GPU information through various operations are reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for a server to read a GPU according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for a server to read a GPU in accordance with a specific embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the invention, fall within the scope of protection of the invention.
FIG. 1 is a flowchart of a method for a server to read a GPU according to an embodiment of the present invention. As shown in fig. 1, the method for reading the GPU by the server of the present invention may include the following steps:
s11, the BIOS synchronizes the first part of information of the identified PCIE device to the BMC. In one embodiment, the first portion of information includes: at least one of vendor information, type, model number, link rate, link width.
S12, the BMC judges whether the PCIE equipment is the GPU or not through the first part of information synchronized by the BIOS.
And S13, when the judgment result is yes, the BMC corresponds the first part of information to the hardware position of the corresponding GPU so as to locate the corresponding GPU and read the second part of information of the corresponding GPU.
In one embodiment, the second portion of information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature. In one embodiment, the BMC locates to the corresponding GPU through an I2C channel. In one embodiment, the BMC reads the second part Of information inside the corresponding GPU by means Of OOB (Out Of Band).
S14, integrating and displaying the first part of information and the second part of information, for example, on a web side.
According to the technical scheme, the BMC is used for displaying the complete information of the GPU, the BIOS can obtain the information of the GPU, the BMC can integrate part of the GPU information obtained by the BIOS, and the actual hardware position (such as the hardware silk-screen position) of the GPU is more intuitively corresponding to the GPU information and displayed to a client. Therefore, when an operation and maintenance person or a user wants to check the model of the GPU on the hardware screen printing, the operation and maintenance person or the user can intuitively monitor the hardware screen printing model through the web page of the BMC, the integrity of the GPU information display is greatly improved, and the technical cost and the time cost for checking the GPU information through various operations are reduced.
In some embodiments, the firmware version information and the serial number of the GPU may be updated each time the server is powered on; and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
FIG. 2 is a flowchart of a method for a server to read a GPU in accordance with a specific embodiment of the present invention. As shown in fig. 2, in this embodiment, the present invention uses BMC as the rendering of the GPU complete information. After power-on and power-on, the BMC identifies whether the GPU equipment is inserted into the hardware slot or not in an SMBUS mode, and determines the position of the GPU on the hardware. After the BIOS completes the initialization of the device, vendor information, type, model, link rate and link width information of all PCIE devices are captured, and the BIOS sends the vendor information, the type, the model, the link rate and the link width information to the BMC according to a predefined IPMI SUGON OEM command format. The BMC builds a corresponding relation table of PCIE and hardware screen printing in advance according to hardware design, performs one-by-one matching according to the BUS/DEVICE/FUNCTION information of the PCIE equipment identified by the BIOS, obtains the position information of all PCIE equipment on hardware, and determines GPU equipment in the PCIE equipment according to BaseClass, subClass. And combining the SMBUS identification result and the PCIE identification result, determining hardware link information of the GPU. And (3) circularly monitoring, namely circularly opening an I2C BUS channel of hardware according to hardware link information, and reading more perfect information of the GPU in an OOB mode, such as: firmware version information, production time, serial number, power consumption, maximum working temperature and the like are combined, integrated, converted into Chinese and English, and displayed on a web interface.
In addition, the firmware version information and the serial number of the GPU are updated once every time the GPU is started. The GPU equipment does not support hot plug, power-off operation is needed for replacing the GPU equipment, and after the equipment is replaced, the GPU can be identified again after the equipment is powered on for the first time. The identification process is carried out again, so that even if GPUs of different models are replaced, the insertion positions of the GPUs are changed, the display of the GPUs cannot be affected, and the correctness and completeness of the information of the GPUs can be ensured.
More specifically, referring to fig. 2, after the BMC receives and recognizes the first boot, it starts to recognize the position of the PCIE interface of the GPU on the hardware, and the BUS/DEVICE/FUNCTION record of the PCIE corresponds to the hardware screen printing information one by one. When the BIOS identifies the information of the PCIE device in each starting process, the IPMI OEM command is used for carrying out data interaction between the BMC and the BIOS, and the acquired information is synchronized to the BMC end. The BMC end can determine whether the GPU equipment is the GPU equipment or not through the information identified by the BIOS through BaseClass, subClass, corresponds to the actual PCIE interface position on hardware through BUS/DEVICE/FUNCTION, switches the I2C channel to position each GPU equipment according to the position information recorded before, reads information inside the GPU through an OOB mode, and displays the information on the web end.
With continued reference to fig. 2, each time the server is started or restarted, the BIOS identifies BUS/DEVICE/FUNCTION information of the PCIE link where the PCIE DEVICE is located, and may obtain base/sub-class information of the PCIE DEVICE according to a PCIE standard protocol, so as to determine a type of the PCIE DEVICE. According to IPMI SUGON OEM CMD, PCIE identification information is transmitted to the BMC. The BMC builds the corresponding relation between the integral PCIE and the hardware screen printing in advance according to the hardware design of the server, and the position information of the PCIE interface can be obtained through the data of the circulating corresponding table through BusNum, devNum, funNum sent by the BIOS. And judging whether the PCIE device is the GPU device according to BaseClass, subClass. When the existence of the GPU equipment is identified, the I2C link information of the hardware where the GPU is located can be determined according to the hardware silk-screen position, interaction is realized with the GPU in an OOB mode, and the detailed information of the GPU is obtained. And integrating all the information, displaying the integrated information on a WEB terminal, and seeing the detailed information of the GPU.
In summary, according to the method provided by the invention, the BMC can obtain the visual silk-screen position of the GPU on the hardware in an I2C mode, the BIOS can obtain partial information of the GPU, the BMC can reprocess the GPU information which can be obtained by the BIOS at present, perfection is achieved, and the position of the GPU actually on the hardware is more visually corresponding to the GPU information and is presented to a client.
When operation and maintenance personnel or users want to check the model of the GPU on the hardware screen printing, the operation and maintenance personnel or users can intuitively monitor the model through the web page of the BMC, so that the integrity of the GPU information display is greatly improved, and the technical cost and the time cost for checking the GPU information by various operations are reduced. Meanwhile, the real-time information of the temperature and the power consumption of the GPU can be monitored, and the heat dissipation is timely controlled by the fan, so that the use persistence of the GPU is improved. Meanwhile, the real-time information of the temperature and the power consumption of the GPU can be monitored, and the heat dissipation is timely controlled by the fan, so that the use persistence of the GPU is improved.
According to an embodiment of the present invention, there is also provided an apparatus for reading a GPU by a server, including:
the BMC is used for identifying and recording PCIE interface positions of the GPUs;
the BIOS module is used for identifying the first part of information of the PCIE equipment and synchronizing the first part of information to the BMC,
wherein, the BMC judges whether the PCIE equipment is the GPU or not through the first partial information of the BIOS synchronization,
when the judging result is yes, the BMC corresponds the first part of information to the PCIE interface position of the corresponding GPU, and locates the BMC to the corresponding GPU and reads the second part of information of the corresponding GPU;
and the display module is used for displaying the first part of information and the second part of information after BMC integration.
According to an embodiment of the invention, the BMC locates to the corresponding GPU through the I2C channel.
According to the embodiment of the invention, the BMC reads the second part of information inside the corresponding GPU in an OOB mode.
According to an embodiment of the invention, the BMC is also for: updating the firmware version information and the serial number of the GPU when the server is started each time; and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
According to an embodiment of the present invention, the first partial information includes: at least one of a link speed and a link bandwidth;
the second part of information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims (8)

1. A method for a server to read a GPU, comprising:
the BMC identifies whether the GPU equipment is inserted into the hardware slot position or not in an SMBUS mode, and determines the position of the GPU on hardware;
the BIOS synchronizes the first part of information of the identified PCIE equipment to the BMC;
the BMC judges whether the PCIE device is a GPU or not through the first partial information synchronized by the BIOS,
when the judgment result is yes, the BMC corresponds the first part information to the hardware position of the corresponding GPU so as to locate the corresponding GPU and read the second part information of the corresponding GPU, wherein the BMC matches the first part information of the PCIE equipment identified by the BIOS according to the corresponding relation between the PCIE and the hardware position, determines the hardware link information of the GPU by combining the SMBUS identification result and the PCIE identification result, circularly monitors, and circularly opens the I2C BUS channel of the hardware according to the hardware link information;
and integrating and displaying the first part of information and the second part of information.
2. The method of the server to read the GPU according to claim 1, wherein reading the second partial information comprises:
and reading the second part of information inside the corresponding GPU in an OOB mode through an I2C channel.
3. The method of the server to read the GPU according to claim 1, further comprising:
updating the firmware version information and the serial number of the GPU when the server is started each time;
and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
4. A method for a server to read a GPU according to any one of claims 1-3,
the first part of information comprises: at least one of vendor information, type, model, link rate, link width;
the second partial information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
5. An apparatus for a server to read a GPU, comprising:
the SMBUS module is used for identifying whether the GPU equipment is inserted into the hardware slot position or not and determining the position of the GPU on the hardware;
a BIOS module for identifying the first part of information of the PCIE device and synchronizing the first part of information to the BMC,
a BMC for judging whether the PCIE device is a GPU or not through the first partial information synchronized by the BIOS,
when the judging result is yes, the BMC corresponds the first part information to the PCIE interface position of the corresponding GPU, the BMC locates to the corresponding GPU and reads the second part information of the corresponding GPU, wherein the BMC matches the first part information of the PCIE equipment identified by the BIOS according to the corresponding relation between the PCIE and the hardware position, the BMC combines the SMBUS identification result and the PCIE identification result to determine hardware link information of the GPU, and circularly monitors the hardware link information, and circularly opens an I2C BUS channel of hardware according to the hardware link information;
and the display module is used for displaying the first part of information and the second part of information after the BMC is integrated.
6. The device for reading GPUs by the server according to claim 5, wherein the BMC reads the second portion of information inside the corresponding GPU in an OOB manner through an I2C channel.
7. The apparatus of the server to read the GPU of claim 5, wherein the BMC is further configured to:
updating the firmware version information and the serial number of the GPU when the server is started each time;
and re-identifying the hardware position of the GPU when the GPU is powered on and started each time.
8. The apparatus for reading GPU by a server according to any one of claims 5-7,
the first part of information comprises: at least one of vendor information, type, model, link rate, link width;
the second partial information includes: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
CN201911333280.7A 2019-12-23 2019-12-23 Method and device for server to read GPU Active CN111190848B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911333280.7A CN111190848B (en) 2019-12-23 2019-12-23 Method and device for server to read GPU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911333280.7A CN111190848B (en) 2019-12-23 2019-12-23 Method and device for server to read GPU

Publications (2)

Publication Number Publication Date
CN111190848A CN111190848A (en) 2020-05-22
CN111190848B true CN111190848B (en) 2023-09-15

Family

ID=70705863

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911333280.7A Active CN111190848B (en) 2019-12-23 2019-12-23 Method and device for server to read GPU

Country Status (1)

Country Link
CN (1) CN111190848B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114968862B (en) * 2022-08-01 2022-11-11 摩尔线程智能科技(北京)有限责任公司 Graphics processor management method, apparatus and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090631A (en) * 2013-04-01 2014-10-08 鸿富锦精密工业(深圳)有限公司 PCI (Peripheral Component Interconnect) device and electronic device with PCI interface
CN108268361A (en) * 2018-01-23 2018-07-10 郑州云海信息技术有限公司 A kind of method, system, device and the storage medium of BMC monitoring GPU
CN108776595A (en) * 2018-06-11 2018-11-09 郑州云海信息技术有限公司 A kind of recognition methods, device, equipment and the medium of the video card of GPU servers
CN109828798A (en) * 2019-01-31 2019-05-31 郑州云海信息技术有限公司 A method of PCIE silk-screen information is sent to BMC

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090631A (en) * 2013-04-01 2014-10-08 鸿富锦精密工业(深圳)有限公司 PCI (Peripheral Component Interconnect) device and electronic device with PCI interface
CN108268361A (en) * 2018-01-23 2018-07-10 郑州云海信息技术有限公司 A kind of method, system, device and the storage medium of BMC monitoring GPU
CN108776595A (en) * 2018-06-11 2018-11-09 郑州云海信息技术有限公司 A kind of recognition methods, device, equipment and the medium of the video card of GPU servers
CN109828798A (en) * 2019-01-31 2019-05-31 郑州云海信息技术有限公司 A method of PCIE silk-screen information is sent to BMC

Also Published As

Publication number Publication date
CN111190848A (en) 2020-05-22

Similar Documents

Publication Publication Date Title
US6895532B2 (en) Wireless server diagnostic system and method
US7013385B2 (en) Remotely controlled boot settings in a server blade environment
US20030105904A1 (en) Monitoring insertion/removal of server blades in a data processing system
CN100472460C (en) Detection and display method and device for computer self-test information
US20070076006A1 (en) Detection of displays for information handling system
US20100306357A1 (en) Server, computer system, and method for monitoring computer system
EP2472402A1 (en) Remote management systems and methods for mapping operating system and management controller located in a server
CN109828798A (en) A method of PCIE silk-screen information is sent to BMC
US20080059626A1 (en) Method for display of blade video location and status information
CN111190848B (en) Method and device for server to read GPU
CN112269584A (en) PCIe Switch firmware updating method, device, electronic equipment and medium
CN112069766A (en) Method and device for reducing cables of hard disk backboard in server
CN111382027A (en) BMC IP obtaining method and device and cabinet type server
US11308002B2 (en) Systems and methods for detecting expected user intervention across multiple blades during a keyboard, video, and mouse (KVM) session
CN106528226B (en) Installation method and device of operating system
CN113190395B (en) State monitoring method and device
US7114067B2 (en) Method of efficiently detecting whether a device is connected to an information processing system by detecting short circuits to predetermined signal lines of an IDE interface
US20110292591A1 (en) Expanding Functionality Of One Or More Hard Drive Bays In A Computing System
CN116627729A (en) External connection cable, external connection cable in-place detection device, startup self-checking method and system
CN116303200A (en) PCIE equipment positioning management method, system, terminal and storage medium
CN113849267A (en) Virtual display method, system, terminal and storage medium for display card
CN113626278B (en) Hardware topology generation method and related equipment thereof
EP1750196A2 (en) Computer system and interface card module thereof
CN114253573A (en) PCIe device firmware batch upgrading method, system, terminal and storage medium
US7009380B2 (en) Interface device for product testing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant