CN111190848A - Method and device for reading GPU (graphics processing Unit) by server - Google Patents
Method and device for reading GPU (graphics processing Unit) by server Download PDFInfo
- Publication number
- CN111190848A CN111190848A CN201911333280.7A CN201911333280A CN111190848A CN 111190848 A CN111190848 A CN 111190848A CN 201911333280 A CN201911333280 A CN 201911333280A CN 111190848 A CN111190848 A CN 111190848A
- Authority
- CN
- China
- Prior art keywords
- gpu
- information
- bmc
- server
- pcie
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4022—Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2213/00—Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F2213/0026—PCI express
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
Abstract
The invention discloses a method and a device for reading a GPU (graphics processing Unit) by a server, wherein the method comprises the following steps: the BIOS synchronizes the first part of the identified PCIE equipment information to the BMC; the BMC judges whether the PCIE equipment is a GPU or not through the first part of information synchronized by the BIOS, and when the judgment result is yes, the BMC corresponds the first part of information to the hardware position of the corresponding GPU so as to position the first part of information to the corresponding GPU and read the second part of information of the corresponding GPU; and integrating and displaying the first part of information and the second part of information. According to the technical scheme, the complete GPU information under the corresponding silk-screen position can be visually displayed.
Description
Technical Field
The invention relates to the technical field of servers, in particular to a method and a device for reading a GPU (graphics processing Unit) by a server.
Background
Currently, the acquisition of graphics processor information through an SMBUS Access mode is limited, and only the temperature, the equipment identification code, the manufacturer identification code, the sub-equipment identification code and the sub-manufacturer identification code of graphics processing can be acquired. The BIOS (Basic Input output system) can only obtain the device identification code, the vendor identification code, the sub-device identification code, the sub-vendor identification code, the type, the model, the link rate, and the link width of the graphics processor by means of the PCIE protocol.
In the prior art, the driver of the graphics processor is installed through an OS system, and the detailed information of the graphics processor can be obtained in an SMBUS in-band mode, but the graphics processor can only return to be presented in a large amount in a command line mode.
In the prior art, addresses of GPUs are fixed, and when multiple GPUs are to be used simultaneously in hardware design, GPU equipment needs to be placed on different I2C channels or the same I2C channel chip is expanded and distinguished, so that position information on hardware is fixed, and related silk-screen display is inevitably generated, and the BMC can acquire the position information of the I2C channel of each GPU in an SMBUS Access manner, and correspond to the silk-screen display. But simply not all the information of the GPU is available in this way.
The identification of the BIOS to the GPU is carried out according to the sequence of the PCIE interfaces, so that the complete corresponding relation cannot exist in the hardware position, and the corresponding relation per se has uncertainty in the change of different PCIE cable link modes.
The mode in the SMBUS band is not convenient enough, displayed information can only return a large amount of data through a command line and cannot be quickly positioned to the desired information, the real-time state information of the corresponding GPU cannot be visually monitored and displayed, and once a GPU card which is not fully matched appears, the display sequence of the GPU in the system can be different from the silk-screen sequence.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a method and a device for reading a GPU by a server, which can intuitively display complete GPU information at a corresponding silk-screen position.
The technical scheme of the invention is realized as follows:
according to one aspect of the present invention, there is provided a method for a server to read a GPU, comprising:
the BIOS synchronizes the first part of the identified PCIE equipment information to the BMC;
the BMC judges whether the PCIE equipment is the GPU or not through the first part of information of the BIOS synchronization,
when the judgment result is yes, the BMC corresponds the first part of information to the hardware position of the corresponding GPU so as to position the corresponding GPU and read the second part of information of the corresponding GPU;
and integrating and displaying the first part of information and the second part of information.
According to the embodiment of the present invention, the BMC corresponding the first part of information to the hardware position of the corresponding GPU to locate the corresponding GPU includes: the BMC performs matching according to the corresponding relation between the PCIE and the hardware position and according to the first part of information of the PCIE equipment identified by the BIOS so as to obtain the hardware position of the PCIE.
According to an embodiment of the present invention, reading the second part of information includes: and reading the second part of information inside the corresponding GPU in an OOB mode through an I2C channel.
According to the embodiment of the invention, the method for reading the GPU by the server further comprises the following steps: updating firmware version information and a serial number of the GPU when the server is started up every time; and re-identifying the hardware position of the GPU when the GPU is powered on and started every time.
According to an embodiment of the present invention, the first part of information includes: at least one of vendor information, type, model, link rate, link width; the second part of information comprises: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
According to another aspect of the present invention, there is provided an apparatus for a server to read a GPU, including:
a BIOS module for identifying a first portion of information of the PCIE device and synchronizing the first portion of information to the BMC,
the BMC is used for judging whether the PCIE equipment is the GPU or not through the first part of information of the BIOS synchronization,
when the judgment result is yes, the BMC corresponds the first part of information to the PCIE interface position of the corresponding GPU, positions the BMC to the corresponding GPU and reads the second part of information of the corresponding GPU;
and the display module is used for displaying the first part of information and the second part of information after the BMC integration.
According to the embodiment of the invention, the BMC performs matching according to the corresponding relation between the PCIE and the hardware position and the first part of information of the PCIE equipment identified by the BIOS so as to obtain the hardware position of the PCIE equipment.
According to the embodiment of the invention, the BMC reads the second part of information inside the corresponding GPU in an OOB manner through the I2C channel.
According to an embodiment of the invention, the BMC is further configured to: updating firmware version information and a serial number of the GPU when the server is started up every time; and re-identifying the hardware position of the GPU when the GPU is powered on and started every time.
According to an embodiment of the present invention, the first part of information includes: at least one of vendor information, type, model, link rate, link width; the second part of information comprises: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
According to the technical scheme, the BMC is used for presenting the complete information of the GPU, the BIOS can obtain the information of the GPU, the BMC can complete the whole task of partial GPU information obtained by the BIOS, and more intuitively corresponds the actual hardware position (such as a hardware silk-screen position) of the GPU to the information of the GPU and presents the information of the GPU to a client. Therefore, when operation and maintenance personnel or a user want to check the type of the GPU on the hardware screen, the operation and maintenance personnel or the user can visually monitor the type of the GPU through the web page of the BMC, the integrity of GPU information display is greatly improved, and the technical cost and the time cost for checking GPU information through various operations are reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a flow chart of a method for a server to read a GPU according to an embodiment of the invention;
fig. 2 is a flowchart of a method for a server to read a GPU according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
Fig. 1 is a flowchart of a method for a server to read a GPU according to an embodiment of the present invention. As shown in fig. 1, the method for the server to read the GPU of the present invention may include the following steps:
s11, the BIOS synchronizes the first part information of the identified PCIE device to the BMC. In one embodiment, the first part of information comprises: vendor information, type, model, link rate, link width.
S12, the BMC determines whether the PCIE device is a GPU according to the first part of the information of the BIOS synchronization.
And S13, when the judgment result is yes, the BMC corresponds the first part information to the hardware position of the corresponding GPU so as to position the corresponding GPU and read the second part information of the corresponding GPU.
In one embodiment, the second part of information comprises: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature. In one embodiment, the BMC is located to the corresponding GPU via the I2C channel. In one embodiment, the BMC reads the second part Of information inside the corresponding GPU by means Of OOB (Out Of Band).
And S14, integrating and displaying the first part information and the second part information, for example, on the web side.
According to the technical scheme, the BMC is used for presenting the complete information of the GPU, the BIOS can obtain the information of the GPU, the BMC can complete the whole task of partial GPU information obtained by the BIOS, and more intuitively corresponds the actual hardware position (such as a hardware silk-screen position) of the GPU to the information of the GPU and presents the information of the GPU to a client. Therefore, when operation and maintenance personnel or a user want to check the type of the GPU on the hardware screen, the operation and maintenance personnel or the user can visually monitor the type of the GPU through the web page of the BMC, the integrity of GPU information display is greatly improved, and the technical cost and the time cost for checking GPU information through various operations are reduced.
In some embodiments, the firmware version information and the serial number of the GPU may be updated each time the server is powered on; and re-identifying the hardware position of the GPU when the GPU is powered on and started every time.
Fig. 2 is a flowchart of a method for a server to read a GPU according to an embodiment of the present invention. As shown in fig. 2, in the present embodiment, the BMC is used for presenting the GPU complete information. After the power-on and the startup are carried out, the BMC identifies whether GPU equipment is inserted into the hardware slot position or not in an SMBUS mode, and determines the position of the GPU on the hardware. After the BIOS completes the initialization of the equipment, the manufacturer information, the type, the model, the link rate and the link width information of all PCIE equipment can be captured, and meanwhile, the BIOS sends the information to the BMC according to a predefined IPMI SUGON OEM command format. The BMC establishes a corresponding relation table of PCIE and hardware silk screen in advance according to hardware design, identifies the BUS/DEVICE/FUNCTION information of the PCIE equipment according to BIOS to perform one-by-one matching, obtains position information of all the PCIE equipment on hardware, and determines GPU equipment in the PCIE equipment according to BaseClass and SubClass. And combining the SMBUS identification result and the PCIE identification result to determine the hardware link information of the GPU. And circularly monitoring, namely circularly opening an I2C BUS channel of the hardware according to the hardware link information, and reading more complete information of the GPU in an OOB mode, such as: the firmware version information, the production time, the serial number, the power consumption, the maximum working temperature and the like are combined to be integrated, and are converted into Chinese and English to be displayed on the web interface.
In addition, the firmware version information and the serial number of the GPU are updated once at each boot. The GPU equipment does not support hot plug, power-off operation is needed when the GPU equipment is replaced, and after the equipment is replaced, the GPU can be re-identified when the GPU equipment is powered on for the first time. The identification process is carried out again, so that even if the GPUs of different models are replaced, the insertion and placement positions of the GPUs are changed, the display of the GPUs cannot be influenced, and the accuracy and the integrity of GPU information can be guaranteed.
More specifically, referring to fig. 2, after the BMC receives and recognizes that the GPU is actually located at the PCIE interface on the hardware after the first boot is started, the BUS/DEVICE/FUNCTION records of the PCIE are in one-to-one correspondence with the hardware silk-screen information. When the BIOS identifies the information of the PCIE equipment in the process of starting up the computer each time, an IPMI OEM command is used for carrying out data interaction between the BMC and the BIOS, and the acquired information is synchronized to the BMC. The BMC end can determine whether the GPU equipment is the GPU equipment or not through BaseClass and SubClass according to information identified by BIOS, corresponds to the PCIE interface position on actual hardware through BUS/DEVICE/FUNCTION, switches an I2C channel to be positioned to each GPU equipment according to the position information recorded before, reads information inside the GPU through an OOB mode, and displays the information on the web end.
As shown in fig. 2, each time the server is powered on or restarted, the BIOS may identify the BUS/DEVICE/FUNCTION information of the PCIE link where the PCIE DEVICE is located, and may obtain the BaseClass/subcclass information of the PCIE DEVICE according to the standard protocol of the PCIE, so as to determine the type of the PCIE DEVICE. And transmitting PCIE identification information to the BMC according to IPMI SUGON OEM CMD. The BMC establishes an integral corresponding relation between the PCIE and the hardware silk-screen in advance according to the hardware design of the server, and can obtain the position information of the PCIE interface through circulating data of the corresponding table through BusNum, DevNum and FunNum sent by the BIOS. And judging whether the PCIE equipment is GPU equipment or not according to BaseClass and SubClass. When the GPU equipment is identified to be stored, according to the silk-screen position of the hardware, the I2C link information of the hardware where the GPU is located can be determined, interaction with the GPU is achieved through an OOB mode, and the detailed information of the GPU is obtained. And integrating all information, displaying the information on a WEB side, and seeing detailed information of the GPU.
In summary, according to the method provided by the present invention, the BMC can obtain the intuitive screen printing position of the GPU on the hardware through the I2C, the BIOS can obtain part of the information of the GPU, and the BMC can reprocess and improve the GPU information currently available by the BIOS, so as to more intuitively correspond the actual position of the GPU on the hardware to the GPU information and present the GPU information to the client.
When an operation and maintenance person or a user wants to check the model of the GPU on the hardware screen, the operation and maintenance person or the user can visually monitor the model through the web page of the BMC, the integrity of GPU information display is greatly improved, and the technical cost and the time cost for checking GPU information through various operations are reduced. Meanwhile, real-time information of the temperature and the power consumption of the GPU can be monitored, heat dissipation is controlled timely through the fan, and the use continuity of the GPU is improved. Meanwhile, real-time information of the temperature and the power consumption of the GPU can be monitored, heat dissipation is controlled timely through the fan, and the use continuity of the GPU is improved.
According to an embodiment of the present invention, there is also provided an apparatus for a server to read a GPU, including:
the BMC is used for identifying and recording PCIE interface positions of the GPUs;
a BIOS module for identifying a first portion of information of the PCIE device and synchronizing the first portion of information to the BMC,
wherein, the BMC judges whether the PCIE equipment is the GPU or not through the first part of information of BIOS synchronization,
when the judgment result is yes, the BMC corresponds the first part of information to the PCIE interface position of the corresponding GPU, positions the BMC to the corresponding GPU and reads the second part of information of the corresponding GPU;
and the display module is used for displaying the first part of information and the second part of information after the BMC integration.
According to an embodiment of the invention, the BMC is located to the corresponding GPU through the I2C channel.
According to the embodiment of the invention, the BMC reads the second part of information inside the corresponding GPU in an OOB mode.
According to an embodiment of the invention, the BMC is further configured to: updating firmware version information and a serial number of the GPU when the server is started up every time; and re-identifying the hardware position of the GPU when the GPU is powered on and started every time.
According to an embodiment of the present invention, the first part of information includes: at least one of a link speed and a link bandwidth;
the second part of information comprises: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (10)
1. A method for a server to read a GPU, comprising:
the BIOS synchronizes the first part of information of the PCIE equipment to the BMC;
the BMC determines whether the PCIE device is a GPU through the first part of information synchronized by the BIOS,
when the judgment result is yes, the BMC corresponds the first part of information to the hardware position of the corresponding GPU so as to position the corresponding GPU and read the second part of information of the corresponding GPU;
integrating and displaying the first part of information and the second part of information.
2. The method of claim 1, wherein the BMC associates the first portion of information with a hardware location of the respective GPU to locate the respective GPU comprises:
and the BMC performs matching according to the corresponding relation between the PCIE and the hardware position and the first part of information of the PCIE equipment identified by the BIOS so as to obtain the hardware position of the PCIE equipment.
3. A method for a server to read a GPU as claimed in claim 1, wherein reading the second part of information comprises:
reading the second part of information inside the corresponding GPU in an OOB mode through an I2C channel.
4. The method for reading the GPU by the server according to claim 1, further comprising:
updating the firmware version information and the serial number of the GPU when the server is started up every time;
and re-identifying the hardware position of the GPU when the GPU is powered on and started every time.
5. Method for a server to read a GPU according to any of claims 1-4,
the first part of information comprises: at least one of vendor information, type, model, link rate, link width;
the second part of information comprises: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
6. An apparatus for a server to read a GPU, comprising:
a BIOS module for identifying a first portion of information of a PCIE device and synchronizing the first portion of information to the BMC,
the BMC is used for judging whether the PCIE equipment is a GPU or not through the first part of information synchronized by the BIOS,
when the judgment result is yes, the BMC corresponds the first part of information to the PCIE interface position of the corresponding GPU, and the BMC is positioned to the corresponding GPU and reads the second part of information of the corresponding GPU;
and the display module is used for displaying the first part of information and the second part of information after the BMC integration.
7. The device for reading the GPU by the server according to claim 6, wherein the BMC performs matching according to the first part of information of the PCIE device identified by the BIOS according to a correspondence between PCIE and hardware locations, so as to obtain the hardware location of the PCIE device.
8. An apparatus for reading GPU as claimed in claim 6, wherein the BMC is configured to read the second part of information inside the corresponding GPU in an OOB manner through an I2C channel.
9. The device for reading a GPU from a server of claim 6, wherein the BMC is further configured to:
updating the firmware version information and the serial number of the GPU when the server is started up every time;
and re-identifying the hardware position of the GPU when the GPU is powered on and started every time.
10. Device for a server to read a GPU according to any of claims 6-9,
the first part of information comprises: at least one of vendor information, type, model, link rate, link width;
the second part of information comprises: at least one of firmware version information, production time, serial number, power consumption, and maximum operating temperature.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911333280.7A CN111190848B (en) | 2019-12-23 | 2019-12-23 | Method and device for server to read GPU |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911333280.7A CN111190848B (en) | 2019-12-23 | 2019-12-23 | Method and device for server to read GPU |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111190848A true CN111190848A (en) | 2020-05-22 |
CN111190848B CN111190848B (en) | 2023-09-15 |
Family
ID=70705863
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911333280.7A Active CN111190848B (en) | 2019-12-23 | 2019-12-23 | Method and device for server to read GPU |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111190848B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114968862A (en) * | 2022-08-01 | 2022-08-30 | 摩尔线程智能科技(北京)有限责任公司 | Graphics processor management method, apparatus, and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104090631A (en) * | 2013-04-01 | 2014-10-08 | 鸿富锦精密工业(深圳)有限公司 | PCI (Peripheral Component Interconnect) device and electronic device with PCI interface |
CN108268361A (en) * | 2018-01-23 | 2018-07-10 | 郑州云海信息技术有限公司 | A kind of method, system, device and the storage medium of BMC monitoring GPU |
CN108776595A (en) * | 2018-06-11 | 2018-11-09 | 郑州云海信息技术有限公司 | A kind of recognition methods, device, equipment and the medium of the video card of GPU servers |
CN109828798A (en) * | 2019-01-31 | 2019-05-31 | 郑州云海信息技术有限公司 | A method of PCIE silk-screen information is sent to BMC |
-
2019
- 2019-12-23 CN CN201911333280.7A patent/CN111190848B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104090631A (en) * | 2013-04-01 | 2014-10-08 | 鸿富锦精密工业(深圳)有限公司 | PCI (Peripheral Component Interconnect) device and electronic device with PCI interface |
CN108268361A (en) * | 2018-01-23 | 2018-07-10 | 郑州云海信息技术有限公司 | A kind of method, system, device and the storage medium of BMC monitoring GPU |
CN108776595A (en) * | 2018-06-11 | 2018-11-09 | 郑州云海信息技术有限公司 | A kind of recognition methods, device, equipment and the medium of the video card of GPU servers |
CN109828798A (en) * | 2019-01-31 | 2019-05-31 | 郑州云海信息技术有限公司 | A method of PCIE silk-screen information is sent to BMC |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114968862A (en) * | 2022-08-01 | 2022-08-30 | 摩尔线程智能科技(北京)有限责任公司 | Graphics processor management method, apparatus, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111190848B (en) | 2023-09-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7013385B2 (en) | Remotely controlled boot settings in a server blade environment | |
US6968414B2 (en) | Monitoring insertion/removal of server blades in a data processing system | |
US6895532B2 (en) | Wireless server diagnostic system and method | |
US20070076006A1 (en) | Detection of displays for information handling system | |
CN1908902A (en) | System of managing peripheral interfaces in ipmi architecture and method thereof | |
CN110489367B (en) | Method and system for flexibly allocating and easily managing backplane by CPLD (complex programmable logic device) | |
US20070168763A1 (en) | System and method for auxiliary channel error messaging | |
CN109828798A (en) | A method of PCIE silk-screen information is sent to BMC | |
CN112269584A (en) | PCIe Switch firmware updating method, device, electronic equipment and medium | |
US20080059626A1 (en) | Method for display of blade video location and status information | |
CN114116378A (en) | Method, system, terminal and storage medium for acquiring PCIe device temperature | |
US11308002B2 (en) | Systems and methods for detecting expected user intervention across multiple blades during a keyboard, video, and mouse (KVM) session | |
CN111382027A (en) | BMC IP obtaining method and device and cabinet type server | |
CN111190848B (en) | Method and device for server to read GPU | |
CN106528226B (en) | Installation method and device of operating system | |
CN113190395B (en) | State monitoring method and device | |
US8554974B2 (en) | Expanding functionality of one or more hard drive bays in a computing system | |
US6807629B1 (en) | Apparatus and method for accessing POST 80h codes via a computer port | |
CN116627729A (en) | External connection cable, external connection cable in-place detection device, startup self-checking method and system | |
CN113010122A (en) | Image forming apparatus monitoring apparatus, method, system, and storage medium | |
CN116303200A (en) | PCIE equipment positioning management method, system, terminal and storage medium | |
CN115098342A (en) | System log collection method, system, terminal and storage medium | |
CN102096621A (en) | Computer and notebook computer | |
CN113849267A (en) | Virtual display method, system, terminal and storage medium for display card | |
EP1469391A1 (en) | Remote controlled data processing system via a network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |