CN112255939A - Independent monitoring device and method for MXM display unit - Google Patents

Independent monitoring device and method for MXM display unit Download PDF

Info

Publication number
CN112255939A
CN112255939A CN202011000861.1A CN202011000861A CN112255939A CN 112255939 A CN112255939 A CN 112255939A CN 202011000861 A CN202011000861 A CN 202011000861A CN 112255939 A CN112255939 A CN 112255939A
Authority
CN
China
Prior art keywords
display unit
mxm
mxm display
controller
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011000861.1A
Other languages
Chinese (zh)
Inventor
邱旭伟
李志刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 52 Research Institute
Original Assignee
CETC 52 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 52 Research Institute filed Critical CETC 52 Research Institute
Priority to CN202011000861.1A priority Critical patent/CN112255939A/en
Publication of CN112255939A publication Critical patent/CN112255939A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/04Programme control other than numerical control, i.e. in sequence controllers or logic controllers
    • G05B19/042Programme control other than numerical control, i.e. in sequence controllers or logic controllers using digital processors
    • G05B19/0423Input/output
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01DMEASURING NOT SPECIALLY ADAPTED FOR A SPECIFIC VARIABLE; ARRANGEMENTS FOR MEASURING TWO OR MORE VARIABLES NOT COVERED IN A SINGLE OTHER SUBCLASS; TARIFF METERING APPARATUS; MEASURING OR TESTING NOT OTHERWISE PROVIDED FOR
    • G01D21/00Measuring or testing not otherwise provided for
    • G01D21/02Measuring two or more variables by means not covered by a single other subclass
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/20Pc systems
    • G05B2219/25Pc structure of the system
    • G05B2219/25257Microcontroller

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

The invention discloses an independent monitoring device and method of an MXM display unit, which are used for monitoring the running state of the MXM display unit. According to the invention, through the design of the MXM peripheral monitoring circuit, the state and the working state of the peripheral sensor of the MXM display unit can be independently monitored in real time, and the problem that the state of the MXM display unit cannot be monitored after the MXM display unit fails is solved. The method has the advantages that the manufacturer information of the MXM display units is automatically read in the initialization stage of the MXM display units, the MXM display units are configured in a self-adaptive mode according to the manufacturer information, the problem caused by inconsistency of SMBus bus protocols of MXM display units of different manufacturers is solved, unified monitoring of the MXM display units of different manufacturers is achieved, and therefore maintainability, reliability and universality of the MXM display units are greatly improved.

Description

Independent monitoring device and method for MXM display unit
Technical Field
The application belongs to the technical field of computers, and particularly relates to an independent monitoring device and method for an MXM display unit.
Background
Mobile PCI Express Module (MXM) is a PCI-Express-based graphics Module specification, dominated by nVIDIA, based on the PCI-E bus. The display card is mainly applied to a mobile platform such as a notebook computer, and a display chip of a product using the specification is not directly welded on a mainboard, but has an independent display card slot similar to a desktop computer, so that a user is allowed to replace the display card by himself, and the maintenance is more convenient.
The MXM display unit is a display card module based on an MXM interface and plays a role in displaying output in a display card. The MXM display unit is widely integrated in computers with structures such as pxi (PCI eXtensions for instrumentation), cpci (compact PCI), and VPX (VME, PCI, eXtensions for both) for desktop display and computation due to its small and compact structure, and the MXM display unit is used as a display core component, and its status monitoring and operational reliability are very important.
In order to monitor the state of the MXM display unit, the upper computer is generally used to directly read the state of its own smbus (system Management bus) Management bus. However, the existing method for monitoring the MXM display unit through the SMBus management bus mainly has the following disadvantages:
1) after the MXM display unit fails due to faults, the SMBus management bus of the MXM display unit fails, so that monitoring failure is caused, and the state of the MXM display unit cannot be monitored;
2) because different MXM display unit manufacturers are different, protocols of SMBus management buses are different, and after MXM display units of different manufacturers are replaced, part of MXM display units can be disabled by the existing monitoring method.
Disclosure of Invention
The application aims to provide an independent monitoring device and method for an MXM display unit, which are not only suitable for monitoring different MXM display units, but also can continue to monitor when the MXM display unit fails.
In order to achieve the purpose, the technical scheme adopted by the application is as follows:
the utility model provides an independent monitoring devices of MXM display element for monitor MXM display element's running state, the independent monitoring devices of MXM display element includes controller, power module, temperature and humidity sensor, electric current/voltage sensor, wherein:
the power supply module is used for supplying power to the MXM display unit;
the temperature and humidity sensor is used for monitoring the temperature and humidity of the peripheral environment of the MXM display unit and feeding back the temperature and humidity to the controller;
one end of the current/voltage sensor is connected with the power supply module, and the other end of the current/voltage sensor is connected to the controller and used for monitoring the current voltage of a link where the MXM display unit is located in real time and feeding the current voltage back to the controller;
the controller is connected with the MXM display unit through the MXM connector and used for calling a preset SMBus bus protocol corresponding to the basic information to establish SMBus bus communication with the MXM display unit according to the basic information of the MXM display unit, reading the operation parameters and the alarm signal of the MXM display unit through the SMBus bus, receiving the temperature and humidity and the current voltage fed back by the temperature and humidity sensor and the current/voltage sensor, and judging that the operation state of the MXM display unit is a normal working state or a fault state.
Several alternatives are provided below, but not as an additional limitation to the above general solution, but merely as a further addition or preference, each alternative being combinable individually for the above general solution or among several alternatives without technical or logical contradictions.
Preferably, the independent monitoring device of the MXM display unit further includes a rear plug-in external connector, the rear plug-in external connector is respectively connected to the MXM connector and the controller, and the rear plug-in external connector provides power to the power supply module and the controller.
Preferably, the controller is further configured to report the read operation parameters of the MXM display unit to an upper computer through the rear-plug external connector, and report an alarm signal of the MXM display unit and the fault state to the upper computer when the operation state of the MXM display unit is judged to be the fault state;
the back plug is used for receiving a control command issued by the upper computer through the back plug external connector to execute legal action, and the legal action comprises the following steps: and controlling the MXM display unit to reset, controlling the MXM display unit to power on and power off, and controlling the graphics card GPU master frequency of the MXM display unit.
Preferably, the independent monitoring device of the MXM display unit further includes a cooling fan, and the cooling fan is connected to the controller, and is configured to adjust a rotation speed according to an instruction of the controller, and feed back a current rotation speed to the controller.
Preferably, the fault condition includes: overcurrent of a link where the MXM display unit is located, overvoltage of a link where the MXM display unit is located, overtemperature of the MXM display unit, short circuit of the MXM display unit and disconnection of the MXM display unit.
The present application further provides an independent monitoring method for an MXM display unit, configured to monitor an operating state of the MXM display unit, where the independent monitoring method for the MXM display unit is implemented in the controller, and includes:
calling a preset SMBus bus protocol corresponding to the basic information to establish SMBus bus communication with the MXM display unit according to the basic information of the MXM display unit, wherein the basic information comprises a manufacturer, a model and a serial number;
receiving operation parameters and alarm signals uploaded by an MXM display unit, receiving the temperature and humidity of the peripheral environment of the MXM display unit fed back by a temperature and humidity sensor, and receiving the current voltage of a link where the MXM display unit is located fed back by a current/voltage sensor;
and judging whether the operation state of the MXM display unit is a normal working state or a fault state according to the alarm signal, the operation parameters, the temperature, the humidity and the current and voltage.
Preferably, the method for independently monitoring the MXM display unit further includes:
reporting the read operation parameters of the MXM display unit to an upper computer through a rear-plug external connector, and reporting an alarm signal of the MXM display unit and the fault state to the upper computer when the operation state of the MXM display unit is judged to be the fault state;
and receiving a control command issued by the upper computer through the rear plug-in external connector to execute legal action.
Preferably, the legal action includes: and controlling the MXM display unit to reset, controlling the MXM display unit to power on and power off, and controlling the graphics card GPU master frequency of the MXM display unit.
Preferably, the fault condition includes: overcurrent of a link where the MXM display unit is located, overvoltage of a link where the MXM display unit is located, overtemperature of the MXM display unit, short circuit of the MXM display unit and disconnection of the MXM display unit.
According to the MXM display unit independent monitoring device and method, through the design of the MXM peripheral monitoring circuit, the state and the working state of the peripheral sensor of the MXM display unit can be monitored in real time independently, and the problem that the state of the MXM display unit cannot be monitored after the MXM display unit fails is solved. The method has the advantages that the manufacturer information of the MXM display units is automatically read in the initialization stage of the MXM display units, the MXM display units are configured in a self-adaptive mode according to the manufacturer information, the problem caused by inconsistency of SMBus bus protocols of MXM display units of different manufacturers is solved, unified monitoring of the MXM display units of different manufacturers is achieved, and therefore maintainability, reliability and universality of the MXM display units are greatly improved.
Drawings
Fig. 1 is a schematic structural diagram of an MXM display unit independent monitoring apparatus according to the present application;
fig. 2 is a flowchart of an embodiment of an MXM display unit independent monitoring method according to the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It will be understood that when an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.
The MXM display unit in the embodiment refers to a video card module based on an MXM interface, is used in mobile computing equipment such as a notebook computer and has the characteristics of small volume and low power consumption. In the prior art, monitoring of the MXM display unit is directly realized through an SMBus management bus, but the method has a plurality of defects.
In order to solve the problem that the fault state of the MXM display unit cannot be reported through the SMBus management bus after the MXM display unit has the fault in the prior art, the independent monitoring device of the MXM display unit is designed to feed back the state of the MXM display unit in real time, and meanwhile, the problem that protocols of SMBus management buses adopted by MXM display units of different manufacturers are inconsistent is solved, so that the monitoring of the MXM display unit becomes universal.
As shown in fig. 1, the present embodiment provides an independent monitoring apparatus for an MXM display unit, which is used for monitoring an operation state of the MXM display unit in real time, where the operation state refers to a normal operation state and a fault state of the MXM display unit.
The independent monitoring devices of MXM display element in this embodiment includes controller, power module, temperature and humidity sensor, current/voltage sensor, wherein:
and the power supply module is used for supplying power to the MXM display unit.
And the temperature and humidity sensor is used for monitoring the temperature and humidity of the peripheral environment of the MXM display unit and feeding back the temperature and humidity to the controller. The peripheral environment here refers to the ambient environment outside but close to the MXM display unit, for example the temperature and humidity sensor is mounted against the MXM display unit.
And one end of the current/voltage sensor is connected with the power supply module, and the other end of the current/voltage sensor is connected to the controller and used for monitoring the current voltage of the link where the MXM display unit is located in real time and feeding the current voltage back to the controller.
And the controller is connected with the MXM display unit through the MXM connector and is used for calling a preset SMBus bus protocol corresponding to the basic information to establish SMBus bus communication with the MXM display unit according to the basic information of the MXM display unit, reading the operation parameters and the alarm signal of the MXM display unit through the SMBus bus, receiving the temperature, humidity and current voltage fed back by the temperature and humidity sensor and the current/voltage sensor, and judging that the operation state of the MXM display unit is a normal working state or a fault state.
It should be noted that the alarm signal in this embodiment includes, but is not limited to, a voltage abnormal signal, an over-temperature signal, and an overload signal, and the operation parameter includes, but is not limited to, a display card GPU master frequency, a display memory occupancy rate, and a GPU usage rate. The alarm signal and the operation parameter reported by the MXM display unit are related to the basic information of the MXM display unit, and are generally factory configuration performance, and fields of the alarm signal and the operation parameter output by different MXM display units are slightly different, but generally include the above main fields. The method and the device for monitoring the MXM display unit have the advantages that the operation parameters and the alarm signals are obtained and used for monitoring the operation state of the MXM display unit, the corresponding fault state can be directly obtained according to the alarm signals, and certain fields are not required to be contained in the operation parameters and the alarm signals.
In the interaction process of the controller and the MXM display unit, the operation parameters are uploaded to the controller in real time, the alarm signal can be uploaded in real time, the MXM display unit can also be uploaded to the controller in case of abnormal operation, namely in the operation parameters and the alarm signal read by the controller each time, the alarm signal can be a null field or a field indicating that no abnormality occurs, and corresponding abnormal signals and information such as temperature, current or voltage corresponding to the triggered abnormal signals are output in case of abnormality.
For the MXM display unit, a sensor for detecting the temperature, the current, the voltage and the like of the MXM display unit is arranged in the MXM display unit, when the MXM display unit is in a normal working state, the operating parameters such as the temperature, the current, the voltage and the like detected by the sensor in the MXM display unit and an alarm signal can be normally output, and the numerical value of the external sensor is used as an auxiliary reference to judge the operating state of the MXM display unit; when the MXM display unit is in a fault state, the operating parameters such as temperature, current and voltage detected by the sensor inside the MXM display unit and the alarm signal cannot be normally output, and the value of the external sensor is used as a main reference to judge the current state of the MXM display unit so as to continuously position the fault of the MXM display unit.
The controller carries out comprehensive real-time monitoring on the MXM display unit according to the information fed back by the MXM display unit and the information fed back by the external sensor, so that the reliability of state monitoring of the MXM display unit is improved, and after the MXM display unit fails, the fault state of the MXM display unit can be fed back by the external sensor, so that the condition that the fault state of the MXM display unit cannot be monitored through the SMBus management bus due to the fault failure of the MXM display unit is avoided.
By the monitoring mode of combining inside and outside in the embodiment, the monitored fault states comprise overcurrent of a link where the MXM display unit is located, overvoltage of a link where the MXM display unit is located, overtemperature of the MXM display unit, short circuit of the MXM display unit and disconnection of the MXM display unit, so that comprehensive and reliable operation monitoring of the MXM display unit is realized.
In order to adapt to various types of MXM display units, the power supply capability of the power supply module in this embodiment outputs three different voltages, 12V, 5V, and 3.3V, respectively, according to the maximum design, for example. And the power module has an automatic protection function to ensure the safety of the circuit.
In order to facilitate information interaction between the independent monitoring device and the outside, in one embodiment, the independent monitoring device of the MXM display unit further includes a rear-plug external-to-external connector, the rear-plug external-to-external connector is respectively connected with the MXM connector and the controller, and the rear-plug external-to-external connector respectively provides power for the power module and the controller.
The controller of this embodiment may be a single chip microcomputer, an ARM, or other processor, and as for the controller, the controller is not powered by the power module, that is, the controller is powered by an independent power supply, so as to satisfy that the controller can detect the start initialization of the MXM display unit. Further, the controller is connected with the power supply module and used for controlling the on-off of the power supply module and controlling the power-on and power-off of the MXM display unit in a time control mode, and the MXM display unit is protected.
The rear-plug external-to-external connector is connected with the MXM connector through PCIE and Display buses, and the rear-plug external-to-external connector is connected with the controller through I2The C BUS connects and provides power to the controller, and the rear plug-in external connectors include, but are not limited to, VPX, CPCI connectors.
In one embodiment, the controller is further configured to report the read operation parameters of the MXM display unit to the upper computer through the rear-plug external-to-external connector based on the rear-plug external-to-external connector, and report an alarm signal of the MXM display unit and a fault state to the upper computer when the operation state of the MXM display unit is judged to be the fault state.
It should be noted that after the fault is reported to the upper computer, the operation state of the MXM display unit can be directly observed conveniently, and an alarm is generated when the MXM display unit fails, so that a controller can handle the fault in time. The data and time for reporting to the upper computer are not limited to the above manner, and for example, the read operation parameters and the alarm signal may be reported to the upper computer at regular time, and when the operation state of the MXM display unit is the fault state, the fault state may be reported to the upper computer immediately.
Based on the rear plug-in external connector, the controller is also used for receiving a control command issued by the upper computer through the rear plug-in external connector and executing legal action. The control command can be generated and issued at any time, for example, when the MXM display unit is in a normal running state, the MXM display unit is required to stop working, a power-off command is issued to the controller through the upper computer, and the controller is electrically connected with the power supply module, so that the power supply module is controlled to be powered off after the controller receives the power-off command; and if the MXM display unit needs to be reset when the MXM display unit is in a fault state, issuing a reset instruction to the controller through the upper computer, and controlling the MXM display unit to reset through the SMBus management bus based on the MXM connector after the controller receives the reset instruction.
Therefore, the embodiment can issue the control command at any time, so that the controller executes legal actions of controlling the resetting of the MXM display unit, controlling the power-on and power-off of the MXM display unit, controlling the GPU master frequency of the MXM display unit and the like, thereby ensuring the protection operation of the MXM display unit.
In order to improve the reliability of the operation of the MXM display unit and better reflect the state of the MXM display unit, in an embodiment, the MXM display unit independent monitoring apparatus further includes a cooling fan, and the cooling fan is connected to the controller and is configured to adjust the rotation speed according to an instruction of the controller and feed back the current rotation speed to the controller.
The controller can comprehensively judge whether the MXM display unit has faults or not by combining the rotating speed of the cooling fan based on the alarm signal, the operation parameters, the temperature and the current voltage so as to further improve the reliability of fault judgment of the MXM display unit, and the cooling fan can be used for reducing the temperature when the MXM display unit is over-temperature.
The independent monitoring device of the embodiment does not depend on a processor, an MXM display unit, a BIOS or an operating system of a computer to work, and has strong independence; and monitoring information to MXM display element is abundant, adopts the mode that inside data and external data combine, effectively improves the reliability of monitoring process.
In another embodiment, there is also provided an independent monitoring method of an MXM display unit for monitoring an operation state of the MXM display unit, the independent monitoring method of the MXM display unit being implemented in the controller, comprising the steps of:
and step S1, calling a preset SMBus bus protocol corresponding to the basic information according to the basic information of the MXM display unit to establish SMBus bus communication with the MXM display unit, wherein the basic information comprises manufacturer, model and serial number.
And S2, receiving the operation parameters and the alarm signals uploaded by the MXM display unit, receiving the temperature and the humidity of the peripheral environment of the MXM display unit fed back by the temperature and humidity sensor, and receiving the current and the voltage of the link where the MXM display unit is located fed back by the current/voltage sensor.
And step S3, judging whether the operation state of the MXM display unit is a normal working state or a fault state according to the alarm signal, the operation parameters, the temperature and the humidity and the current and voltage.
The independent monitoring method provided by the embodiment is adaptive to SMBus management bus protocols of MXM display units of different manufacturers, and is good in universality. The manufacturers which can be supported by the test comprise Nvidia, AMD and Jingjia micro, and the method is also suitable for supporting MXM display units of other manufacturers after the SMBus bus protocol is preloaded.
The alarm signal in this embodiment includes, but is not limited to, a voltage anomaly signal, an over-temperature signal, and an overload signal, and the operation parameter includes, but is not limited to, a display card GPU master frequency, a display memory occupancy rate, and a GPU usage rate. The fault conditions include: overcurrent of a link where the MXM display unit is located, overvoltage of a link where the MXM display unit is located, overtemperature of the MXM display unit, short circuit of the MXM display unit and disconnection of the MXM display unit.
How to judge whether the MXM display unit is in the fault state according to the operation parameters and the alarm signal can be realized based on the existing judgment logic and can also be a user-defined corresponding relation according to the requirement. For example, if the alarm signal comprises an over-temperature signal, judging that the MXM display unit is in an over-temperature fault state; if the MXM display unit does not output data to the controller within a certain time and the current/voltage sensor detects that no current exists in the link, judging that the MXM display unit is in a fault state and the fault state is that the MXM display unit is in an open circuit; if the MXM display unit does not output data to the controller within a certain time and the current/voltage sensor detects that the current in the link is suddenly increased, the MXM display unit is judged to be in a fault state, and the fault state is that the MXM display unit is short-circuited.
When the alarm signal and the external sensor data have feedback at the same time, comprehensive judgment is carried out by taking the alarm signal as a main part and the external sensor data as an auxiliary part, for example, if the alarm signal is an over-temperature signal and the temperature fed back by an external temperature and humidity sensor is normal, the operation state of the MXM display unit is judged to be a fault state and is the fault state of the MXM display unit over-temperature; when only the external sensor data has feedback, the operating state of the MXM display unit is directly determined to be a failure state, and the determination of the specific failure state is made with the external sensor data.
The fault state of each feedback in the monitoring process can only comprise one fault or a plurality of faults, and the method and the device are not limited to fault monitoring of the MXM display unit, and can also realize monitoring of the device, such as operation monitoring of various sensors, so as to ensure monitoring reliability.
To further enhance the detection control of the MXM display unit, in an embodiment, the method for independently monitoring the MXM display unit further includes:
reporting the read operation parameters of the MXM display unit to an upper computer through the rear-plug external connector, and reporting an alarm signal of the MXM display unit and the fault state to the upper computer when the operation state of the MXM display unit is judged to be the fault state; and receiving a control command issued by the upper computer through the rear plug-in external connector to execute legal action.
It should be noted that, the independent monitoring device of the MXM display unit and the independent monitoring method of the MXM display unit provided in the present application correspond to each other, and details of the two may be correspondingly complementary to each other, which is not described in detail.
In another embodiment, as shown in fig. 2, the independent monitoring method for the MXM display unit of the present application is split into two portions of execution logic, and corresponding MXM display unit monitoring software and upper computer management software are designed to run the corresponding execution logic, where the MXM display unit monitoring software is deployed on a controller, and the upper computer management software is deployed in an upper computer, where the upper computer may be a PC end or a computer card (where the computer card and the display card are connected via PCIE and I)2C bus connected).
1) MXM display unit monitoring software
The MXM display unit monitoring software is deployed on a controller of an independent monitoring device of the MXM display unit and mainly provides functions of display card initialization detection, power-on and power-off control, real-time state monitoring and reporting, exception handling and the like. The software runs as follows:
a. after power-on, the controller is firstly powered on and runs an initialization program, and the initialization comprises the following steps: voltage and current detection, and detection of the cooling fan and each sensor. And after the initialization is finished, enabling the power supply module, the MXM display unit and the cooling fan, and if the initialization fails, generating an alarm.
b. The MXM display unit is powered on and then distributed with system resources through the PCIE interface (the MXM display unit is used as a PCIE device, and the MXM display unit needs to obtain hardware resources such as a related memory, an IO, and the like from the computer through the PCIE interface, so that the MXM display unit can be identified by the computer, and then the MXM display unit functions by installing the video card driver software under the operating system), and the controller starts to monitor the status of the video card in real time.
c. The method comprises the steps that after a display card driver is installed under an operating system, upper computer management software is started, the upper computer management software generates an alarm when detecting that the display card does not exist, and if the display card is detected, a display card driving interface is automatically called to read basic information of an MXM display unit, wherein the basic information includes information such as manufacturer, model and serial number, and the basic information is read in through a PCIE bus and is read through an I bus2The C bus downloads the basic information to the controller.
d. The MXM display unit monitoring software calls different SMBus bus protocols to communicate with the MXM display unit according to different basic information, and the self-adaptive monitoring function of the display card is achieved. And if the corresponding SMBus management bus protocol does not exist, generating an alarm.
e. Controller through I2And the C bus reports the monitoring information to the upper computer management software at intervals, so that a real-time state reporting function is realized. At the same time, the control can also receive control commands from the management software to perform legitimate actions.
2) Upper computer management software
The upper computer monitoring software is deployed on a computing card (the computing card and a display card are connected with an I2C bus through PCIE) of the equipment, and mainly has the functions of monitoring the running state of an MXM display unit, providing a man-machine interaction interface and storing a reported log. The software operation flow is as follows:
a. and opening the upper computer management software, automatically detecting the display card drive, and sending the basic information of the display card to the controller, so that the controller (MXM display unit monitoring software) is matched with the SMBus bus protocol of the MXM display card unit.
b. After the protocol is successfully matched, the upper computer management software continuously receives monitoring information (operation parameters, alarm signals and the like) uploaded by the controller at regular time, and then displays the detection information on a human-computer interaction interface in real time.
c. The human-computer interaction interface of the upper computer management software is refreshed once per second, and the following parameters are updated: the system comprises a display card manufacturer, a model, a serial number, the power consumption of the display card, the temperature of the display card, the rotating speed of a cooling fan, GPU main frequency of the display card, display memory main frequency, GPU core temperature, display memory occupancy rate, GPU utilization rate, PCIe bus speed and the like. Parameters of the human-computer interaction interface can be increased and decreased according to actual display requirements.
d. Once receiving the fault state reported by the controller, the management software of the upper computer automatically classifies and generates a fault code, a fault reason and a fault grade according to the fault state, displays the highlighted fault code, fault reason and fault grade on the first page, and sends a control command according to the fault grade, for example, when the temperature or power consumption of a display card exceeds a set threshold value, the operation frequency of the GPU is reduced, the performance mode of the GPU is switched, the rotating speed of a cooling fan is increased, and the like.
When the fault code, the fault reason and the fault grade are automatically classified and generated according to the fault state, for example, the uploaded fault state is the excessive temperature of the MXM display unit and the temperature fed back by the alarm signal exists, the GPU may be overheated when the temperature exceeds a certain value, the fault grade is judged to be warning, if the temperature exceeds the maximum value, the fault grade is converted into error, and if the temperature is at or exceeds the maximum value for a long time, the fault grade is converted into critical, and the fault code, the fault reason and the fault grade can be adjusted according to actual requirements.
e. And the upper computer management software uniformly packages the data in the buffer area at intervals and stores the data in the hard disk of the equipment to form a log.
It should be noted that the execution logic of the MXM display unit monitoring software and the upper computer management software provided above is only an optional execution logic provided in the present application, and in this embodiment, not only the MXM display unit can be monitored, but also monitoring information can be displayed, and prompt fault information can be displayed, so that a manager can obtain fault information in time to process the fault information.
In other embodiments, according to actual needs, the execution logic may be increased or decreased, for example, an alarm sound is added, so that the manager can quickly know the fault information, or the log generation operation is removed, and the like.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (9)

1. The utility model provides an MXM display element's independent monitoring devices for monitor MXM display element's running state, its characterized in that, MXM display element's independent monitoring devices includes controller, power module, temperature and humidity sensor, current/voltage sensor, wherein:
the power supply module is used for supplying power to the MXM display unit;
the temperature and humidity sensor is used for monitoring the temperature and humidity of the peripheral environment of the MXM display unit and feeding back the temperature and humidity to the controller;
one end of the current/voltage sensor is connected with the power supply module, and the other end of the current/voltage sensor is connected to the controller and used for monitoring the current voltage of a link where the MXM display unit is located in real time and feeding the current voltage back to the controller;
the controller is connected with the MXM display unit through the MXM connector and used for calling a preset SMBus bus protocol corresponding to the basic information to establish SMBus bus communication with the MXM display unit according to the basic information of the MXM display unit, reading the operation parameters and the alarm signal of the MXM display unit through the SMBus bus, receiving the temperature and humidity and the current voltage fed back by the temperature and humidity sensor and the current/voltage sensor, and judging that the operation state of the MXM display unit is a normal working state or a fault state.
2. The stand-alone monitoring device for an MXM display unit of claim 1, further comprising a rear plug-to-external connector, wherein the rear plug-to-external connector is connected to the MXM connector and the controller, respectively, and the rear plug-to-external connector supplies power to the power module and the controller, respectively.
3. The apparatus for monitoring the MXM display unit independently as claimed in claim 2, wherein the controller is further configured to report the read operation parameters of the MXM display unit to the host computer through the rear-plug external-to-external connector, and report an alarm signal of the MXM display unit and the fault state to the host computer when the operation state of the MXM display unit is determined to be the fault state;
the back plug is used for receiving a control command issued by the upper computer through the back plug external connector to execute legal action, and the legal action comprises the following steps: and controlling the MXM display unit to reset, controlling the MXM display unit to power on and power off, and controlling the graphics card GPU master frequency of the MXM display unit.
4. The standalone monitoring device for an MXM display unit of claim 1, further comprising a cooling fan connected to the controller for adjusting a rotation speed according to a command of the controller and feeding back a current rotation speed to the controller.
5. The independent monitoring device of the MXM display unit of claim 1, wherein the fault condition includes: overcurrent of a link where the MXM display unit is located, overvoltage of a link where the MXM display unit is located, overtemperature of the MXM display unit, short circuit of the MXM display unit and disconnection of the MXM display unit.
6. An independent monitoring method for an MXM display unit, which is used for monitoring the running state of the MXM display unit, and is characterized in that the independent monitoring method for the MXM display unit is implemented in the controller, and comprises the following steps:
calling a preset SMBus bus protocol corresponding to the basic information to establish SMBus bus communication with the MXM display unit according to the basic information of the MXM display unit, wherein the basic information comprises a manufacturer, a model and a serial number;
receiving operation parameters and alarm signals uploaded by an MXM display unit, receiving the temperature and humidity of the peripheral environment of the MXM display unit fed back by a temperature and humidity sensor, and receiving the current voltage of a link where the MXM display unit is located fed back by a current/voltage sensor;
and judging whether the operation state of the MXM display unit is a normal working state or a fault state according to the alarm signal, the operation parameters, the temperature, the humidity and the current and voltage.
7. The method of independent monitoring of an MXM display unit of claim 6, further comprising:
reporting the read operation parameters of the MXM display unit to an upper computer through a rear-plug external connector, and reporting an alarm signal of the MXM display unit and the fault state to the upper computer when the operation state of the MXM display unit is judged to be the fault state;
and receiving a control command issued by the upper computer through the rear plug-in external connector to execute legal action.
8. The method for stand-alone monitoring of an MXM display unit of claim 7, wherein the legitimate action comprises: and controlling the MXM display unit to reset, controlling the MXM display unit to power on and power off, and controlling the graphics card GPU master frequency of the MXM display unit.
9. The method of independently monitoring the MXM display unit of claim 6, wherein the fault condition includes: overcurrent of a link where the MXM display unit is located, overvoltage of a link where the MXM display unit is located, overtemperature of the MXM display unit, short circuit of the MXM display unit and disconnection of the MXM display unit.
CN202011000861.1A 2020-09-22 2020-09-22 Independent monitoring device and method for MXM display unit Pending CN112255939A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011000861.1A CN112255939A (en) 2020-09-22 2020-09-22 Independent monitoring device and method for MXM display unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011000861.1A CN112255939A (en) 2020-09-22 2020-09-22 Independent monitoring device and method for MXM display unit

Publications (1)

Publication Number Publication Date
CN112255939A true CN112255939A (en) 2021-01-22

Family

ID=74232769

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011000861.1A Pending CN112255939A (en) 2020-09-22 2020-09-22 Independent monitoring device and method for MXM display unit

Country Status (1)

Country Link
CN (1) CN112255939A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582046A (en) * 2009-06-26 2009-11-18 浪潮电子信息产业股份有限公司 High-available system state monitoring, forcasting and intelligent management method
CN102368224A (en) * 2011-06-29 2012-03-07 奇智软件(北京)有限公司 Processing method and device for hardware detection
CN205139904U (en) * 2015-11-23 2016-04-06 常州信息职业技术学院 Computer operational monitoring system
CN107395463A (en) * 2017-09-05 2017-11-24 合肥爱吾宠科技有限公司 Computer hardware operational factor network monitoring system
KR20170142818A (en) * 2016-06-20 2017-12-28 비씨카드(주) Method for controlling operation of display card and display card
CN108572903A (en) * 2017-03-10 2018-09-25 艾维克科技股份有限公司 Wireless monitoring device for display card
US20180341300A1 (en) * 2017-05-23 2018-11-29 Evga Corporation Wireless graphics card monitoring device
CN209496369U (en) * 2018-10-30 2019-10-15 亳州学院 A kind of computer fault alarm system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582046A (en) * 2009-06-26 2009-11-18 浪潮电子信息产业股份有限公司 High-available system state monitoring, forcasting and intelligent management method
CN102368224A (en) * 2011-06-29 2012-03-07 奇智软件(北京)有限公司 Processing method and device for hardware detection
CN205139904U (en) * 2015-11-23 2016-04-06 常州信息职业技术学院 Computer operational monitoring system
KR20170142818A (en) * 2016-06-20 2017-12-28 비씨카드(주) Method for controlling operation of display card and display card
CN108572903A (en) * 2017-03-10 2018-09-25 艾维克科技股份有限公司 Wireless monitoring device for display card
US20180341300A1 (en) * 2017-05-23 2018-11-29 Evga Corporation Wireless graphics card monitoring device
CN107395463A (en) * 2017-09-05 2017-11-24 合肥爱吾宠科技有限公司 Computer hardware operational factor network monitoring system
CN209496369U (en) * 2018-10-30 2019-10-15 亳州学院 A kind of computer fault alarm system

Similar Documents

Publication Publication Date Title
CN112069035B (en) IPMI board card health management system for chassis and health monitoring method
US9645954B2 (en) Embedded microcontroller and buses
CN104699589B (en) Fan fault detection system and method
TW201119173A (en) Method of using power supply to execute remote monitoring of an electronic system
US20120137159A1 (en) Monitoring system and method of power sequence signal
US9037878B2 (en) Server rack system
CN104660440A (en) Blade server management system and control method thereof
CN105739668A (en) Power management method and power management system of notebook computers
CN111120383A (en) Control method and control device for equipment fan, switch and storage medium
CN112099412B (en) Safety redundancy architecture of micro control unit
CN100472467C (en) Method and device for monitoring status of computer power supply fan
CN110985426A (en) Fan control system and method for PCIE Switch product
CN101140480A (en) Control method of server fan
CN112255939A (en) Independent monitoring device and method for MXM display unit
CN101799775A (en) Monitoring method for monitoring circuit and business board
CN218824636U (en) Power supply detection device for server hard disk backboard
CN111338907A (en) Remote state monitoring system and method of PCIE (peripheral component interface express) equipment
CN108279761B (en) Server power circuit supporting fan hot plug and control method
CN115394058A (en) Method for displaying equipment state alarm
CN102810840B (en) Voltage protection system
JPH10307635A (en) Computer system and temperature monitoring method applied to the same system
CN105468495A (en) Complex programmable logic array control device
CN111459768A (en) Hard disk management method, device, equipment and machine readable storage medium
CN110647435A (en) Server, hard disk remote control method and control assembly
CN116028123A (en) Method and system for safely switching on and switching off server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210122

RJ01 Rejection of invention patent application after publication