CN107145428A - A kind of server and server monitoring method - Google Patents
A kind of server and server monitoring method Download PDFInfo
- Publication number
- CN107145428A CN107145428A CN201710381927.8A CN201710381927A CN107145428A CN 107145428 A CN107145428 A CN 107145428A CN 201710381927 A CN201710381927 A CN 201710381927A CN 107145428 A CN107145428 A CN 107145428A
- Authority
- CN
- China
- Prior art keywords
- bmc
- control element
- logic control
- signal
- server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3055—Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3058—Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3089—Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
- G06F11/3093—Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a kind of server and server monitoring method, when BMC normal works, BMC sends heartbeat signal to logic control element;When logic control element detects heartbeat signal, the monitoring signal that monitored unit is sent directly is issued into BMC;When BMC irregular workings, i.e., when logic control element can't detect heartbeat signal, logic control element detects the state of monitored unit;When level and the inconsistent normal level of its monitoring signal, logic control element is by the number record of the monitoring signal into RAM, and after the heartbeat signal to be checked for measuring BMC, the numbering for retelling monitoring signal issues BMC by logout signal.The present invention can be realized when BMC breaks down or resets, and remain able to read the monitoring signal change of failure or reset device.Improve the stability and reliability of server.
Description
Technical field
The present invention relates to server and server monitoring.
Background technology
, it is necessary to using the management system outside band to the power consumption of server, voltage, fan, switching on and shutting down in server design
Each index such as state is monitored.Management system uses special Management Controller, typically according to function by Management Controller
It is divided into two kinds:BMC(Baseboard management controller, baseboard management controller), server master board is carried out
Monitoring;SMC(System management controller, System Management Controller), server whole system is supervised
Control.Generally, in order to ensure the reliability of server, SMC uses Redundancy Design, that is, configures two SMC, any one
When SMC is out of order, another SMC can ensure the normal work of management system.
But BMC does not use Redundancy Design typically.It is the common connection in existing design, i.e., each mainboard or section such as Fig. 1
An only integrated BMC on point, for the monitoring to mainboard or node.The signal of monitoring is directly issued BMC by monitored unit.Quilt
Monitoring unit is usually the equipment such as power module, CPU, PCH.Monitoring signal can be signal, the CPU overheats for indicating power supply status
Signal etc..So, when system power state, temperature etc. are exceeded, BMC can successfully be detected and be recorded the phenomenon, and under progress
Single stepping.
The problem of this just brings server reliability.That is, during BMC breaks down reset, it is impossible to mainboard or section
The work health state of point(Power consumption, voltage, whether wrong false information etc.)It is monitored.
CPLD(Complex Programmable Logic Device, CPLD).
FPGA(Field-Programmable Gate Array, i.e. field programmable gate array).
The content of the invention
The present invention is solution when BMC breaks down, it is impossible to the work health state of mainboard or node(Power consumption, voltage,
Whether wrong false information etc.)The technical problem being monitored.Therefore, the present invention provides a kind of server and server monitoring side
Method, it has when BMC breaks down or resets, and remains able to read the monitoring signal change of failure or reset device, carries
The high stability and the advantage of reliability of server.
To achieve these goals, the present invention is adopted the following technical scheme that.
A kind of server, comprising:
BMC, is connected with logic control element, sends heartbeat signal to logic control element, and receive logic control element transmission
Monitoring signal and logout signal;
Logic control element, is connected with monitored unit, receives the monitoring signal that monitored unit is sent;
Monitored unit, for sending monitoring signal to logic control element.
It is preferred that, logic control element is CPLD or FPGA one kind.
It is preferred that, monitored unit is power module, CPU, PCH, network chip, the one or more of system power supply.
Whether power module, monitoring voltage has output, and whether amplitude is normal;CPU, has monitored whether to report an error;PCH, monitoring
Whether report an error;Network chip, monitoring network signal whether UNICOM;System power supply, detect whether it is wrong, export it is whether normal
(System power supply is that power module refers to the dc source for producing system power supply into the module of dc source by 220V Power converts
Change into the various power supplys of board needs).
Server monitoring method, is comprised the steps of:
When BMC normal works, BMC sends heartbeat signal to logic control element.When logic control element detects heartbeat letter
Number when, the monitoring signal that monitored unit is sent directly is issued into BMC.
When BMC irregular workings, i.e., when logic control element can't detect heartbeat signal, logic control element detection
The state of monitored unit.When level and the inconsistent normal level of its monitoring signal, logic control element believes the monitoring
Number number record into RAM, after the heartbeat signal to be checked for measuring BMC, the numbering for retelling monitoring signal is believed by logout
Number issue BMC.
Beneficial effects of the present invention:The present invention is directed in existing design BMC when breaking down or resetting, it is impossible to mainboard or
The work health state of node(Power consumption, voltage, whether wrong false information etc.)The problem of being monitored is improved.Pass through CPLD
Judge BMC working condition, decide whether writing task of the adapter to monitoring information.Using this technology, it is possible to achieve in BMC hairs
When raw failure or reset, remain able to read the monitoring signal change of failure or reset device.Improve the stabilization of server
Property and reliability.
Brief description of the drawings
Fig. 1 is prior art server circuit connection diagram.
Fig. 2 is the present embodiment server circuit connection diagram.
Embodiment
The invention will be further described with embodiment below in conjunction with the accompanying drawings.
As shown in Fig. 2 a kind of server, comprising BMC, is connected with CPLD, heartbeat signal is sent to CPLD, and receive CPLD
The monitoring signal and logout signal of transmission;CPLD, is connected with monitored unit, receives the monitoring letter that monitored unit is sent
Number;Monitored unit, for sending monitoring signal to CPLD.Monitored unit includes power module, CPU, PCH.
Server monitoring method, is comprised the steps of:
When BMC normal works, BMC sends heartbeat signal to CPLD.When CPLD detects heartbeat signal, by monitored unit
The monitoring signal sent directly issues BMC.
When BMC irregular workings, i.e., when CPLD can't detect heartbeat signal, CPLD detects the shape of monitored unit
State.When level and the inconsistent normal level of its monitoring signal, CPLD into RAM, treats the number record of the monitoring signal
After the heartbeat signal for detecting BMC, the numbering for retelling monitoring signal issues BMC by logout signal.
Although above-mentioned the embodiment of the present invention is described with reference to accompanying drawing, not to present invention protection model
The limitation enclosed, one of ordinary skill in the art should be understood that on the basis of technical scheme those skilled in the art are not
Need to pay various modifications or deform still within protection scope of the present invention that creative work can make.
Claims (4)
1. a kind of server, it is characterised in that include:
BMC, is connected with logic control element, sends heartbeat signal to logic control element, and receive logic control element transmission
Monitoring signal and logout signal;
Logic control element, is connected with monitored unit, receives the monitoring signal that monitored unit is sent;
Monitored unit, for sending monitoring signal to logic control element.
2. the server as described in right will require 1, it is characterised in that the logic control element is the one of CPLD or FPGA
Kind.
3. the server as described in right will require 1, it is characterised in that the monitored unit be power module, CPU, PCH,
The one or more of network chip, system power supply.
4. according to claim 1-4 server monitoring method, comprise the steps of:
When BMC normal works, BMC sends heartbeat signal to logic control element;When logic control element detects heartbeat letter
Number when, the monitoring signal that monitored unit is sent directly is issued into BMC;
When BMC irregular workings, i.e., when logic control element can't detect heartbeat signal, logic control element detection is supervised
Control the state of unit;When level and the inconsistent normal level of its monitoring signal, logic control element is by the monitoring signal
Number record is into RAM, after the heartbeat signal to be checked for measuring BMC, and the numbering for retelling monitoring signal is sent out by logout signal
To BMC.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710381927.8A CN107145428A (en) | 2017-05-26 | 2017-05-26 | A kind of server and server monitoring method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710381927.8A CN107145428A (en) | 2017-05-26 | 2017-05-26 | A kind of server and server monitoring method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107145428A true CN107145428A (en) | 2017-09-08 |
Family
ID=59780272
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710381927.8A Pending CN107145428A (en) | 2017-05-26 | 2017-05-26 | A kind of server and server monitoring method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107145428A (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107608925A (en) * | 2017-10-09 | 2018-01-19 | 郑州云海信息技术有限公司 | A kind of Server Extension card information acquisition methods and device |
CN107783788A (en) * | 2017-10-26 | 2018-03-09 | 英业达科技有限公司 | The method started shooting after detection means and detection before start |
CN107797880A (en) * | 2017-11-29 | 2018-03-13 | 济南浪潮高新科技投资发展有限公司 | A kind of method for improving server master board BMC reliabilities |
CN108038019A (en) * | 2017-12-25 | 2018-05-15 | 曙光信息产业(北京)有限公司 | A kind of automatically restoring fault method and system of baseboard management controller |
CN108170546A (en) * | 2017-12-15 | 2018-06-15 | 山东超越数控电子股份有限公司 | A kind of repositioning method based on EC |
CN108255646A (en) * | 2018-01-17 | 2018-07-06 | 重庆大学 | A kind of self-healing method of industrial control program failure based on heartbeat detection |
CN108762142A (en) * | 2018-05-24 | 2018-11-06 | 新华三技术有限公司 | A kind of communication equipment and its processing method |
CN108919935A (en) * | 2018-07-12 | 2018-11-30 | 浪潮电子信息产业股份有限公司 | Monitoring method, device and equipment for power supply on server mainboard |
CN109723666A (en) * | 2018-11-26 | 2019-05-07 | 曙光信息产业股份有限公司 | Fan control device and method |
CN109826822A (en) * | 2019-04-11 | 2019-05-31 | 苏州浪潮智能科技有限公司 | A kind of control method for fan and relevant apparatus |
CN109882440A (en) * | 2019-04-16 | 2019-06-14 | 苏州浪潮智能科技有限公司 | A kind of fan rotation speed control apparatus and control method |
CN110244630A (en) * | 2019-06-21 | 2019-09-17 | 深圳市三旺通信股份有限公司 | Serial server based on programmable logic device online acquisition serial interface signal |
CN110245106A (en) * | 2019-06-21 | 2019-09-17 | 深圳市三旺通信股份有限公司 | The serial server of SCM Based online acquisition serial interface signal |
CN110262341A (en) * | 2019-06-21 | 2019-09-20 | 深圳市三旺通信股份有限公司 | The CAN server of SCM Based online acquisition CAN interface signal |
CN110262342A (en) * | 2019-06-21 | 2019-09-20 | 深圳市三旺通信股份有限公司 | CAN server based on programmable logic device online acquisition CAN signal |
CN110502377A (en) * | 2019-08-08 | 2019-11-26 | 苏州浪潮智能科技有限公司 | It is a kind of that test method is restarted based on CPLD |
CN110597745A (en) * | 2019-09-20 | 2019-12-20 | 苏州浪潮智能科技有限公司 | Method and device for realizing multi-master multi-slave I2C communication of switch system |
TWI684859B (en) * | 2018-01-12 | 2020-02-11 | 廣達電腦股份有限公司 | Method for remote system recovery |
CN111639005A (en) * | 2020-05-19 | 2020-09-08 | 成都市爱科科技实业有限公司 | Independent monitoring system and method for server state |
CN113064664A (en) * | 2021-03-02 | 2021-07-02 | 凌华科技(中国)有限公司 | Control method and device, complex programmable logic device and server |
CN113064479A (en) * | 2021-03-03 | 2021-07-02 | 山东英信计算机技术有限公司 | Power supply redundancy control system, method and medium of GPU server |
CN114691408A (en) * | 2022-04-18 | 2022-07-01 | 苏州浪潮智能科技有限公司 | Fault detection device for substrate management controller |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080313312A1 (en) * | 2006-12-06 | 2008-12-18 | David Flynn | Apparatus, system, and method for a reconfigurable baseboard management controller |
CN103835972A (en) * | 2012-11-20 | 2014-06-04 | 英业达科技有限公司 | Fan rotating speed control system and method for control rotating speed of fan |
CN104063300A (en) * | 2014-01-18 | 2014-09-24 | 浪潮电子信息产业股份有限公司 | Acquisition device based on FPGA (Field Programmable Gate Array) for monitoring information of high-end multi-channel server |
CN105117317A (en) * | 2015-08-17 | 2015-12-02 | 浪潮(北京)电子信息产业有限公司 | Method and device for monitoring server performance |
CN105808398A (en) * | 2016-03-08 | 2016-07-27 | 浪潮电子信息产业股份有限公司 | Method for rapidly analyzing and positioning hardware abnormity |
-
2017
- 2017-05-26 CN CN201710381927.8A patent/CN107145428A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080313312A1 (en) * | 2006-12-06 | 2008-12-18 | David Flynn | Apparatus, system, and method for a reconfigurable baseboard management controller |
CN103835972A (en) * | 2012-11-20 | 2014-06-04 | 英业达科技有限公司 | Fan rotating speed control system and method for control rotating speed of fan |
CN104063300A (en) * | 2014-01-18 | 2014-09-24 | 浪潮电子信息产业股份有限公司 | Acquisition device based on FPGA (Field Programmable Gate Array) for monitoring information of high-end multi-channel server |
CN105117317A (en) * | 2015-08-17 | 2015-12-02 | 浪潮(北京)电子信息产业有限公司 | Method and device for monitoring server performance |
CN105808398A (en) * | 2016-03-08 | 2016-07-27 | 浪潮电子信息产业股份有限公司 | Method for rapidly analyzing and positioning hardware abnormity |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107608925A (en) * | 2017-10-09 | 2018-01-19 | 郑州云海信息技术有限公司 | A kind of Server Extension card information acquisition methods and device |
CN107783788A (en) * | 2017-10-26 | 2018-03-09 | 英业达科技有限公司 | The method started shooting after detection means and detection before start |
CN107797880A (en) * | 2017-11-29 | 2018-03-13 | 济南浪潮高新科技投资发展有限公司 | A kind of method for improving server master board BMC reliabilities |
CN108170546A (en) * | 2017-12-15 | 2018-06-15 | 山东超越数控电子股份有限公司 | A kind of repositioning method based on EC |
CN108038019A (en) * | 2017-12-25 | 2018-05-15 | 曙光信息产业(北京)有限公司 | A kind of automatically restoring fault method and system of baseboard management controller |
CN108038019B (en) * | 2017-12-25 | 2021-06-11 | 曙光信息产业(北京)有限公司 | Automatic fault recovery method and system for substrate management controller |
TWI684859B (en) * | 2018-01-12 | 2020-02-11 | 廣達電腦股份有限公司 | Method for remote system recovery |
US10846160B2 (en) | 2018-01-12 | 2020-11-24 | Quanta Computer Inc. | System and method for remote system recovery |
CN108255646A (en) * | 2018-01-17 | 2018-07-06 | 重庆大学 | A kind of self-healing method of industrial control program failure based on heartbeat detection |
CN108255646B (en) * | 2018-01-17 | 2022-02-01 | 重庆大学 | Industrial control application program fault self-recovery method based on heartbeat detection |
CN108762142A (en) * | 2018-05-24 | 2018-11-06 | 新华三技术有限公司 | A kind of communication equipment and its processing method |
CN108919935A (en) * | 2018-07-12 | 2018-11-30 | 浪潮电子信息产业股份有限公司 | Monitoring method, device and equipment for power supply on server mainboard |
CN109723666A (en) * | 2018-11-26 | 2019-05-07 | 曙光信息产业股份有限公司 | Fan control device and method |
CN109826822B (en) * | 2019-04-11 | 2021-06-29 | 苏州浪潮智能科技有限公司 | Fan control method and related device |
CN109826822A (en) * | 2019-04-11 | 2019-05-31 | 苏州浪潮智能科技有限公司 | A kind of control method for fan and relevant apparatus |
CN109882440A (en) * | 2019-04-16 | 2019-06-14 | 苏州浪潮智能科技有限公司 | A kind of fan rotation speed control apparatus and control method |
CN110262341A (en) * | 2019-06-21 | 2019-09-20 | 深圳市三旺通信股份有限公司 | The CAN server of SCM Based online acquisition CAN interface signal |
CN110262342A (en) * | 2019-06-21 | 2019-09-20 | 深圳市三旺通信股份有限公司 | CAN server based on programmable logic device online acquisition CAN signal |
CN110245106A (en) * | 2019-06-21 | 2019-09-17 | 深圳市三旺通信股份有限公司 | The serial server of SCM Based online acquisition serial interface signal |
CN110244630A (en) * | 2019-06-21 | 2019-09-17 | 深圳市三旺通信股份有限公司 | Serial server based on programmable logic device online acquisition serial interface signal |
CN110502377A (en) * | 2019-08-08 | 2019-11-26 | 苏州浪潮智能科技有限公司 | It is a kind of that test method is restarted based on CPLD |
CN110502377B (en) * | 2019-08-08 | 2021-04-27 | 苏州浪潮智能科技有限公司 | Restarting test method based on CPLD |
CN110597745A (en) * | 2019-09-20 | 2019-12-20 | 苏州浪潮智能科技有限公司 | Method and device for realizing multi-master multi-slave I2C communication of switch system |
CN111639005A (en) * | 2020-05-19 | 2020-09-08 | 成都市爱科科技实业有限公司 | Independent monitoring system and method for server state |
CN113064664A (en) * | 2021-03-02 | 2021-07-02 | 凌华科技(中国)有限公司 | Control method and device, complex programmable logic device and server |
CN113064479A (en) * | 2021-03-03 | 2021-07-02 | 山东英信计算机技术有限公司 | Power supply redundancy control system, method and medium of GPU server |
WO2022183877A1 (en) * | 2021-03-03 | 2022-09-09 | 山东英信计算机技术有限公司 | Power redundancy control system and method for gpu server, and medium |
CN114691408A (en) * | 2022-04-18 | 2022-07-01 | 苏州浪潮智能科技有限公司 | Fault detection device for substrate management controller |
CN114691408B (en) * | 2022-04-18 | 2024-07-02 | 苏州浪潮智能科技有限公司 | Fault detection device of substrate management controller |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107145428A (en) | A kind of server and server monitoring method | |
CN104794033A (en) | CPU low-frequency fault positioning method and device based on BMC | |
WO2020253417A1 (en) | Lorawan-based electric transmission line monitoring device and system | |
CN106952464A (en) | Intelligent data acqusition system and acquisition method | |
CN104660440A (en) | Blade server management system and control method thereof | |
CN106445055A (en) | Power supply protection mechanism of Rack server | |
CN102495786B (en) | Server system | |
CN112882901A (en) | Intelligent health state monitor of distributed processing system | |
CN110907802A (en) | State detection device | |
CN212809125U (en) | Health management system of computer host | |
CN102780207A (en) | Voltage protection system and voltage protection method | |
WO2021190093A1 (en) | Server system, and frequency control device for processor therein | |
CN101799775B (en) | Monitoring method for monitoring circuit and business board | |
CN104765326A (en) | Air discharge monitoring system | |
CN116483613B (en) | Processing method and device of fault memory bank, electronic equipment and storage medium | |
CN102928684B (en) | Insulation monitoring device for medical isolated power system | |
CN210038709U (en) | Power monitoring management buckle | |
CN102810840A (en) | Voltage protection system and voltage protection method | |
CN206892209U (en) | Failure detector circuit and system | |
CN105468495A (en) | Complex programmable logic array control device | |
CN107831452A (en) | DC control and protection system hostdown diagnoses and life appraisal equipment | |
CN109460139A (en) | A kind of method and relevant apparatus of power supply guarantee | |
CN105185422B (en) | A kind of Measurement redundancy rod level detecting apparatus | |
CN203100924U (en) | Temperature rising testing device for toy detection | |
CN110905796A (en) | Self-suction type slurry pump running state testing device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170908 |
|
WD01 | Invention patent application deemed withdrawn after publication |