CN108287778A - A kind of method of detection service device environmental temperature monitoring reliability - Google Patents

A kind of method of detection service device environmental temperature monitoring reliability Download PDF

Info

Publication number
CN108287778A
CN108287778A CN201810037212.5A CN201810037212A CN108287778A CN 108287778 A CN108287778 A CN 108287778A CN 201810037212 A CN201810037212 A CN 201810037212A CN 108287778 A CN108287778 A CN 108287778A
Authority
CN
China
Prior art keywords
temperature
read
service device
detection service
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810037212.5A
Other languages
Chinese (zh)
Inventor
岳远斌
孙心
孙一心
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201810037212.5A priority Critical patent/CN108287778A/en
Publication of CN108287778A publication Critical patent/CN108287778A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored

Abstract

The embodiment of the invention discloses a kind of methods of detection service device environmental temperature monitoring reliability, including, read the displays temperature of BMC chip record;The register information for reading temperature sensor obtains actual temperature;The displays temperature is compared with actual temperature;It repeats to read displays temperature and actual temperature every the time, and is compared.Pass through the temperature information comparative analysis to record, it can be seen that the longtime running stability of temperature sensor, meanwhile, it can be tested using script in different external environments, simulation is in different data center environments, the stability of environment temperature of server monitoring.Using the mode with interior perform script, the monitoring temperature stability under different external environments can easily be tested, be capable of the environment at more direct analogue data center, identify server potential risk in advance, substantially reduce the failure rate of machine.

Description

A kind of method of detection service device environmental temperature monitoring reliability
Technical field
The present invention relates to server detection fields.
Background technology
With big data, cloud computing, the arrival in artificial intelligence epoch, there is fierce increase in Internet service amount and data amount Long, calculation amount also increases therewith;In server system, with the increase of data volume, the deployment quantity of server is increasingly Greatly, density is higher and higher, and delivery pressure is also increasing, and the calorific value of core component becomes larger, and machine internal temperature increases, The temperature that core component can bear has certain limitations, if long-play in the higher environment of temperature, component Performance can reduce, the service life decline, cause the service life of server to reduce.In server operational process, if environment temperature mistake Height can cause server overall work in a hot environment, at this time the temperature meeting higher of the core components ontology such as CPU, when When the temperature of CPU reach a certain height, frequency reducing will occur for server, seriously affect the calculated performance of server;If temperature Continue to increase, server can be abnormal shutdown, and the consequence thus caused is then the service disconnection of client, and loss of data is made At loss can not estimate.This just proposes more the reliability of the environmental temperature monitoring of server and monitoring temperature device High requirement.
In server system, we are usually using BMC (baseboard management controllers:Baseboard Management Controller) health status of mainboard is monitored and managed.Some important parameters such as voltage, temperature on mainboard Degree, power consumption etc. are all by BMC monitoring records.The monitoring link of environment temperature of server is made of two parts:First, temperature Sensor and plug-in triode are spent, second is that BMC chip;Temperature sensor can collect internal triode and plug-in three itself first The temperature information of pole pipe, and it is stored in different registers, secondly BMC chip is passed by I2C bus to collect temperature Register information in sensor, and carry out conversion calculating inside BMC chip and form degree Celsius that we can identify, in turn Realize the collection and monitoring of mainboard temperature information.According to the temperature information being collected into, BMC can be carried out in conjunction with fan regulation and controlling strategy The adjustment of rotation speed of the fan ensures proper heat reduction, but while rotation speed of the fan promotes, and influences whether the vibration of complete machine and whole The power consumption of machine can not receive the data center of client, therefore it is to be ensured that the stability of environmental temperature monitoring and Reliability is detected environment temperature of server monitoring reliability and is of great significance.
Invention content
The present invention is to solve to monitor the technical issues of reliability is detected to environment temperature of server.For this purpose, of the invention A kind of method of detection service device environmental temperature monitoring reliability is provided, it, which has, conveniently realizes in the different external worlds The advantages of long-time stability of environmental temperature monitoring are tested in environment.
To achieve the goals above, the present invention adopts the following technical scheme that.
A kind of method of detection service device environmental temperature monitoring reliability, includes the following steps,
Read the displays temperature of BMC chip record;The register information for reading temperature sensor obtains actual temperature;By institute Displays temperature is stated to be compared with actual temperature;It repeats to read displays temperature and actual temperature every the time, and carries out pair Than.
Beneficial effects of the present invention:Pass through the temperature information comparative analysis to record, it can be seen that the length of temperature sensor Phase operation stability, meanwhile, it can be tested, be simulated in different data using script in different external environments In thimble border, the stability of environment temperature of server monitoring.
Using the mode with interior perform script, the monitoring temperature stability under different external environments can be carried out easily Test, is capable of the environment at more direct analogue data center, identifies server potential risk in advance, substantially reduce machine Failure rate, while can also increase customer satisfaction degree and the competitiveness of product.
Specific implementation mode
With reference to embodiment, the invention will be further described.
The method of detection service device environmental temperature monitoring reliability, passes through Intelligent Platform Management Interface (Intelligent Platform Management Interface, abbreviation IPMI) temperature that records and show in I crawl BMC chips.Pass through IPMI reads the actual temperature numerical value in the register of temperature sensor.The displays temperature read and actual temperature are exported Into same file.Temperature information is read every 10 seconds cycles.Cycle is read 4 times, in output to file.
1 digital independent of table records
Serial number BMC chip displays temperature Temperature sensor register records temperature information Convert actual temperature
1 31℃ 1f 31℃
2 31℃ 1f 31℃
3 30℃ 1f 31℃
4 31℃ 1f 31℃
Table 1 shows that 4 cycles are read and write as a result, from can be seen that the temperature recorded in BMC chip in this 4 results The actual temperature recorded in 31 DEG C or so, temperature sensor register is hexadecimal 1f, and it is 31 to be scaled temperature information DEG C, therefore, in the range of error allows, it can be determined that environmental temperature monitoring is reliable and stable.
Although the above-mentioned specific implementation mode in conjunction with to the present invention is described, not to the scope of the present invention Limitation, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not required to Make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.

Claims (4)

1. a kind of method of detection service device environmental temperature monitoring reliability, which is characterized in that include the following steps,
Read the displays temperature of BMC chip record;The register information for reading temperature sensor obtains actual temperature;It will be described aobvious Temperature displaying function is compared with actual temperature;It repeats to read displays temperature and actual temperature every the time, and is compared.
2. the method for detection service device environmental temperature monitoring reliability as described in claim 1, which is characterized in that the reading By being read with interior IPMI scripts.
3. the method for detection service device environmental temperature monitoring reliability as described in claim 1, which is characterized in that will read Displays temperature of stating export with actual temperature and compared into identical file.
4. the method for detection service device environmental temperature monitoring reliability as described in claim 1, which is characterized in that every 10 seconds It repeats to read displays temperature and actual temperature.
CN201810037212.5A 2018-01-15 2018-01-15 A kind of method of detection service device environmental temperature monitoring reliability Pending CN108287778A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810037212.5A CN108287778A (en) 2018-01-15 2018-01-15 A kind of method of detection service device environmental temperature monitoring reliability

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810037212.5A CN108287778A (en) 2018-01-15 2018-01-15 A kind of method of detection service device environmental temperature monitoring reliability

Publications (1)

Publication Number Publication Date
CN108287778A true CN108287778A (en) 2018-07-17

Family

ID=62835243

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810037212.5A Pending CN108287778A (en) 2018-01-15 2018-01-15 A kind of method of detection service device environmental temperature monitoring reliability

Country Status (1)

Country Link
CN (1) CN108287778A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1889049A (en) * 2005-07-02 2007-01-03 鸿富锦精密工业(深圳)有限公司 System and method for storing temperature under magnetic disk operating system
CN1954353A (en) * 2004-03-17 2007-04-25 罗姆股份有限公司 Gamma correction circuit and display having same
CN101276307A (en) * 2007-03-27 2008-10-01 鸿富锦精密工业(深圳)有限公司 Super I/O test method
CN102308292A (en) * 2008-12-31 2012-01-04 英特尔公司 Dynamic updating of thresholds in accordance with operating conditions
CN202732402U (en) * 2011-12-31 2013-02-13 曙光信息产业股份有限公司 Fan control system and blade server cabinet
CN106919519A (en) * 2017-01-22 2017-07-04 郑州云海信息技术有限公司 A kind of method for designing of automatic difference NVME HD vendors

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1954353A (en) * 2004-03-17 2007-04-25 罗姆股份有限公司 Gamma correction circuit and display having same
CN1889049A (en) * 2005-07-02 2007-01-03 鸿富锦精密工业(深圳)有限公司 System and method for storing temperature under magnetic disk operating system
CN101276307A (en) * 2007-03-27 2008-10-01 鸿富锦精密工业(深圳)有限公司 Super I/O test method
CN102308292A (en) * 2008-12-31 2012-01-04 英特尔公司 Dynamic updating of thresholds in accordance with operating conditions
CN202732402U (en) * 2011-12-31 2013-02-13 曙光信息产业股份有限公司 Fan control system and blade server cabinet
CN106919519A (en) * 2017-01-22 2017-07-04 郑州云海信息技术有限公司 A kind of method for designing of automatic difference NVME HD vendors

Similar Documents

Publication Publication Date Title
Noureddine et al. A preliminary study of the impact of software engineering on greenit
CN102411526B (en) Test method of mainboard of blade server
US10552761B2 (en) Non-intrusive fine-grained power monitoring of datacenters
US10048995B1 (en) Methods and apparatus for improved fault analysis
US20200218613A1 (en) System and Method for Information Handling System Boot Status and Error Data Capture and Analysis
TW201743210A (en) Fan failure detection and reporting
TW200403563A (en) Method and system to implement a system event log for system manageability
CN104298583B (en) Mainboard management system and method based on baseboard management controller
US10863653B2 (en) Thermal testing system and method of thermal testing
US8526259B2 (en) Disk drive with state-information data buffer
US20150370619A1 (en) Management system for managing computer system and management method thereof
CN208140901U (en) A kind of server power supply real time monitoring apparatus
Tang et al. NIPD: Non-intrusive power disaggregation in legacy datacenters
CN103729279A (en) Hard disk temperature detecting system
US7725285B2 (en) Method and apparatus for determining whether components are not present in a computer system
US11640377B2 (en) Event-based generation of context-aware telemetry reports
CN108845909A (en) A kind of BMC method for testing pressure parallel based on Python
CN108287778A (en) A kind of method of detection service device environmental temperature monitoring reliability
TWI611290B (en) Method for monitoring server racks
CN110674044B (en) Coverage rate acquisition method, system, equipment and medium for function automation test
CN108052436A (en) Method, apparatus, equipment and the storage medium of management and control are carried out to FPGA boards
US20200409813A1 (en) System and Method to Derive Health Information for a General Purpose Processing Unit Through Aggregation of Board Parameters
CN113900718B (en) Decoupling method, system and device for BMC and BIOS asset information
CN108880916B (en) IIC bus-based fault positioning method and system
CN104252400A (en) Multi-node management system and method for data center

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180717

RJ01 Rejection of invention patent application after publication