CN109408312B - Server operating temperature test system and equipment - Google Patents

Server operating temperature test system and equipment Download PDF

Info

Publication number
CN109408312B
CN109408312B CN201811293038.7A CN201811293038A CN109408312B CN 109408312 B CN109408312 B CN 109408312B CN 201811293038 A CN201811293038 A CN 201811293038A CN 109408312 B CN109408312 B CN 109408312B
Authority
CN
China
Prior art keywords
test
server
temperature
cpu
tested
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811293038.7A
Other languages
Chinese (zh)
Other versions
CN109408312A (en
Inventor
徐伟超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201811293038.7A priority Critical patent/CN109408312B/en
Publication of CN109408312A publication Critical patent/CN109408312A/en
Application granted granted Critical
Publication of CN109408312B publication Critical patent/CN109408312B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2273Test methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2268Logging of test results

Abstract

The invention provides a server operation temperature test system and equipment, wherein a temperature test program is executed based on a test client installed on a server to be tested according to a preset temperature test condition, and temperature test process data, temperature test data of the server to be tested and temperature test unit operation data are displayed through a display module. The system is applied to the complete machine system test in the server project development stage, the temperature heat dissipation condition of the server processor in the working state is monitored in real time, and the state data is timely recorded in real time for the abnormal condition of the server when the temperature of the processor is abnormal or an alarm is given out, so that the research and the development are facilitated, the corresponding analysis is carried out, and the heat dissipation strategy of the complete machine server is optimally regulated and controlled. The time of manual operation and intervention is greatly reduced, and the practicability is strong. The tester can test the temperature state of the processor based on different configurations of the server, and adjust the configuration of the server to meet the requirements of users on the performance of the server.

Description

Server operating temperature test system and equipment
Technical Field
The invention relates to the field of server temperature testing, in particular to a system and equipment for testing the operating temperature of a server.
Background
With the continuous development of the technology in the IT field and the arrival of the big data era, the stability of the server by the traditional informatization service and the increasingly strong cloud computing service is more and more required. At present, when a high-performance server pursues performance, power consumption and heat dissipation of the corresponding server in a working state can be greatly increased. In the development stage of a server project, considerable time and energy are invested in research and development aiming at a heat dissipation part, the heat dissipation condition of the server is improved and optimized, the power consumption of the server is reduced, and the power consumption of the server of the whole machine can be reduced to the minimum under the condition that the heat dissipation of the server is guaranteed. How can realize that the temperature state that satisfies the treater among the test server data processing is in presetting the within range to based on the configuration of heat dissipation state and treater temperature state adjustment server, satisfy the demand of user to server performance, be the technical problem who waits for solution currently.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a server operation temperature testing system, which comprises: the system comprises a temperature testing unit and a testing client; the test client is used for being installed on a server to be tested;
the temperature test unit includes: the device comprises a plurality of communication interfaces, a data receiving module, a display module and a test control module;
the communication interface is connected with a network port of the server to be tested through a network cable;
the data receiving module and the display module are respectively connected with the test control module, and the test control module receives a temperature test control instruction and a temperature test preset test condition input by a tester through the data receiving module;
the test control module is used for executing a temperature test program based on a test client installed on the server to be tested according to a temperature test control instruction input by a tester and a temperature test preset condition, and displaying temperature test process data, temperature test data of the server to be tested and temperature test unit operation data through the display module.
Preferably, the temperature test unit further comprises: a temperature test program configuration module;
the temperature test program configuration module is used for configuring a temperature alarm test program for the temperature test of the server;
the test control module configures a temperature alarm test program into a server to be tested and executes the temperature alarm test program, and a test client records the temperature information of CPU operation through a server BMC log;
the test control module judges whether the temperature of the CPU exceeds a threshold value or not through the CPU running temperature information recorded by the server BMC log, sends an alarm log when the temperature of the CPU exceeds the threshold value, and records the information conditions of the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole power consumption and the server BMC monitoring value of the current CPU.
Preferably, the temperature test program configuration module is further configured to perform frequency reduction processing on the CPU when the temperature of the CPU of the server to be tested exceeds the threshold, acquire the program of the CPU state in real time, and perform frequency reduction processing on the CPU of the server to be tested;
the test control module configures the test program configured by the temperature test program configuration module into the server to be tested, and when the temperature of the CPU of the server to be tested exceeds a threshold value, the CPU is subjected to frequency reduction processing to obtain the running state and the temperature information of the CPU;
the test control module is also used for carrying out frequency reduction processing on the CPU of the server to be tested, acquiring the information of the CPU state in real time, judging whether the temperature of the CPU exceeds a threshold value, sending out an alarm log formed by overhigh temperature of the frequency reduction processing CPU when the temperature of the CPU exceeds the threshold value, and recording the information conditions of the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole power consumption and the server BMC monitoring value of the current frequency reduction processing CPU.
Preferably, the test control module is further configured to record the master frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotation speed, the overall power consumption, and the information status of the server BMC monitoring value of the current CPU when the processor temperature alarm does not occur in both the server log to be tested and the server BMC log during the test;
in the testing process, the server BMC log records an alarm log with overhigh temperature of the CPU, and when the alarm log does not appear in the server log to be tested, the preset time duration is continued, and the master frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole machine power consumption and the information condition of the server BMC monitoring value of the CPU are recorded;
the exception log in the bmc log is saved and cleared for the next cycle.
Preferably, the test control module is further configured to record an alarm log with an excessively high temperature of the processor in the server BMC log during the test, and when the alarm log with the CPU down-conversion occurs in the server log to be tested, continuously record a preset time, record a main frequency, an external frequency, a front-end bus frequency, a frequency multiplication coefficient, a fan rotation speed, a whole power consumption, and an information status of a server BMC monitoring value of the CPU.
Preferably, the data receiving module is further configured to provide operation ports for temperature test task information of all servers to be tested in the system, and a user operates the temperature test task information of the servers to be tested through a checking, modifying, deleting and adding operation mode provided by the operation ports;
the system comprises a plurality of testing personnel, an operation port and a control module, wherein the testing personnel are used for providing testing task information of a server to be tested corresponding to each testing personnel in the system;
the system is also used for providing an operation port of each test script in the system, and a user operates each test script through a checking, modifying, deleting and increasing operation mode provided by the operation port;
the system is also used for setting a task list, so that the test task information of all the servers to be tested is configured in the task list, and a user obtains the test execution progress under each task through the task list;
the system is also used for setting a tester list, so that all tester information is configured in the tester list, and the tester obtains the test execution progress through the tester list;
the test script task list is also used for setting a test script list, so that all test script information is configured in the test script task list, and a user acquires the state information of each test script through the test script task list.
Preferably, the test control module is further configured to configure a test item interface in the test system, where the test item interface displays each test item in a tree form;
setting a test item adding port in the test item interface, calling the test item adding port by a tester, and adding a test item on the test item interface; adding test items includes: the system comprises a test item coding information input port, a test item name information input port, a test item starting time input port, a test item ending time input port, a test item remark information input port, a test item submission operation input key, a test item reset information key and a test item delay port;
the test system is also used for configuring an editing port for the added test items on the test item interface; enabling a user to edit the added test items through the editing port;
the system is also used for configuring a task item information input port of each test item on a test item interface, modifying, deleting and increasing operation ports for tester information, modifying, deleting and increasing operation ports for test scripts, executing a sequence operation port for the test scripts, executing a correlation operation port for the test scripts and constructing an operation port for the tester.
Preferably, the temperature test unit further comprises: the device comprises an alarm prompt module and a test result pushing module;
the alarm prompting module is used for sending an alarm prompt to a tester, a server maintainer and a system administrator in a short message mode, or through a client browser interface or in an e-mail mode when a test script related to a test project is in endless loop, or the related test script runs for an over-long time, or the related current test script is not executed after the operation is finished, or a preset next test script is not executed after the operation is finished;
the test result pushing module is used for pushing the test process log and the test result report in a short message mode, a client browser interface or an electronic mail mode.
An apparatus having a server operating temperature test system, comprising:
the memory is used for storing the computer program and the server operation temperature testing system;
and the processor is used for executing the computer program and the server operation temperature testing system so as to realize the server operation temperature testing system.
According to the technical scheme, the invention has the following advantages:
the system is applied to the complete machine system test in the development stage of a server project, the temperature heat dissipation condition of the server processor in the working state is monitored in real time, and the state data is timely recorded in real time for the abnormal condition of the server when the temperature of the processor is abnormal or an alarm is given out, so that the research and the development are facilitated, the corresponding analysis is carried out, and the heat dissipation strategy of the complete machine server is optimally regulated and controlled. The time of manual operation and intervention is greatly reduced, and the practicability is strong. The tester can test the temperature state of the processor based on different configurations of the server, and adjust the configuration of the server to meet the requirements of users on the performance of the server.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings used in the description will be briefly introduced, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of a server operating temperature test system;
FIG. 2 is a schematic diagram of an embodiment of a server operating temperature testing system.
Detailed Description
The invention provides a server operation temperature test system, as shown in fig. 1, comprising: a temperature test unit 8 and a test client 6; the test client 6 is used for being installed on a server 7 to be tested;
the temperature test unit 8 includes: the device comprises a plurality of communication interfaces 1, a data receiving module 3, a display module 4 and a test control module 5; the communication interface 1 is connected with a network port of a server 7 to be tested through a network cable; the data receiving module 3 and the display module 4 are respectively connected with the test control module 5, and the test control module 5 receives a temperature test control instruction and a temperature test preset test condition input by a tester through the data receiving module 3; the test control module 5 is used for executing a temperature test program based on the test client 6 installed on the server 7 to be tested according to the temperature test control instruction and the preset temperature test condition input by the tester, and displaying the temperature test process data, the temperature test data of the server 7 to be tested and the operation data of the temperature test unit 8 through the display module 4.
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions of the present invention will be clearly and completely described below with reference to specific embodiments and drawings. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the scope of protection of this patent.
In the embodiment of the present invention, as shown in fig. 2, the temperature testing unit 8 further includes: a temperature test program configuration module 2;
the temperature test program configuration module 2 is used for configuring a temperature alarm test program for the temperature test of the server; the test control module 5 configures a temperature alarm test program into the server 7 to be tested and executes the temperature alarm test program, and the test client 6 records the temperature information of the CPU operation through a server BMC log; the test control module 5 judges whether the temperature of the CPU exceeds a threshold value through the CPU running temperature information recorded by the server BMC log, sends an alarm log when the temperature of the CPU exceeds the threshold value, and records the information conditions of the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole machine power consumption and the server BMC monitoring value of the current CPU.
Of course, the present invention is not limited to recording the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotation speed, the whole power consumption and the information status of the server BMC monitoring value of the current CPU, and can also include information such as memory, mainboard, hard disk, etc., and the comprehensive test is performed to meet the configuration requirement.
The techniques described herein may be implemented in hardware, software, firmware, or any combination thereof. Various features are described as modules, units or components that may be implemented together in an integrated logic device or separately as discrete but interoperable logic devices or other hardware devices. In some cases, various features of an electronic circuit may be implemented as one or more integrated circuit devices, such as an integrated circuit chip or chipset.
In the embodiment provided by the invention, the temperature test program configuration module 2 is further configured to perform frequency reduction processing on the CPU when the temperature of the CPU of the server to be tested 7 exceeds the threshold, acquire the program of the CPU state in real time, and perform frequency reduction processing on the CPU of the server to be tested 7;
the test control module 5 configures the test program configured by the temperature test program configuration module 2 into the server 7 to be tested, and when the temperature of the CPU of the server 7 to be tested exceeds a threshold value, the CPU is subjected to frequency reduction processing to obtain the running state and the temperature information of the CPU;
the test control module 5 is further configured to perform frequency reduction processing on the CPU of the server 7 to be tested, acquire information of the state of the CPU in real time, determine whether the temperature of the CPU exceeds a threshold, send out an alarm log when the temperature of the CPU exceeds the threshold, and record the master frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan speed, the overall power consumption, and the information status of the BMC monitoring value of the server at present.
It can be understood that in the case of abnormal processor temperature, the log of the server 7 to be tested has a processor down-frequency throttled alarm log and a server BMC log has a severe alarm log with an excessively high CPU temperature, and the trigger conditions of the alarm logs of the server 7 and the server BMC log are different. The trigger temperature of the frequency reduction alarm log of the server 7 to be tested is higher than that of the log of the server to be tested, so that different abnormity judgments need to be made for the part
#!/bin/bash
Cur_Dir=$(cd "$(dirname "$0")";pwd)
process=$1
function get_status()
{
flag _ os = 'cat/var/log/messages | grep threotted | grep temporal' variable flag _ os acquisition system log about CPU temperature and frequency reduction
flag _ bmc = 'ipmitool sel elist | grep-i CPU | grep-i hot' # variable flag _ bmc obtains temperature alarm information about CPU in server bmc
}
In the embodiment provided by the invention, the test control module 5 is also used for recording the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole machine power consumption and the information condition of the server BMC monitoring value of the current CPU when the processor temperature alarm does not occur in the log of the server 7 to be tested and the log of the server BMC in the test process;
in the testing process, the server BMC log records an alarm log with overhigh temperature of the CPU, and when the alarm log does not appear in the log of the server 7 to be tested, the log lasts for a preset time length and records the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the overall power consumption and the information condition of the server BMC monitoring value of the CPU.
The test control module 5 is also used for recording an alarm log with overhigh temperature of the processor in the log record of the server BMC in the test process, and continuously recording the master frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole power consumption and the information condition of the monitoring value of the server BMC when the log of the server 7 to be tested has the alarm log record of the CPU frequency reduction.
Specifically, corresponding abnormal log data capture records are respectively carried out according to the obtained temperature conditions, and the real-time server state of abnormal alarm occurrence, including the temperature condition of a processor, the fan rotating speed condition of a server of the whole machine, the power consumption condition of the whole machine and the information condition of a BMC monitoring value of the server, is obtained
This section is subdivided into three cases:
1> no processor temperature alarm appears in the 7 logs of the server to be tested and the BMC logs of the server
2, the server BMC log has a serious alarm log with overhigh processor temperature, and the log of the server 7 to be tested does not appear
And 3, the server BMC log has a serious alarm log with overhigh processor temperature, and the log of the server 7 to be tested has an alarm log record with reduced processor frequency.
function mon ()
{
get_status
Time= 'date +%D_%T'
if [ -z "$ flag _ os" ] & [ -z $ flag _ BMC ], # server 7 log to be tested and server BMC log have no processor temperature alarm and do no any processing action
get_status
fi
if [ -n "$ flag _ os" ] & [ -z $ flag _ BMC ]. Then # server BMC log has a severe alarm log with overhigh processor temperature, and the log of the server 7 to be tested does not appear
Time_os=`cat /var/log/messages |grep throttled |grep temperature | head -1 |awk '{print $1,$2,$3}'`
echo "CPU Core temperature clock throttled(bmc not logged) Time in OS messages is ======"$Time_os"======" >> $Cur_Dir/fail_monitor.log
echo "The local time is ==========="$Time"============"
for i in {1..10} # records the next 10s bmc sdr information monitoring FAN and temperature status
do
ipmitool sdr elist >> $Cur_Dir/fail_monitor.log
sleep 1
done
fi
if [ -n "$ flag _ os" ] & & [ -n $ flag _ BMC ], the n # server BMC log has a severe alarm log with overhigh processor temperature, and meanwhile, the log of the server 7 to be tested has an alarm log record with reduced processor frequency
Time_bmc=`ipmitool sel elist |grep -i CPU|grep -i hot|head -1 |awk '{print $3,$5}'`
echo "CPU Core temperature clock throttled(bmc logged) Time is ======"$Time_bmc"======" >> $Cur_Dir/fail_monitor.log
echo "The local time is ==========="$Time"============"
for i in {1..10} # records the next 10s bmc sdr information monitoring FAN and temperature status
do
ipmitool sdr elist >> $Cur_Dir/fail_monitor.log
sleep 1
done
fi
ipmitool sel elist >> $Cur_Dir/bmc.log
Ipsmool sel clear # saves and clears exception logs in bmc log for the next cycle check
cat /var/log/messages >> $Cur_Dir/messages
cat/dev/null >/var/log/messages # saves and clears the exception log in the os messages log for the next cycle check
}
function get _ process () # function get _ process obtains server complete machine test process
{
flag=`ps -A |grep $process`
}
get_process
while [ -n "flag" ]
do
mon
get_process
And the done # main program circularly monitors in the incomplete test process of the whole server test process
In the embodiment provided by the invention, the data receiving module 3 is also used for providing operation ports of the temperature test task information of all the servers 7 to be tested in the system, and a user operates the temperature test task information of the servers 7 to be tested through a checking, modifying, deleting and adding operation mode provided by the operation ports;
the operation port is also used for providing test task information of the server 7 to be tested corresponding to each tester in the system, and each tester operates the temperature test task information of the server 7 to be tested under the name of the tester through the operation mode of checking, modifying, deleting and increasing provided by the operation port;
the system is also used for providing an operation port of each test script in the system, and a user operates each test script through a checking, modifying, deleting and increasing operation mode provided by the operation port;
the system is also used for setting a task list, so that the test task information of all the servers 7 to be tested is configured in the task list, and a user obtains the test execution progress under each task through the task list;
the system is also used for setting a tester list, so that all tester information is configured in the tester list, and the tester obtains the test execution progress through the tester list;
the test script task list is also used for setting a test script list, so that all test script information is configured in the test script task list, and a user acquires the state information of each test script through the test script task list.
If implemented in hardware, the invention relates to an apparatus, which may be, for example, a processor or an integrated circuit device, such as an integrated circuit chip or chipset. Alternatively or additionally, if implemented in software or firmware, the techniques may implement a data storage medium readable at least in part by a computer, comprising instructions that when executed cause a processor to perform one or more of the above-described methods. For example, a computer-readable data storage medium may store instructions that are executed, such as by a processor.
In the embodiment provided by the invention, the test control module 5 is further configured to configure a test item interface in the test system, wherein the test item interface displays each test item in a tree-mounted form;
setting a test item adding port in the test item interface, calling the test item adding port by a tester, and adding a test item on the test item interface; adding test items includes: the system comprises a test item coding information input port, a test item name information input port, a test item starting time input port, a test item ending time input port, a test item remark information input port, a test item submission operation input key, a test item reset information key and a test item delay port;
the test system is also used for configuring an editing port for the added test items on the test item interface; enabling a user to edit the added test items through the editing port;
the system is also used for configuring a task item information input port of each test item on a test item interface, modifying, deleting and increasing operation ports for tester information, modifying, deleting and increasing operation ports for test scripts, executing a sequence operation port for the test scripts, executing a correlation operation port for the test scripts and constructing an operation port for the tester.
The temperature test unit 8 further includes: an alarm prompt module 11 and a test result pushing module 12; the alarm prompting module 11 is used for sending an alarm prompt to a tester, a server maintainer and a system administrator in a short message mode, or through a client browser interface, or through an e-mail mode when a test script related to a test project is in a dead cycle, or the related test script runs for an over-long time, or the related current test script is not executed after the operation of the related current test script is finished, or a preset next test script is not executed after the operation of the related current test script is finished; the test result pushing module 12 is used for pushing the test process log and the test result report in a short message manner, or in a client browser interface, or in an email manner.
The server operation temperature testing system can enable a tester to quickly and efficiently test the temperature state of the servers in batches, improve the testing efficiency, reduce the uncertainty caused by human factors and save the cost.
The invention also provides a device with a server operation temperature test system, which comprises:
the memory is used for storing the computer program and the server operation temperature testing system;
and the processor is used for executing the computer program and the server operation temperature testing system so as to realize the server operation temperature testing system.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (7)

1. A server operating temperature test system, comprising: the system comprises a temperature testing unit and a testing client; the test client is used for being installed on a server to be tested;
the temperature test unit includes: the device comprises a plurality of communication interfaces, a data receiving module, a display module, a temperature test program configuration module and a test control module;
the communication interface is connected with a network port of the server to be tested through a network cable;
the data receiving module and the display module are respectively connected with the test control module, and the test control module receives a temperature test control instruction and a temperature test preset test condition input by a tester through the data receiving module;
the test control module is used for executing a temperature test program based on a test client installed on the server to be tested according to a temperature test control instruction input by a tester and a temperature test preset condition, and displaying temperature test process data, temperature test data of the server to be tested and temperature test unit operation data through the display module;
the temperature test program configuration module is used for configuring a temperature alarm test program for the temperature test of the server;
the test control module configures a temperature alarm test program into a server to be tested and executes the temperature alarm test program, and a test client records the temperature information of CPU operation through a server BMC log;
the test control module judges whether the temperature of the CPU exceeds a threshold value or not through the CPU operation temperature information recorded by the server BMC log, sends an alarm log when the temperature of the CPU exceeds the threshold value, and records the information conditions of the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole machine power consumption and the server BMC monitoring value of the current CPU;
the temperature test program configuration module is also used for carrying out frequency reduction processing on the CPU when the temperature of the CPU of the server to be tested exceeds a threshold value, acquiring a program of the CPU state in real time and carrying out frequency reduction processing on the CPU of the server to be tested;
the test control module configures the test program configured by the temperature test program configuration module into the server to be tested, and when the temperature of the CPU of the server to be tested exceeds a threshold value, the CPU is subjected to frequency reduction processing to obtain the running state and the temperature information of the CPU;
the test control module is also used for carrying out frequency reduction processing on the CPU of the server to be tested, acquiring the information of the CPU state in real time, judging whether the temperature of the CPU exceeds a threshold value, sending out an alarm log formed by overhigh temperature of the frequency reduction processing CPU when the temperature of the CPU exceeds the threshold value, and recording the information conditions of the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole power consumption and the server BMC monitoring value of the current frequency reduction processing CPU.
2. The server operational temperature testing system of claim 1,
the test control module is also used for recording the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole machine power consumption and the information condition of the server BMC monitoring value of the current CPU when the temperature alarm of the processor does not occur in the server log to be tested and the server BMC log in the test process;
in the testing process, the server BMC log records an alarm log with overhigh temperature of the CPU, and when the alarm log does not appear in the server log to be tested, the preset time duration is continued, and the master frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole machine power consumption and the information condition of the server BMC monitoring value of the CPU are recorded;
the exception log in the bmc log is saved and cleared for the next cycle.
3. The server operational temperature testing system of claim 2,
the test control module is also used for recording an alarm log with overhigh temperature of the processor in the log record of the server BMC in the test process, and continuously recording the main frequency, the external frequency, the front-end bus frequency, the frequency multiplication coefficient, the fan rotating speed, the whole power consumption and the information condition of the monitoring value of the server BMC when the alarm log of the CPU frequency reduction of the log of the server to be tested is recorded.
4. The server operational temperature testing system of claim 1,
the data receiving module is also used for providing operation ports of the temperature test task information of all the servers to be tested in the system, and a user operates the temperature test task information of the servers to be tested through a checking, modifying, deleting and increasing operation mode provided by the operation ports;
the system comprises a plurality of testing personnel, an operation port and a control module, wherein the testing personnel are used for providing testing task information of a server to be tested corresponding to each testing personnel in the system;
the system is also used for providing an operation port of each test script in the system, and a user operates each test script through a checking, modifying, deleting and increasing operation mode provided by the operation port;
the system is also used for setting a task list, so that the test task information of all the servers to be tested is configured in the task list, and a user obtains the test execution progress under each task through the task list;
the system is also used for setting a tester list, so that all tester information is configured in the tester list, and the tester obtains the test execution progress through the tester list;
the test script task list is also used for setting a test script list, so that all test script information is configured in the test script task list, and a user acquires the state information of each test script through the test script task list.
5. The server operational temperature testing system of claim 1,
the test control module is also used for configuring a test item interface in the test system, and the test item interface displays each test item in a tree form;
setting a test item adding port in the test item interface, calling the test item adding port by a tester, and adding a test item on the test item interface; adding test items includes: the system comprises a test item coding information input port, a test item name information input port, a test item starting time input port, a test item ending time input port, a test item remark information input port, a test item submission operation input key, a test item reset information key and a test item delay port;
the test system is also used for configuring an editing port for the added test items on the test item interface; enabling a user to edit the added test items through the editing port;
the system is also used for configuring a task item information input port of each test item on a test item interface, modifying, deleting and increasing operation ports for tester information, modifying, deleting and increasing operation ports for test scripts, executing a sequence operation port for the test scripts, executing a correlation operation port for the test scripts and constructing an operation port for the tester.
6. The server operational temperature testing system of claim 1,
the temperature test unit further includes: the device comprises an alarm prompt module and a test result pushing module;
the alarm prompting module is used for sending an alarm prompt to a tester, a server maintainer and a system administrator in a short message mode, or through a client browser interface or in an e-mail mode when a test script related to a test project is in endless loop, or the related test script runs for an over-long time, or the related current test script is not executed after the operation is finished, or a preset next test script is not executed after the operation is finished;
the test result pushing module is used for pushing the test process log and the test result report in a short message mode, a client browser interface or an electronic mail mode.
7. An apparatus having a server operating temperature test system, comprising:
the memory is used for storing the computer program and the server operation temperature testing system;
a processor for executing the computer program and the server operation temperature test system to realize the server operation temperature test system according to any one of claims 1 to 6.
CN201811293038.7A 2018-11-01 2018-11-01 Server operating temperature test system and equipment Active CN109408312B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811293038.7A CN109408312B (en) 2018-11-01 2018-11-01 Server operating temperature test system and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811293038.7A CN109408312B (en) 2018-11-01 2018-11-01 Server operating temperature test system and equipment

Publications (2)

Publication Number Publication Date
CN109408312A CN109408312A (en) 2019-03-01
CN109408312B true CN109408312B (en) 2021-10-29

Family

ID=65470851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811293038.7A Active CN109408312B (en) 2018-11-01 2018-11-01 Server operating temperature test system and equipment

Country Status (1)

Country Link
CN (1) CN109408312B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124784A (en) * 2019-12-20 2020-05-08 浪潮商用机器有限公司 Method, device and equipment for testing temperature alarm function of server
CN114356057A (en) * 2021-12-30 2022-04-15 浙江大华技术股份有限公司 Method, device and equipment for controlling heat dissipation of PCIe card and storage medium
CN116820197B (en) * 2023-06-27 2024-04-12 深圳小非牛科技有限公司 Software testing technology platform based on big data interconnection

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106815115A (en) * 2017-01-13 2017-06-09 郑州云海信息技术有限公司 A kind of operation condition of server monitoring system
CN107590037A (en) * 2017-08-29 2018-01-16 郑州云海信息技术有限公司 A kind of method that EDPP tests are carried out to server GPU
CN108574600A (en) * 2018-03-20 2018-09-25 北京航空航天大学 The QoS guarantee method of the power consumption and resource contention Collaborative Control of cloud computing server

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10541892B2 (en) * 2016-01-13 2020-01-21 Ricoh Company, Ltd. System and method for monitoring, sensing and analytics of collaboration devices
US10708155B2 (en) * 2016-06-03 2020-07-07 Guavus, Inc. Systems and methods for managing network operations

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106815115A (en) * 2017-01-13 2017-06-09 郑州云海信息技术有限公司 A kind of operation condition of server monitoring system
CN107590037A (en) * 2017-08-29 2018-01-16 郑州云海信息技术有限公司 A kind of method that EDPP tests are carried out to server GPU
CN108574600A (en) * 2018-03-20 2018-09-25 北京航空航天大学 The QoS guarantee method of the power consumption and resource contention Collaborative Control of cloud computing server

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Leakage and Temperature Aware Server Control for Improving Energy Efficiency in Data Centers;Marina Zapater 等;《ACM》;20130318;第1-4页 *
基于微信平台的温室环境监测与温度预测系统;任延昭 等;《农业机械学报》;20171231;第302-307页 *

Also Published As

Publication number Publication date
CN109408312A (en) 2019-03-01

Similar Documents

Publication Publication Date Title
CN109408312B (en) Server operating temperature test system and equipment
US9954727B2 (en) Automatic debug information collection
CN102141942B (en) A kind of monitoring and protection method of equipment and device
US10698788B2 (en) Method for monitoring server, and monitoring device and monitoring system using the same
US8627147B2 (en) Method and computer program product for system tuning based on performance measurements and historical problem data and system thereof
US8683268B2 (en) Key based cluster log coalescing
CN104079434A (en) Device and method for managing physical devices in cloud computing system
CN110851320A (en) Server downtime supervision method, system, terminal and storage medium
CN112596568B (en) Method, system, device and medium for reading error information of voltage regulator
CN103577298A (en) Baseboard management controller monitoring system and method
WO2023115999A1 (en) Device state monitoring method, apparatus, and device, and computer-readable storage medium
CN109254922A (en) A kind of automated testing method and device of server B MC Redfish function
CN103257922B (en) A kind of method of quick test BIOS and OS interface code reliability
CN113608964A (en) Cluster automation monitoring method and device, electronic equipment and storage medium
CN111625386A (en) Monitoring method and device for power-on overtime of system equipment
CN103178977A (en) Computer system and starting-up management method of same
CN111176736A (en) Server mainboard power-on and power-off test method and system
CN112650674A (en) Method for remotely acquiring and debugging webpage log, computer equipment and storage medium
CN104394023B (en) Failure collection method and system for network insertion terminal
CN114116276A (en) BMC hang-up self-recovery method, system, terminal and storage medium
CN109920130B (en) Monitoring method, monitoring device, electronic equipment and computer readable storage medium
CN112003727A (en) Multi-node server power supply testing method, system, terminal and storage medium
CN114449628B (en) Log data processing method, electronic device and medium thereof
CN114328103A (en) Method, system and related equipment for OpenBMC monitoring and management of discrete sensor
CN116582422A (en) Network card exception handling method, network card exception handling system and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant