CN117667591A - Remote management system based on supercomputer laboratory and management method thereof - Google Patents

Remote management system based on supercomputer laboratory and management method thereof Download PDF

Info

Publication number
CN117667591A
CN117667591A CN202311674568.7A CN202311674568A CN117667591A CN 117667591 A CN117667591 A CN 117667591A CN 202311674568 A CN202311674568 A CN 202311674568A CN 117667591 A CN117667591 A CN 117667591A
Authority
CN
China
Prior art keywords
laboratory
supercomputer
module
abnormal
different areas
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311674568.7A
Other languages
Chinese (zh)
Inventor
曹汉华
张焕平
黄莹莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Xinhua College
Original Assignee
Guangzhou Xinhua College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Xinhua College filed Critical Guangzhou Xinhua College
Priority to CN202311674568.7A priority Critical patent/CN117667591A/en
Publication of CN117667591A publication Critical patent/CN117667591A/en
Pending legal-status Critical Current

Links

Landscapes

  • Automatic Analysis And Handling Materials Therefor (AREA)

Abstract

The invention discloses a remote management system and a remote management method based on a supercomputer laboratory, and relates to the field of computer management. The invention discloses a remote management system based on a supercomputer laboratory, which comprises a supercomputer cluster management module, a real-time monitoring module and a remote management module. The invention solves the problems that the operation speed of a computer can be influenced and the operation failure of the computer can be caused by the overhigh temperature of the existing computer room, not only can ensure the accuracy of the adjustment of environmental parameters, but also improves the accuracy of remote management, reduces the requirement of manual intervention and improves the efficiency by targeted adjustment, and the working environment of each device in a super computer laboratory is always in a normal state by regulating and controlling the environmental factors in the super computer laboratory, thereby avoiding the influence of the overhigh environmental temperature on the operation state of the devices in the super computer laboratory or the equipment failure caused by overhigh temperature.

Description

Remote management system based on supercomputer laboratory and management method thereof
Technical Field
The invention relates to the technical field of computer management, in particular to a remote management system based on a supercomputer laboratory and a management method thereof.
Background
Currently, with the high-speed development of supercomputers, the matched management facilities of supercomputers are more and more complete, and the quantity of matched management systems of supercomputers is more and more, and supercomputers are managed by supercomputer administrators according to the required matched facilities.
The patent CN202310437784.3 discloses a remote management system based on a supercomputer laboratory and a management method thereof, wherein the supercomputer cluster management system comprises a node management module, a restarting setting module, a command terminal module and an AI node module, and the computer room environment management system comprises a gateway management module, a ring control management module, a video monitoring module and a fort module. The remote monitoring and management of the supercomputer can be realized, so that the commute time of a supercomputer manager to and from an office and a laboratory when the supercomputer laboratory problem is processed is effectively reduced, the laboratory management efficiency is improved, and the supercomputer manager can analyze the problems of related systems in time so as to troubleshoot the faults in time.
In the practical use process of the patent, the environmental parameters of the machine room can greatly influence the operation of the computer, and the overhigh temperature can not only influence the operation speed of the computer, but also cause the operation failure of the computer; therefore, the existing requirements are not met, and a remote management system based on a supercomputer laboratory and a management method thereof are provided.
Disclosure of Invention
The invention aims to provide a remote management system and a management method thereof based on a supercomputer laboratory, which can ensure the accuracy of environmental parameter adjustment, improve the accuracy of remote management, reduce the requirement of manual intervention and improve the efficiency by targeted adjustment, ensure that the working environment of each device in the supercomputer laboratory is always in a normal state, avoid the influence of the excessive environmental temperature on the running state of the device in the supercomputer laboratory, or avoid the equipment failure caused by the excessive temperature, and solve the problems in the background technology.
In order to achieve the above purpose, the present invention provides the following technical solutions: a supercomputer laboratory-based remote management system, comprising:
and the super-computing cluster management module is used for carrying out command input and data exchange, simultaneously providing data temporary storage and transfer when a plurality of nodes carry out data exchange, simultaneously carrying out restarting management on the plurality of nodes in each infrastructure of the super computer laboratory, and providing a corresponding operation platform and management AI node for the command terminal.
The real-time monitoring module is used for dividing the supercomputer laboratory into different areas, carrying out real-time monitoring, acquisition and transmission on environment parameters of each device in the different areas of the supercomputer laboratory, judging whether the environment parameters in the supercomputer laboratory are abnormal or not, and correspondingly adjusting the operation parameters of the temperature control ventilation elements in the different areas of the supercomputer laboratory according to the judgment result.
The remote management module is used for receiving real-time environment parameters and abnormal results of all the devices in the supercomputer laboratory, analyzing the received data, making corresponding adjustment commands according to the analysis results, and controlling the operation of all the devices and the temperature control ventilation elements in different areas of the supercomputer laboratory according to the adjustment commands.
Preferably, the management system includes: the super computing cluster management module specifically comprises:
the node management module is used for inputting commands and exchanging data, classifying the data entering the system, providing temporary storage and transfer of the data when a plurality of nodes exchange the data, and completing data retrieval when the data are required.
And the restarting setting module is used for carrying out restarting management on a plurality of nodes in each infrastructure of the supercomputer laboratory.
And the command terminal module is used for providing a corresponding operation platform for the command terminals in each infrastructure of the supercomputer laboratory.
And the AI node module is used for managing AI nodes in each infrastructure of the supercomputer laboratory.
Preferably, the real-time monitoring module specifically includes:
the regional monitoring module is used for dividing the supercomputer laboratory into different regions and monitoring, collecting and transmitting environment parameters of each device in the different regions of the supercomputer laboratory in real time, wherein the environment parameters comprise environment temperature, humidity and air quality coefficient.
The abnormality judgment module is used for judging whether the acquired environmental parameters exceed a preset threshold according to the environmental parameter data monitored and acquired in real time, and when the acquired environmental parameters exceed the preset threshold, the current environmental parameters in the supercomputer laboratory are abnormal.
And the operation parameter adjustment module is used for correspondingly adjusting the operation parameters of the temperature control ventilation element in different areas of the supercomputer laboratory according to the abnormal environment parameters judged by the abnormality judgment module.
And the alarm module is used for judging whether the abnormal times of the supercomputer laboratory exceeds a threshold value according to the parameter adjustment condition of the temperature control ventilation element, and triggering an alarm to a management end or automatically stopping the operation of equipment in a related area if the abnormal times of the supercomputer laboratory exceed the threshold value.
Preferably, the area monitoring module includes:
and the region dividing module is used for dividing the supercomputer laboratory into different regions.
And the real-time monitoring module is used for continuously monitoring the environmental parameters of each device in different areas in the supercomputer laboratory and updating the monitored environmental parameter data in real time.
And the sensing acquisition module is used for acquiring and transmitting the environmental parameters of each device in different areas of the monitored supercomputer laboratory in real time and converting the environmental parameters into a machine-readable language.
Preferably, the abnormality determination module includes:
and the analysis module is used for acquiring the real-time data transmitted by the area monitoring module and analyzing the environmental parameter condition of the supercomputer laboratory according to the acquired data.
And the comparison module is used for comparing the analyzed environmental parameter result with a preset environmental parameter threshold value and judging whether the real-time environmental parameter is abnormal or not according to the comparison result.
And the sending module is used for transmitting the result judged by the comparison module to the remote management module.
Preferably, the judging flow of the abnormality judging module specifically includes:
presetting a threshold value of an environmental parameter, acquiring real-time data transmitted by the area monitoring module, and analyzing the acquired data.
And comparing the analyzed environmental parameters with a preset environmental parameter threshold value, and judging whether an abnormal condition exists or not.
If the abnormality exists, the abnormality is transmitted to a remote management module, the remote management module transmits an instruction to a temperature control ventilation element in an abnormal area of the supercomputer laboratory according to the abnormal state, and the temperature control ventilation element receives the corresponding instruction and makes corresponding adjustment.
Preferably, the alarm module specifically includes:
the adjustment frequency threshold values of the temperature control ventilation elements in different areas of the supercomputer laboratory are preset, and the adjustment frequency values of the temperature control ventilation elements in different areas of the supercomputer laboratory are received.
Judging whether the difference exceeds a preset threshold value according to the parameter adjustment times of the temperature control ventilation element, and if the difference exceeds the threshold value, judging that the environmental parameters in different areas of the supercomputer laboratory are abnormal and cannot be regulated.
Triggering an alarm signal and automatically stopping operation of the device in the relevant area.
Preferably, the monitoring process of the real-time monitoring module specifically includes:
firstly, dividing a supercomputer laboratory into different areas, and monitoring and collecting environmental parameters of each divided area in the supercomputer laboratory in real time.
And judging whether the environmental parameters exceed a preset threshold according to the environmental parameter data monitored and collected in real time, and if so, judging that the environmental parameters in the current supercomputer laboratory are abnormal.
And finally, correspondingly adjusting the operation parameters of the temperature control ventilation elements in different areas of the supercomputer laboratory according to the abnormal conditions, and triggering an alarm to a management end or automatically stopping the operation of equipment in the related area according to the abnormal times.
Preferably, the remote management module specifically includes:
the data receiving module is used for receiving real-time environment parameters and abnormal analysis results of all the devices in the super computer laboratory.
The data analysis module is used for analyzing the received data, determining the abnormal conditions of the environmental parameters of each device in the super computer laboratory, and making corresponding adjustment commands according to the abnormal conditions.
And the remote control module is used for controlling and managing the running conditions of all the devices in different areas of the supercomputer laboratory.
The invention provides another scheme: a method of managing a supercomputer laboratory-based remote management system, comprising the steps of:
step one: dividing the supercomputer laboratory into different areas, and presetting thresholds of environmental parameters and adjustment times thresholds of temperature control ventilation elements in different areas of the supercomputer laboratory.
Step two: real-time data of environmental parameters in different areas of the supercomputer laboratory are obtained, the obtained data are analyzed, the analyzed environmental parameters are compared with a preset environmental parameter threshold value, and whether abnormal conditions exist or not is judged.
Step three: if the abnormality exists, the abnormality is transmitted to a remote management module, the remote management module transmits an instruction to a temperature control ventilation element in an abnormal area of the supercomputer laboratory according to the abnormal state, and the temperature control ventilation element receives the corresponding instruction and makes corresponding adjustment.
Step four: when the temperature control ventilation element is adjusted, the adjustment record is recorded, and whether the preset adjustment frequency threshold value is exceeded or not is judged according to the adjustment frequency.
Step five: if the preset adjustment frequency threshold value is exceeded, judging that the environmental parameters in different areas of the supercomputer laboratory are abnormal and cannot be regulated, triggering an alarm signal and automatically stopping the operation of equipment in the relevant area.
Compared with the prior art, the invention has the beneficial effects that:
according to the invention, the super computer laboratory is divided into different areas, when the environmental parameters are adjusted, the corresponding adjustment can be carried out according to the environmental parameters in the different areas, so that the accuracy of the environmental parameter adjustment can be ensured, the accuracy of remote management is improved, the requirement of manual intervention is reduced, the efficiency is improved, the working environment of each device in the super computer laboratory is always in a normal state by adjusting and controlling the environmental factors in the super computer laboratory, and the operation state of the devices in the super computer laboratory, which is influenced by the excessive environmental temperature, or the device failure caused by the excessive temperature, is avoided.
Drawings
FIG. 1 is a schematic block diagram of a supercomputer laboratory-based remote management system in accordance with the present invention;
fig. 2 is a diagram of a management method of a remote management system based on a supercomputer laboratory according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In order to solve the problem that the operation of the computer is greatly affected by the environmental parameters of the existing machine room, the operation speed of the computer is affected by the overhigh temperature, and the operation failure of the computer is caused, please refer to fig. 1-2, the present embodiment provides the following technical scheme:
a remote management system based on a supercomputer laboratory comprises
And the super-computing cluster management module is used for carrying out command input and data exchange, simultaneously providing data temporary storage and transfer when a plurality of nodes carry out data exchange, simultaneously carrying out restarting management on the plurality of nodes in each infrastructure of the super computer laboratory, and providing a corresponding operation platform and management AI node for the command terminal.
The real-time monitoring module is used for dividing the supercomputer laboratory into different areas, carrying out real-time monitoring, acquisition and transmission on environment parameters of each device in the different areas of the supercomputer laboratory, judging whether the environment parameters in the supercomputer laboratory are abnormal, carrying out corresponding adjustment on the operation parameters of the temperature control ventilation element in the different areas of the supercomputer laboratory according to a judging result, and further realizing the monitoring on the different areas of the supercomputer laboratory, carrying out more accurate acquisition on the environment parameters of the supercomputer laboratory, further realizing the regulation and control on the environment parameters of the supercomputer laboratory, and avoiding the operation state of the device in the supercomputer laboratory due to the overhigh environment temperature or avoiding the device fault caused by overhigh temperature.
The remote management module is used for receiving real-time environment parameters and abnormal results of all the devices in the supercomputer laboratory, analyzing the received data, making corresponding adjustment commands according to the analysis results, controlling the operation of all the devices and the temperature control ventilation elements in different areas of the supercomputer laboratory according to the adjustment commands, remotely managing the environment parameters in the supercomputer laboratory, remotely adjusting the environment parameters, eliminating the need of manual real-time monitoring and adjustment, reducing the manual consumption and improving the management effect.
The management system includes: the super computing cluster management module specifically comprises:
the node management module is used for inputting commands and exchanging data, classifying the data entering the system, providing temporary storage and transfer of the data when a plurality of nodes exchange the data, and completing data retrieval when the data are required.
And the restarting setting module is used for carrying out restarting management on a plurality of nodes in each infrastructure of the supercomputer laboratory.
And the command terminal module is used for providing a corresponding operation platform for the command terminals in each infrastructure of the supercomputer laboratory.
And the AI node module is used for managing AI nodes in each infrastructure of the supercomputer laboratory.
The real-time monitoring module specifically comprises:
the regional monitoring module is used for dividing the supercomputer laboratory into different regions and monitoring, collecting and transmitting environment parameters of each device in the different regions of the supercomputer laboratory in real time, wherein the environment parameters comprise environment temperature, humidity and air quality coefficient.
The abnormality judgment module is used for judging whether the acquired environmental parameters exceed a preset threshold according to the environmental parameter data monitored and acquired in real time, and when the acquired environmental parameters exceed the preset threshold, the current environmental parameters in the supercomputer laboratory are abnormal.
And the operation parameter adjustment module is used for correspondingly adjusting the operation parameters of the temperature control ventilation element in different areas of the supercomputer laboratory according to the abnormal environment parameters judged by the abnormality judgment module.
And the alarm module is used for judging whether the abnormal times of the supercomputer laboratory exceeds a threshold value according to the parameter adjustment condition of the temperature control ventilation element, and triggering an alarm to the management end or automatically stop the operation of equipment in a related area if the abnormal times of the supercomputer laboratory exceeds the threshold value, so that a worker can timely maintain the supercomputer laboratory indoor equipment, the equipment is prevented from being damaged, and the normal operation of the supercomputer laboratory indoor equipment is ensured.
An area monitoring module comprising:
the regional division module is used for dividing the supercomputer laboratory into different regions, respectively monitoring and collecting the environmental parameters in the different regions, guaranteeing the accuracy of the environmental parameter collection, and when the environmental parameters are adjusted, carrying out corresponding adjustment according to the environmental parameters in the different regions, not only guaranteeing the accuracy of the environmental parameter adjustment, but also improving the accuracy of remote management through targeted adjustment, reducing the requirement of manual intervention and improving the efficiency.
And the real-time monitoring module is used for continuously monitoring the environmental parameters of each device in different areas in the supercomputer laboratory and updating the monitored environmental parameter data in real time.
And the sensing acquisition module is used for acquiring and transmitting the environmental parameters of each device in different areas of the monitored supercomputer laboratory in real time and converting the environmental parameters into a machine-readable language.
An anomaly determination module comprising:
and the analysis module is used for acquiring the real-time data transmitted by the area monitoring module and analyzing the environmental parameter condition of the supercomputer laboratory according to the acquired data.
And the comparison module is used for comparing the analyzed environmental parameter result with a preset environmental parameter threshold value and judging whether the real-time environmental parameter is abnormal or not according to the comparison result.
And the sending module is used for transmitting the result judged by the comparison module to the remote management module.
The judging flow of the abnormality judging module specifically comprises:
presetting a threshold value of an environmental parameter, acquiring real-time data transmitted by the area monitoring module, and analyzing the acquired data.
And comparing the analyzed environmental parameters with a preset environmental parameter threshold value, and judging whether an abnormal condition exists or not.
If the abnormality exists, the abnormality is transmitted to a remote management module, the remote management module transmits an instruction to a temperature control ventilation element in an abnormal area of the supercomputer laboratory according to the abnormal state, and the temperature control ventilation element receives the corresponding instruction and makes corresponding adjustment.
The alarm module specifically comprises:
the adjustment frequency threshold values of the temperature control ventilation elements in different areas of the supercomputer laboratory are preset, and the adjustment frequency values of the temperature control ventilation elements in different areas of the supercomputer laboratory are received.
Judging whether the difference exceeds a preset threshold value according to the parameter adjustment times of the temperature control ventilation element, if so, judging that the environmental parameters in different areas of the supercomputer laboratory are abnormal and cannot be regulated, immediately reacting to abnormal conditions and alarming when the condition of incapability of regulating and controlling occurs, and simultaneously stopping the operation of equipment, thereby improving the environmental safety of the supercomputer laboratory and the operation stability of the equipment.
Triggering an alarm signal and automatically stopping operation of the device in the relevant area.
The monitoring flow of the real-time monitoring module specifically comprises:
firstly, dividing a supercomputer laboratory into different areas, and monitoring and collecting environmental parameters of each divided area in the supercomputer laboratory in real time.
And judging whether the environmental parameters exceed a preset threshold according to the environmental parameter data monitored and collected in real time, and if so, judging that the environmental parameters in the current supercomputer laboratory are abnormal.
And finally, correspondingly adjusting the operation parameters of the temperature control ventilation elements in different areas of the supercomputer laboratory according to the abnormal conditions, and triggering an alarm to a management end or automatically stopping the operation of equipment in the related area according to the abnormal times.
The remote management module specifically comprises:
the data receiving module is used for receiving real-time environment parameters and abnormal analysis results of all the devices in the super computer laboratory.
The data analysis module is used for analyzing the received data, determining the abnormal conditions of the environmental parameters of each device in the super computer laboratory, and making corresponding adjustment commands according to the abnormal conditions.
And the remote control module is used for controlling and managing the running conditions of all the devices in different areas of the supercomputer laboratory.
The invention provides another scheme: a method of managing a supercomputer laboratory-based remote management system, comprising the steps of:
step one: the super computer laboratory is divided into different areas, the threshold value of the environmental parameter and the adjustment frequency threshold value of the temperature control ventilation element in the different areas of the super computer laboratory are preset, the super computer laboratory can be divided into different areas to monitor, the environmental parameter of the super computer laboratory is acquired more accurately, and the environmental parameter of the super computer laboratory is regulated and controlled.
Step two: the method comprises the steps of acquiring real-time data of environmental parameters in different areas of a supercomputer laboratory, analyzing the acquired data, comparing the analyzed environmental parameters with preset environmental parameter thresholds, judging whether abnormal conditions exist, and rapidly adjusting the environmental parameters in the supercomputer laboratory to a normal state through real-time monitoring and judgment of the environmental parameters in the supercomputer laboratory, so that the running conditions of all equipment are guaranteed.
Step three: if the abnormality exists, the abnormality is transmitted to the remote management module, the remote management module transmits an instruction to the temperature control ventilation element in the abnormal area of the supercomputer laboratory according to the abnormal state, the temperature control ventilation element receives the corresponding instruction and makes corresponding adjustment, and the working environment of each device in the supercomputer laboratory is always in a normal state by regulating and controlling the environmental factors in the supercomputer laboratory, so that the operating state of the device in the supercomputer laboratory is prevented from being influenced by the overhigh environmental temperature, or the device is prevented from being broken down due to overhigh temperature.
Step four: when the temperature control ventilation element is adjusted, the adjustment record is recorded, whether the adjustment times exceeds a preset adjustment times threshold value is judged according to the adjustment times, the environmental parameters in the supercomputer laboratory are remotely managed, the environmental parameters are remotely adjusted, manual real-time monitoring and adjustment are not needed, the manual consumption is reduced, and the management effect is improved.
Step five: if the preset adjustment frequency threshold value is exceeded, the abnormal non-adjustable environmental parameters in different areas of the supercomputer laboratory are judged, an alarm signal is triggered, the operation of equipment in the relevant area is automatically stopped, and when the non-adjustable condition occurs, the alarm signal is sent out, the operation of each equipment in the supercomputer laboratory is automatically stopped, so that staff can timely maintain the equipment in the supercomputer laboratory, the normal operation of the equipment in the supercomputer laboratory is ensured, and the equipment is prevented from being damaged.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A supercomputer laboratory-based remote management system, comprising:
the super computing cluster management module is used for inputting commands and exchanging data, providing temporary storage and transfer of data when a plurality of nodes exchange data, simultaneously carrying out restarting management on the plurality of nodes in each infrastructure of the super computer laboratory, and providing a corresponding operation platform and management AI nodes for the command terminal;
the real-time monitoring module is used for dividing the supercomputer laboratory into different areas, carrying out real-time monitoring, acquisition and transmission on environment parameters of each device in the different areas of the supercomputer laboratory, judging whether the environment parameters in the supercomputer laboratory are abnormal or not, and correspondingly adjusting the operation parameters of the temperature control ventilation elements in the different areas of the supercomputer laboratory according to the judgment result.
The remote management module is used for receiving real-time environment parameters and abnormal results of all the devices in the supercomputer laboratory, analyzing the received data, making corresponding adjustment commands according to the analysis results, and controlling the operation of all the devices and the temperature control ventilation elements in different areas of the supercomputer laboratory according to the adjustment commands.
2. A supercomputer laboratory-based remote management system as in claim 1, wherein: the management system includes: the super computing cluster management module specifically comprises:
the node management module is used for inputting commands and exchanging data, classifying the data entering the system, providing temporary storage and transfer of the data when a plurality of nodes exchange the data, and completing data retrieval when the data are required;
the restarting setting module is used for carrying out restarting management on a plurality of nodes in each infrastructure of the supercomputer laboratory;
the command terminal module is used for providing corresponding operation platforms for command terminals in various infrastructures of the supercomputer laboratory;
and the AI node module is used for managing AI nodes in each infrastructure of the supercomputer laboratory.
3. A supercomputer laboratory-based remote management system as in claim 1, wherein: the real-time monitoring module specifically comprises:
the area monitoring module is used for dividing the super computer laboratory into different areas and monitoring, collecting and transmitting environment parameters of each device in the different areas of the super computer laboratory in real time, wherein the environment parameters comprise environment temperature, humidity and air quality coefficient;
the abnormality judging module is used for judging whether the acquired environmental parameters exceed a preset threshold according to the environmental parameter data monitored and acquired in real time, and if the acquired environmental parameters exceed the preset threshold, the current environmental parameters in the supercomputer laboratory are abnormal;
the operation parameter adjusting module is used for correspondingly adjusting the operation parameters of the temperature control ventilation element in different areas of the supercomputer laboratory according to the abnormal environment parameters judged by the abnormality judging module;
and the alarm module is used for judging whether the abnormal times of the supercomputer laboratory exceeds a threshold value according to the parameter adjustment condition of the temperature control ventilation element, and triggering an alarm to a management end or automatically stopping the operation of equipment in a related area if the abnormal times of the supercomputer laboratory exceed the threshold value.
4. A supercomputer laboratory-based remote management system as in claim 3, wherein: the area monitoring module comprises:
the region dividing module is used for dividing the supercomputer laboratory into different regions;
the real-time monitoring module is used for continuously monitoring the environmental parameters of each device in different areas in the supercomputer laboratory and updating the monitored environmental parameter data in real time;
and the sensing acquisition module is used for acquiring and transmitting the environmental parameters of each device in different areas of the monitored supercomputer laboratory in real time and converting the environmental parameters into a machine-readable language.
5. A supercomputer laboratory-based remote management system as in claim 3, wherein: the abnormality judgment module includes:
the analysis module is used for acquiring the real-time data transmitted by the area monitoring module and analyzing the environmental parameter condition of the supercomputer laboratory according to the acquired data;
the comparison module is used for comparing the analyzed environmental parameter result with a preset environmental parameter threshold value and judging whether the real-time environmental parameter is abnormal or not according to the comparison result;
and the sending module is used for transmitting the result judged by the comparison module to the remote management module.
6. A supercomputer laboratory-based remote management system as recited in claim 5, wherein: the judging flow of the abnormality judging module specifically comprises:
presetting a threshold value of an environmental parameter, acquiring real-time data transmitted by an area monitoring module, and analyzing the acquired data;
comparing the analyzed environmental parameters with preset environmental parameter thresholds, and judging whether abnormal conditions exist or not;
if the abnormality exists, the abnormality is transmitted to a remote management module, the remote management module transmits an instruction to a temperature control ventilation element in an abnormal area of the supercomputer laboratory according to the abnormal state, and the temperature control ventilation element receives the corresponding instruction and makes corresponding adjustment.
7. A supercomputer laboratory-based remote management system as in claim 3, wherein: the alarm module specifically comprises:
presetting adjustment frequency thresholds of temperature control ventilation elements in different areas of a supercomputer laboratory, and receiving the adjustment frequency of the temperature control ventilation elements in different areas of the supercomputer laboratory;
judging whether the difference exceeds a preset threshold value according to the parameter adjustment times of the temperature control ventilation element, and if the difference exceeds the threshold value, judging that the environmental parameters in different areas of the supercomputer laboratory are abnormal and cannot be regulated;
triggering an alarm signal and automatically stopping operation of the device in the relevant area.
8. A supercomputer laboratory-based remote management system as recited in claim 4, wherein: the monitoring flow of the real-time monitoring module specifically comprises the following steps:
firstly, dividing a supercomputer laboratory into different areas, and monitoring and collecting environmental parameters of each divided area in the supercomputer laboratory in real time;
then, judging whether the environmental parameters exceed a preset threshold value according to the environmental parameter data monitored and collected in real time, and if so, judging that the environmental parameters in the current supercomputer laboratory are abnormal;
and finally, correspondingly adjusting the operation parameters of the temperature control ventilation elements in different areas of the supercomputer laboratory according to the abnormal conditions, and triggering an alarm to a management end or automatically stopping the operation of equipment in the related area according to the abnormal times.
9. A supercomputer laboratory-based remote management system as in claim 1, wherein: the remote management module specifically comprises:
the data receiving module is used for receiving real-time environment parameters and abnormal analysis results of all the devices in the super computer laboratory;
the data analysis module is used for analyzing the received data, determining the abnormal conditions of the environmental parameters of each device in the super computer laboratory and making corresponding adjustment commands according to the abnormal conditions;
and the remote control module is used for controlling and managing the running conditions of all the devices in different areas of the supercomputer laboratory.
10. The method for managing a supercomputer laboratory-based remote management system as recited in claim 9, wherein: the method comprises the following steps:
step one: dividing a supercomputer laboratory into different areas, and presetting threshold values of environmental parameters and adjustment frequency threshold values of temperature control ventilation elements in different areas of the supercomputer laboratory;
step two: acquiring real-time data of environmental parameters in different areas of a supercomputer laboratory, analyzing the acquired data, comparing the analyzed environmental parameters with a preset environmental parameter threshold value, and judging whether an abnormal condition exists or not;
step three: if the abnormality exists, the abnormality is transmitted to a remote management module, the remote management module transmits an instruction to a temperature control ventilation element in an abnormal area of the supercomputer laboratory according to the abnormal state, and the temperature control ventilation element receives the corresponding instruction and makes corresponding adjustment;
step four: when the temperature control ventilation element is adjusted, the adjustment record is recorded, and whether the adjustment times exceeds a preset adjustment times threshold value is judged according to the adjustment times;
step five: if the preset adjustment frequency threshold value is exceeded, judging that the environmental parameters in different areas of the supercomputer laboratory are abnormal and cannot be regulated, triggering an alarm signal and automatically stopping the operation of equipment in the relevant area.
CN202311674568.7A 2023-12-05 2023-12-05 Remote management system based on supercomputer laboratory and management method thereof Pending CN117667591A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311674568.7A CN117667591A (en) 2023-12-05 2023-12-05 Remote management system based on supercomputer laboratory and management method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311674568.7A CN117667591A (en) 2023-12-05 2023-12-05 Remote management system based on supercomputer laboratory and management method thereof

Publications (1)

Publication Number Publication Date
CN117667591A true CN117667591A (en) 2024-03-08

Family

ID=90065777

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311674568.7A Pending CN117667591A (en) 2023-12-05 2023-12-05 Remote management system based on supercomputer laboratory and management method thereof

Country Status (1)

Country Link
CN (1) CN117667591A (en)

Similar Documents

Publication Publication Date Title
CN107888441B (en) Network traffic baseline self-learning self-adaption method
CN112039075B (en) Converter station multidimensional data analysis and monitoring system
CN112421774B (en) Monitoring system of reactive power compensation equipment of power distribution network
CN115657631B (en) Intelligent monitoring system for industrial control equipment operation field environment
CN117176560B (en) Monitoring equipment supervision system and method based on Internet of things
CN107302264A (en) A kind of substation secondary automation equipment stable operation management-control method
CN116664113A (en) Intelligent safety supervision system for electric power metering standardized operation
CN111488258A (en) System for analyzing and early warning software and hardware running state
CN117477774A (en) Intelligent early warning system and method for multifunctional power distribution cabinet
CN115480542A (en) Production line running state and related process data acquisition system
CN117391675B (en) Data center infrastructure operation and maintenance management method
CN117667591A (en) Remote management system based on supercomputer laboratory and management method thereof
CN109035746B (en) Function judgment method and system for centralized meter reading terminal
CN107482783B (en) Comprehensive intelligent system for monitoring and controlling service power
CN212645787U (en) Computer lab power environmental monitoring system
CN113965529A (en) LonWorks communication control method and system based on priority queue
KR20230081759A (en) Integrated management systme for clean room management based on artifical intelligence and method thereof
CN113379082A (en) Cloud intelligent monitoring energy-saving and operation and maintenance management platform for clean industrial environment and equipment
CN112532434A (en) Intelligent monitoring system for network messages of transformer substation
CN110751814A (en) Electrical fire monitoring system for rail transit and early warning analysis method thereof
CN117950364B (en) Intelligent on-site equipment control system
CN117171590B (en) Intelligent driving optimization method and system for motor
KR101412384B1 (en) Optimum management system of constant temperature and constant humidity by preceding diagnosis
CN117493129B (en) Operating power monitoring system of computer control equipment
CN115983806A (en) Smart station room background management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination