CN115576736A - Refined intelligent monitoring method for data center - Google Patents

Refined intelligent monitoring method for data center Download PDF

Info

Publication number
CN115576736A
CN115576736A CN202211562132.4A CN202211562132A CN115576736A CN 115576736 A CN115576736 A CN 115576736A CN 202211562132 A CN202211562132 A CN 202211562132A CN 115576736 A CN115576736 A CN 115576736A
Authority
CN
China
Prior art keywords
fault
data
hardware
working group
data center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211562132.4A
Other languages
Chinese (zh)
Inventor
李超成
高鸿波
刘毅
康凯
刘大维
高雷
饶智斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tongniu Information Technology Co ltd
Original Assignee
Beijing Tongniu Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tongniu Information Technology Co ltd filed Critical Beijing Tongniu Information Technology Co ltd
Priority to CN202211562132.4A priority Critical patent/CN115576736A/en
Publication of CN115576736A publication Critical patent/CN115576736A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q9/00Arrangements in telecontrol or telemetry systems for selectively calling a substation from a main station, in which substation desired apparatus is selected for applying a control signal thereto or for obtaining measured values therefrom
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q2209/00Arrangements in telecontrol or telemetry systems
    • H04Q2209/70Arrangements in the main station, i.e. central controller

Abstract

The invention relates to the technical field of data security, in particular to a refined intelligent monitoring method for a data center, which comprises the steps of establishing a database, acquiring fault data and recording the fault data into the database; grouping the data centers, and acquiring temperature and humidity data and hardware telemetering data of each working group and temperature and humidity data and hardware telemetering data of each device; generating a fault event according to the fault data, and positioning a fault working group and fault equipment which generate the fault event according to fault event information; carrying out hardware fault troubleshooting on the fault working group and the fault equipment, and judging whether the fault equipment is a hardware fault according to temperature and humidity data and hardware telemetering data of the fault working group and the fault equipment; when the fault working group and the fault equipment are determined to be hardware faults, generating warranty information and automatically reporting the warranty information; when the hardware of the fault working group and the fault equipment normally runs, the fault system is checked, and meanwhile, automatic repair is carried out.

Description

Refined intelligent monitoring method for data center
Technical Field
The invention relates to the technical field of data security, in particular to a refined intelligent monitoring method for a data center.
Background
A data center is a globally collaborative network of devices that is used to deliver, accelerate, present, compute, store data information over the internet network infrastructure. In future development, data centers will become competitive assets for enterprises, and business models will change accordingly. With the popularization of data center applications, artificial intelligence, network security and the like are also appeared in succession, and more users are brought into the applications of networks and mobile phones. With the increase of computers and data volume, people can also improve the self ability by continuously learning and accumulating, and the method is an important mark advancing to the information age. The data center is one of the more popular researches in the field of computers, so the research technology is mature. The computer network mainly comprises TREE, FAT-TREE, BCUBE, FICONN and the like, and thousands of devices are divided by taking units as units and managed one by one mainly by adopting a modularized, hierarchical and flattened design idea and a virtualized division management technology. Data centers are made up of numerous computer hardware, and problems with the hardware can result in some functions not being performed or functioning properly.
In order to realize the safe operation of the data center, most of the data centers are safely monitored by a video monitoring method and a software monitoring method, but the existing monitoring mode has the problems of message lag and untimely disposal, and when the hardware problem occurs in the data center, if the hardware problem cannot be timely powered off, the hardware equipment with the problem can be overheated, ignited and exploded under the condition of power-on.
Disclosure of Invention
The invention aims to solve the technical problem of providing a refined intelligent monitoring method for a data center, which has high intelligence degree, can quickly locate abnormal equipment, automatically repair and report the abnormal equipment in time, and improves the maintenance efficiency and the operation stability.
In order to solve the technical problems, the invention adopts the following technical scheme:
a refined intelligent monitoring method for a data center comprises
Establishing a database, collecting information data of logs and operation records of the data center, collecting and recording the data into the database, periodically scanning each device of the data center to acquire fault data, and recording the fault data into the database;
the method comprises the steps that a data center is placed in a centralized mode according to functions and purposes, the data center is grouped, the data center is divided into a plurality of working groups, temperature and humidity data and hardware telemetering data of each working group and temperature and humidity data and hardware telemetering data of each device are obtained, and power is supplied to each working group in a grouped mode;
generating a fault event according to the fault data, and positioning a fault working group and fault equipment which generate the fault event according to fault event information;
carrying out hardware fault troubleshooting on the fault working group and the fault equipment, and judging whether the fault equipment is a hardware fault or not according to temperature and humidity data and hardware telemetering data of the fault working group and the fault equipment;
when the fault working group and the fault equipment are determined to be hardware faults, the fault working group and the fault equipment are powered off, maintenance information is generated, and automatic reporting is carried out;
when the hardware of the fault working group and the fault equipment normally runs, the fault system is checked, and meanwhile, automatic repair is carried out.
And further, a log analysis tool is included, and the database analyzes the log through the log analysis tool.
Further, the step of grouping the data centers specifically includes:
dividing the data center into a plurality of working groups according to cabinet distribution of the data center, and performing grouping power supply on each working group;
carrying out video monitoring on the data center and acquiring a monitoring picture;
and separating the monitoring pictures according to the separation condition of the workgroup, and carrying out visual display.
Further, the video monitoring of the data center and the acquisition of the monitoring picture comprise a high-definition camera, and the video picture divider and the large-screen display device are used for dividing the monitoring picture according to the division condition of the workgroup and performing visual display on the monitoring picture.
Further, the hardware telemetry data includes data streams generated by the CPU, the memory and the Pcle interface.
Further, the temperature and humidity data and the hardware telemetering data of each working group and the temperature and humidity data and the hardware telemetering data of each device are acquired by a temperature and humidity sensor and a register, and the register is used for monitoring cache, CPU frequency, memory bandwidth, input and output access.
Further, the database module is used for establishing a database, collecting information data of logs and operation records of the data center, collecting and recording the data into the database, regularly scanning each device of the data center, acquiring fault data and recording the fault data into the database;
the data center grouping and operation data monitoring module is used for grouping the data centers, dividing the data centers into a plurality of working groups, acquiring temperature and humidity data and hardware telemetering data of each working group and temperature and humidity data and hardware telemetering data of each device, and grouping and supplying power to each working group;
the invention also provides a refined intelligent monitoring device of the data center, which comprises the following components:
the fault positioning module is used for generating a fault event according to the fault data and positioning a fault working group and fault equipment which generate the fault event according to the fault event information;
the hardware troubleshooting module is used for performing hardware troubleshooting on the fault working group and the fault equipment and judging whether the fault equipment is a hardware fault according to temperature and humidity data and hardware telemetering data of the fault working group and the fault equipment;
the hardware fault operation module is used for powering off the fault working group and the fault equipment when the fault working group and the fault equipment are determined to be hardware faults, generating maintenance information and automatically reporting the maintenance information;
and the system fault operation module is used for checking the fault system and automatically repairing the fault system when the hardware of the fault working group and the fault equipment normally runs.
The invention also provides computer equipment which comprises a memory and a processor, wherein the memory is stored with computer readable instructions, and the processor executes the computer readable instructions to realize the steps of the data center refined intelligent monitoring method.
The invention also provides a computer readable storage medium, wherein computer readable instructions are stored on the computer readable storage medium, and when the computer readable instructions are executed by a processor, the steps of the data center refined intelligent monitoring method are realized.
The invention has the beneficial effects that: when the system is actually used, a database is established, information data of logs and operation records of a data center are collected, the data are collected and recorded into the database, each device of the data center is periodically scanned to obtain fault data, and the fault data are recorded into the database; then grouping the data centers, dividing the data centers into a plurality of working groups, acquiring temperature and humidity data and hardware telemetering data of each working group and temperature and humidity data and hardware telemetering data of each device, and grouping and supplying power to each working group; generating a fault event according to the fault data, and positioning a fault working group and fault equipment generating the fault event according to fault event information; then, carrying out hardware fault troubleshooting on a fault working group and fault equipment, wherein all application services are carried out on the basis of normal operation of physical hardware, and partial functions cannot be normally exerted or operated due to the occurrence of a problem in the hardware; finally, when the fault working group and the fault equipment are determined to be hardware faults, the fault working group and the fault equipment are powered off, maintenance information is generated, and automatic reporting is carried out; and finally, when the hardware of the fault working group and the fault equipment normally runs, the fault system is checked and automatically repaired. The fine intelligent monitoring method for the data center is high in intelligent degree, abnormal equipment can be quickly positioned, automatic repairing and timely reporting are carried out, and maintenance efficiency and operation stability are improved.
Drawings
FIG. 1 is a block flow diagram of the present invention;
FIG. 2 is a schematic structural diagram of a data center refined intelligent monitoring device according to the present invention;
FIG. 3 is a schematic diagram of the structure of the computer device of the present invention;
fig. 4 is a schematic diagram of monitoring pictures grouped by the data center according to the present invention.
Detailed Description
In order to facilitate understanding of those skilled in the art, the present invention is further described below with reference to the following examples and the accompanying drawings, which are not intended to limit the present invention.
As shown in fig. 1, a method for refining intelligent monitoring of a data center includes:
establishing a database, collecting information data of logs and operation records of the data center, collecting and recording the data into the database, regularly scanning each device of the data center, acquiring fault data, and recording the fault data into the database;
grouping the data centers, dividing the data centers into a plurality of working groups, and acquiring temperature and humidity data and hardware telemetering data of each working group and temperature and humidity data and hardware telemetering data of each device;
generating a fault event according to the fault data, and positioning a fault working group and fault equipment which generate the fault event according to fault event information;
hardware faults are checked on a fault working group and fault equipment, all application services are carried out on the basis of normal operation of physical hardware, partial functions cannot be normally played or operated when the hardware has problems, and whether the hardware faults exist is judged according to temperature and humidity data and hardware telemetering data of the fault working group and the fault equipment;
when the fault working group and the fault equipment are determined to be hardware faults, generating warranty information and automatically reporting the warranty information;
when the hardware of the fault working group and the fault equipment normally runs, the fault system is checked, and meanwhile, automatic repair is carried out.
When the data base is established, an independent server can be used for operation, when the independent server is used, attention is paid to not opening strangers or files and mail attachments with unknown histories, not clicking an office macro operation prompt and not double clicking to open js and vbs postfix files, meanwhile, the latest security feature base such as anti-virus and the like is timely upgraded, important data and files are periodically backed up in different places, when the data center is grouped, the arrangement directions of all working groups are mainly adopted for grouping, therefore, in the process of arranging and establishing all the working groups, parts of the same working groups are arranged together as far as possible so as to be convenient for later maintenance and use, information data of log operation records of the data center are collected, the data are periodically arranged in the data base, all devices of the data center are scanned, fault data are obtained, system monitoring of all the working groups and devices is realized, the phenomenon that the monitoring of all the working groups and devices can be automatically carried out through collecting the information data and collecting the data of all the working groups and the hardware data can be directly repaired by remote monitoring and monitoring the devices when the devices operate, the temperature and humidity monitoring system monitoring equipment can be carried out by manual intervention when the temperature monitoring system is abnormally operated, the temperature and temperature monitoring equipment can be directly carried out, the phenomenon that the equipment can be repaired when the equipment is discovered by the equipment, the equipment can be repaired, the temperature and temperature monitoring system can be repaired when the temperature monitoring system.
The specific working principle and implementation of the invention are as follows: when the system works, the data centers are grouped according to functions, purposes and placement positions, fault events are generated through fault data collected by a database, fault work groups and fault equipment which generate the fault events are located according to fault event information, hardware fault judgment and system fault judgment are carried out on reasons of the fault events, when system faults occur, automatic repair can be carried out, fault systems can be checked, when the hardware faults occur, the fault work groups cannot work through the system repair, when the temperature and humidity data and the hardware remote measurement data of the fault work groups reach set thresholds, the fault equipment is proved to have risks of fire and explosion at the moment, when the hardware remote measurement data are abnormal, the fault equipment is proved to be abnormal in operation and cannot work normally, the fault work groups and the fault equipment are powered off in advance, the fault equipment can be prevented from being fired and exploded due to power on, the occurrence of fire and explosion accidents can be effectively prevented by cutting off power supply of the fault work groups and the fault equipment at the moment, and automatic reporting is carried out maintenance by contact maintainers.
In this embodiment: the data center is monitored in a video mode, and monitoring pictures are obtained through the data center and comprise high-definition cameras.
In the above structure: the high definition camera is installed in the inside roof department of data center computer lab, and the high definition camera can cover whole data center, is used for monitoring data center through the high definition camera, and the rear end equipment of being convenient for acquires the control picture.
As shown in fig. 4, in the present embodiment: the method for separating the monitoring pictures according to the separation condition of the working group and carrying out visual display comprises a video picture divider and a large-screen display device.
In the above structure: the monitoring data of the high-definition camera are subjected to picture segmentation through the video picture splitter and then displayed through the large-screen display device, and the specific working condition of each working group can be visually checked by a worker through the video picture segmentation, so that the worker can rapidly position the working group with problems, and the maintenance efficiency of maintenance personnel is improved.
In this embodiment: the temperature and humidity data and the hardware telemetering data of each working group and the temperature and humidity data and the hardware telemetering data of each device are acquired by the temperature and humidity sensor and the register, and the register is used for monitoring cache, CPU frequency, memory bandwidth and input and output access.
In the above structure: the temperature and humidity sensor is used for monitoring the temperature and humidity during operation of the data center, the temperature and humidity sensors are multiple and are respectively installed on equipment of each working group, the temperature and humidity condition of each equipment can be accurately monitored, the register is used for monitoring the operation conditions of monitoring cache, CPU frequency, memory bandwidth, input and output access and the like of the equipment, when problems occur, the temperature and humidity sensor can timely find out and be connected with maintenance personnel for maintenance.
As shown in fig. 2, a data center refined intelligent monitoring apparatus includes:
the database module is used for establishing a database, collecting information data of logs and operation records of the data center, collecting and recording the data into the database, regularly scanning each device of the data center, acquiring fault data and recording the fault data into the database;
the data center grouping and operation data monitoring module is used for grouping the data centers, dividing the data centers into a plurality of working groups, acquiring temperature and humidity data and hardware telemetering data of each working group and temperature and humidity data and hardware telemetering data of each device, and grouping and supplying power to each working group;
the fault positioning module is used for generating a fault event according to the fault data and positioning a fault working group and fault equipment which generate the fault event according to the fault event information;
the hardware troubleshooting module is used for performing hardware troubleshooting on the fault working group and the fault equipment and judging whether the fault equipment is a hardware fault according to temperature and humidity data and hardware telemetering data of the fault working group and the fault equipment;
the hardware fault operation module is used for powering off the fault working group and the fault equipment when the fault working group and the fault equipment are determined to be hardware faults, generating maintenance information and automatically reporting the maintenance information;
and the system fault operation module is used for checking a fault system and automatically repairing the fault system when the hardware of the fault working group and the fault equipment normally runs.
In the above structure: collecting logs of a data center and information data of operation records through a database module, collecting and recording the data into the database module, periodically scanning each device of the data center to obtain fault data, recording the fault data into the database module, grouping and operating a data monitoring module through the data center, dividing the data center into a plurality of working groups, obtaining temperature and humidity data and hardware remote measurement data of each working group and temperature and humidity data and hardware remote measurement data of each device, and grouping and supplying power to each working group; and then generating a fault event by using fault data according to a fault positioning module, positioning a fault working group and fault equipment generating the fault event through fault event information, then carrying out hardware fault troubleshooting on the fault working group and the fault equipment, acquiring temperature and humidity data and hardware telemetering data of each working group and temperature and humidity data and hardware telemetering data of each equipment, carrying out hardware troubleshooting, judging whether the fault working group and the fault equipment are hardware faults or not after the hardware troubleshooting module inspects, when the fault working group and the fault equipment are determined to be hardware faults, powering off the fault working group and the fault equipment through a hardware fault operation module, generating maintenance information, automatically reporting, and when the hardware of the fault working group and the fault equipment normally operates, troubleshooting and automatically repairing the fault system through a system fault operation module.
As shown in FIG. 3, the computer device includes a memory, a processor, and a network interface communicatively connected to each other via a system bus. It should be noted that only a computer device having a memory, a processor, and a network interface is shown in fig. 3, but it should be understood that not all of the illustrated components are required to be implemented, and that more or fewer components may alternatively be implemented. As will be understood by those skilled in the art, the computer device herein is a device capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware includes but is not limited to a microprocessor, an application specific integrated circuit, a programmable gate array, a digital processor, an embedded device, and the like. The memory includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. The processor may be a central processing unit, controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor is typically used to control the overall operation of the computer device. In this embodiment, the processor is configured to run a program code stored in the memory or process data, for example, run a program code of the data center refined intelligent monitoring method.
All the technical features in the embodiment can be freely combined according to actual needs.
The above embodiments are preferred implementations of the present invention, and other implementations are also included, and any obvious substitutions are within the scope of the present invention without departing from the spirit of the present invention.

Claims (6)

1. A refined intelligent monitoring method for a data center is characterized by comprising the following steps:
establishing a database, collecting information data of logs and operation records of the data center, collecting and recording the data into the database, regularly scanning each device of the data center, acquiring fault data, and recording the fault data into the database;
the method comprises the steps that a data center is placed in a centralized mode according to functions and purposes, the data center is grouped, the data center is divided into a plurality of working groups, temperature and humidity data and hardware telemetering data of each working group and temperature and humidity data and hardware telemetering data of each device are obtained, and power is supplied to each working group in a grouped mode;
generating a fault event according to the fault data, and positioning a fault working group and fault equipment which generate the fault event according to fault event information;
carrying out hardware fault troubleshooting on the fault working group and the fault equipment, and judging whether the fault equipment is a hardware fault according to temperature and humidity data and hardware telemetering data of the fault working group and the fault equipment;
when the fault working group and the fault equipment are determined to be hardware faults, generating warranty information and automatically reporting the warranty information;
when the hardware of the fault working group and the fault equipment normally runs, the fault system is checked, and meanwhile, automatic repair is carried out.
2. The refined intelligent monitoring method for the data center according to claim 1, characterized in that: the system also comprises a log analysis tool, and the log recorded in the database is analyzed through the log analysis tool.
3. The refined intelligent monitoring method for the data center according to claim 1, wherein: the step of grouping the data centers specifically includes:
dividing the data center into a plurality of working groups according to the cabinet distribution of the data center, and performing grouping power supply on each working group;
carrying out video monitoring on the data center and acquiring a monitoring picture;
and separating the monitoring pictures according to the separation condition of the working groups, and performing visual display.
4. The refined intelligent monitoring method for the data center according to claim 3, wherein the refined intelligent monitoring method comprises the following steps: the data center is subjected to video monitoring, and monitoring pictures obtained by the video monitoring comprise high-definition cameras;
the step of separating the monitoring pictures according to the separation condition of the working group and carrying out visual display comprises a video picture divider and a large-screen display device.
5. The refined intelligent monitoring method for the data center according to claim 1, characterized in that: the hardware telemetry data includes data streams generated by the CPU, the memory and the Pcle interface.
6. The refined intelligent monitoring method for the data center according to claim 5, wherein: the temperature and humidity data and the hardware telemetering data of each working group and the temperature and humidity data and the hardware telemetering data of each device are acquired and comprise temperature and humidity sensor data and register data, and the register is used for monitoring cache, CPU frequency, memory bandwidth and input and output access.
CN202211562132.4A 2022-12-07 2022-12-07 Refined intelligent monitoring method for data center Pending CN115576736A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211562132.4A CN115576736A (en) 2022-12-07 2022-12-07 Refined intelligent monitoring method for data center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211562132.4A CN115576736A (en) 2022-12-07 2022-12-07 Refined intelligent monitoring method for data center

Publications (1)

Publication Number Publication Date
CN115576736A true CN115576736A (en) 2023-01-06

Family

ID=84590461

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211562132.4A Pending CN115576736A (en) 2022-12-07 2022-12-07 Refined intelligent monitoring method for data center

Country Status (1)

Country Link
CN (1) CN115576736A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103684817A (en) * 2012-09-06 2014-03-26 百度在线网络技术(北京)有限公司 Monitoring method and system for data center
CN204204385U (en) * 2014-11-05 2015-03-11 北京百度网讯科技有限公司 Large screen display system
CN107958347A (en) * 2017-12-18 2018-04-24 金税信息技术服务股份有限公司 It is a kind of that there are the intelligent operation management method and system of automatic fault reporting
CN112187514A (en) * 2020-09-02 2021-01-05 上海御威通信科技有限公司 Intelligent operation and maintenance system, method and terminal for data center network equipment
CN115033419A (en) * 2022-08-12 2022-09-09 浩鲸云计算科技股份有限公司 Method and system for realizing hardware fault self-healing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103684817A (en) * 2012-09-06 2014-03-26 百度在线网络技术(北京)有限公司 Monitoring method and system for data center
CN204204385U (en) * 2014-11-05 2015-03-11 北京百度网讯科技有限公司 Large screen display system
CN107958347A (en) * 2017-12-18 2018-04-24 金税信息技术服务股份有限公司 It is a kind of that there are the intelligent operation management method and system of automatic fault reporting
CN112187514A (en) * 2020-09-02 2021-01-05 上海御威通信科技有限公司 Intelligent operation and maintenance system, method and terminal for data center network equipment
CN115033419A (en) * 2022-08-12 2022-09-09 浩鲸云计算科技股份有限公司 Method and system for realizing hardware fault self-healing

Similar Documents

Publication Publication Date Title
CN107302466B (en) Big data analysis platform and method for dynamic loop monitoring system
CN108667666A (en) A kind of intelligent O&M method and its system based on visualization technique
CN107832196B (en) Monitoring device and monitoring method for abnormal content of real-time log
CN109768889A (en) A kind of visualization safety management wisdom operation platform
CN106936860A (en) A kind of monitoring system and method based on terminal device
CN106936858A (en) A kind of cloud platform monitoring system and method
US10673684B2 (en) System and method for autonomus data center operation and healing
CN108052358B (en) Distributed deployment system and method
CN106936859A (en) A kind of Cloud Server policy deployment system and method
CN110971464A (en) Operation and maintenance automatic system suitable for disaster recovery center
CN104125085A (en) EBS (Enterprise Service Bus) data management and control method and device
CN109698766A (en) The method and system of communication power supply accident analysis
CN105141478A (en) Method for monitoring state of sas card hard disk of linux server
CN110412524A (en) A kind of wind profile radar standard output controller system
CN112799909A (en) Automatic management system and method for server
CN115860729A (en) IT operation and maintenance integrated management system
CN112269673A (en) Intelligent operation and maintenance management system and method for data center
CN103986607A (en) Voice-sound-light alarm monitoring system for intelligent data center
JP4842738B2 (en) Fault management support system and information management method thereof
WO2024008130A1 (en) Faulty hardware processing method, apparatus and system
CN112449019A (en) IMS intelligent Internet of things operation and maintenance management platform
CN115576736A (en) Refined intelligent monitoring method for data center
CN112131090B (en) Service system performance monitoring method, device, equipment and medium
CN113691390A (en) Cloud-end-coordinated edge node alarm system and method
CN115237719A (en) Early warning method and system for reliability of server power supply

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230106

RJ01 Rejection of invention patent application after publication