CN115827038A - Operation and maintenance control method and system applied to data center - Google Patents

Operation and maintenance control method and system applied to data center Download PDF

Info

Publication number
CN115827038A
CN115827038A CN202211543582.9A CN202211543582A CN115827038A CN 115827038 A CN115827038 A CN 115827038A CN 202211543582 A CN202211543582 A CN 202211543582A CN 115827038 A CN115827038 A CN 115827038A
Authority
CN
China
Prior art keywords
maintenance
fault
data center
information
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202211543582.9A
Other languages
Chinese (zh)
Inventor
朱青山
邓莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Wanxin Information Engineering Co ltd
Original Assignee
Hefei Wanxin Information Engineering Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Wanxin Information Engineering Co ltd filed Critical Hefei Wanxin Information Engineering Co ltd
Priority to CN202211543582.9A priority Critical patent/CN115827038A/en
Publication of CN115827038A publication Critical patent/CN115827038A/en
Withdrawn legal-status Critical Current

Links

Images

Abstract

The invention is suitable for the technical field of operation and maintenance of data centers, and provides an operation and maintenance control method and system applied to the data centers, which comprises the following steps: receiving data center fault information, wherein the data center fault information comprises data equipment information and fault codes; determining whether a fault automatic repair program exists according to the fault information of the data center, and if so, automatically operating and maintaining by using the fault automatic repair program; when the fault information does not exist, sending the fault information of the data center to the manual operation and maintenance center; the method comprises the steps that a data center room is monitored in real time, and when it is detected that operation and maintenance personnel enter the data center room, identity information of the operation and maintenance personnel and data equipment needing operation and maintenance are determined, so that the data equipment starts an operation and maintenance operation recording function; in addition, the invention also can detect the behaviors of the operation and maintenance personnel, and when the operation and maintenance personnel is detected to operate the data equipment which does not need operation and maintenance, the operation and maintenance personnel is indicated to perform improper operation, warning information is generated, and the monitoring of the operation and maintenance personnel is realized.

Description

Operation and maintenance control method and system applied to data center
Technical Field
The invention relates to the technical field of operation and maintenance of data centers, in particular to an operation and maintenance control method and system applied to a data center.
Background
With the rapid development of the electronic information industry, countless servers, network devices and data storage devices are behind various software application systems as supports, the scale of a communication network is gradually enlarged, the number of communication network devices is also increased, the operation and maintenance management and control work of a data center is more complicated, and when the data center is abnormal, an engineer is often required to go to the data center for processing. The whole operation and maintenance operation is dominated by engineers, namely, the artificial subjective factor is strong, secret leakage may occur in the operation and maintenance process, and the monitoring of the operation and maintenance work of the engineers is lacked. Therefore, it is desirable to provide an operation and maintenance management and control method and system applied to a data center, which aim to solve the above problems.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide an operation and maintenance control method and system applied to a data center, so as to solve the problems in the background technology.
The invention is realized in such a way that an operation and maintenance management and control method applied to a data center comprises the following steps:
receiving data center fault information, wherein the data center fault information comprises data equipment information and fault codes;
determining whether a fault automatic repair program exists according to the fault information of the data center, and if so, automatically operating and maintaining by using the fault automatic repair program; when not present, executing the next step;
sending the fault information of the data center to a manual operation and maintenance center;
the method comprises the steps that a data center room is monitored in real time, and when it is detected that operation and maintenance personnel enter the data center room, identity information of the operation and maintenance personnel and data equipment needing operation and maintenance are determined, so that the data equipment starts an operation and maintenance operation recording function, and operation and maintenance operation information is obtained;
and detecting the behaviors of the operation and maintenance personnel, and generating warning information when detecting that the operation and maintenance personnel operate the data equipment which does not need operation and maintenance.
As a further scheme of the invention: the step of determining whether an automatic fault repairing program exists according to the data center fault information specifically includes:
inputting fault information of a data center into a repair program library for matching, wherein the repair program library comprises all data equipment information, each data equipment information corresponds to a plurality of fault codes, and each fault code corresponds to a fault automatic repair program;
when the fault codes in the data center fault information are successfully matched with the fault codes in the repair program library, judging that an automatic fault repair program exists; otherwise, judging that no automatic fault repairing program exists.
As a further scheme of the invention: the step of sending the data center fault information to the manual operation and maintenance center specifically comprises the following steps:
sending the fault information of the data center to a manual operation and maintenance center;
distributing the operation and maintenance tasks to the operation and maintenance personnel terminal according to the fault codes in the fault information of the data center;
and sending the identity information of the corresponding operation and maintenance personnel to an access control system of the data center room, so that the operation and maintenance personnel can enter the data center room.
As a further scheme of the invention: the step of detecting the behavior of the operation and maintenance personnel specifically comprises the following steps:
analyzing the monitoring image of the data center room, and when the stay time of the operation and maintenance personnel in a certain area is detected to exceed the set time length;
and determining a distance value between the area and the data equipment needing operation and maintenance, and when the distance value is greater than a set distance, judging that the operation and maintenance personnel is operating the data equipment not needing operation and maintenance.
As a further scheme of the invention: the method further comprises the following steps:
grouping all the recorded operation and maintenance operation information to obtain a plurality of operation and maintenance operation groups, wherein the similarity of any two operation and maintenance operation information in each operation and maintenance operation group is greater than a set similarity value;
judging the quantity of the operation and maintenance operation information in each operation and maintenance operation group, and marking the operation and maintenance operation group as a repair program development group when the quantity of the operation and maintenance operation information is greater than a set quantity value;
and sending the repair program development group to a program development center.
Another object of the present invention is to provide an operation and maintenance management and control system applied to a data center, where the system includes:
the data center fault information receiving module is used for receiving data center fault information, and the data center fault information comprises data equipment information and fault codes;
the program automatic operation and maintenance module is used for determining whether a fault automatic repair program exists according to the fault information of the data center, and when the fault automatic repair program exists, the automatic operation and maintenance are carried out by using the fault automatic repair program; when the fault information does not exist, executing the steps in the fault information sending module;
the fault information sending module is used for sending the fault information of the data center to the manual operation and maintenance center;
the operation and maintenance operation recording module is used for monitoring the data center room in real time, and when detecting that an operation and maintenance person enters the data center room, the operation and maintenance person identity information and the data equipment needing operation and maintenance are determined, so that the data equipment starts the operation and maintenance operation recording function to obtain the operation and maintenance operation information;
and the operation and maintenance behavior monitoring module is used for detecting the behavior of the operation and maintenance personnel and generating warning information when detecting that the operation and maintenance personnel operate the data equipment which does not need operation and maintenance.
As a further scheme of the invention: the program automatic operation and maintenance module comprises:
the system comprises a repairing program matching unit, a fault automatic repairing program matching unit and a fault automatic repairing program matching unit, wherein the repairing program matching unit is used for inputting fault information of a data center into a repairing program base for matching, the repairing program base comprises all data equipment information, each data equipment information corresponds to a plurality of fault codes, and each fault code corresponds to the fault automatic repairing program;
the repair program judging unit is used for judging that the automatic fault repair program exists when the fault codes in the data center fault information are successfully matched with the fault codes in the repair program library; otherwise, judging that no fault automatic repair program exists.
As a further scheme of the invention: the fault information sending module comprises:
the fault information sending unit is used for sending the fault information of the data center to the manual operation and maintenance center;
the operation and maintenance task dispatching unit is used for dispatching the operation and maintenance tasks to the operation and maintenance personnel terminal according to the fault codes in the fault information of the data center;
and the access control system authorization unit is used for sending the identity information of the corresponding operation and maintenance personnel to the access control system of the data center room, so that the operation and maintenance personnel can enter the data center room.
As a further scheme of the invention: the operation and maintenance behavior monitoring module comprises:
the monitoring image analysis unit is used for analyzing the monitoring image of the data center room, and when the fact that the stay time of the operation and maintenance personnel in a certain area exceeds the set time length is detected;
and the distance value judging unit is used for determining the distance value between the area and the data equipment needing operation and maintenance, and when the distance value is greater than the set distance, the operation and maintenance personnel is judged to be operating the data equipment not needing operation and maintenance.
As a further scheme of the invention: the system also comprises an operation and maintenance operation development module, wherein the operation and maintenance operation development module specifically comprises:
the operation and maintenance operation grouping unit is used for grouping all the recorded operation and maintenance operation information to obtain a plurality of operation and maintenance operation groups, and the similarity of any two operation and maintenance operation information in each operation and maintenance operation group is greater than a set similarity value;
the repair program development group unit is used for judging the quantity of the operation and maintenance operation information in each operation and maintenance operation group, and when the quantity of the operation and maintenance operation information is larger than a set quantity value, the operation and maintenance operation group is marked as a repair program development group;
and the program development group sending unit is used for sending the repaired program development group to the program development center.
Compared with the prior art, the invention has the beneficial effects that:
the invention can monitor the data center room in real time, and when detecting that an operation and maintenance person enters the data center room, the identity information of the operation and maintenance person and the data equipment needing operation and maintenance are determined, so that the data equipment starts the operation and maintenance operation recording function, obtains the operation and maintenance operation information and is convenient to trace; in addition, the invention also can detect the behaviors of the operation and maintenance personnel, and when the operation and maintenance personnel is detected to operate the data equipment which does not need operation and maintenance, the operation and maintenance personnel is indicated to perform improper operation, warning information is generated, and the monitoring of the operation and maintenance personnel is realized.
Drawings
Fig. 1 is a flowchart of an operation and maintenance management and control method applied to a data center.
Fig. 2 is a flowchart of determining whether an automatic fault repairing program exists according to data center fault information in an operation and maintenance management and control method applied to a data center.
Fig. 3 is a flowchart for sending data center fault information to a manual operation and maintenance center in an operation and maintenance control method applied to a data center.
Fig. 4 is a flowchart for detecting the behavior of operation and maintenance personnel in the operation and maintenance control method applied to the data center.
Fig. 5 is a flowchart for grouping all recorded operation and maintenance operation information in an operation and maintenance management and control method applied to a data center.
Fig. 6 is a schematic structural diagram of an operation and maintenance management and control system applied to a data center.
Fig. 7 is a schematic structural diagram of an automatic operation and maintenance program module in an operation and maintenance management and control system applied to a data center.
Fig. 8 is a schematic structural diagram of a fault information sending module in an operation and maintenance management and control system applied to a data center.
Fig. 9 is a schematic structural diagram of an operation and maintenance behavior monitoring module in an operation and maintenance management and control system applied to a data center.
Fig. 10 is a schematic structural diagram of an operation and maintenance operation development module applied to an operation and maintenance management and control system of a data center.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clear, the present invention is further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Specific implementations of the present invention are described in detail below with reference to specific embodiments.
As shown in fig. 1, an embodiment of the present invention provides an operation and maintenance management and control method applied to a data center, where the method includes the following steps:
s100, receiving data center fault information, wherein the data center fault information comprises data equipment information and fault codes;
s200, determining whether a fault automatic repair program exists according to the fault information of the data center, and if so, automatically operating and maintaining by using the fault automatic repair program; when not present, executing the next step;
s300, sending the fault information of the data center to a manual operation and maintenance center;
s400, monitoring a data center room in real time, and when detecting that an operation and maintenance person enters the data center room, determining the identity information of the operation and maintenance person and data equipment needing operation and maintenance, so that the data equipment starts an operation and maintenance operation recording function to acquire operation and maintenance operation information;
s500, detecting the behavior of the operation and maintenance personnel, and generating warning information when detecting that the operation and maintenance personnel operate the data equipment which does not need operation and maintenance.
It should be noted that, when an abnormality occurs in the data center, an engineer is often required to go to the data center to perform processing. The whole operation and maintenance operation is dominated by engineers, namely, the artificial subjective factor is strong, secret leakage may occur in the operation and maintenance process, and the operation and maintenance work of the engineers is lack of monitoring. Embodiments of the present invention aim to solve the above problems.
In the embodiment of the invention, when data center fault information occurs, whether a fault automatic repair program exists or not is determined according to the specific content of the data center fault information, when the fault automatic repair program exists, the fault automatic repair program is used for automatic operation and maintenance, the method is easy to understand, and some simple common fault problems can be automatically repaired through the preset repair program, so that the operation and maintenance management and control efficiency is increased, and the labor cost is reduced; when a corresponding fault automatic repair program does not exist, the embodiment of the invention sends the fault information of the data center to a manual operation and maintenance center, and the manual operation and maintenance center arranges operation and maintenance personnel to go to the data center for processing; in addition, the embodiment of the invention also can detect the behaviors of the operation and maintenance personnel, and when the operation and maintenance personnel is detected to operate the data equipment which does not need operation and maintenance, the operation and maintenance personnel is explained to perform improper operation, warning information is generated, and the operation and maintenance personnel can be monitored.
As shown in fig. 2, as a preferred embodiment of the present invention, the step of determining whether there is an automatic failure recovery procedure according to the data center failure information specifically includes:
s201, inputting fault information of a data center into a repair program library for matching, wherein the repair program library comprises all data equipment information, each data equipment information corresponds to a plurality of fault codes, and each fault code corresponds to a fault automatic repair program;
s202, when the fault codes in the data center fault information are successfully matched with the fault codes in the repair program library, judging that an automatic fault repair program exists; otherwise, judging that no automatic fault repairing program exists.
In the embodiment of the invention, a repair program library is established in advance, the repair program library comprises all data equipment information, each data equipment information corresponds to a plurality of fault codes, each fault code corresponds to a fault automatic repair program, the data center fault information is input into the repair program library for matching, when the data equipment information and the fault codes in the data center fault information are consistent with the data equipment information and the fault codes in the repair program library, the matching is regarded as successful, the existence of the fault automatic repair program is judged, and the fault automatic repair program is directly called for automatic operation and maintenance.
As shown in fig. 3, as a preferred embodiment of the present invention, the step of sending the data center fault information to the manual operation and maintenance center specifically includes:
s301, sending the fault information of the data center to a manual operation and maintenance center;
s302, distributing operation and maintenance tasks to operation and maintenance personnel terminals according to fault codes in the data center fault information;
and S303, sending the identity information of the corresponding operation and maintenance personnel to an access control system of the data center room, so that the operation and maintenance personnel can enter the data center room.
In the embodiment of the invention, after the fault information of the data center is sent to the manual operation and maintenance center, the manual operation and maintenance center can distribute the operation and maintenance tasks to the corresponding operation and maintenance personnel terminals according to the fault codes in the fault information of the data center, so that the operation and maintenance personnel can easily understand that basically different operation and maintenance personnel have different fault codes and need to send the identity information of the corresponding operation and maintenance personnel to the access control system of the data center room, and thus the access control system can grant temporary authority to the operation and maintenance personnel, and the operation and maintenance personnel can enter the data center room to perform operation and maintenance work. That is to say, in the embodiment of the present invention, even a dedicated operation and maintenance person cannot enter the data center room at will, and the security of the data center room is further ensured.
As shown in fig. 4, as a preferred embodiment of the present invention, the step of detecting the behavior of the operation and maintenance personnel specifically includes:
s501, analyzing the monitoring image of the data center room, and when the fact that the stay time of operation and maintenance personnel in a certain area exceeds a set time length is detected;
s502, determining a distance value between the area and the data equipment needing operation and maintenance, and judging that the operation and maintenance personnel are operating the data equipment needing no operation and maintenance when the distance value is greater than a set distance.
In the embodiment of the invention, each operation and maintenance task contains data equipment information, so that an operation and maintenance person can directly go to the corresponding data equipment to carry out operation and maintenance work, if the fact that the stay time of the operation and maintenance person in a certain area exceeds the set time length which is a fixed value set in advance is detected, the operation and maintenance person carries out certain activity in the area is indicated, then a distance value between the area and the data equipment needing operation and maintenance is determined, when the distance value is greater than the set distance, the set distance is the fixed value set in advance, the fact that the certain activity does not carry out operation and maintenance work on the data equipment needing operation and maintenance is indicated, the operation and maintenance person is likely to carry out abnormal work, and warning information is generated.
As shown in fig. 5, as a preferred embodiment of the present invention, the method further includes:
s601, grouping all recorded operation and maintenance operation information to obtain a plurality of operation and maintenance operation groups, wherein the similarity of any two operation and maintenance operation information in each operation and maintenance operation group is greater than a set similarity value;
s602, judging the quantity of the operation and maintenance operation information in each operation and maintenance operation group, and marking the operation and maintenance operation group as a repair program development group when the quantity of the operation and maintenance operation information is greater than a set quantity value;
and S603, sending the repair program development group to a program development center.
In the embodiment of the invention, the operation and maintenance operation information of the operation and maintenance personnel is further processed, specifically, all recorded operation and maintenance operation information is grouped to obtain a plurality of operation and maintenance operation groups, the similarity of any two operation and maintenance operation information in each operation and maintenance operation group is greater than a set similarity value, the set similarity value is a fixed value set in advance, so all the operation and maintenance operation information in each operation and maintenance operation group is substantially the same, the quantity of the operation and maintenance operation information in each operation and maintenance operation group is judged, when the quantity of the operation and maintenance operation information is greater than a set quantity value, the operation and maintenance operation information is indicated to be used at high frequency, a fault automatic repairing program is worthy to be developed, the operation and maintenance operation groups are marked as repairing program development groups, all the repairing program development groups are finally sent to a program development center, an engineer can develop programs, and a repairing program library is updated and updated continuously.
As shown in fig. 6, an embodiment of the present invention further provides an operation and maintenance management and control system applied to a data center, where the system includes:
the system comprises a fault information receiving module 100, a fault information sending module and a fault information receiving module, wherein the fault information receiving module is used for receiving data center fault information, and the data center fault information comprises data equipment information and fault codes;
the program automatic operation and maintenance module 200 is used for determining whether a fault automatic repair program exists according to the fault information of the data center, and when the fault automatic repair program exists, the automatic operation and maintenance are carried out by using the fault automatic repair program; when not present, the steps in the failure information transmission module 300 are performed;
the fault information sending module 300 is used for sending the fault information of the data center to the manual operation and maintenance center;
the operation and maintenance operation recording module 400 is configured to monitor the data center room in real time, and when it is detected that an operation and maintenance person enters the data center room, determine identity information of the operation and maintenance person and a data device that needs operation and maintenance, so that the data device starts an operation and maintenance operation recording function to obtain operation and maintenance operation information;
and the operation and maintenance behavior monitoring module 500 is used for detecting the behavior of the operation and maintenance personnel and generating warning information when detecting that the operation and maintenance personnel operate the data equipment which does not need operation and maintenance.
In the embodiment of the invention, when data center fault information occurs, whether a fault automatic repair program exists or not is determined according to the specific content of the data center fault information, when the fault automatic repair program exists, the fault automatic repair program is used for automatic operation and maintenance, the method is easy to understand, and some simple common fault problems can be automatically repaired through the preset repair program, so that the operation and maintenance management and control efficiency is increased, and the labor cost is reduced; when no corresponding fault automatic repair program exists, the embodiment of the invention also can monitor a data center room in real time through monitoring equipment, when the fact that an operation and maintenance person enters the data center room is detected, the identity information of the operation and maintenance person and the data equipment needing operation and maintenance are determined, for example, the person face identification is carried out when the person enters the data center room, the identity information of the operation and maintenance person can be determined, the operation and maintenance person is further determined by combining the operation and maintenance task of the operation and maintenance person arranged by the manual operation and maintenance center, the data equipment needing operation and maintenance is remotely controlled, so that the data equipment starts an operation and maintenance operation recording function, the operation and maintenance operation information is obtained, the operation and maintenance operation information is stored, and the operation and maintenance operation information is conveniently traced; in addition, the embodiment of the invention also can detect the behaviors of the operation and maintenance personnel, and when the operation and maintenance personnel is detected to operate the data equipment which does not need operation and maintenance, the operation and maintenance personnel is explained to perform improper operation, warning information is generated, and the operation and maintenance personnel can be monitored.
As shown in fig. 7, as a preferred embodiment of the present invention, the program automatic operation and maintenance module 200 includes:
a repairing program matching unit 201, configured to input data center fault information into a repairing program library for matching, where the repairing program library includes all data device information, each data device information corresponds to a plurality of fault codes, and each fault code corresponds to a faulty automatic repairing program;
a repair program determining unit 202, configured to determine that an automatic repair program for the fault exists when the fault code in the data center fault information is successfully matched with the fault code in the repair program library; otherwise, judging that no fault automatic repair program exists.
As shown in fig. 8, as a preferred embodiment of the present invention, the fault information sending module 300 includes:
the fault information sending unit 301 is configured to send data center fault information to the manual operation and maintenance center;
the operation and maintenance task dispatching unit 302 is used for dispatching the operation and maintenance task to the operation and maintenance personnel terminal according to the fault code in the data center fault information;
and the access control system authorization unit 303 is configured to send the identity information of the corresponding operation and maintenance personnel to the access control system of the data center room, so that the operation and maintenance personnel can enter the data center room.
As shown in fig. 9, as a preferred embodiment of the present invention, the operation and maintenance behavior monitoring module 500 includes:
the monitoring image analysis unit 501 is used for analyzing a monitoring image of the data center room, and when the fact that the stay time of the operation and maintenance personnel in a certain area exceeds a set time length is detected;
a distance value determining unit 502, configured to determine a distance value between the area and the data device that needs to be operated and maintained, and determine that the operation and maintenance worker is operating the data device that does not need to be operated and maintained when the distance value is greater than a set distance.
As shown in fig. 10, as a preferred embodiment of the present invention, the system further includes an operation and maintenance development module 600, where the operation and maintenance development module 600 specifically includes:
the operation and maintenance operation grouping unit 601 is configured to group all recorded operation and maintenance operation information to obtain a plurality of operation and maintenance operation groups, and the similarity between any two operation and maintenance operation information in each operation and maintenance operation group is greater than a set similarity value;
a repair program development group unit 602, configured to determine the number of operation and maintenance operation information in each operation and maintenance operation group, and when the number of operation and maintenance operation information is greater than a set number value, mark the operation and maintenance operation group as a repair program development group;
a program development group sending unit 603, configured to send the repair program development group to the program development center.
The present invention has been described in detail with reference to the preferred embodiments thereof, and it should be understood that the invention is not limited thereto, but is intended to cover modifications, equivalents, and improvements within the spirit and scope of the present invention.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by a computer program, which may be stored in a non-volatile computer readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (10)

1. The operation and maintenance management and control method applied to the data center is characterized by comprising the following steps:
receiving data center fault information, wherein the data center fault information comprises data equipment information and fault codes;
determining whether a fault automatic repair program exists according to the fault information of the data center, and if so, automatically operating and maintaining by using the fault automatic repair program; when not present, executing the next step;
sending the fault information of the data center to a manual operation and maintenance center;
the method comprises the steps that a data center room is monitored in real time, and when it is detected that operation and maintenance personnel enter the data center room, identity information of the operation and maintenance personnel and data equipment needing operation and maintenance are determined, so that the data equipment starts an operation and maintenance operation recording function, and operation and maintenance operation information is obtained;
and detecting the behavior of the operation and maintenance personnel, and generating warning information when detecting that the operation and maintenance personnel operate the data equipment which does not need operation and maintenance.
2. The operation and maintenance management and control method applied to the data center according to claim 1, wherein the step of determining whether the automatic fault repairing program exists according to the fault information of the data center specifically comprises:
inputting data center fault information into a repair program library for matching, wherein the repair program library comprises all data equipment information, each data equipment information corresponds to a plurality of fault codes, and each fault code corresponds to a fault automatic repair program;
when the fault codes in the data center fault information are successfully matched with the fault codes in the repair program library, judging that an automatic fault repair program exists; otherwise, judging that no fault automatic repair program exists.
3. The operation and maintenance management and control method applied to the data center according to claim 1, wherein the step of sending the fault information of the data center to the manual operation and maintenance center specifically comprises:
sending the fault information of the data center to a manual operation and maintenance center;
distributing the operation and maintenance tasks to the operation and maintenance personnel terminal according to the fault codes in the fault information of the data center;
and sending the identity information of the corresponding operation and maintenance personnel to an access control system of the data center room, so that the operation and maintenance personnel can enter the data center room.
4. The operation and maintenance management and control method applied to the data center according to claim 1, wherein the step of detecting the behavior of the operation and maintenance personnel specifically comprises:
analyzing the monitoring image of the data center room, and when the fact that the stay time of operation and maintenance personnel in a certain area exceeds a set time length is detected;
and determining a distance value between the area and the data equipment needing operation and maintenance, and judging that the operation and maintenance personnel are operating the data equipment not needing operation and maintenance when the distance value is greater than a set distance.
5. The operation and maintenance management and control method applied to the data center according to claim 1, wherein the method further comprises:
grouping all the recorded operation and maintenance operation information to obtain a plurality of operation and maintenance operation groups, wherein the similarity of any two operation and maintenance operation information in each operation and maintenance operation group is greater than a set similarity value;
judging the quantity of the operation and maintenance operation information in each operation and maintenance operation group, and marking the operation and maintenance operation group as a repair program development group when the quantity of the operation and maintenance operation information is greater than a set quantity value;
and sending the repair program development group to a program development center.
6. Operation and maintenance management and control system applied to data center, characterized in that the system includes:
the data center fault information receiving module is used for receiving data center fault information, and the data center fault information comprises data equipment information and fault codes;
the program automatic operation and maintenance module is used for determining whether a fault automatic repair program exists according to the fault information of the data center, and when the fault automatic repair program exists, the automatic operation and maintenance are carried out by using the fault automatic repair program; when the fault information does not exist, executing the steps in the fault information sending module;
the fault information sending module is used for sending the fault information of the data center to the manual operation and maintenance center;
the operation and maintenance operation recording module is used for monitoring the data center room in real time, and when detecting that an operation and maintenance person enters the data center room, the operation and maintenance person identity information and the data equipment needing operation and maintenance are determined, so that the data equipment starts the operation and maintenance operation recording function to obtain the operation and maintenance operation information;
and the operation and maintenance behavior monitoring module is used for detecting the behavior of the operation and maintenance personnel and generating warning information when detecting that the operation and maintenance personnel operate the data equipment which does not need operation and maintenance.
7. The operation and maintenance management and control system applied to the data center according to claim 6, wherein the program automatic operation and maintenance module comprises:
the system comprises a repairing program matching unit, a fault automatic repairing program matching unit and a fault automatic repairing program matching unit, wherein the repairing program matching unit is used for inputting fault information of a data center into a repairing program base for matching, the repairing program base comprises all data equipment information, each data equipment information corresponds to a plurality of fault codes, and each fault code corresponds to the fault automatic repairing program;
the repair program judging unit is used for judging that the fault automatic repair program exists when the fault code in the data center fault information is successfully matched with the fault code in the repair program library; otherwise, judging that no fault automatic repair program exists.
8. The operation and maintenance management and control system applied to the data center according to claim 6, wherein the fault information sending module comprises:
the fault information sending unit is used for sending the fault information of the data center to the manual operation and maintenance center;
the operation and maintenance task dispatching unit is used for dispatching the operation and maintenance tasks to the operation and maintenance personnel terminal according to the fault codes in the fault information of the data center;
and the access control system authorization unit is used for sending the identity information of the corresponding operation and maintenance personnel to the access control system of the data center room, so that the operation and maintenance personnel can enter the data center room.
9. The operation and maintenance management and control system applied to the data center according to claim 6, wherein the operation and maintenance behavior monitoring module comprises:
the monitoring image analysis unit is used for analyzing the monitoring image of the data center room, and when the fact that the stay time of the operation and maintenance personnel in a certain area exceeds the set duration is detected;
and the distance value judging unit is used for determining the distance value between the area and the data equipment needing operation and maintenance, and when the distance value is greater than the set distance, the operation and maintenance personnel is judged to be operating the data equipment not needing operation and maintenance.
10. The operation and maintenance management and control system applied to the data center according to claim 6, wherein the system further comprises an operation and maintenance operation development module, and the operation and maintenance operation development module specifically comprises:
the operation and maintenance operation grouping unit is used for grouping all the recorded operation and maintenance operation information to obtain a plurality of operation and maintenance operation groups, and the similarity of any two operation and maintenance operation information in each operation and maintenance operation group is greater than a set similarity value;
the repair program development group unit is used for judging the quantity of the operation and maintenance operation information in each operation and maintenance operation group, and when the quantity of the operation and maintenance operation information is larger than a set quantity value, the operation and maintenance operation group is marked as a repair program development group;
and the program development group sending unit is used for sending the repair program development group to the program development center.
CN202211543582.9A 2022-12-03 2022-12-03 Operation and maintenance control method and system applied to data center Withdrawn CN115827038A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211543582.9A CN115827038A (en) 2022-12-03 2022-12-03 Operation and maintenance control method and system applied to data center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211543582.9A CN115827038A (en) 2022-12-03 2022-12-03 Operation and maintenance control method and system applied to data center

Publications (1)

Publication Number Publication Date
CN115827038A true CN115827038A (en) 2023-03-21

Family

ID=85543897

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211543582.9A Withdrawn CN115827038A (en) 2022-12-03 2022-12-03 Operation and maintenance control method and system applied to data center

Country Status (1)

Country Link
CN (1) CN115827038A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116743603A (en) * 2023-08-16 2023-09-12 广州海晟科技有限公司 Safe operation and maintenance method and system for private cloud platform information system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116743603A (en) * 2023-08-16 2023-09-12 广州海晟科技有限公司 Safe operation and maintenance method and system for private cloud platform information system
CN116743603B (en) * 2023-08-16 2023-10-20 广州海晟科技有限公司 Safe operation and maintenance method and system for private cloud platform information system

Similar Documents

Publication Publication Date Title
CN109669844B (en) Equipment fault processing method, device, equipment and storage medium
CN116009480B (en) Fault monitoring method, device and equipment of numerical control machine tool and storage medium
CN115827038A (en) Operation and maintenance control method and system applied to data center
CN109656205B (en) Defective product control method and device, electronic device and readable storage medium
CN110378273B (en) Method and device for monitoring operation flow
CN115776438B (en) Industrial control data transmission method and system
CN110597196A (en) Data acquisition system and data acquisition method
CN109059198A (en) Equipment automatic engineering adjustment method, device, system and computer equipment
CN111177488B (en) Metering equipment maintenance processing method and device, computer equipment and storage medium
CN113821242A (en) Intelligent firmware matching method and system
CN111815262B (en) Method for auditing and managing operation in dangerous source
CN113206823A (en) Industrial information safety monitoring method and device, computer equipment and storage medium
CN115953880B (en) Monitoring and early warning system and method for citric acid production
CN103825758A (en) Fault processing method for electric power communication network operation monitoring system
CN113533887B (en) Intelligent debugging method and system for power distribution terminal
CN115202962A (en) Equipment fault rapid diagnosis method and system based on industrial internet platform
CN114219035A (en) Multi-sensor data fusion method and device
CN110806729B (en) Method and system for controlling startup, shutdown, power supply and power failure of production line
CN111141981B (en) Line loss point inspection method and device, computer equipment and storage medium
CN114358627A (en) Electric power operation and maintenance management system and operation and maintenance method thereof
CN113778552A (en) Monitoring parameter modification method and device and computer equipment
CN113886262A (en) Software automation test method and device, computer equipment and storage medium
CN114219434A (en) Construction engineering wisdom supervisory systems
CN112737120A (en) Generation method and device of regional power grid control report and computer equipment
CN117979254A (en) Internet of things networking method and system of gas detector

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20230321