CN117076253B - Multi-dimensional intelligent operation and maintenance system for data center service and facilities - Google Patents

Multi-dimensional intelligent operation and maintenance system for data center service and facilities Download PDF

Info

Publication number
CN117076253B
CN117076253B CN202311109508.0A CN202311109508A CN117076253B CN 117076253 B CN117076253 B CN 117076253B CN 202311109508 A CN202311109508 A CN 202311109508A CN 117076253 B CN117076253 B CN 117076253B
Authority
CN
China
Prior art keywords
data
node
module
maintenance
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311109508.0A
Other languages
Chinese (zh)
Other versions
CN117076253A (en
Inventor
熊芳
张涛
苏镇生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Yiyun Information Technology Co ltd
Original Assignee
Guangzhou Yiyun Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Yiyun Information Technology Co ltd filed Critical Guangzhou Yiyun Information Technology Co ltd
Priority to CN202311109508.0A priority Critical patent/CN117076253B/en
Publication of CN117076253A publication Critical patent/CN117076253A/en
Application granted granted Critical
Publication of CN117076253B publication Critical patent/CN117076253B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Mathematical Physics (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of data processing, in particular to a multi-dimensional intelligent operation and maintenance system of a data center service and facilities.

Description

Multi-dimensional intelligent operation and maintenance system for data center service and facilities
Technical Field
The invention relates to the technical field of data processing, in particular to a multidimensional intelligent operation and maintenance system for data center business and facilities.
Background
The data center infrastructure management is to combine information technology (I nformat ion Techno logy, I T) with equipment management to perform centralized management such as centralized monitoring and capacity planning on key equipment of the data center. The accurate control of the production process is realized through software, hardware, sensors and the like.
Chinese patent publication No.: CN109492044a discloses a large data center operation and maintenance monitoring system, comprising: the system comprises a data acquisition module, a unified management module, a data processing module and a service application module; the data acquisition module is used for acquiring, converging and storing data information, wherein the data information comprises state information, operation and maintenance information and processing result information; the unified management module is used for realizing unified management of data information and equipment and facilities and providing services to the outside through a WebService interface; wherein the equipment facility comprises hardware equipment and middleware software; the data processing module is used for extracting, aggregating and analyzing the data information; and returning a processing result; the business application module is used for realizing the presentation of the data information; and the interface provided by the unified management module is called, so that the control of equipment is realized, and the management of operation and maintenance services is supported. The invention provides an operation and maintenance monitoring system with comprehensive service coverage and complete technical system; therefore, the problem of low operation and maintenance efficiency caused by insufficient control precision of node facilities in the process of managing the data center infrastructure exists in the prior art.
Disclosure of Invention
Therefore, the invention provides a multi-dimensional intelligent operation and maintenance system for data center services and facilities, which is used for solving the problem of low operation and maintenance efficiency caused by insufficient control precision of node facilities in the management process of data center infrastructure in the prior art.
In order to achieve the above object, the present invention provides a multidimensional intelligent operation and maintenance system for data center service and facilities, comprising:
The system comprises a plurality of node data processors, a plurality of node data processing units and a cloud service data transmission module, wherein the node data processors are arranged on a plurality of node facilities of a data center and comprise a data acquisition module for acquiring node monitoring data of a sensor of a single node facility of the data center, a data processing module for processing the node monitoring data, a fault judgment module for judging faults of the single node according to the node monitoring data and a data transmission module for transmitting the node monitoring data to the cloud service;
The cloud server is connected with each node data processor and comprises a data receiving module for receiving the node monitoring data sent by the data sending module, a data analysis module for analyzing the node monitoring data and a data transmission module for respectively transmitting data to a database and an operation and maintenance center according to the analysis result of the data analysis module;
the database is connected with the cloud server and used for storing the node monitoring data transmitted by the data analysis module;
and the operation and maintenance center is connected with the cloud server and used for receiving the analysis result of the data analysis module so as to operate and maintain the data center.
Further, the node data processor determines a plurality of processing modes of the data processing module for processing the monitoring data of the node facility according to the comparison result of the importance index and the preset importance index of the node facility of the data center in a preset monitoring period, wherein the plurality of processing modes comprise a first processing mode of the data processing module for carrying out cluster analysis on the monitoring data to remove repeated data so that the data transmitting module transmits residual data to the data center and a second processing mode of the data transmitting module for directly transmitting the monitoring data to the data center;
and the preset monitoring period is the completion of the instruction of executing the data center by the node facility.
Further, the data analysis module counts the number of nodes, which are not corresponding to the monitoring data and the standard data, among the associated node facilities under the preset condition, and compares the number of the nodes with the preset number of the nodes, so as to determine a plurality of operation and maintenance modes of the data center according to the comparison result, wherein the plurality of operation and maintenance modes comprise a first operation and maintenance mode under the condition that the number of the nodes is smaller than or equal to the preset number of the nodes and a second operation and maintenance mode under the condition that the number of the nodes is larger than the preset number of the nodes;
The preset condition is that the data sending module wants the data analysis module to send the monitoring data.
Further, the data analysis module compares and analyzes the monitoring data of the single node with the historical fault data in a first operation and maintenance mode to determine whether the monitoring data are fault data, and if so, the data transmission module transmits the fault data of the single node serving as the fault node to the operation and maintenance to perform the operation and maintenance.
Further, the data analysis module analyzes a plurality of fault nodes by taking the node farthest from the data center as an initial analysis node in a second operation and maintenance mode, and determines the data quantity of fault data of the plurality of fault nodes so as to determine the urgency of operation and maintenance according to the comparison result of the data quantity of the fault data and the data quantity of preset fault data.
Further, the data analysis module determines whether the data transmission path has errors according to the comparison result of the real-time data transmission amount and the preset data transmission amount of the data transmission module in a corresponding processing mode, and if the data analysis module determines that the data transmission path has errors, the data is acquired again to perform fault analysis; and if the data analysis module determines that the data transmission path is free of errors, the data transmission module generates fault data and transmits the fault data to a maintenance center.
Further, the data analysis obtains the interval duration between the maintenance time point of the current facility and the previous maintenance time point in the second operation and maintenance mode, and the data analysis module determines that the urgency needs to be adjusted under the condition that the interval duration is longer than the standard interval duration.
Further, the data analysis module obtains average failure rates of a plurality of node facilities in the historical data in the database in a corresponding operation and maintenance mode, so as to determine to optimize the operation and maintenance mode under the comparison result that the average failure rates are larger than a preset average failure rate.
Further, the data analysis module calculates the failure rate difference value of the average failure rate and the preset average failure rate under the condition of determining the optimization condition of the operation and maintenance mode, so as to determine a plurality of optimization modes of the operation and maintenance mode according to the comparison result of the failure rate difference value and the preset failure rate difference value, wherein the plurality of optimization modes comprise a first optimization mode for proposing the proportion after the cluster analysis of the monitoring data and a second optimization mode for correcting the urgency.
Further, the data analysis module calculates the reject data of the data receiving module after the data processing module is subjected to cluster analysis in a second optimization mode, and determines the proportion of fault data in the reject data, so as to determine to increase node facilities according to the comparison result of the proportion of fault data and the proportion of preset fault data, and simultaneously determine to increase the node facilities by the initial increase number.
Compared with the prior art, the method has the beneficial effects that the monitoring data of each node facility in the data center are monitored by monitoring the node facilities to perform fault analysis on the monitoring data of each node facility, the accurate monitoring of the node facilities in the data center is realized, the monitoring data of each node facility are analyzed and processed by the data processing module, so that the fault data of the corresponding node facility is obtained, the fault data are transmitted to the data analysis module of the cloud server to be analyzed, whether operation and maintenance operations are performed or not is determined according to the analysis result, the calculated amount of the cloud server is reduced, the control limitation on the node facilities is improved, and the operation and maintenance efficiency is further improved.
Further, the importance index of the corresponding node is determined by the node data processor, so that the processing mode of the data processing module to the monitoring data of the node facility is determined according to the determined importance index, the flexible processing of the monitoring data of the node facility is realized, the control confinement of the node facility is further improved, and the operation and maintenance efficiency is further improved.
Further, according to the invention, the operation and maintenance modes of the data center are determined according to the node quantity of the monitoring data which does not correspond to the standard data under the preset condition, and the urgency of fault data when the fault data are transmitted to the operation and maintenance center or the fault operation and maintenance of the node facilities are set under the corresponding operation and maintenance modes respectively, so that the flexible operation and maintenance of the data center is realized, the control limitation of the node facilities is further improved, and the operation and maintenance efficiency is further improved.
Further, according to the invention, the historical monitoring data in the database of the data center is analyzed in a corresponding operation and maintenance mode, so that an optimization mode for optimizing the operation and maintenance mode is determined according to the failure rate of the historical monitoring data, and whether to send the reject data to the operation and maintenance center or add node facilities to conduct standard monitoring on the business of the data center is respectively determined according to the reject data proportion of the data processing module in the corresponding optimization mode, so that the control confinement of the node facilities is further improved, and the operation and maintenance efficiency is further improved.
Drawings
FIG. 1 is a schematic diagram of a data center business and facility multidimensional intelligent operation and maintenance system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a node data processor in a data center service and facility multidimensional intelligent operation and maintenance system according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a cloud server in a data center service and facility multidimensional intelligent operation and maintenance system according to an embodiment of the present invention;
in the figure, a 1-node data processor, a 2-database, a 3-cloud server and a 4-operation and maintenance center.
Detailed Description
In order that the objects and advantages of the invention will become more apparent, the invention will be further described with reference to the following examples; it should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.
It should be noted that, in the description of the present invention, terms such as "upper," "lower," "left," "right," "inner," "outer," and the like indicate directions or positional relationships based on the directions or positional relationships shown in the drawings, which are merely for convenience of description, and do not indicate or imply that the apparatus or elements must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
Furthermore, it should be noted that, in the description of the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention can be understood by those skilled in the art according to the specific circumstances.
Referring to fig. 1-3, fig. 1 is a schematic structural diagram of a multi-dimensional intelligent operation and maintenance system for a data center service and a facility according to an embodiment of the present invention; FIG. 2 is a schematic diagram of a node data processor in a data center service and facility multidimensional intelligent operation and maintenance system according to an embodiment of the present invention; fig. 3 is a schematic structural diagram of a cloud server in a data center service and facility multidimensional intelligent operation and maintenance system according to an embodiment of the present invention.
The embodiment of the invention relates to a multidimensional intelligent operation and maintenance system for data center business and facilities, which comprises the following steps:
The system comprises a plurality of node data processors, a plurality of node data processing units and a cloud service data transmission module, wherein the node data processors are arranged on a plurality of node facilities of a data center and comprise a data acquisition module for acquiring node monitoring data of a sensor of a single node facility of the data center, a data processing module for processing the node monitoring data, a fault judgment module for judging faults of the single node according to the node monitoring data and a data transmission module for transmitting the node monitoring data to the cloud service;
The cloud server is connected with each node data processor and comprises a data receiving module for receiving the node monitoring data sent by the data sending module, a data analysis module for analyzing the node monitoring data and a data transmission module for respectively transmitting data to a database and an operation and maintenance center according to the analysis result of the data analysis module;
the database is connected with the cloud server and used for storing the node monitoring data transmitted by the data analysis module;
and the operation and maintenance center is connected with the cloud server and used for receiving the analysis result of the data analysis module so as to operate and maintain the data center.
Specifically, the node data processor determines a processing mode of the data processing module for processing the monitoring data of the node facilities according to a comparison result of the importance index F of the node facilities of the data center and the preset importance index F0 in a preset monitoring period;
if F is less than or equal to F0, the data processing module determines to process the monitoring data of the node facilities in a first processing mode;
if F is more than F0, the data processing module determines to process the monitoring data of the node facilities in a second processing mode;
The first processing mode satisfies that the data processing module performs cluster analysis on the monitored data to remove repeated data so that the data sending module sends residual data to the data center, and the second processing mode satisfies that the data sending module directly sends the monitored data to the data center.
Specifically, the data processing module calculates the importance index F of the node facility according to the following formula, and sets up
F=W/Wz
Where W is the number of node facilities associated with a single node facility in the data center and Wz is the total number of node facilities in the data center.
Specifically, the data analysis module counts the number U of nodes, which are not corresponding to the monitoring data and the standard data, among the associated node facilities under the preset condition, and compares the number U of nodes with the preset number U0 of nodes so as to determine the operation and maintenance mode of the data center according to the comparison result;
If U is less than or equal to U0, the data analysis module determines that the operation and maintenance mode is a first operation and maintenance mode;
If U is more than U0, the data analysis module determines that the operation and maintenance mode is a second operation and maintenance mode.
In the embodiment of the present invention, the preset condition is that the data sending module wants the data analysis module to send the monitoring data, and the number of preset nodes is 1.
Specifically, the data analysis module compares and analyzes the monitoring data of a single node with historical fault data in a first operation and maintenance mode to determine whether the monitoring data are fault data, and if so, the data transmission module transmits the node serving as the fault data of the fault node to the operation and maintenance to perform operation and maintenance.
In the embodiment of the invention, if the monitoring data is the same as the historical fault data, the monitoring data is determined to be the fault data.
Specifically, the data analysis module analyzes a plurality of fault nodes by taking a node farthest from the data center as an initial analysis node in a second operation and maintenance mode, and determines the data quantity R of fault data of the plurality of fault nodes so as to determine the urgency of operation and maintenance according to the comparison result of the data quantity R of the fault data and the data quantity R0 of preset fault data;
If R is less than or equal to R0, the data analysis module determines that the urgency is a first urgency;
If R > R0, the data analysis module determines the urgency to be a second urgency.
In the embodiment of the invention, the value of the data quantity R0 of the preset fault data is 5% of the data quantity of the historical fault data, the value of the first urgency is 0.75, and the value of the second urgency is 0.80.
Specifically, the data analysis module determines whether an error exists in the data transmission path according to a comparison result of the real-time data transmission amount E of the data transmission module and the preset data transmission amount E0 in a corresponding processing mode;
if E is less than E0, the data analysis module determines that the data transmission path has errors, and re-acquires data to perform fault analysis;
If E is more than or equal to E0, the data analysis module determines that the data transmission path has no error, and the data transmission module generates fault data and transmits the fault data to a maintenance center.
In the embodiment of the present invention, the preset data transmission amount E0 is a standard value of the corresponding transmission path, and the standard value is not less than 80% of the total capacity of the data transmission path.
Specifically, the data analysis obtains the interval duration T of the maintenance time point of the current facility and the previous maintenance time point in a second operation and maintenance mode, and compares the interval duration T with the standard interval duration T0 to determine whether to adjust the urgency according to the comparison result;
If T is less than or equal to T0, the data analysis module determines that the urgency is not adjusted;
If T is more than T0, the data analysis module determines that the urgency needs to be adjusted;
the data analysis module sets the adjusted urgency to J i ×K under the condition of a preset interval duration, wherein j i is the ith urgency, and K is an urgency adjustment coefficient.
In the embodiment of the invention, the standard interval duration is 50% of the maintenance period of the current facility, and the value of the urgency adjusting coefficient is 1.2.
Specifically, the data analysis module obtains the average failure rate G of a plurality of node facilities in the historical data in the database under the corresponding operation and maintenance mode, so as to determine whether to optimize the operation and maintenance mode according to the comparison result of the average failure rate G and the preset average failure rate G0;
if G is less than or equal to G0, the data analysis module determines that the operation and maintenance mode is not optimized;
if G is more than G0, the data analysis module determines to optimize the operation and maintenance mode.
In the embodiment of the invention, the value of the preset average fault rate is 5%.
Specifically, the data analysis module calculates a failure rate difference value delta G of the average failure rate G and a preset average failure rate G0 under the condition of determining the optimization of the operation and maintenance mode, so as to determine the optimization of the operation and maintenance mode according to the comparison result of the failure rate difference value delta G and the preset failure rate difference value delta G0;
If ΔG is less than or equal to ΔG0, the data analysis module determines to optimize the operation and maintenance mode in a first optimization mode; if ΔG > ΔG0, the data analysis module determines to optimize the operation and maintenance mode in a second optimization mode; the first optimization mode is to determine the proposed proportion after the monitoring data are subjected to cluster analysis, and the second optimization mode is to correct the urgency.
In the embodiment of the invention, the value of the difference value of the preset failure rate is 3 percent.
Specifically, the data analysis module receives the reject data after the cluster analysis of the data processing module in a first optimization mode, and determines a fault data proportion B in the reject data, so as to determine whether to send monitoring data of a corresponding node facility to an operation and maintenance center according to a comparison result of the fault data proportion B and a preset fault data proportion B0;
if B is less than or equal to B0, the data analysis module determines that monitoring data of the corresponding node facilities are not transmitted to the operation and maintenance center;
if B is more than B0, the data analysis module determines to send the monitoring data of the corresponding node facilities to the operation and maintenance center.
In the embodiment of the invention, the preset fault data proportion B0 has a value of 50% of the data quantity of the reject data.
Specifically, the data analysis module calculates, in a second optimization mode, the reject data after the data receiving module receives the data processing module cluster analysis, and determines a fault data proportion B in the reject data, so as to determine whether to increase node facilities according to a comparison result of the fault data proportion B and a preset fault data proportion B0;
if B is less than or equal to B0, the data analysis module determines that node facilities are not added;
if B > B0, the data analysis module determines to add node facilities while determining to add node facilities by the initial added number.
In the embodiment of the invention, the value of the initial increment is 2% of the total number of the original node facilities.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.
The foregoing description is only of the preferred embodiments of the invention and is not intended to limit the invention; various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (3)

1. A data center business and facility multidimensional intelligent operation and maintenance system, comprising:
The system comprises a plurality of node data processors, a plurality of node data processing units and a cloud service data transmission module, wherein the node data processors are arranged on a plurality of node facilities of a data center and comprise a data acquisition module for acquiring node monitoring data of a sensor of a single node facility of the data center, a data processing module for processing the node monitoring data, a fault judgment module for judging faults of the single node according to the node monitoring data and a data transmission module for transmitting the node monitoring data to the cloud service;
The cloud server is connected with each node data processor and comprises a data receiving module for receiving the node monitoring data sent by the data sending module, a data analysis module for analyzing the node monitoring data and a data transmission module for respectively transmitting data to a database and an operation and maintenance center according to the analysis result of the data analysis module;
the database is connected with the cloud server and used for storing the node monitoring data transmitted by the data analysis module;
the operation and maintenance center is connected with the cloud server and used for receiving the analysis result of the data analysis module so as to operate and maintain the data center;
The data analysis module counts the number of nodes which are not corresponding to the monitoring data and the standard data and compares the number of the nodes with the number of the preset nodes under the preset condition, so as to determine a plurality of operation and maintenance modes of the data center according to the comparison result, wherein the plurality of operation and maintenance modes comprise a first operation and maintenance mode under the condition that the number of the nodes is smaller than or equal to the number of the preset nodes and a second operation and maintenance mode under the condition that the number of the nodes is larger than the number of the preset nodes;
The preset condition is that the data sending module sends the monitoring data to the data analysis module;
The data analysis module is used for analyzing a plurality of fault nodes by taking the node farthest from the data center as an initial analysis node in a second operation and maintenance mode, determining the data quantity of fault data of the plurality of fault nodes, and determining the urgency of operation and maintenance according to the comparison result of the data quantity of the fault data and the data quantity of preset fault data;
the data analysis module determines whether the data transmission path has errors according to the comparison result of the real-time data transmission quantity and the preset data transmission quantity of the data transmission module in a corresponding processing mode, and if the data analysis module determines that the data transmission path has errors, the data are acquired again for fault analysis; if the data analysis module determines that the data transmission path has no error, the data transmission module generates fault data and transmits the fault data to a maintenance center;
The data analysis module acquires the interval duration between the maintenance time point of the current facility and the previous maintenance time point in a second operation and maintenance mode, and determines that the urgency is required to be adjusted under the condition that the interval duration is longer than the standard interval duration;
The data analysis module acquires average failure rates of a plurality of node facilities in historical data in the database in a corresponding operation and maintenance mode, so as to determine to optimize the operation and maintenance mode under the comparison result that the average failure rate is larger than a preset average failure rate;
the data analysis module calculates the failure rate difference value of the average failure rate and the preset average failure rate under the condition of determining the optimization condition of the operation and maintenance mode, so as to determine a plurality of optimization modes of the operation and maintenance mode according to the comparison result of the failure rate difference value and the preset failure rate difference value, wherein the plurality of optimization modes comprise a first optimization mode for eliminating the proportion after the cluster analysis of the monitoring data and a second optimization mode for correcting the urgency;
And under a second optimization mode, the data analysis module calculates the reject data which is subjected to cluster analysis by the data receiving module and is received by the data processing module, and determines the proportion of fault data in the reject data, so as to determine to increase node facilities under the comparison result of the proportion of the fault data and the proportion of preset fault data, and simultaneously determine to increase the node facilities by the initial increase quantity.
2. The system according to claim 1, wherein the node data processor determines a plurality of processing modes of the data processing module for processing the monitoring data of the node facility according to a comparison result of an importance index of the node facility and a preset importance index of the data center in a preset monitoring period, and the plurality of processing modes include a first processing mode of the data processing module for performing cluster analysis on the monitoring data to reject repeated data so that the data transmitting module transmits remaining data to the data center and a second processing mode of the data transmitting module for directly transmitting the monitoring data to the data center;
and the preset monitoring period is the completion of the instruction of executing the data center by the node facility.
3. The system according to claim 2, wherein the data analysis module compares the monitored data of a single node with historical fault data in a first operation mode to determine whether the monitored data is fault data, and if so, the data transmission module transmits the fault data of the node as a fault node to an operation center for operation.
CN202311109508.0A 2023-08-30 2023-08-30 Multi-dimensional intelligent operation and maintenance system for data center service and facilities Active CN117076253B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311109508.0A CN117076253B (en) 2023-08-30 2023-08-30 Multi-dimensional intelligent operation and maintenance system for data center service and facilities

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311109508.0A CN117076253B (en) 2023-08-30 2023-08-30 Multi-dimensional intelligent operation and maintenance system for data center service and facilities

Publications (2)

Publication Number Publication Date
CN117076253A CN117076253A (en) 2023-11-17
CN117076253B true CN117076253B (en) 2024-05-28

Family

ID=88716934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311109508.0A Active CN117076253B (en) 2023-08-30 2023-08-30 Multi-dimensional intelligent operation and maintenance system for data center service and facilities

Country Status (1)

Country Link
CN (1) CN117076253B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106452922A (en) * 2016-11-30 2017-02-22 长春工业大学 Data center processing method applied to Internet of things
CN107016057A (en) * 2017-02-28 2017-08-04 北京铁路局 Row control vehicle-mounted ATP equipment integral intelligent O&M method and system
CN108092813A (en) * 2017-12-21 2018-05-29 郑州云海信息技术有限公司 Data center's total management system server hardware Governance framework and implementation method
CN109101400A (en) * 2018-08-16 2018-12-28 郑州云海信息技术有限公司 A kind of monitoring system of cloud computation data center whole machine cabinet server
CN109255531A (en) * 2018-08-28 2019-01-22 中金数据系统有限公司 A kind of data center's intelligence operation management system
CN112165501A (en) * 2020-08-05 2021-01-01 宁夏无线互通信息技术有限公司 Remote operation and maintenance system and method for product analysis based on industrial internet identification
CN112269673A (en) * 2020-11-02 2021-01-26 深圳市巨文科技有限公司 Intelligent operation and maintenance management system and method for data center
WO2021169270A1 (en) * 2020-02-27 2021-09-02 平安科技(深圳)有限公司 Server fault pre-warning method, device, computer apparatus, and storage medium
CN115562933A (en) * 2022-09-21 2023-01-03 新华三技术有限公司 Processing method and device of operation monitoring data, storage medium and electronic equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105051698B (en) * 2013-03-28 2018-11-16 瑞典爱立信有限公司 Method and arrangement for fault management in infrastructure, that is, service cloud
US10355966B2 (en) * 2016-03-25 2019-07-16 Advanced Micro Devices, Inc. Managing variations among nodes in parallel system frameworks
JP7234942B2 (en) * 2018-01-19 2023-03-08 日本電気株式会社 Network monitoring system, method and program
US20230161661A1 (en) * 2021-11-22 2023-05-25 Accenture Global Solutions Limited Utilizing topology-centric monitoring to model a system and correlate low level system anomalies and high level system impacts

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106452922A (en) * 2016-11-30 2017-02-22 长春工业大学 Data center processing method applied to Internet of things
CN107016057A (en) * 2017-02-28 2017-08-04 北京铁路局 Row control vehicle-mounted ATP equipment integral intelligent O&M method and system
CN108092813A (en) * 2017-12-21 2018-05-29 郑州云海信息技术有限公司 Data center's total management system server hardware Governance framework and implementation method
CN109101400A (en) * 2018-08-16 2018-12-28 郑州云海信息技术有限公司 A kind of monitoring system of cloud computation data center whole machine cabinet server
CN109255531A (en) * 2018-08-28 2019-01-22 中金数据系统有限公司 A kind of data center's intelligence operation management system
WO2021169270A1 (en) * 2020-02-27 2021-09-02 平安科技(深圳)有限公司 Server fault pre-warning method, device, computer apparatus, and storage medium
CN112165501A (en) * 2020-08-05 2021-01-01 宁夏无线互通信息技术有限公司 Remote operation and maintenance system and method for product analysis based on industrial internet identification
CN112269673A (en) * 2020-11-02 2021-01-26 深圳市巨文科技有限公司 Intelligent operation and maintenance management system and method for data center
CN115562933A (en) * 2022-09-21 2023-01-03 新华三技术有限公司 Processing method and device of operation monitoring data, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN117076253A (en) 2023-11-17

Similar Documents

Publication Publication Date Title
US8667126B2 (en) Dynamic rate heartbeating for inter-node status updating
WO2017041406A1 (en) Failure positioning method and device
CN110929896A (en) Security analysis method and device for system equipment
CN115529595A (en) Method, device, equipment and medium for detecting abnormity of log data
CN115695150A (en) Method and device for detecting networking equipment based on distributed heterogeneous fusion
CN117076253B (en) Multi-dimensional intelligent operation and maintenance system for data center service and facilities
CN110855001A (en) Distribution automation and power grid optimization management operation system
CN116820014A (en) Intelligent monitoring and early warning method and system for traffic electromechanical equipment
CN116600329A (en) Message error identification code delimitation method and device
CN114138750B (en) AI consultation database based cluster building method and system
CN110537347B (en) Method and central computer for detecting and determining the probability of failure of a radio network
CN113253694B (en) Control system of satellite measurement and control equipment and method for controlling satellite measurement and control equipment
CN115603448A (en) Low-voltage line operation and maintenance management method based on edge calculation
CN102104950B (en) Method and base station for managing collected property data
CN114615637A (en) High-information timeliness data transmission method based on two-stage polling
KR20230124638A (en) Management systems, management methods and management programs
CN110266818B (en) Management method based on Internet of things terminal group
CN110687346A (en) Method for checking and optimizing power grid voltage abnormity reason data
WO2024066292A1 (en) Device group fault identification method and apparatus, and computer-readable storage medium
CN117235460B (en) Data transmission processing method and system based on power time sequence data
CN117914786A (en) Cloud edge cooperation-oriented intelligent Internet of things data processing method and management platform
CN111524053B (en) Information acquisition method, device, equipment and medium of air quality prediction system
CN112887165A (en) Data source state determination method and device and computer readable storage medium
CN110890988B (en) Server cluster operation monitoring system
CN117527623A (en) Communication control signal transmission system based on micro-service architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant