CN117785542A - Intelligent operation and maintenance method, system, equipment and storage medium for data center - Google Patents

Intelligent operation and maintenance method, system, equipment and storage medium for data center Download PDF

Info

Publication number
CN117785542A
CN117785542A CN202410220811.6A CN202410220811A CN117785542A CN 117785542 A CN117785542 A CN 117785542A CN 202410220811 A CN202410220811 A CN 202410220811A CN 117785542 A CN117785542 A CN 117785542A
Authority
CN
China
Prior art keywords
information
historical
index
error reporting
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410220811.6A
Other languages
Chinese (zh)
Other versions
CN117785542B (en
Inventor
刘文超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yaode Data Service Co ltd
Original Assignee
Shenzhen Yaode Data Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yaode Data Service Co ltd filed Critical Shenzhen Yaode Data Service Co ltd
Priority to CN202410220811.6A priority Critical patent/CN117785542B/en
Priority claimed from CN202410220811.6A external-priority patent/CN117785542B/en
Publication of CN117785542A publication Critical patent/CN117785542A/en
Application granted granted Critical
Publication of CN117785542B publication Critical patent/CN117785542B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The application relates to the technical field of artificial intelligence, and provides an intelligent operation and maintenance method, system, equipment and storage medium for a data center, which comprises the following steps: acquiring historical error report sentences and corresponding index judgment information, historical index information and historical fault information thereof; determining index judgment information corresponding to each historical error report statement, a mapping relation between the historical index information and the historical fault information based on a preset reasoning algorithm, and further obtaining reasoning example information corresponding to each historical error report statement; acquiring real-time error reporting sentences, matching the real-time error reporting sentences with the history error reporting sentences, and determining reasoning example information corresponding to the history error reporting sentences with the highest matching degree as target example information; inputting target example information into a natural language model so that the natural language model can complete deducing learning; acquiring real-time index information of a data center, and asking a natural language model according to index judgment information and the real-time index information to obtain fault indication information and a corresponding deduction process.

Description

Intelligent operation and maintenance method, system, equipment and storage medium for data center
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to an intelligent operation and maintenance method, system, device, and storage medium for a data center.
Background
With the development of big data and artificial intelligence technology, operation and maintenance personnel of the data center can be assisted by the artificial intelligence technology to quickly comb massive operation and maintenance information from multiple angles so as to accelerate the fault positioning speed, wherein the operation and maintenance personnel can be used for combing alarm information, and the operation and maintenance personnel can be an important way for knowing the operation condition of the data center. Most of alarm messages only record data such as equipment names, occurrence time, fault phenomena and the like of faults, but do not explicitly explain the root cause of the faults, and even hide the root cause of the faults in alarm storms, which further aggravates the difficulty of analyzing the root cause of the faults.
Disclosure of Invention
The application provides an intelligent operation and maintenance method, system, equipment and storage medium for a data center, which are used for analyzing root causes of faults according to alarm information so as to improve the fault analysis efficiency of operation and maintenance personnel.
In a first aspect, an embodiment of the present application provides an intelligent operation and maintenance method for a data center, the method including:
acquiring historical error reporting sentences of a data center, and acquiring index judgment information, historical index information and historical fault information corresponding to the historical error reporting sentences;
determining the index judgment information, the historical index information and the mapping relation between the historical fault information corresponding to each historical error reporting statement based on a preset reasoning algorithm, and obtaining reasoning example information corresponding to each historical error reporting statement according to the index judgment information, the historical index information, the historical fault information and the mapping relation corresponding to each historical error reporting statement;
acquiring real-time error reporting sentences of a data center, matching the real-time error reporting sentences with the historical error reporting sentences, and determining reasoning example information corresponding to the historical error reporting sentences with highest matching degree as target example information;
inputting the target example information into a natural language model so that the natural language model can complete deducing learning;
acquiring real-time index information of the data center, and asking the natural language model which finishes deducing learning according to the index judgment information and the real-time index information to obtain fault indication information and a deducing process of the fault indication information.
In a second aspect, embodiments of the present application provide an intelligent operation and maintenance system for a data center, the intelligent operation and maintenance system comprising:
the information acquisition module is used for acquiring historical error reporting sentences of the data center and acquiring index judgment information, historical index information and historical fault information corresponding to the historical error reporting sentences;
the mapping reasoning module is used for determining the index judgment information, the historical index information and the mapping relation between the historical fault information corresponding to each historical error reporting statement based on a preset reasoning algorithm, and obtaining reasoning example information corresponding to each historical error reporting statement according to the index judgment information, the historical index information, the historical fault information and the mapping relation corresponding to each historical error reporting statement;
the information matching module is used for acquiring real-time error reporting sentences of the data center, matching the real-time error reporting sentences with the historical error reporting sentences, and determining reasoning example information corresponding to the historical error reporting sentences with the highest matching degree as target example information;
the model training module is used for inputting the target example information into a natural language model so that the natural language model can complete deducing learning;
and the result output module is used for acquiring real-time index information of the data center, asking questions of the natural language model which finishes deducing and learning according to the index judgment information and the real-time index information, and obtaining fault indication information and a deducing process of the fault indication information.
In a third aspect, embodiments of the present application provide a computer device comprising a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to execute the computer program and implement the intelligent operation and maintenance method of the data center according to any one of the embodiments of the present application when the computer program is executed.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program, which when executed by a processor, causes the processor to implement a method for intelligent operation and maintenance of a data center according to any one of the embodiments of the present application.
The embodiment of the application provides an intelligent operation and maintenance method for a data center, which comprises the following steps: acquiring historical error reporting sentences of a data center, and acquiring index judgment information, historical index information and historical fault information corresponding to the historical error reporting sentences; determining index judgment information, mapping relation between the history index information and the history fault information corresponding to each history error report statement based on a preset reasoning algorithm, and obtaining reasoning example information corresponding to each history error report statement according to the index judgment information, the history index information, the history fault information and the mapping relation corresponding to each history error report statement; acquiring real-time error reporting sentences of a data center, matching the real-time error reporting sentences with historical error reporting sentences, and determining reasoning example information corresponding to the historical error reporting sentences with highest matching degree as target example information; inputting target example information into a natural language model so that the natural language model can complete deducing learning; acquiring real-time index information of a data center, and asking a natural language model which finishes deducing and learning according to index judgment information and the real-time index information to obtain fault indication information and a deducing process of the fault indication information. According to the method, the inference example information is obtained by gradually reasoning according to the index judgment information, the historical index information and the historical fault information in the historical error reporting statement of the data center and the preset inference algorithm, the target example information is obtained by matching the inference example information according to the real-time error reporting statement, the target example information is used as a prompt word project of a natural language model, the natural language model is helped to perfect the unique background knowledge about the data center, the real intention of a problem generated based on the real-time error reporting statement can be further known, the accuracy of an answer is improved, the root cause of a fault is accurately output, and the fault analysis efficiency of operation and maintenance personnel is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of an intelligent operation and maintenance method for a data center according to an embodiment of the present application;
fig. 2 is a schematic block diagram of an intelligent operation and maintenance system for a data center according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations.
It is also to be understood that the terminology used in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Referring to fig. 1, fig. 1 illustrates a schematic flowchart of an intelligent operation and maintenance method for a data center according to an embodiment of the present application. As shown in fig. 1, the specific steps of the intelligent operation and maintenance method include: S101-S105.
S101, acquiring historical error reporting sentences of a data center, and acquiring index judgment information, historical index information and historical fault information corresponding to the historical error reporting sentences.
Illustratively, the historical index information includes historical data for a plurality of indices, including, for example, the indices: CPU utilization, GPU utilization, memory utilization, disk utilization, network traffic, disk metrics, etc. Index judgment information. The index judgment information can be obtained from an operation and maintenance emergency manual, a fault processing scheme, an operation and maintenance management specification and a historical work order, and the index judgment information records continuous numerical values of a plurality of indexes under different conditions. Whereas the historical fault information includes: the historical worksheets generated by the alarm information are processed to consider the fault reasons of the positioning. When the index judgment information, the historical index information and the historical fault information are obtained, the actual names of the indexes or the fault information need to be reserved.
By combing the historical error reporting sentences of the data center, the unique indexes or the actual names of the fault information of the data center can be obtained, and the artificial intelligent system can be better helped to acquire the related information of the data center in the subsequent quotation process.
S102, determining the mapping relation among index judgment information, historical index information and historical fault information corresponding to each historical error reporting statement based on a preset reasoning algorithm, and obtaining reasoning example information corresponding to each historical error reporting statement according to the index judgment information, the historical index information, the historical fault information and the mapping relation corresponding to each historical error reporting statement.
In some embodiments, the metric determination information includes a standard trend curve of a plurality of metrics of the data center, and the historical metric information includes historical data of the plurality of metrics of the data center. The index judgment information records continuous values of a plurality of indexes under different conditions, and is used for generating standard trend curves of the plurality of indexes according to the continuous values, wherein the standard trend curves record the change trend of each index under normal conditions. Historical index information is a continuous value of the plurality of indexes under historical conditions for generating a historical trend curve of the plurality of indexes. Through the curves, the change of the index data under different conditions can be seen.
In some embodiments, determining, based on a preset reasoning algorithm, a mapping relationship among index judgment information, historical index information, and historical fault information corresponding to each historical error report statement includes: generating a historical trend curve of the multiple indexes of the data center according to the historical data of the multiple indexes of the data center; comparing the historical trend curve with the standard trend curve to obtain change thresholds of a plurality of indexes of the data center, wherein the change thresholds are values of the historical trend curve deviating from the standard trend curve; acquiring target indexes from the historical fault information, wherein the target indexes are indexes with faults, and the target indexes are one or more of indexes of a data center; based on a preset reasoning algorithm, determining mapping conditions according to the change thresholds of the multiple indexes and the target indexes, and converting the mapping conditions into mapping relations among index judgment information, historical index information and historical fault information.
The preset inference algorithm is a decision tree algorithm, and according to the change threshold values of the multiple indexes and the target indexes, the optimal indexes are selected from the root node according to the selection strategy, the change threshold values are divided into multiple subsets according to different values of the indexes, the process is repeated recursively for each subset until a preset stopping condition is reached, for example, the maximum depth limit is reached, and the change threshold values of the multiple indexes and the target indexes are obtained to determine the mapping condition. Because the change threshold is a value of the historical trend curve deviating from the standard trend curve, the mapping condition can also be converted into a mapping relationship among the index judgment information, the historical index information and the historical fault information based on the relationship.
In some embodiments, obtaining the reasoning example information corresponding to each history error report statement according to the index judgment information, the history index information, the history fault information and the mapping relation corresponding to each history error report statement includes: generating a guide problem according to the index judgment information and the history index information; generating a guide answer according to the historical fault information; obtaining reasoning data corresponding to the guide answers according to the mapping relation, the index judgment information and the history index information; and generating reasoning example information according to the guidance questions, the guidance answers and the reasoning data.
Illustratively, first, a change threshold between the history index information and the index judgment information is calculated, and a guidance problem is generated from the change threshold, for example, what is a system failure that may occur when the change threshold of the preset index is within a certain range? And generating a guide answer according to the historical fault information, wherein the guide answer is a system fault cause manually determined in the historical work order. Gradually recording the step of the deducing process of deducing the historical fault information from the change threshold value to generate the reasoning example information according to the mapping relation.
In this way, the model has a standard deduction example, a sample containing few thinking chain reasoning is prepared for the natural language model in the subsequent process, so that the natural language model can learn to think step by step, the complex reasoning capability in the natural language model is improved, and the language accuracy of the natural language model is improved.
S103, acquiring real-time error reporting sentences of the data center, matching the real-time error reporting sentences with the historical error reporting sentences, and determining reasoning example information corresponding to the historical error reporting sentences with the highest matching degree as target example information.
The method includes the steps that an alarm prompt appears in a data center, character content of the alarm prompt is used as a real-time error report statement, the real-time error report statement and a history error report statement are used, so that a history error report statement with highest similarity is obtained, and the history error report statement corresponds to reasoning example information and is determined to be target example information.
In some embodiments, before matching the real-time error-reporting statement with the historical error-reporting statement, the method further comprises: splitting historical error reporting sentences based on a pre-trained text segmentation model to obtain historical keywords; and carrying out vectorization processing on the historical keywords and the corresponding reasoning example information to obtain first vector data, and storing the first vector data in a database. Therefore, the natural language model can directly read the orientation quantized data without executing the task of data conversion in the running process, and therefore the operation efficiency of the natural language model is improved.
In some embodiments, matching the real-time error reporting statement with the historical error reporting statement includes: based on a pre-trained text segmentation model, splitting a real-time error reporting statement to obtain error reporting keywords, and carrying out vectorization processing on the error reporting keywords to obtain second vector data. The second vector data and the first vector data are matched.
S104, inputting the target example information into the natural language model so that the natural language model can complete deducing learning.
By engineering the carefully designed target examples as hints for the natural language model, the natural language model can be guided to focus more on specific tasks, helping the natural language model to better understand and follow specific contexts and contexts, thereby producing output consistent with user expectations or application requirements, and helping to improve the accuracy and efficiency of the model on these tasks. Meanwhile, the target example information includes: the unique indexes or fault information actual names of the data center help the natural language model to perfect the unique background knowledge of the data center, so that the real intention of a user for asking the problem can be known more deeply, and the accuracy of an answer is improved.
S105, acquiring real-time index information of the data center, and asking questions of the natural language model subjected to deducing learning according to the index judgment information and the real-time index information to obtain fault indication information and a deducing process of the fault indication information.
Illustratively, the natural language model has perfected background knowledge unique to the data center according to the target example information and has obtained logical reasoning capability from a change threshold to a failure cause, so after the index judgment information and the real-time index information are input into the natural language model, the change threshold between the index judgment information and the real-time index information is calculated first, the natural language model deduces the failure cause step by step according to the change threshold, generates failure indication information, and sorts the deducing process of the failure cause into the deducing process of the failure indication information so as to help a user to trace back whether the reasoning process is correct.
In some embodiments, after deriving the fault indication information and the deriving of the fault indication information, the method further comprises: matching the fault indication information with the historical fault information, and taking the historical fault information with the highest matching degree with the fault indication information as target fault information; and acquiring fault resolution information corresponding to the target fault information, and outputting the fault resolution information to a user. Thus, the current user can be helped to quickly find the historical solution of the fault cause, and the fault repairing efficiency is improved.
According to the method, the inference example information is obtained by gradually reasoning according to the index judgment information, the historical index information and the historical fault information in the historical error reporting statement of the data center and the preset inference algorithm, the target example information is obtained by matching the inference example information according to the real-time error reporting statement, the target example information is used as a prompt word project of a natural language model, the natural language model is helped to perfect the unique background knowledge about the data center, the real intention of a problem generated based on the real-time error reporting statement can be further known, the accuracy of an answer is improved, the root cause of a fault is accurately output, and the fault analysis efficiency of operation and maintenance personnel is improved.
Referring to fig. 2, fig. 2 is a schematic block diagram of an intelligent operation and maintenance system for a data center, where the intelligent operation and maintenance system for a data center 300 is used to perform the foregoing intelligent operation and maintenance method for a data center according to an embodiment of the present application. The intelligent operation and maintenance system for the data center can be configured in a server or terminal equipment.
The server may be an independent server, may be a server cluster, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms. The terminal device can be an electronic device such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a user digital assistant, a wearable device and the like.
As shown in fig. 2, the intelligent operation and maintenance system 300 for a data center includes: an information acquisition module 301, a mapping reasoning module 302, an information matching module 303, a model training module 304 and a result output module 305.
The information obtaining module 301 is configured to obtain a history error report statement of the data center, and obtain index judgment information, history index information, and history fault information corresponding to the history error report statement.
The mapping reasoning module 302 is configured to determine, based on a preset reasoning algorithm, a mapping relationship between the index judgment information, the historical index information, and the historical fault information corresponding to each historical error reporting statement, and obtain reasoning example information corresponding to each historical error reporting statement according to the index judgment information, the historical index information, the historical fault information, and the mapping relationship corresponding to each historical error reporting statement.
In some embodiments, the index judgment information includes standard trend curves of a plurality of indexes of the data center, the history index information includes history data of the plurality of indexes of the data center, and the mapping reasoning module 302 is specifically configured to, when configured to implement a preset reasoning algorithm, determine a mapping relationship between the index judgment information, the history index information, and the history fault information corresponding to each history error report statement, implement: generating a historical trend curve of the multiple indexes of the data center according to the historical data of the multiple indexes of the data center; comparing the historical trend curve with the standard trend curve to obtain change thresholds of a plurality of indexes of the data center, wherein the change thresholds are values of the historical trend curve deviating from the standard trend curve; acquiring target indexes from the historical fault information, wherein the target indexes are indexes with faults, and the target indexes are one or more of indexes of a data center; based on a preset reasoning algorithm, determining mapping conditions according to the change thresholds of the multiple indexes and the target indexes, and converting the mapping conditions into mapping relations among index judgment information, historical index information and historical fault information.
In some embodiments, the mapping inference module 302 is configured to implement, when obtaining the inference example information corresponding to each historical error reporting statement according to the index judgment information, the historical index information, the historical fault information, and the mapping relationship corresponding to each historical error reporting statement, specifically to implement: generating a guide problem according to the index judgment information and the history index information; generating a guide answer according to the historical fault information; obtaining reasoning data corresponding to the guide answers according to the mapping relation, the index judgment information and the history index information; and generating reasoning example information according to the guidance questions, the guidance answers and the reasoning data.
The information matching module 303 is configured to obtain a real-time error reporting statement of the data center, match the real-time error reporting statement with a history error reporting statement, and determine inference example information corresponding to the history error reporting statement with the highest matching degree as target example information.
In some embodiments, the information matching module 303 is further specifically configured to, before being configured to match the real-time error-reporting statement with the historical error-reporting statement, implement: splitting historical error reporting sentences based on a pre-trained text segmentation model to obtain historical keywords; and carrying out vectorization processing on the historical keywords and the corresponding reasoning example information to obtain first vector data, and storing the first vector data in a database.
In some embodiments, the information matching module 303 is configured to, when implementing matching the real-time error reporting statement with the historical error reporting statement, specifically: based on a pre-trained text segmentation model, splitting a real-time error reporting statement to obtain error reporting keywords, and carrying out vectorization processing on the error reporting keywords to obtain second vector data. The second vector data and the first vector data are matched.
Model training module 304 is configured to input target example information into the natural language model, so that the natural language model completes the learning of the derivation.
The result output module 305 is configured to obtain real-time index information of the data center, and ask questions of the natural language model that completes the deducing learning according to the index judgment information and the real-time index information, so as to obtain fault indication information and a deducing process of the fault indication information.
The result output module 305 is further specifically configured to implement, after implementing the deriving process for obtaining the fault indication information and the fault indication information: matching the fault indication information with the historical fault information, and taking the historical fault information with the highest matching degree with the fault indication information as target fault information; and acquiring fault resolution information corresponding to the target fault information, and outputting the fault resolution information to a user.
Embodiments of the present application provide a computer device comprising a memory and a processor; the memory is used for storing a computer program; the processor is configured to execute the computer program and implement the intelligent operation and maintenance method of the data center according to any one of the embodiments of the present application when the computer program is executed.
Embodiments of the present application provide a computer readable storage medium storing a computer program, which when executed by a processor, causes the processor to implement a method for intelligent operation and maintenance of a data center according to any one of the embodiments of the present application.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. An intelligent operation and maintenance method for a data center, the method comprising:
acquiring historical error reporting sentences of a data center, and acquiring index judgment information, historical index information and historical fault information corresponding to the historical error reporting sentences;
determining the index judgment information, the historical index information and the mapping relation between the historical fault information corresponding to each historical error reporting statement based on a preset reasoning algorithm, and obtaining reasoning example information corresponding to each historical error reporting statement according to the index judgment information, the historical index information, the historical fault information and the mapping relation corresponding to each historical error reporting statement;
acquiring real-time error reporting sentences of a data center, matching the real-time error reporting sentences with the historical error reporting sentences, and determining reasoning example information corresponding to the historical error reporting sentences with highest matching degree as target example information;
inputting the target example information into a natural language model so that the natural language model can complete deducing learning;
acquiring real-time index information of the data center, and asking the natural language model which finishes deducing learning according to the index judgment information and the real-time index information to obtain fault indication information and a deducing process of the fault indication information.
2. The intelligent operation and maintenance method according to claim 1, wherein the index judgment information includes standard trend curves of a plurality of indexes of the data center, the history index information includes history data of the plurality of indexes of the data center, and the determining, based on a preset reasoning algorithm, a mapping relationship among the index judgment information, the history index information, and the history fault information corresponding to each history error statement includes:
generating a historical trend curve of the multiple indexes of the data center according to the historical data of the multiple indexes of the data center;
comparing the historical trend curve with the standard trend curve to obtain a change threshold value of a plurality of indexes of the data center, wherein the change threshold value is a numerical value of the historical trend curve deviated from the standard trend curve;
acquiring a target index from the historical fault information, wherein the target index is an index with faults, and the target index is one or more of indexes of the data center;
and determining mapping conditions according to the change thresholds of the indexes and the target indexes based on a preset reasoning algorithm, and converting the mapping conditions into mapping relations among the index judgment information, the historical index information and the historical fault information.
3. The intelligent operation and maintenance method according to claim 1, wherein the obtaining the reasoning example information corresponding to each historical error reporting statement according to the index judgment information, the historical index information, the historical fault information and the mapping relation corresponding to each historical error reporting statement includes:
generating a guide problem according to the index judgment information and the history index information;
generating a guide answer according to the historical fault information;
obtaining reasoning data corresponding to the guide answers according to the mapping relation, the index judgment information and the historical index information;
and generating the reasoning example information according to the guidance questions, the guidance answers and the reasoning data.
4. The intelligent operation and maintenance method according to claim 1, wherein before said matching said real-time error-reporting statement with said historical error-reporting statement, said method further comprises:
splitting the historical error report statement based on a pre-trained text segmentation model to obtain a historical keyword;
and carrying out vectorization processing on the historical keywords and the corresponding reasoning example information to obtain first vector data, and storing the first vector data in a database.
5. The intelligent operation and maintenance method according to claim 4, wherein the matching the real-time error reporting statement with the historical error reporting statement comprises:
dividing the real-time error reporting statement based on a pre-trained text dividing model to obtain error reporting keywords, and carrying out vectorization processing on the error reporting keywords to obtain second vector data;
the second vector data and the first vector data are matched.
6. The intelligent operation and maintenance method according to claim 1, wherein after the deriving of the fault indication information and the fault indication information, the method further comprises:
matching the fault indication information with the historical fault information, and taking the historical fault information with the highest matching degree with the fault indication information as target fault information;
and acquiring the fault resolution information corresponding to the target fault information, and outputting the fault resolution information to a user.
7. An intelligent operation and maintenance system for a data center, the intelligent operation and maintenance system comprising:
the information acquisition module is used for acquiring historical error reporting sentences of the data center and acquiring index judgment information, historical index information and historical fault information corresponding to the historical error reporting sentences;
the mapping reasoning module is used for determining the index judgment information, the historical index information and the mapping relation between the historical fault information corresponding to each historical error reporting statement based on a preset reasoning algorithm, and obtaining reasoning example information corresponding to each historical error reporting statement according to the index judgment information, the historical index information, the historical fault information and the mapping relation corresponding to each historical error reporting statement;
the information matching module is used for acquiring real-time error reporting sentences of the data center, matching the real-time error reporting sentences with the historical error reporting sentences, and determining reasoning example information corresponding to the historical error reporting sentences with the highest matching degree as target example information;
the model training module is used for inputting the target example information into a natural language model so that the natural language model can complete deducing learning;
and the result output module is used for acquiring real-time index information of the data center, asking questions of the natural language model which finishes deducing and learning according to the index judgment information and the real-time index information, and obtaining fault indication information and a deducing process of the fault indication information.
8. A computer device, the computer device comprising a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to execute the computer program and implement the intelligent operation and maintenance method of the data center according to any one of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which when executed by a processor causes the processor to implement the intelligent operation and maintenance method of a data center according to any one of claims 1 to 6.
CN202410220811.6A 2024-02-28 Intelligent operation and maintenance method, system, equipment and storage medium for data center Active CN117785542B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410220811.6A CN117785542B (en) 2024-02-28 Intelligent operation and maintenance method, system, equipment and storage medium for data center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410220811.6A CN117785542B (en) 2024-02-28 Intelligent operation and maintenance method, system, equipment and storage medium for data center

Publications (2)

Publication Number Publication Date
CN117785542A true CN117785542A (en) 2024-03-29
CN117785542B CN117785542B (en) 2024-05-31

Family

ID=

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102243497A (en) * 2011-07-25 2011-11-16 江苏吉美思物联网产业股份有限公司 Networking technology-based remote intelligent analysis service system used for engineering machinery
KR101896973B1 (en) * 2018-01-26 2018-09-10 가천대학교 산학협력단 Natural Laguage Generating System Using Machine Learning Moodel, Method and Computer-readable Medium Thereof
CN113313271A (en) * 2021-06-03 2021-08-27 国家电网有限公司客户服务中心 Power system fault repair method and device based on remote customer service
CN114201328A (en) * 2021-12-17 2022-03-18 中国平安财产保险股份有限公司 Fault processing method and device based on artificial intelligence, electronic equipment and medium
CN115660083A (en) * 2022-11-08 2023-01-31 中国联合网络通信集团有限公司 Method and equipment for constructing knowledge graph and asking for answering for transmission network fault
CN117591659A (en) * 2024-01-18 2024-02-23 卓望数码技术(深圳)有限公司 Information processing method, device, equipment and medium based on ChatGLM operation and maintenance scene

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102243497A (en) * 2011-07-25 2011-11-16 江苏吉美思物联网产业股份有限公司 Networking technology-based remote intelligent analysis service system used for engineering machinery
KR101896973B1 (en) * 2018-01-26 2018-09-10 가천대학교 산학협력단 Natural Laguage Generating System Using Machine Learning Moodel, Method and Computer-readable Medium Thereof
CN113313271A (en) * 2021-06-03 2021-08-27 国家电网有限公司客户服务中心 Power system fault repair method and device based on remote customer service
CN114201328A (en) * 2021-12-17 2022-03-18 中国平安财产保险股份有限公司 Fault processing method and device based on artificial intelligence, electronic equipment and medium
CN115660083A (en) * 2022-11-08 2023-01-31 中国联合网络通信集团有限公司 Method and equipment for constructing knowledge graph and asking for answering for transmission network fault
CN117591659A (en) * 2024-01-18 2024-02-23 卓望数码技术(深圳)有限公司 Information processing method, device, equipment and medium based on ChatGLM operation and maintenance scene

Similar Documents

Publication Publication Date Title
US20220138193A1 (en) Conversion method and systems from natural language to structured query language
CN110597992A (en) Semantic reasoning method and device based on knowledge graph and electronic equipment
US11907863B2 (en) Natural language enrichment using action explanations
CN112200465B (en) Electric power AI method and system based on multimedia information intelligent analysis
CN117785542B (en) Intelligent operation and maintenance method, system, equipment and storage medium for data center
CN117785542A (en) Intelligent operation and maintenance method, system, equipment and storage medium for data center
CN116561264A (en) Knowledge graph-based intelligent question-answering system construction method
CN110413750A (en) The method and apparatus for recalling standard question sentence according to user's question sentence
CN112925889B (en) Natural language processing method, device, electronic equipment and storage medium
WO2019043380A1 (en) Semantic parsing
CN114968325A (en) Code annotation generation method and device, processor and electronic equipment
CN111897932A (en) Query processing method and system for text big data
CN111881266A (en) Response method and device
CN111666770A (en) Semantic matching method and device
CN113345429B (en) Semantic analysis method and system based on complex scene
CN114564330A (en) Log analysis method and device, storage medium and electronic equipment
CN117112744A (en) Assessment method and device for large language model and electronic equipment
CN117874179A (en) CCER intelligent question-answering method and device, electronic equipment and storage medium
CN117033235A (en) Method, device, equipment and storage medium for testing relevance of software program
CN118113831A (en) Question-answer data processing method and device, electronic equipment and storage medium
CN116108827A (en) Sentence similarity generation method and device
CN117421226A (en) Defect report reconstruction method and system based on large language model
CN114842246A (en) Social media pressure category detection method and device
CN117932009A (en) ChatGLM model-based insurance customer service dialogue generation method, chatGLM model-based insurance customer service dialogue generation device, chatGLM model-based insurance customer service dialogue generation equipment and ChatGLM model-based insurance customer service dialogue generation medium
CN113434655A (en) Text updating method, system, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant