CN117909651A - Fault determination method and device, electronic equipment and storage medium - Google Patents

Fault determination method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117909651A
CN117909651A CN202311862778.9A CN202311862778A CN117909651A CN 117909651 A CN117909651 A CN 117909651A CN 202311862778 A CN202311862778 A CN 202311862778A CN 117909651 A CN117909651 A CN 117909651A
Authority
CN
China
Prior art keywords
alarm
real
data
parameter
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311862778.9A
Other languages
Chinese (zh)
Inventor
徐哲元
陈安琪
王旭龙
谭胜眉
邱海飞
常理杰
李明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Caituo Cloud Computing Shanghai Co ltd
Original Assignee
Caituo Cloud Computing Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Caituo Cloud Computing Shanghai Co ltd filed Critical Caituo Cloud Computing Shanghai Co ltd
Priority to CN202311862778.9A priority Critical patent/CN117909651A/en
Publication of CN117909651A publication Critical patent/CN117909651A/en
Pending legal-status Critical Current

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The application provides a fault determination method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: determining the operation parameters of the target equipment according to the real-time acquisition data of the target equipment; searching an alarm rule corresponding to the operation parameter, wherein the alarm rule comprises: a rule expression determined by a plurality of parameter indexes; and performing fault analysis on the real-time collected data through a rule expression determined by the plurality of parameter indexes to obtain a fault analysis result of the target equipment. In the implementation process of the scheme, the real-time collected data is subjected to fault analysis through the regular expression determined by the plurality of parameter indexes to obtain the fault analysis result of the target equipment, so that the fault analysis of a complex service layer is effectively supported, the condition that the index data of a single measuring point only describe the index is low is improved, and the efficiency of carrying out fault analysis on the target equipment can be improved.

Description

Fault determination method and device, electronic equipment and storage medium
Technical Field
The application relates to the technical field of information technology operation and telecommunication intelligent operation and maintenance, in particular to a fault determination method, a device, electronic equipment and a storage medium.
Background
Currently, a single index is used to determine whether a failure occurs in a device failure determination manner of a data center, for example: index data generated by infrastructure equipment of each data center is collected through an industrial control protocol of a Building management system (Building MANAGEMENT SYSTEM, BMS), then whether the index data exceeds an index threshold value is judged, and if the index data exceeds the index threshold value, a message of higher index of the infrastructure equipment is sent to terminal equipment so as to perform fault analysis on the infrastructure equipment. However, in a specific course of practice, failure analysis of infrastructure equipment is inefficient.
Disclosure of Invention
An embodiment of the application aims to provide a fault determination method, a fault determination device, electronic equipment and a storage medium, which are used for solving the problem that the efficiency of fault analysis on infrastructure equipment is low.
The embodiment of the application provides a fault determination method, which comprises the following steps: determining the operation parameters of the target equipment according to the real-time acquisition data of the target equipment; searching an alarm rule corresponding to the operation parameter, wherein the alarm rule comprises: a rule expression determined by a plurality of parameter indexes; and performing fault analysis on the real-time collected data through a rule expression determined by the plurality of parameter indexes to obtain a fault analysis result of the target equipment. In the implementation process of the scheme, the real-time collected data is subjected to fault analysis through the regular expression determined by the plurality of parameter indexes to obtain the fault analysis result of the target equipment, so that the fault analysis of a complex service layer is effectively supported, the condition that the index data of a single measuring point only describe the index is low is improved, and the efficiency of carrying out fault analysis on the target equipment can be improved.
Optionally, in an embodiment of the present application, before determining the operation parameter of the target device according to the real-time collected data of the target device, the method further includes: extracting data fields from the real-time acquired data, and searching a filtering expression corresponding to the data fields; the real-time acquisition data is filtered using the filter expression. In the implementation process of the scheme, the filtering expression corresponding to the data field extracted from the real-time acquisition data is searched, and the filtering expression is used for filtering the real-time acquisition data, so that the number of invalid alarm messages is reduced, and the triggering accuracy of the alarm messages is improved.
Optionally, in an embodiment of the present application, determining an operation parameter of the target device according to the real-time collected data of the target device includes: extracting a data field from the real-time acquired data; and searching an enumeration value mapped by the data field in a configuration file, determining the enumeration value as an operation parameter of the target equipment, and storing a mapping relation between the data field and the enumeration value in the configuration file. In the implementation process of the scheme, the enumeration value of the data field mapping extracted from the real-time acquisition data is searched in the configuration file, and is determined as the operation parameter of the target equipment, so that the fields are consistent in the range of Data Center Infrastructure Management (DCIM), and the field names show consistency in fault analysis.
Optionally, in the embodiment of the present application, searching the alarm rule corresponding to the operation parameter includes: searching an alarm template corresponding to the operation parameter in the plurality of alarm templates; and filling the data field into an alarm template corresponding to the operation parameter to obtain an alarm rule corresponding to the operation parameter. In the implementation process of the scheme, the alarm rule corresponding to the operation parameter is obtained by searching the alarm templates corresponding to the operation parameter in the alarm templates and filling the data field into the alarm templates corresponding to the operation parameter, so that each data center can independently configure the alarm rule for monitoring, thereby forming a unified standard of complete and quantized evaluation and effectively supporting the fault analysis of the complex business layer.
Optionally, in an embodiment of the present application, the fault analysis result includes: alarm level; performing fault analysis on the real-time collected data through a rule expression determined by a plurality of parameter indexes, wherein the fault analysis comprises the following steps: splicing a plurality of parameter indexes according to an arithmetic operator, a logic operator, a relational operator and/or a function to obtain a rule expression; and matching the real-time acquired data by using the rule expression to obtain the alarm grade. In the implementation process of the scheme, the rule expressions of splicing the multiple parameter indexes are matched by using arithmetic operators, logical operators, relational operators and/or functions, so that each data center can independently configure the alarm rule for monitoring, form a unified standard of complete and quantifiable evaluation, and effectively support fault analysis of a complex business layer.
Optionally, in an embodiment of the present application, the fault analysis result further includes: an alarm message; performing fault analysis on the real-time collected data through a rule expression determined by a plurality of parameter indexes, and further comprising: and if the alarm level is greater than the level threshold, generating an alarm message according to the operation parameters of the target equipment. In the implementation process of the scheme, the alarm message is generated according to the operation parameters of the target equipment only under the condition that the alarm level is greater than the level threshold value, so that the probability of occurrence of the alarm storm in the complex fault scene is reduced, and the efficiency of carrying out fault analysis and elimination on the target equipment is effectively improved.
Optionally, in an embodiment of the present application, after generating the alarm message according to the operation parameter of the target device, the method further includes: and if the alarm mode of the target equipment is a preset mode and the alarm level is changed, sending an alarm message. In the implementation process of the scheme, the alarm mode of the target equipment is a preset mode (such as an abnormal mode), and the alarm level is changed, so that a user can timely receive the alarm message, and the triggering time rate of the alarm message and the fault removal efficiency are effectively improved.
The embodiment of the application also provides a fault determining device, which comprises: the operation parameter determining module is used for determining the operation parameters of the target equipment according to the real-time acquisition data of the target equipment; the alarm rule searching module is used for searching alarm rules corresponding to the operation parameters, and the alarm rules comprise: a rule expression determined by a plurality of parameter indexes; and the analysis result obtaining module is used for carrying out fault analysis on the real-time collected data through the rule expressions determined by the plurality of parameter indexes to obtain a fault analysis result of the target equipment. Optionally, in an embodiment of the present application, the fault determining apparatus further includes: the acquisition data extraction module is used for extracting data fields from the real-time acquisition data and searching a filtering expression corresponding to the data fields; and the collected data filtering module is used for filtering the real-time collected data by using the filtering expression.
Optionally, in an embodiment of the present application, the operation parameter determining module includes: the data field extraction submodule is used for extracting data fields from real-time acquired data; and the operation parameter determination submodule is used for searching an enumeration value mapped by the data field in the configuration file, determining the enumeration value as the operation parameter of the target equipment, and storing the mapping relation between the data field and the enumeration value in the configuration file.
Optionally, in an embodiment of the present application, the alert rule searching module includes: the alarm template searching sub-module is used for searching an alarm template corresponding to the operation parameter in the plurality of alarm templates; and the alarm template filling sub-module is used for filling the data field into the alarm template corresponding to the operation parameter to obtain the alarm rule corresponding to the operation parameter.
Optionally, in an embodiment of the present application, the fault analysis result includes: alarm level; an analysis result obtaining module, comprising: the parameter index splicing sub-module is used for splicing a plurality of parameter indexes according to an arithmetic operator, a logic operator, a relational operator and/or a function to obtain a rule expression; and the alarm grade obtaining sub-module is used for matching the real-time acquired data by using the rule expression to obtain the alarm grade.
Optionally, in an embodiment of the present application, the fault analysis result further includes: an alarm message; the analysis result obtaining module further comprises: and the alarm message generation sub-module is used for generating an alarm message according to the operation parameters of the target equipment if the alarm level is greater than the level threshold.
Optionally, in an embodiment of the present application, the analysis result obtaining module further includes: and the alarm message sending sub-module is used for sending an alarm message if the alarm mode of the target equipment is a preset mode and the alarm level is changed.
The embodiment of the application also provides electronic equipment, which comprises: a processor and a memory storing machine-readable instructions executable by the processor to perform the method as described above when executed by the processor.
Embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs a method as described above.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application, and therefore should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort to those of ordinary skill in the art.
Fig. 1 is a schematic flow chart of a fault determining method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of an apparatus operating parameter indicator provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram of a fault determining apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the accompanying drawings in the embodiments of the present application are only for the purpose of illustration and description, and are not intended to limit the scope of the embodiments of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. The flowcharts used in the embodiments of the present application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to or removed from the flow diagrams by those skilled in the art under the direction of the teachings of the embodiments of the present application.
In addition, the described embodiments are only some, but not all, of the embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Accordingly, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the claimed embodiments of the application, but is merely representative of selected embodiments of the application.
It will be appreciated that "first" and "second" in embodiments of the application are used to distinguish similar objects. It will be appreciated by those skilled in the art that the words "first," "second," etc. do not limit the number and order of execution, and that the words "first," "second," etc. do not necessarily differ. In the description of the embodiments of the present application, the term "and/or" is merely an association relationship describing an association object, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. The term "plurality" refers to two or more (including two), and similarly, "plurality" refers to two or more (including two).
It should be noted that, the fault determining method provided by the embodiment of the present application may be executed by an electronic device, where the electronic device refers to a device terminal or a server having a function of executing a computer program, where the device terminal is for example: smart phones, personal computers, tablet computers, personal digital assistants, or mobile internet appliances, etc. A server refers to a device that provides computing services over a network, such as: an x86 server and a non-x 86 server, the non-x 86 server comprising: mainframe, minicomputer, and UNIX servers.
Application scenarios to which the fault determination method is applicable are described below, where the application scenarios include, but are not limited to: operational data centers or hosting rooms, etc., for example: when a traditional Building Management System (BMS) system is used for managing a data center or a hosting machine room, the auxiliary monitoring of the data center or the hosting machine room is mostly completed from the viewpoint of equipment health, however, the traditional Building Management System (BMS) system has a single index for analyzing index data generated by infrastructure equipment, and only a simple relational expression of a single measurement point or a virtual measurement point is used for performing fault analysis on the infrastructure. When a large-scale scene or multiple fault scenes of sudden faults (such as faults of mains supply interruption and the like) occur in a data center, large-scale alarm storms easily occur, and at this time, the efficiency of performing fault analysis on the infrastructure by using a simple relational expression of a single measurement point or a virtual measurement point is low. In this case, the regular expression determined by the multiple parameter indexes in the fault determining method can be used for performing fault analysis on the real-time collected data, so that each data center can independently configure the alarm rule for monitoring, thereby forming a unified standard of complete quantifiable evaluation, effectively supporting fault analysis on a complex service level, improving the condition that the index data of a single measuring point only describe the index is high, and therefore, improving the efficiency of performing fault analysis on target equipment.
Please refer to fig. 1, which is a flowchart illustrating a fault determining method according to an embodiment of the present application; the embodiment of the application provides a fault determination method, which comprises the following steps:
Step S110: and determining the operation parameters of the target equipment according to the real-time acquisition data of the target equipment.
Target devices (Equipment), which are devices that need to monitor for faults, include, but are not limited to: servers, lithium iron phosphate batteries, and the like.
An operation parameter refers to variable information that a target device (Equipment) in an infrastructure (Facility) represents an operation condition and a state during an operation process, for example: temperature, humidity, voltage or current, etc.
Step S120: searching an alarm rule corresponding to the operation parameter, wherein the alarm rule comprises: a rule expression determined by a plurality of parameter indexes.
The alarm rule refers to a decision rule in an alarm scene for detecting whether to trigger an alarm message, where the alarm rule may include: parameter metrics, attribute criteria, and/or function operators, etc., the three concepts are explained below, respectively.
Please refer to fig. 2, which illustrates a schematic diagram of an apparatus operation parameter index provided by an embodiment of the present application; the parameter index refers to an equipment operation parameter index predefined according to an operation parameter, that is, an index defined for real number collected data monitored by target equipment of each brand or model, including but not limited to: basic information (e.g., number, name, or data type of device, etc.), functional description (e.g., ab_line voltage, etc.), collection valid value range (e.g., 0V or between 350V and 450V), etc., where the valid range expression may be this= 0| (this >350 ++this < 450).
Attribute criteria, which refers to a basic physical attribute and/or a business use attribute, etc., determined by a device manufacturer, where the basic physical attribute is, for example: rated voltage, rated capacity, etc., business usage attributes such as: temperature sensor location, humidity sensor location (cold or hot channel, etc.).
The function operator refers to a symbol for operating on parameter indexes and/or attribute standards, and comprises two forms of a function and an operator, wherein the symbol of the function form can comprise: mathematical functions (ABS, MAX, MIN, SUM, AVG, COUNT, SQRT, LOG, etc.), trigonometric functions (SIN, COS, TAN, ATAN), logical decision functions (e.g., IF, ELSE), etc., the symbols in the form of operators may include: arithmetic operators (+, -,/,% etc.), relational operators (>, <, +.gtoreq, +=, =, |= etc.), logical operators (e.g., & gt &, |, ||), etc.
Step S130: and performing fault analysis on the real-time collected data through a rule expression determined by the plurality of parameter indexes to obtain a fault analysis result of the target equipment.
It can be understood that the rule expression determined by the multiple parameter indexes can express any complex rule, the complex rule can perform fault analysis on real-time collected data of target equipment of the data center, and abnormal parameter indexes can be determined from the multiple parameter indexes according to a fault analysis result so as to provide service alarm information for on-duty monitoring personnel of the data center and guide information for removing faults and disposing faults, thereby assisting first-line monitoring and fault removing personnel to find problems, locate problems and solve problems. Of course, the affected service clients can be analyzed according to the fault analysis result and the abnormal parameter index, and the influence notification message can be timely sent to the service clients.
In the implementation process of the scheme, the alarm rules formed by the rule expressions determined by the parameter indexes are used for carrying out fault analysis on the real-time collected data to obtain the fault analysis result of the target equipment, so that each data center can independently configure the alarm rules for monitoring, thereby forming a unified standard of complete and quantifiable evaluation, effectively supporting the fault analysis of a complex service layer, improving the condition that the index data of a single measuring point only describe the height, and therefore, improving the efficiency of carrying out fault analysis on the target equipment.
As an alternative embodiment of the fault determining method, before determining the operation parameter of the target device according to the real-time collected data of the target device, the filtering may further include:
step S101: and extracting data fields from the real-time acquired data, and searching a filtering expression corresponding to the data fields.
The embodiment of step S101 described above is, for example: specifically, the real-time acquisition data can be obtained from the BMS system, two types of data fields, namely a parameter index and an attribute standard, are extracted from the real-time acquisition data, and finally, filtering expressions corresponding to the data fields are searched through executable programs compiled or interpreted by using a preset programming language. Among these, programming languages that can be used are, for example: C. c++, java, BASIC, javaScript, LISP, shell, perl, ruby, python, PHP, and the like.
Step S102: the real-time acquisition data is filtered using the filter expression.
The embodiment of step S102 described above is, for example: the above-described filter expression may be set according to circumstances, such as using the filter expression to reject data having a deviation of more than 50% between a specific value of a data field and an average value of the past 10 minutes from real-time collected data, and the like. Also for example: the above filtering expression may express a valid range corresponding to each field, for example: the effective range of the ab_line voltage is 0V or between 350V and 450V, then a filtering expression may be used to represent this=0| (this >350& this < 450), filtering out real-time collected data from data outside the effective range to reduce invalid alert messages; this is a wild card of the current acquisition value.
Optionally, in a specific practical process, before determining the operation parameters of the target device according to the real-time collected data of the target device, the real-time collected data may be filtered according to the attribute criteria of the target device, for example: assuming the target device voltage rating is 350V to 390V, if the monitored real-time acquisition data is 400V, 400V may be filtered out or modified to 390V. Optionally, after filtering the data in the real-time collected data, the missing value in the real-time collected data (i.e. the missing value after being removed) may be further filled, for example, the average value of the past 10 minutes is filled into the missing value in the real-time collected data, so that the situation that the missing value in the real-time collected data affects the fault analysis result is improved, and the accuracy of the fault analysis result is effectively improved.
As an alternative embodiment of the above step S110, an embodiment of determining the operation parameter of the target device according to the real-time collected data of the target device may include:
Step S111: the data field is extracted from the real-time acquisition data.
The implementation manner of the step S111 is as follows: the data fields are extracted from the real-time collected data using an executable program compiled or interpreted in a pre-set programming language, such as: C. c++, java, BASIC, javaScript, LISP, shell, perl, ruby, python, PHP, and the like.
Step S112: and searching an enumeration value mapped by the data field in a configuration file, determining the enumeration value as an operation parameter of the target equipment, and storing a mapping relation between the data field and the enumeration value in the configuration file.
The embodiment of step S112 described above is, for example: it will be appreciated that because the BMS system may not agree on the data field names defined at the plurality of data centers, the data field names may be mapped in the configuration file such that the BMS system defined field names are mapped to standard field names. The executable program is used for searching the enumeration value mapped by the data field from the configuration file and determining the enumeration value as the operation parameter of the target equipment, so that the field names show consistency in fault analysis. Also for example: referring to the following table, if the switching state is wrong due to the problems of the underlying communication protocol or wiring error of some data centers, the real-time collected data can be mapped reversely according to the following table, that is, if a specific value before mapping is 0 (which can be represented by closing), the value is mapped reversely to 1 to represent closing, and similarly, if a specific value before mapping is 1 (which can be represented by opening), the value is mapped reversely to 0 to represent opening.
Specific value before mapping Mapped concrete value Meaning after mapping
0 1 Closing switch
1 0 Separating brake
Also for example: in the weak current field, when the data field is of a state quantity type, mapping can be performed according to an enumeration value defined by a default weak current standard in the configuration file, if mapping cannot be performed according to an enumeration value defined by a default weak current standard in the configuration file (for example, an old version of equipment of a data center or BMS system limitation, etc.), a new field can be artificially added in the configuration file, mapping processing can be performed according to the new field, for example, the enumeration value mapped by the new field is searched in the configuration file, and the enumeration value is determined as an operation parameter of the target equipment, so that the fields are consistent within the scope of data center infrastructure management (DATA CENTER Infrastructure Management, DCIM), and the field names show consistency in fault analysis.
As an alternative implementation manner of the step S120, an implementation manner of searching for the alarm rule corresponding to the operation parameter may include:
step S121: and searching an alarm template corresponding to the operation parameter in the alarm templates.
The alarm template refers to a template standard defined in advance for an alarm standard in a monitoring obstacle-removing alarm scene of a data center, and the alarm template can comprise: data fields such as alarm body, alarm mode and alarm description. The alarm main body may be based on a device type (including device basic physical information, description, location, type, brand, model, etc.), location information (including location name, hierarchical relationship, location type, etc.), a device cabinet, a data center, etc., and the alarm mode may include two modes, a message mode and an anomaly (abnormal) mode, and the alarm description may include: equipment fault description, description location environment risk description, cabinet power rule description, and data center overall fault description, among others.
The embodiment of step S121 described above is, for example: and analyzing the operation parameters to obtain an alarm main body, an alarm mode and/or an alarm description, and then matching a plurality of alarm templates according to the alarm main body, the alarm mode and/or the alarm description, so as to match the alarm templates corresponding to the operation parameters. It will be appreciated that the alert templates described above may have inheritance relationships between them, for example: when standards of different areas are inconsistent for the same event detection mechanism, the alarm template can be respectively inherited and customized by the national alarm template, the area alarm template, the park alarm template and the alarm template of a specific data center, that is, the lower-level alarm template can inherit the upper-level alarm template, and when the rule expression instantiated by the alarm template is finally used for fault analysis, the rule expression of the lower-level alarm template is preferentially used for instantiation, so that an alarm rule expression instance (also called an alarm instance) is obtained. If the lower-level alarm template does not have the rule expression, the rule expression of the upper-level alarm template is used for instantiation to obtain an alarm rule expression instance (also called an alarm instance), for example: alert instances may be generated one by one for all racks of the data center and rack distribution relationships, where instance content may include: the cabinet is powered off in a single way, the cabinet is powered off in a double way, and the like, so that a user can adjust and modify the rule expression according to the actual situation of the data center.
Step S122: and filling the data field into an alarm template corresponding to the operation parameter to obtain an alarm rule corresponding to the operation parameter.
The embodiment of step S122 described above is, for example: the device type, the location type, the device cabinet or the data center can be filled into an alarm main body field in an alarm template corresponding to the operation parameter, and/or a message mode or an abnormal mode can be filled into an alarm mode field in an alarm template corresponding to the operation parameter, and/or a device fault description, a description location environment risk description, a cabinet power rule description and a data center overall fault description are filled into an alarm description field in an alarm template corresponding to the operation parameter, so that an alarm rule corresponding to the operation parameter is obtained.
It may be understood that the alert mode described above includes a message mode (message), where the system generates and transmits an alert message only when the trigger condition of the alert message is satisfied for the first time, and does not transmit the alert message if the trigger condition of the alert message is satisfied again later. Under the condition of the message mode, the system considers that the alarm message is the alarm message which needs to be known about the state change of the system, does not judge the error of the alarm message, and is suitable for the situation that the user needs to know the alarm message in the state change process of the system when the environment is unstable. The scenario here is for example: in the scene of switching on and off the air conditioner in the group control mode, a user needs to know the condition of the change of the state of the air conditioner in the group control mode, but the condition of the air conditioner in the group control mode (starting or stopping) is changed at any time, and the condition of stopping is not necessarily abnormal (for example, the air conditioner is not cooled or heated when the temperature is proper), and the air conditioner can be set into a message mode at the moment.
It may be appreciated that the foregoing alert modes include an abnormal (abnormal) mode, where the abnormal mode means that after the system detects a problem, the problem is continuously detected, so that when a state change such as a problem occurrence (i.e., an alert level is analyzed in a fault analysis result), an upgrade (i.e., an alert level is increased), a downgrade (i.e., an alert level is decreased) needs to be triggered to generate and send an alert message. The abnormal mode only triggers the generation and the sending of the alarm message when the problem occurs (namely the alarm level is analyzed in the fault analysis result), the state changes such as the upgrade (namely the alarm level is increased), the downgrade (namely the alarm level is reduced) and the like, so that the user can be effectively helped to monitor the abnormal state changes of the problem. The scenario here is for example: when the temperature of the air conditioner rises to 22 ℃, a general warning message with overhigh temperature can be generated and sent; when the temperature of the air conditioner rises to 24 ℃, an upgrade alarm message with higher temperature can be generated and sent; when the temperature of the air conditioner rises to 28 ℃, a severe alarm message with very high temperature can be generated and sent; when the temperature of the air conditioner is reduced to below 28 ℃, a degradation alarm message with higher temperature can be generated and sent; when the temperature of the air conditioner is reduced to below 24 ℃, a degradation alarm message with overhigh temperature can be generated and sent; when the temperature of the air conditioner drops below 22 degrees, a general warning message of temperature restoration may be generated and transmitted.
As an alternative embodiment of the step S130, the fault analysis result may include: alarm level; the above embodiment of performing fault analysis on the real-time collected data by using the regular expression determined by the multiple parameter indexes may include:
Step S131: and splicing the plurality of parameter indexes according to an arithmetic operator, a logic operator, a relational operator and/or a function to obtain the rule expression.
Step S132: and matching the real-time acquired data by using the rule expression to obtain the alarm grade.
The embodiments of the above steps S131 to S132 are, for example: in the condensing pressure scene of the water-cooling chiller, an occurrence rule expression, such as :(attr_chlw_condenser_press<710)&&(attr_chlw_working_state==1&&(attr_chl w_system_mode==2||attr_chlw_system_mode==3)),, obtained by splicing a plurality of parameter indexes by using an arithmetic operator, a logic operator, a relation operator and/or a function can be used for describing the rule expression, if the rule expression is used for matching the real-time collected data, it is determined that the condensing pressure of the chiller is less than 710, and the chiller is in an operating state, and if the operating mode of the chiller is a mechanical refrigeration mode or a precooling mode, the alarm level is set to be a general alarm that the condensing pressure of the chiller is too low, and then the generation and the sending of a general alarm message that the condensing pressure of the chiller is too low can be triggered. Similarly, a rule expression of a restoration rule obtained by stitching a plurality of parameter indexes is as follows: attr_ chlw _condensing_press >800||attr_ chlw _working_state= 0, and the rule expression indicates that if the rule expression is used for matching the real-time collected data, the condensing pressure of the water chiller is determined to be greater than 800, or the water chiller stops running, the alarm level is set to be a recovery alarm level or no alarm.
As an alternative embodiment of the step S130, the fault analysis result may further include: an alarm message; the above embodiment of performing fault analysis on the real-time collected data by using the regular expression determined by the multiple parameter indexes may further include:
step S133: and if the alarm level is greater than the level threshold, generating an alarm message according to the operation parameters of the target equipment.
The embodiment of step S133 described above is, for example: here, the temperature detected by the temperature sensor is described as an example, and it is assumed that the alarm class of the alarm message obtained by the regular expression determined by the plurality of parameter indexes is classified into three classes: no alarms, general alarms and severe alarms, for example: when the temperature is less than 25 ℃, the alarm level is set to be no alarm; when the temperature is greater than 25 degrees and less than 28 degrees, the alarm level is set as a general alarm, and when the temperature is greater than 28 degrees, the alarm is set as a serious alarm. Assuming that the level threshold is no alarm, when the temperature is 27 degrees, the alarm level is already a general alarm, and the alarm level at this time is greater than the level threshold of no alarm, and thus an alarm message of the general alarm should be generated according to the operation parameter (i.e., temperature) of the target device. Similarly, assuming that the level threshold is a general alarm, when the temperature is 30 degrees, the alarm level is already a serious alarm, and the alarm level at this time is greater than the level threshold of the general alarm, and thus an alarm message of the serious alarm should be generated according to the operation parameter (i.e., temperature) of the target device.
As an alternative embodiment of the above step S130, after generating the alarm message according to the operation parameter of the target device, the alarm message may be further transmitted, which may include:
Step S134: and if the alarm mode of the target equipment is a preset mode and the alarm level is changed, sending an alarm message.
The embodiment of step S134 described above is, for example: and if the alarm mode of the target equipment is an abnormal mode and the alarm level is changed, sending an alarm message. Wherein the alert level changes, for example: when the temperature of the air conditioner rises to 22 ℃, a general warning message with overhigh temperature can be generated and sent; when the temperature of the air conditioner rises to 24 ℃, an upgrade alarm message with higher temperature can be generated and sent; when the temperature of the air conditioner rises to 28 ℃, a severe alarm message with very high temperature can be generated and sent; when the temperature of the air conditioner is reduced to below 28 ℃, a degradation alarm message with higher temperature can be generated and sent; when the temperature of the air conditioner is reduced to below 24 ℃, a degradation alarm message with overhigh temperature can be generated and sent; when the temperature of the air conditioner drops below 22 degrees, a general warning message of temperature restoration may be generated and transmitted.
Optionally, after the alarm message generated by the alarm engine is obtained, the obstacle avoidance guide corresponding to the alarm message can be found according to different obstacle avoidance alarm scenes, so as to assist the first-line monitoring obstacle avoidance personnel in performing obstacle avoidance treatment guidance. For example: an emergency plan (EMERGENCY OF PLAN, EOP) can be defined in advance for a complex obstacle avoidance alarm scenario, so that after an alarm message generated by an alarm engine is obtained, the emergency plan corresponding to the alarm message can be directly found. The emergency plan described above is for example: combining the measurement point level alarms into cabinet level, equipment level, location level, data center level alarms, combining repeated alarm messages, and the like.
Please refer to fig. 3, which illustrates a schematic structural diagram of a fault determining apparatus according to an embodiment of the present application; the embodiment of the application provides a fault determining apparatus 200, which comprises:
the operation parameter determining module 210 is configured to determine an operation parameter of the target device according to the real-time collected data of the target device.
The alarm rule searching module 220 is configured to search an alarm rule corresponding to the operation parameter, where the alarm rule includes: a rule expression determined by a plurality of parameter indexes.
The analysis result obtaining module 230 is configured to perform fault analysis on the real-time collected data according to the rule expressions determined by the multiple parameter indexes, so as to obtain a fault analysis result of the target device.
Optionally, in an embodiment of the present application, the fault determining apparatus further includes:
and the acquisition data extraction module is used for extracting data fields from the real-time acquisition data and searching a filtering expression corresponding to the data fields.
And the collected data filtering module is used for filtering the real-time collected data by using the filtering expression.
Optionally, in an embodiment of the present application, the operation parameter determining module includes:
and the data field extraction sub-module is used for extracting the data field from the real-time acquired data.
And the operation parameter determination submodule is used for searching an enumeration value mapped by the data field in the configuration file, determining the enumeration value as the operation parameter of the target equipment, and storing the mapping relation between the data field and the enumeration value in the configuration file.
Optionally, in an embodiment of the present application, the alert rule searching module includes:
and the alarm template searching sub-module is used for searching the alarm templates corresponding to the operation parameters in the plurality of alarm templates.
And the alarm template filling sub-module is used for filling the data field into the alarm template corresponding to the operation parameter to obtain the alarm rule corresponding to the operation parameter.
Optionally, in an embodiment of the present application, the fault analysis result includes: alarm level; an analysis result obtaining module, comprising:
and the parameter index splicing sub-module is used for splicing a plurality of parameter indexes according to an arithmetic operator, a logic operator, a relational operator and/or a function to obtain a rule expression.
And the alarm grade obtaining sub-module is used for matching the real-time acquired data by using the rule expression to obtain the alarm grade.
Optionally, in an embodiment of the present application, the fault analysis result further includes: an alarm message; the analysis result obtaining module further comprises:
And the alarm message generation sub-module is used for generating an alarm message according to the operation parameters of the target equipment if the alarm level is greater than the level threshold.
Optionally, in an embodiment of the present application, the analysis result obtaining module further includes:
and the alarm message sending sub-module is used for sending an alarm message if the alarm mode of the target equipment is a preset mode and the alarm level is changed.
It should be understood that the apparatus corresponds to the above-described fault determining method embodiment, and is capable of performing the steps involved in the above-described method embodiment, and specific functions of the apparatus may be referred to the above description, and detailed descriptions thereof are omitted herein as appropriate. The device includes at least one software functional module that can be stored in memory in the form of software or firmware (firmware) or cured in an Operating System (OS) of the device.
Please refer to fig. 4, which illustrates a schematic structural diagram of an electronic device according to an embodiment of the present application. An electronic device 300 provided in an embodiment of the present application includes: a processor 310 and a memory 320, the memory 320 storing machine-readable instructions executable by the processor 310, which when executed by the processor 310 perform the method as described above.
The embodiment of the present application also provides a computer readable storage medium 330, on which computer readable storage medium 330 a computer program is stored which, when executed by the processor 310, performs a method as above. The computer-readable storage medium 330 may be implemented by any type or combination of volatile or nonvolatile memory devices, such as static random access memory (Static Random Access Memory, SRAM for short), electrically erasable programmable read-only memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-only memory, EEPROM for short), erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM for short), programmable read-only memory (Programmable Read-only memory, PROM for short), read-only memory (ROM for short), magnetic memory, flash memory, magnetic disk, or optical disk.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described as different from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.
In the embodiments of the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
In addition, the functional modules of the embodiments of the present application may be integrated together to form a single part, or the modules may exist separately, or two or more modules may be integrated to form a single part. Furthermore, in the description herein, the descriptions of the terms "one embodiment," "some embodiments," "examples," "specific examples," "some examples," and the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
The foregoing description is merely an optional implementation of the embodiment of the present application, but the scope of the embodiment of the present application is not limited thereto, and any person skilled in the art may easily think about changes or substitutions within the technical scope of the embodiment of the present application, and the changes or substitutions are covered by the scope of the embodiment of the present application.

Claims (10)

1. A fault determination method, comprising:
Determining the operation parameters of target equipment according to the real-time acquisition data of the target equipment;
searching an alarm rule corresponding to the operation parameter, wherein the alarm rule comprises: a rule expression determined by a plurality of parameter indexes;
and performing fault analysis on the real-time collected data through the regular expressions determined by the plurality of parameter indexes to obtain a fault analysis result of the target equipment.
2. The method of claim 1, further comprising, prior to said determining the operating parameters of the target device from the real-time acquisition data of the target device:
Extracting a data field from the real-time acquired data, and searching a filtering expression corresponding to the data field;
and filtering the real-time acquired data by using the filtering expression.
3. The method of claim 1, wherein determining the operating parameter of the target device from the real-time acquisition data of the target device comprises:
extracting a data field from the real-time acquisition data;
And searching an enumeration value mapped by the data field in a configuration file, determining the enumeration value as an operation parameter of the target equipment, wherein the configuration file stores a mapping relation between the data field and the enumeration value.
4. A method according to claim 3, wherein said searching for an alarm rule corresponding to said operating parameter comprises:
searching an alarm template corresponding to the operation parameter in a plurality of alarm templates;
And filling the data field into an alarm template corresponding to the operation parameter to obtain an alarm rule corresponding to the operation parameter.
5. The method of claim 1, wherein the fault analysis results comprise: alarm level; the performing fault analysis on the real-time collected data by using the rule expression determined by the parameter indexes comprises the following steps:
splicing the parameter indexes according to an arithmetic operator, a logic operator, a relational operator and/or a function to obtain the rule expression;
and matching the real-time acquired data by using the rule expression to obtain an alarm grade.
6. The method of claim 5, wherein the fault analysis result further comprises: an alarm message; the performing fault analysis on the real-time collected data by using the rule expression determined by the parameter indexes further includes:
and if the alarm level is greater than a level threshold, generating the alarm message according to the operation parameters of the target equipment.
7. The method of claim 6, further comprising, after said generating said alert message based on an operating parameter of said target device:
And if the alarm mode of the target equipment is a preset mode and the alarm level is changed, sending the alarm message.
8. A fault determination apparatus, comprising:
The operation parameter determining module is used for determining the operation parameters of the target equipment according to the real-time acquisition data of the target equipment;
the alarm rule searching module is used for searching alarm rules corresponding to the operation parameters, and the alarm rules comprise: a rule expression determined by a plurality of parameter indexes;
And the analysis result obtaining module is used for carrying out fault analysis on the real-time collected data through the regular expressions determined by the plurality of parameter indexes to obtain a fault analysis result of the target equipment.
9. An electronic device, comprising: a processor and a memory storing machine-readable instructions executable by the processor to perform the method of any one of claims 1 to 7 when executed by the processor.
10. A computer-readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, performs the method according to any of claims 1 to 7.
CN202311862778.9A 2023-12-29 2023-12-29 Fault determination method and device, electronic equipment and storage medium Pending CN117909651A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311862778.9A CN117909651A (en) 2023-12-29 2023-12-29 Fault determination method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311862778.9A CN117909651A (en) 2023-12-29 2023-12-29 Fault determination method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117909651A true CN117909651A (en) 2024-04-19

Family

ID=90681195

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311862778.9A Pending CN117909651A (en) 2023-12-29 2023-12-29 Fault determination method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117909651A (en)

Similar Documents

Publication Publication Date Title
JP7478226B2 (en) CHARGING STATION MONITORING METHOD AND APPARATUS - Patent application
CN110164101B (en) Alarm information processing method and equipment
CN110166462B (en) Access control method, system, electronic device and computer storage medium
CN110247800B (en) Online monitoring system for intelligent substation switch
CN113176978A (en) Monitoring method, system and device based on log file and readable storage medium
CN111353911A (en) Power equipment operation and maintenance method, system, equipment and storage medium
CN115396289B (en) Fault alarm determining method and device, electronic equipment and storage medium
CN104319891A (en) Overhaul and operation maintenance device and method for intelligent substation process layer
CN112446511A (en) Fault handling method, device, medium and equipment
CN111756560A (en) Data processing method, device and storage medium
CN111611097B (en) Fault detection method, device, equipment and storage medium
CN110543658A (en) Power plant equipment diagnosis method based on big data
KR102150622B1 (en) System and method for intelligent equipment abnormal symptom proactive detection
CN116381542B (en) Health diagnosis method and device of power supply equipment based on artificial intelligence
CN117909651A (en) Fault determination method and device, electronic equipment and storage medium
CN107612755A (en) The management method and its device of a kind of cloud resource
WO2024001253A1 (en) Fault detection method and apparatus for air conditioner, air conditioner and electronic device
CN116302795A (en) Terminal operation and maintenance system and method based on artificial intelligence
CN113835961B (en) Alarm information monitoring method, device, server and storage medium
CN111950448B (en) High-voltage isolating switch fault state detection method and device based on machine vision
CN108879954A (en) Controller switching equipment management method, device and power distribution server
CN114039362A (en) Combined stability control method and device for power system
CN107710158A (en) Virtualisation entity enables
CN111369017A (en) Equipment remote monitoring method and device, electronic equipment and storable medium
CN113129160A (en) Electric power communication network inspection method based on equipment state perception and intellectualization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination