WO2024012109A1 - Inspection method, alarm method, inspection system, and computer-readable storage medium - Google Patents

Inspection method, alarm method, inspection system, and computer-readable storage medium Download PDF

Info

Publication number
WO2024012109A1
WO2024012109A1 PCT/CN2023/099167 CN2023099167W WO2024012109A1 WO 2024012109 A1 WO2024012109 A1 WO 2024012109A1 CN 2023099167 W CN2023099167 W CN 2023099167W WO 2024012109 A1 WO2024012109 A1 WO 2024012109A1
Authority
WO
WIPO (PCT)
Prior art keywords
inspection
monitoring object
real
event
analysis
Prior art date
Application number
PCT/CN2023/099167
Other languages
French (fr)
Chinese (zh)
Inventor
单童
李学领
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2024012109A1 publication Critical patent/WO2024012109A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements

Definitions

  • Embodiments of the present application relate to, but are not limited to, the technical field of network operation and maintenance, and particularly relate to an inspection method, an alarm method, an inspection system, and a computer-readable storage medium.
  • Operation and maintenance management essentially refers to the continuous operation and maintenance of software and hardware performance in the network to achieve an acceptable state in terms of cost, stability and efficiency. It can be said that operation and maintenance management is a long-term, complex and continuous action. Automatic inspection technology is to face operation and maintenance management and rely on a specific intelligent management platform to hand over periodic, repetitive and regular work. The tool is completed to achieve the purpose of improving operation and maintenance efficiency. However, traditional automatic inspection technology still relies on the experience of operators and cannot handle faults in the network in a timely manner, making it difficult to guarantee the accuracy and real-time performance of services.
  • Embodiments of the present application provide an inspection method, an alarm method, an inspection system, and a computer-readable storage medium.
  • embodiments of the present application provide an inspection method, which includes: obtaining real-time operating data of a monitored object; analyzing the real-time operating data to obtain an analysis result; and conducting inspection on the monitored object according to the analysis result. Inspection is carried out to obtain the inspection results; the monitoring objects are corrected and continuously monitored based on the inspection results.
  • embodiments of the present application provide an alarm method, which includes: obtaining an alarm signal of a monitored object; identifying the intention of the monitored object according to the alarm signal, and obtaining the alarm intention of the monitored object; according to the The alarm intends to perform inspection on the monitoring object and obtain the inspection result; and perform fault determination and correction on the monitoring object based on the inspection result.
  • embodiments of the present application provide an inspection system, including: an intelligent control center, the intelligent control center being configured to execute the inspection method as described in the first aspect and/or as described in the second aspect.
  • Alarm method ; measurement unit, the measurement unit is communicated with the intelligent control center and the monitoring object respectively, and the measurement unit is configured to obtain the real-time operation data of the monitoring object and send the real-time operation data to the monitoring object.
  • the intelligent control center a control unit, the control unit is communicatively connected with the intelligent control center, and the control unit is configured to receive instructions sent by the intelligent control center; wherein the instructions include inspection instructions and correction instructions ; Execution mechanism, the execution mechanism is communicatively connected with the control unit and the monitoring object respectively; wherein the control unit is also configured to control the execution mechanism to perform corresponding operations on the monitoring object according to the instruction .
  • embodiments of the present application provide an electronic device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the above steps are implemented.
  • embodiments of the present application provide a computer-readable storage medium that stores a computer-executable program.
  • the computer-executable program is used to cause a computer to execute the method described in the first aspect. Inspection method and/or alarm method as described in the second aspect above.
  • Figure 1 is a main flow chart of an inspection method provided by an embodiment of the present application.
  • Figure 2 is another sub-flow chart of an inspection method provided by an embodiment of the present application.
  • Figure 3 is another sub-flow chart of an inspection method provided by an embodiment of the present application.
  • Figure 4 is another sub-flow chart of an inspection method provided by an embodiment of the present application.
  • Figure 5 is another sub-flow chart of an inspection method provided by an embodiment of the present application.
  • Figure 6 is another sub-flow chart of an inspection method provided by an embodiment of the present application.
  • Figure 7 is a main flow chart of an alarm method provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of a networking structure under a multi-network element topology provided by an embodiment of the present application.
  • Figure 9 is a schematic structural diagram of an inspection system provided by an embodiment of the present application.
  • Figure 10 is a flow chart of an inspection method in which the monitoring object is system status provided by an embodiment of the present application.
  • FIG 11 is a flow chart of an inspection method in which the monitoring object is event/scenario data provided by an embodiment of the present application;
  • Figure 12 is another main flow chart of an alarm method provided by an embodiment of the present application.
  • Figure 13 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Operation and maintenance management essentially refers to the continuous operation and maintenance of software and hardware performance in the network to achieve an acceptable state in terms of cost, stability and efficiency. It can be said that operation and maintenance management is a long-term, complex and continuous action. Automatic inspection technology is to face operation and maintenance management and rely on a specific intelligent management platform to hand over periodic, repetitive and regular work. The tool is completed to achieve the purpose of improving operation and maintenance efficiency. Although traditional inspection methods can discover and collect data, they are quite weak in processing and analysis capabilities, and they cannot process complex data well. Moreover, traditional inspection methods are limited in accuracy, real-time and universality, and are highly dependent on the experience of operators, which makes it difficult to guarantee the stability and security of the business and hinders the process of digital transformation. .
  • the waiting event from the occurrence of a fault to the discovery of a problem completely depends on the frequency and timing of inspections, making operation and maintenance management extremely passive.
  • the traditional inspection function is still limited by the network environment. When the network environment changes, the original inspection function may no longer be applicable.
  • the network environment described in this application is not the network bandwidth and network speed, but the network element topology.
  • this application improves the traditional inspection method.
  • the inspection method of this application can proactively predict and perceive possible failure links based on the current system's operating behavior, and more Initiate inspection actions timely and accurately, thereby improving network operation and maintenance awareness.
  • the inspection method of this application can also blur the personalized characteristics of network elements, increase the versatility and portability of the inspection system, and make it more convenient and faster to apply to similar network elements.
  • an embodiment of the present application provides an inspection method, including but not limited to step S110, step S120, step S130, and step S140.
  • Step S110 obtain real-time operating data of the monitoring object
  • Step S120 analyze the real-time operating data and obtain the analysis results
  • Step S130 perform inspection on the monitoring object according to the analysis results, and obtain inspection results
  • Step S140 Modify and continuously monitor the monitoring object based on the inspection results.
  • this application relies on the intelligent operation and maintenance network and proposes a method based on Situation prejudgment inspection method.
  • the inspection method of this application introduces an automated and intelligent closed-loop architecture. It is based on situation awareness and starts from the two dimensions of time and space. Through information extraction, element analysis, and situation prejudgment, It can accurately predict the development trend of the target and initiate more targeted inspection operations.
  • situational awareness is an environment-based, dynamic, and overall ability to gain insight into the security risks of monitored objects.
  • Situational awareness can be based on security big data to obtain security factors that cause changes in network situation in a large-scale network environment. , understand and display, and then make decisions and actions. From a global perspective, the ability to discover and identify security threats, understand and analyze them, and respond to them is improved.
  • the inspection method can obtain the real-time operation data of the monitored object in real time, and analyze the real-time operation data, so as to proactively predict and perceive the possible failure links according to the current operation behavior of the monitored object, that is, obtain the corresponding analysis results. . Then, based on the analysis results, timely and accurate inspections of the monitoring objects are carried out, and the monitoring objects are corrected and continuously monitored based on the obtained inspection results.
  • This application separates manual operation and maintenance from the inspection actions and introduces an automated closed-loop structure. The transformation from passive operation and maintenance to active operation and maintenance ensures the accuracy and real-time nature of the business.
  • the monitoring objects include but are not limited to systems, services, networks, databases, physical machines, storage, etc., which are divided more finely, and also include the performance and event indicators of each monitoring object, such as network load and business failure rate. , business success rate, packet loss rate, alarms, system CPU, memory and disk IO, etc.
  • the real-time operating data of the monitoring object may be the network load at the current time, or the packet loss rate of the service transmission at the current time, which is not specifically limited by this application.
  • the inspection method of the present application adopts data adaptation sensing technology, which can obtain real-time operating data of the monitored object in real time, intelligently determine the abnormal points of the current monitored object based on the real-time operating data, and quickly locate the abnormal points. Actively initiate precise inspections, and perform fault determination and automatic repair based on inspection results, solving the problems of poor real-time performance and low accuracy of traditional inspection methods.
  • the inspection method of this application also uses event/scenario awareness technology.
  • the monitoring object is a network event or network scenario
  • this application can use event/scenario awareness technology to monitor scene switching or event changes when a single network element is running its business. , and sense the changes in monitoring targets between different network elements, and adjust the inspection mode in a timely manner, solving the problems of poor scalability and low portability of traditional inspection methods, ensuring the universality of the business.
  • the inspection method of this application By adopting intent-aware technology, it can receive alarm notifications or inspection requests initiated by monitoring objects, analyze the intentions of the monitoring objects, and proactively initiate precise inspection operations required by the monitoring objects, thus achieving the autonomy of inspections and reducing the traditional
  • the manual operation in the inspection method effectively improves the inspection efficiency and reduces labor costs.
  • the inspection method of this application is a dynamic and continuous cyclic action, in order to better explain the inspection method of this application.
  • This application disassembles the closed-loop process into a one-way process for explanation.
  • this application continuously monitors, inspects and corrects the monitoring objects, and the inspection, positioning and fault determination operations are a relatively complex process. .
  • the judgment standards are also different.
  • This application requires a large amount of data analysis to support it.
  • embodiments of the present application provide a method for obtaining real-time operating data of a monitoring object, including but not limited to steps S210, S220, and S230.
  • Step S210 collect several event information of the monitoring object
  • Step S220 disassemble the event information to obtain a number of machine events corresponding to the event information, and store the machine events in a preset event library; where the machine events are a collection of several machine languages obtained by disassembling the event information;
  • Step S230 Continue to monitor the monitoring object and obtain real-time operating data of the monitoring object.
  • the inspection method of this application can adjust the inspection mode in time according to the scene switching or event change when the service of a single network element is running, or sense the change of monitoring targets between different network elements, thus solving the problem of poor scalability and portability of traditional inspection methods.
  • the focus of the system operation health status is the success rate of message delivery; when the monitoring object is group sending or push messages, the focus of the system operation health status is the system operation health status. Whether the system indicators are overloaded; when the monitoring object is file messages, because file messages increase file read and write operations, and the network bandwidth usage is high, database storage may even be introduced, the focus of the system running health status is focused on disk IO, network Load and database running status, etc.
  • FIG 11 is a flow chart of an inspection method in which the monitoring object is event/scenario data provided by an embodiment of the present application. It can be understood that when the inspection method of the present application is applied to events or scenes, it should be Before obtaining the real-time operation data of the monitoring object, collect a large amount of event information of the monitoring object. In order to facilitate the reading of a large amount of event information, the collected event information needs to be disassembled to obtain a collection of machine language with higher readability. , that is, machine events, and machine events correspond to event information one by one. After obtaining the machine events, the machine events need to be stored in the preset event library.
  • the benchmark event library is the cornerstone of the inspection method of this application and the basis for strong fuzzy identification capabilities. Storing machine events into the preset event library is actually a long-term, continuous, and continuous process of precipitation. The accumulation process is to store a large number of different machine events into the preset event library. Each time a machine event is stored in the event library, the sample types of the event library are expanded and enhanced. By continuously updating the event library Enhanced and enriched the samples stored in the event library, making it easier to query the event library based on real-time running data.
  • embodiments of the present application provide a method for analyzing real-time operating data and obtaining analysis results, including but not limited to step S310 and step S320.
  • Step S310 Compare the real-time operating data and the preset initial operating data to obtain differentiation indicators
  • Step S320 Compare the differentiation index with the preset differentiation index threshold to obtain an analysis result.
  • Figure 10 is a flow chart of an inspection method for monitoring system status provided by an embodiment of the present application. It should be noted that when the inspection method of the present application can not only be used to monitor system status, it can also be applied to For monitoring business operation, taking the monitoring system status as an example, before obtaining the real-time operation data of the monitoring object, it is necessary to set the initial operation data for the monitoring object.
  • the initial operation data is customized by the operator.
  • the initial operating data can be understood as a benchmark value.
  • the initial operating data can be the operating state parameters of the system under ideal conditions, or the operating state parameters of the system under critical conditions.
  • the real-time operation data of the monitoring object can be obtained, and the real-time operation data and the initial operation data can be compared to obtain differentiation indicators.
  • real-time operating data includes CPU occupancy, disk IO, memory space, file handles, etc.
  • Differential indicators represent the deviation between real-time operating data and initial operating data. After obtaining the differential indicators, it is necessary to evaluate the differential The indicators are compared with the preset differentiation indicator thresholds to obtain the analysis results.
  • the analysis results indicate whether there is an abnormality in the current monitoring object.
  • the differential index reflects that the system storage space of the monitoring object is greater than the threshold set by the initial operating data, it is necessary to further check the disk IO of the current monitoring object, the CPU status of each process, and the file handle occupancy status and whether there are oversized files, etc.
  • the analysis results are obtained by comparing the differentiation index with the preset differentiation index threshold, including one of the following: when the differentiation index is greater than the differentiation index threshold, the analysis result is that the monitoring object is abnormal; or when When the differentiation index is less than the differentiation index threshold, the analysis result is that the monitored object is normal.
  • the differentiation index reflects the deviation between the real-time operating data and the preset initial operating data. Therefore, when the differentiation index is larger, it means that the real-time operating data deviates from the initial operating data to a greater extent.
  • the differentiation index is greater than the differentiation index threshold, it indicates that the monitoring object is abnormal at this time; when the differentiation index is less than the differentiation index threshold, it indicates that the monitoring object is operating normally at this time.
  • the differentiation index when the differentiation index reflects that the system storage space of the monitoring object is greater than the threshold set by the initial operating data, it indicates that the system storage space of the monitoring object has exceeded the threshold that the system can accommodate files.
  • embodiments of the present application provide another method of analyzing real-time operating data and obtaining analysis results, including but not limited to step S410 and step S420.
  • Step S410 perform parameter extraction on the real-time operating data to obtain the first system operating parameters
  • Step S420 Compare the first system operating parameters and the second system operating parameters to obtain analysis results; wherein the second system operating parameters are obtained by parameter extraction from the real-time operating data obtained in the previous period.
  • the monitoring object is an event/scene in a network environment
  • the first system operating parameters are the operating parameters of the system under the current event/scenario. After obtaining the first system operating parameters, it is necessary to compare the first system operating parameters with the preset second system operating parameters to obtain analysis results. The analysis results reflect whether the current monitoring object has an event/scenario change.
  • the first system operating parameters of the monitored object are different under different events/scenarios.
  • the first system operating parameters reflect the current status of the event/scenario, and different first system operating parameters correspond to different events/scenarios. Scenario, when multiple first system operating parameters change, it means that the event/scenario has changed at this time. Therefore, it is necessary to compare the first system operating parameters and the second system operating parameters to determine whether the event/scenario has changed.
  • the inspection method of the present application collects real-time operating data of the monitoring object in real time
  • the inspection method of the present application collects real-time operating data of the monitoring object at preset intervals.
  • the period is customized by the operator and can be 1 second or 1 millisecond.
  • the second system operating parameters are obtained by extracting parameters from the real-time operating data obtained in the previous period.
  • This application runs the first system in real time. The parameters are compared with the operating parameters of the second system, and by analyzing whether the parameters have changed, we can quickly determine whether the event/scenario has changed in a short period of time.
  • the analysis results obtained by comparing the operating parameters of the first system and the operating parameters of the second system include one of the following: when the operating parameters of the first system and the operating parameters of the second system are different, the analysis result is an event change; Or when the operating parameters of the first system and the operating parameters of the second system are the same, the analysis result is that the event has not changed.
  • the inspection method of this application can monitor whether the operating parameters of the first system change. Determine whether the event/scene has changed.
  • the operating parameters of the first system and the operating parameters of the second system are different, it can be determined that the event/scene has changed; when the operating parameters of the first system and the second system are the same, it can be determined that the event/scenario has not changed.
  • different events/scenarios correspond to different inspection modes.
  • the inspection mode should also be changed accordingly.
  • embodiments of the present application provide a method of inspecting a monitoring object based on analysis results and obtaining inspection results, including but not limited to step S510 and step S520 .
  • Step S510 When the analysis result is that the monitoring object is abnormal, the designated parameters of the corresponding monitoring object are obtained according to the differentiation index;
  • Step S520 Perform inspection on the monitoring object according to specified parameters to obtain inspection results.
  • the differentiation index when the differentiation index is greater than the differentiation index threshold and the analysis result is that the monitoring object is abnormal, it is necessary to obtain the specified parameters of the monitoring object based on the differentiation index.
  • the differential index reflects that the system storage space exceeds a preset threshold, it indicates that the system storage space of the monitored object is suspected to be abnormal. It is necessary to more carefully determine which parameters of the system storage space cause the system storage space to be abnormal.
  • the specified parameters of the monitoring object can be the system disk IO status, the CPU status of each process, or the file handle occupancy status, etc. That is to say, the inspection method of this application accurately locates suspected abnormal nodes in the system through real-time monitoring data.
  • the monitoring object can be inspected according to the specified parameters, that is, a precise inspection of the monitoring object is quickly initiated based on the suspected abnormal nodes, and the inspection results are obtained, which saves a lot of inspection resources.
  • the inspection results can reflect whether the currently suspected abnormal node is indeed abnormal, so that the inspection method of the present application can perform corresponding correction operations and monitoring operations on the monitoring object based on the inspection results.
  • it also includes: when the analysis result is that the monitoring object is normal, adjusting the monitoring object according to the differential index and the preset baseline value adaptive adjustment algorithm.
  • the differentiation index is less than the differentiation index threshold and the analysis result is that the monitoring object is normal, the first system operating data of the monitoring object is still within the controllable range. In order to avoid the conflict between the first system operating data and The deviations between the operating data of the second system are getting larger and larger, causing the working conditions of the monitoring objects to further deteriorate. It is also necessary to adjust the monitoring objects based on differentiated indicators and preset baseline adaptive adjustment algorithms.
  • the baseline adaptive adjustment algorithm described in this application is not unique.
  • the baseline adaptive adjustment algorithm can be traditional PID, minimum mean square error, high-order Kalman filter, etc. In actual application, it needs to be based on the system practical use Scenarios are used to monitor the degree of differentiation of differentiation indicators to make optimal choices. This application does not specifically limit it.
  • embodiments of the present application provide a method of inspecting a monitoring object based on analysis results and obtaining inspection results, including but not limited to step S610, step S620, and step S630.
  • Step S610 When the analysis result is an event change, perform big data analysis on the real-time operating data to obtain real-time event information;
  • Step S620 Query the event database according to the real-time event information to obtain inspection information corresponding to the real-time event information;
  • Step S630 Perform inspection on the monitored object according to the inspection information to obtain inspection results.
  • the inspection method of the present application extracts valid event information from real-time operating data through fuzzy recognition, and then determines the current operating events/scenarios, that is, obtains real-time event information.
  • the event database needs to be queried based on the real-time event information, because the event database has been expanded multiple times and contains a large number of event sample information, and different events/scenarios correspond to different patrols. inspection mode.
  • the inspection mode corresponding to different events/scenes is also recorded in the event library. Therefore, the inspection method of this application can search for events/scenarios corresponding to real-time event information in the event library to identify monitoring objects.
  • the current event/scenario and obtain the inspection information corresponding to the current event/scenario of the monitoring object according to the event library, that is, the inspection mode.
  • the monitoring object is inspected according to the inspection mode, and the inspection results are obtained. Through the inspection results Perform corresponding correction operations and monitoring operations on the monitored objects.
  • big data analysis of real-time operating data is performed to obtain real-time event information, including: fuzzy identification of real-time operating data in combination with the event library, and extraction of real-time event information.
  • fuzzy recognition is a method of classifying objects to be recognized into corresponding standard libraries given a standard library. Fuzzy recognition methods are divided into direct methods and indirect methods. This application can combine the event library to perform fuzzy identification on real-time operating data to accurately extract real-time event information.
  • event one has feature A, feature B, feature C, and feature D
  • event two has feature A, feature E, feature F, and feature G.
  • fuzzy recognition When performing fuzzy recognition of an unknown event, first identify Feature A, feature B, feature C and feature D, then it is directly positioned as event one, which is the direct method; if it is also recognized that an unknown event has feature A, feature E, feature F and feature G, then according to the nearest principle, positioning it indirectly as event two.
  • actual events/scenarios are often more responsible, which needs to be achieved through more advanced and accurate fuzzy recognition technology.
  • the expected effect achieved by the inspection method of this application also depends on the depth of the event library and the accuracy of the fuzzy recognition algorithm.
  • the correction and continuous monitoring of the monitoring object based on the inspection results include one of the following: when the inspection result is that there is no fault, the monitoring object is continuously monitored; or when the inspection result is that there is a fault, Correct the monitoring object to obtain the correction result, and perform corresponding operations on the monitoring object based on the correction result.
  • corresponding operations are performed on the monitoring object according to the correction result, including one of the following: when the correction result is successful, the monitoring object is continuously monitored; or when the correction result is failure, an alarm signal is sent to the monitoring object.
  • the correction result when the correction result is successful, it indicates that the abnormal node in the monitoring object has been successfully corrected at this moment. In order to prevent the monitoring object from appearing abnormal in the subsequent work process, the monitoring object still needs to be continuously monitored; when When the correction result is failure, it means that the correction of the abnormal nodes in the monitoring object has failed. At this time, relying on automatic correction cannot successfully correct the abnormal nodes in the monitoring object. An alarm signal needs to be sent to the monitoring object, so that the monitoring object issues a warning and also causes the monitoring object to issue a warning. This enables operators to manually correct abnormal nodes in the monitored objects in a timely manner.
  • FIG 8 is a schematic diagram of a networking structure under a multi-network element topology provided by an embodiment of the present application; it should be noted that the above-mentioned embodiment is only an event change of a single network element.
  • the network element topology is complex.
  • network element characteristics also need to be considered.
  • this application can keenly sense and quickly capture the switching of monitoring objects through model and scene adaptation.
  • network element A switches to network element E, this application can quickly sense the changes in the network element and adjust the inspection mode in a timely manner. , avoiding manual operations by operators, effectively improving the universality and portability of the inspection method.
  • the embodiment of the present application provides an alarm method, including but not limited to step S710, step S720, step S730, and step S740.
  • Step S710 obtain the alarm signal of the monitored object
  • Step S720 perform intent identification on the monitored object based on the alarm signal, and obtain the alarm intent of the monitored object;
  • Step S730 perform inspection on the monitoring object according to the alarm intention, and obtain inspection results
  • Step S740 Perform fault determination and correction on the monitored object based on the inspection results.
  • any of the inspection methods described in the first aspect of this application actively obtains the target information of the monitoring object
  • the alarm method of this application uses intention sensing technology to passively accept alarm notifications initiated by the monitoring object or Inspection requests, and sensing the intention of the monitoring object, proactively initiate precise inspection operations required by the monitoring object according to the intention of the monitoring object, realizing the automation of inspection, reducing the manual operation of traditional inspection methods, and effectively improving the efficiency of inspection efficiency, reducing labor costs.
  • the monitored object when the abnormal nodes in the monitored object are not corrected successfully, the monitored object actively generates an estrus alarm signal.
  • the inspection method of the present application passively obtains the alarm signal of the monitored object, and monitors the monitored object according to the alarm signal. Carry out intention recognition and obtain the alarm intention of the monitored object.
  • the alarm signals include but are not limited to business success rate alarms, system overload alarms, link abnormality alarms, service abnormality alarms, database abnormality alarms, etc.
  • the alarm signals reflect the alarming intention of the monitoring object and can be passed to the alarm.
  • the signal is used for intent recognition to identify the alarm intent of the monitored object.
  • the monitoring object after obtaining the alarm intention of the monitoring object, the monitoring object needs to be inspected according to the alarm intention to judge again whether the suspected abnormal node of the monitoring object is indeed abnormal, that is, the inspection result is obtained, and the inspection result is carried out according to the inspection result. Fault determination, if the inspection result is that the suspected abnormal node of the monitoring object is indeed abnormal, the abnormal node needs to be corrected again, or manual correction is introduced; if the inspection result is that the suspected abnormal node of the monitoring object is not abnormal, then the monitoring object Conduct ongoing monitoring.
  • the inspection method of the present application when the alarm signal is a business success rate alarm, the inspection method of the present application triggers process and log inspection after receiving the alarm signal to further obtain the cause of failure; when the alarm signal is a database abnormality alarm, the inspection method of this application checks the current operating status of the database, storage capacity, disk IO, read and write delay and other information after receiving the alarm signal; when the alarm signal is a link abnormality alarm, the inspection method of this application The method is to check the operating status and network connection of relevant network elements after receiving the alarm signal.
  • the monitoring object can also actively send intention signals.
  • the intention signals include designated inspection operations, that is, actively requesting certain types of inspection actions, rather than performing inspections only after receiving alarm signals.
  • FIG. 9 a schematic structural diagram of an inspection system provided by an embodiment of the present application is provided.
  • the embodiment of the present application provides an inspection system, including: an intelligent control center.
  • the intelligent control center is configured to execute the first step of the present application.
  • the inspection method of one aspect of the embodiment and the alarm method of the second aspect of the present application for example, perform the above-described method steps S110 to S140 in Figure 1, method steps S210 to S230 in Figure 2, and method steps S210 to S230 in Figure 3.
  • the measurement unit is communicated with the intelligent control center and the monitoring object respectively.
  • the measurement unit is configured to obtain real-time operating data of the monitoring object and send the real-time operating data to the intelligent control center;
  • the control unit is communicated with the intelligent control center.
  • the control unit is set to receive instructions sent by the intelligent control center; where the instructions include inspection instructions and correction instructions;
  • the execution mechanism is respectively connected with the control unit and monitoring
  • the monitoring object is connected through communication; wherein, the control unit is also configured to control the execution mechanism to perform corresponding operations on the monitoring object according to the instructions.
  • the inspection system collects the operating data of the monitored objects in real time, embeds it in the closed-loop control system, intelligently drives decision-making, initiates precise inspections, and separates manual operations from In addition to inspection actions, there is zero contact and zero waiting, and only necessary repair operations are provided based on the fault results.
  • the inspection system includes an intelligent control center, a measurement unit, a control unit and an actuator.
  • the intelligent control center is configured to collect the performance indicators of monitoring objects and centrally manage full-stack operation and maintenance data.
  • the intelligent control center is set to perform inspection methods or alarm methods;
  • the measurement unit is communicated with the intelligent control center and the monitoring object respectively, and the measurement unit is It is set to obtain the indicator parameters of the monitoring object, that is, real-time operating data, and send the real-time operating data to the intelligent control center;
  • the control unit is connected to the intelligent control center through communication, and is set to receive instructions sent by the intelligent control center, where the intelligent control center It can accurately locate the inspection type required for the monitored object and send the inspection instruction to the control unit.
  • the intelligent control center can also start the system self-healing function after clarifying the fault type of the monitored object based on the inspection results and correct the instruction.
  • the execution mechanism is communicated with the control unit and the monitoring object respectively.
  • the control unit is configured to perform inspection operations on the monitoring object according to the inspection instruction, and is also configured to perform correction operations on the monitoring object according to the correction instruction.
  • the intelligent control center first presets the initial operation data before obtaining the real-time operation data of the monitoring object, and then obtains the differential index based on the real-time operation data and the initial operation data of the monitoring object, and The differential index is compared with the preset differential index threshold.
  • the obtained analysis result is that the monitoring object is normal, the monitoring object needs to be adjusted according to the differential index and the preset baseline value adaptive adjustment algorithm.
  • the intelligent control center actively initiates a precise inspection instruction to the control unit to clarify the specified parameters or status of the inspection and monitoring object.
  • the control unit controls the execution mechanism to initiate an inspection operation to the monitored object according to the precise inspection instruction.
  • the monitoring object when the inspection operation is completed, the monitoring object returns the inspection results to the intelligent control center, and the intelligent control center determines the inspection results. If the inspection result is that there is no fault, the intelligent control center continues to monitor the monitoring object, such as When the inspection result is that there is a fault, the intelligent control center sends a correction instruction to the control unit, so that the control center controls the execution mechanism to correct the monitoring object according to the correction instruction. When the correction operation is completed, the monitoring object returns the correction result to the intelligent control center, so If the correction result is successful, the monitoring object will be continuously monitored. If the correction result is failed, the intelligent control center will initiate an alarm, causing manual intervention.
  • the intelligent control center before obtaining the real-time operation data of the monitoring object, the intelligent control center needs to collect a large amount of event information of the monitoring object, store the event information in the event library, and then process the real-time operation data of the monitoring object. Parameters are extracted from the data to obtain the first system operating parameters. Based on the first system operating parameters, it is judged whether the event/scene has changed. If the event/scene has not changed, the intelligent control center will continue to monitor the monitoring object. If the event/scene changes, then The intelligent control center performs big data analysis on real-time operating data to obtain real-time event information, and queries the event database based on the real-time event information to obtain inspection information corresponding to the real-time event information.
  • the intelligent control center After obtaining the inspection information, the intelligent control center will The inspection information is converted into inspection instructions, and the inspection instructions are sent to the control unit, so that the control unit controls the execution agency to inspect the monitored object according to the inspection instructions. After the execution agency completes the inspection operation, it returns the inspection results to the intelligent control center.
  • the intelligent control center will collect event/scenario information again and continue to inspect the monitoring object.
  • FIG 12 is another main flow chart of an alarm method provided by an embodiment of the present application; according to another embodiment of the present application, the monitoring object actively initiates an alarm and sends the alarm signal to the intelligent control center, so that the intelligent The control center identifies the intention of the monitoring object based on the alarm signal, obtains the alarm intention of the monitoring object, and initiates a precise inspection based on the alarm intention, that is, it sends an inspection instruction to the control unit. After receiving the inspection instruction, the control unit controls the execution mechanism. The monitoring object performs inspection operations. When the inspection operation is completed, the monitoring object sends the inspection results to the intelligent control center. The intelligent control center also needs to determine and correct the fault of the monitoring object based on the inspection results.
  • the intelligent control center includes a centralized event management module, an intelligent engine module, a big data processing module, an impact analysis module, and an event library module; among which, the centralized event management module is configured to perform event sensing and centralize events
  • the management module is set to manage various parameters in the event library module; the intelligent engine module is set to collect monitoring The indicator performance of the object is analyzed and predicted intelligently; the big data processing module is set to extract real-time operating data, and adaptively adjust the algorithm for the monitored object based on the real-time operating data and the preset benchmark value. Make adjustments; the impact analysis module is set to predict the impact of faults on the monitored objects; and the centralized event management module, intelligent engine module, big data processing module, impact analysis module and event library module communicate with each other and work together to Provide situational awareness.
  • the intelligent control center as the core component of the precise inspection system, consists of a centralized event management module, an intelligent engine module, a big data processing module, an impact analysis module and an event library module. These modules work together , providing complete situational awareness capabilities.
  • the centralized event management module is configured to manage event types, characteristics, parameters, etc. in the event library module;
  • the intelligent engine module is the core control component in the intelligent control center and is configured to collect the indicator performance of monitoring objects, Intelligent perception drive type, such as sensing real-time operating data changes, event/scene switching, etc., to conduct intelligent analysis and prediction of real-time operating data;
  • the big data processing module is set to collect real-time operating data and extract information about different events, different objects, and different scenarios.
  • Operating parameters, intelligent fuzzy identification of events/scenes, and the big data processing module is also set to adaptively adjust the inspection benchmark value based on the real-time data of the monitored object;
  • the impact analysis module is set to predict the impact of the fault based on the inspection results. Timely adjust the inspection strategy or initiate automatic fault repair. If necessary, trigger an alarm and introduce manual intervention.
  • an electronic device including:
  • Programs are stored in memory and the processor executes at least one program to:
  • the processor and memory may be connected via a bus or other means.
  • memory can be used to store non-transitory software instructions and non-transitory instructions.
  • the memory may include high-speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device.
  • the memory may include memory located remotely relative to the processor, and that the remote memory may be connected to the processor via a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.
  • the processor executes non-transient software instructions, instructions and signals stored in the memory to implement various functional applications and data processing, that is, to implement the inspection method of the first embodiment.
  • the non-transient software instructions and instructions required to implement the inspection method of the above embodiment are stored in the memory.
  • the inspection method of the first embodiment of the present application and the second embodiment of the present application are executed.
  • the alarm method for example, performs the above-described method steps S110 to S140 in Figure 1, method steps S210 to S230 in Figure 2, method steps S310 to S320 in Figure 3, method steps S410 to S420 in Figure 4, Method steps S510 to S520 in FIG. 5 , method steps S610 to S630 in FIG. 6 , and method steps S710 to S740 in FIG. 7 .
  • inventions of the present application provide a computer-readable storage medium.
  • the computer-readable storage medium stores computer-executable signals.
  • the computer-executable signals are used to execute:
  • Steps S110 to S140 in FIG. 1 the above-described method steps S110 to S140 in FIG. 1 , method steps S210 to S230 in FIG. 2 , method steps S310 to S320 in FIG. 3 , method steps S410 to S420 in FIG. 4 , and the method in FIG. 5 are performed. Steps S510 to S520, method steps S610 to S630 in Figure 6, and method steps S710 to S740 in Figure 7.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated.
  • the components shown as units may or may not be physical units, that is, they may be located in one place. , or it can be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • Embodiments of the present application include: obtaining real-time operation data of monitoring objects; analyzing the real-time operation data to obtain analysis results; performing inspections on the monitoring objects according to the analysis results to obtain inspection results; check conclusion
  • the monitoring objects shall be revised and continuously monitored.
  • this application can analyze the real-time operating data of the current monitoring object to proactively predict and perceive possible failure links based on the operating behavior of the current monitoring object, that is, obtain the analysis results. Then, based on the analysis results, the monitoring objects are inspected in a timely and accurate manner, and the monitoring objects are corrected and continuously monitored based on the obtained inspection results.
  • the transformation from passive operation and maintenance to active operation and maintenance ensures the accuracy and real-time performance of the business.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer.
  • communication media typically embodies a computer-readable signal, data structure, instruction module, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

Disclosed are an inspection method, an alarm method, an inspection system, and a computer-readable storage medium. The inspection method comprises: acquiring real-time operation data of a monitored object (S110); analyzing the real-time operation data to obtain an analysis result (S120); performing an inspection on the monitored object on the basis of the analysis result to obtain an inspection result (S130); and correcting and continuously monitoring the monitored object on the basis of the inspection result (S140).

Description

巡检方法、告警方法、巡检系统和计算机可读存储介质Inspection method, alarm method, inspection system and computer-readable storage medium
相关申请的交叉引用Cross-references to related applications
本申请基于申请号为202210810434.2、申请日为2022年07月11日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is filed based on a Chinese patent application with application number 202210810434.2 and a filing date of July 11, 2022, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference into this application.
技术领域Technical field
本申请实施例涉及但不限于网络运维技术领域,特别是涉及一种巡检方法、告警方法、巡检系统和计算机可读存储介质。Embodiments of the present application relate to, but are not limited to, the technical field of network operation and maintenance, and particularly relate to an inspection method, an alarm method, an inspection system, and a computer-readable storage medium.
背景技术Background technique
运维管理本质上是指对网络中的软、硬件性能进行持续性的运营与维护,使其在成本、稳定性和效率上达成一致可接受的状态。可以说,运维管理是一个长期、繁复且持续的动作,自动巡检技术便是面对运维管理,依托于具体的智能管理平台,将周期性、重复性、规律性的工作都交给工具完成以达到提升运维效率的目的。而传统的自动巡检技术仍依赖于操作人员的经验,并不能及时对网络中的故障进行处理,使得业务的精准性和实时性难以得到保障。Operation and maintenance management essentially refers to the continuous operation and maintenance of software and hardware performance in the network to achieve an acceptable state in terms of cost, stability and efficiency. It can be said that operation and maintenance management is a long-term, complex and continuous action. Automatic inspection technology is to face operation and maintenance management and rely on a specific intelligent management platform to hand over periodic, repetitive and regular work. The tool is completed to achieve the purpose of improving operation and maintenance efficiency. However, traditional automatic inspection technology still relies on the experience of operators and cannot handle faults in the network in a timely manner, making it difficult to guarantee the accuracy and real-time performance of services.
发明内容Contents of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics described in detail in this article. This summary is not intended to limit the scope of the claims.
本申请实施例提供了一种巡检方法、告警方法、巡检系统和计算机可读存储介质。Embodiments of the present application provide an inspection method, an alarm method, an inspection system, and a computer-readable storage medium.
第一方面,本申请实施例提供了一种巡检方法,包括:获取监测对象的实时运行数据;对所述实时运行数据进行分析,得到分析结果;根据所述分析结果对所述监测对象进行巡检,得到巡检结果;根据所述巡检结果对所述监测对象进行修正和持续监测。In the first aspect, embodiments of the present application provide an inspection method, which includes: obtaining real-time operating data of a monitored object; analyzing the real-time operating data to obtain an analysis result; and conducting inspection on the monitored object according to the analysis result. Inspection is carried out to obtain the inspection results; the monitoring objects are corrected and continuously monitored based on the inspection results.
第二方面,本申请实施例提供了一种告警方法,包括:获取监测对象的告警信号;根据所述告警信号对所述监测对象进行意图识别,得到所述监测对象的告警意图;根据所述告警意图对所述监测对象进行巡检,得到巡检结果;根据所述巡检结果对所述监测对象进行故障判定和修正。In the second aspect, embodiments of the present application provide an alarm method, which includes: obtaining an alarm signal of a monitored object; identifying the intention of the monitored object according to the alarm signal, and obtaining the alarm intention of the monitored object; according to the The alarm intends to perform inspection on the monitoring object and obtain the inspection result; and perform fault determination and correction on the monitoring object based on the inspection result.
第三方面,本申请实施例提供了一种巡检系统,包括:智能控制中心,所述智能控制中心被设置为执行如上第一方面所述的巡检方法和/或如上第二方面所述的告警方法;测量单元,所述测量单元分别与所述智能控制中心、监测对象通信连接,所述测量单元被设置为获取所述监测对象的实时运行数据并将所述实时运行数据发送给所述智能控制中心;控制单元,所述控制单元与所述智能控制中心通信连接,所述控制单元被设置为接收所述智能控制中心发送的指令;其中,所述指令包括巡检指令和修正指令;执行机构,所述执行机构分别与所述控制单元、所述监测对象通信连接;其中,所述控制单元还被设置为根据所述指令控制所述执行机构对所述监测对象执行对应的操作。In a third aspect, embodiments of the present application provide an inspection system, including: an intelligent control center, the intelligent control center being configured to execute the inspection method as described in the first aspect and/or as described in the second aspect. Alarm method; measurement unit, the measurement unit is communicated with the intelligent control center and the monitoring object respectively, and the measurement unit is configured to obtain the real-time operation data of the monitoring object and send the real-time operation data to the monitoring object. The intelligent control center; a control unit, the control unit is communicatively connected with the intelligent control center, and the control unit is configured to receive instructions sent by the intelligent control center; wherein the instructions include inspection instructions and correction instructions ; Execution mechanism, the execution mechanism is communicatively connected with the control unit and the monitoring object respectively; wherein the control unit is also configured to control the execution mechanism to perform corresponding operations on the monitoring object according to the instruction .
第四方面,本申请实施例提供了一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上第一方面所述的巡检方法和/或如上第二方面所述的告警方法。In a fourth aspect, embodiments of the present application provide an electronic device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the above steps are implemented. The inspection method described in one aspect and/or the alarm method described in the second aspect above.
第五方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行程序,所述计算机可执行程序用于使计算机执行如上第一方面所述的巡检方法和/或如上第二方面所述的告警方法。In a fifth aspect, embodiments of the present application provide a computer-readable storage medium that stores a computer-executable program. The computer-executable program is used to cause a computer to execute the method described in the first aspect. Inspection method and/or alarm method as described in the second aspect above.
本申请的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。 Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
附图说明Description of drawings
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。The drawings are used to provide a further understanding of the technical solution of the present application and constitute a part of the specification. They are used to explain the technical solution of the present application together with the embodiments of the present application and do not constitute a limitation of the technical solution of the present application.
图1是本申请一个实施例提供的一种巡检方法的主流程图;Figure 1 is a main flow chart of an inspection method provided by an embodiment of the present application;
图2是本申请一个实施例提供的一种巡检方法的另一子流程图;Figure 2 is another sub-flow chart of an inspection method provided by an embodiment of the present application;
图3是本申请一个实施例提供的一种巡检方法的另一子流程图;Figure 3 is another sub-flow chart of an inspection method provided by an embodiment of the present application;
图4是本申请一个实施例提供的一种巡检方法的另一子流程图;Figure 4 is another sub-flow chart of an inspection method provided by an embodiment of the present application;
图5是本申请一个实施例提供的一种巡检方法的另一子流程图;Figure 5 is another sub-flow chart of an inspection method provided by an embodiment of the present application;
图6是本申请一个实施例提供的一种巡检方法的另一子流程图;Figure 6 is another sub-flow chart of an inspection method provided by an embodiment of the present application;
图7是本申请一个实施例提供的一种告警方法的主流程图;Figure 7 is a main flow chart of an alarm method provided by an embodiment of the present application;
图8是本申请一个实施例提供的在多网元拓扑结构下的组网结构示意图;Figure 8 is a schematic diagram of a networking structure under a multi-network element topology provided by an embodiment of the present application;
图9为本申请一个实施例提供的巡检系统的结构示意图;Figure 9 is a schematic structural diagram of an inspection system provided by an embodiment of the present application;
图10为本申请一个实施例提供的监测对象为系统状态的巡检方法的流程图;Figure 10 is a flow chart of an inspection method in which the monitoring object is system status provided by an embodiment of the present application;
图11为本申请一个实施例提供的监测对象为事件/场景数据的巡检方法的流程图;Figure 11 is a flow chart of an inspection method in which the monitoring object is event/scenario data provided by an embodiment of the present application;
图12为本申请一个实施例提供的告警方法的另一主流程图;Figure 12 is another main flow chart of an alarm method provided by an embodiment of the present application;
图13为本申请一个实施例提供的电子设备结构示意图。Figure 13 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions and advantages of the present application more clear, the present application will be further described in detail below with reference to the drawings and embodiments. It should be understood that the embodiments described here are only used to explain the present application and are not used to limit the present application.
应了解,在本申请实施例的描述中,多个(或多项)的含义是两个以上,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到“第一”、“第二”等只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。It should be understood that in the description of the embodiments of this application, the meaning of multiple (or multiple items) is two or more. Greater than, less than, exceeding, etc. are understood to exclude the number, and above, below, within, etc. are understood to include the number. If there are descriptions of "first", "second", etc., they are only used for the purpose of distinguishing technical features and cannot be understood as indicating or implying the relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the indicated technical features. The sequence relationship of technical features.
运维管理本质上是指对网络中的软、硬件性能进行持续性的运营与维护,使其在成本、稳定性和效率上达成一致可接受的状态。可以说,运维管理是一个长期、繁复且持续的动作,自动巡检技术便是面对运维管理,依托于具体的智能管理平台,将周期性、重复性、规律性的工作都交给工具完成以达到提升运维效率的目的。传统的巡检方法虽然能够发现并采集数据,但在处理和分析的能力上相当薄弱,且传统巡检方法无法对复杂的数据进行较好的处理。而且传统的巡检方法受限于精准度、实时性和普适性,对操作人员的经验有较大的依赖性,这使得业务的稳定性和安全性难以得到保障,阻碍了数字化转型的进程。一方面,从故障发生到发现问题的等待事件完全取决于巡检的频次和时机,使得运维管理有极大的被动性,在另一方面,传统的巡检功能仍受制于网络环境,当网络环境发生变化时原有的巡检功能可能不再适用。其中,本申请所述的网络环境并不是网络带宽和网速,而是网元拓扑结构。Operation and maintenance management essentially refers to the continuous operation and maintenance of software and hardware performance in the network to achieve an acceptable state in terms of cost, stability and efficiency. It can be said that operation and maintenance management is a long-term, complex and continuous action. Automatic inspection technology is to face operation and maintenance management and rely on a specific intelligent management platform to hand over periodic, repetitive and regular work. The tool is completed to achieve the purpose of improving operation and maintenance efficiency. Although traditional inspection methods can discover and collect data, they are quite weak in processing and analysis capabilities, and they cannot process complex data well. Moreover, traditional inspection methods are limited in accuracy, real-time and universality, and are highly dependent on the experience of operators, which makes it difficult to guarantee the stability and security of the business and hinders the process of digital transformation. . On the one hand, the waiting event from the occurrence of a fault to the discovery of a problem completely depends on the frequency and timing of inspections, making operation and maintenance management extremely passive. On the other hand, the traditional inspection function is still limited by the network environment. When the network environment changes, the original inspection function may no longer be applicable. Among them, the network environment described in this application is not the network bandwidth and network speed, but the network element topology.
基于此,本申请对传统的巡检方法进行改善,相较于传统的巡检方法,本申请的巡检方法能够根据当前系统的运行行为,对可能发生故障的环节进行主动预测并感知,更及时、精准地发起巡检动作,进而提升网络运维感知能力。同时本申请的巡检方法还能将网元的个性化特征模糊化,增加了巡检系统的通用性和可移值性,更便捷、快速地应用于同类网元。Based on this, this application improves the traditional inspection method. Compared with the traditional inspection method, the inspection method of this application can proactively predict and perceive possible failure links based on the current system's operating behavior, and more Initiate inspection actions timely and accurately, thereby improving network operation and maintenance awareness. At the same time, the inspection method of this application can also blur the personalized characteristics of network elements, increase the versatility and portability of the inspection system, and make it more convenient and faster to apply to similar network elements.
参照图1,第一方面,本申请实施例提供了一种巡检方法,包括但不限于步骤S110、步骤S120、步骤S130、步骤S140。Referring to Figure 1 , in a first aspect, an embodiment of the present application provides an inspection method, including but not limited to step S110, step S120, step S130, and step S140.
步骤S110,获取监测对象的实时运行数据;Step S110, obtain real-time operating data of the monitoring object;
步骤S120,对实时运行数据进行分析,得到分析结果;Step S120, analyze the real-time operating data and obtain the analysis results;
步骤S130,根据分析结果对监测对象进行巡检,得到巡检结果;Step S130, perform inspection on the monitoring object according to the analysis results, and obtain inspection results;
步骤S140,根据巡检结果对监测对象进行修正和持续监测。Step S140: Modify and continuously monitor the monitoring object based on the inspection results.
需要说明的是,相较于传统的巡检方法,本申请以智能运维网为依托,提出了一种基于 态势预判的巡检方法,本申请的巡检方法引入了自动化智能化的闭环架构,以态势感知为基础,从时间、空间两个维度为出发点,通过信息提取、要素分析、态势预判,能够准确地预测了目标的发展趋势,更有针对性地发起巡检操作。It should be noted that compared with the traditional inspection method, this application relies on the intelligent operation and maintenance network and proposes a method based on Situation prejudgment inspection method. The inspection method of this application introduces an automated and intelligent closed-loop architecture. It is based on situation awareness and starts from the two dimensions of time and space. Through information extraction, element analysis, and situation prejudgment, It can accurately predict the development trend of the target and initiate more targeted inspection operations.
其中,态势感知是一种基于环境的、动态、整体地洞悉监测对象的安全风险的能力,态势感知能够以安全大数据为基础,在大规模网络环境中对引起网络态势变化的安全要素进行获取、理解和显示,进而进行决策与行动。从全局视角提升了对安全威胁的发现识别能力、理解分析能力和响应处置能力。Among them, situational awareness is an environment-based, dynamic, and overall ability to gain insight into the security risks of monitored objects. Situational awareness can be based on security big data to obtain security factors that cause changes in network situation in a large-scale network environment. , understand and display, and then make decisions and actions. From a global perspective, the ability to discover and identify security threats, understand and analyze them, and respond to them is improved.
根据本申请的巡检方法能够实时获取监测对象的实时运行数据,并对实时运行数据进行分析,以根据当前监测对象的运行行为,主动预测、感知可能发生故障的环节,即得到对应的分析结果。进而根据分析结果及时精准地对监测对象进行巡检,根据得到的巡检结果对监测对象进行修正和持续监测,本申请将人工运维剥脱于巡检动作之外,引入了自动化的闭环结构,从被动运维向主动运维转型,使业务的精准性和实时性得到保障。The inspection method according to the present application can obtain the real-time operation data of the monitored object in real time, and analyze the real-time operation data, so as to proactively predict and perceive the possible failure links according to the current operation behavior of the monitored object, that is, obtain the corresponding analysis results. . Then, based on the analysis results, timely and accurate inspections of the monitoring objects are carried out, and the monitoring objects are corrected and continuously monitored based on the obtained inspection results. This application separates manual operation and maintenance from the inspection actions and introduces an automated closed-loop structure. The transformation from passive operation and maintenance to active operation and maintenance ensures the accuracy and real-time nature of the business.
在一实施方式中,监测对象包括但不限于系统、业务、网络、数据库、物理机、存储等,更细化地划分,还包括各个监测对象的性能、事件指标,如网络负荷、业务失败率、业务成功率、丢包率、告警、系统CPU、内存和磁盘IO等。根据本申请的一个实施例,监测对象的实时运行数据可以为当前时间的网络负荷,也可以为当前时间业务传送的丢包率,本申请并不对其作具体的限定。In one implementation, the monitoring objects include but are not limited to systems, services, networks, databases, physical machines, storage, etc., which are divided more finely, and also include the performance and event indicators of each monitoring object, such as network load and business failure rate. , business success rate, packet loss rate, alarms, system CPU, memory and disk IO, etc. According to an embodiment of the present application, the real-time operating data of the monitoring object may be the network load at the current time, or the packet loss rate of the service transmission at the current time, which is not specifically limited by this application.
在一实施方式中,本申请的巡检方法采用了数据适配感知技术,能够实时获取监测对象的实时运行数据,根据实时运行数据智能判断当前监测对象的异常点,对异常点进行快速定位,主动发起精准巡检,并根据巡检结果进行故障判定和自动修复,解决了传统的巡检方法的实时性差、精准度低的问题。其次,本申请的巡检方法还采用了事件/场景感知技术,当监测对象为网络事件或网络场景时,本申请可以通过事件/场景感知技术监测单一网元业务运行时的场景切换或事件变更,并感知不同网元间的监测目标变换,及时调整巡检模式,解决了传统的巡检方法扩展性差、可移植性低的问题,使得业务的普适性得到了保障。In one embodiment, the inspection method of the present application adopts data adaptation sensing technology, which can obtain real-time operating data of the monitored object in real time, intelligently determine the abnormal points of the current monitored object based on the real-time operating data, and quickly locate the abnormal points. Actively initiate precise inspections, and perform fault determination and automatic repair based on inspection results, solving the problems of poor real-time performance and low accuracy of traditional inspection methods. Secondly, the inspection method of this application also uses event/scenario awareness technology. When the monitoring object is a network event or network scenario, this application can use event/scenario awareness technology to monitor scene switching or event changes when a single network element is running its business. , and sense the changes in monitoring targets between different network elements, and adjust the inspection mode in a timely manner, solving the problems of poor scalability and low portability of traditional inspection methods, ensuring the universality of the business.
而且,由于传统的巡检方法中,人工处理仍是不可或缺的一环,需要操作人员对监测对象的实时运行数据进行识别、判断并发起巡检操作,基于此,本申请的巡检方法通过采用了意图感知技术,能够接收监测对象发起的告警通知或巡检请求,剖析监测对象的意图,主动发起监测对象需要的精准巡检操作,从而实现了巡检的自主性,减少了传统的巡检方法中的人工操作,有效提高了巡检效率,降低了人力成本。Moreover, since manual processing is still an indispensable part of the traditional inspection method, the operator is required to identify, judge and initiate inspection operations on the real-time operating data of the monitored object. Based on this, the inspection method of this application By adopting intent-aware technology, it can receive alarm notifications or inspection requests initiated by monitoring objects, analyze the intentions of the monitoring objects, and proactively initiate precise inspection operations required by the monitoring objects, thus achieving the autonomy of inspections and reducing the traditional The manual operation in the inspection method effectively improves the inspection efficiency and reduces labor costs.
需要说明的是,本申请的巡检方法是一个动态持续的循环动作,为了更好地说明本申请的巡检方法。本申请将闭环过程拆解成单向的流程进行说明,实际上,本申请持续地在对监测对象进行监测、巡检和修正,而且,巡检定位操作和故障判定操作是一个相对复杂的过程。面对不同的监测对象,判定标准也不同,本申请需要大量的数据分析为依托以支撑。It should be noted that the inspection method of this application is a dynamic and continuous cyclic action, in order to better explain the inspection method of this application. This application disassembles the closed-loop process into a one-way process for explanation. In fact, this application continuously monitors, inspects and corrects the monitoring objects, and the inspection, positioning and fault determination operations are a relatively complex process. . In the face of different monitoring objects, the judgment standards are also different. This application requires a large amount of data analysis to support it.
参照图2,第一方面,本申请实施例提供了一种获取监测对象的实时运行数据的方法,包括但不限于步骤S210、S220、S230。Referring to Figure 2, in the first aspect, embodiments of the present application provide a method for obtaining real-time operating data of a monitoring object, including but not limited to steps S210, S220, and S230.
步骤S210,收集监测对象的若干事件信息;Step S210, collect several event information of the monitoring object;
步骤S220,对事件信息进行拆解,得到若干与事件信息对应的机器事件,将机器事件存储进预设的事件库;其中,机器事件由对事件信息进行拆解得到的若干机器语言的合集;Step S220, disassemble the event information to obtain a number of machine events corresponding to the event information, and store the machine events in a preset event library; where the machine events are a collection of several machine languages obtained by disassembling the event information;
步骤S230,持续监测监测对象并获取监测对象的实时运行数据。Step S230: Continue to monitor the monitoring object and obtain real-time operating data of the monitoring object.
需要说明的是,对于业务系统运行来说,在不同的事件/场景下,系统运行健康状态的关注点也不同,在一实施方式中,当场景或事件切换时,应根据监测对象的实时运行数据切换至当前事件/场景对应的巡检模式。本申请的巡检方法可以根据单一网元业务运行时的场景切换或事件变更,或感知不同网元间的监测目标变换,及时调整巡检模式,解决了传统的巡检方法扩展性差,可移植性低的问题。It should be noted that for the operation of the business system, under different events/scenarios, the focus on the health status of the system operation is also different. In one embodiment, when the scene or event is switched, the real-time operation of the monitored object should be The data switches to the inspection mode corresponding to the current event/scene. The inspection method of this application can adjust the inspection mode in time according to the scene switching or event change when the service of a single network element is running, or sense the change of monitoring targets between different network elements, thus solving the problem of poor scalability and portability of traditional inspection methods. The problem of low sex.
根据本申请的一个实施例,当监测对象为点对点消息传输时,系统运行健康状态的关注点为消息投递的成功率;当监测对象为群发或推送消息时,系统运行健康状态的关注点为系 统指标是否过负荷;当监测对象为文件消息时,因为文件消息增加了文件读写操作,且网络带宽占用率高,甚至可能引入数据库存储,系统运行健康状态的关注点集中在磁盘IO、网络负荷和数据库运行状态等。According to an embodiment of the present application, when the monitoring object is point-to-point message transmission, the focus of the system operation health status is the success rate of message delivery; when the monitoring object is group sending or push messages, the focus of the system operation health status is the system operation health status. Whether the system indicators are overloaded; when the monitoring object is file messages, because file messages increase file read and write operations, and the network bandwidth usage is high, database storage may even be introduced, the focus of the system running health status is focused on disk IO, network Load and database running status, etc.
参照图11,图11为本申请一个实施例提供的监测对象为事件/场景数据的巡检方法的流程图,可以理解的是,当本申请的巡检方法应用于事件或场景时,应在获取监测对象的实时运行数据之前,收集监测对象的大量事件信息,为方便对大量的事件信息进行读取,需要对收集到的事件信息进行拆解,得到易读性更高的机器语言的合集,即机器事件,且机器事件一一与事件信息相对应,在得到机器事件后还需要将机器事件存储进预设的事件库。Referring to Figure 11, Figure 11 is a flow chart of an inspection method in which the monitoring object is event/scenario data provided by an embodiment of the present application. It can be understood that when the inspection method of the present application is applied to events or scenes, it should be Before obtaining the real-time operation data of the monitoring object, collect a large amount of event information of the monitoring object. In order to facilitate the reading of a large amount of event information, the collected event information needs to be disassembled to obtain a collection of machine language with higher readability. , that is, machine events, and machine events correspond to event information one by one. After obtaining the machine events, the machine events need to be stored in the preset event library.
需要说明的是,基准事件库是本申请的巡检方法实现的基石,是具备有强大的模糊识别能力的基础,将机器事件存储进预设的事件库实际上是一个长期、持续、不断沉淀的累加过程,即将大量不同的机器事件存储进预设的事件库,每将一个机器事件存储进事件库,就对事件库的样本类型进行了一次扩充和增强,通过持续性地对事件库进行增强,丰富了事件库存储的样本,便于后续对根据实时运行数据对事件库进行查询。It should be noted that the benchmark event library is the cornerstone of the inspection method of this application and the basis for strong fuzzy identification capabilities. Storing machine events into the preset event library is actually a long-term, continuous, and continuous process of precipitation. The accumulation process is to store a large number of different machine events into the preset event library. Each time a machine event is stored in the event library, the sample types of the event library are expanded and enhanced. By continuously updating the event library Enhanced and enriched the samples stored in the event library, making it easier to query the event library based on real-time running data.
参照图3,第一方面,本申请实施例提供了一种对实时运行数据进行分析,得到分析结果的方法,包括但不限于步骤S310、步骤S320。Referring to Figure 3, in the first aspect, embodiments of the present application provide a method for analyzing real-time operating data and obtaining analysis results, including but not limited to step S310 and step S320.
步骤S310,对实时运行数据和预设的初始运行数据进行对比,得到差异化指标;Step S310: Compare the real-time operating data and the preset initial operating data to obtain differentiation indicators;
步骤S320,对差异化指标和预设的差异化指标阈值进行对比,得到分析结果。Step S320: Compare the differentiation index with the preset differentiation index threshold to obtain an analysis result.
参照图10,图10为本申请一个实施例提供的监测对象为系统状态的巡检方法的流程图,需要说明的是,当本申请的巡检方法不仅能够应用于监测系统状态,还能够应用于监测业务运行,以监测系统状态为例,在获取监测对象的实时运行数据之前,需要先针对监测对象设定初始运行数据,在一实施方式中,初始运行数据是由操作人员自定义设置的,可以将初始运行数据理解为是一种基准值,初始运行数据可以为理想状态下系统的运行状态参数,也可以为临界状态下系统的运行状态参数。Referring to Figure 10, Figure 10 is a flow chart of an inspection method for monitoring system status provided by an embodiment of the present application. It should be noted that when the inspection method of the present application can not only be used to monitor system status, it can also be applied to For monitoring business operation, taking the monitoring system status as an example, before obtaining the real-time operation data of the monitoring object, it is necessary to set the initial operation data for the monitoring object. In one embodiment, the initial operation data is customized by the operator. , the initial operating data can be understood as a benchmark value. The initial operating data can be the operating state parameters of the system under ideal conditions, or the operating state parameters of the system under critical conditions.
根据本申请的一个实施例,在设定好初始运行数据后,可获取监测对象的实时运行数据,并对实时运行数据和初始运行数据进行对比,得到差异化指标。具体的是,实时运行数据包括CPU占用率、磁盘IO、内存空间和文件句柄等,差异化指标表示实时运行数据和初始运行数据之间的偏差,在得到差异化指标后,还需要对差异化指标和预设的差异化指标阈值进行对比,得到分析结果。According to an embodiment of the present application, after the initial operation data is set, the real-time operation data of the monitoring object can be obtained, and the real-time operation data and the initial operation data can be compared to obtain differentiation indicators. Specifically, real-time operating data includes CPU occupancy, disk IO, memory space, file handles, etc. Differential indicators represent the deviation between real-time operating data and initial operating data. After obtaining the differential indicators, it is necessary to evaluate the differential The indicators are compared with the preset differentiation indicator thresholds to obtain the analysis results.
在一实施方式中,分析结果表明了当前的监测对象是否存在异常。In one implementation, the analysis results indicate whether there is an abnormality in the current monitoring object.
根据本申请的另一个实施例,若差异化指标反映了监测对象的系统存储空间大于初始运行数据设定的阈值,则需要进一步检查当前监测对象的磁盘IO、各进程CPU情况、文件句柄占用情况和是否存在超大文件等。According to another embodiment of the present application, if the differential index reflects that the system storage space of the monitoring object is greater than the threshold set by the initial operating data, it is necessary to further check the disk IO of the current monitoring object, the CPU status of each process, and the file handle occupancy status and whether there are oversized files, etc.
可以理解的是,对差异化指标和预设的差异化指标阈值进行对比,得到分析结果,包括以下之一:当差异化指标大于差异化指标阈值时,分析结果为监测对象存在异常;或当差异化指标小于差异化指标阈值时,分析结果为监测对象正常。It can be understood that the analysis results are obtained by comparing the differentiation index with the preset differentiation index threshold, including one of the following: when the differentiation index is greater than the differentiation index threshold, the analysis result is that the monitoring object is abnormal; or when When the differentiation index is less than the differentiation index threshold, the analysis result is that the monitored object is normal.
需要说明的是,差异化指标体现了实时运行数据和预设的初始运行数据之间的偏差,因此,当差异化指标越大,表示实时运行数据偏离初始运行数据的程度越大。当差异化指标大于差异化指标阈值时,则表明此时监测对象存在异常,当差异化指标小于差异化指标阈值时,则表明此时监测对象正常运行。It should be noted that the differentiation index reflects the deviation between the real-time operating data and the preset initial operating data. Therefore, when the differentiation index is larger, it means that the real-time operating data deviates from the initial operating data to a greater extent. When the differentiation index is greater than the differentiation index threshold, it indicates that the monitoring object is abnormal at this time; when the differentiation index is less than the differentiation index threshold, it indicates that the monitoring object is operating normally at this time.
根据本申请的一个实施例,当差异化指标反映了监测对象的系统存储空间大于初始运行数据设定的阈值,则表明此时监测对象的系统存储空间已超过系统能够容纳文件的阈值。According to an embodiment of the present application, when the differentiation index reflects that the system storage space of the monitoring object is greater than the threshold set by the initial operating data, it indicates that the system storage space of the monitoring object has exceeded the threshold that the system can accommodate files.
参照图4,第一方面,本申请实施例提供了另一种对实时运行数据进行分析,得到分析结果的方法,包括但不限于步骤S410、步骤S420。Referring to Figure 4, in the first aspect, embodiments of the present application provide another method of analyzing real-time operating data and obtaining analysis results, including but not limited to step S410 and step S420.
步骤S410,对实时运行数据进行参数提取,得到第一系统运行参数;Step S410, perform parameter extraction on the real-time operating data to obtain the first system operating parameters;
步骤S420,对第一系统运行参数和第二系统运行参数进行对比,得到分析结果;其中,第二系统运行参数由对上一时段获取的实时运行数据进行参数提取得到。 Step S420: Compare the first system operating parameters and the second system operating parameters to obtain analysis results; wherein the second system operating parameters are obtained by parameter extraction from the real-time operating data obtained in the previous period.
参照图11,可以理解的是,当监测对象为网络环境中的事件/场景时,为判断事件/场景是否发生变更,需要先对实时运行数据进行参数提取,以提取得到第一系统运行参数,第一系统运行参数即当前事件/场景下的系统的各个运行参数。在得到第一系统运行参数后需要对第一系统运行参数和预设的第二系统运行参数进行对比,得到分析结果,分析结果反映了当前的监测对象是否发生事件/场景变更。Referring to Figure 11, it can be understood that when the monitoring object is an event/scene in a network environment, in order to determine whether the event/scene has changed, it is necessary to first perform parameter extraction on the real-time operating data to extract the first system operating parameters. The first system operating parameters are the operating parameters of the system under the current event/scenario. After obtaining the first system operating parameters, it is necessary to compare the first system operating parameters with the preset second system operating parameters to obtain analysis results. The analysis results reflect whether the current monitoring object has an event/scenario change.
需要说明的是,不同的事件/场景之下监测对象的第一系统运行参数也不同,第一系统运行参数反映了事件/场景当前的状态,不同的第一系统运行参数对应着不同的事件/场景,当多个第一系统运行参数均发生变化,则说明此时事件/场景发生变更,因此需要将第一系统运行参数和第二系统运行参数进行对比,以判断事件/场景是否发生变更。It should be noted that the first system operating parameters of the monitored object are different under different events/scenarios. The first system operating parameters reflect the current status of the event/scenario, and different first system operating parameters correspond to different events/scenarios. Scenario, when multiple first system operating parameters change, it means that the event/scenario has changed at this time. Therefore, it is necessary to compare the first system operating parameters and the second system operating parameters to determine whether the event/scenario has changed.
在一实施方式中,因为本申请的巡检方法是实时收集监测对象的实时运行数据,也可以理解为,本申请的巡检方法间隔预设的时段收集监测对象的实时运行数据,预设的时段由操作者自定义,可以为1秒,也可以为1毫秒,第二系统运行参数则是由对上一时段获取的实时运行数据进行参数提取得到的,本申请通过实时对第一系统运行参数和第二系统运行参数进行对比,通过分析参数是否变更,对应地在短时间内迅速判断事件/场景是否发生变更。In one embodiment, because the inspection method of the present application collects real-time operating data of the monitoring object in real time, it can also be understood that the inspection method of the present application collects real-time operating data of the monitoring object at preset intervals. The period is customized by the operator and can be 1 second or 1 millisecond. The second system operating parameters are obtained by extracting parameters from the real-time operating data obtained in the previous period. This application runs the first system in real time. The parameters are compared with the operating parameters of the second system, and by analyzing whether the parameters have changed, we can quickly determine whether the event/scenario has changed in a short period of time.
可以理解的是,对第一系统运行参数和第二系统运行参数进行对比,得到分析结果,包括以下之一:当第一系统运行参数与第二系统运行参数不同时,分析结果为事件变更;或当第一系统运行参数与第二系统运行参数相同时,分析结果为事件未发生变更。It can be understood that the analysis results obtained by comparing the operating parameters of the first system and the operating parameters of the second system include one of the following: when the operating parameters of the first system and the operating parameters of the second system are different, the analysis result is an event change; Or when the operating parameters of the first system and the operating parameters of the second system are the same, the analysis result is that the event has not changed.
需要说明的是,不同事件/场景下的系统指标不同,即当事件/场景变更时,第一系统运行参数也发生变化,因此本申请的巡检方法可通过监测第一系统运行参数是否变化以判断事件/场景是否变更。当第一系统运行参数与第二系统运行参数不同时,可判断得到事件/场景变更;当第一系统运行参数与第二系统运行参数相同时,可判断得到事件/场景未发生变更。It should be noted that the system indicators under different events/scenarios are different, that is, when the event/scenario changes, the operating parameters of the first system also change. Therefore, the inspection method of this application can monitor whether the operating parameters of the first system change. Determine whether the event/scene has changed. When the operating parameters of the first system and the operating parameters of the second system are different, it can be determined that the event/scene has changed; when the operating parameters of the first system and the second system are the same, it can be determined that the event/scenario has not changed.
在一实施方式中,不同的事件/场景对应着不同的巡检模式,当事件/场景发生变更时,巡检模式也应对应地变更。In one implementation, different events/scenarios correspond to different inspection modes. When the event/scenario changes, the inspection mode should also be changed accordingly.
参照图5,第一方面,本申请实施例提供了一种根据分析结果对监测对象进行巡检,得到巡检结果的方法,包括但不限于步骤S510,步骤S520。Referring to FIG. 5 , in the first aspect, embodiments of the present application provide a method of inspecting a monitoring object based on analysis results and obtaining inspection results, including but not limited to step S510 and step S520 .
步骤S510,当分析结果为监测对象存在异常时,根据差异化指标得到对应的监测对象的指定参数;Step S510: When the analysis result is that the monitoring object is abnormal, the designated parameters of the corresponding monitoring object are obtained according to the differentiation index;
步骤S520,根据指定参数对监测对象进行巡检,得到巡检结果。Step S520: Perform inspection on the monitoring object according to specified parameters to obtain inspection results.
参照图10,需要说明的是,当差异化指标大于差异化指标阈值,分析结果为监测对象存在异常时,需要根据差异化指标得到监测对象的指定参数。在一实施方式中,当差异化指标反映系统存储空间超过了预设的阈值,则表明监测对象的系统存储空间疑似存在异常。需要更细致地判断系统存储空间的何种参数导致了系统存储空间异常,此时监测对象的指定参数可以为系统磁盘IO情况、可以为各进程CPU情况,也可以为文件句柄占用情况等。即本申请的巡检方法通过实时监测数据在系统中精准定位了疑似异常的节点。Referring to Figure 10, it should be noted that when the differentiation index is greater than the differentiation index threshold and the analysis result is that the monitoring object is abnormal, it is necessary to obtain the specified parameters of the monitoring object based on the differentiation index. In one implementation, when the differential index reflects that the system storage space exceeds a preset threshold, it indicates that the system storage space of the monitored object is suspected to be abnormal. It is necessary to more carefully determine which parameters of the system storage space cause the system storage space to be abnormal. At this time, the specified parameters of the monitoring object can be the system disk IO status, the CPU status of each process, or the file handle occupancy status, etc. That is to say, the inspection method of this application accurately locates suspected abnormal nodes in the system through real-time monitoring data.
在得到指定参数后,可根据指定参数对监测对象进行巡检,即根据疑似异常的节点对监测对象快速发起精准巡检,得到巡检结果,节省了大量的巡检资源。在一实施方式中,巡检结果能够反映当前疑似异常的节点是否确实存在异常,使得本申请的巡检方法可以通过巡检结果对监测对象执行对应的修正操作和监测操作。After obtaining the specified parameters, the monitoring object can be inspected according to the specified parameters, that is, a precise inspection of the monitoring object is quickly initiated based on the suspected abnormal nodes, and the inspection results are obtained, which saves a lot of inspection resources. In one embodiment, the inspection results can reflect whether the currently suspected abnormal node is indeed abnormal, so that the inspection method of the present application can perform corresponding correction operations and monitoring operations on the monitoring object based on the inspection results.
可以理解的是,还包括:当分析结果为监测对象正常时,根据差异化指标和预设的基准值自适应调整算法对监测对象进行调整。It can be understood that it also includes: when the analysis result is that the monitoring object is normal, adjusting the monitoring object according to the differential index and the preset baseline value adaptive adjustment algorithm.
需要说明的是,当差异化指标小于差异化指标阈值,分析结果为监测对象正常时,此时监测对象的第一系统运行数据仍在可控的范围之内,为避免第一系统运行数据与第二系统运行数据之间的偏差越来越大,导致监测对象的工作情况进一步恶化,还需要根据差异化指标和预设的基准自适应调整算法对监测对象进行调整。It should be noted that when the differentiation index is less than the differentiation index threshold and the analysis result is that the monitoring object is normal, the first system operating data of the monitoring object is still within the controllable range. In order to avoid the conflict between the first system operating data and The deviations between the operating data of the second system are getting larger and larger, causing the working conditions of the monitoring objects to further deteriorate. It is also necessary to adjust the monitoring objects based on differentiated indicators and preset baseline adaptive adjustment algorithms.
在一实施方式中,本申请所述的基准自适应调整算法并不唯一,基准自适应调整算法可以为传统PID、最小均方误差、高阶卡尔曼滤波等,在实际运用中,需要根据系统实际运用 场景来监测差异化指标的差异化程度,来进行最优选择。本申请并不对其作具体的限定。In one implementation, the baseline adaptive adjustment algorithm described in this application is not unique. The baseline adaptive adjustment algorithm can be traditional PID, minimum mean square error, high-order Kalman filter, etc. In actual application, it needs to be based on the system practical use Scenarios are used to monitor the degree of differentiation of differentiation indicators to make optimal choices. This application does not specifically limit it.
参照图6,第一方面,本申请实施例提供了一种根据分析结果对监测对象进行巡检,得到巡检结果的方法,包括但不限于步骤S610、步骤S620、步骤S630。Referring to FIG. 6 , in the first aspect, embodiments of the present application provide a method of inspecting a monitoring object based on analysis results and obtaining inspection results, including but not limited to step S610, step S620, and step S630.
步骤S610,当分析结果为事件变更时,对实时运行数据进行大数据分析,得到实时事件信息;Step S610: When the analysis result is an event change, perform big data analysis on the real-time operating data to obtain real-time event information;
步骤S620,根据实时事件信息对事件库进行查询,得到与实时事件信息对应的巡检信息;Step S620: Query the event database according to the real-time event information to obtain inspection information corresponding to the real-time event information;
步骤S630,根据巡检信息对监测对象进行巡检,得到巡检结果。Step S630: Perform inspection on the monitored object according to the inspection information to obtain inspection results.
参照图11,需要说明的是,当第一系统运行参数与第二系统运行参数不同,分析结果为事件/场景变更时,需要对实时运行数据进行大数据分析,得到实时事件信息。在一实施方式中,本申请的巡检方法是通过模糊识别以提取出实时运行数据中的有效事件信息,进而确定当前运行事件/场景,即得到实时事件信息。Referring to Figure 11, it should be noted that when the first system operating parameters are different from the second system operating parameters and the analysis result is an event/scenario change, big data analysis needs to be performed on the real-time operating data to obtain real-time event information. In one embodiment, the inspection method of the present application extracts valid event information from real-time operating data through fuzzy recognition, and then determines the current operating events/scenarios, that is, obtains real-time event information.
需要说明的是,在得到实时事件信息后,需要根据实时事件信息对事件库进行查询,因为事件库通过多次扩充,包含了大量事件的样本信息,且不同的事件/场景对应着不同的巡检模式,同时事件库中也记录了不同事件/场景对应着的巡检模式,因此本申请的巡检方法可在事件库中查找得到与实时事件信息对应的事件/场景,以识别出监测对象当前的事件/场景,并根据事件库得到监测对象当前的事件/场景对应的巡检信息,即巡检模式,最后根据巡检模式对监测对象进行巡检,得到巡检结果,通过巡检结果对监测对象执行对应的修正操作和监测操作。It should be noted that after obtaining the real-time event information, the event database needs to be queried based on the real-time event information, because the event database has been expanded multiple times and contains a large number of event sample information, and different events/scenarios correspond to different patrols. inspection mode. At the same time, the inspection mode corresponding to different events/scenes is also recorded in the event library. Therefore, the inspection method of this application can search for events/scenarios corresponding to real-time event information in the event library to identify monitoring objects. The current event/scenario, and obtain the inspection information corresponding to the current event/scenario of the monitoring object according to the event library, that is, the inspection mode. Finally, the monitoring object is inspected according to the inspection mode, and the inspection results are obtained. Through the inspection results Perform corresponding correction operations and monitoring operations on the monitored objects.
可以理解的是,对实时运行数据进行大数据分析,得到实时事件信息,包括:结合事件库对实时运行数据进行模糊识别,提取实时事件信息It is understandable that big data analysis of real-time operating data is performed to obtain real-time event information, including: fuzzy identification of real-time operating data in combination with the event library, and extraction of real-time event information.
需要说明的是,模糊识别是在给定标准库的情况下,将待识别对象分类到对应标准库的方法,模糊识别的方法分为直接法和间接法。本申请能够结合事件库对实时运行数据进行模糊识别,以精准提取到实时事件信息。It should be noted that fuzzy recognition is a method of classifying objects to be recognized into corresponding standard libraries given a standard library. Fuzzy recognition methods are divided into direct methods and indirect methods. This application can combine the event library to perform fuzzy identification on real-time operating data to accurately extract real-time event information.
根据本申请的一个实施例,事件一具备特征A、特征B、特征C和特征D,事件二具备特征A、特征E、特征F和特征G,当对未知事件进行模糊识别时,首先识别到特征A、特征B、特征C和特征D,则将其直接定位为事件一,此为直接法;若又识别到某未知事件具有特征A、特征E、特征F和特征G,则根据则近原则,将其间接定位为事件二。事实上,实际的事件/场景往往更负责,这需要通过更高级精准的模糊识别技术来实现。According to an embodiment of the present application, event one has feature A, feature B, feature C, and feature D, and event two has feature A, feature E, feature F, and feature G. When performing fuzzy recognition of an unknown event, first identify Feature A, feature B, feature C and feature D, then it is directly positioned as event one, which is the direct method; if it is also recognized that an unknown event has feature A, feature E, feature F and feature G, then according to the nearest principle, positioning it indirectly as event two. In fact, actual events/scenarios are often more responsible, which needs to be achieved through more advanced and accurate fuzzy recognition technology.
需要说明的是,本申请的巡检方法达到的预期效果同时还取决于事件库的深度和模糊识别算法的精准度。It should be noted that the expected effect achieved by the inspection method of this application also depends on the depth of the event library and the accuracy of the fuzzy recognition algorithm.
可以理解的是,根据巡检结果对监测对象进行修正和持续监测,包括以下之一:当巡检结果为不存在故障时,对监测对象进行持续监测;或当巡检结果为存在故障时,对监测对象进行修正,得到修正结果,根据修正结果对监测对象进行对应的操作。It can be understood that the correction and continuous monitoring of the monitoring object based on the inspection results include one of the following: when the inspection result is that there is no fault, the monitoring object is continuously monitored; or when the inspection result is that there is a fault, Correct the monitoring object to obtain the correction result, and perform corresponding operations on the monitoring object based on the correction result.
需要说明的是,在巡检结束,得到巡检结果后,需要根据巡检结果对监测对象执行对应的操作。当巡检结果为不存在故障时,则此时监测对象不存在异常,可对监测对象进行持续监测;当巡检结果为存在故障时,此时监测对象的某个节点存在异常,为使得监测对象后续正常工作,需要对监测对象进行修正,以修复异常节点。在一实施方式中,当本申请的巡检方法对异常节点进行修正后,会返回修正结果,需要根据修正结果对监测对象进行对应的操作,修正结果可能为修正成功,也可能为修正失败,不同的修正结果对应着不同的后续处理操作。It should be noted that after the inspection is completed and the inspection results are obtained, corresponding operations need to be performed on the monitored objects based on the inspection results. When the inspection result is that there is no fault, then there is no abnormality in the monitoring object at this time, and the monitoring object can be continuously monitored; when the inspection result is that there is a fault, there is an abnormality in a node of the monitoring object. In order to make the monitoring If the object subsequently works normally, the monitoring object needs to be corrected to repair the abnormal node. In one implementation, when the inspection method of the present application corrects the abnormal node, the correction result will be returned. Corresponding operations need to be performed on the monitoring object based on the correction result. The correction result may be a successful correction or a failed correction. Different correction results correspond to different subsequent processing operations.
可以理解的是,根据修正结果对监测对象进行对应的操作,包括以下之一:当修正结果为成功时,对监测对象进行持续监测;或当修正结果为失败时,向监测对象发送告警信号。It can be understood that corresponding operations are performed on the monitoring object according to the correction result, including one of the following: when the correction result is successful, the monitoring object is continuously monitored; or when the correction result is failure, an alarm signal is sent to the monitoring object.
需要说明的是,当修正结果为成功时,表明此刻已成功对监测对象中的异常节点进行修正,为防止监测对象在后续的工作过程中出现异常,仍需要对监测对象进行持续性监测;当修正结果为失败时,则表明对监测对象中的异常节点修正失败,此时依靠自动修正无法成功修正监测对象中的异常节点,需要向监测对象发送告警信号,使得监测对象发出警告,也使 得操作人员能够及时对监测对象中的异常节点进行人工修正。It should be noted that when the correction result is successful, it indicates that the abnormal node in the monitoring object has been successfully corrected at this moment. In order to prevent the monitoring object from appearing abnormal in the subsequent work process, the monitoring object still needs to be continuously monitored; when When the correction result is failure, it means that the correction of the abnormal nodes in the monitoring object has failed. At this time, relying on automatic correction cannot successfully correct the abnormal nodes in the monitoring object. An alarm signal needs to be sent to the monitoring object, so that the monitoring object issues a warning and also causes the monitoring object to issue a warning. This enables operators to manually correct abnormal nodes in the monitored objects in a timely manner.
参照图8,图8为本申请一个实施例提供的在多网元拓扑结构下的组网结构示意图;需要说明的是,上述所述的实施例仅仅是单一网元的事件变化,在5G业务组网中,网元拓扑结构错综复杂,除了事件特征,还需要考虑网元特征。当监测对象由单一网元变化为包含多网元的拓扑系统时,本申请的巡检方法的态势感知需要从事件/场景拓展至网元切换。在一实施方式中,本申请可以通过模型、场景适配敏锐感知并快速捕获监测对象的切换,当网元A切换至网元E时,本申请能够快速感知网元变化并及时调整巡检模式,避免了操作人员对其进行人工操作,有效提高了巡检方法的普适性和可移植性。Referring to Figure 8, Figure 8 is a schematic diagram of a networking structure under a multi-network element topology provided by an embodiment of the present application; it should be noted that the above-mentioned embodiment is only an event change of a single network element. In 5G services In networking, the network element topology is complex. In addition to event characteristics, network element characteristics also need to be considered. When the monitoring object changes from a single network element to a topological system containing multiple network elements, the situational awareness of the inspection method of this application needs to be expanded from events/scenarios to network element switching. In one implementation, this application can keenly sense and quickly capture the switching of monitoring objects through model and scene adaptation. When network element A switches to network element E, this application can quickly sense the changes in the network element and adjust the inspection mode in a timely manner. , avoiding manual operations by operators, effectively improving the universality and portability of the inspection method.
参照图7,第二方面,本申请实施例提供了一种告警方法,包括但不限于步骤S710、步骤S720、步骤S730、步骤S740。Referring to Figure 7, in the second aspect, the embodiment of the present application provides an alarm method, including but not limited to step S710, step S720, step S730, and step S740.
步骤S710,获取监测对象的告警信号;Step S710, obtain the alarm signal of the monitored object;
步骤S720,根据告警信号对监测对象进行意图识别,得到监测对象的告警意图;Step S720, perform intent identification on the monitored object based on the alarm signal, and obtain the alarm intent of the monitored object;
步骤S730,根据告警意图对监测对象进行巡检,得到巡检结果;Step S730, perform inspection on the monitoring object according to the alarm intention, and obtain inspection results;
步骤S740,根据巡检结果对监测对象进行故障判定和修正。Step S740: Perform fault determination and correction on the monitored object based on the inspection results.
需要说明的是,本申请第一方面任一项描述的巡检方法均为主动获取监测对象的目标信息的,而本申请的告警方法是采用意图感知技术,被动接受监测对象发起的告警通知或巡检请求,并感知监测对象的意图,根据监测对象的意图主动发起监测对象需要的精准巡检操作,实现了巡检的自动性,减少了传统巡检方法的人工操作,有效提高了巡检效率,降低了人力成本。It should be noted that any of the inspection methods described in the first aspect of this application actively obtains the target information of the monitoring object, while the alarm method of this application uses intention sensing technology to passively accept alarm notifications initiated by the monitoring object or Inspection requests, and sensing the intention of the monitoring object, proactively initiate precise inspection operations required by the monitoring object according to the intention of the monitoring object, realizing the automation of inspection, reducing the manual operation of traditional inspection methods, and effectively improving the efficiency of inspection efficiency, reducing labor costs.
根据本申请的一个实施例,当监测对象中的异常节点未被修正成功时,监测对象主动发情告警信号,本申请的巡检方法通过被动获取监测对象的告警信号,并根据告警信号对监测对象进行意图识别,得到监测对象的告警意图。在一实施方式中,告警信号包括但不限于业务成功率告警、系统过负荷告警、链路异常告警、服务异常告警和数据库异常告警等,告警信号体现了监测对象的告警意图,可以通过对告警信号进行意图识别,以识别出监测对象的告警意图。According to an embodiment of the present application, when the abnormal nodes in the monitored object are not corrected successfully, the monitored object actively generates an estrus alarm signal. The inspection method of the present application passively obtains the alarm signal of the monitored object, and monitors the monitored object according to the alarm signal. Carry out intention recognition and obtain the alarm intention of the monitored object. In one implementation, the alarm signals include but are not limited to business success rate alarms, system overload alarms, link abnormality alarms, service abnormality alarms, database abnormality alarms, etc. The alarm signals reflect the alarming intention of the monitoring object and can be passed to the alarm. The signal is used for intent recognition to identify the alarm intent of the monitored object.
在一实施方式中,在得到监测对象的告警意图后,需要根据告警意图对监测对象进行巡检,以再次判断监测对象的疑似异常节点是否确实异常,即得到巡检结果,根据巡检结果进行故障判定,如巡检结果为监测对象的疑似异常节点确实异常,则需要对异常节点再次进行修正,或引入人工修正;当巡检结果为监测对象的疑似异常节点不存在异常,则对监测对象进行持续性监测。In one embodiment, after obtaining the alarm intention of the monitoring object, the monitoring object needs to be inspected according to the alarm intention to judge again whether the suspected abnormal node of the monitoring object is indeed abnormal, that is, the inspection result is obtained, and the inspection result is carried out according to the inspection result. Fault determination, if the inspection result is that the suspected abnormal node of the monitoring object is indeed abnormal, the abnormal node needs to be corrected again, or manual correction is introduced; if the inspection result is that the suspected abnormal node of the monitoring object is not abnormal, then the monitoring object Conduct ongoing monitoring.
根据本申请的另一个实施例,当告警信号为业务成功率告警,则本申请的巡检方法在接收到告警信号后触发进程、日志巡检,以进一步获取失败原因;当告警信号为数据库异常告警,则本申请的巡检方法在接收到告警信号后检查数据库当前运行状态,存储容量,磁盘IO,并读写时延等信息;当告警信号为链路异常告警,则本申请的巡检方法在接收到告警信号后检查相关网元运行状态和网络连接等。According to another embodiment of the present application, when the alarm signal is a business success rate alarm, the inspection method of the present application triggers process and log inspection after receiving the alarm signal to further obtain the cause of failure; when the alarm signal is a database abnormality alarm, the inspection method of this application checks the current operating status of the database, storage capacity, disk IO, read and write delay and other information after receiving the alarm signal; when the alarm signal is a link abnormality alarm, the inspection method of this application The method is to check the operating status and network connection of relevant network elements after receiving the alarm signal.
需要说明的是,监测对象还可以主动发送意图信号,意图信号包括指定巡检操作,即主动要求开展某类巡检动作,而非接收到告警信号后才进行巡检。It should be noted that the monitoring object can also actively send intention signals. The intention signals include designated inspection operations, that is, actively requesting certain types of inspection actions, rather than performing inspections only after receiving alarm signals.
第三方面,参照图9,本申请一个实施例提供的巡检系统的结构示意图,本申请实施例提供了一种巡检系统,包括:智能控制中心,智能控制中心被设置为执行本申请第一方面实施例的巡检方法和本申请第二方面实施例的告警方法,例如,执行以上描述的图1中的方法步骤S110至S140、图2中的方法步骤S210至S230、图3中的方法步骤S310至S320、图4中的方法步骤S410至S420、图5中的方法步骤S510至S520、图6中的方法步骤S610至S630、图7中的方法步骤S710至S740。测量单元,测量单元分别与智能控制中心、监测对象通信连接,测量单元被设置为获取监测对象的实时运行数据并将实时运行数据发送给智能控制中心;控制单元,控制单元与智能控制中心通信连接,控制单元被设置为接收智能控制中心发送的指令;其中,指令包括巡检指令和修正指令;执行机构,执行机构分别与控制单元、监 测对象通信连接;其中,控制单元还被设置为根据指令控制执行机构对监测对象执行对应的操作。In the third aspect, referring to FIG. 9 , a schematic structural diagram of an inspection system provided by an embodiment of the present application is provided. The embodiment of the present application provides an inspection system, including: an intelligent control center. The intelligent control center is configured to execute the first step of the present application. The inspection method of one aspect of the embodiment and the alarm method of the second aspect of the present application, for example, perform the above-described method steps S110 to S140 in Figure 1, method steps S210 to S230 in Figure 2, and method steps S210 to S230 in Figure 3. Method steps S310 to S320, method steps S410 to S420 in FIG. 4 , method steps S510 to S520 in FIG. 5 , method steps S610 to S630 in FIG. 6 , and method steps S710 to S740 in FIG. 7 . The measurement unit is communicated with the intelligent control center and the monitoring object respectively. The measurement unit is configured to obtain real-time operating data of the monitoring object and send the real-time operating data to the intelligent control center; the control unit is communicated with the intelligent control center. , the control unit is set to receive instructions sent by the intelligent control center; where the instructions include inspection instructions and correction instructions; the execution mechanism is respectively connected with the control unit and monitoring The monitoring object is connected through communication; wherein, the control unit is also configured to control the execution mechanism to perform corresponding operations on the monitoring object according to the instructions.
参照图9,需要说明的是,本申请的巡检方法的基本思想是:巡检系统实时采集监测对象的运行数据,嵌入闭环控制系统,智能驱动决策,发起精准巡检,将人工操作剥脱于巡检动作之外,零接触、零等待,仅根据故障结果提供必要的修复操作。在一实施方式中,巡检系统包括智能控制中心、测量单元、控制单元和执行机构,其中,智能控制中心被设置为收集监测对象的指标性能,将全栈式的运维数据进行集中化管理,对数据进行智能分析与预测,感知潜在的故障点,提供修正方案,即智能控制中心被设置为执行巡检方法或告警方法;测量单元分别与智能控制中心、监测对象通信连接,测量单元被设置为获取监测对象的指标参数,即实时运行数据,并将实时运行数据发送给智能控制中心;控制单元与智能控制中心通信连接,被设置为接收智能控制中心发送的指令,其中,智能控制中心能够精准定位监测对象所需的巡检类型,并将巡检指令发送给控制单元,智能控制中心还能根据巡检结果,明确监测对象的故障类型后,启动系统自修复功能,并将修正指令发送给控制单元;执行机构分别与控制单元、监测对象通信连接,控制单元被设置为根据巡检指令对监测对象执行巡检操作,还被设置为根据修正指令对监测对象执行修正操作。Referring to Figure 9, it should be noted that the basic idea of the inspection method of this application is: the inspection system collects the operating data of the monitored objects in real time, embeds it in the closed-loop control system, intelligently drives decision-making, initiates precise inspections, and separates manual operations from In addition to inspection actions, there is zero contact and zero waiting, and only necessary repair operations are provided based on the fault results. In one embodiment, the inspection system includes an intelligent control center, a measurement unit, a control unit and an actuator. The intelligent control center is configured to collect the performance indicators of monitoring objects and centrally manage full-stack operation and maintenance data. , conduct intelligent analysis and prediction of data, perceive potential fault points, and provide correction solutions, that is, the intelligent control center is set to perform inspection methods or alarm methods; the measurement unit is communicated with the intelligent control center and the monitoring object respectively, and the measurement unit is It is set to obtain the indicator parameters of the monitoring object, that is, real-time operating data, and send the real-time operating data to the intelligent control center; the control unit is connected to the intelligent control center through communication, and is set to receive instructions sent by the intelligent control center, where the intelligent control center It can accurately locate the inspection type required for the monitored object and send the inspection instruction to the control unit. The intelligent control center can also start the system self-healing function after clarifying the fault type of the monitored object based on the inspection results and correct the instruction. Sent to the control unit; the execution mechanism is communicated with the control unit and the monitoring object respectively. The control unit is configured to perform inspection operations on the monitoring object according to the inspection instruction, and is also configured to perform correction operations on the monitoring object according to the correction instruction.
参照图10,根据本申请的一个实施例,智能控制中心在获取监测对象的实时运行数据之前首先预设了初始运行数据,再根据监测对象的实时运行数据和初始运行数据得到差异化指标,对差异化指标和预设的差异化指标阈值进行对比,当得到的分析结果为监测对象正常时,需要根据差异化指标和预设的基准值自适应调整算法对监测对象进行调整,当得到的分析结果为监测对象存在异常时,智能控制中心主动向控制单元发起精准巡检指令,以明确巡检监测对象的指定参数或状态,控制单元根据精准巡检指令控制执行机构向监测对象发起巡检操作,当巡检操作结束后,监测对象返回巡检结果至智能控制中心,智能控制中心对巡检结果进行判定,如巡检结果为不存在故障时,智能控制中心对监测对象进行持续监测,如巡检结果为存在故障时,智能控制中心向控制单元发送修正指令,使得控制中心根据修正指令控制执行机构对监测对象进行修正,当修正操作结束后,监测对象返回修正结果至智能控制中心,如此时修正结果为成功,则对监测对象持续监测,如此时修正结果为失败,则智能控制中心发起告警,使得人工介入。Referring to Figure 10, according to an embodiment of the present application, the intelligent control center first presets the initial operation data before obtaining the real-time operation data of the monitoring object, and then obtains the differential index based on the real-time operation data and the initial operation data of the monitoring object, and The differential index is compared with the preset differential index threshold. When the obtained analysis result is that the monitoring object is normal, the monitoring object needs to be adjusted according to the differential index and the preset baseline value adaptive adjustment algorithm. When the obtained analysis result is When the result is that there is an abnormality in the monitored object, the intelligent control center actively initiates a precise inspection instruction to the control unit to clarify the specified parameters or status of the inspection and monitoring object. The control unit controls the execution mechanism to initiate an inspection operation to the monitored object according to the precise inspection instruction. , when the inspection operation is completed, the monitoring object returns the inspection results to the intelligent control center, and the intelligent control center determines the inspection results. If the inspection result is that there is no fault, the intelligent control center continues to monitor the monitoring object, such as When the inspection result is that there is a fault, the intelligent control center sends a correction instruction to the control unit, so that the control center controls the execution mechanism to correct the monitoring object according to the correction instruction. When the correction operation is completed, the monitoring object returns the correction result to the intelligent control center, so If the correction result is successful, the monitoring object will be continuously monitored. If the correction result is failed, the intelligent control center will initiate an alarm, causing manual intervention.
参照图11,根据本申请的另一个实施例,智能控制中心在获取监测对象的实时运行数据之前需要收集大量监测对象的事件信息,并将事件信息存储至事件库,再对监测对象的实时运行数据进行参数提取,得到第一系统运行参数,根据第一系统运行参数判断事件/场景是否变更,如事件/场景未变更,则智能控制中心对监测对象进行持续监测,如事件/场景变更,则智能控制中心对实时运行数据进行大数据分析,得到实时事件信息,并根据实时事件信息对事件库进行查询,得到与实时事件信息对应的巡检信息,智能控制中心得到巡检信息后,将巡检信息转化为巡检指令,将巡检指令发送给控制单元,使得控制单元根据巡检指令控制执行机构对监测对象进行巡检,执行机构执行完巡检操作后将巡检结果返回给智能控制中心。Referring to Figure 11, according to another embodiment of the present application, before obtaining the real-time operation data of the monitoring object, the intelligent control center needs to collect a large amount of event information of the monitoring object, store the event information in the event library, and then process the real-time operation data of the monitoring object. Parameters are extracted from the data to obtain the first system operating parameters. Based on the first system operating parameters, it is judged whether the event/scene has changed. If the event/scene has not changed, the intelligent control center will continue to monitor the monitoring object. If the event/scene changes, then The intelligent control center performs big data analysis on real-time operating data to obtain real-time event information, and queries the event database based on the real-time event information to obtain inspection information corresponding to the real-time event information. After obtaining the inspection information, the intelligent control center will The inspection information is converted into inspection instructions, and the inspection instructions are sent to the control unit, so that the control unit controls the execution agency to inspect the monitored object according to the inspection instructions. After the execution agency completes the inspection operation, it returns the inspection results to the intelligent control center.
在一实施方式中,在对监测对象执行完巡检操作后,智能控制中心将再次收集事件/场景信息,持续地对监测对象进行巡检。In one embodiment, after completing the inspection operation on the monitoring object, the intelligent control center will collect event/scenario information again and continue to inspect the monitoring object.
参照图12,图12为本申请一个实施例提供的告警方法的另一主流程图;根据本申请的另一个实施例,监测对象主动发起告警,并将告警信号发送给智能控制中心,使得智能控制中心根据告警信号对监测对象进行意图识别,得到监测对象的告警意图,并根据告警意图对发起精准巡检,即发送巡检指令给控制单元,控制单元接收到巡检指令后控制执行机构对监测对象进行巡检操作,当巡检操作结束后,监测对象将巡检结果发送给智能控制中心,智能控制中心还需要根据巡检结果对监测对象进行故障判定和修正。Referring to Figure 12, Figure 12 is another main flow chart of an alarm method provided by an embodiment of the present application; according to another embodiment of the present application, the monitoring object actively initiates an alarm and sends the alarm signal to the intelligent control center, so that the intelligent The control center identifies the intention of the monitoring object based on the alarm signal, obtains the alarm intention of the monitoring object, and initiates a precise inspection based on the alarm intention, that is, it sends an inspection instruction to the control unit. After receiving the inspection instruction, the control unit controls the execution mechanism. The monitoring object performs inspection operations. When the inspection operation is completed, the monitoring object sends the inspection results to the intelligent control center. The intelligent control center also needs to determine and correct the fault of the monitoring object based on the inspection results.
参照图8,可以理解的是,智能控制中心包括集中事件管理模块、智能引擎模块、大数据处理模块、影响分析模块和事件库模块;其中,集中事件管理模块被设置为进行事件感知,集中事件管理模块被设置为管理事件库模块中的各项参数;智能引擎模块被设置为收集监测 对象的指标性能,并对监测对象的指标性能进行智能分析和预测;大数据处理模块被设置为对实时运行数据进行提取,并根据实时运行数据和预设的基准值自适应调整算法对监测对象进行调整;影响分析模块被设置为对监测对象的故障影响进行预判;而且集中事件管理模块、智能引擎模块、大数据处理模块、影响分析模块和事件库模块互相通信连接且协同作用,用于提供态势感知能力。Referring to Figure 8, it can be understood that the intelligent control center includes a centralized event management module, an intelligent engine module, a big data processing module, an impact analysis module, and an event library module; among which, the centralized event management module is configured to perform event sensing and centralize events The management module is set to manage various parameters in the event library module; the intelligent engine module is set to collect monitoring The indicator performance of the object is analyzed and predicted intelligently; the big data processing module is set to extract real-time operating data, and adaptively adjust the algorithm for the monitored object based on the real-time operating data and the preset benchmark value. Make adjustments; the impact analysis module is set to predict the impact of faults on the monitored objects; and the centralized event management module, intelligent engine module, big data processing module, impact analysis module and event library module communicate with each other and work together to Provide situational awareness.
根据本申请的一个实施例,智能控制中心作为精准巡检系统的核心组件,由集中事件管理模块、智能引擎模块、大数据处理模块、影响分析模块和事件库模块组成,这几个模块协同作用,提供了完备的态势感知能力。According to an embodiment of this application, the intelligent control center, as the core component of the precise inspection system, consists of a centralized event management module, an intelligent engine module, a big data processing module, an impact analysis module and an event library module. These modules work together , providing complete situational awareness capabilities.
在一实施方式中,集中事件管理模块被设置为管理事件库模块中的事件类型、特征和参数等;智能引擎模块是智能控制中心中的核心控制组件,被设置为收集监测对象的指标性能,智能感知驱动类型,如感知实时运行数据变更、事件/场景切换等,对实时运行数据进行智能分析与预测;大数据处理模块被设置为收集实时运行数据,提取不同事件、不同对象、不同场景的运行参数,智能模糊识别事件/场景,而且,大数据处理模块还被设置为根据监测对象的实时数据,自适应调整巡检基准值;影响分析模块被设置为根据巡检结果预判故障影响,及时调整巡检策略或发起自动故障修复,必要时再触发告警,引入人工介入。In one embodiment, the centralized event management module is configured to manage event types, characteristics, parameters, etc. in the event library module; the intelligent engine module is the core control component in the intelligent control center and is configured to collect the indicator performance of monitoring objects, Intelligent perception drive type, such as sensing real-time operating data changes, event/scene switching, etc., to conduct intelligent analysis and prediction of real-time operating data; the big data processing module is set to collect real-time operating data and extract information about different events, different objects, and different scenarios. Operating parameters, intelligent fuzzy identification of events/scenes, and the big data processing module is also set to adaptively adjust the inspection benchmark value based on the real-time data of the monitored object; the impact analysis module is set to predict the impact of the fault based on the inspection results. Timely adjust the inspection strategy or initiate automatic fault repair. If necessary, trigger an alarm and introduce manual intervention.
第四方面,参照图13,本申请实施例提供了一种电子设备,包括:In the fourth aspect, referring to Figure 13, an embodiment of the present application provides an electronic device, including:
至少一个存储器;at least one memory;
至少一个处理器;at least one processor;
至少一个程序;at least one program;
程序被存储在存储器中,处理器执行至少一个程序以实现:Programs are stored in memory and the processor executes at least one program to:
如本申请第一方面任一项实施例的巡检方法和本申请第二方面任一项实施例的告警方法。Such as the inspection method of any embodiment of the first aspect of this application and the alarm method of any embodiment of the second aspect of this application.
处理器和存储器可以通过总线或者其他方式连接。The processor and memory may be connected via a bus or other means.
存储器作为一种非暂态可读存储介质,可用于存储非暂态软件指令以及非暂态性可指令。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。可以理解的是,存储器可包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该处理器。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。As a non-transitory readable storage medium, memory can be used to store non-transitory software instructions and non-transitory instructions. In addition, the memory may include high-speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. It will be appreciated that the memory may include memory located remotely relative to the processor, and that the remote memory may be connected to the processor via a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.
处理器通过运行存储在存储器中的非暂态软件指令、指令以及信号,从而各种功能应用以及数据处理,即实现上述第一方面实施例的巡检方法。The processor executes non-transient software instructions, instructions and signals stored in the memory to implement various functional applications and data processing, that is, to implement the inspection method of the first embodiment.
实现上述实施例的巡检方法所需的非暂态软件指令以及指令存储在存储器中,当被处理器执行时,执行本申请第一方面实施例的巡检方法和本申请第二方面实施例的告警方法,例如,执行以上描述的图1中的方法步骤S110至S140、图2中的方法步骤S210至S230、图3中的方法步骤S310至S320、图4中的方法步骤S410至S420、图5中的方法步骤S510至S520、图6中的方法步骤S610至S630、图7中的方法步骤S710至S740。The non-transient software instructions and instructions required to implement the inspection method of the above embodiment are stored in the memory. When executed by the processor, the inspection method of the first embodiment of the present application and the second embodiment of the present application are executed. The alarm method, for example, performs the above-described method steps S110 to S140 in Figure 1, method steps S210 to S230 in Figure 2, method steps S310 to S320 in Figure 3, method steps S410 to S420 in Figure 4, Method steps S510 to S520 in FIG. 5 , method steps S610 to S630 in FIG. 6 , and method steps S710 to S740 in FIG. 7 .
第五方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机可执行信号,计算机可执行信号用于执行:In the fifth aspect, embodiments of the present application provide a computer-readable storage medium. The computer-readable storage medium stores computer-executable signals. The computer-executable signals are used to execute:
如本申请第一方面任一项实施例的巡检方法和第二方面任一项实施例的告警方法。Such as the inspection method of any embodiment of the first aspect and the alarm method of any embodiment of the second aspect of this application.
例如执行以上描述的图1中的方法步骤S110至S140、图2中的方法步骤S210至S230、图3中的方法步骤S310至S320、图4中的方法步骤S410至S420、图5中的方法步骤S510至S520、图6中的方法步骤S610至S630、图7中的方法步骤S710至S740。For example, the above-described method steps S110 to S140 in FIG. 1 , method steps S210 to S230 in FIG. 2 , method steps S310 to S320 in FIG. 3 , method steps S410 to S420 in FIG. 4 , and the method in FIG. 5 are performed. Steps S510 to S520, method steps S610 to S630 in Figure 6, and method steps S710 to S740 in Figure 7.
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The device embodiments described above are only illustrative. The units described as separate components may or may not be physically separated. The components shown as units may or may not be physical units, that is, they may be located in one place. , or it can be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
本申请实施例包括:获取监测对象的实时运行数据;对所述实时运行数据进行分析,得到分析结果;根据所述分析结果对所述监测对象进行巡检,得到巡检结果;根据所述巡检结 果对所述监测对象进行修正和持续监测。基于此,相较于传统的巡检方法,本申请能够对当前监测对象的实时运行数据进行分析,以根据当前监测对象的运行行为,主动预测、感知可能发生故障的环节,即得到分析结果,进而根据分析结果及时精准地对监测对象进行巡检,根据得到的巡检结果对监测对象进行修正和持续监测,从被动运维向主动运维转型,使业务的精准性和实时性得到保障。Embodiments of the present application include: obtaining real-time operation data of monitoring objects; analyzing the real-time operation data to obtain analysis results; performing inspections on the monitoring objects according to the analysis results to obtain inspection results; check conclusion The monitoring objects shall be revised and continuously monitored. Based on this, compared with traditional inspection methods, this application can analyze the real-time operating data of the current monitoring object to proactively predict and perceive possible failure links based on the operating behavior of the current monitoring object, that is, obtain the analysis results. Then, based on the analysis results, the monitoring objects are inspected in a timely and accurate manner, and the monitoring objects are corrected and continuously monitored based on the obtained inspection results. The transformation from passive operation and maintenance to active operation and maintenance ensures the accuracy and real-time performance of the business.
通过以上的实施方式的描述,本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在可读介质上,可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读信号、数据结构、指令模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读信号、数据结构、指令模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。Through the description of the above embodiments, those of ordinary skill in the art can understand that all or some steps and systems in the methods disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As is known to those of ordinary skill in the art, the term computer storage media includes volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable signals, data structures, modules of instructions, or other data. removable, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer. Additionally, it is known to those of ordinary skill in the art that communication media typically embodies a computer-readable signal, data structure, instruction module, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
上面结合附图对本申请实施例作了详细说明,但是本申请不限于上述实施例,在所属技术领域普通技术人员所具备的知识范围内,还可以在不脱离本申请宗旨的前提下,做出各种变化。 The embodiments of the present application have been described in detail above in conjunction with the accompanying drawings. However, the present application is not limited to the above embodiments. Within the scope of knowledge possessed by those of ordinary skill in the technical field, other embodiments can be made without departing from the purpose of the present application. Various changes.

Claims (18)

  1. 一种巡检方法,包括:An inspection method including:
    获取监测对象的实时运行数据;Obtain real-time operating data of monitoring objects;
    对所述实时运行数据进行分析,得到分析结果;Analyze the real-time operating data and obtain analysis results;
    根据所述分析结果对所述监测对象进行巡检,得到巡检结果;Conduct inspections on the monitoring objects according to the analysis results to obtain inspection results;
    根据所述巡检结果对所述监测对象进行修正和持续监测。The monitoring objects are corrected and continuously monitored based on the inspection results.
  2. 根据权利要求1所述的巡检方法,其中,所述实时运行数据包括事件信息,所述获取监测对象的实时运行数据,包括:The inspection method according to claim 1, wherein the real-time operation data includes event information, and the obtaining the real-time operation data of the monitoring object includes:
    收集所述监测对象的若干所述事件信息;Collect several event information of the monitoring object;
    对所述事件信息进行拆解,得到若干与所述事件信息对应的机器事件,将所述机器事件存储进预设的事件库;其中,所述机器事件由对所述事件信息进行拆解得到的若干机器语言的合集;The event information is disassembled to obtain a number of machine events corresponding to the event information, and the machine events are stored in a preset event library; wherein the machine events are obtained by disassembling the event information. A collection of several machine languages;
    持续监测所述监测对象并获取所述监测对象的所述实时运行数据。Continuously monitor the monitoring object and obtain the real-time operating data of the monitoring object.
  3. 根据权利要求1所述的巡检方法,其中,所述对所述实时运行数据进行分析,得到分析结果,包括:The inspection method according to claim 1, wherein the analysis of the real-time operation data to obtain the analysis results includes:
    对所述实时运行数据和预设的初始运行数据进行对比,得到差异化指标;Compare the real-time operating data with the preset initial operating data to obtain differentiation indicators;
    对所述差异化指标和预设的差异化指标阈值进行对比,得到所述分析结果。Compare the differentiation index with a preset differentiation index threshold to obtain the analysis result.
  4. 根据权利要求3所述的巡检方法,其中,所述对所述差异化指标和预设的差异化指标阈值进行对比,得到分析结果,包括以下之一:The inspection method according to claim 3, wherein the analysis results obtained by comparing the differentiation index with a preset differentiation index threshold include one of the following:
    当所述差异化指标大于所述差异化指标阈值时,所述分析结果为所述监测对象存在异常;或When the differentiation index is greater than the differentiation index threshold, the analysis result is that the monitoring object is abnormal; or
    当所述差异化指标小于所述差异化指标阈值时,所述分析结果为所述监测对象正常。When the differentiation index is less than the differentiation index threshold, the analysis result is that the monitoring object is normal.
  5. 根据权利要求2所述的巡检方法,其中,所述对所述实时运行数据进行分析,得到分析结果,还包括:The inspection method according to claim 2, wherein said analyzing the real-time operation data to obtain analysis results further includes:
    对所述实时运行数据进行参数提取,得到第一系统运行参数;Perform parameter extraction on the real-time operating data to obtain the first system operating parameters;
    对所述第一系统运行参数和第二系统运行参数进行对比,得到所述分析结果;其中,所述第二系统运行参数由对上一时段获取的所述实时运行数据进行参数提取得到。The first system operating parameters and the second system operating parameters are compared to obtain the analysis results; wherein the second system operating parameters are obtained by parameter extraction from the real-time operating data obtained in the previous period.
  6. 根据权利要求5所述的巡检方法,其中,所述对所述第一系统运行参数和第二系统运行参数进行对比,得到所述分析结果,包括以下之一:The inspection method according to claim 5, wherein the comparison of the first system operating parameters and the second system operating parameters to obtain the analysis result includes one of the following:
    当所述第一系统运行参数与所述第二系统运行参数不同时,所述分析结果为事件变更;或When the first system operating parameters are different from the second system operating parameters, the analysis result is an event change; or
    当所述第一系统运行参数与所述第二系统运行参数相同时,所述分析结果为事件未发生变更。When the first system operating parameters are the same as the second system operating parameters, the analysis result is that the event has not changed.
  7. 根据权利要求4所述的巡检方法,其中,所述根据所述分析结果对所述监测对象进行巡检,得到巡检结果,包括:The inspection method according to claim 4, wherein the inspection of the monitoring object according to the analysis results to obtain the inspection results includes:
    当所述分析结果为所述监测对象存在异常时,根据所述差异化指标得到对应的所述监测对象的指定参数;When the analysis result shows that there is an abnormality in the monitoring object, obtain the corresponding specified parameters of the monitoring object according to the differentiation index;
    根据所述指定参数对所述监测对象进行巡检,得到所述巡检结果。The monitoring object is inspected according to the specified parameters to obtain the inspection result.
  8. 根据权利要求4所述的巡检方法,还包括:The inspection method according to claim 4, further comprising:
    当所述分析结果为所述监测对象正常时,根据所述差异化指标和预设的基准值自适应调整算法对所述监测对象进行调整。When the analysis result is that the monitoring object is normal, the monitoring object is adjusted according to the differential index and the preset reference value adaptive adjustment algorithm.
  9. 根据权利要求6所述的巡检方法,其中,所述根据所述分析结果对所述监测对象进行巡检,得到巡检结果,还包括:The inspection method according to claim 6, wherein the step of performing inspection on the monitoring object according to the analysis result to obtain the inspection result also includes:
    当所述分析结果为事件变更时,对所述实时运行数据进行大数据分析,得到实时事件信息; When the analysis result is an event change, perform big data analysis on the real-time operating data to obtain real-time event information;
    根据所述实时事件信息对所述事件库进行查询,得到与所述实时事件信息对应的巡检信息;Query the event database according to the real-time event information to obtain inspection information corresponding to the real-time event information;
    根据所述巡检信息对所述监测对象进行巡检,得到所述巡检结果。The monitoring object is inspected according to the inspection information to obtain the inspection result.
  10. 根据权利要求9所述的巡检方法,其中,所述对所述实时运行数据进行大数据分析,得到实时事件信息,包括:The inspection method according to claim 9, wherein said performing big data analysis on the real-time operating data to obtain real-time event information includes:
    结合所述事件库对所述实时运行数据进行模糊识别,提取所述实时事件信息。The real-time operation data is fuzzy identified in combination with the event library, and the real-time event information is extracted.
  11. 根据权利要求6所述的巡检方法,还包括:The inspection method according to claim 6, further comprising:
    当所述分析结果为事件未发生变更时,对所述监测对象进行持续监测。When the analysis result is that the event has not changed, the monitoring object is continuously monitored.
  12. 根据权利要求1所述的巡检方法,其中,所述根据所述巡检结果对所述监测对象进行修正和持续监测,包括以下之一:The inspection method according to claim 1, wherein the correction and continuous monitoring of the monitoring object according to the inspection results include one of the following:
    当所述巡检结果为不存在故障时,对所述监测对象进行持续监测;或When the inspection result is that there is no fault, continue to monitor the monitoring object; or
    当所述巡检结果为存在故障时,对所述监测对象进行修正,得到修正结果,根据所述修正结果对所述监测对象进行对应的操作。When the inspection result indicates that there is a fault, the monitoring object is corrected to obtain a correction result, and corresponding operations are performed on the monitoring object based on the correction result.
  13. 根据权利要求12所述的巡检方法,其中,所述根据所述修正结果对所述监测对象进行对应的操作,包括以下之一:The inspection method according to claim 12, wherein the corresponding operation on the monitoring object according to the correction result includes one of the following:
    当所述修正结果为成功时,对所述监测对象进行持续监测;或When the correction result is successful, continue monitoring the monitoring object; or
    当所述修正结果为失败时,向所述监测对象发送告警信号。When the correction result is failure, an alarm signal is sent to the monitoring object.
  14. 一种告警方法,包括:An alarm method, including:
    获取监测对象的告警信号;Obtain the alarm signal of the monitored object;
    根据所述告警信号对所述监测对象进行意图识别,得到所述监测对象的告警意图;Perform intention identification on the monitoring object according to the alarm signal to obtain the alarm intention of the monitoring object;
    根据所述告警意图对所述监测对象进行巡检,得到巡检结果;Conduct inspection on the monitoring object according to the alarm intention and obtain inspection results;
    根据所述巡检结果对所述监测对象进行故障判定和修正。Perform fault determination and correction on the monitored object based on the inspection results.
  15. 一种巡检系统,包括:An inspection system including:
    智能控制中心,所述智能控制中心被设置为执行如权利要求1至13中任一项所述的巡检方法和/或如权利要求14中的告警方法;Intelligent control center, the intelligent control center is configured to perform the inspection method according to any one of claims 1 to 13 and/or the alarm method according to claim 14;
    测量单元,所述测量单元分别与所述智能控制中心、监测对象通信连接,所述测量单元被设置为获取所述监测对象的实时运行数据并将所述实时运行数据发送给所述智能控制中心;Measuring unit, the measuring unit is communicatively connected to the intelligent control center and the monitoring object respectively, the measuring unit is configured to obtain the real-time operating data of the monitoring object and send the real-time operating data to the intelligent control center ;
    控制单元,所述控制单元与所述智能控制中心通信连接,所述控制单元被设置为接收所述智能控制中心发送的指令;其中,所述指令包括巡检指令和修正指令;A control unit, the control unit is communicatively connected with the intelligent control center, and the control unit is configured to receive instructions sent by the intelligent control center; wherein the instructions include inspection instructions and correction instructions;
    执行机构,所述执行机构分别与所述控制单元、所述监测对象通信连接;其中,所述控制单元还被设置为根据所述指令控制所述执行机构对所述监测对象执行对应的操作。An execution mechanism, which is communicatively connected to the control unit and the monitoring object respectively; wherein the control unit is further configured to control the execution mechanism to perform corresponding operations on the monitoring object according to the instruction.
  16. 根据权利要求15所述的巡检系统,其中,所述智能控制中心包括集中事件管理模块、智能引擎模块、大数据处理模块、影响分析模块和事件库模块;其中,所述集中事件管理模块被设置为进行事件感知,所述集中事件管理模块被设置为管理所述事件库模块中的各项参数;所述智能引擎模块被设置为收集所述监测对象的指标性能,并对所述监测对象的指标性能进行智能分析和预测;所述大数据处理模块被设置为对所述实时运行数据进行提取,并根据所述实时运行数据和预设的基准值自适应调整算法对所述监测对象进行调整;所述影响分析模块被设置为对所述监测对象的故障影响进行预判;The inspection system according to claim 15, wherein the intelligent control center includes a centralized event management module, an intelligent engine module, a big data processing module, an impact analysis module and an event library module; wherein the centralized event management module is Set to perform event awareness, the centralized event management module is set to manage various parameters in the event library module; the intelligent engine module is set to collect the indicator performance of the monitoring object, and perform analysis on the monitoring object. Intelligent analysis and prediction of the indicator performance; the big data processing module is configured to extract the real-time operation data, and perform adaptive adjustment algorithm on the monitoring object according to the real-time operation data and the preset benchmark value. Adjustment; the impact analysis module is configured to predict the impact of faults on the monitoring object;
    而且所述集中事件管理模块、所述智能引擎模块、所述大数据处理模块、所述影响分析模块和所述事件库模块互相通信连接且协同作用,用于提供态势感知能力。Moreover, the centralized event management module, the intelligent engine module, the big data processing module, the impact analysis module and the event library module communicate with each other and work together to provide situation awareness capabilities.
  17. 一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至13中任一项所述的巡检方法和/或如权利要求14中的告警方法。An electronic device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the implementation as described in any one of claims 1 to 13 is achieved. The inspection method and/or the alarm method as claimed in claim 14.
  18. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行程序,所述计算机可执行程序用于使计算机执行如权利要求1至13中任一项所述的巡检方法和/或如权利要求14中的告警方法。 A computer-readable storage medium, the computer-readable storage medium stores a computer executable program, the computer executable program is used to cause the computer to execute the inspection method as described in any one of claims 1 to 13 and /or the alarm method in claim 14.
PCT/CN2023/099167 2022-07-11 2023-06-08 Inspection method, alarm method, inspection system, and computer-readable storage medium WO2024012109A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210810434.2A CN117424825A (en) 2022-07-11 2022-07-11 Inspection method, alarm method, inspection system and computer readable storage medium
CN202210810434.2 2022-07-11

Publications (1)

Publication Number Publication Date
WO2024012109A1 true WO2024012109A1 (en) 2024-01-18

Family

ID=89528942

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/099167 WO2024012109A1 (en) 2022-07-11 2023-06-08 Inspection method, alarm method, inspection system, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN117424825A (en)
WO (1) WO2024012109A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070075780A (en) * 2006-01-16 2007-07-24 에스케이 텔레콤주식회사 System for inspecting faultiness of mobile terminal and method thereof
CN102638838A (en) * 2011-03-23 2012-08-15 中兴通讯股份有限公司 Method, subscriber terminals, servers, inspection update terminals and system for intelligent inspection
CN113537415A (en) * 2021-09-17 2021-10-22 中国南方电网有限责任公司超高压输电公司广州局 Convertor station inspection method and device based on multi-information fusion and computer equipment
CN113993005A (en) * 2021-10-27 2022-01-28 南方电网大数据服务有限公司 Power grid equipment inspection method and device, computer equipment and storage medium
CN116165484A (en) * 2023-02-21 2023-05-26 海南电网有限责任公司信息通信分公司 Fault positioning robot-assisted fixed inspection, inspection and scheduling method based on electric power automation operation and maintenance

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070075780A (en) * 2006-01-16 2007-07-24 에스케이 텔레콤주식회사 System for inspecting faultiness of mobile terminal and method thereof
CN102638838A (en) * 2011-03-23 2012-08-15 中兴通讯股份有限公司 Method, subscriber terminals, servers, inspection update terminals and system for intelligent inspection
CN113537415A (en) * 2021-09-17 2021-10-22 中国南方电网有限责任公司超高压输电公司广州局 Convertor station inspection method and device based on multi-information fusion and computer equipment
CN113993005A (en) * 2021-10-27 2022-01-28 南方电网大数据服务有限公司 Power grid equipment inspection method and device, computer equipment and storage medium
CN116165484A (en) * 2023-02-21 2023-05-26 海南电网有限责任公司信息通信分公司 Fault positioning robot-assisted fixed inspection, inspection and scheduling method based on electric power automation operation and maintenance

Also Published As

Publication number Publication date
CN117424825A (en) 2024-01-19

Similar Documents

Publication Publication Date Title
WO2021008176A1 (en) Fault detection method and system for water chilling unit, and water chilling unit
CN105955876B (en) Data monitoring processing method and device
CN109462490B (en) Video monitoring system and fault analysis method
CN107704359B (en) Monitoring system of big data platform
CN115118581B (en) Internet of things data all-link monitoring and intelligent guaranteeing system based on 5G
CN112734971A (en) Automatic inspection method, storage medium and inspection robot
CN114493204A (en) Industrial equipment monitoring method and equipment based on industrial Internet
CN110231998B (en) Detection method and device for distributed timing task and storage medium
CN115622867A (en) Industrial control system safety event early warning classification method and system
CN117351271A (en) Fault monitoring method and system for high-voltage distribution line monitoring equipment and storage medium thereof
CN113537652A (en) Equipment health monitoring and early warning method, system, storage medium and equipment
WO2024012109A1 (en) Inspection method, alarm method, inspection system, and computer-readable storage medium
CN116975938B (en) Sensor data processing method in product manufacturing process
CN114138617B (en) Self-learning frequency conversion monitoring method and system, electronic equipment and storage medium
CN115442209B (en) Fault detection method and device, electronic equipment and storage medium
CN115378841B (en) Method and device for detecting state of equipment accessing cloud platform, storage medium and terminal
JP5388122B2 (en) Server monitoring apparatus and server failure determination method thereof
US20220173980A1 (en) Ai machine learning technology based fault management system for network equpment that supports sdn open flow protocol
CN110896545B (en) Online charging roaming fault positioning method, related device and storage medium
WO2019142414A1 (en) Network monitoring system and method, and non-transitory computer-readable medium containing program
TW201729236A (en) Data management apparatus and monitoring method of same
CN116204386B (en) Method, system, medium and equipment for automatically identifying and monitoring application service relationship
CN113452659A (en) Active defense system based on dynamic technology and method thereof
CN115150307B (en) Method and device for collecting frequency safety detection, storage medium and electronic equipment
CN113114987B (en) Hadoop-based power system inspection method and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23838603

Country of ref document: EP

Kind code of ref document: A1