WO2024012109A1

WO2024012109A1 - Inspection method, alarm method, inspection system, and computer-readable storage medium

Info

Publication number: WO2024012109A1
Application number: PCT/CN2023/099167
Authority: WO
Inventors: 单童; 李学领
Original assignee: 中兴通讯股份有限公司
Priority date: 2022-07-11
Filing date: 2023-06-08
Publication date: 2024-01-18
Also published as: CN117424825A

Abstract

Disclosed are an inspection method, an alarm method, an inspection system, and a computer-readable storage medium. The inspection method comprises: acquiring real-time operation data of a monitored object (S110); analyzing the real-time operation data to obtain an analysis result (S120); performing an inspection on the monitored object on the basis of the analysis result to obtain an inspection result (S130); and correcting and continuously monitoring the monitored object on the basis of the inspection result (S140).

Description

Inspection method, alarm method, inspection system and computer-readable storage medium

Cross-references to related applications

This application is filed based on a Chinese patent application with application number 202210810434.2 and a filing date of July 11, 2022, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference into this application.

Technical field

Embodiments of the present application relate to, but are not limited to, the technical field of network operation and maintenance, and particularly relate to an inspection method, an alarm method, an inspection system, and a computer-readable storage medium.

Background technique

Operation and maintenance management essentially refers to the continuous operation and maintenance of software and hardware performance in the network to achieve an acceptable state in terms of cost, stability and efficiency. It can be said that operation and maintenance management is a long-term, complex and continuous action. Automatic inspection technology is to face operation and maintenance management and rely on a specific intelligent management platform to hand over periodic, repetitive and regular work. The tool is completed to achieve the purpose of improving operation and maintenance efficiency. However, traditional automatic inspection technology still relies on the experience of operators and cannot handle faults in the network in a timely manner, making it difficult to guarantee the accuracy and real-time performance of services.

Contents of the invention

The following is an overview of the topics described in detail in this article. This summary is not intended to limit the scope of the claims.

Embodiments of the present application provide an inspection method, an alarm method, an inspection system, and a computer-readable storage medium.

In the first aspect, embodiments of the present application provide an inspection method, which includes: obtaining real-time operating data of a monitored object; analyzing the real-time operating data to obtain an analysis result; and conducting inspection on the monitored object according to the analysis result. Inspection is carried out to obtain the inspection results; the monitoring objects are corrected and continuously monitored based on the inspection results.

In the second aspect, embodiments of the present application provide an alarm method, which includes: obtaining an alarm signal of a monitored object; identifying the intention of the monitored object according to the alarm signal, and obtaining the alarm intention of the monitored object; according to the The alarm intends to perform inspection on the monitoring object and obtain the inspection result; and perform fault determination and correction on the monitoring object based on the inspection result.

In a third aspect, embodiments of the present application provide an inspection system, including: an intelligent control center, the intelligent control center being configured to execute the inspection method as described in the first aspect and/or as described in the second aspect. Alarm method; measurement unit, the measurement unit is communicated with the intelligent control center and the monitoring object respectively, and the measurement unit is configured to obtain the real-time operation data of the monitoring object and send the real-time operation data to the monitoring object. The intelligent control center; a control unit, the control unit is communicatively connected with the intelligent control center, and the control unit is configured to receive instructions sent by the intelligent control center; wherein the instructions include inspection instructions and correction instructions ; Execution mechanism, the execution mechanism is communicatively connected with the control unit and the monitoring object respectively; wherein the control unit is also configured to control the execution mechanism to perform corresponding operations on the monitoring object according to the instruction .

In a fourth aspect, embodiments of the present application provide an electronic device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the above steps are implemented. The inspection method described in one aspect and/or the alarm method described in the second aspect above.

In a fifth aspect, embodiments of the present application provide a computer-readable storage medium that stores a computer-executable program. The computer-executable program is used to cause a computer to execute the method described in the first aspect. Inspection method and/or alarm method as described in the second aspect above.

Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

Description of drawings

The drawings are used to provide a further understanding of the technical solution of the present application and constitute a part of the specification. They are used to explain the technical solution of the present application together with the embodiments of the present application and do not constitute a limitation of the technical solution of the present application.

Figure 1 is a main flow chart of an inspection method provided by an embodiment of the present application;

Figure 2 is another sub-flow chart of an inspection method provided by an embodiment of the present application;

Figure 3 is another sub-flow chart of an inspection method provided by an embodiment of the present application;

Figure 4 is another sub-flow chart of an inspection method provided by an embodiment of the present application;

Figure 5 is another sub-flow chart of an inspection method provided by an embodiment of the present application;

Figure 6 is another sub-flow chart of an inspection method provided by an embodiment of the present application;

Figure 7 is a main flow chart of an alarm method provided by an embodiment of the present application;

Figure 8 is a schematic diagram of a networking structure under a multi-network element topology provided by an embodiment of the present application;

Figure 9 is a schematic structural diagram of an inspection system provided by an embodiment of the present application;

Figure 10 is a flow chart of an inspection method in which the monitoring object is system status provided by an embodiment of the present application;

Figure 11 is a flow chart of an inspection method in which the monitoring object is event/scenario data provided by an embodiment of the present application;

Figure 12 is another main flow chart of an alarm method provided by an embodiment of the present application;

Figure 13 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.

Detailed ways

In order to make the purpose, technical solutions and advantages of the present application more clear, the present application will be further described in detail below with reference to the drawings and embodiments. It should be understood that the embodiments described here are only used to explain the present application and are not used to limit the present application.

It should be understood that in the description of the embodiments of this application, the meaning of multiple (or multiple items) is two or more. Greater than, less than, exceeding, etc. are understood to exclude the number, and above, below, within, etc. are understood to include the number. If there are descriptions of "first", "second", etc., they are only used for the purpose of distinguishing technical features and cannot be understood as indicating or implying the relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the indicated technical features. The sequence relationship of technical features.

Operation and maintenance management essentially refers to the continuous operation and maintenance of software and hardware performance in the network to achieve an acceptable state in terms of cost, stability and efficiency. It can be said that operation and maintenance management is a long-term, complex and continuous action. Automatic inspection technology is to face operation and maintenance management and rely on a specific intelligent management platform to hand over periodic, repetitive and regular work. The tool is completed to achieve the purpose of improving operation and maintenance efficiency. Although traditional inspection methods can discover and collect data, they are quite weak in processing and analysis capabilities, and they cannot process complex data well. Moreover, traditional inspection methods are limited in accuracy, real-time and universality, and are highly dependent on the experience of operators, which makes it difficult to guarantee the stability and security of the business and hinders the process of digital transformation. . On the one hand, the waiting event from the occurrence of a fault to the discovery of a problem completely depends on the frequency and timing of inspections, making operation and maintenance management extremely passive. On the other hand, the traditional inspection function is still limited by the network environment. When the network environment changes, the original inspection function may no longer be applicable. Among them, the network environment described in this application is not the network bandwidth and network speed, but the network element topology.

Based on this, this application improves the traditional inspection method. Compared with the traditional inspection method, the inspection method of this application can proactively predict and perceive possible failure links based on the current system's operating behavior, and more Initiate inspection actions timely and accurately, thereby improving network operation and maintenance awareness. At the same time, the inspection method of this application can also blur the personalized characteristics of network elements, increase the versatility and portability of the inspection system, and make it more convenient and faster to apply to similar network elements.

Referring to Figure 1 , in a first aspect, an embodiment of the present application provides an inspection method, including but not limited to step S110, step S120, step S130, and step S140.

Step S110, obtain real-time operating data of the monitoring object;

Step S120, analyze the real-time operating data and obtain the analysis results;

Step S130, perform inspection on the monitoring object according to the analysis results, and obtain inspection results;

Step S140: Modify and continuously monitor the monitoring object based on the inspection results.

It should be noted that compared with the traditional inspection method, this application relies on the intelligent operation and maintenance network and proposes a method based on Situation prejudgment inspection method. The inspection method of this application introduces an automated and intelligent closed-loop architecture. It is based on situation awareness and starts from the two dimensions of time and space. Through information extraction, element analysis, and situation prejudgment, It can accurately predict the development trend of the target and initiate more targeted inspection operations.

Among them, situational awareness is an environment-based, dynamic, and overall ability to gain insight into the security risks of monitored objects. Situational awareness can be based on security big data to obtain security factors that cause changes in network situation in a large-scale network environment. , understand and display, and then make decisions and actions. From a global perspective, the ability to discover and identify security threats, understand and analyze them, and respond to them is improved.

The inspection method according to the present application can obtain the real-time operation data of the monitored object in real time, and analyze the real-time operation data, so as to proactively predict and perceive the possible failure links according to the current operation behavior of the monitored object, that is, obtain the corresponding analysis results. . Then, based on the analysis results, timely and accurate inspections of the monitoring objects are carried out, and the monitoring objects are corrected and continuously monitored based on the obtained inspection results. This application separates manual operation and maintenance from the inspection actions and introduces an automated closed-loop structure. The transformation from passive operation and maintenance to active operation and maintenance ensures the accuracy and real-time nature of the business.

In one implementation, the monitoring objects include but are not limited to systems, services, networks, databases, physical machines, storage, etc., which are divided more finely, and also include the performance and event indicators of each monitoring object, such as network load and business failure rate. , business success rate, packet loss rate, alarms, system CPU, memory and disk IO, etc. According to an embodiment of the present application, the real-time operating data of the monitoring object may be the network load at the current time, or the packet loss rate of the service transmission at the current time, which is not specifically limited by this application.

In one embodiment, the inspection method of the present application adopts data adaptation sensing technology, which can obtain real-time operating data of the monitored object in real time, intelligently determine the abnormal points of the current monitored object based on the real-time operating data, and quickly locate the abnormal points. Actively initiate precise inspections, and perform fault determination and automatic repair based on inspection results, solving the problems of poor real-time performance and low accuracy of traditional inspection methods. Secondly, the inspection method of this application also uses event/scenario awareness technology. When the monitoring object is a network event or network scenario, this application can use event/scenario awareness technology to monitor scene switching or event changes when a single network element is running its business. , and sense the changes in monitoring targets between different network elements, and adjust the inspection mode in a timely manner, solving the problems of poor scalability and low portability of traditional inspection methods, ensuring the universality of the business.

Moreover, since manual processing is still an indispensable part of the traditional inspection method, the operator is required to identify, judge and initiate inspection operations on the real-time operating data of the monitored object. Based on this, the inspection method of this application By adopting intent-aware technology, it can receive alarm notifications or inspection requests initiated by monitoring objects, analyze the intentions of the monitoring objects, and proactively initiate precise inspection operations required by the monitoring objects, thus achieving the autonomy of inspections and reducing the traditional The manual operation in the inspection method effectively improves the inspection efficiency and reduces labor costs.

It should be noted that the inspection method of this application is a dynamic and continuous cyclic action, in order to better explain the inspection method of this application. This application disassembles the closed-loop process into a one-way process for explanation. In fact, this application continuously monitors, inspects and corrects the monitoring objects, and the inspection, positioning and fault determination operations are a relatively complex process. . In the face of different monitoring objects, the judgment standards are also different. This application requires a large amount of data analysis to support it.

Referring to Figure 2, in the first aspect, embodiments of the present application provide a method for obtaining real-time operating data of a monitoring object, including but not limited to steps S210, S220, and S230.

Step S210, collect several event information of the monitoring object;

Step S220, disassemble the event information to obtain a number of machine events corresponding to the event information, and store the machine events in a preset event library; where the machine events are a collection of several machine languages obtained by disassembling the event information;

Step S230: Continue to monitor the monitoring object and obtain real-time operating data of the monitoring object.

It should be noted that for the operation of the business system, under different events/scenarios, the focus on the health status of the system operation is also different. In one embodiment, when the scene or event is switched, the real-time operation of the monitored object should be The data switches to the inspection mode corresponding to the current event/scene. The inspection method of this application can adjust the inspection mode in time according to the scene switching or event change when the service of a single network element is running, or sense the change of monitoring targets between different network elements, thus solving the problem of poor scalability and portability of traditional inspection methods. The problem of low sex.

According to an embodiment of the present application, when the monitoring object is point-to-point message transmission, the focus of the system operation health status is the success rate of message delivery; when the monitoring object is group sending or push messages, the focus of the system operation health status is the system operation health status. Whether the system indicators are overloaded; when the monitoring object is file messages, because file messages increase file read and write operations, and the network bandwidth usage is high, database storage may even be introduced, the focus of the system running health status is focused on disk IO, network Load and database running status, etc.

Referring to Figure 11, Figure 11 is a flow chart of an inspection method in which the monitoring object is event/scenario data provided by an embodiment of the present application. It can be understood that when the inspection method of the present application is applied to events or scenes, it should be Before obtaining the real-time operation data of the monitoring object, collect a large amount of event information of the monitoring object. In order to facilitate the reading of a large amount of event information, the collected event information needs to be disassembled to obtain a collection of machine language with higher readability. , that is, machine events, and machine events correspond to event information one by one. After obtaining the machine events, the machine events need to be stored in the preset event library.

It should be noted that the benchmark event library is the cornerstone of the inspection method of this application and the basis for strong fuzzy identification capabilities. Storing machine events into the preset event library is actually a long-term, continuous, and continuous process of precipitation. The accumulation process is to store a large number of different machine events into the preset event library. Each time a machine event is stored in the event library, the sample types of the event library are expanded and enhanced. By continuously updating the event library Enhanced and enriched the samples stored in the event library, making it easier to query the event library based on real-time running data.

Referring to Figure 3, in the first aspect, embodiments of the present application provide a method for analyzing real-time operating data and obtaining analysis results, including but not limited to step S310 and step S320.

Step S310: Compare the real-time operating data and the preset initial operating data to obtain differentiation indicators;

Step S320: Compare the differentiation index with the preset differentiation index threshold to obtain an analysis result.

Referring to Figure 10, Figure 10 is a flow chart of an inspection method for monitoring system status provided by an embodiment of the present application. It should be noted that when the inspection method of the present application can not only be used to monitor system status, it can also be applied to For monitoring business operation, taking the monitoring system status as an example, before obtaining the real-time operation data of the monitoring object, it is necessary to set the initial operation data for the monitoring object. In one embodiment, the initial operation data is customized by the operator. , the initial operating data can be understood as a benchmark value. The initial operating data can be the operating state parameters of the system under ideal conditions, or the operating state parameters of the system under critical conditions.

According to an embodiment of the present application, after the initial operation data is set, the real-time operation data of the monitoring object can be obtained, and the real-time operation data and the initial operation data can be compared to obtain differentiation indicators. Specifically, real-time operating data includes CPU occupancy, disk IO, memory space, file handles, etc. Differential indicators represent the deviation between real-time operating data and initial operating data. After obtaining the differential indicators, it is necessary to evaluate the differential The indicators are compared with the preset differentiation indicator thresholds to obtain the analysis results.

In one implementation, the analysis results indicate whether there is an abnormality in the current monitoring object.

According to another embodiment of the present application, if the differential index reflects that the system storage space of the monitoring object is greater than the threshold set by the initial operating data, it is necessary to further check the disk IO of the current monitoring object, the CPU status of each process, and the file handle occupancy status and whether there are oversized files, etc.

It can be understood that the analysis results are obtained by comparing the differentiation index with the preset differentiation index threshold, including one of the following: when the differentiation index is greater than the differentiation index threshold, the analysis result is that the monitoring object is abnormal; or when When the differentiation index is less than the differentiation index threshold, the analysis result is that the monitored object is normal.

It should be noted that the differentiation index reflects the deviation between the real-time operating data and the preset initial operating data. Therefore, when the differentiation index is larger, it means that the real-time operating data deviates from the initial operating data to a greater extent. When the differentiation index is greater than the differentiation index threshold, it indicates that the monitoring object is abnormal at this time; when the differentiation index is less than the differentiation index threshold, it indicates that the monitoring object is operating normally at this time.

According to an embodiment of the present application, when the differentiation index reflects that the system storage space of the monitoring object is greater than the threshold set by the initial operating data, it indicates that the system storage space of the monitoring object has exceeded the threshold that the system can accommodate files.

Referring to Figure 4, in the first aspect, embodiments of the present application provide another method of analyzing real-time operating data and obtaining analysis results, including but not limited to step S410 and step S420.

Step S410, perform parameter extraction on the real-time operating data to obtain the first system operating parameters;

Step S420: Compare the first system operating parameters and the second system operating parameters to obtain analysis results; wherein the second system operating parameters are obtained by parameter extraction from the real-time operating data obtained in the previous period.

Referring to Figure 11, it can be understood that when the monitoring object is an event/scene in a network environment, in order to determine whether the event/scene has changed, it is necessary to first perform parameter extraction on the real-time operating data to extract the first system operating parameters. The first system operating parameters are the operating parameters of the system under the current event/scenario. After obtaining the first system operating parameters, it is necessary to compare the first system operating parameters with the preset second system operating parameters to obtain analysis results. The analysis results reflect whether the current monitoring object has an event/scenario change.

It should be noted that the first system operating parameters of the monitored object are different under different events/scenarios. The first system operating parameters reflect the current status of the event/scenario, and different first system operating parameters correspond to different events/scenarios. Scenario, when multiple first system operating parameters change, it means that the event/scenario has changed at this time. Therefore, it is necessary to compare the first system operating parameters and the second system operating parameters to determine whether the event/scenario has changed.

In one embodiment, because the inspection method of the present application collects real-time operating data of the monitoring object in real time, it can also be understood that the inspection method of the present application collects real-time operating data of the monitoring object at preset intervals. The period is customized by the operator and can be 1 second or 1 millisecond. The second system operating parameters are obtained by extracting parameters from the real-time operating data obtained in the previous period. This application runs the first system in real time. The parameters are compared with the operating parameters of the second system, and by analyzing whether the parameters have changed, we can quickly determine whether the event/scenario has changed in a short period of time.

It can be understood that the analysis results obtained by comparing the operating parameters of the first system and the operating parameters of the second system include one of the following: when the operating parameters of the first system and the operating parameters of the second system are different, the analysis result is an event change; Or when the operating parameters of the first system and the operating parameters of the second system are the same, the analysis result is that the event has not changed.

It should be noted that the system indicators under different events/scenarios are different, that is, when the event/scenario changes, the operating parameters of the first system also change. Therefore, the inspection method of this application can monitor whether the operating parameters of the first system change. Determine whether the event/scene has changed. When the operating parameters of the first system and the operating parameters of the second system are different, it can be determined that the event/scene has changed; when the operating parameters of the first system and the second system are the same, it can be determined that the event/scenario has not changed.

In one implementation, different events/scenarios correspond to different inspection modes. When the event/scenario changes, the inspection mode should also be changed accordingly.

Referring to FIG. 5 , in the first aspect, embodiments of the present application provide a method of inspecting a monitoring object based on analysis results and obtaining inspection results, including but not limited to step S510 and step S520 .

Step S510: When the analysis result is that the monitoring object is abnormal, the designated parameters of the corresponding monitoring object are obtained according to the differentiation index;

Step S520: Perform inspection on the monitoring object according to specified parameters to obtain inspection results.

Referring to Figure 10, it should be noted that when the differentiation index is greater than the differentiation index threshold and the analysis result is that the monitoring object is abnormal, it is necessary to obtain the specified parameters of the monitoring object based on the differentiation index. In one implementation, when the differential index reflects that the system storage space exceeds a preset threshold, it indicates that the system storage space of the monitored object is suspected to be abnormal. It is necessary to more carefully determine which parameters of the system storage space cause the system storage space to be abnormal. At this time, the specified parameters of the monitoring object can be the system disk IO status, the CPU status of each process, or the file handle occupancy status, etc. That is to say, the inspection method of this application accurately locates suspected abnormal nodes in the system through real-time monitoring data.

After obtaining the specified parameters, the monitoring object can be inspected according to the specified parameters, that is, a precise inspection of the monitoring object is quickly initiated based on the suspected abnormal nodes, and the inspection results are obtained, which saves a lot of inspection resources. In one embodiment, the inspection results can reflect whether the currently suspected abnormal node is indeed abnormal, so that the inspection method of the present application can perform corresponding correction operations and monitoring operations on the monitoring object based on the inspection results.

It can be understood that it also includes: when the analysis result is that the monitoring object is normal, adjusting the monitoring object according to the differential index and the preset baseline value adaptive adjustment algorithm.

It should be noted that when the differentiation index is less than the differentiation index threshold and the analysis result is that the monitoring object is normal, the first system operating data of the monitoring object is still within the controllable range. In order to avoid the conflict between the first system operating data and The deviations between the operating data of the second system are getting larger and larger, causing the working conditions of the monitoring objects to further deteriorate. It is also necessary to adjust the monitoring objects based on differentiated indicators and preset baseline adaptive adjustment algorithms.

In one implementation, the baseline adaptive adjustment algorithm described in this application is not unique. The baseline adaptive adjustment algorithm can be traditional PID, minimum mean square error, high-order Kalman filter, etc. In actual application, it needs to be based on the system practical use Scenarios are used to monitor the degree of differentiation of differentiation indicators to make optimal choices. This application does not specifically limit it.

Referring to FIG. 6 , in the first aspect, embodiments of the present application provide a method of inspecting a monitoring object based on analysis results and obtaining inspection results, including but not limited to step S610, step S620, and step S630.

Step S610: When the analysis result is an event change, perform big data analysis on the real-time operating data to obtain real-time event information;

Step S620: Query the event database according to the real-time event information to obtain inspection information corresponding to the real-time event information;

Step S630: Perform inspection on the monitored object according to the inspection information to obtain inspection results.

Referring to Figure 11, it should be noted that when the first system operating parameters are different from the second system operating parameters and the analysis result is an event/scenario change, big data analysis needs to be performed on the real-time operating data to obtain real-time event information. In one embodiment, the inspection method of the present application extracts valid event information from real-time operating data through fuzzy recognition, and then determines the current operating events/scenarios, that is, obtains real-time event information.

It should be noted that after obtaining the real-time event information, the event database needs to be queried based on the real-time event information, because the event database has been expanded multiple times and contains a large number of event sample information, and different events/scenarios correspond to different patrols. inspection mode. At the same time, the inspection mode corresponding to different events/scenes is also recorded in the event library. Therefore, the inspection method of this application can search for events/scenarios corresponding to real-time event information in the event library to identify monitoring objects. The current event/scenario, and obtain the inspection information corresponding to the current event/scenario of the monitoring object according to the event library, that is, the inspection mode. Finally, the monitoring object is inspected according to the inspection mode, and the inspection results are obtained. Through the inspection results Perform corresponding correction operations and monitoring operations on the monitored objects.

It is understandable that big data analysis of real-time operating data is performed to obtain real-time event information, including: fuzzy identification of real-time operating data in combination with the event library, and extraction of real-time event information.

It should be noted that fuzzy recognition is a method of classifying objects to be recognized into corresponding standard libraries given a standard library. Fuzzy recognition methods are divided into direct methods and indirect methods. This application can combine the event library to perform fuzzy identification on real-time operating data to accurately extract real-time event information.

According to an embodiment of the present application, event one has feature A, feature B, feature C, and feature D, and event two has feature A, feature E, feature F, and feature G. When performing fuzzy recognition of an unknown event, first identify Feature A, feature B, feature C and feature D, then it is directly positioned as event one, which is the direct method; if it is also recognized that an unknown event has feature A, feature E, feature F and feature G, then according to the nearest principle, positioning it indirectly as event two. In fact, actual events/scenarios are often more responsible, which needs to be achieved through more advanced and accurate fuzzy recognition technology.

It should be noted that the expected effect achieved by the inspection method of this application also depends on the depth of the event library and the accuracy of the fuzzy recognition algorithm.

It can be understood that the correction and continuous monitoring of the monitoring object based on the inspection results include one of the following: when the inspection result is that there is no fault, the monitoring object is continuously monitored; or when the inspection result is that there is a fault, Correct the monitoring object to obtain the correction result, and perform corresponding operations on the monitoring object based on the correction result.

It should be noted that after the inspection is completed and the inspection results are obtained, corresponding operations need to be performed on the monitored objects based on the inspection results. When the inspection result is that there is no fault, then there is no abnormality in the monitoring object at this time, and the monitoring object can be continuously monitored; when the inspection result is that there is a fault, there is an abnormality in a node of the monitoring object. In order to make the monitoring If the object subsequently works normally, the monitoring object needs to be corrected to repair the abnormal node. In one implementation, when the inspection method of the present application corrects the abnormal node, the correction result will be returned. Corresponding operations need to be performed on the monitoring object based on the correction result. The correction result may be a successful correction or a failed correction. Different correction results correspond to different subsequent processing operations.

It can be understood that corresponding operations are performed on the monitoring object according to the correction result, including one of the following: when the correction result is successful, the monitoring object is continuously monitored; or when the correction result is failure, an alarm signal is sent to the monitoring object.

It should be noted that when the correction result is successful, it indicates that the abnormal node in the monitoring object has been successfully corrected at this moment. In order to prevent the monitoring object from appearing abnormal in the subsequent work process, the monitoring object still needs to be continuously monitored; when When the correction result is failure, it means that the correction of the abnormal nodes in the monitoring object has failed. At this time, relying on automatic correction cannot successfully correct the abnormal nodes in the monitoring object. An alarm signal needs to be sent to the monitoring object, so that the monitoring object issues a warning and also causes the monitoring object to issue a warning. This enables operators to manually correct abnormal nodes in the monitored objects in a timely manner.

Referring to Figure 8, Figure 8 is a schematic diagram of a networking structure under a multi-network element topology provided by an embodiment of the present application; it should be noted that the above-mentioned embodiment is only an event change of a single network element. In 5G services In networking, the network element topology is complex. In addition to event characteristics, network element characteristics also need to be considered. When the monitoring object changes from a single network element to a topological system containing multiple network elements, the situational awareness of the inspection method of this application needs to be expanded from events/scenarios to network element switching. In one implementation, this application can keenly sense and quickly capture the switching of monitoring objects through model and scene adaptation. When network element A switches to network element E, this application can quickly sense the changes in the network element and adjust the inspection mode in a timely manner. , avoiding manual operations by operators, effectively improving the universality and portability of the inspection method.

Referring to Figure 7, in the second aspect, the embodiment of the present application provides an alarm method, including but not limited to step S710, step S720, step S730, and step S740.

Step S710, obtain the alarm signal of the monitored object;

Step S720, perform intent identification on the monitored object based on the alarm signal, and obtain the alarm intent of the monitored object;

Step S730, perform inspection on the monitoring object according to the alarm intention, and obtain inspection results;

Step S740: Perform fault determination and correction on the monitored object based on the inspection results.

It should be noted that any of the inspection methods described in the first aspect of this application actively obtains the target information of the monitoring object, while the alarm method of this application uses intention sensing technology to passively accept alarm notifications initiated by the monitoring object or Inspection requests, and sensing the intention of the monitoring object, proactively initiate precise inspection operations required by the monitoring object according to the intention of the monitoring object, realizing the automation of inspection, reducing the manual operation of traditional inspection methods, and effectively improving the efficiency of inspection efficiency, reducing labor costs.

According to an embodiment of the present application, when the abnormal nodes in the monitored object are not corrected successfully, the monitored object actively generates an estrus alarm signal. The inspection method of the present application passively obtains the alarm signal of the monitored object, and monitors the monitored object according to the alarm signal. Carry out intention recognition and obtain the alarm intention of the monitored object. In one implementation, the alarm signals include but are not limited to business success rate alarms, system overload alarms, link abnormality alarms, service abnormality alarms, database abnormality alarms, etc. The alarm signals reflect the alarming intention of the monitoring object and can be passed to the alarm. The signal is used for intent recognition to identify the alarm intent of the monitored object.

In one embodiment, after obtaining the alarm intention of the monitoring object, the monitoring object needs to be inspected according to the alarm intention to judge again whether the suspected abnormal node of the monitoring object is indeed abnormal, that is, the inspection result is obtained, and the inspection result is carried out according to the inspection result. Fault determination, if the inspection result is that the suspected abnormal node of the monitoring object is indeed abnormal, the abnormal node needs to be corrected again, or manual correction is introduced; if the inspection result is that the suspected abnormal node of the monitoring object is not abnormal, then the monitoring object Conduct ongoing monitoring.

According to another embodiment of the present application, when the alarm signal is a business success rate alarm, the inspection method of the present application triggers process and log inspection after receiving the alarm signal to further obtain the cause of failure; when the alarm signal is a database abnormality alarm, the inspection method of this application checks the current operating status of the database, storage capacity, disk IO, read and write delay and other information after receiving the alarm signal; when the alarm signal is a link abnormality alarm, the inspection method of this application The method is to check the operating status and network connection of relevant network elements after receiving the alarm signal.

It should be noted that the monitoring object can also actively send intention signals. The intention signals include designated inspection operations, that is, actively requesting certain types of inspection actions, rather than performing inspections only after receiving alarm signals.

In the third aspect, referring to FIG. 9 , a schematic structural diagram of an inspection system provided by an embodiment of the present application is provided. The embodiment of the present application provides an inspection system, including: an intelligent control center. The intelligent control center is configured to execute the first step of the present application. The inspection method of one aspect of the embodiment and the alarm method of the second aspect of the present application, for example, perform the above-described method steps S110 to S140 in Figure 1, method steps S210 to S230 in Figure 2, and method steps S210 to S230 in Figure 3. Method steps S310 to S320, method steps S410 to S420 in FIG. 4 , method steps S510 to S520 in FIG. 5 , method steps S610 to S630 in FIG. 6 , and method steps S710 to S740 in FIG. 7 . The measurement unit is communicated with the intelligent control center and the monitoring object respectively. The measurement unit is configured to obtain real-time operating data of the monitoring object and send the real-time operating data to the intelligent control center; the control unit is communicated with the intelligent control center. , the control unit is set to receive instructions sent by the intelligent control center; where the instructions include inspection instructions and correction instructions; the execution mechanism is respectively connected with the control unit and monitoring The monitoring object is connected through communication; wherein, the control unit is also configured to control the execution mechanism to perform corresponding operations on the monitoring object according to the instructions.

Referring to Figure 9, it should be noted that the basic idea of the inspection method of this application is: the inspection system collects the operating data of the monitored objects in real time, embeds it in the closed-loop control system, intelligently drives decision-making, initiates precise inspections, and separates manual operations from In addition to inspection actions, there is zero contact and zero waiting, and only necessary repair operations are provided based on the fault results. In one embodiment, the inspection system includes an intelligent control center, a measurement unit, a control unit and an actuator. The intelligent control center is configured to collect the performance indicators of monitoring objects and centrally manage full-stack operation and maintenance data. , conduct intelligent analysis and prediction of data, perceive potential fault points, and provide correction solutions, that is, the intelligent control center is set to perform inspection methods or alarm methods; the measurement unit is communicated with the intelligent control center and the monitoring object respectively, and the measurement unit is It is set to obtain the indicator parameters of the monitoring object, that is, real-time operating data, and send the real-time operating data to the intelligent control center; the control unit is connected to the intelligent control center through communication, and is set to receive instructions sent by the intelligent control center, where the intelligent control center It can accurately locate the inspection type required for the monitored object and send the inspection instruction to the control unit. The intelligent control center can also start the system self-healing function after clarifying the fault type of the monitored object based on the inspection results and correct the instruction. Sent to the control unit; the execution mechanism is communicated with the control unit and the monitoring object respectively. The control unit is configured to perform inspection operations on the monitoring object according to the inspection instruction, and is also configured to perform correction operations on the monitoring object according to the correction instruction.

Referring to Figure 10, according to an embodiment of the present application, the intelligent control center first presets the initial operation data before obtaining the real-time operation data of the monitoring object, and then obtains the differential index based on the real-time operation data and the initial operation data of the monitoring object, and The differential index is compared with the preset differential index threshold. When the obtained analysis result is that the monitoring object is normal, the monitoring object needs to be adjusted according to the differential index and the preset baseline value adaptive adjustment algorithm. When the obtained analysis result is When the result is that there is an abnormality in the monitored object, the intelligent control center actively initiates a precise inspection instruction to the control unit to clarify the specified parameters or status of the inspection and monitoring object. The control unit controls the execution mechanism to initiate an inspection operation to the monitored object according to the precise inspection instruction. , when the inspection operation is completed, the monitoring object returns the inspection results to the intelligent control center, and the intelligent control center determines the inspection results. If the inspection result is that there is no fault, the intelligent control center continues to monitor the monitoring object, such as When the inspection result is that there is a fault, the intelligent control center sends a correction instruction to the control unit, so that the control center controls the execution mechanism to correct the monitoring object according to the correction instruction. When the correction operation is completed, the monitoring object returns the correction result to the intelligent control center, so If the correction result is successful, the monitoring object will be continuously monitored. If the correction result is failed, the intelligent control center will initiate an alarm, causing manual intervention.

Referring to Figure 11, according to another embodiment of the present application, before obtaining the real-time operation data of the monitoring object, the intelligent control center needs to collect a large amount of event information of the monitoring object, store the event information in the event library, and then process the real-time operation data of the monitoring object. Parameters are extracted from the data to obtain the first system operating parameters. Based on the first system operating parameters, it is judged whether the event/scene has changed. If the event/scene has not changed, the intelligent control center will continue to monitor the monitoring object. If the event/scene changes, then The intelligent control center performs big data analysis on real-time operating data to obtain real-time event information, and queries the event database based on the real-time event information to obtain inspection information corresponding to the real-time event information. After obtaining the inspection information, the intelligent control center will The inspection information is converted into inspection instructions, and the inspection instructions are sent to the control unit, so that the control unit controls the execution agency to inspect the monitored object according to the inspection instructions. After the execution agency completes the inspection operation, it returns the inspection results to the intelligent control center.

In one embodiment, after completing the inspection operation on the monitoring object, the intelligent control center will collect event/scenario information again and continue to inspect the monitoring object.

Referring to Figure 12, Figure 12 is another main flow chart of an alarm method provided by an embodiment of the present application; according to another embodiment of the present application, the monitoring object actively initiates an alarm and sends the alarm signal to the intelligent control center, so that the intelligent The control center identifies the intention of the monitoring object based on the alarm signal, obtains the alarm intention of the monitoring object, and initiates a precise inspection based on the alarm intention, that is, it sends an inspection instruction to the control unit. After receiving the inspection instruction, the control unit controls the execution mechanism. The monitoring object performs inspection operations. When the inspection operation is completed, the monitoring object sends the inspection results to the intelligent control center. The intelligent control center also needs to determine and correct the fault of the monitoring object based on the inspection results.

Referring to Figure 8, it can be understood that the intelligent control center includes a centralized event management module, an intelligent engine module, a big data processing module, an impact analysis module, and an event library module; among which, the centralized event management module is configured to perform event sensing and centralize events The management module is set to manage various parameters in the event library module; the intelligent engine module is set to collect monitoring The indicator performance of the object is analyzed and predicted intelligently; the big data processing module is set to extract real-time operating data, and adaptively adjust the algorithm for the monitored object based on the real-time operating data and the preset benchmark value. Make adjustments; the impact analysis module is set to predict the impact of faults on the monitored objects; and the centralized event management module, intelligent engine module, big data processing module, impact analysis module and event library module communicate with each other and work together to Provide situational awareness.

According to an embodiment of this application, the intelligent control center, as the core component of the precise inspection system, consists of a centralized event management module, an intelligent engine module, a big data processing module, an impact analysis module and an event library module. These modules work together , providing complete situational awareness capabilities.

In one embodiment, the centralized event management module is configured to manage event types, characteristics, parameters, etc. in the event library module; the intelligent engine module is the core control component in the intelligent control center and is configured to collect the indicator performance of monitoring objects, Intelligent perception drive type, such as sensing real-time operating data changes, event/scene switching, etc., to conduct intelligent analysis and prediction of real-time operating data; the big data processing module is set to collect real-time operating data and extract information about different events, different objects, and different scenarios. Operating parameters, intelligent fuzzy identification of events/scenes, and the big data processing module is also set to adaptively adjust the inspection benchmark value based on the real-time data of the monitored object; the impact analysis module is set to predict the impact of the fault based on the inspection results. Timely adjust the inspection strategy or initiate automatic fault repair. If necessary, trigger an alarm and introduce manual intervention.

In the fourth aspect, referring to Figure 13, an embodiment of the present application provides an electronic device, including:

at least one memory;

at least one processor;

at least one program;

Programs are stored in memory and the processor executes at least one program to:

Such as the inspection method of any embodiment of the first aspect of this application and the alarm method of any embodiment of the second aspect of this application.

The processor and memory may be connected via a bus or other means.

As a non-transitory readable storage medium, memory can be used to store non-transitory software instructions and non-transitory instructions. In addition, the memory may include high-speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. It will be appreciated that the memory may include memory located remotely relative to the processor, and that the remote memory may be connected to the processor via a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.

The processor executes non-transient software instructions, instructions and signals stored in the memory to implement various functional applications and data processing, that is, to implement the inspection method of the first embodiment.

The non-transient software instructions and instructions required to implement the inspection method of the above embodiment are stored in the memory. When executed by the processor, the inspection method of the first embodiment of the present application and the second embodiment of the present application are executed. The alarm method, for example, performs the above-described method steps S110 to S140 in Figure 1, method steps S210 to S230 in Figure 2, method steps S310 to S320 in Figure 3, method steps S410 to S420 in Figure 4, Method steps S510 to S520 in FIG. 5 , method steps S610 to S630 in FIG. 6 , and method steps S710 to S740 in FIG. 7 .

In the fifth aspect, embodiments of the present application provide a computer-readable storage medium. The computer-readable storage medium stores computer-executable signals. The computer-executable signals are used to execute:

Such as the inspection method of any embodiment of the first aspect and the alarm method of any embodiment of the second aspect of this application.

For example, the above-described method steps S110 to S140 in FIG. 1 , method steps S210 to S230 in FIG. 2 , method steps S310 to S320 in FIG. 3 , method steps S410 to S420 in FIG. 4 , and the method in FIG. 5 are performed. Steps S510 to S520, method steps S610 to S630 in Figure 6, and method steps S710 to S740 in Figure 7.

The device embodiments described above are only illustrative. The units described as separate components may or may not be physically separated. The components shown as units may or may not be physical units, that is, they may be located in one place. , or it can be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Embodiments of the present application include: obtaining real-time operation data of monitoring objects; analyzing the real-time operation data to obtain analysis results; performing inspections on the monitoring objects according to the analysis results to obtain inspection results; check conclusion The monitoring objects shall be revised and continuously monitored. Based on this, compared with traditional inspection methods, this application can analyze the real-time operating data of the current monitoring object to proactively predict and perceive possible failure links based on the operating behavior of the current monitoring object, that is, obtain the analysis results. Then, based on the analysis results, the monitoring objects are inspected in a timely and accurate manner, and the monitoring objects are corrected and continuously monitored based on the obtained inspection results. The transformation from passive operation and maintenance to active operation and maintenance ensures the accuracy and real-time performance of the business.

Through the description of the above embodiments, those of ordinary skill in the art can understand that all or some steps and systems in the methods disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As is known to those of ordinary skill in the art, the term computer storage media includes volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable signals, data structures, modules of instructions, or other data. removable, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer. Additionally, it is known to those of ordinary skill in the art that communication media typically embodies a computer-readable signal, data structure, instruction module, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

The embodiments of the present application have been described in detail above in conjunction with the accompanying drawings. However, the present application is not limited to the above embodiments. Within the scope of knowledge possessed by those of ordinary skill in the technical field, other embodiments can be made without departing from the purpose of the present application. Various changes.

Claims

An inspection method including:

Obtain real-time operating data of monitoring objects;

Analyze the real-time operating data and obtain analysis results;

Conduct inspections on the monitoring objects according to the analysis results to obtain inspection results;

The monitoring objects are corrected and continuously monitored based on the inspection results.
The inspection method according to claim 1, wherein the real-time operation data includes event information, and the obtaining the real-time operation data of the monitoring object includes:

Collect several event information of the monitoring object;

The event information is disassembled to obtain a number of machine events corresponding to the event information, and the machine events are stored in a preset event library; wherein the machine events are obtained by disassembling the event information. A collection of several machine languages;

Continuously monitor the monitoring object and obtain the real-time operating data of the monitoring object.
The inspection method according to claim 1, wherein the analysis of the real-time operation data to obtain the analysis results includes:

Compare the real-time operating data with the preset initial operating data to obtain differentiation indicators;

Compare the differentiation index with a preset differentiation index threshold to obtain the analysis result.
The inspection method according to claim 3, wherein the analysis results obtained by comparing the differentiation index with a preset differentiation index threshold include one of the following:

When the differentiation index is greater than the differentiation index threshold, the analysis result is that the monitoring object is abnormal; or

When the differentiation index is less than the differentiation index threshold, the analysis result is that the monitoring object is normal.
The inspection method according to claim 2, wherein said analyzing the real-time operation data to obtain analysis results further includes:

Perform parameter extraction on the real-time operating data to obtain the first system operating parameters;

The first system operating parameters and the second system operating parameters are compared to obtain the analysis results; wherein the second system operating parameters are obtained by parameter extraction from the real-time operating data obtained in the previous period.
The inspection method according to claim 5, wherein the comparison of the first system operating parameters and the second system operating parameters to obtain the analysis result includes one of the following:

When the first system operating parameters are different from the second system operating parameters, the analysis result is an event change; or

When the first system operating parameters are the same as the second system operating parameters, the analysis result is that the event has not changed.
The inspection method according to claim 4, wherein the inspection of the monitoring object according to the analysis results to obtain the inspection results includes:

When the analysis result shows that there is an abnormality in the monitoring object, obtain the corresponding specified parameters of the monitoring object according to the differentiation index;

The monitoring object is inspected according to the specified parameters to obtain the inspection result.
The inspection method according to claim 4, further comprising:

When the analysis result is that the monitoring object is normal, the monitoring object is adjusted according to the differential index and the preset reference value adaptive adjustment algorithm.
The inspection method according to claim 6, wherein the step of performing inspection on the monitoring object according to the analysis result to obtain the inspection result also includes:

When the analysis result is an event change, perform big data analysis on the real-time operating data to obtain real-time event information;

Query the event database according to the real-time event information to obtain inspection information corresponding to the real-time event information;

The monitoring object is inspected according to the inspection information to obtain the inspection result.
The inspection method according to claim 9, wherein said performing big data analysis on the real-time operating data to obtain real-time event information includes:

The real-time operation data is fuzzy identified in combination with the event library, and the real-time event information is extracted.
The inspection method according to claim 6, further comprising:

When the analysis result is that the event has not changed, the monitoring object is continuously monitored.
The inspection method according to claim 1, wherein the correction and continuous monitoring of the monitoring object according to the inspection results include one of the following:

When the inspection result is that there is no fault, continue to monitor the monitoring object; or

When the inspection result indicates that there is a fault, the monitoring object is corrected to obtain a correction result, and corresponding operations are performed on the monitoring object based on the correction result.
The inspection method according to claim 12, wherein the corresponding operation on the monitoring object according to the correction result includes one of the following:

When the correction result is successful, continue monitoring the monitoring object; or

When the correction result is failure, an alarm signal is sent to the monitoring object.
An alarm method, including:

Obtain the alarm signal of the monitored object;

Perform intention identification on the monitoring object according to the alarm signal to obtain the alarm intention of the monitoring object;

Conduct inspection on the monitoring object according to the alarm intention and obtain inspection results;

Perform fault determination and correction on the monitored object based on the inspection results.
An inspection system including:

Intelligent control center, the intelligent control center is configured to perform the inspection method according to any one of claims 1 to 13 and/or the alarm method according to claim 14;

Measuring unit, the measuring unit is communicatively connected to the intelligent control center and the monitoring object respectively, the measuring unit is configured to obtain the real-time operating data of the monitoring object and send the real-time operating data to the intelligent control center ;

A control unit, the control unit is communicatively connected with the intelligent control center, and the control unit is configured to receive instructions sent by the intelligent control center; wherein the instructions include inspection instructions and correction instructions;

An execution mechanism, which is communicatively connected to the control unit and the monitoring object respectively; wherein the control unit is further configured to control the execution mechanism to perform corresponding operations on the monitoring object according to the instruction.
The inspection system according to claim 15, wherein the intelligent control center includes a centralized event management module, an intelligent engine module, a big data processing module, an impact analysis module and an event library module; wherein the centralized event management module is Set to perform event awareness, the centralized event management module is set to manage various parameters in the event library module; the intelligent engine module is set to collect the indicator performance of the monitoring object, and perform analysis on the monitoring object. Intelligent analysis and prediction of the indicator performance; the big data processing module is configured to extract the real-time operation data, and perform adaptive adjustment algorithm on the monitoring object according to the real-time operation data and the preset benchmark value. Adjustment; the impact analysis module is configured to predict the impact of faults on the monitoring object;

Moreover, the centralized event management module, the intelligent engine module, the big data processing module, the impact analysis module and the event library module communicate with each other and work together to provide situation awareness capabilities.
An electronic device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the implementation as described in any one of claims 1 to 13 is achieved. The inspection method and/or the alarm method as claimed in claim 14.
A computer-readable storage medium, the computer-readable storage medium stores a computer executable program, the computer executable program is used to cause the computer to execute the inspection method as described in any one of claims 1 to 13 and /or the alarm method in claim 14.