CN112671557A - Situation awareness based fault monitoring method and system - Google Patents

Situation awareness based fault monitoring method and system Download PDF

Info

Publication number
CN112671557A
CN112671557A CN202011417081.7A CN202011417081A CN112671557A CN 112671557 A CN112671557 A CN 112671557A CN 202011417081 A CN202011417081 A CN 202011417081A CN 112671557 A CN112671557 A CN 112671557A
Authority
CN
China
Prior art keywords
event
scene
data
access
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011417081.7A
Other languages
Chinese (zh)
Inventor
李攻科
姜晓辉
顾建国
王嘉佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Eastcom Software Technology Co ltd
Original Assignee
Hangzhou Eastcom Software Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Eastcom Software Technology Co ltd filed Critical Hangzhou Eastcom Software Technology Co ltd
Priority to CN202011417081.7A priority Critical patent/CN112671557A/en
Publication of CN112671557A publication Critical patent/CN112671557A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The disclosure relates to a situation awareness based fault monitoring method and system. The embodiment method comprises the following steps: accessing a scene event, wherein the scene event comprises an event fault type; reading a delimitation rule through a scene event fault type, analyzing and calculating scene data, and outputting a scene event delimitation result; the scene data is multidimensional data, the delimiting rule is configured based on the data, the network elements and the fault types, and the delimiting result can be used for evaluating and predicting the network situation. Compared with the prior art, the embodiment of the invention has the advantages of short time consumption, timely data feedback, centralized processing means and intelligentized fault delimiting and positioning, and is more convenient for realizing quick response to the fault and fault duplication.

Description

Situation awareness based fault monitoring method and system
Technical Field
The embodiment of the invention relates to the technical field of network management, in particular to a situation awareness based fault monitoring method and system.
Background
With the rapid development of network services of telecom operators, how to quickly and effectively process faults in order to guarantee the network quality and improve the experience of users on the network becomes a problem to be solved urgently in the system construction of the telecom operators.
The traditional fault handling of system construction mainly has the following three disadvantages:
firstly, a plurality of network supporting systems are built, but means are dispersed in each system, and when a fault occurs, operation and maintenance personnel often need to see the plurality of systems to complete the positioning of the fault;
secondly, in all the existing support systems, the problems that data are inaccurate, cannot be updated in time, cannot be associated among specialties, and cannot be associated with other network data exist in the established topology, and a set of topology capable of reflecting the current network condition in time is built to support the daily work of network operation and maintenance;
thirdly, the existing fault delimiting positioning is realized by manually analyzing multi-dimensional data such as alarm, resource, topology and performance by professionals to obtain a delimiting conclusion, and the analysis time is too long, so that the construction of an automatic delimiting conclusion needs to be accelerated urgently.
Disclosure of Invention
The invention aims to solve the problems that the existing fault treatment is often too long, the treatment means are dispersed, and the delimitation and positioning of the fault are totally manual.
In order to achieve the above object, embodiments of the present invention provide a method and a system for monitoring a fault based on situational awareness.
According to a first aspect of the embodiments of the present invention, a situation awareness-based fault monitoring method is provided, including: accessing a scene event, wherein the scene event comprises an event fault type; reading a delimitation rule through a scene event fault type, analyzing and calculating scene data, and outputting a scene event delimitation result; the scene data is multidimensional data, the delimiting rule is configured based on the data, the network elements and the fault types, and the delimiting result can be used for evaluating and predicting the network situation.
In one embodiment, the scene delimiting rules include filtering rules and delimiting conclusion rules, wherein the filtering rules include event topology rules and multidimensional data rules, and the filtering rules are used for filtering and screening the network element range related to the access scene event, the topology relationship among the network elements and the multidimensional data thereof; and the delimitation conclusion rule is used for making an integral delimitation conclusion on the access scene event based on the topology and the data filtered by the filtering rule.
In one embodiment, the data types of the scene data include: alarm data, engineering data, operation logs, complaint amount data, network delivery aggregation data and performance indexes.
In one embodiment, the manner of accessing the scenario event includes: event trigger access, wherein the event trigger access comprises self trigger access and external system trigger access; the self-triggering access is access according to scene CMNET, EPC, IP bearing network and VOLTE voice scene events; the external system triggering access is the access for monitoring the middleware message scene event.
In one embodiment, the manner of accessing the scenario event further includes: and event access, specifically, the event access is the access of a monitoring system operation and maintenance data scene event.
In one embodiment, the method includes collecting data in real-time and periodically.
In one embodiment, the data accessed by the event trigger is acquired in real time; the data acquisition mode of the event access is timing acquisition.
In one embodiment, the method further comprises providing a rest interface to output the scene event delimitation results for sharing.
In one embodiment, the method further comprises building, by the component configuration, a failure scenario corresponding to the access scenario event.
In one embodiment, the categories of component configurations include: the method comprises the steps of scene topology configuration, preliminary conclusion configuration and event access configuration, wherein the scene topology configuration is used for drawing a scene topology according to network hierarchies and structures in various specialties, and supporting topology network element configuration, link configuration and icon configuration; the configuration of the preliminary conclusion is specifically a configuration conclusion model, the model is filled according to a multi-dimensional data result, and a preliminary delimitation result is output; the event access is configured to be used for accessing different types of events through configuration of event sources and event types.
According to a second aspect of the embodiments of the present invention, there is provided a situation-aware-based fault monitoring system for use in the method according to the first aspect of the embodiments of the present invention, the system including: the delimitation unit is used for configuring a scene delimitation rule based on the fault types of the data, the network elements and the events; the system comprises a collecting unit, a processing unit and a processing unit, wherein the collecting unit is used for accessing a scene event, and the scene event comprises an event fault type; the analysis unit is used for reading the delimitation rule through the scene event fault type, analyzing and calculating scene data and outputting a scene event delimitation result; the scene data is multidimensional data, and the delimiting result can be used for evaluating and predicting the network situation.
According to the embodiment of the invention, the event is matched with the rule configuration by configuring the multi-dimensional data rule model, the corresponding delimiting rule is found through the fault type, and the corresponding multi-dimensional data is analyzed, so that the quick intelligent positioning of the primary reason and the possible reason of the fault is realized, and the fault processing time is shortened. In addition, the embodiment of the invention also opens a rest data service interface for the preliminary delimitation result to realize event conclusion sharing. And the accumulation of the operation and maintenance knowledge is realized through the accumulation delimiting rule.
Drawings
FIG. 1 is a flow chart of a situation awareness-based fault monitoring method of the present invention;
fig. 2 is a diagram of a situation awareness-based fault monitoring system according to the present invention.
Detailed Description
The invention relates to a fault monitoring method and a system designed based on situation awareness, wherein the principle theory of situation awareness is as follows:
the network situation awareness comprises three parts of situation awareness, situation understanding and situation evaluation.
Situation awareness needs to be achieved through multi-level and multi-dimensional data collection, and any single state and mode cannot be called as situation awareness. The collected multidimensional data comprises six types of data, namely alarm data (alarm data generated by equipment (network elements)), engineering data (engineering cutting operation information of the equipment (network elements)), whether the fault is related to engineering operation or not is judged, operation logs (operation and command executed on the equipment), complaint quantity data (complaint trend of complaint types in a period of time), network delivery aggregation degree data (aggregation degree information of user complaints at a certain terminal, county and TAC and the like), and performance indexes (trends of various performance indexes on the equipment). The process of data collection is in fact a situation awareness process.
After the multidimensional data is acquired, certain processing needs to be performed on the multidimensional data for subsequent operation, and in order to ensure the accuracy and comprehensiveness of the situation awareness result, the integrity of the acquired data should be ensured to the maximum extent, so that the acquired original multidimensional data needs to be analyzed. Because the amount of processed data is large, if more complex correlation filtering is adopted, the processing time is longer, and the real-time performance of the system is poorer.
To solve the above problems, i.e. to meet the real-time requirements of the system, the concept of situation understanding arises. The situation understanding is a process of adopting simple data level fusion, then analyzing the correlation of the fused data, analyzing original safety data, removing repeated redundant information and combining the same kind of information, and particularly, providing a data base through the calculation of a multi-dimensional data rule.
The situation assessment of the last component of the network situation awareness system is the core of network situation awareness and is qualitative and quantitative description of network conditions. A multi-level, multi-dimensional, multi-granular situation assessment framework may be employed. And according to different data dimensions, each dimension evaluates the network situation from different granularities.
With the understanding of the principle of situational awareness, the technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Fig. 1 is a flowchart of a fault monitoring method based on situation awareness according to an embodiment of the present invention.
As shown in fig. 1, a specific process based on situation-aware fault monitoring is described as follows:
in step S102, data types are formulated and scene delimiting rules are configured based on the data, network elements and fault types.
And formulating a data type, namely formulating a data type which can be accessed, wherein the data type comprises alarm data, engineering data, an operation log, complaint amount data, network delivery aggregation degree data and a performance index.
The configuration delimitation rule is to accumulate operation and maintenance experience for the system. Specifically, the delimiting rules are filtering rules and delimiting conclusion rules, wherein the filtering rules include event topology rules and event multidimensional data rules. And finally presenting the network element range related to the fault event and the topological relation among the network elements by the event topological rule. And finally presenting alarm data, engineering data, an operation log, complaint amount data, network delivery aggregation data, performance indexes and the like related to the fault event by the multi-dimensional data rule. The delimited conclusion rules ultimately present possible causes of the occurrence of the failure event. The configuration of the scene delimitation rule needs to define rule variables and ensure the completeness of configuration information of the delimitation rule. This step is the basis for ensuring the initial location of the event.
In step S104, a scene event and scene multidimensional data are accessed.
After the configuration of the scene delimiting rule is completed, the scene event and the scene multidimensional data can be accessed.
The method for accessing the scene event comprises two types of event triggering and event accessing, wherein the event triggering comprises self triggering and external system accessing triggering; the event access is specifically an access event of operation and maintenance data of the monitoring system.
In one embodiment, event triggering access scene events are subjected to event standardization, and event elements, event occurrence time, event titles, early warning id, event fault types, cities and the like are converted into an event unified mode. Stored in a database.
In one embodiment, the self system trigger is according to four basic scenes of CMNET, EPC, IP bearing network and VOLTE voice access scene events. In another embodiment, the external access event is a real-time listening to middleware messages. The corresponding scene event can be generated through alarming, early warning data or accessing from an external system.
The access scenario event also includes accessing multidimensional data thereof.
The situation awareness is a method theory for analyzing and presenting fault events based on multidimensional data, which can be known from the situation awareness principle theory. That is, the accessed multidimensional data is the data basis for rule calculation. In one embodiment, the accessed scene multidimensional data comprises alarm data, engineering data, operation logs, complaint amount data, network delivery aggregation data and performance indexes. The accessed data information further comprises fault type attribute data of the scene event. The multi-dimensional data storage period, the data types and the data integrity are perfect, and the diversification and integrity of the multi-dimensional data are guaranteed, so that the basis for correctly analyzing the events is provided.
The mode of accessing multidimensional data can be divided into real-time acquisition and timing acquisition according to the frequency. The majority of the acquisition of the trigger event data is real-time acquisition, and the majority of the access event access data of the operation and maintenance data of the monitoring system is timing acquisition. In practical implementation, the data can be collected periodically in a manner that sftp + web service receives messages in a period of 15 minutes, hours, days, and months, and the engineering data is collected periodically in one embodiment. In another embodiment, the received data is collected in real-time by listening to middleware messages in real-time. In the implementation of the method, the implementer can make a selection as to the manner of data acquisition.
In step S106, a scene boundary rule is calculated.
The data basis of the calculation of the scene delimiting rule can be data accessed based on event triggering, and can also be data accessed based on monitoring system operation and maintenance data access events.
Specifically, the calculation of the scene delimiting rule is to read the delimiting rule through event fault type matching, output a rule variable, and analyze a preliminary delimiting result according to conclusion configuration. Namely, analyzing scene multidimensional data, calculating an access scene event, outputting a preliminary delimitation result, and acquiring a preliminary conclusion of fault monitoring. The user login interface can view the event delimitation result at any time, and the event delimitation result comprises event related data and a preliminary fault delimitation conclusion.
It can be understood that the scene delimiting rule calculation refers to filtering rule calculation and conclusion rule calculation, and the filtering rule includes multidimensional data rule calculation and event topology rule calculation, that is, filtering and screening of alarm data, engineering data and topology. And calculating a conclusion rule, wherein data comes from the integral delimitation conclusion after the scene delimitation rule calculation filtering, namely, the delimitation conclusion rule is summary, and the scene delimitation rule is detail.
In one embodiment, a rest interface is further provided for outputting an external environment, and event preliminary delimitation rule sharing is realized.
In one embodiment, besides the calculation configuration of the multidimensional data, other components are configured, and the scene is quickly built by abundant configuration multi-dimensional quick locking and matching of corresponding fault scenes. For example, configuration of the scene topology: a user draws scene topology according to network hierarchy and structure in each specialty, and supports topology network element configuration, link configuration and icon configuration; configuration of preliminary conclusions: configuring a conclusion model, filling the model according to the multi-dimensional data result, and outputting a preliminary delimitation result; event access may be configured: by configuring the event source and the event category, different categories of events can be accessed quickly.
In one embodiment, a user configures 4G complaint early warning rules, after an event is triggered to a situation awareness system, situation awareness finds corresponding multidimensional data delimitation rules according to fault types, basic data are filtered, alarms, projects, performance indexes and the like related to the event are filtered. Whether the event is caused by engineering or not, the network element generating the alarm and whether the performance index on the network element is normal or not are analyzed, and the problem of quick positioning by maintenance personnel is helped.
Fig. 2 illustrates a possible scenario of functions described in the embodiment of the present specification when hardware, firmware, or a combination of software is adopted, and specifically, fig. 2 is a schematic diagram of a fault monitoring system for executing the method shown in fig. 1 based on situation awareness, and the system includes: a delimitation unit 902, which configures a scene delimitation rule based on the fault types of the data, the network elements and the events; an acquisition unit 904, configured to access a scenario event, where the scenario event includes an event fault type; the analysis unit 906 is configured to read a delimiting rule according to the scene event fault type, analyze and calculate scene data, and output a scene event delimiting result; the scene data is multidimensional data, and the delimiting result can be used for evaluating and predicting the network situation.
According to the embodiment of the invention, the event is matched with the rule configuration by configuring the multi-dimensional data rule model, the corresponding delimiting rule is found through the fault type, the corresponding multi-dimensional data is analyzed, the fault reason is preliminarily positioned, the user can preliminarily delimit the result of the event at any time on a system interface, and the preliminary reason and the possible reason of the fault are quickly positioned. And opening a rest data service interface according to the preliminary delimitation result to realize event conclusion sharing. The accumulation of operation and maintenance knowledge is realized through the accumulation delimitation rule; by automatically analyzing the fault, the fault processing time is shortened.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A fault monitoring method based on situation awareness is characterized by comprising the following steps:
accessing a scene event, wherein the scene event comprises an event fault type;
reading a delimitation rule through a scene event fault type, analyzing and calculating scene data, and outputting a scene event delimitation result;
the scene data is multidimensional data, the delimiting rule is configured based on the data, the network elements and the fault types, and the delimiting result can be used for evaluating and predicting the network situation.
2. The method of claim 1, wherein the scene delimiting rules comprise filtering rules and delimiting conclusion rules, wherein the filtering rules comprise event topology rules, multidimensional data rules,
the filtering rule is used for filtering and screening the network element range, the topological relation among the network elements and the multidimensional data thereof related to the access scene event;
and the delimitation conclusion rule is used for making an integral delimitation conclusion on the access scene event based on the topology and the data filtered by the filtering rule.
3. The method of claim 1, wherein the data types of the scene data comprise: alarm data, engineering data, operation logs, complaint amount data, network delivery aggregation data and performance indexes.
4. The method of claim 1, wherein the accessing the scene event comprises: the event triggers the access to the mobile terminal,
the event trigger access comprises self trigger access and external system trigger access;
the self-triggering access is access according to scene CMNET, EPC, IP bearing network and VOLTE voice scene events;
the external system triggering access is the access for monitoring the middleware message scene event.
5. The method of claim 1, wherein the accessing the scene event further comprises: and event access, specifically, the event access is the access of a monitoring system operation and maintenance data scene event.
6. The method of claim 1, comprising collecting data in real time and periodically.
7. The method of claim 1, further comprising providing a res t interface to output scene event delimitation results for sharing.
8. The method of claim 1, further comprising building a fault scenario corresponding to an access scenario event by component configuration.
9. The method of claim 8, wherein the categories of component configurations include: configuration of scene topology, configuration of preliminary conclusions, event access configuration,
the configuration of the scene topology is that a user draws the scene topology according to the network hierarchy and structure in each specialty, and topology network element configuration, link configuration and icon configuration are supported;
the configuration of the preliminary conclusion is specifically a configuration conclusion model, the model is filled according to a multi-dimensional data result, and a preliminary delimitation result is output;
the event access is configured to be used for accessing different types of events through configuration of event sources and event types.
10. A situational awareness based fault monitoring system for performing the method of claims 1-9, comprising:
the delimitation unit is used for configuring a scene delimitation rule based on the fault types of the data, the network elements and the events;
the system comprises a collecting unit, a processing unit and a processing unit, wherein the collecting unit is used for accessing a scene event, and the scene event comprises an event fault type;
the analysis unit is used for reading the delimitation rule through the scene event fault type, analyzing and calculating scene data and outputting a scene event delimitation result;
the scene data is multidimensional data, and the delimiting result can be used for evaluating and predicting the network situation.
CN202011417081.7A 2020-12-07 2020-12-07 Situation awareness based fault monitoring method and system Pending CN112671557A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011417081.7A CN112671557A (en) 2020-12-07 2020-12-07 Situation awareness based fault monitoring method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011417081.7A CN112671557A (en) 2020-12-07 2020-12-07 Situation awareness based fault monitoring method and system

Publications (1)

Publication Number Publication Date
CN112671557A true CN112671557A (en) 2021-04-16

Family

ID=75401343

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011417081.7A Pending CN112671557A (en) 2020-12-07 2020-12-07 Situation awareness based fault monitoring method and system

Country Status (1)

Country Link
CN (1) CN112671557A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342172A (en) * 2021-06-25 2021-09-03 中国电子科技集团公司第二十九研究所 Man-machine interaction system for electronic reconnaissance special situation
CN115695142A (en) * 2022-10-25 2023-02-03 浪潮通信信息系统有限公司 Network operation and maintenance oriented event monitoring method and device
CN116016112A (en) * 2022-12-22 2023-04-25 浪潮通信信息系统有限公司 Complex event deriving method based on multidimensional data association relation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101083019A (en) * 2006-12-31 2007-12-05 中国人民解放军63791部队 Rapid evaluating system based on roomage state sensing
CN109189866A (en) * 2018-08-22 2019-01-11 北京大学 A kind of method and system constructing equipment failure diagnostic field ontologies knowledge base
WO2020014181A1 (en) * 2018-07-09 2020-01-16 Siemens Aktiengesellschaft Knowledge graph for real time industrial control system security event monitoring and management
CN111179117A (en) * 2019-12-27 2020-05-19 天津大学 Calculation method and device for situation awareness effect evaluation of intelligent power distribution network
CN111885012A (en) * 2020-07-03 2020-11-03 安徽继远软件有限公司 Network situation perception method and system based on information acquisition of various network devices

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101083019A (en) * 2006-12-31 2007-12-05 中国人民解放军63791部队 Rapid evaluating system based on roomage state sensing
WO2020014181A1 (en) * 2018-07-09 2020-01-16 Siemens Aktiengesellschaft Knowledge graph for real time industrial control system security event monitoring and management
CN109189866A (en) * 2018-08-22 2019-01-11 北京大学 A kind of method and system constructing equipment failure diagnostic field ontologies knowledge base
CN111179117A (en) * 2019-12-27 2020-05-19 天津大学 Calculation method and device for situation awareness effect evaluation of intelligent power distribution network
CN111885012A (en) * 2020-07-03 2020-11-03 安徽继远软件有限公司 Network situation perception method and system based on information acquisition of various network devices

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342172A (en) * 2021-06-25 2021-09-03 中国电子科技集团公司第二十九研究所 Man-machine interaction system for electronic reconnaissance special situation
CN113342172B (en) * 2021-06-25 2023-04-25 中国电子科技集团公司第二十九研究所 Man-machine interaction system for electronic reconnaissance special situation
CN115695142A (en) * 2022-10-25 2023-02-03 浪潮通信信息系统有限公司 Network operation and maintenance oriented event monitoring method and device
CN116016112A (en) * 2022-12-22 2023-04-25 浪潮通信信息系统有限公司 Complex event deriving method based on multidimensional data association relation

Similar Documents

Publication Publication Date Title
CN112671557A (en) Situation awareness based fault monitoring method and system
CN110493348B (en) Intelligent monitoring alarm system based on Internet of things
CN108763957B (en) Database security audit system, method and server
CN112422344A (en) Log abnormity warning method and device, storage medium and electronic device
CN111176879A (en) Fault repairing method and device for equipment
CN107958337A (en) A kind of information resources visualize mobile management system
CN113595761B (en) Micro-service component optimization method and medium of power system information communication integrated scheduling platform
CN111756582B (en) Service chain monitoring method based on NFV log alarm
US20110191394A1 (en) Method of processing log files in an information system, and log file processing system
CN110955550A (en) Cloud platform fault positioning method, device, equipment and storage medium
CN110209518A (en) A kind of multi-data source daily record data, which is concentrated, collects storage method and device
CN111756560A (en) Data processing method, device and storage medium
CN110647417B (en) Energy internet abnormal data processing method, device and system
CN103049365A (en) Monitoring and evaluating method for information and application resource operating states
CN114363222A (en) Network equipment inspection method and system based on Netconf protocol
CN116562848A (en) Operation and maintenance management platform
CN107204868B (en) Task operation monitoring information acquisition method and device
CN114301817A (en) Equipment monitoring threshold setting method and system based on Netconf protocol
CN114500178B (en) Self-operation intelligent Internet of things gateway
CN113626236B (en) Fault diagnosis method, device, equipment and medium for distributed file system
CN115766768A (en) Method and device for designing sensing center in computational power network operating system
CN115102828A (en) Fault analysis method and device
CN114531338A (en) Monitoring alarm and tracing method and system based on call chain data
CN108880903B (en) Data stream monitoring method, system, device and computer readable storage medium
CN113726808A (en) Website monitoring method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination