CN117251353A - Monitoring method, system and platform for civil aviation weak current system - Google Patents

Monitoring method, system and platform for civil aviation weak current system Download PDF

Info

Publication number
CN117251353A
CN117251353A CN202311542275.3A CN202311542275A CN117251353A CN 117251353 A CN117251353 A CN 117251353A CN 202311542275 A CN202311542275 A CN 202311542275A CN 117251353 A CN117251353 A CN 117251353A
Authority
CN
China
Prior art keywords
data
monitoring
alarm
civil aviation
weak current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311542275.3A
Other languages
Chinese (zh)
Inventor
来秋月
张建翔
徐立中
单义升
牟文刚
王德强
刘文文
王芳
刘晓疆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Civil Aviation Cares Co ltd
Original Assignee
Qingdao Civil Aviation Cares Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Civil Aviation Cares Co ltd filed Critical Qingdao Civil Aviation Cares Co ltd
Priority to CN202311542275.3A priority Critical patent/CN117251353A/en
Publication of CN117251353A publication Critical patent/CN117251353A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Abstract

The invention belongs to the technical field of information processing, and discloses a monitoring method, a monitoring system and a monitoring platform for a civil aviation weak current system. The method utilizes deployed Telegraf/Exporter to collect and report indexes, and extracts indexes, events and logs from a container operated by the Telegraf/Exporter; periodically pulling monitoring data from an interface opened by monitoring terminal equipment by using a Prometheuser and standardizing a data format; data storage and alarm rule matching are carried out on a Prometauser server side; and butting the data of the background TSDB database at the Web data display end, and carrying out data query, alarm pushing and data display. According to the invention, by monitoring various index data of the system, operation and maintenance personnel can check the running condition of the system in real time, and timely track and process when the system is abnormal.

Description

Monitoring method, system and platform for civil aviation weak current system
Technical Field
The invention belongs to the technical field of information processing, and particularly relates to a monitoring method, system and platform of a civil aviation weak current system.
Background
The civil aviation weak current system refers to weak current equipment and systems such as electric power, communication, network, security and protection used by civil aviation airports, airlines and other civil aviation related units. The system comprises a production operation system, a departure system, a security information management system, a security protection system, a 1.8G system, a broadcasting system, a freight system, a flight display system, an internal communication system and other weak current systems, and the normal operation of the equipment and the information system is critical to the safety and the efficiency of the civil aviation industry. Therefore, monitoring and managing the stability and performance of civil aviation weak current systems is an important task.
The traditional monitoring method often adopts a manual inspection and post-processing mode, and the mode has a plurality of problems. First, manual inspection requires a lot of labor and time costs, and cannot monitor and diagnose the state of the system in real time. Second, manual alarms are susceptible to human factors, and there may be delays or omissions. In addition, with the continuous expansion of the scale and the increase of the complexity of the civil aviation weak current system, the traditional monitoring method cannot meet the requirements on the performance and the stability of the system.
In order to solve the problems, the construction of a monitoring platform of a civil aviation weak current system based on Prometheus becomes a research direction. Prometheus is an open source monitoring and alarm tool that can help collect, store and analyze metric data for the system. By integrating Prometheus with a civil aviation weak current system, key indexes of the system, such as application service states, network bandwidths, equipment running states and the like, can be collected and monitored in real time. Meanwhile, prometheus also provides powerful inquiry and alarm functions, and can automatically alarm according to preset rules and thresholds.
Through the above analysis, the problems and defects existing in the prior art are as follows:
(1) The prior art has low monitoring efficiency, high labor cost and long inspection time;
(2) The prior art cannot monitor the state of the system in real time, can not give an alarm in time, and cannot effectively judge potential faults and problems in advance;
(3) The prior art cannot monitor and analyze the key indexes of the system continuously, and has low fault positioning accuracy on civil aviation weak current equipment.
Disclosure of Invention
In order to overcome the problems in the related art, the embodiment of the invention discloses a monitoring method, a system and a platform for a civil aviation weak current system. The monitoring platform is constructed based on Prometheus, and operation and maintenance personnel can check the running condition of the system in real time through various index data of the monitoring system, and track and process the system in time when the system is abnormal.
The technical scheme is as follows: a monitoring method of a civil aviation weak current system comprises the following steps:
s1: collecting and reporting indexes by using deployed Telegraf/Exporter, and extracting indexes, events and logs from a container operated by the Telegraf/Exporter;
s2: periodically pulling monitoring data from an interface opened by monitoring terminal equipment by using a Prometheuser and standardizing a data format;
S3: data storage and alarm rule matching are carried out on a Prometauser server side;
s4: and butting the data of the background TSDB database at the Web data display end, and carrying out data query, alarm pushing and data display.
In step S1, the Telegraf/Exporter running container is configured with parameters of the Telegraf plug-in, including: server address, port, authentication credentials;
in step S2, the promethauser server periodically pulls the monitoring data from the interface opened by the monitoring terminal device and standardizes the data format, which includes: processing and converting the collected data, arranging the original data into a form suitable for storage and analysis, transmitting the processed data to a configured target database, and providing data support for Grafana's inquiry and analysis;
in step S3, the promethauser server includes:
the Prometheuser is used for storing the monitoring data and matching the alarm rules, and using a storage engine of the Prometheuser to format the acquired monitoring index into a time sequence;
the alert manager is used for aggregating and silencing alarms and sending alarm information to the nailing interface;
OSAgents are used for monitoring the running states of Prometheusserver components and AlertManager components.
Further, in Prometheuser, the time series is in the format in which Prometheus stores monitoring data, each time series is composed of the following elements:
index name: specific indexes for monitoring are shown, including CPU utilization rate and memory utilization amount;
a set of tags: key value pairs for uniquely identifying the time series;
timestamp: representing a time point of data acquisition;
data value: representing a specific numerical value of the monitoring index;
and (3) generating a time sequence: each time of data acquisition, the Prometheuser generates a unique time sequence for each index and the corresponding label combination; this time sequence is appended to the storage engine for subsequent querying and analysis; prometauserver uses a series of data compression and storage policies to manage the size and accuracy of stored data, including how historical data is preserved, how data of varying accuracy is stored.
In step S3, data storage and alert rule matching are performed, including:
the Prometheuser is internally provided with a high-efficiency TSDB database, and the captured monitoring data is stored in the TSDB database in a time sequence mode;
when the Grafana needs to use the monitoring data, the PromQL language is used as a time sequence data query language, data is called from Prometheus according to indexes and dimension labels, and a specific index and label combination is selected by using the PromQL in a query editing interface, and the method comprises the following steps: the query and analysis functions of the monitoring data are provided by querying the up index of the node exporter with up { job = "node_exporter" }, where the job tag is equal to node_exporter.
In step S4, the Web data presentation end includes: the system comprises a Dingtalk component for sending alarms and a Grafana component for displaying index data and alarm information;
the Web data display end uses GrafanaUI tool to complete data summarization, filtering and aggregation through PromQL time sequence number query language, displays Prometaus data in a chart form, and realizes custom data display adapting to service requirements;
the display content comprises:
and (5) an alarm center: displaying alarm details, service overview and alarm overview;
wherein the fields in the alarm details include:
date: fixed format: yyyy-MM-DDHH: MM: ss;
error reporting summary: briefly explaining error reporting conditions;
service grouping: the civil aviation weak current system to which the alarm belongs;
severity level: dividing the severity level into two levels of warning and severity according to the importance of the service system, the node key degree and the service influence range;
service type: dividing the alarm list by a professional line;
error details: alarm details.
Another object of the present invention is to provide a monitoring platform for a civil aviation weak current system, the platform implementing the monitoring method for the civil aviation weak current system, the platform comprising:
the data collection layer configures parameters of the Telegraf plug-in unit according to the requirements of various data sources such as operating system indexes, databases, cloud platforms, communication equipment, message queues, network equipment and security equipment, wherein the parameters comprise server addresses, ports and authentication credentials; processing and converting the collected data, including filtering, tag adding and data format conversion, so as to arrange the original data into a form suitable for storage and analysis, and transmitting the processed data to a configured target database to provide data support for Grafana's query and analysis;
The data storage layer is used for storing data imported by the Telegraf agent in a Prometaus built-in time sequence database, defining alarm rules in a configuration file of an alert manager and dividing the alarm rules in multiple layers according to a civil aviation system and a software type; the alarm rules define when to trigger alarms, how to group, how to silence, the alarm rules use PromQL, a query language of promheus, to select a time sequence to trigger alarms; matching the stored data by using a preset rule, and successfully generating an alarm;
the data display layer provides a Web interface for data alarming and displaying, and the data in the Prometaus built-in time sequence database and the generated alarming are intuitively displayed in a graphical mode such as a bar graph, a line graph and the like.
Further, in the data storage layer, the preset rule includes the following parts:
alert, rule name, also triggered alert name;
the expr is PromQL expression, defining the condition for triggering alarm;
for, prescribing the duration of time for triggering the alarm to meet the condition;
labels, adding a label for the alarm for subsequent processing and filtering;
adding comments for the alarm, including abstracts and detailed descriptions;
Defining a plurality of alarm rules in a configuration file according to civil aviation scenes and actual demands, and adjusting expressions, duration, labels and notes according to requirements; the alert manager periodically reads and applies these rules, judging when to trigger an alert, how to handle the alert, and which channels to notify;
in the data display layer, a user creates data source connection through a Grafana interface, and configures an address and an access certificate of a Prometaheus server as a data source; the visual data are selected in a self-defining mode through a graph, a table and a dashboard mode, meanwhile, the data are searched in an editing mode and are retrieved from a Prometaus server by using a Prometaus query language, the returned time sequence data are used for drawing a chart, and the style, the color and the axis of the chart are customized so that the data presented by the chart meet the requirements; meanwhile, alarm rules are set for the chart, and when the data meet specific conditions, grafana triggers an alarm and sends a notification.
Another object of the present invention is to provide a monitoring system for a civil aviation weak current system, the system implementing a monitoring method for the civil aviation weak current system, the system comprising:
the alarm module is used for displaying alarm information of the system in real time, and comprises key fields of time, error reporting summary, service grouping, severity level, service type and error reporting details;
The system operation module is used for displaying the operation conditions of each professional line of the civil aviation weak current system, and comprises a production line, a departure line, security protection, a cloud platform and communication;
the statistical report module is used for generating report statistics based on the collected monitoring index data;
and the log collection module is responsible for collecting and storing the logs of the system operation module in the log aggregation system and providing a web interface for calling and displaying.
Further, the system operation module monitoring indexes comprise an operating system, middleware and infrastructure monitoring, an application layer and a service layer, and are used for acquiring performance indexes, resource utilization conditions and service operation states, timely finding potential problems and performing corresponding optimization and adjustment.
Further, the statistical report module is also used for analyzing the operation data of the whole system and acquiring the overall trend and performance of the whole system;
the log collection module is also used for troubleshooting, locating faults and diagnosing faults of the whole system.
By combining all the technical schemes, the invention has the advantages and positive effects that: the monitoring platform for building the civil aviation weak current system based on Prometheus provided by the invention improves the operation and maintenance efficiency: by automatic monitoring and real-time data acquisition, the monitoring efficiency can be greatly improved, and the labor cost and the inspection time are reduced;
Perceived system risk in advance: by setting reasonable rules and thresholds, the system state can be monitored in real time, and an alarm can be given out in time, so that potential faults and problems can be found in advance;
improving the operation and maintenance quality: through continuous monitoring and analysis of key indexes of the system, operation and maintenance personnel can be helped to quickly locate and solve the problems, and the stability and usability of the system are improved;
data analysis and optimization: through analysis and mining of the monitoring data, the operation trend and the performance bottleneck of the system can be known, and basis is provided for system optimization and decision making.
Advantages of the present invention compared to the prior art further include: according to the monitoring platform for building the civil aviation weak current system based on Prometheus, according to various indexes acquired by Telegraf, the Prometheus stores the captured monitoring data in a Prometheus built-in TSDB time sequence database in a time sequence mode, and a user-defined PromQL is used for providing a query and analysis function of the monitoring data, so that high-quality visualization of the operation condition of the civil aviation business is achieved, and operation and maintenance targets are customized and divided in multiple dimensions to meet operation and maintenance requirements. The information such as system condition, fault alarm and the like is clear at a glance, thereby improving the operation and maintenance working level and the production efficiency.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure;
FIG. 1 is a schematic diagram of a monitoring platform framework of a civil aviation weak current system provided by an embodiment of the invention;
FIG. 2 is a diagram of a monitoring system of a civil aviation weak current system provided by an embodiment of the invention;
FIG. 3 is a flowchart of a monitoring method of a civil aviation weak current system provided by an embodiment of the invention;
fig. 4 is a schematic diagram of a monitoring method of a civil aviation weak current system provided by an embodiment of the invention;
in the figure: 1. a data collection layer; 2. a data storage layer; 3. a data display layer; 4. an alarm module; 5. a system operation module; 6. a statistics report module; 7. and a log collection module.
Detailed Description
In order that the above objects, features and advantages of the invention will be readily understood, a more particular description of the invention will be rendered by reference to the appended drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The invention may be embodied in many other forms than described herein and similarly modified by those skilled in the art without departing from the spirit or scope of the invention, which is therefore not limited to the specific embodiments disclosed below.
The monitoring method of the civil aviation weak current system provided by the embodiment of the invention has the innovation points that: the specificity of the civil aviation field is considered, intelligent operation and maintenance are led automatically, and a comprehensive, real-time and multi-level monitoring solution is provided for the civil aviation weak current system through innovative modes such as customization, data integration and visualization, and clearer data support is provided for a management layer, so that decisions have data driving performance.
Embodiment 1, as shown in fig. 1, the monitoring platform for a civil aviation weak current system provided by the embodiment of the invention includes:
the data collection layer 1 aims at the civil aviation weak current system, and the indexes to be focused comprise the state of communication equipment, network flow, operating system indexes, indexes of various middleware, the health state of various front-end equipment such as cameras, industrial personal computers and the like. And using Telegraf as a probe program for acquiring various indexes, acquiring data, uploading the data to a server, and then performing the next data storage processing on the data.
Illustratively, parameters of the Telegraf plug-in, such as server addresses, ports, authentication credentials and the like, are configured according to the requirements of various data sources, such as operating system indexes (CPU, memory and the like), databases, cloud platforms, communication equipment, message queues, network equipment, security equipment and the like, collected data are processed and converted, including filtering, tag addition, data format conversion and the like, so that the original data are arranged into a form suitable for storage and analysis, the processed data are transmitted to a configured target database, and data support is provided for Grafana query and analysis.
And the data storage layer 2 is used for storing data imported by the Telegraf agent in a Prometaus built-in time sequence database, defining alarm processing rules in a configuration file of an alert manager and dividing the alarm rules in a plurality of layers of civil aviation systems, software and hardware types and the like. These rules define when to trigger alarms, how to group, how to silence, etc. The rules use PromQL, a query language of Prometheus, to select the time sequence that triggers an alarm. And matching the stored data by using a preset rule, and generating an alarm if the matching is successful. The definition of rules includes the following parts:
alert, rule name, also the name of the triggered alert.
The expr, promQL expression, defines the conditions that trigger alarms.
for, prescribing the duration of time that the triggering alert needs to meet the condition.
labels, adding labels for alarms, which can be used for subsequent processing and filtering.
and adding comments for the alarm, including abstracts and detailed descriptions.
And defining a plurality of alarm rules in the configuration file according to civil aviation scenes and actual demands, and adjusting expressions, duration, labels, notes and the like according to requirements. The alert manager periodically reads and applies these rules, deciding when to trigger an alert, how to handle an alert, and which channels to notify.
And the data display layer 3, the Grafana program provides a Web interface for data alarming and displaying, and the data in the Prometaus built-in time sequence database and the generated alarming are intuitively displayed in a graphical mode such as a bar graph, a line graph and the like.
The user creates a data source connection through the Grafana interface and configures the address and access credentials of the Prometaus server as the data source. Visual data is selected in a self-defined mode through graphs, tables, dashboards and the like, meanwhile, queries are edited, data are retrieved from a Prometaus server by using a PromQL query language of Prometaus, returned time series data are used for drawing charts, and patterns, colors and axes of the charts are customized to ensure that the data presented by the charts meet the requirements of the charts. Meanwhile, alarm rules are set for the chart, and when the data meet specific conditions, grafana triggers an alarm and sends a notification.
Embodiment 2, as another implementation manner of the present invention, as shown in fig. 2, the monitoring system for a civil aviation weak current system provided by the embodiment of the present invention provides a plurality of key modules, including an alarm module 4, a system operation module 5, a statistics report module 6 and a log collection module 7, so as to comprehensively monitor and manage the operation condition of the civil aviation weak current system.
The alarm module 4 is used for displaying alarm information of the system in real time, and comprises key fields such as time, error reporting summary, service grouping, severity level, service type, error reporting details and the like. Through the alarm module, a manager can quickly acquire abnormal conditions of the system and timely take corresponding measures to solve the problems, so that the stability and reliability of the system are ensured.
The system operation module 5 is used for displaying the operation conditions of professional lines of the civil aviation weak current system, including production lines, departure lines, security protection, cloud platforms, communication and the like. The system operation module 5 monitors the indexes to cover key aspects such as operating system, middleware and infrastructure monitoring, application layer and business layer. Through the system operation module 5, the manager can comprehensively understand the performance index, the resource utilization condition and the service operation state of the system, discover potential problems in time and make corresponding optimization and adjustment.
And the statistics report module 6 is used for generating report statistics based on the collected monitoring index data. Through the statistical report module, a manager can deeply analyze the operation data of the system to know the overall trend and performance of the system. Report statistics can help management personnel make decisions and plans, formulate reasonable resource allocation strategies, and identify bottlenecks and improvement points existing in the system.
The log collecting module 7 is responsible for collecting and storing logs of the operating system, the application and the like in a log aggregation system, and providing a web interface for calling and displaying. Through the log collection module, a manager can easily access and analyze log data of the system to help troubleshoot problems, locate faults and conduct fault diagnosis. The method has important significance for timely finding out system abnormality, maintaining system safety and improving system performance.
In summary, the civil aviation weak current system monitoring platform constructed based on Prometheus realizes comprehensive monitoring and management functions through the alarm module, the system operation module, the statistics report module and the log collection module, helps management staff grasp the state and the problem of the system in real time, and supports decision making and optimization work.
Embodiment 3 as shown in fig. 3, the method for monitoring a civil aviation weak current system provided by the embodiment of the invention includes:
s1: collecting and reporting indexes by using Telegraf/Exporter deployed on a server, a small-sized computer, security IPSAN storage, a broadcast matrix, 1.8G, IPTV, cloud platform operation and maintenance center equipment, a switch, departure self-help check-in equipment and aviation industrial control computer terminal equipment, and directly extracting various indexes, events and logs from a container operated by the Telegraf/Exporter;
Wherein the container comprises an departure service system, a baggage tracking service system and a production operation service system;
the method includes the steps of collecting terminal data indexes, configuring parameters of a Telegraf plug-in, such as server addresses, ports, authentication credentials and the like, according to requirements of various data sources such as operating system indexes (CPU, memory and the like), databases, cloud platforms, communication equipment, message queues, network equipment, security equipment and the like, and directly extracting various indexes, events and logs from containers and systems in which the Telegraf plug-in runs. Deployment on each end device is required.
S2: periodically pulling monitoring data from an interface opened by the monitored terminal equipment by using a Prometheuser and standardizing a data format;
illustratively, the collected data is processed and converted, including filtering, tag addition, data format conversion, etc., so that the raw data is consolidated into a form suitable for storage and analysis, and the processed data is transmitted to a configured target database, providing data support for Grafana's query and analysis.
S3: data storage and alarm rule matching are carried out on a Prometauser server side;
the Prometheuser server end deploys the following:
Prometheuser, a core component, storing monitoring data and matching alarm rules.
Alert manager, core component, aggregate, silence alarms, send alarm messages to other interfaces such as nails.
OSAgents, necessary components, monitor the server itself.
Prometauseserver uses its own storage engine to format the acquired monitoring metrics into a time series. The time series are in the format of Prometheus stored monitoring data, each time series is composed of the following elements:
index name (MetricName): specific indexes for monitoring are indicated, such as CPU usage rate, memory usage amount and the like.
A set of Labels (Labels): is a key value pair for uniquely identifying a time series. For example, an instance of a particular server may be identified by instance= "webserver 1".
Timestamp (Timestamp): indicating the point in time of the data acquisition.
Data Value (Value): indicating the specific value of the monitoring index.
And (3) generating a time sequence: for each data acquisition, prometheuser generates a unique time series for each index and its corresponding tag combination. This time sequence may be appended to the storage engine for subsequent querying and analysis. Prometauseserver uses a series of data compression and storage strategies to manage the size and accuracy of the stored data. This includes how historical data is preserved, how data of different accuracy is stored, etc.
The Prometheuser is internally provided with an efficient TSDB database, the captured monitoring data is stored in the TSDB database in a time sequence mode, and compared with the traditional RRD format, the data compression of the TSDB does not cause data precision loss, so that the method has important significance for subsequent monitoring of big data application.
When the Grafana needs to use the monitoring data, the data is called from Prometheus according to the index (Metric) and the dimension Label (Label) in the PromQL (PrometheusQueryLanguage) language of the time sequence data, and the specific index and Label combination is selected by using the PromQL in the query editing interface. For example, the "up" index of the node exporter is queried by up { job= "node_exporter" }, where the "job" tag is equal to "node_exporter", thereby providing a query and analysis function to monitor data.
S4: and butting the data of the background TSDB database at the Web data display end, and carrying out data query, alarm pushing and data display.
The system operation module 5 monitors the application layer in the index, mainly at the nailing and Web data display end, interfaces with the data of the background TSDB database, realizes service logic, and provides functions of data query, alarm push and the like.
The Web data presentation end deploys the following:
Dingtalk, an optional component, sends an alarm to the staple.
Grafana, a core component, which displays index data, alarm information and the like.
Illustratively, staple pushing is taken as an example: the Prometauser Server ends the Dingttalk component (see step S3), opens the nailed extranet interface, and configures the network proxy. When the nailing group is added into the robot, the robot is accessed into a custom service through webhook, and the secret and url parts are configured in a Prometauseserver side Dingttalk configuration file. Finally, a special alert mgr configuration rule for nailing is newly added, and a nailing alarm pushing function module can be realized by starting service.
The Web data display end uses GrafanaUI tool to complete the operations of summarizing, filtering, aggregating and the like of data through PromQL time sequence number query language, displays Prometaus data in a chart form, and realizes custom data display adapting to service requirements.
The display content is as follows:
and (5) an alarm center: and displaying the alarm details, the business overview and the alarm overview. Wherein fields in the alarm details are explained as follows:
date: fixed format: yyyy-MM-DDHH: MM: ss;
error reporting summary: briefly explaining error reporting conditions;
service grouping: the civil aviation weak current system to which the alarm belongs;
Severity level: dividing the severity level into two levels of warning and severity according to the importance of the service system, the node key degree and the service influence range;
service type: dividing the alarm list by a professional line, namely, service types;
error details: alarm details.
The business overview module provided by the embodiment of the invention displays all civil aviation weak current systems of the nano tube, including a departure system, a production operation system, a security information management system, a flight display system, a baggage tracking system and the like, and can be linked to a system operation center to display specific monitoring data.
The alarm overview counts the alarm details, and improves the operation and maintenance efficiency.
And (3) a system operation center: classifying by using a civil aviation weak current system, and visually displaying an interface from the aspects of an operating system, network throughput, service states, database space, backup conditions, middleware and the like.
And a statistical report module: the system mainly comprises an alarm statistics and a server load, and is used for displaying the tidying operation trend and performance of the service system in a period of time.
And a log collection module: the centralized collection, storage and visualization of the logs of each level of the operating system and the application are realized, so that developers and operation and maintenance personnel can conveniently check and analyze the log data of the system, so that problems can be quickly found, and fault investigation and performance optimization can be performed. At the same time, the architecture provides a highly scalable and flexible architecture, which is suitable for the log processing requirements of large-scale distributed systems.
According to the scheme, the monitoring platform for building the civil aviation weak current system based on Prometheus provided by the invention improves the safety and reliability of the system: stable operation of civil aviation weak current systems is critical to flight safety and normal flight operation. The monitoring platform aims to monitor key indexes of the system in real time, so that the running condition of the system is comprehensively mastered, and the operation and maintenance personnel can be helped to quickly respond and take necessary measures by timely finding faults, anomalies and potential risks and providing alarms and notices so as to improve the safety and reliability of the system.
Fault diagnosis and maintenance efficiency are improved: the traditional manual inspection and fault detection method has the problems of low efficiency and omission. By means of the monitoring platform built based on Prometheus, a large amount of system measurement index data can be collected and analyzed in real time. The data can be used for fault diagnosis and problem positioning, help operation and maintenance personnel to quickly and accurately find a fault source and take corresponding maintenance measures, so that the efficiency of fault diagnosis and maintenance is improved.
Preventive maintenance and optimization is achieved: by continuously monitoring the performance and operating state of the system, the monitoring platform can help operation and maintenance personnel to perform preventive maintenance and system optimization. Through analysis and mining of the data, the operation trend, bottleneck and potential problems of the system can be identified, measures can be taken in advance to optimize and adjust, so that potential faults are avoided, and the efficiency and usability of the system are improved.
Providing comprehensive visual monitoring and reporting: the monitoring platform can provide an visual interface to display the information such as the real-time state, the historical data and the trend analysis of the system. Through visual monitoring and reporting, management personnel can better know the running condition of the system and make corresponding decisions and plans. In addition, the monitoring platform can also generate reports and statistical data for evaluating the performance and operation and maintenance effects of the system, and provide references for future planning and improvement.
In conclusion, the monitoring platform of the civil aviation weak current system is built based on Prometheus, so that the safety, stability and reliability of the system are improved, the fault diagnosis and maintenance efficiency is improved, preventive maintenance and optimization are realized, the operation and maintenance quality is improved, and a basic guarantee is provided for smooth production of civil aviation. Through real-time monitoring and data analysis, the platform can help civil aviation related units to realize comprehensive monitoring and management of weak current systems, so that the operation efficiency and the safety level of the whole civil aviation industry are improved.
Further, through the above embodiment, the present invention has the following features:
safety and stability are improved: the monitoring platform can monitor various key indexes of the civil aviation weak current system in real time, such as a machine room environment, a communication network, security equipment and the like. By quickly discovering potential faults, anomalies or risks, the monitoring platform can help to reduce system interruption and faults, thereby improving the safety and stability of the whole airport civil aviation weak current system.
The maintenance cost is reduced: automatic monitoring reduces the need for manual inspection, thereby saving time and labor costs. By quickly finding problems and providing accurate diagnostic information, the platform can shorten the fault recovery time, ultimately reducing maintenance costs.
Optimizing resource utilization: the monitoring platform provides real-time monitoring and analysis of system performance and resource utilization. This helps to identify resource waste and performance bottlenecks, thereby optimizing resource allocation, improving efficiency and performance of the system, and reducing resource waste.
Enhancing user satisfaction: through real-time monitoring, the problems can be found and solved more quickly, and the usability and stability of the system can be improved, so that the user experience is improved. In the civil aviation field, this will directly affect the punctuality of flights and passenger satisfaction.
Support decision and planning: the report and the statistical data provided by the platform can help the management layer to know the overall operation condition and performance index of the system. This is important for making future investment decisions, technical planning and system upgrade plans.
Commercial value extension: as an advanced technical solution, the monitoring platform can be provided to airlines, airports and related units as a differentiated service for the civil aviation industry. This will increase the market competitiveness of the enterprise and may create a new source of commercial revenue.
In general, a Prometheus-based civil aviation system monitoring platform brings higher safety, efficiency and satisfaction to the civil aviation industry, and creates commercial competitive advantages and value-added opportunities for enterprises.
Comprehensive monitoring solution: in the field of civil aviation, especially in the field of weak current system monitoring, a comprehensive monitoring solution is often lacking. The monitoring platform based on Prometheus is used for integrating monitoring requirements of various aspects of civil aviation weak current systems, machine room environments, cloud platforms, databases, communication, networks, security protection and the like, providing a unified monitoring platform and filling up the technical blank that the monitoring of a single system cannot be satisfied before.
Real-time and automated monitoring: traditional monitoring methods often rely on manual inspection, and cannot monitor the state of the system in real time. The Prometheus-based platform can realize real-time data acquisition and monitoring and automatically send out alarms and notifications, thereby filling the technical blank of real-time monitoring in the civil aviation field.
Cross-level monitoring: civil aviation weak current systems are typically made up of different levels of devices and components, covering physical devices, network devices, software applications, and the like. The monitoring platform based on Prometheus provides cross-level monitoring capability, and can comprehensively monitor from the bottom layer equipment to the application layer, so that the technical blank of cross-level monitoring is filled.
Index analysis and data driven decision: the traditional monitoring is only at the data collection level, and the deep analysis and utilization of the data are lacking. The Prometheus-based platform has powerful data analysis and query functions, can provide more insight for civil aviation decision makers about the running state and problems of the system in a data driving mode, and fills the technical blank of data analysis and decision support.
Visualization and user experience: the monitoring platform based on Prometheus is combined with tools such as Grafana and the like, an visual interface can be provided, and monitoring data is displayed in the form of a chart, a graph and the like, so that the monitoring data is easier to understand and analyze. The user-friendly interface fills the technical blank of the prior monitoring system in the aspects of visualization and user experience.
In summary, the Prometheus-based civil aviation monitoring platform fills the technical blank in domestic and foreign civil aviation industry through the advantages of comprehensive solution, real-time monitoring, cross-level monitoring, data analysis, visualization and the like, and brings more advanced, efficient and comprehensive monitoring solution to civil aviation industry.
Open source flexibility: prometheus is an open source tool with flexible architecture and powerful custom capabilities. This means that the scheme does not depend on a specific provider, can be customized and expanded according to actual requirements, avoids technical bias, and enables the system to be more flexible and adapt to different technical environments.
Wide application: prometheus is widely used for monitoring and data acquisition in various fields, not limited to specific industries. Therefore, the technical scheme of the monitoring platform based on Prometheus has universality and is not influenced by technical prejudices of specific industries.
Standardized monitoring indexes: prometaus adopts a standardized method in the aspects of collection and storage of monitoring indexes. This means that devices and systems of different suppliers can output monitoring data conforming to the promethaus standard, thereby eliminating technical prejudice for a specific device or technology.
Data driven decision: prometheus-based platforms provide powerful data analysis and query capabilities that can help decision makers make decisions based on data rather than relying on subjective bias. This helps to eliminate the impact of past subjective judgment and bias on decisions.
Technology integration and interaction: the scheme integrates Prometheus, grafana and AlertManager tools and provides a unified platform for monitoring and data analysis in different aspects. This technical integration eliminates the prejudice of a single technology or tool while providing cross-system data interaction, making decisions more comprehensive and accurate.
Actual benefit proves that: prometheus is widely used worldwide and achieves practical benefits. Based on the fact, the technical scheme of the civil aviation monitoring platform based on Prometheus can prove the effectiveness of the technical scheme based on actual results, so that the possible prejudice of people to new technologies is reduced.
In summary, the technical scheme of the civil aviation monitoring platform based on Prometheus effectively overcomes the technical bias through the modes of open source flexibility, wide application, standardization, data driving decision, technical integration, actual benefit and the like, so that the monitoring scheme has objectivity and universality.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
The content of the information interaction and the execution process between the devices/units and the like is based on the same conception as the method embodiment of the present invention, and specific functions and technical effects brought by the content can be referred to in the method embodiment section, and will not be described herein.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present invention. For specific working processes of the units and modules in the system, reference may be made to corresponding processes in the foregoing method embodiments.
Based on the technical solutions described in the embodiments of the present invention, the following application examples may be further proposed.
According to an embodiment of the present application, the present invention also provides a computer apparatus, including: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, which when executed by the processor performs the steps of any of the various method embodiments described above.
Embodiments of the present invention also provide a computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of the respective method embodiments described above.
The embodiment of the invention also provides an information data processing terminal, which is used for providing a user input interface to implement the steps in the method embodiments when being implemented on an electronic device, and the information data processing terminal is not limited to a mobile phone, a computer and a switch.
The embodiment of the invention also provides a server, which is used for realizing the steps in the method embodiments when being executed on the electronic device and providing a user input interface.
Embodiments of the present invention also provide a computer program product which, when run on an electronic device, causes the electronic device to perform the steps of the method embodiments described above.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application implements all or part of the flow in the methods of the above embodiments, and may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, where the computer program may implement the steps of each method embodiment described above when executed by a processor. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal device, recording medium, computer memory, read-only memory (ROM), random access memory (RandomAccessMemory, RAM), electrical carrier signal, telecommunication signal, and software distribution medium. Such as a U-disk, removable hard disk, magnetic or optical disk, etc.
To further demonstrate the positive effects of the above embodiments, the present invention was based on the above technical solutions to perform the following experiments.
(1) The alarm center interface displays all civil aviation weak current systems of the nanotubes, including departure systems, production operation systems, security inspection information management systems, flight display systems, baggage tracking systems and the like, and can be linked to a system operation center to display specific monitoring data.
(2) The system operation center interface is classified by a civil aviation weak current system, and the interface is visually displayed in terms of an operating system, network throughput, service states, database space, backup conditions, middleware and the like.
(3) The Prometauser server ends the Dingtalk component, opens the external network interface of the nail, and configures the network proxy. When the nailing group is added into the robot, the robot is accessed into a custom service through webhook, and the secret and url parts are configured in a Prometauseserver side Dingttalk configuration file. Finally, a special alert mgr configuration rule for nailing is newly added, and a nailing alarm pushing function module can be realized by starting service.
While the invention has been described with respect to what is presently considered to be the most practical and preferred embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

Claims (10)

1. The monitoring method of the civil aviation weak current system is characterized by comprising the following steps of:
s1: collecting and reporting indexes by using deployed Telegraf/Exporter, and extracting indexes, events and logs from a container operated by the Telegraf/Exporter;
s2: periodically pulling monitoring data from an interface opened by monitoring terminal equipment by using a Prometheuser and standardizing a data format;
s3: data storage and alarm rule matching are carried out on a Prometauser server side;
s4: and butting the data of the background TSDB database at the Web data display end, and carrying out data query, alarm pushing and data display.
2. The method for monitoring a civil aviation weak current system according to claim 1, wherein in step S1, the container operated by Telegraf/Exporter is configured with parameters of Telegraf plug-in, comprising: server address, port, authentication credentials;
in step S2, the promethauser server periodically pulls the monitoring data from the interface opened by the monitoring terminal device and standardizes the data format, which includes: processing and converting the collected data, arranging the original data into a form suitable for storage and analysis, transmitting the processed data to a configured target database, and providing data support for Grafana's inquiry and analysis;
In step S3, the promethauser server includes:
the Prometheuser is used for storing the monitoring data and matching the alarm rules, and using a storage engine of the Prometheuser to format the acquired monitoring index into a time sequence;
the alert manager is used for aggregating and silencing alarms and sending alarm information to the nailing interface;
OSAgents are used for monitoring the running states of Prometheusserver components and AlertManager components.
3. The method for monitoring a civil aviation weak current system according to claim 2, wherein in the promethausserver, the time series are in a format in which the promethaus stores monitoring data, each time series is composed of the following elements:
index name: specific indexes for monitoring are shown, including CPU utilization rate and memory utilization amount;
a set of tags: key value pairs for uniquely identifying the time series;
timestamp: representing a time point of data acquisition;
data value: representing a specific numerical value of the monitoring index;
and (3) generating a time sequence: each time of data acquisition, the Prometheuser generates a unique time sequence for each index and the corresponding label combination; this time sequence is appended to the storage engine for subsequent querying and analysis; prometauserver uses a series of data compression and storage policies to manage the size and accuracy of stored data, including how historical data is preserved, how data of varying accuracy is stored.
4. The method for monitoring a civil aviation weak current system according to claim 1, wherein in step S3, data storage and alarm rule matching are performed, comprising:
the Prometheuser is internally provided with a high-efficiency TSDB database, and the captured monitoring data is stored in the TSDB database in a time sequence mode;
when the Grafana needs to use the monitoring data, the PromQL language is used as a time sequence data query language, data is called from Prometheus according to indexes and dimension labels, and a specific index and label combination is selected by using the PromQL in a query editing interface, and the method comprises the following steps: the query and analysis functions of the monitoring data are provided by querying the up index of the node exporter with up { job = "node_exporter" }, where the job tag is equal to node_exporter.
5. The method for monitoring a civil aviation weak current system according to claim 1, wherein in step S4, the Web data presentation terminal includes: the system comprises a Dingtalk component for sending alarms and a Grafana component for displaying index data and alarm information;
the Web data display end uses GrafanaUI tool to complete data summarization, filtering and aggregation through PromQL time sequence number query language, displays Prometaus data in a chart form, and realizes custom data display adapting to service requirements;
The display content comprises:
and (5) an alarm center: displaying alarm details, service overview and alarm overview;
wherein the fields in the alarm details include:
date: fixed format: yyyy-MM-DDHH: MM: ss;
error reporting summary: briefly explaining error reporting conditions;
service grouping: the civil aviation weak current system to which the alarm belongs;
severity level: dividing the severity level into two levels of warning and severity according to the importance of the service system, the node key degree and the service influence range;
service type: dividing the alarm list by a professional line;
error details: alarm details.
6. A monitoring platform for a civil aviation weak current system, characterized in that the platform implements the monitoring method for a civil aviation weak current system according to any one of claims 1 to 5, the platform comprising:
the data collection layer (1) configures parameters of the Telegraf plug-in according to the requirements of various data sources of the operating system index, the database, the cloud platform, the communication equipment, the message queue, the network equipment and the security equipment, wherein the parameters comprise server addresses, ports and authentication credentials; processing and converting the collected data, including filtering, tag adding and data format conversion, so as to arrange the original data into a form suitable for storage and analysis, and transmitting the processed data to a configured target database to provide data support for Grafana's query and analysis;
The data storage layer (2) stores data imported by the Telegraf agent in a Prometaus built-in time sequence database, defines alarm rules in a configuration file of an alert manager, and divides the alarm rules in a multi-level mode according to a civil aviation system and a software and hardware type; the alarm rules define when to trigger alarms, how to group, how to silence, the alarm rules use PromQL, a query language of promheus, to select a time sequence to trigger alarms; matching the stored data by using a preset rule, and successfully generating an alarm;
and the data display layer (3) provides a Web interface for data alarming and displaying, and intuitively displays the data in the Prometheus built-in time sequence database and the generated alarming in a graphical mode such as a histogram, a line graph and the like.
7. Monitoring platform for civil aviation weak-current systems according to claim 6, characterized in that in the data storage layer (2) the preset rules comprise the following parts:
alert, rule name, also triggered alert name;
the expr is PromQL expression, defining the condition for triggering alarm;
for, prescribing the duration of time for triggering the alarm to meet the condition;
labels, adding a label for the alarm for subsequent processing and filtering;
adding comments for the alarm, including abstracts and detailed descriptions;
defining a plurality of alarm rules in a configuration file according to civil aviation scenes and actual demands, and adjusting expressions, duration, labels and notes according to requirements; the alert manager periodically reads and applies these rules, judging when to trigger an alert, how to handle the alert, and which channels to notify;
in the data display layer (3), a user creates a data source connection through a Grafana interface, and configures an address and an access certificate of a Prometaus server as a data source; the visual data are selected in a self-defining mode through a graph, a table and a dashboard mode, meanwhile, the data are searched in an editing mode and are retrieved from a Prometaus server by using a Prometaus query language, the returned time sequence data are used for drawing a chart, and the style, the color and the axis of the chart are customized so that the data presented by the chart meet the requirements; meanwhile, alarm rules are set for the chart, and when the data meet specific conditions, grafana triggers an alarm and sends a notification.
8. A monitoring system for a civil aviation weak current system, characterized in that the system implements the monitoring method for a civil aviation weak current system according to any one of claims 1 to 5, the system comprising:
The alarm module (4) is used for displaying alarm information of the system in real time, and comprises key fields of time, error reporting summary, service grouping, severity level, service type and error reporting details;
the system operation module (5) is used for displaying the operation conditions of each professional line of the civil aviation weak current system, and comprises a production line, a departure line, security protection, a cloud platform and communication;
the statistics report module (6) is used for generating report statistics based on the collected monitoring index data;
and the log collection module (7) is responsible for collecting and storing the logs of the system operation module (5) in a log aggregation system and providing a web interface for calling and displaying.
9. The monitoring system of a civil aviation weak current system according to claim 8, wherein the system operation module (5) monitors the indexes including an operating system, middleware and infrastructure class monitoring, an application layer and a service layer, and is used for acquiring performance indexes, resource utilization conditions and service operation states, finding potential problems in time and making corresponding optimization and adjustment.
10. The monitoring system of a civil aviation weak current system according to claim 8, wherein the statistics report module (6) is further used for analyzing the operation data of the whole system and obtaining the overall trend and performance of the whole system;
The log collection module (7) is also used for troubleshooting, locating faults and diagnosing faults of the whole system.
CN202311542275.3A 2023-11-20 2023-11-20 Monitoring method, system and platform for civil aviation weak current system Pending CN117251353A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311542275.3A CN117251353A (en) 2023-11-20 2023-11-20 Monitoring method, system and platform for civil aviation weak current system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311542275.3A CN117251353A (en) 2023-11-20 2023-11-20 Monitoring method, system and platform for civil aviation weak current system

Publications (1)

Publication Number Publication Date
CN117251353A true CN117251353A (en) 2023-12-19

Family

ID=89137323

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311542275.3A Pending CN117251353A (en) 2023-11-20 2023-11-20 Monitoring method, system and platform for civil aviation weak current system

Country Status (1)

Country Link
CN (1) CN117251353A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103491354A (en) * 2013-10-10 2014-01-01 国家电网公司 System operation monitoring and controlling visual platform
US20200133814A1 (en) * 2018-10-25 2020-04-30 Capital One Services, Llc Application performance analytics platform
CN113486351A (en) * 2020-06-15 2021-10-08 中国民用航空局空中交通管理局 Civil aviation air traffic control network safety detection early warning platform
CN114996085A (en) * 2022-05-26 2022-09-02 中电云数智科技有限公司 Prometheus-based real-time service monitoring method and system
CN115934464A (en) * 2022-12-13 2023-04-07 浪潮云信息技术股份公司 Information platform monitoring and collecting system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103491354A (en) * 2013-10-10 2014-01-01 国家电网公司 System operation monitoring and controlling visual platform
US20200133814A1 (en) * 2018-10-25 2020-04-30 Capital One Services, Llc Application performance analytics platform
CN113486351A (en) * 2020-06-15 2021-10-08 中国民用航空局空中交通管理局 Civil aviation air traffic control network safety detection early warning platform
CN114996085A (en) * 2022-05-26 2022-09-02 中电云数智科技有限公司 Prometheus-based real-time service monitoring method and system
CN115934464A (en) * 2022-12-13 2023-04-07 浪潮云信息技术股份公司 Information platform monitoring and collecting system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄静;陈秋燕;: "基于Prometheus + Grafana实现企业园区信息化PaaS平台监控", 数字通信世界, no. 09 *

Similar Documents

Publication Publication Date Title
US11657309B2 (en) Behavior analysis and visualization for a computer infrastructure
CN104407964B (en) A kind of centralized monitoring system and method based on data center
US20200259716A1 (en) Hierarchical network analysis service
CN110782370B (en) Comprehensive operation and maintenance management platform for power dispatching data network
CN110505102B (en) Power information communication fusion monitoring and service standardization management platform system and method
CN110581773A (en) automatic service monitoring and alarm management system
CN107958337A (en) A kind of information resources visualize mobile management system
JP6085550B2 (en) Log analysis apparatus and method
CN104699759A (en) Method for maintaining automatic operation of database
CN101989931A (en) Operation alarm processing method and device
CN113176770A (en) Remote monitoring system for equipment failure
CN116738163A (en) Energy consumption monitoring management system and method based on rule engine
CN116010456A (en) Equipment processing method, server and rail transit system
CN109858807B (en) Enterprise operation monitoring method and system
CN113468022B (en) Automatic operation and maintenance method for centralized monitoring of products
KR20060093524A (en) Information technology service management system
US8099527B2 (en) Operation management apparatus, display method, and record medium
CN117270937A (en) Digital operation and maintenance management system
CN116955434A (en) Full life cycle management and multidimensional energy efficiency analysis system for industrial equipment
CN116755992A (en) Log analysis method and system based on OpenStack cloud computing
CN117251353A (en) Monitoring method, system and platform for civil aviation weak current system
CN116155687A (en) Remote operation and maintenance system
CN109450103A (en) Condition detection method, device and the intelligent terminal of pressing plate
CN114240211A (en) Intelligent production management system and management method
CN116151787A (en) IT operation and maintenance management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination