CN114124662A - Resource intelligent operation and maintenance system based on cross-network environment - Google Patents

Resource intelligent operation and maintenance system based on cross-network environment Download PDF

Info

Publication number
CN114124662A
CN114124662A CN202111126506.3A CN202111126506A CN114124662A CN 114124662 A CN114124662 A CN 114124662A CN 202111126506 A CN202111126506 A CN 202111126506A CN 114124662 A CN114124662 A CN 114124662A
Authority
CN
China
Prior art keywords
monitoring
platform
maintenance
data
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202111126506.3A
Other languages
Chinese (zh)
Inventor
丁义镇
吕鹤
陈焕新
李存冰
王方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Technology Co Ltd
Original Assignee
Inspur Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Software Technology Co Ltd filed Critical Inspur Software Technology Co Ltd
Priority to CN202111126506.3A priority Critical patent/CN114124662A/en
Publication of CN114124662A publication Critical patent/CN114124662A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/02Standardisation; Integration
    • H04L41/0213Standardised network management protocols, e.g. simple network management protocol [SNMP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • H04L43/045Processing captured monitoring data, e.g. for logfile generation for graphical visualisation of monitoring data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a resource intelligent operation and maintenance system based on a cross-network environment, which belongs to the technical field of intelligent operation and maintenance.

Description

Resource intelligent operation and maintenance system based on cross-network environment
Technical Field
The invention relates to the technical field of intelligent operation and maintenance, in particular to a resource intelligent operation and maintenance system based on a cross-network environment.
Background
The intelligent operation and maintenance platform (hereinafter referred to as operation and maintenance platform) provides comprehensive, unified, multidimensional, automatic and visual intelligent monitoring capability for 7 x 24 hours, is used for tracing and positioning problems through detailed monitoring data and graphic management, and finds faults in time to generate alarms by using alarm detection based on threshold values, so that the response speed is quickly and effectively improved, and the operation and maintenance cost is saved.
In an actual production environment, the operation and maintenance platform may face a situation of network unavailability during the delivery process, such as: the system comprises an A network and a video network, wherein the A network contains certain secret data and has higher security level, and the video network needs to be communicated with cameras in various regions and belongs to the Internet category. For safety and business considerations, the network a needs to perform logical network isolation by multiple means to achieve the purposes of data safety and secret leakage prevention. This has just caused the operation and maintenance platform repeated construction between obstructed network, the operation and maintenance level is uneven, resource monitoring data is too dispersed, and the user can't carry out overall management, greatly increased the operation and maintenance cost.
Disclosure of Invention
Based on the technical problems, the invention provides a resource intelligent operation and maintenance system based on a cross-network environment, and solves the problem that resource monitoring data of different networks and different platforms cannot be simultaneously displayed in a unified operation and maintenance platform under the cross-network environment.
The technical scheme of the invention is as follows:
a resource intelligent operation and maintenance system based on a cross-network environment utilizes a cloud computing platform, a monitoring platform, a reporting center, a unified intelligent operation and maintenance platform and the like to carry out combined planning and deployment so as to achieve the aim of solving the problems.
(1) Cloud computing platform
The cloud computing platform provides a set of standards for interfacing with the IaaS, which are called as infrastructure adaptive Driver interfaces (CPI), the built-in IaaS drivers of the current platform include VMWare, OpenStack, Xen, Docker, part of industry standard virtualization environments, and the like, and drivers (drivers) supporting other IaaS can also be written with reference to the standards. The Driver can provide calls for bottom layer resource management, and the docking with the cloud platform is realized through a standard API (application program interface), so that the access of various basic environments is met.
(2) Monitoring platform
The plug-in architecture provided by the monitoring platform is easy to manage and configure, the automatic discovery function of the plug-in architecture greatly reduces the workload of daily management, abundant data acquisition modes and API (application program interface) interfaces can flexibly acquire data, and the distributed system architecture can support monitoring of more devices.
(3) Reporting center
The reporting center plays a role in starting up and starting down in the resource intelligent operation and maintenance system in the cross-network environment, and reports the monitoring data received from the cloud computing platform and the monitoring platform under different networks to the operation and maintenance platform for unified display according to standard interface specifications established by the operation and maintenance platform for the capacity, assets, performance, alarm and other information of various resources.
(4) Operation and maintenance platform
The operation and maintenance platform adopts a multilayer framework and modular design mode, has comprehensive system functions and independent module functions, can be freely combined according to different customer requirements, and provides rich monitoring item types, including: CPU usage, CPU load, memory usage, network traffic, disk space usage, process state, number of processes, etc. The overall architecture comprises seven parts: asset management, asset monitoring, a 3D machine room, alarm management, inspection management, automatic deployment and work order management. Meanwhile, the platform has good expansibility and can be seamlessly integrated with a third-party product.
Drawings
Fig. 1 is a working block diagram of an inter-network resource intelligent operation and maintenance system.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without creative efforts belong to the scope of the present invention.
Aiming at a complex operation and maintenance environment, the invention utilizes a cloud computing platform, a monitoring platform, a reporting center, a unified intelligent operation and maintenance platform and the like to carry out combined planning and deployment to achieve the aim of solving the problems.
The invention provides a resource intelligent operation and maintenance system based on a cross-network environment, which is characterized in that a cloud computing platform and a monitoring platform are used for acquiring resource monitoring data, a reporting center is responsible for reporting the monitoring data of different networks and different areas, the operation and maintenance platform is used for carrying out comprehensive, unified, multidimensional and visual display on the monitoring data and the operation state, and the most efficient and rapid management is realized by utilizing the lowest operation and maintenance cost.
The method comprises the following steps: collector installation
To realize the monitoring data acquisition of resources, a corresponding acquisition device must be installed first. The collectors are different according to different resource devices.
1) The monitoring of resource objects such as virtual machines, cloud storage, cloud hard disks, networks and the like is realized by installing and deploying collection plug-ins such as cloud-agents, ceilometer and the like on the monitoring nodes.
2) The acquisition of monitoring data is realized by installing a probe on a physical server or opening an SNMP protocol.
3) And monitoring data is acquired by opening an SNMP protocol on the storage equipment.
Step two: monitoring data acquisition
The resource intelligent operation and maintenance system under the cross-network environment mainly collects resource monitoring data through a cloud computing platform, a monitoring platform or an autonomous reporting mode.
The cloud computing platform supports the collection of resource monitoring data by setting monitoring object types, monitoring indexes and the like; and the abnormal alarm of the resources is realized by setting an alarm scheme. The monitoring collector and the Cloud-Agent collect monitoring information such as virtual machines, containers, bare metals, Cloud hard disks and the like in the basic environment and send the monitoring information to the monitoring bus. And the monitoring alarm measurement Ceilometer acquires original data from the monitoring bus, analyzes the monitoring index and the index value and performs alarm evaluation. And the cloud platform monitoring module acquires monitoring information from the monitoring bus, analyzes, summarizes and stores the monitoring information in a warehouse, and provides a RestAPI interface for upper layer calling.
The monitoring platform provides modes such as probes, SNMP, JMX and IPMI for monitoring data acquisition of different resources, and also provides rich API interfaces for the upper operation and maintenance platform to call.
The resource intelligent operation and maintenance system under the cross-network environment sets a standard monitoring information API report interface, and the third-party application only needs to carry out interface butt joint according to the interface specification to report own monitoring data.
Step three: cross-network monitoring data reporting
As shown in the figure, taking deployment situations of a and a video network as examples, cross-network reporting of resource data is realized.
The A network and the video network respectively acquire resource monitoring data for the equipment in the service;
2. and the reporting center establishes a uniform and standard interface specification according to the dimensions of the type, the performance index, the alarm and the like of the monitored object.
And 3, the A network and the video network respectively provide a board jump machine which can be in network intercommunication with the upper operation and maintenance platform for the installation and deployment of the reporting center. The network between the two board-jumping machines does not need to be communicated, but the communication with an upper network is ensured.
4. The operation and maintenance platform integrates the resource monitoring data reported by each reporting center uniformly, and finally, the resource monitoring data are displayed to users visually in a form of a report or a view.
Step four: operation and maintenance platform overall management
The operation and maintenance platform is a new generation operation and maintenance solution integrating management, monitoring, visualization and analysis capabilities, and is used for uniformly managing multiple systems and multiple devices, giving an alarm in real time and positioning faults at a second level. The method has good expansibility, and can ensure that the reporting center can uniformly gather all monitoring data of networks and areas to the operation and maintenance platform in real time by performing seamless butt joint with the reporting center through standard RestAPI interface specifications.
The operation and maintenance platform displays the current use condition and the running state of various resources in the system in real time through the view and the powerful report form analysis capability, the monitoring data of the resource equipment is displayed to a user more visually, the problems that the operation and maintenance platform is repeatedly built, the operation and maintenance level is uneven, the resource monitoring data are too dispersed and the like in the cross-network environment are well solved, and the user experience is greatly improved.
The cloud computing platform is adaptive to heterogeneous IaaS resources, the monitoring platform automatically collects data, the reporting center reports cross-network information, and the operation and maintenance platform is used for overall situation and overall management.
The cloud computing platform is adapted to heterogeneous IaaS resources, and specifically comprises the following steps: a standard interface CPI for IaaS environment access is designed considering usability on the premise of meeting the requirements on resource request and scheduling. The access of various basic environments can be met, including OpenStack, VMware, cloud OS, public clouds (AWS, Alice cloud) and the like.
After the IaaS environment is accessed, the system periodically synchronizes information of the IaaS environment, including clusters, networks, physical hosts, virtual machines, cloud hard disks, cloud storage and the like, and acquires key indexes such as CPU utilization rate, memory utilization rate, storage utilization rate, running state and the like through the collector, thereby providing data support for unified management of the upper operation and maintenance platform.
The automatic data collection of monitor platform specifically includes: the monitoring platform provides abundant resource templates and API interfaces, accesses resource equipment through automatic discovery and manual input, and collects monitoring data by using a plurality of acquisition modes such as probes or SNMP protocols.
The cross-network information reporting of the reporting center specifically comprises the following steps: aiming at resources in different networks, each network needs to provide a springboard for deploying a reporting center to solve the problem of uniform integration of the resources. Because the reporting center establishes a uniform standard for the data such as assets, capacity, performance, alarms and the like, each network only needs to report the data collected by the cloud computing platform and the monitoring platform according to the standard.
The operation and maintenance platform is used for carrying out overall situation and overall management, and specifically comprises the following steps: the operation and maintenance platform is a new generation operation and maintenance scheme integrating resource management, monitoring, alarming, visualization and analysis capabilities, and uses alarm detection based on a threshold value, real-time alarming and second-level fault positioning. The platform shows data to users comprehensively, uniformly and visually in a visual and chart mode, whether resource equipment has problems or not is clear at a glance, a series of problems of resource data dispersion, different operation and maintenance levels, repeated platform construction and the like are solved, and the working efficiency is greatly improved.
The above description is only a preferred embodiment of the present invention, and is only used to illustrate the technical solutions of the present invention, and not to limit the protection scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (8)

1. An intelligent operation and maintenance system for resources based on cross-network environment is characterized in that,
the method comprises the following steps: the system comprises a computing platform, a monitoring platform, a reporting center and an operation and maintenance platform;
wherein the content of the first and second substances,
the cloud computing platform and the monitoring platform are used for acquiring resource monitoring data;
the reporting center is responsible for reporting the monitoring data of different networks and different areas;
and the operation and maintenance platform displays the monitoring data and the operation state.
2. The system of claim 1,
the cloud computing platform provides a set of standard, namely infrastructure adaptive Driver interface (CPI), which is in butt joint with IaaS, and the built-in IaaS Driver of the current platform comprises VMWare, OpenStack, Xen, Docker, an industry standard virtualization environment or a Driver (Driver) which is written by reference to the standard and supports other IaaS; the Driver can provide calls for bottom layer resource management, and the docking with the cloud platform is realized through a standard API (application program interface), so that the access of various basic environments is met.
3. The system of claim 2,
the plug-in type framework provided by the monitoring platform has an automatic discovery function, data acquisition is carried out through a data acquisition mode and an API (application program interface), and the distributed system framework can support monitoring of more devices.
4. The system of claim 3,
the monitoring of the resource object is realized by installing a deployment acquisition plug-in on the monitoring node;
the acquisition of monitoring data is realized by installing a probe on a physical server or starting an SNMP (simple network management protocol);
and monitoring data is acquired by opening an SNMP protocol on the storage equipment.
5. The system of claim 4,
the cloud computing platform collects resource monitoring data by setting monitoring object types, monitoring indexes and the like; realizing abnormal alarm of resources by setting an alarm scheme; the monitoring collector and the Cloud-Agent collect monitoring information in the basic environment and send the monitoring information to the monitoring bus;
the monitoring alarm measurement Ceilometer acquires original data from a monitoring bus, analyzes monitoring indexes and index values and carries out alarm evaluation;
the monitoring platform acquires monitoring information from the monitoring bus, analyzes, summarizes and stores the monitoring information in a warehouse, and provides a RestAPI interface for upper layer calling.
6. The system of claim 1,
and the reporting center follows standard interface specifications formulated by the operation and maintenance platform for information of various resources, and reports the monitoring data received from the cloud computing platform and the monitoring platform under different networks to the operation and maintenance platform for unified display.
7. The system of claim 1,
the operation and maintenance platform displays the current use condition and the running state of various resources in the system in real time through view and report analysis, and displays the monitoring data of the resource equipment more intuitively.
8. The system of claim 1,
the operation and maintenance platform adopts a design mode of a plurality of layers of architectures and modularization, the functions of the modules are independent, the modules can be freely combined according to different customer requirements, and the types of the provided monitoring items comprise: CPU utilization, CPU load, memory utilization, network traffic, disk space utilization, process state, and process count.
CN202111126506.3A 2021-09-26 2021-09-26 Resource intelligent operation and maintenance system based on cross-network environment Withdrawn CN114124662A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111126506.3A CN114124662A (en) 2021-09-26 2021-09-26 Resource intelligent operation and maintenance system based on cross-network environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111126506.3A CN114124662A (en) 2021-09-26 2021-09-26 Resource intelligent operation and maintenance system based on cross-network environment

Publications (1)

Publication Number Publication Date
CN114124662A true CN114124662A (en) 2022-03-01

Family

ID=80441472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111126506.3A Withdrawn CN114124662A (en) 2021-09-26 2021-09-26 Resource intelligent operation and maintenance system based on cross-network environment

Country Status (1)

Country Link
CN (1) CN114124662A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115967657A (en) * 2022-12-20 2023-04-14 浪潮云信息技术股份公司 SDWAN-based cloud platform capacity acquisition method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115967657A (en) * 2022-12-20 2023-04-14 浪潮云信息技术股份公司 SDWAN-based cloud platform capacity acquisition method

Similar Documents

Publication Publication Date Title
CN104407964B (en) A kind of centralized monitoring system and method based on data center
CN105282772A (en) Wireless network data communication equipment monitoring system and equipment monitoring method
WO2023142054A1 (en) Container microservice-oriented performance monitoring and alarm method and alarm system
CN106487574A (en) Automatic operating safeguards monitoring system
CN102523140A (en) Real-time monitoring device for operation and maintenance of electric power customer service system
CN104881352A (en) System resource monitoring device based on mobile terminal
CN108092813A (en) Data center's total management system server hardware Governance framework and implementation method
CN112688819A (en) Comprehensive management system for network operation and maintenance
CN103973815A (en) Method for unified monitoring of storage environment across data centers
CN107528870A (en) A kind of collecting method and its equipment
CN114518934A (en) Unified operation and maintenance platform architecture system
CN103716173A (en) Storage monitoring system and monitoring alarm issuing method
CN108777637A (en) A kind of data center's total management system and method for supporting server isomery
US20230198860A1 (en) Systems and methods for the temporal monitoring and visualization of network health of direct interconnect networks
CN113946497A (en) Method suitable for unified intelligent monitoring and alarming of multi-cloud platform resources
CN113596150A (en) Message pushing method and device, computer equipment and storage medium
CN111488258A (en) System for analyzing and early warning software and hardware running state
CN114244676A (en) Intelligent IT integrated gateway system
CN112333020B (en) Network security monitoring and data message analysis system based on quintuple
WO2015192664A1 (en) Device monitoring method and apparatus
CN105893211A (en) Method and system for monitoring
CN108011769A (en) A kind of implementation method of visualized O&M system
CN112153131A (en) Iron and steel quality private cloud platform construction method based on super-fusion technology
CN102404160B (en) Method and system for realizing intelligent monitoring
CN114124662A (en) Resource intelligent operation and maintenance system based on cross-network environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220301