WO2018192569A1 - 设备监控方法、装置及系统 - Google Patents

设备监控方法、装置及系统 Download PDF

Info

Publication number
WO2018192569A1
WO2018192569A1 PCT/CN2018/083905 CN2018083905W WO2018192569A1 WO 2018192569 A1 WO2018192569 A1 WO 2018192569A1 CN 2018083905 W CN2018083905 W CN 2018083905W WO 2018192569 A1 WO2018192569 A1 WO 2018192569A1
Authority
WO
WIPO (PCT)
Prior art keywords
epg
big data
monitoring
epg device
online
Prior art date
Application number
PCT/CN2018/083905
Other languages
English (en)
French (fr)
Inventor
张彦垒
Original Assignee
南京中兴软件有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京中兴软件有限责任公司 filed Critical 南京中兴软件有限责任公司
Publication of WO2018192569A1 publication Critical patent/WO2018192569A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2404Monitoring of server processing errors or hardware failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2405Monitoring of the internal components or processes of the server, e.g. server load
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/254Management at additional data server, e.g. shopping server, rights management server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/254Management at additional data server, e.g. shopping server, rights management server
    • H04N21/2541Rights Management

Definitions

  • the present invention relates to the field of communications, and in particular to a device monitoring method, apparatus, and system.
  • IPTV Internet Protocol Television
  • EPGs electronic program guides
  • Numerous devices increase the difficulty of manual maintenance. It is difficult to monitor the performance indicators and online load of each EPG device, and it is impossible to provide early warning of equipment risks. Once the machine performance is insufficient or even downtime, it will affect the business. Therefore, you need to monitor the device performance in real time, and check the device with high performance indicators in advance to prevent the device from affecting the service abnormally.
  • the embodiment of the invention provides a device monitoring method, device and system, so as to solve at least the problem that real-time monitoring of a large number of EPG devices cannot be performed.
  • a device monitoring method including: acquiring performance indicators and online loads of an EPG device in real time; and performing big data analysis on the obtained performance indicators and online loads to generate monitoring of the EPG device. result.
  • a device monitoring apparatus comprising: an acquisition module configured to acquire a performance indicator and an online load of an EPG device in real time; and a generation module configured to perform performance obtained by the acquisition module The indicator and the online load perform big data analysis to generate monitoring results of the EPG device.
  • a device monitoring system including: a big data platform configured to acquire performance indicators and online loads of an EPG device in real time, and perform big data analysis on the obtained performance indicators and online loads, Generating a monitoring result of the EPG device; the EPG device is configured to collect the performance indicator and the online load through the set Flume agent; and the big data portal is configured to display the monitoring result of the EPG device.
  • a computer readable storage medium having stored thereon a device monitor program, wherein when the device monitor program is executed by a processor, the processor performs the step of: acquiring performance of an EPG device in real time The indicator and the online load; and the big data analysis of the obtained performance indicator and the online load to generate the monitoring result of the EPG device.
  • FIG. 1 is a flow chart of a device monitoring method according to an embodiment of the present invention.
  • FIGS. 2 and 3 are schematic structural diagrams of a device monitoring system according to an embodiment of the present invention.
  • 4 to 6 are block diagrams showing the structure of a device monitoring apparatus according to various embodiments of the present invention.
  • FIG. 1 is a flow chart of a device monitoring method in accordance with an embodiment of the present invention.
  • a device monitoring method may include steps S102 and S104.
  • step S102 the performance indicator and the online load of the EPG device are acquired in real time.
  • step S104 big data analysis is performed on the acquired performance indicator and the online load to generate a monitoring result of the EPG device.
  • An application scenario of the device monitoring method includes, but is not limited to, monitoring an EPG device in an IPTV network.
  • the performance indicators and online load of the EPG device are obtained in real time, and the obtained performance index and the online load are analyzed by big data to generate a monitoring result of the EPG device. That is to say, based on big data analysis, a large number of EPG devices can be monitored in real time, thereby solving the problem of real-time monitoring of a large number of EPG devices, and achieving the technical effect of real-time monitoring of the entire network EPG device.
  • the main purpose of this example is to monitor the performance and load of EPG devices across the entire network based on big data, and provide an intuitive interface display to provide real-time prompts for devices at risk. After the high-risk equipment is discovered by the engineers, the equipment can be pre-processed in time to avoid the abnormality of the equipment.
  • the present invention provides a monitoring framework based on big data to monitor EPG device information of a whole network.
  • the monitoring framework includes: deploying a timing task on each EPG device to periodically generate a server performance log as the basic data collected by the big data system; deploying a Flume agent for each EPG device to collect the server performance log; deploying based on Hadoop
  • the big data platform environment including installing related services, initializing service data information, configuring the presentation dimensions of the EPG device, setting the correspondence between the name and ID of the EPG device and the grouping of the EPG device, to provide subsequent multi-dimensional display Foundation; deploy the Flume server in the big data platform environment to receive the collected basic data; collect the incremental logs generated by the EPG device through the Flume agent deployed on the EPG device, so that the big data platform can analyze the online user load situation.
  • the big data platform processes the collected data to generate an index for display by the big data portal, wherein the index may include (but is not limited to) an ElasticSearch index; and when the big data portal interface queries the EPG device information, the dimension is displayed. Details of each EPG device can be used for key indices Order, and on the key index above the threshold are labeled alarm to alert a station EPG devices at high risk, to draw attention to maintenance personnel investigation.
  • the method before acquiring the performance indicator and the online load of the EPG device, the method further includes: setting a Hadoop-based big data platform environment, wherein the Hadoop-based big data platform environment is set by using at least one of the following operations: The service for monitoring operations; setting the configuration of the presentation dimension of the EPG device; and setting the correspondence between the name and ID of the EPG device and the packet to which the EPG device belongs.
  • the real-time acquisition of the performance index and the online load of the EPG device includes: collecting the energy indicator and the online load by using a Flume agent on the EPG device, wherein the performance index and the online load collected on different EPG devices are passed through Flume
  • the server enters the Hadoop Distributed File System (HDFS).
  • HDFS Hadoop Distributed File System
  • the device monitoring method may further include: generating an index (for example, an ElasticSearch index) to cause the big data portal to display the monitoring result according to the index.
  • an index for example, an ElasticSearch index
  • the big data portal may display the monitoring result according to the index, including but not limited to: the big data portal shows the monitoring result of the EPG device in a dimension, which may include: the big data portal monitors the EPG device according to the key index. The results are sorted; and when the key index of the monitoring result is higher than the threshold, the big data portal marks the corresponding EPG device.
  • FIGS. 2 and 3 are schematic structural diagrams of a device monitoring system according to an embodiment of the present invention.
  • FIG. 2 and 3 illustrates the use of a big data cluster to monitor EPG network elements (i.e., EPG devices) in an IPTV network and display relevant key indices to alert early warnings.
  • EPG network elements i.e., EPG devices
  • a device monitoring system may include: an EPG device on which a Flume agent for log collection is deployed; a big data cluster including various service components supporting cluster work; A big data portal interface for displaying EPG device information.
  • Flume is a distributed, massive log collection, aggregation and transport system provided by Cloudera. Flume supports the customization of various data senders in the logging system for data collection. At the same time, Flume provides the ability to simply process data and write it to various data recipients.
  • the scheduled task deployed on the EPG device can generate EPG performance logs and behavior logs periodically to record the usage of the EPG device, and the Flume agent can collect the incremental logs of the EPG device in real time, for example, through the best effort.
  • the (Best Effort, referred to as BE) interface provides log information to the big data cluster for analysis by big data clusters.
  • the Flume agent can reside as a program process in the EPG device to collect incremental logs from the EPG device.
  • FIG. 2 also shows that an Emergency Alert System (EAS) device together with each EPG device constitutes an EPG cluster.
  • EAS Emergency Alert System
  • Flume agents can also be deployed on EAS devices to provide log information for EAS devices to big data clusters.
  • the log files that the Flume agent can collect include (but are not limited to): an EPG device performance log (for example, a vmstat.log file); and an EPG device service usage log. These log files can be used to analyze the business performance metrics of the EPG device.
  • the Flume agent as a program process can always exist in the server process of the EPG device.
  • Flume provides the ability to collect data on a variety of data sources, including but not limited to consoles, remote procedure call (RPC), files, tail commands (which are used to display the end of a specified file, commonly viewed) Log files), system logs, etc.
  • the tail command can be used for log collection. Therefore, when the log to be collected is increased, the incremental log file content can be automatically collected and sent to the Flume server set in the big data cluster to achieve real-time. The effect of the acquisition.
  • a big data cluster applied to a device monitoring system may include various service components supporting cluster work, including but not limited to Flume server, HDFS, HIVE, integrated access gateway. (Integrated Access Gateway, referred to as IAG), TOMCAT server cluster and embedded storage (ES).
  • HIVE is a data warehouse infrastructure built on Hadoop. HIVE provides a set of tools that can be used for Extract-Transform-Load (ETL), a mechanism for querying and analyzing big data stored in Hadoop.
  • the TOMCAT server is a web application server and is a lightweight application server.
  • the Flume server can be used as a program process on the big data cluster. It receives the log information of the EPG device collected by the Flume agent of the EPG device in real time, and records the received information into the HDFS. Therefore, the logs collected by the Flume agent deployed on different EPG devices can pass through the Flume server to enter HDFS. Subsequently, the data entered into the HDFS can be extracted by the timed task program to be entered into the index (for example, the ElasticSearch index), and thus the data collection from the EPG device to the big data environment is basically completed.
  • the index for example, the ElasticSearch index
  • Show monitoring results by building a big data portal interface For example, by querying the RESTful web interface provided by the ElasticSearch index, you can query the EPG device information entered into the ElasticSearch index.
  • the query request can be sent to the interface provided by the big data cluster according to the dimension combination condition of the big data portal interface (for example, the Rest interface shown in FIG. 3), and the response is assembled according to the information returned by the big data cluster, and the maintenance personnel are intuitively presented through the page.
  • a friendly prompt for example, can be used to sort the EPG devices with a high resource usage rate and provide an alarm function to enable the maintenance personnel to discover the status of the EPG device in time and prevent the EPG device from running out of resources.
  • Hadoop is a distributed system infrastructure developed by the Apache Foundation. Hadoop enables efficient computing and is the underlying framework for executing distributed applications on large clusters of general purpose computing devices.
  • the device monitoring system according to an embodiment of the present invention can be built on top of Hadoop to fully utilize the high-speed computing and storage capabilities of the cluster to reliably store and process massive amounts of PB-level data.
  • HDFS is a distributed file system built on top of PC hardware, ideal for applications that need to access massive amounts of data.
  • the biggest difference between HDFS and existing distributed systems is that HDFS is highly fault tolerant and low cost. Therefore, the visual operation and maintenance system built on Hadoop can realize powerful external service capabilities and can adapt to the massive data processing capability under large-scale networking conditions, so that it can be well applied in IPTV operation and maintenance.
  • the monitoring entry of the EPG device may be provided, and the EPG device that meets the condition may be listed according to the region, according to the query condition, etc., to query related information of the EPG device.
  • the related information may include: EPG device basic information including, but not limited to, an EPG device ID, an EPG device name, a group to which the EPG device belongs, a server type of the EPG device, a belonging room, a geographical location, and a CPU and an EPG device. / or line graph of memory usage; and EPG device log analysis, including (but not limited to) page status statistics, service authentication failure statistics, service order failure statistics, and page response time of EPG devices.
  • EPG device log analysis including (but not limited to) page status statistics, service authentication failure statistics, service order failure statistics, and page response time of EPG devices.
  • portions of the technical solution of the present invention that contribute substantially or to the prior art may be embodied in the form of a software product stored in a computer readable storage medium (eg, ROM/RAM,
  • a computer readable storage medium eg, ROM/RAM,
  • the disk, the optical disk includes a plurality of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in various embodiments of the present invention.
  • 4 to 6 are block diagrams showing the structure of a device monitoring apparatus according to various embodiments of the present invention.
  • a device monitoring apparatus is further provided for implementing a device monitoring method according to various embodiments of the present invention, and details are not described herein.
  • the term "module” may implement a combination of software and/or hardware of a predetermined function. Even though the apparatus described in the following embodiments can be implemented by software, hardware or a combination of software and hardware is also possible.
  • the device monitoring apparatus may include an obtaining module 42 and a generating module 44.
  • the obtaining module 42 is configured to acquire the performance indicator and the online load of the EPG device in real time
  • the generating module 44 is configured to perform big data analysis on the obtained performance indicator and the online load to generate a monitoring result of the EPG device.
  • An application scenario of the device monitoring apparatus includes, but is not limited to, monitoring an EPG device in an IPTV network.
  • the performance indicators and online load of the EPG device are obtained in real time, and the obtained performance index and the online load are analyzed by big data to generate a monitoring result of the EPG device. That is to say, based on big data analysis, a large number of EPG devices can be monitored in real time, thereby solving the problem of real-time monitoring of a large number of EPG devices, and achieving the technical effect of real-time monitoring of the entire network EPG device.
  • the device monitoring apparatus may further include a setting module 52.
  • the setting module 52 Before acquiring the performance indicators and online load of the EPG device in real time, the setting module 52 is set to set the Hadoop-based big data platform environment.
  • the setting operation of the setting module 52 may include at least one of: installing a service for monitoring operation; setting a configuration of an presentation dimension of the EPG device; and setting a correspondence between a name and an ID of the EPG device and a group to which the EPG device belongs .
  • the obtaining module 42 may be further configured to collect performance indicators and online loads through the Flume agent on the EPG device. Therefore, the performance indicators and online load collected on different EPG devices can enter HDFS through the Flume server.
  • the device monitoring apparatus may further include a processing module 62.
  • the processing module 62 is arranged to generate an index to cause the big data portal to present the monitoring results according to the index.
  • the index may include, but is not limited to, an ElasticSearch index.
  • the big data portal may display the monitoring result according to the index, including but not limited to: the big data portal shows the monitoring result of the EPG device in a dimension, which may include: the big data portal monitors the EPG device according to the key index. The results are sorted; and when the key index of the monitoring result is higher than the threshold, the big data portal marks the corresponding EPG device.
  • each of the above modules can be implemented by software or hardware.
  • each of the above modules may be located in the same processor, or each of the above modules may be located in different processors in any combination.
  • An embodiment of the present invention further provides a device monitoring system, including: a big data platform, configured to acquire performance indicators and online loads of an EPG device, and perform big data analysis on the obtained performance indicators and online loads to generate an EPG.
  • the monitoring result of the device the EPG device is configured to collect the performance indicator and the online load through the set Flume agent; and the big data portal is set to display the monitoring result of the EPG device.
  • Embodiments of the present invention further provide a computer readable storage medium having stored thereon a device monitoring program, wherein when the device monitoring program is executed by a processor, the processor performs the step of: acquiring a performance indicator of the EPG device in real time. And online load; and big data analysis of acquired performance indicators and online loads to generate monitoring results for EPG equipment.
  • the computer readable storage medium may include, but is not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, A variety of media that can store program code, such as a disk or an optical disk.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • mobile hard disk A variety of media that can store program code, such as a disk or an optical disk.
  • the various modules or steps of the present invention described above can be implemented by a general-purpose computing device, which can be centralized on a single computing device or distributed across multiple computing devices.
  • the composition of the network may be implemented by program code executable by a computing device, such that program code implementing the module or step may be stored in a storage device for execution by the computing device, and
  • the steps shown or described may be performed in an order different than that herein, or they may be separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof may be fabricated into a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本发明提供了一种设备监控方法、装置及系统。所述设备监控方法包括:实时获取EPG设备的性能指标和在线负荷;对获取的性能指标和在线负荷进行大数据分析,以生成该EPG设备的监控结果。

Description

设备监控方法、装置及系统 技术领域
本发明涉及通信领域,具体而言,涉及一种设备监控方法、装置及系统。
背景技术
随着运营商交互式网络电视(Internet Protocol Television,简称为IPTV)业务规模的不断扩大,IPTV组网中电子节目指南(Electronic Program Guide,简称为EPG)设备的数量越来越多。众多的设备增加了人工维护的难度,很难监控到每台EPG设备的性能指标和在线负荷,无法对设备风险进行预警,一但机器性能不足甚至宕机就会对业务造成影响。因此需要对设备性能进行实时的监控,对性能指标较高的设备进行提前检查,避免设备异常影响业务。
发明内容
本发明实施例提供了一种设备监控方法、装置及系统,以至少解决无法对大量的EPG设备进行实时监控的问题。
根据本发明的实施例,提供了一种设备监控方法,包括:实时获取EPG设备的性能指标和在线负荷;以及对获取的性能指标和在线负荷进行大数据分析,以生成所述EPG设备的监控结果。
根据本发明的实施例,提供了一种设备监控装置,包括:获取模块,其设置为实时获取EPG设备的性能指标和在线负荷;以及生成模块,其设置为对由所述获取模块获取的性能指标和在线负荷进行大数据分析,以生成所述EPG设备的监控结果。
根据本发明的实施例,提供了一种设备监控系统,包括:大数据平台,其设置为实时获取EPG设备的性能指标和在线负荷,并对获取的性能指标和在线负荷进行大数据分析,以生成所述EPG设备的监控结果;EPG设备,其设置为通过设置的Flume代理采集所述性能指标和在线负荷;以及大数据门户,其设置为展示所述EPG设备的监控 结果。
根据本发明的实施例,提供了一种计算机可读存储介质,其上存储有设备监控程序,所述设备监控程序被处理器执行时,使得所述处理器执行步骤:实时获取EPG设备的性能指标和在线负荷;以及对获取的性能指标和在线负荷进行大数据分析,以生成所述EPG设备的监控结果。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分。本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是根据本发明实施例的设备监控方法的流程图;
图2和图3是根据本发明实施例的设备监控系统的结构示意图;以及
图4至图6是根据本发明各实施例的设备监控装置的结构框图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
图1是根据本发明实施例的设备监控方法的流程图。
如图1所示,根据本发明实施例的设备监控方法可以包括步骤S102和S104。
在步骤S102,实时获取EPG设备的性能指标和在线负荷。
在步骤S104,对获取的性能指标和在线负荷进行大数据分析,以生成EPG设备的监控结果。
根据本发明实施例的设备监控方法的应用场景包括(但并不限 于)监控IPTV组网中的EPG设备。在该应用场景下,实时获取EPG设备的性能指标和在线负荷,并且对获取的性能指标和在线负荷进行大数据分析,以生成该EPG设备的监控结果。也就是说,基于大数据分析可以对大量的EPG设备进行实时监控,进而解决对大量的EPG设备进行实时监控的问题,达到了可以对全网EPG设备进行实时监控的技术效果。
下面结合具体示例,对本实施例进行举例说明。
本示例的主要目的在于:基于大数据监测全网EPG设备的性能和负载,并提供直观的界面展示,从而对存在风险的设备进行实时提示。工程人员发现高风险设备后可以及时进行预处理,从而避免设备异常对业务造成影响。
为了实现上述目的,本发明提供一种基于大数据的监控框架,来监控全网EPG设备信息。该监控框架包括:在每台EPG设备上部署定时任务,以定时生成服务器性能日志,作为大数据系统所采集的基础数据;为每台EPG设备部署Flume代理,以采集服务器性能日志;部署基于Hadoop的大数据平台环境,其中包括,安装相关服务、初始化业务数据信息、对EPG设备的展示维度做进行配置、设置EPG设备的名称和ID与EPG设备所属分组的对应关系,以为后续多维度展示提供基础;在大数据平台环境中部署Flume服务器,用于接收所采集的基础数据;通过在EPG设备部署的Flume代理采集EPG设备产生的增量日志,以供大数据平台做在线用户负荷情况的分析;大数据平台对采集到的数据进行处理,以生成索引供大数据门户展示,其中,索引可以包括(但并不限于)ElasticSearch索引;以及在大数据门户界面查询EPG设备信息时,分维度展示各EPG设备详情,可以对关键指数进行排序,并且对高于阈值的关键指数进行标注告警,以提示某台EPG设备处于高风险状态,提请维护人员及时关注排查。
根据本发明实施例,在获取EPG设备的性能指标和在线负荷之前,还包括:设置基于Hadoop的大数据平台环境,其中,通过以下操作至少之一来设置基于Hadoop的大数据平台环境:安装用于监控操作的服务;设置EPG设备的展示维度的配置;以及设置EPG设备的 名称和ID与该EPG设备所属分组之间的对应关系。
通过设置基于Hadoop的大数据平台环境,使得可以基于大数据平台监测全网EPG设备的性能和负载。
根据本发明实施例,实时获取EPG设备的性能指标和在线负荷包括:通过EPG设备上的Flume代理采集该能指标和在线负荷,其中,在不同EPG设备上采集到的性能指标和在线负荷通过Flume服务器进入Hadoop分布式文件系统(Hadoop Distributed File System,简称为HDFS)。
根据本发明实施例的设备监控方法还可以包括:生成索引(例如,ElasticSearch索引),以使大数据门户根据索引展示监控结果。
根据本发明实施例,大数据门户根据索引展示监控结果可以包括(但并不限于):大数据门户分维度展示EPG设备的监控结果,其中可以包括:大数据门户根据关键指数对EPG设备的监控结果进行排序;以及在监控结果的关键指数高于阈值时,大数据门户对相应的EPG设备进行标注告警。
下面结合具体示例,对本实施例进行举例说明。
图2和图3是根据本发明实施例的设备监控系统的结构示意图。
图2和图3所示的实施例示出了使用大数据集群对IPTV组网中EPG网元(即,EPG设备)进行监控并展示相关的关键指数,以提示预警。
参见图2和图3,根据本发明实施例的设备监控系统可以包括:EPG设备,其上部署了用于日志采集的Flume代理;大数据集群,其包括支持集群工作的各种服务组件;以及用于展示EPG设备信息的大数据门户界面。
Flume是Cloudera提供的分布式的海量日志采集、聚合和传输的系统。Flume支持在日志系统中定制各类数据发送方,以用于数据收集,同时,Flume提供对数据进行简单处理,并写入到各种数据接受方的能力。
如图2所示,部署在EPG设备上的定时任务可以定时生成EPG性能日志和行为日志,以记录EPG设备的使用情况,并且Flume代理 可以实时采集EPG设备的增量日志,以通过例如最大努力(Best Effort,简称为BE)接口将日志信息提供给大数据集群,供大数据集群分析使用。Flume代理可以作为程序进程一直驻留在EPG设备当中,以对EPG设备的增量日志进行采集。
此外,图2还示出了紧急警报系统(Emergency Alert System,简称为EAS)设备与各个EPG设备一起构成了EPG集群。Flume代理也可以部署在EAS设备上,以向大数据集群提供EAS设备的日志信息。
根据本发明实施例,Flume代理可以采集的日志文件包括(但不限于):EPG设备性能日志(例如,vmstat.log文件);以及EPG设备业务使用情况日志。这些日志文件可被用于分析EPG设备的业务性能指标。
Flume代理作为程序进程可以一直存在于EPG设备的服务器进程中。Flume提供了在多种数据源上收集数据的能力,所述数据源包括(但不限于)控制台、远程过程调用(RPC)、文件、tail命令(其用于显示指定文件末尾内容,常用查看日志文件)、系统日志等。根据本发明实施例,可以采用tail命令的方式进行日志采集,因此当要采集的日志有增加时,可以自动采集增量的日志文件内容并发送给设置在大数据集群端的Flume服务器,以达到实时采集的效果。
如图3所示,应用于根据本发明实施例的设备监控系统的大数据集群可以包括支持集群工作的各种服务组件,其中包括(但不限于)Flume服务器、HDFS、HIVE、综合接入网关(Integrated Access Gateway,简称为IAG)、TOMCAT服务器集群和嵌入式存储(embedded storage,简称为ES)。HIVE是建立在Hadoop上的数据仓库基础构架。HIVE提供了一系列的工具,可以用来进行数据提取转化加载(Extract-Transform-Load,简称为ETL),是一种查询和分析存储在Hadoop中的大数据的机制。TOMCAT服务器是Web应用服务器,属于轻量级应用服务器。
根据Flume的框架结构设计,Flume服务器可以作为程序进程位于大数据集群端,其实时接收通过EPG设备的Flume代理采集的EPG设备的日志信息,并将接收到的信息录入到HDFS中。从而,在不同 的EPG设备上部署的Flume代理所采集到的日志都通可以过Flume服务器进入HDFS。随后,可以通过定时任务程序对录入到HDFS中的数据进行信息提取,以录入到索引(例如,ElasticSearch索引),至此基本上完成了从EPG设备到大数据环境的数据采集。
通过构建大数据门户界面来展示监控结果。例如,通过查询ElasticSearch索引提供的RESTful web接口,可以查询录入到ElasticSearch索引中的EPG设备信息。
可以根据大数据门户界面的维度组合条件发送查询请求至大数据集群提供的接口(例如,图3所示的Rest接口),并且根据大数据集群返回的信息组装响应,通过页面向维护人员呈现直观友好的提示,例如,对于资源占用率较高的EPG设备可以进行排序,并提供告警功能,使维护人员能够及时发现EPG设备的状态,防止EPG设备资源耗尽发生故障。
Hadoop是由Apache基金会所开发的分布式系统基础架构。Hadoop能够实现高效计算,是在通用计算设备组成的大型集群上执行分布式应用的基础框架。根据本发明实施例的设备监控系统可以构建在Hadoop之上,从而充分利用集群的高速运算和存储能力,可靠地存储和处理海量PB级数据。
Hadoop本身运行于大规模集群上的HDFS以及MapReduce分布式并行编程框架之上。HDFS是构建在PC硬件之上的分布式文件系统,非常适合需要访问海量数据的应用。HDFS和现有的分布式系统最大的区别在于,HDFS具有高容错性和低成本。因此,建立在Hadoop之上的可视化运营运维系统,能够实现强大的对外服务能力,可以适应大规模组网情况下的海量数据处理能力,从而在IPTV运维中可以得到很好的应用。
根据本发明实施例,可以提供EPG设备的监控入口,并且可以根据地域、根据查询条件等列出符合条件的EPG设备,以查询EPG设备的相关信息。所述相关信息可以包括:EPG设备基本信息,其包括(但不限于)EPG设备ID、EPG设备名称、EPG设备所属分组、EPG设备的服务器类型、所属机房、地理位置、以及EPG设备的CPU和/ 或内存使用情况的折线图等;以及EPG设备日志分析,其包括(但不限于)EPG设备的页面状态统计、业务鉴权失败统计、业务订购失败统计、EPG设备的页面响应时间等。从而,可以呈现EPG设备的响应时间的折线图、EPG设备的页面响应时间等信息。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件来实现。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个计算机可读存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
图4至图6是根据本发明各实施例的设备监控装置的结构框图。
根据本发明实施例,还提供了一种设备监控装置,用于实现根据本发明各实施例的设备监控方法,对于已经进行过说明的内容不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。即便以下实施例所描述的装置可以通过软件来实现,但是硬件或者软件和硬件的组合的实现也是可能的。
如图4所示,根据本发明实施例的设备监控装置的可以包括获取模块42和生成模块44。
获取模块42设置为实时获取EPG设备的性能指标和在线负荷,并且生成模块44设置为对获取的性能指标和在线负荷进行大数据分析,以生成EPG设备的监控结果。
根据本发明实施例的设备监控装置的应用场景包括(但并不限于)监控IPTV组网中的EPG设备。在该应用场景下,实时获取EPG设备的性能指标和在线负荷,并且对获取的性能指标和在线负荷进行大数据分析,以生成该EPG设备的监控结果。也就是说,基于大数据分析可以对大量的EPG设备进行实时监控,进而解决对大量的EPG设备进行实时监控的问题,达到了可以对全网EPG设备进行实时监控的技术效果。
如图5所示,根据本发明实施例的设备监控装置还可以包括设置模块52。
在实时获取EPG设备的性能指标和在线负荷之前,设置模块52设置为对基于Hadoop的大数据平台环境进行设置。设置模块52的设置操作可以包括以下操作至少之一:安装用于监控操作的服务;设置EPG设备的展示维度的配置;以及设置EPG设备的名称和ID与该EPG设备所属分组之间的对应关系。
根据本发明实施例,获取模块42还可以设置为通过EPG设备上的Flume代理采集性能指标和在线负荷。从而,在不同EPG设备上采集到的性能指标和在线负荷可以通过Flume服务器进入HDFS。
如图6所示,根据本发明实施例的设备监控装置还可以包括处理模块62。
处理模块62设置为生成索引,以使大数据门户根据索引展示监控结果。根据本发明实施例,所述索引可以包括(但并不限于)ElasticSearch索引。
根据本发明实施例,大数据门户根据索引展示监控结果可以包括(但并不限于):大数据门户分维度展示EPG设备的监控结果,其中可以包括:大数据门户根据关键指数对EPG设备的监控结果进行排序;以及在监控结果的关键指数高于阈值时,大数据门户对相应的EPG设备进行标注告警。
需要说明的是,上述各个模块是可以通过软件或硬件来实现的。在通过硬件来实现上述各个模块的情况下,上述各个模块均可位于同一处理器中,或者上述各个模块以任意组合的形式分别位于不同的处理器中。
本发明的实施例还提供了一种设备监控系统,包括:大数据平台,其设置为获取EPG设备的性能指标和在线负荷,并对获取的性能指标和在线负荷进行大数据分析,以生成EPG设备的监控结果;EPG设备,其设置为通过设置的Flume代理采集所述性能指标和在线负荷;以及大数据门户,其设置为展示EPG设备的监控结果。
本发明的实施例还提供了一种计算机可读存储介质,其上存储 有设备监控程序,所述设备监控程序被处理器执行时,使得所述处理器执行步骤:实时获取EPG设备的性能指标和在线负荷;以及对获取的性能指标和在线负荷进行大数据分析,以生成EPG设备的监控结果。
根据本发明实施例,上述计算机可读存储介质可以包括(但不限于):U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,所述模块或步骤可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上。根据本发明实施例,所述模块或步骤可以用计算装置可执行的程序代码来实现,从而,可以将实现了所述模块或步骤的程序代码存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (12)

  1. 一种设备监控方法,包括:
    实时获取电子节目指南EPG设备的性能指标和在线负荷;以及
    对获取的性能指标和在线负荷进行大数据分析,以生成所述EPG设备的监控结果。
  2. 根据权利要求1所述的方法,其中,在实时获取所述EPG设备的性能指标和在线负荷之前,所述方法还包括:
    设置基于Hadoop的大数据平台环境,
    其中,通过以下操作至少之一来设置基于Hadoop的大数据平台环境:
    安装用于监控操作的服务;
    设置所述EPG设备的展示维度的配置;以及
    设置所述EPG设备的名称和ID与所述EPG设备所属分组之间的对应关系。
  3. 根据权利要求1所述的方法,其中,实时获取所述EPG设备的性能指标和在线负荷包括:
    通过所述EPG设备上的Flume代理采集所述性能指标和在线负荷,
    其中,在不同EPG设备上采集到的性能指标和在线负荷通过Flume服务器进入Hadoop分布式文件系统HDFS。
  4. 根据权利要求1所述的方法,还包括:
    生成索引,以使大数据门户根据所述索引展示监控结果。
  5. 根据权利要求4所述的方法,其中,所述大数据门户根据所述索引展示监控结果包括:
    所述大数据门户分维度展示所述EPG设备的监控结果。
  6. 根据权利要求5所述的方法,其中,所述大数据门户分维度展示所述EPG设备的监控结果包括:
    所述大数据门户根据关键指数对所述EPG设备的监控结果进行排序;以及
    在所述监控结果的关键指数高于阈值时,所述大数据门户对相应的EPG设备进行标注告警。
  7. 一种设备监控装置,包括:
    获取模块,其设置为实时获取电子节目指南EPG设备的性能指标和在线负荷;以及
    生成模块,其设置为对由所述获取模块获取的性能指标和在线负荷进行大数据分析,以生成所述EPG设备的监控结果。
  8. 根据权利要求7所述的装置,还包括:
    设置模块,在实时获取所述EPG设备的性能指标和在线负荷之前,所述设置模块设置为对基于Hadoop的大数据平台环境进行设置,
    其中,所述设置模块的设置操作包括以下操作至少之一:
    安装用于监控操作的服务;
    设置所述EPG设备的展示维度的配置;以及
    设置所述EPG设备的名称和ID与所述EPG设备所属分组之间的对应关系。
  9. 根据权利要求7所述的装置,其中,所述获取模块还设置为通过所述EPG设备上的Flume代理采集所述性能指标和在线负荷,
    其中,在不同EPG设备上采集到的性能指标和在线负荷通过Flume服务器进入Hadoop分布式文件系统HDFS。
  10. 根据权利要求7所述的装置,还包括:
    处理模块,其设置为生成索引,以使大数据门户根据所述索引 展示监控结果。
  11. 一种设备监控系统,包括:
    大数据平台,其设置为实时获取电子节目指南EPG设备的性能指标和在线负荷,并对获取的性能指标和在线负荷进行大数据分析,以生成所述EPG设备的监控结果;
    EPG设备,其设置为通过设置的Flume代理采集所述性能指标和在线负荷;以及
    大数据门户,其设置为展示所述EPG设备的监控结果。
  12. 一种计算机可读存储介质,其上存储有设备监控程序,所述设备监控程序被处理器执行时,使得所述处理器执行根据权利要求1至6中任一项所述的设备监控方法。
PCT/CN2018/083905 2017-04-21 2018-04-20 设备监控方法、装置及系统 WO2018192569A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710265366.5 2017-04-21
CN201710265366.5A CN108737855A (zh) 2017-04-21 2017-04-21 设备监控方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2018192569A1 true WO2018192569A1 (zh) 2018-10-25

Family

ID=63856208

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/083905 WO2018192569A1 (zh) 2017-04-21 2018-04-20 设备监控方法、装置及系统

Country Status (2)

Country Link
CN (1) CN108737855A (zh)
WO (1) WO2018192569A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115292150B (zh) * 2022-10-09 2023-04-07 帕科视讯科技(杭州)股份有限公司 一种基于ai算法监控iptv epg业务健康状态的方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102291616A (zh) * 2011-09-03 2011-12-21 四川公用信息产业有限责任公司 Iptv终端管理系统
CN104735144A (zh) * 2015-03-20 2015-06-24 努比亚技术有限公司 基于大数据改变终端状态的方法及服务器
CN105357549A (zh) * 2015-11-09 2016-02-24 天津网络广播电视台有限公司 一种机顶盒数据采集系统及数据采集方法
CN105979273A (zh) * 2016-05-06 2016-09-28 苏州清云网络科技有限公司 基于大数据及云计算的智能商用电视的云监控与云运维

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102291616A (zh) * 2011-09-03 2011-12-21 四川公用信息产业有限责任公司 Iptv终端管理系统
CN104735144A (zh) * 2015-03-20 2015-06-24 努比亚技术有限公司 基于大数据改变终端状态的方法及服务器
CN105357549A (zh) * 2015-11-09 2016-02-24 天津网络广播电视台有限公司 一种机顶盒数据采集系统及数据采集方法
CN105979273A (zh) * 2016-05-06 2016-09-28 苏州清云网络科技有限公司 基于大数据及云计算的智能商用电视的云监控与云运维

Also Published As

Publication number Publication date
CN108737855A (zh) 2018-11-02

Similar Documents

Publication Publication Date Title
CN107689953B (zh) 一种面向多租户云计算的容器安全监控方法及系统
US10061578B2 (en) System and method of configuring a data store for tracking and auditing real-time events across different software development tools in agile development environments
US9590880B2 (en) Dynamic collection analysis and reporting of telemetry data
US9419917B2 (en) System and method of semantically modelling and monitoring applications and software architecture hosted by an IaaS provider
CN108039959B (zh) 一种数据的态势感知方法、系统及相关装置
CN109716730B (zh) 用于生产应用的自动化性能调试的方法和计算设备
US9600389B2 (en) Generating performance and capacity statistics
CN104022903A (zh) 一站式自动化运维系统
US10019340B2 (en) On-demand profiling based on event streaming architecture
CN112632135A (zh) 一种大数据平台
CN114363042B (zh) 日志分析方法、装置、设备及可读存储介质
US10657099B1 (en) Systems and methods for transformation and analysis of logfile data
CN104301147A (zh) 一种对业务应用系统中业务及流程活动的监测方法
CN111259073A (zh) 基于日志、流量和业务访问的业务系统运行状态智能研判系统
CN111008109A (zh) 一种监控数据处理方法、装置、电子设备及存储介质
US10915510B2 (en) Method and apparatus of collecting and reporting database application incompatibilities
KR20150136369A (ko) 로그 보안 및 빅 데이터를 이용한 통합 관리 시스템
CN113836237A (zh) 对数据库的数据操作进行审计的方法及装置
WO2018192569A1 (zh) 设备监控方法、装置及系统
CN115729727A (zh) 故障修复方法、装置、设备及介质
CN109324892B (zh) 分布式管理方法、分布式管理系统及装置
CN116594840A (zh) 基于elk的日志故障采集与分析方法、系统、设备及介质
CN112882892B (zh) 数据处理方法和装置、电子设备及存储介质
US20220405115A1 (en) Server and application monitoring
CN114328093A (zh) 一种基于Hadoop的监控方法、系统、存储介质及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18787935

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18787935

Country of ref document: EP

Kind code of ref document: A1