TWI763177B - Management system and method for a plurality of network devices and computer readable medium - Google Patents

Management system and method for a plurality of network devices and computer readable medium Download PDF

Info

Publication number
TWI763177B
TWI763177B TW109144091A TW109144091A TWI763177B TW I763177 B TWI763177 B TW I763177B TW 109144091 A TW109144091 A TW 109144091A TW 109144091 A TW109144091 A TW 109144091A TW I763177 B TWI763177 B TW I763177B
Authority
TW
Taiwan
Prior art keywords
network
data
root cause
network devices
module
Prior art date
Application number
TW109144091A
Other languages
Chinese (zh)
Other versions
TW202223692A (en
Inventor
洪民翰
鄧哲君
陳泓桔
黃雅泙
陳美君
Original Assignee
中華電信股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中華電信股份有限公司 filed Critical 中華電信股份有限公司
Priority to TW109144091A priority Critical patent/TWI763177B/en
Application granted granted Critical
Publication of TWI763177B publication Critical patent/TWI763177B/en
Publication of TW202223692A publication Critical patent/TW202223692A/en

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention discloses a management system and method for a plurality of network devices. First, event lists regarding multiple network devices, and operating data and network data of multiple network devices are recorded; next, data compensation or prediction, dimensional value collection, contribution filtering, dimensional grouping, and root cause analysis with respect to event lists, operating data, and network data are performed in order to obtain a root cause of network anomaly, such that an autonomous decision-making performance management of multiple network devices can be carried out based on the root cause of the network anomaly. The present invention further provides a computer readable medium for executing the management method for multiple network devices of the present invention.

Description

用於多個網路設備之管理系統、方法及電腦可讀媒介 Management system, method and computer readable medium for multiple network devices

本發明係關於一種用於無線通訊網路設備之技術,詳而言之,係關於一種用於無線通訊網路設備之管理系統、方法及電腦可讀媒介。 The present invention relates to a technology for wireless communication network equipment, and more specifically, to a management system, method and computer readable medium for wireless communication network equipment.

隨著行動資料量不斷地增加、並支援各類新業務與應用場景,第五代行動通訊系統(5th Generation Wireless System,5G)或未來之行動通訊系統預期將具有龐大的移動數據和設備連接。例如,為了提供適當的網路服務,將傳統基地台分離為數個單元的網路分裂(Network Splitting)架構,如中央單元(Central Unit,CU)、分佈單元(Distributed Unit,DU)和無線電單元(Radio Unit,RU)。 With the ever-increasing amount of mobile data and supporting various new services and application scenarios, the 5th Generation Wireless System (5G) or future mobile communication systems are expected to have huge mobile data and device connections. For example, in order to provide proper network services, a network splitting (Network Splitting) architecture that separates a traditional base station into several units, such as a central unit (CU), a distributed unit (DU), and a radio unit ( Radio Unit, RU).

此外,隨著移動裝置、物聯網及雲端服務等新興資訊科技應用普及,網路與實體世界已逐漸融合,新興資通訊科技固然對人類帶來生活的便利,然而對於企業的資訊架構不斷演進的過程中,維運面臨著更大的挑戰,維護的設備邊界與數據規模不斷增長,故障定位變得困難,仰賴大量人力決策之維運管理已無法因應龐大、複雜、快速與供裝不斷演變的網路系統的挑戰。 In addition, with the popularization of emerging information technology applications such as mobile devices, the Internet of Things and cloud services, the Internet and the physical world have gradually merged. Although emerging information and communication technologies have brought convenience to human life, they are not conducive to the continuous evolution of enterprise information architecture. During the process, maintenance and operation are faced with greater challenges. The equipment boundaries and data scale to be maintained continue to grow, and fault location becomes difficult. The maintenance and operation management that relies on a large number of human decision-making can no longer cope with the huge, complex, fast, and evolving supply and installation requirements. Challenges of network systems.

因此,如何有效維護5G等無線網路設備,為目前業界亟待解決之課題。 Therefore, how to effectively maintain 5G and other wireless network equipment is an urgent issue to be solved in the industry.

為解決上述問題及其他問題,本發明揭示一種用於多個網路設備之管理系統、方法及電腦可讀媒介。 To solve the above problems and other problems, the present invention discloses a management system, method and computer readable medium for multiple network devices.

本發明之用於多個網路設備之管理系統係包括:事件單模組,用於記錄關於多個網路設備之事件單;系統日誌模組,用於記錄該多個網路設備之運行資料;網路日誌模組,用於記錄該多個網路設備之網路資料;以及根因分析模組,用於對該事件單、該運行資料、及該網路資料依序執行資料補償或預測、維度值匯集、貢獻度過濾、維度分組、及根因分析,俾獲得網路異常的根本原因。 The management system for multiple network devices of the present invention includes: an event list module for recording event lists related to multiple network devices; a system log module for recording the operation of the multiple network devices data; a network log module for recording the network data of the plurality of network devices; and a root cause analysis module for sequentially performing data compensation on the event ticket, the operation data, and the network data Or prediction, dimension value aggregation, contribution filtering, dimension grouping, and root cause analysis to obtain the root cause of network anomalies.

本發明之用於多個網路設備之管理系統更包括效能管理模組,其中,該根因分析模組將該網路異常的根本原因傳輸至該效能管理模組,以供該效能管理模組根據該網路異常的根本原因對該多個網路設備執行自主決策效能管理。 The management system for multiple network devices of the present invention further includes a performance management module, wherein the root cause analysis module transmits the root cause of the network abnormality to the performance management module for the performance management module to use. The group performs autonomous decision performance management on the plurality of network devices according to the root cause of the network abnormality.

本發明之用於多個網路設備之管理系統更包括探針模組,其中,該探針模組係根據前傳、中傳和後傳之網路環境,主動定期詢問或被動接收該多個網路設備之監控指標資料,以於該監控指標資料異常時,通知該事件單模組產生該事件單。 The management system for a plurality of network devices of the present invention further includes a probe module, wherein the probe module actively and periodically inquires or passively receives the plurality of Monitoring indicator data of network equipment, so that when the monitoring indicator data is abnormal, the event ticket module is notified to generate the event ticket.

本發明之用於多個網路設備之管理方法,係包括:記錄關於多個網路設備之事件單、該多個網路設備之運行資料及網路資料;以及對該事件 單、該運行資料、及該網路資料依序執行資料補償或預測、維度值匯集、貢獻度過濾、維度分組、及根因分析,俾獲得網路異常的根本原因。 The management method for a plurality of network devices of the present invention includes: recording an event list about a plurality of network devices, operation data and network data of the plurality of network devices; Execute data compensation or prediction, dimension value aggregation, contribution filtering, dimension grouping, and root cause analysis in sequence on the operation data and the network data, so as to obtain the root cause of network anomalies.

在一實施例中,該運行資料包括設備、供裝、系統、及路由的故障或異常資料,而該網路資料包括網路流量或封包資料。 In one embodiment, the operational data includes equipment, equipment, system, and routing fault or exception data, and the network data includes network traffic or packet data.

在一實施例中,執行該資料補償或預測包括根據歷史資料執行該資料補償或預測;執行該貢獻度過濾包括根據嚴重程度和正常預測值計算貢獻度;執行該維度分組係包括使分組後各組成員選自由系統、事件、網路和設備所組成的群組;執行該根因分析包括執行組內和組間之根因分析。 In one embodiment, performing the data compensation or prediction includes performing the data compensation or prediction according to historical data; performing the contribution filtering includes calculating the contribution according to the severity and normal predicted values; performing the dimension grouping includes making the Group members are selected from the group consisting of systems, events, networks, and devices; performing this root cause analysis includes performing root cause analysis within and between groups.

本發明復提供一種電腦可讀媒介,應用於計算裝置或電腦中,係儲存有指令,以執行上述之用於多個網路設備之管理方法。 The present invention further provides a computer-readable medium, which is applied to a computing device or a computer and stores instructions for executing the above-mentioned management method for a plurality of network devices.

因此,本發明之用於多個網路設備之管理系統、方法及電腦可讀媒介,係提供快速識別異常原因並使用自動化工具動態調整,能有效的提高設備服務穩定、降低資訊系統維護成本。 Therefore, the management system, method and computer-readable medium for multiple network devices of the present invention provide rapid identification of abnormal causes and dynamic adjustment using automated tools, which can effectively improve device service stability and reduce information system maintenance costs.

10:管理系統 10: Management System

11:基礎設施模組 11: Infrastructure Mods

12:系統日誌模組 12: System log module

13:網路日誌模組 13: Network log module

14:探針模組 14: Probe module

15:事件單模組 15: Event single module

16:根因分析模組 16: Root cause analysis module

17:效能管理模組 17: Performance Management Module

S201~S207:步驟 S201~S207: Steps

圖1係為本發明之用於多個網路設備之管理系統之方塊示意圖。 FIG. 1 is a schematic block diagram of a management system for multiple network devices according to the present invention.

圖2係為本發明之用於多個網路設備之管理方法之流程示意圖。 FIG. 2 is a schematic flowchart of a management method for a plurality of network devices according to the present invention.

以下藉由特定的實施例說明本案之實施方式,熟習此項技藝之人士可由本文所揭示之內容輕易地瞭解本案之其他優點及功效。本說明書所附圖式所繪示之結構、比例、大小等均僅用於配合說明書所揭示之內容,以供熟悉 此技藝之人士之瞭解與閱讀,非用於限定本案可實施之限定條件,故任何修飾、改變或調整,在不影響本案所能產生之功效及所能達成之目的下,均應仍落在本案所揭示之技術內容得能涵蓋之範圍內。 The following specific examples illustrate the implementation of the present application, and those skilled in the art can easily understand other advantages and effects of the present application from the content disclosed herein. The structures, proportions, sizes, etc. shown in the drawings attached in this specification are only used to cooperate with the contents disclosed in the specification for familiarity. The understanding and reading of those skilled in the art are not used to limit the conditions for the implementation of this case. Therefore, any modification, change or adjustment should still fall within the scope of this case without affecting the effect and the purpose that can be achieved in this case. The technical content disclosed in this case can be covered within the scope.

請參閱圖1,本發明之用於多個網路設備之管理系統10包括以下模組,其中,圖1中之管理系統10的各模組均可為軟體、硬體或韌體;若為硬體,則可為具有資料處理與運算能力之處理單元、處理器、電腦或伺服器;若為軟體或韌體,則可包括處理單元、處理器、電腦或伺服器可執行之指令。 Please refer to FIG. 1 , the management system 10 for multiple network devices of the present invention includes the following modules, wherein each module of the management system 10 in FIG. 1 can be software, hardware or firmware; Hardware can be a processing unit, processor, computer or server with data processing and computing capabilities; in the case of software or firmware, it can include instructions executable by the processing unit, processor, computer or server.

基礎設施模組11,通常在資源管理層,可記錄網路設備之包含行動網路及/或實體網路的設備資訊以及設備彼此之間連接關係,例如,前傳、中傳與後傳的網路設備的實體連結與網路路由表資訊。此外,前傳之網路環境指的是無線電單元(Radio Unit,RU)與分布單元(Distributed Unit,DU)之間;中傳之網路環境指的是DU與中央單元(Central Unit,CU)之間;而後傳之網路環境指的是CU與核心網路(Core)之間。 The infrastructure module 11, usually in the resource management layer, can record the device information of the network device including the mobile network and/or the physical network and the connection relationship between the devices, for example, the network of pre-transmission, mid-transmission and post-transmission The physical link of the road device and the network routing table information. In addition, the network environment of the fronthaul refers to the connection between the Radio Unit (RU) and the Distributed Unit (DU); the network environment of the midhaul refers to the connection between the DU and the Central Unit (CU). The post-transmission network environment refers to the connection between the CU and the core network (Core).

系統日誌模組12,通常在資源管理層,可記錄多個網路設備之運行資料,例如設備、供裝、系統、及路由的故障或異常資料。此外,系統日誌模組12的主要功能為收集各種維度監控數據,包括設備、系統、應用服務、資料庫與網路的實時指標數據和日誌數據。 The system log module 12, usually in the resource management layer, can record operation data of multiple network devices, such as failure or abnormal data of devices, equipment, systems, and routes. In addition, the main function of the system log module 12 is to collect monitoring data of various dimensions, including real-time indicator data and log data of devices, systems, application services, databases and networks.

網路日誌模組13,通常在網路管理層,可記錄該多個網路設備之網路資料,例如網路流量或封包資料。此外,網路日誌模組13的主要功能為頻寬管理、深度封包檢測與前傳、中傳、後傳的網路路由管理。 The network log module 13, usually in the network management layer, can record network data of the plurality of network devices, such as network traffic or packet data. In addition, the main functions of the network log module 13 are bandwidth management, deep packet inspection, and network routing management for forward, intermediate, and post-transmission.

探針模組14,通常在網路管理層,根據前傳、中傳和後傳之網路環境,主動定期詢問或被動接收多個網路設備之監控指標資料,以於監控指標 資料異常時,通知事件單模組15產生事件單。此外,這些監控指標資料包含設備的關鍵績效指標,如無線網路設備的狀態、與上下鏈路網路設備的連接狀態。當指標異常時,表示網路設備故障,可能是其基礎元件的異常導致的。異常元件可以透過一個或多個關鍵績效指標(如表一)異常反映,例如網路設備Sync-E BITS狀態變成不正常狀態時,被動探針會將偵測到的發生變化的資訊傳送至事件單模組15,而主動探針會在一個固定週期內檢查網路設備狀態,如果與設定的正常值不同時,也會將異常資訊與其事件單嚴重等級等資訊傳送到事件單模組15。 The probe module 14, usually in the network management layer, actively and regularly inquires or passively receives monitoring indicator data of multiple network devices according to the network environment of pre-transmission, mid-transmission and post-transmission, so as to monitor indicators When the data is abnormal, the notification event ticket module 15 generates an event ticket. In addition, these monitoring indicator data include key performance indicators of equipment, such as the status of wireless network equipment, and the connection status of uplink and downlink network equipment. When the indicator is abnormal, it means that the network equipment is faulty, which may be caused by the abnormality of its basic components. Abnormal components can be reflected by one or more key performance indicators (see Table 1) abnormally. For example, when the Sync-E BITS status of the network device becomes abnormal, the passive probe will send the detected change information to the event. In the single module 15, the active probe will check the network device status in a fixed period. If it is different from the set normal value, it will also transmit the abnormal information and the severity level of the event ticket to the event ticket module 15.

事件單模組15,通常在網路管理層,可記錄關於多個網路設備之事件單。此外,事件單模組15的主要功能為記錄所有的異常值與告警管理,可有效的降低錯誤發生率與改善整體網路的效能。當探針模組14偵測到問題時,事件單模組15可產生一筆帶有時間戳記的事件單(如表二),一個事件單包含許多屬性,如事件單編號、類別、事件嚴重程度……等,每個屬性都有不同的屬性值。例如,事件單嚴重程度屬性是根據探針模組14所定義嚴重等級,至少分別有低、中、高,三種嚴重程度,該至少三種嚴重程度在根因分析模組16中分別代表不同權重,可加快找出根因。 The ticket module 15, usually at the network management layer, can record tickets for multiple network devices. In addition, the main function of the event ticket module 15 is to record all abnormal values and alarm management, which can effectively reduce the error rate and improve the overall network performance. When the probe module 14 detects a problem, the event ticket module 15 can generate a time-stamped event ticket (such as Table 2). An event ticket contains many attributes, such as the ticket number, category, and event severity. ...etc, each property has a different property value. For example, the severity attribute of the event ticket is defined according to the severity level of the probe module 14, and there are at least three levels of severity, namely low, medium, and high, and the at least three levels of severity represent different weights in the root cause analysis module 16, respectively. It can speed up finding the root cause.

根因分析模組16,通常在網路管理層,對事件單模組15的事件單、系統日誌模組12的運行資料、及網路日誌模組13的網路資料,依序執行資料補償或預測、維度值匯集、貢獻度過濾、維度分組、及根因分析,藉此獲得網路異常的根本原因。此外,執行該資料補償或預測包括根據歷史資料執行該資料補償或預測,如將其中的缺漏值做轉換處理,接著根據這些處理過後的數據預測異常時間點的正常值,藉此透過比較預測值與實際值的差異,判定是否 異常;執行該維度值匯集包括將時間值與預測值,依照維度順序匯總各種維度資訊;執行該貢獻度過濾包括根據嚴重程度和正常預測值計算貢獻度;執行該維度分組係包括使分組後各組成員選自由系統、事件、網路和設備所組成的群組;執行該根因分析包括執行組內和組間之根因分析,藉此降低搜索複雜度與加速區分根因。 The root cause analysis module 16 , usually in the network management layer, executes data compensation in sequence on the event ticket of the event ticket module 15 , the operation data of the system log module 12 , and the network data of the network log module 13 . Or prediction, dimension value aggregation, contribution filtering, dimension grouping, and root cause analysis to obtain the root cause of network anomalies. In addition, performing the data compensation or prediction includes performing the data compensation or prediction based on historical data, such as converting missing values therein, and then predicting normal values at abnormal time points according to the processed data, thereby comparing the predicted values The difference from the actual value, to determine whether Abnormal; performing the dimension value aggregation includes summarizing the time value and the predicted value, and summarizing various dimension information according to the dimension order; performing the contribution filtering includes calculating the contribution based on the severity and the normal predicted value; performing the dimension grouping includes making the Group members are selected from the group consisting of systems, events, networks, and devices; performing this root cause analysis includes performing root cause analysis within and between groups, thereby reducing search complexity and accelerating root cause differentiation.

效能管理模組17,通常在網路管理層,根因分析模組16將該網路異常的根本原因傳輸至效能管理模組17,以供效能管理模組17根據該網路異常的根本原因對該多個網路設備執行自主決策效能管理。此外,效能管理模組17的主要功能為整合與管理整體前傳、中傳與後傳的傳輸網路與無線電接入網路(radio access network)各種關鍵績效指標。 The performance management module 17, usually in the network management layer, the root cause analysis module 16 transmits the root cause of the network abnormality to the performance management module 17 for the performance management module 17 to analyze the root cause of the network abnormality Perform autonomous decision performance management on the plurality of network devices. In addition, the main function of the performance management module 17 is to integrate and manage various key performance indicators of the overall fronthaul, midhaul, and backhaul transmission networks and radio access networks.

請參閱圖2,本發明之用於多個網路設備之管理方法包括以下步驟: Please refer to FIG. 2 , the management method for multiple network devices of the present invention includes the following steps:

在步驟S201,記錄資料。具體來說,將網路設備所有運行數據(如表三所示之系統日誌)、網路設備所有網路封包與流量數據(如表四所示之網路日誌)、所監控之指標數據(如表二所示之事件單)予以記錄。 In step S201, data is recorded. Specifically, all operating data of network devices (system logs shown in Table 3), all network packets and traffic data of network devices (network logs shown in Table 4), monitored indicator data ( The event list shown in Table 2) shall be recorded.

Figure 109144091-A0305-02-0008-1
Figure 109144091-A0305-02-0008-1

表二、事件單範例。

Figure 109144091-A0101-12-0007-2
Table 2. Examples of event tickets.
Figure 109144091-A0101-12-0007-2

表三、系統日誌範例。

Figure 109144091-A0101-12-0007-3
Table 3. System log example.
Figure 109144091-A0101-12-0007-3

表四、網路日誌範例。

Figure 109144091-A0101-12-0007-4
Table 4. Examples of network logs.
Figure 109144091-A0101-12-0007-4

Figure 109144091-A0101-12-0008-5
Figure 109144091-A0101-12-0008-5

在步驟S202,執行資料補償或預測。具體來說,將事件單(如表二)、系統日誌(如表三)、網路日誌(如表四)中有資料不齊全的資料做資料轉換處理,接著進入資料預測階段,在資料預測會有兩種狀況,第一種狀況是將遺漏的資料補齊;第二種狀況是判斷異常值的正確值,兩者皆是利用人工智慧方法(如LSTM,不限此演算法)預測異常時間的正常值,藉由預測出正 常值可準確找到異常時間點的根本原因。例如,當資料有部分缺漏時,如表四的編號21,因網路日誌過程中發生異常,導致紀錄有缺,此時會先補上設備資訊(如設備601與設備602流量),再進行資料預測階段補其值(如表六的編號1)。例如,當發生流量異常時,如表四的編號5,則在資料預測階段會根據過往資料預測出正常的網路流量(如表六的編號5,預測其正常流量為10G)。此外,以表三為例,編號7為設備發生網路異常,後續應該會有其他設備受其影響並產生告警,但由於系統異常或網路問題導致系統日誌缺漏的狀況發生。此時,可根據過往資料、事件單與網路日誌,運用人工智慧演算法預測在設備網路發生異常之後,會有網路端口不通與迴路異常的事件發生,並將人工智慧預測的結果(如編號8、9)補回日誌表中(如表五)。 In step S202, data compensation or prediction is performed. Specifically, convert the data with incomplete data in the event list (such as Table 2), system log (such as Table 3), and network log (such as Table 4), and then enter the data prediction stage. There will be two situations, the first situation is to fill in the missing data; the second situation is to judge the correct value of the outlier, both of which use artificial intelligence methods (such as LSTM, not limited to this algorithm) to predict the anomaly normal value of time, by predicting a positive A constant value pinpoints the root cause of an anomalous point in time. For example, when the data is partially missing, such as No. 21 in Table 4, due to an abnormality in the network log process, the record is missing, and the device information (such as the flow of device 601 and device 602) will be supplemented first, and then the The data prediction stage fills up its value (such as No. 1 in Table 6). For example, when abnormal traffic occurs, such as No. 5 in Table 4, normal network traffic will be predicted based on past data in the data prediction stage (for example, No. 5 in Table 6, the normal traffic is predicted to be 10G). In addition, taking Table 3 as an example, No. 7 is the network abnormality of the device, and other devices should be affected by it and generate an alarm in the future, but the system log is missing due to system abnormality or network problem. At this time, based on past data, event sheets and network logs, artificial intelligence algorithms can be used to predict that after an abnormality occurs in the device network, there will be events such as network port failure and abnormal circuit, and the results predicted by artificial intelligence ( Such as No. 8, 9) fill it back to the log table (see Table 5).

表五、系統日誌補償與預測表範例。

Figure 109144091-A0101-12-0009-6
Table 5. Examples of system log compensation and prediction tables.
Figure 109144091-A0101-12-0009-6

表六、網路日誌補償與預測表範例。

Figure 109144091-A0101-12-0010-7
Table 6. Examples of Network Log Compensation and Prediction Tables.
Figure 109144091-A0101-12-0010-7

在步驟S203中,執行維度值匯集。具體而言,將事件單、網路日誌與系統日誌補償後,依據所定義好的格式彙整成一個表,內有各種監控維度的資料與預測的資料,如表七所示。 In step S203, dimension value aggregation is performed. Specifically, after compensating event tickets, network logs and system logs, they are compiled into a table according to the defined format, which contains data of various monitoring dimensions and forecast data, as shown in Table 7.

表七、多維度監測指標表範例。

Figure 109144091-A0101-12-0011-8
Table 7. Example of multi-dimensional monitoring indicator table.
Figure 109144091-A0101-12-0011-8

Figure 109144091-A0101-12-0012-9
Figure 109144091-A0101-12-0012-9

在步驟S204中,執行貢獻度過濾,其中包括先計算貢獻度再過濾貢獻度。具體而言,根據其嚴重程度與正常預測值計算出貢獻度(如表八),將不相關的資料刪除來降低計算量(如表九,超過貢獻度6才保留),如表八的編號1與編號2皆與此次網路異常無關,故將資料過濾掉。 In step S204, filtering the contribution degree is performed, which includes calculating the contribution degree first and then filtering the contribution degree. Specifically, the contribution degree is calculated according to its severity and normal predicted value (such as Table 8), and irrelevant data is deleted to reduce the amount of calculation (such as Table 9, which is reserved when the contribution degree exceeds 6), as shown in the number of Table 8 1 and No. 2 have nothing to do with this network abnormality, so the data is filtered out.

表八、多維度監測指標表計算貢獻度範例。

Figure 109144091-A0101-12-0013-10
Table 8. Examples of calculating contributions from the multi-dimensional monitoring indicator table.
Figure 109144091-A0101-12-0013-10

表九、多維度監測指標表過濾範例。

Figure 109144091-A0101-12-0014-11
Table 9. Example of filtering multi-dimensional monitoring index table.
Figure 109144091-A0101-12-0014-11

在步驟S205中,執行維度分組。即依照維度進行分組,分為系統(如表十)、事件單(如表十一)、網路(如表十二)與設備(如表十三)等維度,形成維度分組表(如表十四)。 In step S205, dimension grouping is performed. That is, according to the dimension, it is divided into system (such as table 10), event ticket (such as table 11), network (such as table 12) and equipment (such as table 13) and other dimensions, forming a dimension grouping table (such as table 13). fourteen).

表十、系統維度組範例。

Figure 109144091-A0101-12-0014-12
Table 10. Examples of system dimension groups.
Figure 109144091-A0101-12-0014-12

表十一、事件單維度組範例。

Figure 109144091-A0101-12-0015-13
Table 11. Examples of single-dimensional groups of events.
Figure 109144091-A0101-12-0015-13

表十二、網路維度組範例。

Figure 109144091-A0101-12-0015-14
Table 12. Examples of network dimension groups.
Figure 109144091-A0101-12-0015-14

表十三、設備維度組範例。

Figure 109144091-A0101-12-0015-15
Table 13. Examples of equipment dimension groups.
Figure 109144091-A0101-12-0015-15

表十四、維度分組表。

Figure 109144091-A0101-12-0015-16
Table 14. Dimension grouping table.
Figure 109144091-A0101-12-0015-16

在步驟S206中,執行根因分析。具體來說,根據維度分組表(表十四)計算組內根因分析,如表十五尚未有根因,而表十六可以判斷編號6 「00103迴路網路異常」為其網路異常之根因,其次可能是編號3,因為系統異常,所以會影響網路狀況。此外,根據表十七,主要原因可能是編號4的網路異常,其次是編號5的網路異常,以此類推分別在組內取得前3名根因。接著,計算組間根因分析,根據組跟組之間的比較找出有可能發生的錯誤原因,最後就可以得出此次的網路發生異常的根本原因是2020-02-20 07:56的「Nokia 001網路異常」。 In step S206, root cause analysis is performed. Specifically, the root cause analysis within the group is calculated according to the dimension grouping table (Table 14). For example, there is no root cause in Table 15, but the number 6 can be judged in Table 16. "00103 Loop network abnormality" is the root cause of the network abnormality, followed by number 3. Because the system is abnormal, it will affect the network status. In addition, according to Table 17, the main reason may be the network abnormality of No. 4, followed by the abnormal network of No. 5, and so on to obtain the top three root causes in the group respectively. Then, calculate the root cause analysis between groups, and find out the possible causes of errors according to the comparison between groups. Finally, it can be concluded that the root cause of the abnormality in the network is 2020-02-20 07:56 "Nokia 001 Network Anomaly".

表十五、系統、事件組範例。

Figure 109144091-A0101-12-0016-17
Table 15. Examples of system and event groups.
Figure 109144091-A0101-12-0016-17

表十六、系統、網路組範例。

Figure 109144091-A0101-12-0016-18
Table 16. Examples of systems and network groups.
Figure 109144091-A0101-12-0016-18

表十七、系統、事件、網路組範例。

Figure 109144091-A0101-12-0017-19
Table 17. Examples of systems, events, and netgroups.
Figure 109144091-A0101-12-0017-19

在步驟S207中,執行效能管理。具體來說,根據設計好的規則自動組態調整來達到網路設備的自我調控能力。 In step S207, performance management is performed. Specifically, automatic configuration adjustment is performed according to the designed rules to achieve the self-regulation capability of network devices.

此外,本發明還揭示一種電腦可讀媒介,係應用於具有處理器(例如,CPU、GPU等)及/或記憶體的計算裝置或電腦中,且儲存有指令,並可利用此計算裝置或電腦透過處理器及/或記憶體執行此電腦可讀媒介,以於執行此電腦可讀媒介時執行上述之方法及各步驟。 In addition, the present invention also discloses a computer-readable medium, which is applied to a computing device or computer having a processor (eg, CPU, GPU, etc.) and/or memory, and stores instructions, and can utilize the computing device or computer. The computer executes the computer-readable medium through a processor and/or a memory, so as to execute the above-mentioned methods and steps when executing the computer-readable medium.

綜上所述,本發明之用於多個網路設備之管理系統、方法及電腦可讀媒介係設計各種資料採集機制,可大幅掌握對於網路設備的掌控能力,並採用主動/被動探針技術來管理與支援不同的資料接取技術,提供多樣化設備服務資訊之監控功能,另外,採用各種維度資料,以智慧判斷問題根因,並選擇最佳修復決策方案。因此,藉由本發明可有效大幅減少投入過多人力來維護5G無線網路設備,且維運修復時間從小時級縮短到分鐘級,甚至是秒級別。 To sum up, the management system, method and computer readable medium for multiple network devices of the present invention are designed with various data collection mechanisms, which can greatly grasp the control ability of network devices, and use active/passive probes Technology to manage and support different data access technologies, provide monitoring functions of diversified equipment service information, and use various dimension data to intelligently determine the root cause of the problem and choose the best repair decision plan. Therefore, the present invention can effectively and greatly reduce the investment of excessive manpower to maintain 5G wireless network equipment, and the maintenance and repair time can be shortened from hours to minutes, or even seconds.

上述實施例僅例示性說明本案之功效,而非用於限制本案,任何熟習此項技藝之人士均可在不違背本案之精神及範疇下對上述該些實施態樣進行修飾與改變。因此本案之權利保護範圍,應如後述之申請專利範圍所列。 The above-mentioned embodiments are only used to illustrate the effect of the present case, but not to limit the present case. Anyone skilled in the art can modify and change the above-mentioned embodiments without departing from the spirit and scope of the present case. Therefore, the scope of protection of the rights in this case should be listed in the scope of the patent application described later.

S201~S207:步驟 S201~S207: Steps

Claims (10)

一種用於多個網路設備之管理系統,係包括:事件單模組,用於記錄關於多個網路設備之事件單;系統日誌模組,用於記錄該多個網路設備之運行資料;網路日誌模組,用於記錄該多個網路設備之網路資料;以及根因分析模組,用於對該事件單、該運行資料、及該網路資料依序執行資料補償或預測、維度值匯集、貢獻度過濾、維度分組、及根因分析,俾獲得網路異常的根本原因,其中,執行該資料補償或預測包括根據歷史資料執行該資料補償或預測。 A management system for multiple network devices, comprising: an event list module for recording event lists related to multiple network devices; a system log module for recording operation data of the multiple network devices ; a network log module for recording the network data of the plurality of network devices; and a root cause analysis module for sequentially performing data compensation or Prediction, dimension value aggregation, contribution filtering, dimension grouping, and root cause analysis to obtain the root cause of network anomalies, wherein performing the data compensation or prediction includes performing the data compensation or prediction based on historical data. 如請求項1所述之管理系統,更包括效能管理模組,其中,該根因分析模組將該網路異常的根本原因傳輸至該效能管理模組,以供該效能管理模組根據該網路異常的根本原因對該多個網路設備執行自主決策效能管理。 The management system according to claim 1, further comprising a performance management module, wherein the root cause analysis module transmits the root cause of the network abnormality to the performance management module for the performance management module to use according to the The root cause of network anomalies performs autonomous decision-making performance management for the plurality of network devices. 如請求項1所述之管理系統,更包括探針模組,其中,該探針模組係根據前傳、中傳和後傳之網路環境,主動定期詢問或被動接收該多個網路設備之監控指標資料,以於該監控指標資料異常時,通知該事件單模組產生該事件單。 The management system according to claim 1, further comprising a probe module, wherein the probe module actively and periodically inquires or passively receives the plurality of network devices according to the network environment of pre-transmission, mid-transmission and post-transmission The monitoring indicator data, so that when the monitoring indicator data is abnormal, the event ticket module is notified to generate the event ticket. 如請求項1所述之管理系統,其中,執行該貢獻度過濾包括根據嚴重程度和正常預測值計算貢獻度,執行該維度分組係包括使分組後各組成員選自由系統、事件、網路和設備所組成的群組,而執行該根因分析包括執行組內和組間之根因分析。 The management system according to claim 1, wherein performing the contribution degree filtering comprises calculating the contribution degree according to the severity and the normal predicted value, and performing the dimension grouping comprises selecting the group members from systems, events, networks, and A group of devices, and performing the root cause analysis includes performing a within-group and between-group root cause analysis. 一種用於多個網路設備之管理方法,係包括: 記錄關於多個網路設備之事件單、該多個網路設備之運行資料及網路資料;以及對該事件單、該運行資料、及該網路資料依序執行資料補償或預測、維度值匯集、貢獻度過濾、維度分組、及根因分析,俾獲得網路異常的根本原因,其中,執行該資料補償或預測包括根據歷史資料執行該資料補償或預測。 A management method for multiple network devices, comprising: Recording event tickets, operation data and network data of multiple network devices; and sequentially performing data compensation or prediction, dimension values for the event tickets, the operation data, and the network data Aggregation, contribution filtering, dimension grouping, and root cause analysis to obtain the root cause of network anomalies, wherein performing the data compensation or prediction includes performing the data compensation or prediction based on historical data. 如請求項5所述之管理方法,更包括根據該網路異常的根本原因,對該多個網路設備執行自主決策效能管理。 The management method according to claim 5, further comprising performing autonomous decision-making performance management on the plurality of network devices according to the root cause of the network abnormality. 如請求項5所述之管理方法,更包括根據前傳、中傳和後傳之網路環境主動定期詢問或被動接收該多個網路設備之監控指標資料,以於該監控指標資料異常時,產生該事件單。 The management method as described in claim 5, further comprises actively periodically inquiring or passively receiving monitoring indicator data of the plurality of network devices according to the network environment of pre-transmission, mid-transmission and post-transmission, so that when the monitoring indicator data is abnormal, Generate the event ticket. 如請求項5所述之管理方法,其中,執行該貢獻度過濾包括根據嚴重程度和正常預測值計算貢獻度,執行該維度分組係包括使分組後各組成員選自由系統、事件、網路和設備所組成的群組,而執行該根因分析包括執行組內和組間之根因分析。 The management method according to claim 5, wherein performing the contribution degree filtering comprises calculating the contribution degree according to the severity and the normal predicted value, and performing the dimension grouping comprises selecting the group members from the system, the event, the network, and the system after the grouping. A group of devices, and performing the root cause analysis includes performing a within-group and between-group root cause analysis. 如請求項5所述之管理方法,其中,該運行資料包括設備、供裝、系統、及路由的故障或異常資料,而該網路資料包括網路流量或封包資料。 The management method of claim 5, wherein the operation data includes equipment, equipment, system, and routing failure or abnormal data, and the network data includes network traffic or packet data. 一種電腦可讀媒介,應用於計算裝置或電腦中,係儲存有指令,以執行如請求項5至9之任一者所述之用於多個網路設備之管理方法。 A computer-readable medium, applied in a computing device or a computer, stores instructions for executing the management method for a plurality of network devices as described in any one of claim 5 to 9.
TW109144091A 2020-12-14 2020-12-14 Management system and method for a plurality of network devices and computer readable medium TWI763177B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW109144091A TWI763177B (en) 2020-12-14 2020-12-14 Management system and method for a plurality of network devices and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW109144091A TWI763177B (en) 2020-12-14 2020-12-14 Management system and method for a plurality of network devices and computer readable medium

Publications (2)

Publication Number Publication Date
TWI763177B true TWI763177B (en) 2022-05-01
TW202223692A TW202223692A (en) 2022-06-16

Family

ID=82593958

Family Applications (1)

Application Number Title Priority Date Filing Date
TW109144091A TWI763177B (en) 2020-12-14 2020-12-14 Management system and method for a plurality of network devices and computer readable medium

Country Status (1)

Country Link
TW (1) TWI763177B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259383A (en) * 2020-01-15 2020-06-09 视联动力信息技术股份有限公司 Safety management center system
CN111555895A (en) * 2019-02-12 2020-08-18 北京数安鑫云信息技术有限公司 Method, device, storage medium and computer equipment for analyzing website faults

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111555895A (en) * 2019-02-12 2020-08-18 北京数安鑫云信息技术有限公司 Method, device, storage medium and computer equipment for analyzing website faults
CN111259383A (en) * 2020-01-15 2020-06-09 视联动力信息技术股份有限公司 Safety management center system

Also Published As

Publication number Publication date
TW202223692A (en) 2022-06-16

Similar Documents

Publication Publication Date Title
KR100840129B1 (en) System and method for management of performance fault using statistical analysis
US7107339B1 (en) Predictive monitoring and problem identification in an information technology (IT) infrastructure
CN107872457B (en) Method and system for network operation based on network flow prediction
US20230140836A9 (en) Anomaly detection method and device, terminal and storage medium
WO2015090098A1 (en) Method and apparatus for realizing fault location
US20160283307A1 (en) Monitoring system, monitoring device, and test device
CN113204461B (en) Server hardware monitoring method, device, equipment and readable medium
CN113014418B (en) Fault diagnosis method based on network historical topology flow
WO2023216457A1 (en) Method for predicting and positioning abnormity of transmission network between core network and base station
CN117118807B (en) Data analysis method and system based on artificial intelligence
WO2019137052A1 (en) Method and device for network operation and maintenance
CN117675522A (en) Power communication fault diagnosis and prevention method and system
CN117221088A (en) Computer network intensity detection system and device
CN116455729A (en) Fault link detection and recovery method based on link quality assessment model
CN116418653A (en) Fault positioning method and device based on multi-index root cause positioning algorithm
CN116663747B (en) Intelligent early warning method and system based on data center infrastructure
TWI763177B (en) Management system and method for a plurality of network devices and computer readable medium
CN116389304B (en) SG-TMS-based network operation state trend analysis system
CN115378841B (en) Method and device for detecting state of equipment accessing cloud platform, storage medium and terminal
JP7173273B2 (en) Failure analysis device, failure analysis method and failure analysis program
CN113438116A (en) Power communication data management system and method
CN115242706B (en) Operation management method of electric power system information network simulation platform
Luo et al. Critical node identification of power wireless private communication network based on complex network
CN112291804B (en) Service fault diagnosis method for noise network under 5G network slice
CN117792903A (en) Enterprise center service evaluation and dynamic treatment method based on deep reinforcement learning