KR20200137637A

KR20200137637A - Training data generation method using virtual alarm, learning method of network failure analysis model, and network system implementing the same method

Info

Publication number: KR20200137637A
Application number: KR1020190064337A
Authority: KR
Inventors: 최민환
Original assignee: 주식회사 케이티
Priority date: 2019-05-31
Filing date: 2019-05-31
Publication date: 2020-12-09

Abstract

Provide is a network failure analysis model comprising: an alarm collector which collects virtual alarm events generated from transmission devices of a network; and a training data generator for classifying the virtual alarm events into a plurality of groups based on a unique code included in the virtual alarm events, extracting a virtual source alarm for each group based on the virtual alarm type included in each virtual alarm event, and generating training data labeled with the cause of a failure related to the virtual source alarm to the corresponding group. The virtual alarm events included in each group are generated from transmission devices related to a specific transmission device due to a virtual failure caused by the specific transmission device. The specific transmission device and the related transmission devices generate virtual alarm events in which the same unique code is written.

Description

A method of generating learning data using a virtual alarm and a network failure analysis model learning method using the same, and a network system implementing the same.

본 발명은 네트워크 장애 분석에 관한 것이다.The present invention relates to network failure analysis.

네트워크 전송 장비는 복수의 계위(layer)로 구성되고, 다중화 단계에 따라 하위 계위 신호(Tributary Signal)들이 체계적으로 상위 계위 신호로 다중화된다. 다중화 신호 계위 표준으로, 동기식 디지털 계위(Synchronous Digital Hierarchy, SDH), 비동기식 디지털 계위(Plesiochronous Digital Hierarchy, PDH), 광 수송 계위(Optical Transport Hierarchy, OTH)가 있다. The network transmission equipment is composed of a plurality of layers, and tributary signals are systematically multiplexed into higher level signals according to multiplexing steps. As multiplexed signal hierarchy standards, there are Synchronous Digital Hierarchy (SDH), Plesiochronous Digital Hierarchy (PDH), and Optical Transport Hierarchy (OTH).

만약, 특정 계위의 전송 장비에서 근원(root) 장애가 발생한 경우, 물리적으로 연결된 다른 전송 장비들로 파생 경보가 전파된다. 네트워크 관리 시스템(Network Management System, NMS)은 네트워크의 전송 장비들로부터 발생된 경보 이벤트들을 보고받고, 이를 기초로 장애를 파악하기 위해 지속적으로 개발 중이다. 하지만, 네트워크는 다중 계위 뿐만 아니라, 이중화, 링, 메쉬(mesh) 구조가 혼재되어 구성되므로, 네트워크 관리 시스템이 경보들로부터 장애 원인을 파악하는 것이 쉽지 않다. 특히, 복잡한 네트워크에서 전파된 경보들이 동시 다발적으로 보고되므로, 근원 장애를 분간하는 것도 쉽지 않다. 실제로 경험 많은 망 운용자라야 경보 이벤트들에서, 근원 장애 발생 시 보고된 근원 경보를 찾고, 이에 연관된 파생 경보들을 분류하는 작업을 할 수 있다.If a root failure occurs in a transmission device of a specific level, a derivative alarm is propagated to other transmission devices that are physically connected. The Network Management System (NMS) is continuously being developed to receive reports of alarm events generated from transmission devices in the network, and to identify faults based on this. However, since the network is configured by mixing not only multiple hierarchies, but also redundancy, ring, and mesh structures, it is not easy for the network management system to determine the cause of the failure from the alerts. In particular, since the alarms propagated in a complex network are simultaneously reported, it is not easy to distinguish the source fault. In fact, only an experienced network operator can find the source alert reported when a source failure occurs in the alert events and classify the derived alerts related thereto.

한편, 최근 학습 기반 인공 지능 기술을 이용해 네트워크 장애를 분석 및 예측하고자 하는 시도가 이루어지고 있다. 전송 장비들로부터 수집된 정보를 가공하여 학습 데이터를 생성하고, 이를 이용해 장애 분석 모델을 학습시키는 것이다. Meanwhile, recent attempts to analyze and predict network failures are being made using learning-based artificial intelligence technology. It processes information collected from transmission devices to generate learning data, and uses it to train a disability analysis model.

그러나, 통신 장애는 흔하게 발생하는 것이 아니므로, 학습 데이터를 수집하는 것이 쉽지 않다. 또한, 장애는 재난, 기후, 공사 등 특수한 환경적 요인이 존재하면 발생 확률이 높지만, 전혀 예상하지 못하는 곳에서 발생하기도 한다. 그런데 어쩔 수 없이 자주 발생된 장애들을 이용하여 학습하면, 자주 발생하는 장애 원인이나 장애 발생 위치에 치우친 학습이 될 수 밖에 없다. 결국, 실제 네트워크 장애에 의해 발생된 경보 이벤트들을 이용하여 학습하는 경우, 장애 분석 정확성이 떨어지고, 실제 장애가 발생하면 대처 능력이 현저하게 떨어질 수 밖에 없다. 따라서, 네트워크 운용에 영향을 주지 않으면서도, 장애 분석 모델을 학습시키는데 필요한 학습 데이터를 획득하는 것이 요구된다.However, since communication failure is not common, it is not easy to collect learning data. In addition, disorders are more likely to occur if there are special environmental factors such as disaster, climate, and construction, but may occur in places that are not expected at all. However, when learning by using frequently occurring disorders is unavoidably, it is inevitable that learning is biased toward the cause of the frequently occurring disorder or the location of the disorder. After all, when learning by using alarm events generated by an actual network failure, the accuracy of failure analysis is degraded, and when an actual failure occurs, the coping ability is inevitably lowered. Therefore, it is required to acquire training data necessary for training a failure analysis model without affecting network operation.

해결하고자 하는 과제는 임의 전송 장비가 고유 코드(unique code)가 부여된 가상 장애를 발생시키고, 연관 전송 장비들이 고유 코드를 포함하는 가상 경보를 전파함으로써, 네트워크 장애 분석 장치가 가상 장애에 의해 전송 장비들에서 발생된 가상 경보 이벤트들을 수집하는 방법을 제공하는 것이다.The problem to be solved is that a random transmission device generates a virtual fault with a unique code, and the related transmission devices propagate a virtual alarm containing a unique code, so that the network fault analysis device is a transmission device due to a virtual fault. It is to provide a method of collecting virtual alarm events generated in the field.

해결하고자 하는 과제는 네트워크 장애 분석 장치가 전송 장비들에서 발생한 가상 경보 이벤트들을 고유 코드로 분류하고, 분류된 가상 경보 이벤트들을 장애 원인으로 라벨링하여 장애 분석 모델을 학습시키는데 필요한 학습 데이터를 생성하는 방법을 제공하는 것이다.The problem to be solved is how the network failure analysis device classifies the virtual alarm events generated from the transmission equipment into a unique code, labels the classified virtual alarm events as the cause of the failure, and generates the training data necessary to train the failure analysis model. To provide.

해결하고자 하는 과제는 네트워크 장애 분석 장치가 고유 코드로 분류된 가상 경보 이벤트들을 기초로 전송 장비들의 연결 관계를 판단하여 네트워크 토폴로지를 구성하는 방법을 제공하는 것이다.The problem to be solved is to provide a method of configuring a network topology by determining a connection relationship between transmission devices by a network failure analysis device based on virtual alarm events classified by a unique code.

한 실시예에 따른 네트워크 장애 분석 장치로서, 네트워크의 전송 장비들에서 발생된 가상 경보 이벤트들을 수집하는 경보 수집기, 그리고 상기 가상 경보 이벤트들에 포함된 고유 코드를 기준으로 상기 가상 경보 이벤트들을 복수의 그룹들로 분류하고, 각 가상 경보 이벤트에 포함된 가상 경보 종류를 기초로 그룹별 가상 근원 경보를 추출하며, 상기 가상 근원 경보에 관련된 장애 원인을 해당 그룹에 라벨링한 학습 데이터를 생성하는 학습 데이터 생성기를 포함한다. 각 그룹에 포함된 가상 경보 이벤트들은 특정 전송 장비에서 가상으로 발생시킨 가상 장애에 의해 상기 특정 전송 장비에 연관된 전송 장비들에서 발생되며, 상기 특정 전송 장비와 상기 연관된 전송 장비들은 동일한 고유 코드가 기재된 가상 경보 이벤트들을 발생시킨다.An apparatus for analyzing network failure according to an embodiment, comprising: an alarm collector collecting virtual alarm events generated from transmission devices of a network, and a plurality of groups of the virtual alarm events based on a unique code included in the virtual alarm events A training data generator that classifies into groups, extracts virtual source alarms for each group based on the type of virtual alarms included in each virtual alert event, and generates training data labeling the cause of failure related to the virtual source alert to the corresponding group. Include. Virtual alarm events included in each group are generated in transmission devices related to the specific transmission device due to a virtual failure generated by a specific transmission device, and the specific transmission device and the related transmission devices are virtual Trigger alarm events.

상기 고유 코드는 전송 장비들 사이에서 전송되는 패킷 헤더의 지정된 예비 사이트에 기재될 수 있다.The unique code may be written in a designated reserved site in a packet header transmitted between transmission devices.

상기 가상 경보 종류는 전송 장비들 사이에서 전송되는 패킷 헤더의 지정된 예비 사이트에 기재될 수 있다.The virtual alert type may be described in a designated reserved site of a packet header transmitted between transmission devices.

상기 학습 데이터 생성기는 상기 가상 근원 경보를 보고한 전송 장비를 장애 발생 위치로 추출하고, 각 그룹에 장애 원인 및 장애 발생 위치가 라벨링된 학습 데이터를 생성할 수 있다.The learning data generator may extract the transmission device reporting the virtual source alarm as a location of a failure, and generate learning data labeled with a cause of the failure and the location of the failure in each group.

상기 학습 데이터 생성기는 각 그룹에 포함된 가상 경보 이벤트들을 발생시킨 전송 장비들의 연결 관계를 행렬로 생성하고, 상기 행렬을 관계도에 따라 컬러를 표시한 관계 행렬 이미지로 변환하며, 상기 관계 행렬 이미지에 상기 장애 원인을 라벨링할 수 있다.The learning data generator generates a connection relationship between transmission devices that generated virtual alarm events included in each group into a matrix, converts the matrix into a relationship matrix image displaying colors according to the relationship diagram, and converts the matrix into a relationship matrix image. The cause of the disorder can be labeled.

상기 네트워크 장애 분석 장치는 상기 학습 데이터 생성기에서 생성한 학습 데이터들을 기초로, 장애와 상기 장애에 의해 전송 장비들에서 발생된 경보들의 관계를 추정하는 장애 분석 모델을 학습시키는 학습기, 그리고 수집된 실제 경보 이벤트들을 상기 장애 분석 모델의 입력에 맞게 전처리한 후, 전처리한 입력 정보를 학습된 상기 장애 분석 모델로 입력하고, 학습된 상기 장애 분석 모델로부터 추정된 장애 정보를 출력하는 장애 분석기를 더 포함할 수 있다.The network failure analysis device is a learner for learning a failure analysis model that estimates the relationship between the failure and the alarms generated by the transmission equipment due to the failure, based on the learning data generated by the learning data generator, and the collected actual alarm After pre-processing events according to the input of the failure analysis model, the pre-processed input information is input to the learned failure analysis model, and a failure analyzer for outputting the estimated failure information from the learned failure analysis model may be further included. have.

상기 장애 정보는 장애 원인과 장애 발생 위치를 포함할 수 있다.The failure information may include the cause of the failure and the location of the failure.

상기 네트워크 장애 분석 장치는 각 가상 경보 이벤트에 포함된 가상 경보 종류가 나타내는 방향 정보를 기초로 각 그룹에 포함된 가상 경보 이벤트들을 발생시킨 전송 장비들의 연결 관계를 추정하고, 추정한 연결 관계를 기초로 상기 네트워크의 토폴로지를 생성하는 토폴로지 생성기를 더 포함할 수 있다.The network failure analysis apparatus estimates a connection relationship between transmission devices that have generated virtual alarm events included in each group based on direction information indicated by a virtual alarm type included in each virtual alarm event, and based on the estimated connection relationship. It may further include a topology generator for generating the topology of the network.

한 실시예에 따른 네트워크 장애 분석 장치가 학습 데이터를 생성하는 방법으로서, 네트워크의 전송 장비들에서 발생된 가상 경보 이벤트들을 수집하는 단계, 상기 가상 경보 이벤트들에 포함된 고유 코드를 기준으로 상기 가상 경보 이벤트들을 복수의 그룹들로 분류하는 단계, 각 가상 경보 이벤트에 포함된 가상 근원 경보를 기초로 각 그룹의 가상 경보 이벤트들을 발생시킨 장애 원인을 판단하는 단계, 그리고 각 그룹에 장애 원인이 라벨링된 학습 데이터를 생성하는 단계를 포함한다. 각 그룹에 포함된 가상 경보 이벤트들은 특정 전송 장비에서 가상으로 발생시킨 가상 장애에 의해 상기 특정 전송 장비에 연관된 전송 장비들에서 발생되며, 상기 특정 전송 장비와 상기 연관된 전송 장비들은 동일한 고유 코드가 기재된 가상 경보 이벤트들을 발생시킨다.A method of generating learning data by a network failure analysis apparatus according to an embodiment, comprising the steps of: collecting virtual alarm events generated from transmission devices of a network, the virtual alarm based on a unique code included in the virtual alarm events Classifying events into a plurality of groups, determining the cause of the failure that caused the virtual alarm events of each group based on the virtual source alarm included in each virtual alarm event, and learning that the cause of the failure is labeled in each group Generating data. Virtual alarm events included in each group are generated in transmission devices related to the specific transmission device due to a virtual failure generated by a specific transmission device, and the specific transmission device and the related transmission devices are virtual Trigger alarm events.

상기 학습 데이터 생성 방법은 상기 특정 전송 장비로 가상 장애 발생을 지시하는 단계를 더 포함할 수 있다.The learning data generation method may further include instructing the occurrence of a virtual failure to the specific transmission device.

상기 학습 데이터를 생성하는 단계는 각 그룹에 포함된 가상 경보 이벤트들을 발생시킨 전송 장비들의 연결 관계를 행렬로 생성하고, 상기 행렬을 관계도에 따라 컬러를 표시한 관계 행렬 이미지로 변환하며, 상기 관계 행렬 이미지에 상기 장애 원인을 라벨링할 수 있다.In the generating of the learning data, a connection relationship between transmission devices that generated virtual alarm events included in each group is generated as a matrix, the matrix is converted into a relationship matrix image displaying colors according to a relationship diagram, and the relationship The cause of the failure can be labeled on the matrix image.

한 실시예에 따른 네트워크 장애 분석 장치의 네트워크 장애 분석 모델 학습 방법으로서, 네트워크의 전송 장비들에서 발생된 가상 경보 이벤트들을 수집하는 단계, 상기 가상 경보 이벤트들에 포함된 고유 코드를 기준으로 상기 가상 경보 이벤트들을 복수의 그룹들로 분류하는 단계, 각 가상 경보 이벤트에 포함된 가상 경보 종류를 기초로 그룹별 가상 근원 경보를 추출하며, 상기 가상 근원 경보에 관련된 장애 원인을 판단하는 단계, 각 그룹에 포함된 가상 경보 이벤트들을 발생시킨 전송 장비들의 연결 관계를 행렬로 생성하고, 상기 행렬을 관계도에 따라 컬러를 표시한 관계 행렬 이미지로 변환하며, 상기 관계 행렬 이미지에 상기 장애 원인을 라벨링한 학습 데이터를 생성하는 단계, 그리고 상기 학습 데이터를 기초로 장애 분석 모델을 학습시키는 단계를 포함한다. A method of learning a network failure analysis model of a network failure analysis apparatus according to an embodiment, comprising the steps of: collecting virtual alarm events generated from transmission devices of a network, the virtual alarm based on a unique code included in the virtual alarm events Classifying events into a plurality of groups, extracting virtual source alerts for each group based on the type of virtual alert included in each virtual alert event, and determining the cause of a failure related to the virtual source alert, included in each group The connection relationship between the transmission devices that generated the virtual alarm events is generated as a matrix, the matrix is converted into a relationship matrix image displaying colors according to the relationship diagram, and learning data labeled with the cause of the failure is added to the relationship matrix image. Generating, and training a disability analysis model based on the learning data.

상기 장애 분석 모델은 장애와 상기 장애에 의해 전송 장비들에서 발생된 경보들의 관계를 추정하는 컨볼루션 뉴럴 네트워크(Convolution Neural Network)로 구현될 수 있다.The failure analysis model may be implemented as a convolution neural network that estimates a relationship between a failure and alerts generated by transmission devices due to the failure.

상기 학습 데이터를 생성하는 단계는 상기 전송 장비들의 연결 관계를 포함하는 네트워크 토폴로지를 기초로 상기 행렬을 생성할 수 있다.The generating of the training data may generate the matrix based on a network topology including a connection relationship between the transmission devices.

각 그룹에 포함된 가상 경보 이벤트들은 특정 전송 장비에서 가상으로 발생시킨 가상 장애에 의해 상기 특정 전송 장비에 연관된 전송 장비들에서 발생되며, 상기 특정 전송 장비와 상기 연관된 전송 장비들은 동일한 고유 코드가 기재된 가상 경보 이벤트들을 발생시킬 수 있다.Virtual alarm events included in each group are generated in transmission devices related to the specific transmission device due to a virtual failure generated by a specific transmission device, and the specific transmission device and the related transmission devices are virtual Alert events can be triggered.

한 실시예에 따른 네트워크 전송 장비가 가상 장애를 발생시키는 방법으로서, 가상 장애의 발생 시점이 되면, 인접 장비로 전송하는 패킷의 헤더에 고유 코드와 상기 가상 장애에 해당하는 가상 경보를 기재하는 단계, 그리고 상기 인접 장비로 상기 고유 코드 및 상기 가상 경보가 기재된 패킷을 전송하는 단계를 포함한다. 상기 고유 코드와 상기 가상 경보는 상기 헤더에서 실제 경보에 할당된 사이트와 다른 사이트에 기재된다. 상기 인접 장비는 상기 가상 경보를 기초로 상기 가상 장애가 수신 링크에서 발생한 것으로 판단하고, 상기 가상 장애의 발생 시 지정된 동작을 수행한다.A method of generating a virtual failure by a network transmission device according to an embodiment, the step of writing a unique code and a virtual alarm corresponding to the virtual failure in a header of a packet transmitted to an adjacent device when a virtual failure occurs, And transmitting a packet including the unique code and the virtual alert to the adjacent device. The unique code and the virtual alert are written on a site different from the site assigned to the actual alert in the header. The neighboring device determines that the virtual failure has occurred in the receiving link based on the virtual alert, and performs a designated operation when the virtual failure occurs.

실시예에 따르면, 고유 코드를 이용하여 가상 경보 이벤트들을 명확히 구분 및 그루핑(grouping)할 수 있고, 또한 다양한 전송 장비들에서 가상 장애를 발생시킴으로써 실제 통신 장애가 발생하지 않더라도 다량의 학습 데이터를 획득할 수 있다. According to an embodiment, virtual alarm events can be clearly classified and grouped using a unique code, and a large amount of learning data can be obtained even if no actual communication failure occurs by generating virtual failures in various transmission devices. have.

실시예에 따르면, 가상 경보 이벤트들을 이용하여 장애 분석 모델이 장애 원인 및 장애 발생 위치를 분류하도록 지도 학습시키므로, 장애 분석 모델의 정확성을 높일 수 있다.According to an embodiment, since the failure analysis model is supervised to classify the cause of the failure and the location of the failure using virtual alarm events, the accuracy of the failure analysis model can be improved.

실시예에 따르면, 실제 운용 중인 네트워크에서 케이블 절단 환경을 실현하지 않더라도, 가상 장애를 통해 통신망 운용에 영향을 주지 않으면서도 다양한 전송 장비들에서 발생할 수 있는 경보 이벤트들을 수집할 수 있다. 즉, 가상 장애로부터 발생된 가상 경보가 장비들 사이에서 전파되더라도, 가상 경보는 데이터가 채워진 정상 패킷의 헤더에 포함되어 전송되므로, 장비간 통신에 영향을 주지 않는다. 또한, 임의 전송 장비에서 가상 장애를 발생시키는 경우, 짧은 시간 동안 가상 경보를 포함하는 패킷이 전송 장비들에서 일시적으로 전파되고, 네트워크 장애 분석 장치는 장비들에서 발생된 가상 경보 이벤트들을 수집할 수 있으므로, 운용 중인 네트워크에 영향을 주지 않으면서 원하는 시점에 학습에 필요한 가상 경보 이벤트들을 수집할 수 있다.According to an embodiment, even if a cable cutting environment is not realized in a network in actual operation, it is possible to collect alarm events that may occur in various transmission devices without affecting the operation of a communication network through a virtual failure. That is, even if a virtual alarm generated from a virtual failure is propagated between devices, the virtual alarm is transmitted by being included in the header of a normal packet filled with data, so that communication between the devices is not affected. In addition, when a virtual failure occurs in a random transmission device, a packet containing a virtual alarm for a short period of time is temporarily propagated from the transmission devices, and the network failure analysis device can collect the virtual alarm events generated by the devices. In addition, virtual alarm events required for learning can be collected at a desired time without affecting the network in operation.

실시예에 따르면, 네트워크의 변동된 토폴로지를 정확히 알 수 없거나, 해외 네트워크와 같이 토폴로지를 알 수 없더라도, 네트워크 장애 분석 장치는 전송 장비들에서 발생한 가상 경보 이벤트들을 기초로 전송 장비들의 연결 관계를 판단할 수 있고, 이를 기초로 네트워크 토폴로지를 구성할 수 있다.According to the embodiment, even if the changed topology of the network cannot be accurately known or the topology such as an overseas network is not known, the network failure analysis device may determine the connection relationship between the transmission devices based on the virtual alarm events generated by the transmission devices. And, based on this, a network topology can be configured.

도 1은 경보 발생 원리를 예시적으로 설명하는 도면이다.
도 2는 실제 네트워크의 경보로부터 근원 경보를 분별하기 어려운 문제를 설명하는 도면이다.
도 3은 한 실시예에 따라 가상 경보가 전송되는 패킷 구조의 예시이다.
도 4는 한 실시예에 따른 가상 경보 전파를 설명하는 도면이다.
도 5는 한 실시예에 따른 네트워크 시스템의 구성도이다.
도 6은 한 실시예에 따른 장애 분석 모델을 위한 학습 데이터 예시이다.
도 7은 한 실시예에 따른 가상 경보를 이용한 학습 데이터 생성 방법의 흐름도이다.
도 8은 한 실시예에 따른 학습된 장애 분석 모델을 이용한 장애 분석 방법의흐름도이다.
도 9는 한 실시예에 따른 토폴로지 구성 방법을 설명하는 도면이다.1 is a diagram illustrating an alarm generation principle by way of example.
2 is a diagram for explaining a problem in which it is difficult to discriminate a source alert from an actual network alert.
3 is an example of a packet structure in which a virtual alert is transmitted according to an embodiment.
4 is a diagram for explaining propagation of a virtual alert according to an embodiment.
5 is a block diagram of a network system according to an embodiment.
6 is an example of training data for a disability analysis model according to an embodiment.
7 is a flowchart of a method of generating learning data using a virtual alarm according to an embodiment.
8 is a flow chart of a disability analysis method using a learned disability analysis model according to an embodiment.
9 is a diagram illustrating a method of configuring a topology according to an embodiment.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art can easily implement the embodiments of the present invention. However, the present invention may be implemented in various different forms and is not limited to the embodiments described herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and similar reference numerals are assigned to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "…부", "…기", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Throughout the specification, when a part "includes" a certain component, it means that other components may be further included rather than excluding other components unless otherwise stated. In addition, terms such as "... unit", "... group", and "module" described in the specification mean units that process at least one function or operation, which can be implemented by hardware or software or a combination of hardware and software. have.

명세서 전체에서, 네트워크를 구성하는 각종 전송 장비를 간단히 장비라고 부를 수 있다. 명세서 전체에서, 근원 장애를 간단히 장애라고 부를 수 있다.Throughout the specification, various transmission equipment constituting the network can be simply referred to as equipment. Throughout the specification, the underlying disorder can be simply referred to as a disorder.

네트워크에서 발생하는 장애 원인은 다양할 수 있고, 예를 들면, 단선(cable-cut, link-cut), 유니트 장애(unit fail), 노드(node) 장애, 파워(power) 장애 등을 포함할 수 있다. 설명에서, 주로 단선 장애를 예로 들어 설명한다.The causes of failures occurring in the network can be various and include, for example, cable-cut, link-cut, unit failure, node failure, power failure, etc. have. In the description, it will be described mainly taking the disconnection disorder as an example.

도 1은 경보 발생 원리를 예시적으로 설명하는 도면이고, 도 2는 실제 네트워크의 경보로부터 근원 경보를 분별하기 어려운 문제를 설명하는 도면이다.FIG. 1 is a diagram illustrating an alarm generation principle by way of example, and FIG. 2 is a diagram illustrating a problem in which it is difficult to discriminate a source alarm from an actual network alarm.

도 1을 참고하면, 네트워크를 구성하는 어느 장비에 장애가 발생한 경우, 장애가 발생하지 않은 다른 계위의 전송 장비들이 장애를 알 수 있도록 경보가 전파된다. Referring to FIG. 1, when a failure occurs in any equipment constituting a network, an alarm is propagated so that transmission equipments of other levels that do not have a failure can recognize the failure.

경보는 ITU-T 표준에 정의되어 있는데, 신호 단절(Loss Of Signal, LOS), 경보 표시 신호(Alarm Indication Signal, AIS), 원격 결함 표시(Remote Defect Indication, RDI)/원격 알람 표시(Remote Alarm Indication, RAI) 등을 포함한다. AIS는 순방향(forward)/하향(downstream) 경보로서, 순방향/하위 계위 장치에게 이전의 상위 계위에서 장애 발생한 사실을 알리는데 사용된다. RDI는 역방향(backward)/상향(upstream) 경보로서, 역방향/상위 계위 장치에게 장애를 알리는데 사용된다. 이외에도 전송 장비 자체에 문제가 발생한 경우 발생하는 EQPT, 신호가 미약한 경우 발생하는 신호 감쇠(Signal Degradation) 등의 경보가 있다. 여기서, 근원 장애 발생을 알리는 경보를 근원 경보(예를 들면, LOS)라하고, 근원 경보 이외의 경보를 파생 경보(예를 들면, AIS, RDI)라 부르며, 근원 경보와 파생 경보를 통칭하여 경보라고 부를 수 있다.Alarms are defined in the ITU-T standard: Loss Of Signal (LOS), Alarm Indication Signal (AIS), Remote Defect Indication (RDI)/Remote Alarm Indication , RAI) and the like. AIS is a forward/downstream alarm and is used to notify forward/lower level devices that a failure has occurred in a previous higher level. RDI is a backward/upstream alarm and is used to notify a fault to a backward/upstream device. In addition, there are alarms such as EQPT, which occurs when a problem occurs in the transmission equipment itself, and Signal Degradation, which occurs when the signal is weak. Here, the alarm notifying the occurrence of a source failure is called a source alert (for example, LOS), and an alert other than the source alert is called a derivative alert (eg, AIS, RDI), and the source alert and the derived alert are collectively referred to as an alert. Can be called.

예를 들어, 종단 단말들 사이의 트래픽이 장비A, 장비B, 장비C를 통해 전송되고, 패킷이 장비A에서 장비C로 전송될 때, 장비A와 장비B를 연결하는 선로가 단선되거나, 장비A와 장비B를 연결하는 포트가 장비에서 뽑히는 장애가 발생하면, 패킷 전송이 제대로 되지 않는다. 그러면 패킷 대신 잡음 신호만 검출한 장비B는 장애 발생을 알리기 위해, LOS를 포함하는 경보 이벤트를 통신망 장비 관리 시스템(Element Management System, EMS)/네트워크 관리 시스템(Network Management System, NMS)에 보고한다. 또한, 장비B는 인접 장비들로 경보를 전파한다.For example, when traffic between end terminals is transmitted through equipment A, equipment B, and equipment C, and a packet is transmitted from equipment A to equipment C, the line connecting equipment A and equipment B is disconnected, or equipment If a failure occurs in which the port connecting A and device B is pulled out of the device, packet transmission is not performed properly. Then, device B, which detects only the noise signal instead of the packet, reports an alarm event including LOS to the Element Management System (EMS)/Network Management System (NMS) in order to notify the occurrence of a failure. In addition, equipment B propagates an alarm to adjacent equipment.

장비C는 장비B로부터 전송된 AIS 경보를 순방향/하위 계위 장치에게 전파하고, 장비A는 장비B로부터 전송된 RDI 경보를 역방향/상위 계위 장치에게 전파한다. 이와 같이, AIS 경보 및 RDI 경보를 수신한 장비들은 EMS/NMS로 AIS/RDI를 포함하는 경보 이벤트를 보고한다. 대규모 네트워크라면 경보 이벤트들은 EMS를 거쳐 NMS에서 최종 수집된다. 한편, 소규모 네트워크는 최종적으로 EMS에서 경보 이벤트들이 관리될 수 있도록 구현될 수 있다. Equipment C propagates the AIS alarm transmitted from equipment B to the forward/lower hierarchy devices, and equipment A propagates the RDI alarm transmitted from equipment B to the reverse/higher hierarchy devices. In this way, devices that have received AIS and RDI alerts report an alert event including AIS/RDI to EMS/NMS. In a large network, alarm events are finally collected in NMS via EMS. Meanwhile, a small network can be implemented so that alarm events can be finally managed in EMS.

NMS가 모든 장비들에서 발생된 경보 이벤트들을 수집하므로, 경보 이벤트들을 분석하여 장애 원인 및 장애 발생 위치를 판단할 수 있을 것으로 보일 수 있다. 하지만, 실제 네트워크는 다수의 장비들이 복잡한 계위 및 구조로 연결되어 있어서, 경보 이벤트들로부터 장애 원인 및/또는 장애 발생 위치를 알 수 있는 근원 경보를 분별하는 것이 쉽지 않다. 따라서, 경보 이벤트들을 이용하여 장애 원인 및/또는 장애 발생 위치를 추론하는 장애 분석 모델을 학습시키려 하더라도, 학습 데이터를 확보하는 것이 쉽지 않은 한계가 있다.Since the NMS collects the alarm events generated from all equipment, it can be seen that the cause of the failure and the location of the failure can be determined by analyzing the alarm events. However, in an actual network, since a number of devices are connected in a complex hierarchy and structure, it is not easy to discern the source alarm from which the cause of the failure and/or the location of the failure occurs from the alarm events. Therefore, even if an attempt is made to train a failure analysis model that infers the cause of the failure and/or the location of the failure by using the alarm events, it is not easy to secure the learning data.

도 2를 참고하면, 실제 네트워크는 계위도 존재하고, 이중화, 링, 메쉬 구조도 혼재하기 때문에, 경보는 매우 복잡하게 발생한다. 예를 들면, 북대구와 동대구 사이의 ROADM(RE-configurable Optical Add Drop Multiplexer)에서 단선(cable-cut) 장애가 발생하면, 표 1과 같은 경보 이벤트들이 NMS에서 수집된다. 경보 이벤트는 경보를 발생한 장비 정보, 경보 종류를 나타내는 경보 항목(alarm_msg), 지역 정보 등으로 구성되어 있다. 구체적으로 경보 이벤트는 경보 발생 시각, 경보 발생 위치(sysname), 장비 종류(equip_type), 유니트 종류(unit_type), 신호 종류(sig_type), 발생 개소(alarm_loc), 경보 항목(alarm_msg) 등을 포함할 수 있다.Referring to FIG. 2, the actual network also has hierarchies and redundancy, ring, and mesh structures are mixed, so that an alarm is very complicated. For example, when a cable-cut failure occurs in a RE-configurable Optical Add Drop Multiplexer (ROADM) between North Daegu and Dongdaegu, alarm events as shown in Table 1 are collected in the NMS. The alarm event consists of information on the device that generated the alarm, an alarm item indicating the type of alarm (alarm_msg), and local information. Specifically, the alarm event may include the alarm occurrence time, alarm location (sysname), equipment type (equip_type), unit type (unit_type), signal type (sig_type), occurrence location (alarm_loc), alarm item (alarm_msg), etc. have.

도 1에서 설명한 경보 발생 원리에 따르면, 장애는 한군데에서 발생하고, 주변 장비들에서 파생 경보(AIS, RDI 등)가 발생해야 한다. 하지만, 표 1의 alarm_msg를 참조하면 LOS 경보가 다수의 위치에서 발생하여 실제로 장애 발생 위치가 어디인지 파악하기 어렵다. 당연히, 경보 이벤트들을 장애 원인별로 묶기도 쉽지 않다. According to the principle of alarm generation described in FIG. 1, a fault must occur in one place, and a derivative alarm (AIS, RDI, etc.) must be generated from peripheral devices. However, referring to the alarm_msg of Table 1, it is difficult to determine where the actual fault occurs because the LOS alarm occurs in a number of locations. Naturally, it is not easy to group alarm events by cause of failure.

이처럼, 도 2와 같은 간략한 테스트 네트워크에서 수집된 경보 이벤트들에서조차 장애 원인을 알 수 있는 근원 경보를 찾고, 각 근원 경보에서 파생된 파생 경보를 파악하는 것이 쉽지 않으므로, 실제 네트워크에서 경보 이벤트들을 장애 원인별로 분류하는 것은 매우 어렵다. 게다가 2개 이상의 장애가 섞여있으면 이를 분간하기 어렵고, 경보가 확산되면 NMS에서 시간 순서대로 경보가 수집되지 않아서 경보 이벤트들을 장애 원인별로 묶기 어렵다. As such, it is not easy to find the source alarm that can know the cause of the failure even from the alarm events collected in the simple test network as shown in FIG. 2 and to identify the derived alarm derived from each source alarm. It is very difficult to categorize them. In addition, if two or more faults are mixed, it is difficult to distinguish them, and if an alarm spreads, it is difficult to group alarm events by cause of failure because alarms are not collected in chronological order in NMS.

SysnameSysname equip_typeequip_type alarm_msgalarm_msg 대구-MSPP-L-A-_NODE-SNHDaegu-MSPP-L-A-_NODE-SNH MSPP3HSNHMSPP3HSNH LOFLOF 대구-PTS-0532-01Daegu-PTS-0532-01 PTN3HALUPTN3HALU LOSLOS 대구-PTS-0532-01Daegu-PTS-0532-01 PTN3HALUPTN3HALU TRFTRF 군위-MSPP-L-A-_NODE-TFMilitary rank-MSPP-L-A-_NODE-TF MSPP3HTFMSPP3HTF SDH-LOSSDH-LOS 대구-PTS-0532-01Daegu-PTS-0532-01 PTN3HALUPTN3HALU RDIRDI 대구-PTS-0532-01Daegu-PTS-0532-01 PTN3HALUPTN3HALU TRFTRF 대구-PTS-0532-01Daegu-PTS-0532-01 PTN3HALUPTN3HALU TRFTRF 대구-PTS-0532-01Daegu-PTS-0532-01 PTN3HALUPTN3HALU ServerSignal FailureServerSignal Failure 남대구-ROADM-0936-01-B-NODENamdaegu-ROADM-0936-01-B-NODE ROADM8HHWROADM8HHW MUT_LOSMUT_LOS 남대구-ROADM-0936-01-B-NODENamdaegu-ROADM-0936-01-B-NODE ROADM8HHWROADM8HHW MUT_LOSMUT_LOS 남대구-ROADM-0936-01-B-NODENamdaegu-ROADM-0936-01-B-NODE ROADM8HHWROADM8HHW MUT_LOSMUT_LOS 남대구-ROADM-0936-01-B-NODENamdaegu-ROADM-0936-01-B-NODE ROADM8HHWROADM8HHW OSC_LOSOSC_LOS 동대구-ROADM-0936-01-C-NODEDongdaegu-ROADM-0936-01-C-NODE ROADM8HHWROADM8HHW MUT_LOSMUT_LOS 동대구-ROADM-0936-01-C-NODEDongdaegu-ROADM-0936-01-C-NODE ROADM8HHWROADM8HHW MUT_LOSMUT_LOS 동대구-ROADM-0936-01-C-NODEDongdaegu-ROADM-0936-01-C-NODE ROADM8HHWROADM8HHW MUT_LOSMUT_LOS 동대구-ROADM-0936-01-C-NODEDongdaegu-ROADM-0936-01-C-NODE ROADM8HHWROADM8HHW R_LOSR_LOS

한편, 주요 통신사들은 NMS를 기반으로 장비들의 연결 관계를 나타내는 네트워크 토폴로지를 통합 관리하고 있지만, 작은 규모의 네트워크 사용처 혹은 개인(private) 네트워크 사용자들은 이를 관리하기 쉽지 않다. 게다가, 통신사들도 장비를 보고 직접 입력하여 장비 대장을 관리하는 방식이므로, 분기별/반기별 포트 최적화 배치 또는 임의 변경 등의 이유로 네트워크 토폴로지가 정확하지 않은 경우가 많다.On the other hand, major telecommunications companies integrate and manage the network topology representing the connection relationship of devices based on NMS, but it is difficult for small network users or private network users to manage it. In addition, since the communication companies also manage the equipment ledger by looking at the equipment and directly inputting it, the network topology is often inaccurate for reasons such as quarterly/half-yearly port optimization arrangements or random changes.

또한, 근원 경보에 따른 파생 경보는 논리적(logical)으로 발생하기 때문에, 단순한 네트워크라면 규칙(Rule)을 만들어 다수의 경보들을 분석하고, 근원 경보가 어디에서 발생했는지 정확히 탐지할 수 있다. 그러나 실제 통신 장애가 발생하는 네트워크는, 기가 인터넷/5세대 통신으로 발전하면서 장비 수가 기하급수적으로 증가하므로, 근원 경보와 파생 경보를 규칙 기반으로 판단하는 것은 불가능에 가깝다.In addition, since derived alarms according to the source alarm are logically generated, if it is a simple network, rules can be created to analyze multiple alarms and accurately detect where the source alarm occurred. However, as the number of devices increases exponentially as the network develops into Giga Internet/5G communication, it is almost impossible to judge the source and derivative alarms based on rules.

다음에서, 변동성이 존재하는 네트워크에서 수집된 경보 이벤트들을 기초로 장애 원인과 장애 발생 위치를 추정하는 장애 분석 모델을 학습시키는 방법에 대해 설명한다. 특히, 임의 전송 장비에서 가상 장애를 발생시킴으로써 장애 분석 모델의 학습에 필요한 경보 이벤트들을 대량으로 수집하고, 동일 장애에 의해 발생된 경보 이벤트들끼리 정확하고 빠르게 분류하는 방법을 설명한다. In the following, a method of learning a failure analysis model that estimates the cause of the failure and the location of the failure based on the alarm events collected from the network with variability will be described. In particular, a method of collecting a large amount of alarm events necessary for learning a failure analysis model by generating a virtual failure in a random transmission device, and accurately and quickly classifying alarm events generated by the same failure will be described.

도 3은 한 실시예에 따라 가상 경보가 전송되는 패킷 구조의 예시이고, 도 4는 한 실시예에 따른 가상 경보 전파를 설명하는 도면이다.3 is an example of a packet structure in which a virtual alert is transmitted according to an embodiment, and FIG. 4 is a diagram illustrating virtual alert propagation according to an embodiment.

도 3을 참고하면, 전송 장비는 헤더(header)에 경보 종류(AIS, RDI)를 표시하고, 페이로드(payload)에 데이터를 채운 패킷을 전송한다. 패킷은 예를 들어, 동기식디지털계위(Synchronous Digital Hierarchy, SDH)의 전송 표준인 STM-1(Synchronous Transport Module level 1) 패킷일 수 있다. Referring to FIG. 3, a transmission device displays an alert type (AIS, RDI) in a header and transmits a packet filled with data in a payload. The packet may be, for example, a Synchronous Transport Module level 1 (STM-1) packet, which is a transmission standard of a synchronous digital hierarchy (SDH).

STM-1 패킷은 9x9 바이트 행렬의 헤더와 페이로드로 구성될 수 있다. 헤더의 각 바이트는 결합 패킷 경계 식별 정보, 동기화 정보, 에러 블록(Error block) 검출 등의 다양한 정보가 들어 있다. 특히, STM-1 패킷의 K2 사이트는 경보 종류(AIS, RDI)를 표시한다. K2의 6, 7, 8비트가 "111"이면 AIS이고, "110"이면 RDI를 나타낸다. 한편, STM-1 패킷의 두 Z1 사이트 및 두 Z2 사이트는 예비(spare) 바이트이다.The STM-1 packet may consist of a 9x9 byte matrix header and payload. Each byte of the header contains various information such as combined packet boundary identification information, synchronization information, and error block detection. Specifically, the K2 site of the STM-1 packet indicates the type of alert (AIS, RDI). If the 6, 7, and 8 bits of K2 are "111", it is AIS, and if it is "110", it is RDI. On the other hand, two Z1 sites and two Z2 sites of the STM-1 packet are spare bytes.

본 발명에서, 임의 전송 장비는 실제 장애에 의해 경보를 발생시키는 것이 아니라, 가상 장애에 의해 경보를 발생시킨다. 따라서, 가상 장애에 의한 경보를 가상 경보라고 부르고, 가상 경보는 가상 근원 경보와 가상 파생 경보로 구분될 수 있다. 가상 경보는 실제 경보와 구분되어 표시된다. 예를 들면, 단선 장애가 가상으로 발생된 경우, 가상 경보는 VLOS(Virtual LOS), VAIS(Virtual AIS), VRDI(Virtual RDI) 등으로 표시될 수 있다. 가상 경보를 포함하는 패킷을 가상 경보 패킷이라고 부를 수 있는데, 가상 경보는 STM-1 패킷과 같이, 전송 장비에서 사용되는 패킷과 동일한 패킷을 사용하여 전송된다. 이때, 패킷 헤더의 K2 사이트에 기재되는 실제 경보와 구분하기 위해, 가상 경보를 별도 사이트에 기재한다. 연결된 장비간 전파되는 가상 경보는 데이터가 채워진 정상 패킷의 헤더에 가상 경보가 표시되어 전송된다. 따라서, 가상 경보는 장비 간 통신에 영향을 주지 않는다. 즉, 페이로드에, 전송되는 사진/비디오와 같은 실제 데이터가 들어있고, 헤더에 가상 경보라는 식별 표지가 포함되므로, 가상 경보가 통신 상황에는 전혀 영향을 주지 않는다.In the present invention, the random transmission equipment generates an alarm by a virtual failure, not by an actual failure. Accordingly, an alarm caused by a virtual failure is called a virtual alarm, and the virtual alarm can be classified into a virtual source alarm and a virtual derived alarm. Virtual alarms are displayed separately from actual alarms. For example, when a disconnection failure occurs virtually, the virtual alarm may be displayed as a Virtual LOS (VLOS), Virtual AIS (VAIS), Virtual RDI (VRDI), or the like. A packet including a virtual alert may be referred to as a virtual alert packet, and the virtual alert is transmitted using the same packet as a packet used in a transmission device, such as an STM-1 packet. At this time, in order to distinguish it from the actual alert recorded on the K2 site in the packet header, a virtual alert is recorded on a separate site. The virtual alarm propagated between connected devices is transmitted by displaying the virtual alarm on the header of a normal packet filled with data. Therefore, virtual alarms do not affect communication between devices. That is, since the payload contains real data such as a picture/video to be transmitted, and an identification mark of a virtual alarm is included in the header, the virtual alarm does not affect the communication situation at all.

한편, 단일한 가상 장애로부터 발생된 가상 경보들을 구분하기 위해, 가상 장애는 고유 코드(unique code)와 함께 전송된다. 고유 코드는 고유한 값이면 되고, 예를 들면, 가상 장애가 발생한 장비의 고유 정보(예를 들면, MAC 주소)를 기초로 생성될 수 있다. Meanwhile, in order to distinguish virtual alarms generated from a single virtual failure, the virtual failure is transmitted with a unique code. The unique code may be a unique value, and may be generated based on, for example, unique information (eg, MAC address) of a device in which a virtual failure has occurred.

고유 코드와 가상 경보 종류는 패킷 헤더의 예비 사이트에 기재될 수 있다. 예비 사이트들 중 앞의 일부 비트(예를 들면, 24비트)는 단일한 가상 장애를 구분하기 위한 고유 코드가 기재되고, 나머지 비트(예를 들면, 8비트)는 가상 경보 종류가 기재될 수 있다. The unique code and the virtual alert type can be written in the reserved site of the packet header. Among the spare sites, a unique code for distinguishing a single virtual fault may be written in the first bits (eg, 24 bits), and the remaining bits (eg, 8 bits) may describe a virtual alarm type. .

표 2를 참고하면, 두 Z1 사이트 및 두 Z2 사이트가 제공하는 4바이트에 고유 코드와 가상 경보 종류를 표시할 수 있다. 만약 Z1 및 Z2 사이트가 사용 중인 경우, 다른 예비 사이트를 이용할 수도 있다. Referring to Table 2, it is possible to display a unique code and virtual alarm type in 4 bytes provided by the two Z1 sites and the two Z2 sites. If the Z1 and Z2 sites are in use, other spare sites may be used.

Z1-1(8bits)Z1-1(8bits) Z1-2(8bits)Z1-2 (8bits) Z2-1(8bits)Z2-1 (8bits) Z2-2(8bits)Z2-2 (8bits) 고유 코드Unique code 가상 경보 종류:
VLOS, VAIS, VRDI 등 Virtual alarm type:
VLOS, VAIS, VRDI, etc.

가상 경보 종류는 다양하게 표시될 수 있다. 예를 들면, VLOS는 00000000, VAIS는 11111111, VRDI는 00001111과 같이 할당될 수 있다. 비트 오류가 발생해도 majority rule 기반으로 추정하기 위해, 동일 숫자가 반복되도록 식별자를 할당할 수 있다. Various types of virtual alarms can be displayed. For example, VLOS may be allocated as 00000000, VAIS may be 11111111, and VRDI may be assigned as 00001111. In order to estimate based on the majority rule even if a bit error occurs, an identifier can be assigned so that the same number is repeated.

도 4를 참고하면, 도 1의 실제 장애 대신 가상 장애가 발생한다면, 장비들은 가상 경보를 전파하고, 지정된 장치로 발생된 가상 경보 이벤트들을 보고한다. 장비들은 실제 장애에 의한 경보 발생과 동일한 방식으로 경보를 전파하고 지정된 장치(예를 들면, NMS)로 보고하는데, 가상 경보를 통해 가상 장애를 구현하고, 실제 경보와 구분하기 위해 헤더의 예비 사이트를 사용하고, 고유 코드를 추가 기재하는 차이가 있다.Referring to FIG. 4, if a virtual failure occurs instead of the actual failure of FIG. 1, the devices propagate a virtual alarm and report virtual alarm events generated to a designated device. The devices propagate the alarm in the same way as the alarm generated by the actual failure and report it to a designated device (e.g., NMS). It implements a virtual failure through a virtual alarm and establishes a spare site in the header to distinguish it from the actual alarm. There is a difference between using it and adding a unique code.

장비A-장비B 사이의 단선 장애를 가상으로 구현하기 위해, 장비A가 장비B로 전송하는 패킷 헤더에 고유 코드 및 VLOS에 해당하는 식별자(00000000)를 포함시켜 전송한다. 그러면, 장비B가 VLOS에 해당하는 식별자(00000000)가 포함된 패킷 헤더를 기초로 가상 장애 발생을 알고, 수신 링크에서 발생한 가상 경보(VLOS) 이벤트를 지정된 장치(예를 들면, NMS)에 보고한다. 장비C는 장비B로부터 헤더에 고유 코드 및 VAIS에 해당하는 식별자(11111111)가 포함된 패킷을 수신하면, 인접 장비로 가상 경보를 전파하고, 발생한 가상 경보(VAIS) 이벤트를 지정된 장치에 보고한다. 마찬가지로, 장비A는 장비B로부터 헤더에 고유 코드 및 VRDI에 해당하는 식별자(00001111)가 포함된 패킷을 수신하면, 인접 장비로 가상 경보를 전파하고, 발생한 가상 경보(VAIS) 이벤트를 지정된 장치에 보고한다.In order to virtually implement a disconnection failure between equipment A and equipment B, the packet header transmitted by equipment A to equipment B includes a unique code and an identifier (00000000) corresponding to the VLOS and transmits. Then, equipment B knows the occurrence of a virtual failure based on the packet header containing the identifier (00000000) corresponding to the VLOS, and reports a virtual alarm (VLOS) event that occurred on the receiving link to the designated device (e.g., NMS). . When device C receives a packet including a unique code and an identifier (11111111) corresponding to VAIS in a header from device B, it propagates a virtual alert to neighboring devices and reports a virtual alert (VAIS) event that has occurred to the designated device. Similarly, when device A receives a packet containing a unique code and an identifier (00001111) corresponding to VRDI in the header from device B, it propagates a virtual alarm to neighboring devices, and reports a virtual alarm (VAIS) event that has occurred to the designated device. do.

도 5는 한 실시예에 따른 네트워크 시스템의 구성도이다.5 is a block diagram of a network system according to an embodiment.

도 5를 참고하면, 네트워크 장애 분석 장치(100)는 경보 수집기(110), 가상 장애 발생 제어기(120), 학습 데이터 생성기(130), 장애 분석 모델 학습기(150) 그리고 학습된 장애 분석 모델을 이용한 장애 분석기(170)를 포함할 수 있다. 네트워크 장애 분석 장치(100)는 경보 수집기(110)에서 수집된 가상 경보 이벤트들을 이용하여 네트워크 토폴로지를 생성하는 토폴로지 생성기(190)를 더 포함할 수 있다. 토폴로지 생성기(190)에서 생성되거나 업데이트된 토폴로지가 장애 분석 모델 학습에 사용될 수 있다. 가상 장애 발생 제어기(120)는 네트워크 장애 분석 장치(100)와 별도로 구현되거나, 네트워크의 특정 장치에 포함되어 구현되거나, 또는 가상 장애 발생 방법에 따라 사용되지 않을 수 있다.Referring to FIG. 5, the network failure analysis apparatus 100 uses an alarm collector 110, a virtual failure generation controller 120, a learning data generator 130, a failure analysis model learner 150, and a learned failure analysis model. A failure analyzer 170 may be included. The network failure analysis apparatus 100 may further include a topology generator 190 that generates a network topology by using the virtual alarm events collected by the alarm collector 110. A topology generated or updated by the topology generator 190 may be used for learning a failure analysis model. The virtual failure occurrence controller 120 may be implemented separately from the network failure analysis apparatus 100, included in a specific network device, or may not be used according to a virtual failure generation method.

경보 수집기(110)는 네트워크의 장비들에서 발생된 가상 경보 이벤트들을 수집한다. 각 장비에서 발생된 가상 경보는 하나의 가상 경보 이벤트로 수집된다. 이때, 가상 경보 이벤트는 장비 정보, 고유 코드, 그리고 가상 경보 종류를 포함한다. 경보 수집기(110)는 각 장비에 직접 연결되거나, 장비들이 연결된 EMS에 연결되거나, EMS들이 연결된 NMS에 연결되고, 이들로부터 보고된 경보 이벤트들을 수집할 수 있다. 경보 수집기(110)는 실제 장애에 의해 보고된 경보 이벤트들을 수집할 수 있고, 장애 분석기(170)로 전달할 수 있다.The alarm collector 110 collects virtual alarm events generated from devices in the network. Virtual alarms generated by each device are collected as one virtual alarm event. In this case, the virtual alarm event includes equipment information, a unique code, and a virtual alarm type. The alarm collector 110 may be directly connected to each device, connected to an EMS to which the devices are connected, or connected to an NMS to which EMSs are connected, and may collect alarm events reported therefrom. The alarm collector 110 may collect the alarm events reported by the actual failure and may transmit it to the failure analyzer 170.

학습 데이터 생성기(130)는 경보 수집기(110)에서 수집된 가상 경보 이벤트들을 고유 코드를 기준으로 분류한다. 학습 데이터 생성기(130)는 고유 코드를 기준으로 분류된 가상 경보 이벤트들 중에서, 가상 경보 종류를 기초로 가상 근원 경보와 가상 파생 경보들을 구분할 수 있다. 예를 들면, 학습 데이터 생성기(130)는 VLOS를 포함하는 가상 경보 이벤트를 분류된 그룹의 가상 근원 경보로 추출할 수 있다.The learning data generator 130 classifies the virtual alarm events collected by the alarm collector 110 based on a unique code. The learning data generator 130 may classify a virtual source alarm and a virtual derived alarm based on a virtual alarm type among virtual alarm events classified based on the unique code. For example, the learning data generator 130 may extract a virtual alarm event including VLOS as a virtual source alarm of a classified group.

학습 데이터 생성기(130)는 고유 코드를 기준으로 분류된 가상 장애 이벤트들을 이용하여 학습 데이터를 생성한다. 학습 데이터는 장애 분석 모델의 입력층 및 출력층에 따라 결정된다. 예를 들면, 학습 데이터 생성기(130)는 장비들에서 발생된 가상 장애 이벤트들을 가공한 경보 정보와 장애 정보(장애 원인 및/또는 장애 발생 위치)를 대응시키고, 경보 정보와 장애 정보의 관계를 지도 학습(Supervised learning)하기 위한 학습 데이터를 생성할 수 있다.The learning data generator 130 generates learning data using virtual disability events classified based on a unique code. The training data is determined according to the input layer and output layer of the disability analysis model. For example, the learning data generator 130 corresponds to the warning information processed by the virtual failure events generated in the equipment and the failure information (the cause of the failure and/or the location of the failure), and maps the relationship between the warning information and the failure information. Learning data can be generated for supervised learning.

학습 데이터 생성기(130)는 분류된 가상 경보 이벤트들 중에서 가상 근원 경보를 추출하고, 가상 근원 경보(예를 들면, VLOS)에 관련된 장애 원인(예를 들면, 단선)을 추출할 수 있다. 또한, 학습 데이터 생성기(130)는 분류된 가상 경보 이벤트들 중에서 가상 근원 경보를 보고한 장비를 가상 장애 발생 위치로 추출할 수 있다. The learning data generator 130 may extract a virtual source alert from among the classified virtual alert events, and extract a cause of a failure (eg, disconnection) related to the virtual source alert (eg, VLOS). In addition, the learning data generator 130 may extract, from among the classified virtual alarm events, a device reporting a virtual source alarm as a virtual failure location.

한편, 장애 분석 모델이 컨볼루션 뉴럴 네트워크(Convolution Neural Network, CNN)로 구현되는 경우, 학습 데이터 생성기(130)는 도 5와 같이, 가상 경보 이벤트들을 보고한 장비들의 연결 관계를 행렬 이미지로 생성하고, 관계 행렬 이미지에 장애 원인을 라벨링한 학습 데이터를 생성할 수 있다.On the other hand, when the failure analysis model is implemented as a convolution neural network (CNN), the training data generator 130 generates a connection relationship between devices reporting virtual alarm events as a matrix image, as shown in FIG. , It is possible to generate training data labeling the cause of the failure in the relationship matrix image.

장애 분석 모델 학습기(150)는 학습 데이터 생성기(130)에서 생성된 학습 데이터를 이용하여 장애 분석 모델을 학습시킨다. 가상 장애를 통해 학습 데이터가 대량 생성되므로, 장애 분석 모델 학습기(150)는 정확성을 높일 수 있는 정도로 장애 분석 모델을 충분히 학습시킬 수 있다. The disability analysis model learner 150 trains a disability analysis model by using the training data generated by the training data generator 130. Since the learning data is generated in a large amount through the virtual disorder, the disorder analysis model learner 150 may sufficiently train the disorder analysis model to a degree to increase accuracy.

동일 가상 장애인지를 구분할 수 있는 고유 코드가 가상 경보와 함께 장비간 전파되고, 경보 수집기에서 수집된 가상 경보 이벤트에 고유 코드가 포함되므로, 가상 장애들이 다수의 장비들에서 발생되더라도, 고유 코드를 기초로 가상 경보 이벤트들이 분류될 수 있다. 즉, 동일 고유 코드로 분류된 가상 경보 이벤트들을 단일 장애 그룹으로 묶을 수 있고, 단일 장애 그룹으로 묶인 가상 경보 이벤트들 중에서 가상 경보 종류를 기초로 가상 근원 경보와 가상 파생 경보들을 구분할 수 있다. 따라서, 학습 데이터 생성기(130)는 장애 분석 모델에 맞게 학습에 필요한 학습 데이터를 다양하게 생성할 수 있다.Since a unique code that can identify the same virtual disabled area is propagated between devices along with a virtual alarm, and a unique code is included in the virtual alarm event collected by the alarm collector, even if virtual obstacles occur in multiple devices, based on the unique code Virtual alert events can be classified. That is, virtual alarm events classified by the same unique code can be grouped into a single failure group, and a virtual source alarm and a virtual derived alarm can be classified based on the type of virtual alarm among the virtual alarm events grouped into a single failure group. Accordingly, the training data generator 130 may variously generate training data necessary for training according to the disability analysis model.

장애 분석 모델 학습기(150)는 장비들에서 발생한 경보 이벤트들로부터 가공된 경보 정보를 입력으로 하고, 추정하고자 하는 장애 정보(장애 원인 및/또는 장애 발생 위치)를 출력(목표)으로 설정한 후, 장애 분석 모델을 구성하는 계층들의 노드 가중치를 학습시킬 수 있다.The failure analysis model learner 150 inputs the processed alarm information from the alarm events generated by the equipment, sets the failure information (the cause of the failure and/or the location of the failure) to be estimated as an output (target), Node weights of layers constituting the failure analysis model can be learned.

장애 분석기(170)는 실제 네트워크에서 수집된 경보 이벤트들을 장애 분석 모델의 입력층에 맞게 전처리한 후, 학습된 장애 분석 모델로 입력한다. 장애 분석기(170)는 학습된 장애 분석 모델로부터, 입력된 실제 경보 정보로부터 추정된 장애 정보를 출력한다. 장애 정보는, 장애 분석 모델이 입력으로부터 추정하도록 학습한 정보로서, 예를 들면, 장애 원인 및/또는 장애 발생 위치를 출력할 수 있다. The failure analyzer 170 pre-processes the alarm events collected in the actual network according to the input layer of the failure analysis model, and then inputs the learned failure analysis model. The failure analyzer 170 outputs failure information estimated from the input actual alarm information from the learned failure analysis model. The failure information is information learned to be estimated from an input by the failure analysis model, and, for example, the cause of the failure and/or the location of the failure may be output.

한편, 장애 분석 모델 학습기(150)는 네트워크 토폴로지를 이용하여 학습 데이터를 생성할 수 있다. 이때, 수시로 변동하는 네트워크 토폴로지가 정확할수록 장애 발생 위치가 정확히 추정될 수 있으므로, 본 발명은 가상 경보 이벤트들로부터 네트워크 토폴로지를 생성/업데이트할 수 있다. Meanwhile, the failure analysis model learner 150 may generate training data using a network topology. In this case, the more accurate the network topology that changes from time to time, the more accurately the location of the failure can be estimated. Accordingly, the present invention can generate/update the network topology from virtual alarm events.

토폴로지 생성기(190)는 가상 경보 이벤트들을 발생시킨 장비들의 연결 관계를 추정하여 네트워크 토폴로지를 생성한다. 구체적으로, 토폴로지 생성기(190)는 특정 장비에서 장애가 발생하면, 특정 장비에 연결된 장비들을 따라 경보가 전파되는 것을 이용한다. 네트워크 정보는 수시로 변하기도 하며, 이를 관리하는 것은 분기, 반기 별로 수행하기에 정보관리가 쉽지 않다. 그래서 임의로 가상 경보를 발생시키면서 주변 장비의 정보를 수집하여 네트워크 지도 정보를 계속해서 업데이트할 수 있다. The topology generator 190 generates a network topology by estimating a connection relationship between devices that have generated virtual alarm events. Specifically, when a failure occurs in a specific device, the topology generator 190 uses an alarm propagating along the devices connected to the specific device. Network information changes from time to time, and it is difficult to manage information because it is performed quarterly and semiannually. Therefore, it is possible to continuously update the network map information by collecting information on nearby devices while generating a virtual alarm.

가상 장애는 다양하게 발생될 수 있다. Virtual failure can occur in a variety of ways.

한 실시예에 다르면, 가상 장애 발생 제어기(120)가 네트워크의 특정 장비로 가상 장애 발생을 지시할 수 있다. 가상 장애 발생 제어기(120)는 단선 장애, 유니트 장애 등의 다양한 가상 장애 발생을 지시하고, 이를 통해 다양한 장애 원인에 의한 경보 이벤트들이 수집되도록 할 수 있다. 가상 장애 발생 지시는 가상 장애를 발생시키는 특정 장비의 특정 포트 정보를 더 포함할 수 있다. 예를 들어 도 1과 같이 장비A와 장비B 사이의 단선 장애를 가상으로 발생시키는 경우, 가상 장애 발생 제어기(120)는 장비A로 가상 장애 발생 지시를 전송한다. 가상 장애 발생 지시를 수신한 장비A는 도 4와 같이, 고유 코드와 가상 장애 종류(예를 들면, VLOS)가 기재된 패킷을 인접 장비(예를 들면, 도 4의 장비B)로 전송할 수 있다. 또는 가상 장애 발생 제어기(120)는 장비A를 대신해, 고유 코드와 가상 장애 종류(VLOS)가 기재된 패킷을 장비B로 전송함으로써, 장비B가 장비A와 장비B 사이의 단선 장애가 가상으로 발생한 것을 알 수 있도록 할 수 있다.According to one embodiment, the virtual failure occurrence controller 120 may instruct the occurrence of a virtual failure to a specific device in the network. The virtual failure occurrence controller 120 may instruct the occurrence of various virtual failures such as disconnection failure and unit failure, and through this, alarm events due to various failure causes may be collected. The virtual failure occurrence instruction may further include specific port information of a specific device that causes a virtual failure. For example, when a disconnection failure between the equipment A and the equipment B is virtually generated as shown in FIG. 1, the virtual failure occurrence controller 120 transmits a virtual failure occurrence instruction to the equipment A. Upon receiving the virtual failure occurrence instruction, the device A may transmit a packet including a unique code and a virtual failure type (eg, VLOS) to an adjacent device (eg, device B of FIG. 4) as shown in FIG. 4. Alternatively, the virtual failure occurrence controller 120 transmits a packet with a unique code and a virtual failure type (VLOS) written on it to the equipment B on behalf of equipment A, so that equipment B knows that a disconnection failure between equipment A and equipment B has occurred virtually. You can do it.

참고로, 단선 장애는 특정 포트가 동작하지 않거나 선이 손상되어 신호를 다른 장비(유니트)의 포트로 송신할 수 없는 상태이므로, 가상 단선 장애 발생 지시를 수신한 장비는 특정 포트로 가상 경보를 전송하여 가상 단선 장애를 발생시킨다. 유니트 장애는 모든 포트로 신호를 송신할 수 없는 상태이므로, 가상 유니트 장애 발생 지시를 수신한 장비는 모든 포트들로 가상 경보를 전송하여 가상 유니트 장애를 발생시킬 수 있다.For reference, a disconnection failure is a state in which a specific port is not working or a wire is damaged, so that the signal cannot be transmitted to the port of another device (unit). Therefore, the device receiving the virtual disconnection failure instruction transmits a virtual alarm to the specific port. This causes a virtual disconnection failure. Since a unit failure is a state in which signals cannot be transmitted to all ports, the device receiving the virtual unit failure indication can send a virtual alarm to all ports to cause a virtual unit failure.

가상 장애 발생 제어기(120)는 전송 장비들을 제어하는 스마트 스위치에 구현될 수 있다.The virtual failure occurrence controller 120 may be implemented in a smart switch that controls transmission equipment.

다른 실시예에 따르면, 장비들은 가상 장애 발생 제어기(120)의 지시에 따라 가상 장애를 발생시킬 필요없이, 특정 시점에 고유 코드 및 가상 경보(예를 들면, VLOS)가 포함된 패킷을 연결된 장비로 전송함으로써, 가상 장애를 발생시킬 수 있다. 가상 경보(예를 들면, VLOS)가 포함된 패킷을 수신한 장비는 단선 장애가 가상으로 발생한 것을 알고, 가상 경보를 전파 및 보고할 수 있다. 각 장비가 가상 장애를 발생시키는 특정 시점은 랜덤하게 결정될 수 있다. 또는 가상 장애가 다수의 장비들에서 동시에 발생하는 것을 방지하기 위해, 연결된 장비들이 토큰을 전달하는 방식으로 가상 장애를 순차적으로 발생시킬 수 있다. According to another embodiment, the devices do not need to generate a virtual failure according to the instruction of the virtual failure occurrence controller 120, and transmit a packet including a unique code and a virtual alarm (eg, VLOS) to the connected equipment at a specific time. By transmitting, virtual failures can be generated. A device that receives a packet containing a virtual alarm (eg, VLOS) knows that a disconnection failure has occurred virtually, and can propagate and report the virtual alarm. The specific time point at which each device generates a virtual failure can be randomly determined. Alternatively, in order to prevent a virtual failure from occurring in multiple devices at the same time, connected devices may sequentially generate a virtual failure in a way that tokens are delivered.

도 6은 한 실시예에 따른 장애 분석 모델을 위한 학습 데이터 예시이다.6 is an example of training data for a disability analysis model according to an embodiment.

도 6을 참고하면, 장애 분석 모델(10)은 장비들에서 발생된 경보 정보와 장애 정보(장애 원인 및/또는 장애 발생 위치)의 관계를 학습한다. 이를 위해, 학습 데이터 생성기(130)는 수집한 가상 경보 이벤트들을 고유 코드를 기준으로 분류하고, 분류된 가상 경보 이벤트들로부터 장애 분석 모델에 적합한 학습 데이터를 생성한다. 학습 데이터 생성기(130)는 고유 코드로 분류된 가상 경보 이벤트들을 단일 장애 그룹으로 묶을 수 있다. 학습 데이터 생성기(130)는 단일 장애 그룹에, 가상 근원 경보로부터 추출한 장애 원인을 라벨링한 학습 데이터를 생성할 수 있다.Referring to FIG. 6, the failure analysis model 10 learns the relationship between the alarm information generated by the devices and the failure information (the cause of the failure and/or the location of the failure). To this end, the training data generator 130 classifies the collected virtual alert events based on a unique code, and generates training data suitable for a failure analysis model from the classified virtual alert events. The learning data generator 130 may group virtual alarm events classified by a unique code into a single failure group. The learning data generator 130 may generate learning data labeling the cause of the failure extracted from the virtual source alarm in a single failure group.

장애 분석 모델(10)은 예를 들면, 컨볼루션 뉴럴 네트워크(Convolution Neural Network, CNN)로 구현되고, 입력에 따라 장애 원인(Cable/Link Cut, Unit Fail, Node Fail, Power Fail)을 분류하도록 학습될 수 있다. The failure analysis model 10 is implemented as, for example, a convolution neural network (CNN), and learns to classify the cause of failure (Cable/Link Cut, Unit Fail, Node Fail, Power Fail) according to the input. Can be.

한 실시예에 따르면, 학습 데이터 생성기(130)는 고유 코드로 분류된 가상 경보 이벤트들을 기초로 가상 경보가 발생한 장비들(장비1부터 장비N) 사이의 NxN 관계 행렬(network tensor)을 계산한다. 그리고, 학습 데이터 생성기(130)는 관계 행렬을, 관계도에 따라 컬러가 표현된 이미지(20)로 변환하여, CNN으로 구현된 장애 분석 모델의 학습 데이터를 생성할 수 있다. 두 장비간 관계도는 지정된 항목들을 기초로 계산된다. 이때, 관계 행렬(20)은 장비들(장비1부터 장비N) 사이의 연결 정보를 나타내는 NxN 연결 정보 행렬을 반영하여 생성될 수 있다. 연결 정보 행렬은 전송 장비들의 연결 관계를 나타내는 네트워크 토폴로지를 기초로 생성될 수 있다.According to an embodiment, the training data generator 130 calculates an NxN network tensor between devices (equipment 1 to device N) in which a virtual alert has occurred based on virtual alert events classified by a unique code. Further, the training data generator 130 may convert the relationship matrix into an image 20 in which colors are expressed according to the relationship diagram, and may generate training data of the disorder analysis model implemented by CNN. The relationship between the two devices is calculated based on the specified items. In this case, the relationship matrix 20 may be generated by reflecting an NxN connection information matrix indicating connection information between devices (equipment 1 to equipment N). The connection information matrix may be generated based on a network topology indicating a connection relationship between transmission devices.

다른 실시예에 따르면, 학습 데이터 생성기(130)는 고유 코드로 분류된 가상 경보 이벤트들을 기초로 가상 경보가 발생한 장비들과 각 장비의 포트들 사이의 MxM 관계 행렬을 계산할 수 있다. 예를 들어, 4개 포트가 있는 장비 5대로 구성된 네트워크의 경우, 20x20 관계 행렬이 생성될 수 있다. 이때, 학습 데이터 생성기(130)는 네트워크 토폴로지를 기초로 장비 포트간 연결 여부에 따라 0 또는 1을 할당하여 초기 관계 행렬을 생성할 수 있다. 이후, 학습 데이터 생성기(130)는 연결된 장비 포트간에 전파되는 가상 경보 종류에 따라 관계 행렬의 행렬값을 할당한다. 경보 종류(수준)에 따라 다른 값이 할당되는데, 예를 들면, VLOS는 7, VAIS는 3, VRDI는 4이 할당될 수 있다. 학습 데이터 생성기(130)는 관계 행렬의 행렬값에 해당하는 컬러를 표시한 이미지를 생성할 수 있다. 학습 데이터 생성기(130)는 관계 행렬 이미지별로 장애 원인을 라벨링하여 학습 데이터를 생성할 수 있다. According to another embodiment, the training data generator 130 may calculate an MxM relationship matrix between the devices generating the virtual alert and ports of each device based on the virtual alert events classified by the unique code. For example, in the case of a network consisting of 5 devices with 4 ports, a 20x20 relationship matrix may be generated. In this case, the training data generator 130 may generate an initial relationship matrix by assigning 0 or 1 according to whether or not the device ports are connected based on the network topology. Thereafter, the training data generator 130 allocates a matrix value of the relationship matrix according to the type of virtual alarm propagated between connected device ports. Different values are assigned according to the alarm type (level). For example, 7 for VLOS and 3 for VAIS and 4 for VRDI can be assigned. The training data generator 130 may generate an image displaying a color corresponding to the matrix value of the relationship matrix. The training data generator 130 may generate training data by labeling the cause of the failure for each relationship matrix image.

장애 분석 모델 학습기(150)는 관계 행렬 이미지별로 장애 원인을 라벨링한 학습 데이터를 이용하여, 장애 분석 모델(10)을 학습시킨다.The disability analysis model learner 150 trains the disability analysis model 10 by using training data labeled with the cause of the disability for each relationship matrix image.

장애 분석기(170)는 학습된 장애 분석 모델(10)을 이용하여 장애 분석한다. 장애 분석기(170)는 일정 시간 동안 수집된 실제 경보 이벤트들을 기초로 장비간 관계 행렬 이미지를 생성하고, 이를 학습된 장애 분석 모델(10)로 입력한다 장애 분석 모델(10)은 장애 원인들 중 확률이 가장 높은 어느 하나의 장애 원인을 출력한다. The disability analyzer 170 analyzes the disability using the learned disability analysis model 10. The failure analyzer 170 generates a relationship matrix image between equipment based on the actual alarm events collected for a certain period of time, and inputs this to the learned failure analysis model 10. The failure analysis model 10 is a probability among failure causes. Outputs any one of the highest causes of failure.

이때, 장애 분석기(170)는 어텐션 매커니즘(attention mechanism)을 통해, 관계 행렬 이미지 중에서 해당 장애 원인으로 예측하는데 영향을 받은 정보(영역)를 찾을 수 있다. 관계 행렬 이미지는 장비1부터 장비N 사이의 관계를 나타내는 NxN 행렬이므로, 해당 장애 원인으로 예측하는데 영향을 받은 정보(영역)를 찾으면, 장애 발생 위치를 알 수 있다.At this time, the failure analyzer 170 may find information (area) affected by predicting the cause of the failure from among the relational matrix images through an attention mechanism. Since the relationship matrix image is an NxN matrix representing the relationship between the device 1 and the device N, the location of the error can be determined by finding information (area) affected by the prediction as the cause of the error.

도 7은 한 실시예에 따른 가상 경보를 이용한 학습 데이터 생성 방법의 흐름도이다.7 is a flowchart of a method of generating learning data using a virtual alarm according to an embodiment.

도 7을 참고하면, 네트워크 장애 분석 장치(100)는 네트워크의 전송 장비들에서 발생된(보고된) 가상 경보 이벤트들을 수집한다(S110). 고유 코드와 가상 경보 종류를 포함하는 패킷들이 연결된 전송 장비들로 전파되고, 네트워크 장애 분석 장치(100)는 고유 코드와 가상 경보 종류를 포함하는 가상 경보 이벤트를 수집할 수 있다. 가상 경보 이벤트는 장비 정보, 고유 코드, 가상 경보 종류, 지역 정보 등을 포함할 수 있다. Referring to FIG. 7, the network failure analysis apparatus 100 collects (reported) virtual alarm events generated from transmission devices of the network (S110). Packets including the unique code and the virtual alarm type are propagated to connected transmission devices, and the network failure analysis apparatus 100 may collect a virtual alarm event including the unique code and the virtual alarm type. The virtual alarm event may include equipment information, a unique code, a virtual alarm type, regional information, and the like.

네트워크 장애 분석 장치(100)는 가상 경보 이벤트들에 포함된 고유 코드를 기준으로 가상 경보 이벤트들을 분류한다(S120). 네트워크 장애 분석 장치(100)는 고유 코드를 기준으로 가상 경보 이벤트들을 복수의 그룹들로 분류한다. 각 그룹은 동일 고유 코드를 포함하는 가상 경보 이벤트들로 묶인다.The network failure analysis apparatus 100 classifies virtual alarm events based on unique codes included in the virtual alarm events (S120). The network failure analysis apparatus 100 classifies virtual alarm events into a plurality of groups based on a unique code. Each group is grouped into virtual alarm events that contain the same unique code.

네트워크 장애 분석 장치(100)는 고유 코드를 기준으로 분류된 가상 경보 이벤트들 중에서, 가상 근원 경보를 추출한다(S130).The network failure analysis apparatus 100 extracts a virtual source alarm from among virtual alarm events classified based on the unique code (S130).

네트워크 장애 분석 장치(100)는 가상 근원 경보에 관련된 장애 원인을 추출한다(S140).The network failure analysis apparatus 100 extracts a cause of a failure related to the virtual source alarm (S140).

네트워크 장애 분석 장치(100)는 고유 코드를 기준으로 분류된 가상 경보 이벤트들에, 추출한 장애 원인을 라벨링한 학습 데이터를 생성한다(S150). 이때, 네트워크 장애 분석 장치(100)는 가상 근원 경보를 보고한 장비를 장애 발생 위치로 추출하고, 장애 원인 및 장애 발생 위치가 라벨링된 학습 데이터를 생성할 수 있다. 학습 데이터는 동일 고유 코드를 포함하는 가상 경보 이벤트들로부터 생성되는데, 장애 분석 모델 종류, 입력층 및 출력층 구조 등에 따라 다양하게 결정될 수 있다. The network failure analysis apparatus 100 generates learning data labeling the extracted cause of failure to virtual alarm events classified based on the unique code (S150). In this case, the network failure analysis apparatus 100 may extract the device reporting the virtual source alarm as a failure occurrence location, and generate learning data labeled with the failure cause and the failure occurrence location. The training data is generated from virtual alarm events including the same unique code, and may be variously determined according to the type of the failure analysis model, the structure of the input layer and the output layer.

네트워크 장애 분석 장치(100)는 학습 데이터들을 이용하여, 장애 분석 모델을 학습시킨다(S160).The network failure analysis apparatus 100 learns a failure analysis model using the learning data (S160).

도 8은 한 실시예에 따른 학습된 장애 분석 모델을 이용한 장애 분석 방법의 흐름도이다.8 is a flowchart of a disability analysis method using a learned disability analysis model according to an embodiment.

도 8을 참고하면, 네트워크 장애 분석 장치(100)는 네트워크의 전송 장비들에서 발생된 실제 경보 이벤트들을 수집한다(S210).Referring to FIG. 8, the network failure analysis apparatus 100 collects actual alarm events generated from transmission devices of the network (S210).

네트워크 장애 분석 장치(100)는 수집한 실제 경보 이벤트들을 학습된 장애 분석 모델의 입력 정보로 가공(전처리)한다(S220). 네트워크 장애 분석 장치(100)는 수집한 실제 경보 이벤트들이 고유 코드를 포함하고 있지 않더라도, 네트워크 토폴로지를 참고하여 특정 장애에 연관된 경보들로 묶을 수 있다.The network failure analysis apparatus 100 processes (pre-processes) the collected actual alarm events into input information of the learned failure analysis model (S220). The network failure analysis apparatus 100 may group the collected actual alarm events into alarms related to a specific failure by referring to the network topology, even if the collected actual alarm events do not include a unique code.

네트워크 장애 분석 장치(100)는 실제 경보 이벤트들로부터 생성한 입력 정보를 학습된 장애 분석 모델에 입력한다(S230).The network failure analysis apparatus 100 inputs input information generated from actual alarm events into the learned failure analysis model (S230).

네트워크 장애 분석 장치(100)는 학습된 장애 분석 모델로부터 출력된 장애 정보를 표시한다(S240). 출력된 장애 정보는 장애 원인, 장애 발생 위치를 포함할 수 있다. The network failure analysis apparatus 100 displays failure information output from the learned failure analysis model (S240). The output failure information may include the cause of the failure and the location of the failure.

도 9는 한 실시예에 따른 토폴로지 구성 방법을 설명하는 도면이다.9 is a diagram illustrating a method of configuring a topology according to an embodiment.

도 9를 참고하면, 네트워크 장애 분석 장치(100)는 수집한 가상 경보 이벤트들을 이용하여 네트워크 토폴로지를 생성할 수 있다. 전송 장비들이 A-B-C-D-E로 연결되고, 장비A(end user)에서 장비E로 신호를 전송한다고 가정한다.Referring to FIG. 9, the network failure analysis apparatus 100 may generate a network topology using collected virtual alarm events. Suppose that the transmitting devices are connected to A-B-C-D-E and transmit signals from device A (end user) to device E.

(a)를 참고하면, 장비B-장비C 사이의 링크가 끊기는 장애를 가상으로 구현하기 위해, 장비B가 장비C로 전송하는 패킷 헤더에 VLOS를 포함하여 전송한다. 그러면, VLOS를 포함하는 패킷을 수신한 장비C가 수신 링크에서의 VLOS 발생을 NMS에 보고한다. 장비B는 신호 진행 방향의 역방향에 연결된 장비이므로 VRDI를 포함하는 패킷을 역방향으로 전파하고, VRDI 발생을 NMS에 보고한다. 장비C 주변의 장비D는 신호 진행 방향의 순방향에 연결된 장비이므로 VAIS를 포함하는 패킷을 순방향으로 전파하고, VAIS 발생을 NMS에 보고한다. 장비A와 장비E는 하위 계위라서 VRDI, VAIS를 발생시키고, VRDI와 VAIS는 NMS에 보고된다. 장비A와 장비E가 같은 계위라면 VRDI, VAIS는 더 이상 전파되지 않는다. Referring to (a), in order to virtually implement a failure in which the link between equipment B and equipment C is disconnected, the VLOS is included in the packet header transmitted by the equipment B to the equipment C. Then, the device C, which has received the packet including the VLOS, reports the occurrence of VLOS on the receiving link to the NMS. Since equipment B is a device connected in the reverse direction of the signal traveling direction, it propagates a packet including VRDI in the reverse direction and reports the occurrence of VRDI to the NMS. Since the device D near the device C is the device connected in the forward direction of the signal progression, it propagates the packet including VAIS in the forward direction and reports the occurrence of VAIS to the NMS. Since equipment A and E are lower levels, VRDI and VAIS are generated, and VRDI and VAIS are reported to the NMS. If equipment A and E are at the same level, VRDI and VAIS are no longer propagated.

NMS는 장비들로부터, VLOS 경보 이벤트, VRDI 경보 이벤트, VAIS 경보 이벤트 등의 가상 경보 이벤트들을 수신한다. 이때, 장비들은 가상 경보 종류와 함께, 고유 코드를 포함하는 패킷을 전송하고, NMS로 가상 경보 종류와 고유 코드를 포함하는 가상 경보 이벤트를 보고할 수 있다.The NMS receives virtual alarm events such as VLOS alarm events, VRDI alarm events, and VAIS alarm events from devices. At this time, the devices may transmit a packet including a unique code along with the virtual alarm type, and report a virtual alarm event including the virtual alarm type and the unique code to the NMS.

네트워크 장애 분석 장치(100)는 NMS에서 수신한 표 3과 같은 정보를 기초로 가상 경보들이 나타내는 방향 정보와 계위를 판단하여, A-B-C-D-E로 연결되는 토폴로지를 생성할 수 있다.The network failure analysis apparatus 100 may generate a topology connected to A-B-C-D-E by determining direction information and a level indicated by the virtual alerts based on the information shown in Table 3 received from the NMS.

전송 장비Transmission equipment 가상 경보 종류Virtual alarm type 계위 정보(헤더에 포함)Rank information (included in header) CC VLOSVLOS C 장비의 계위C equipment rank BB VRDIVRDI B 장비의 계위B equipment rank DD VAISVAIS D 장비의 계위D equipment rank AA VRDIVRDI A 장비의 계위Rank of equipment A EE VAISVAIS E 장비의 계위E equipment hierarchy

(b)를 참고하면, 장비A-장비B 사이의 링크가 끊기는 가상 장애를 가상으로 구현하기 위해, 장비A가 장비B로 전송하는 패킷 헤더에 VLOS를 포함하여 전송한다. VLOS를 포함하는 패킷을 수신한 장비B가 수신 링크에서의 VLOS 발생을 NMS에 보고한다. 장비A는 신호 진행 방향의 역방향에 연결된 장비이므로 VRDI를 포함하는 패킷을 역방향으로 전파하고, VRDI 발생을 NMS에 보고한다. 장비C는 신호 진행 방향의 순방향에 연결된 장비이므로 VAIS를 포함하는 패킷을 순방향으로 전파하고, VAIS 발생을 NMS에 보고한다. 만약, 장비C와 장비D가 같은 계위라면 파생 경보는 더 이상 확장되지 않는다. NMS는 장비들로부터 VLOS 경보 이벤트, VRDI 경보 이벤트, VAIS 경보 이벤트 등의 가상 경보 이벤트들을 수신한다. Referring to (b), in order to virtually implement a virtual failure in which the link between equipment A and equipment B is disconnected, a VLOS is included in the packet header transmitted from equipment A to equipment B and transmitted. Equipment B, which receives the packet including VLOS, reports the occurrence of VLOS on the receiving link to the NMS. Since device A is a device connected in the reverse direction of the signal travel direction, it propagates a packet including VRDI in the reverse direction, and reports the occurrence of VRDI to the NMS. Since equipment C is a device connected in the forward direction of the signal traveling direction, it propagates the packet including VAIS in the forward direction and reports the occurrence of VAIS to the NMS. If equipment C and equipment D are at the same level, the derivative alarm is no longer extended. The NMS receives virtual alarm events such as VLOS alarm events, VRDI alarm events, and VAIS alarm events from devices.

네트워크 장애 분석 장치(100)는 NMS에서 수신한 표 4과 같은 정보를 이용하여, A-B-C로 연결되는 토폴로지를 생성할 수 있다.The network failure analysis apparatus 100 may generate a topology connected to A-B-C by using the information shown in Table 4 received from the NMS.

전송 장비Transmission equipment 가상 경보 종류Virtual alarm type 계위 정보(헤더에 포함)Rank information (included in header) BB VLOSVLOS B 장비의 계위B equipment rank AA VRDIVRDI B 장비의 계위B equipment rank CC VAISVAIS D 장비의 계위D equipment rank

이와 같이, 네트워크 장애 분석 장치(100)는 가상 경보들을 이용해, 동일 고유 코드로 분류된 가상 경보 이벤트들을 수집할 수 있기 때문에, 가상 경보들이 나타내는 방향 정보(순방향, 역방향)와 계위를 판단하여 네트워크 토폴로지를 구성할 수 있다. (b)와 같이 부분적인 토폴로지가 획득되더라도, 다수의 가상 장애를 통해 토폴로지가 지속적으로 업데이트될 수 있다. As described above, since the network failure analysis apparatus 100 can collect virtual alarm events classified by the same unique code by using the virtual alarms, the network topology is determined by determining the direction information (forward, reverse) and the rank indicated by the virtual alarms. Can be configured. Even if a partial topology is obtained as shown in (b), the topology may be continuously updated through a number of virtual failures.

(a)와 같이, 중심 장비 근처에서 가상 장애가 발생되면 넓은 범위의 토폴로지를 파악할 수 있다. 따라서, 상위 계위 장비에 가상 장애를 발생시켜 넓은 범위의 토폴로지를 파악하고, 점차 하위 계위 장비에 가상 장애를 발생시켜 토폴로지를 상세 업데이트할 수 있다.As shown in (a), if a virtual failure occurs near the central equipment, a wide range of topologies can be identified. Accordingly, it is possible to identify a wide range of topology by generating a virtual failure in an upper level device, and gradually generate a virtual failure in a lower level device to update the topology in detail.

이와 같이, 실시예에 따르면, 고유 코드를 이용하여 가상 경보 이벤트들을 명확히 구분 및 그루핑할 수 있고, 또한 다양한 전송 장비들에서 가상 장애를 발생시킴으로써 실제 통신 장애가 발생하지 않더라도 다량의 학습 데이터를 획득할 수 있다. As described above, according to the embodiment, it is possible to clearly classify and group virtual alarm events using a unique code, and also, by generating a virtual failure in various transmission devices, a large amount of learning data can be obtained even if no actual communication failure occurs. have.

실시예에 따르면, 가상 경보 이벤트들을 이용하여 장애 분석 모델이 장애 원인 및/또는 장애 발생 위치를 분류하도록 지도 학습시키므로, 장애 분석 모델의 정확성을 높일 수 있다.According to an embodiment, since the failure analysis model is supervised to classify the cause of the failure and/or the location of the failure by using virtual alarm events, the accuracy of the failure analysis model can be improved.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements by those skilled in the art using the basic concept of the present invention defined in the following claims are also provided. It belongs to the scope of rights.

Claims

As a network failure analysis device,
An alarm collector that collects virtual alarm events generated from transmission devices of the network, and
Classify the virtual alarm events into a plurality of groups based on a unique code included in the virtual alarm events, extract a virtual source alarm for each group based on the virtual alarm type included in each virtual alarm event, and the virtual alarm event Includes a training data generator that generates training data that labels the cause of the failure related to the source alert to the corresponding group,
Virtual alarm events included in each group are generated in transmission devices related to the specific transmission device due to a virtual failure generated by a specific transmission device, and the specific transmission device and the related transmission devices are virtual Network fault analysis device that generates alarm events.

In claim 1,
The unique code is written in a designated reserved site of a packet header transmitted between transmission equipments.

In claim 1,
The virtual alert type is described in a designated reserved site of a packet header transmitted between transmission equipments.

In claim 1,
The training data generator
A network failure analysis apparatus for extracting the transmission equipment reporting the virtual source alarm as a failure occurrence location, and generating learning data labeled with the failure cause and failure location in each group.

In claim 1,
The training data generator
The connection relationship between the transmission devices that generated the virtual alarm events included in each group is created as a matrix, the matrix is converted into a relationship matrix image displaying colors according to the relationship diagram, and the cause of the failure is labeled on the relationship matrix image. That, network failure analysis device.

In claim 1,
A learner for learning a failure analysis model for estimating a relationship between a failure and alarms generated by transmission devices due to the failure, based on the learning data generated by the learning data generator, and
A failure analyzer that preprocesses collected actual alarm events according to the input of the failure analysis model, inputs the preprocessed input information into the learned failure analysis model, and outputs failure information estimated from the learned failure analysis model
Further comprising a network failure analysis device.

In paragraph 6,
The failure information includes the cause of the failure and the location of the failure, network failure analysis device.

In claim 1,
Based on the direction information indicated by the virtual alarm type included in each virtual alarm event, the connection relationship between the transmission devices that generated the virtual alarm events included in each group is estimated, and the network topology is generated based on the estimated connection relationship. Topology Generator
Further comprising a network failure analysis device.

As a method for the network failure analysis device to generate learning data,
Collecting virtual alarm events generated from transmission devices of the network,
Classifying the virtual alarm events into a plurality of groups based on a unique code included in the virtual alarm events,
Determining the cause of the failure that caused the virtual alarm events of each group based on the virtual source alarm included in each virtual alarm event, and
Generating training data labeled with the cause of the disorder in each group,
Virtual alarm events included in each group are generated in transmission devices related to the specific transmission device due to a virtual failure generated by a specific transmission device, and the specific transmission device and the related transmission devices are virtual A method of generating learning data, generating alarm events.

In claim 9,
Instructing the occurrence of a virtual failure to the specific transmission device
Further comprising, learning data generation method.

In claim 9,
The step of generating the training data
The connection relationship between the transmission devices that generated the virtual alarm events included in each group is created as a matrix, the matrix is converted into a relationship matrix image displaying colors according to the relationship diagram, and the cause of the failure is labeled on the relationship matrix image. How to generate training data.

As a network failure analysis model learning method of a network failure analysis device,
Collecting virtual alarm events generated from transmission devices of the network,
Classifying the virtual alarm events into a plurality of groups based on a unique code included in the virtual alarm events,
Extracting a virtual source alert for each group based on the type of virtual alert included in each virtual alert event, and determining a cause of a failure related to the virtual source alert,
The connection relationship between the transmission devices that generated the virtual alarm events included in each group is created as a matrix, the matrix is converted into a relationship matrix image displaying colors according to the relationship diagram, and the cause of the failure is labeled on the relationship matrix image. One step of generating training data, and
Training a disability analysis model based on the learning data
Network failure analysis model learning method comprising a.

In claim 12,
The failure analysis model is implemented as a convolution neural network that estimates the relationship between the failure and the alerts generated by the transmission equipment due to the failure, a network failure analysis model learning method.

In claim 12,
The step of generating the training data
A method of learning a network failure analysis model for generating the matrix based on a network topology including a connection relationship between the transmission devices.

In claim 12,
Virtual alarm events included in each group are generated in transmission devices related to the specific transmission device due to a virtual failure generated by a specific transmission device, and the specific transmission device and the related transmission devices are virtual A network failure analysis model learning method that generates alarm events.

As a method of generating a virtual failure of the network transmission equipment,
At the time of occurrence of the virtual failure, writing a unique code and a virtual alarm corresponding to the virtual failure in a header of a packet transmitted to an adjacent device, and
And transmitting a packet including the unique code and the virtual alert to the adjacent device,
The unique code and the virtual alert are written on a site different from the site assigned to the actual alert in the header,
The neighboring device determines that the virtual failure has occurred in a receiving link based on the virtual alarm, and performs a designated operation when the virtual failure occurs.