KR20190104759A

KR20190104759A - System and method for intelligent equipment abnormal symptom proactive detection

Info

Publication number: KR20190104759A
Application number: KR1020180025268A
Authority: KR
Inventors: 채윤주; 이종필; 권성용; 성종규
Original assignee: 주식회사 케이티
Priority date: 2018-03-02
Filing date: 2018-03-02
Publication date: 2019-09-11
Also published as: KR102150622B1

Abstract

A system for previously detecting abnormalities in network equipment comprises: a graph similarity deriving module normalizing a plurality of resource usages collected from network equipment to generate resource usage graphs and comparing a difference between a reference resource graph selected from the resource usage graphs and the remaining resource graphs to extract at least one associated resource graph associated with the reference resource graph; a failure occurrence information storage module storing failure history information on the graph similarity deriving module and failed network device; and a failure detecting module comparing the associated resource graph with the failure history information to define an abnormality-prone equipment and comparing the resource usage graph and failure history information of the network equipment defined as the abnormality-prone equipment to detect equipment in which an abnormality is highly likely to occur, and detecting the abnormality in advance.

Description

System and method for intelligent equipment abnormal symptom proactive detection

본 발명은 지능형 장비 이상 증상 사전 탐지 시스템 및 방법에 관한 것이다.The present invention relates to an intelligent equipment abnormal symptom proactive detection system and method.

네트워크의 과부화를 탐지하기 위해 다양한 탐지 방법들이 제시되었다. 네트워크의 과부하를 탐지하기 위하여, 네트워크 장비에 장애가 발생하면 관계자에게 장애 장비에 대한 경보를 발생한다. 장애 경보는 장애가 발생한 이후의 경보로, 네트워크 장비의 서비스가 중단 된 후 문제를 해결하는 방법이기 때문에 서비스가 중단되는 문제가 있다.Various detection methods have been proposed to detect network overload. In order to detect network overload, if a network device fails, an alarm is issued to the person concerned. The failure alarm is an alarm after a failure, and the service is interrupted because it is a way to solve the problem after the service of the network equipment is stopped.

또한 복수의 다른 네트워크 장비에 대한 지식을 알아야 하는 고급 기술자인 네트워크 장비 관계자가 네트워크 과부하 탐지를 위해 필요하다. 그리고, 신규 증축/설계 등의 변화하는 네트워크 시스템에 맞춰 네트워크 장비를 유지 보수하기 위해서는, 지속적으로 많은 고급인력이 반드시 필요하다는 문제점이 있다.In addition, network equipment personnel, advanced technicians who need to know knowledge of multiple different network devices, are required for network overload detection. In addition, in order to maintain network equipment in accordance with a changing network system such as new expansion / design, there is a problem in that a large number of high-quality manpower is constantly required.

또한, 네트워크 장비 관계자가 다양한 장비 종류로부터 생성되는 방대한 량의 네트워크 리소스 현황을 확인하여 네트워크 장비의 장애 발생 내용과의 연관성을 판단하기에는 한계가 있다. 그리고, 네트워크 현황파악 만으로는 모든 네트워크 장비의 장애를 판단하지 못하는 문제가 있다.In addition, there is a limitation that the person concerned with the network equipment can determine the relationship with the occurrence of the failure of the network equipment by checking the current state of the network resources generated from the various equipment types. In addition, there is a problem that can not determine the failure of all network equipment only by grasping the network status.

이 외에도 네트워크 장비의 이상을 사전 탐지하는 종래의 기술의 경우 리소스 전체의 정보를 토대로 학습을 수행하기 때문에, 학습 시간이 오래 걸려 네트워크 장비의 이상을 해결하기 촉박한 시간에 관계자에게 정보를 넘겨준다는 한계를 갖고 있다. 또한 네트워크 장비의 사용량에 '임계치'를 지정하여 임계치를 초과시 네트워크 장비의 이상이 발생할 가능성을 제공하는 방식으로 구성되어 있는 경우에는, 네트워크 장비의 사용량이 임계치를 돌파하기 전에는 관계자에게 정보를 전달할 수 없다는 문제점이 있다.In addition, in the case of the conventional technology that detects the abnormality of the network equipment, the learning is performed based on the information of the entire resource, so that the learning time is long and the information is handed over to the related parties in a time when it is urgent to solve the abnormality of the network equipment. Have In addition, if the network device is configured in such a way that a threshold value is specified for the usage of the network equipment, and the abnormality of the network equipment is generated when the threshold is exceeded, information cannot be communicated to the related party until the usage of the network equipment exceeds the threshold. There is a problem.

따라서, 본 발명은 네트워크 장비 리소스별 변화 추이의 연관성을 학습하여, 지능형 장비의 이상 증상을 사전 탐지하는 시스템 및 방법을 제공한다.Accordingly, the present invention provides a system and method for proactively detecting abnormal symptoms of intelligent equipment by learning the correlation of changes in network equipment resources.

상기 본 발명의 기술적 과제를 달성하기 위한 본 발명의 하나의 특징인 네트워크 장비의 이상 증상을 사전에 탐지하는 시스템은,The system for detecting in advance the abnormal symptoms of the network equipment which is one feature of the present invention for achieving the technical problem of the present invention,

네트워크 장비에서 수집된 복수의 리소스 사용량을 정규화하여 리소스 사용량 그래프들을 생성하고, 상기 리소스 사용량 그래프들 중에서 선택된 기준 리소스 그래프와 나머지 리소스 그래프들 사이의 차이를 비교하여 상기 기준 리소스 그래프에 연관된 적어도 하나의 연관 리소스 그래프를 추출하는 그래프 유사도 도출 모듈, 장애가 발생한 네트워크 장비에 대한 장애 이력 정보를 저장하는 장애 발생 정보 저장 모듈, 그리고 상기 연관 리소스 그래프와 상기 장애 이력 정보를 비교하여 이상 발생 가능 장비를 정의하고, 이상 발생 가능 장비로 정의된 네트워크 장비의 리소스 사용량 그래프와 상기 장애 이력 정보를 비교하여 이상 발생 유력 장비를 검출하여 이상 증상을 사전에 탐지하는 장애 검출 모듈을 포함한다. Resource usage graphs are generated by normalizing a plurality of resource usages collected from a network device, and comparing the difference between the reference resource graph selected from the resource usage graphs and the remaining resource graphs and at least one association associated with the reference resource graph. A graph similarity derivation module for extracting a resource graph, a fault occurrence information storage module for storing fault history information on a faulty network device, and a fault-prone device are defined by comparing the related resource graph with the fault history information. And a failure detection module that detects the abnormal symptoms and detects the abnormal symptom by comparing the resource usage graph of the network equipment defined as the possible equipment with the failure history information.

본 발명에 따르면 수집된 네트워크 장비의 리소스 연관성을 기반으로 분석하여 다양한 네트워크 장비의 이상에 대한 기존의 내역과 비교하여 분석한 내용을 제공함으로써, 네트워크의 품질 향상을 기대할 수 있다. According to the present invention, by analyzing based on the resource associations of the collected network equipment and providing the analyzed contents compared with the existing details on the abnormalities of various network equipment, it is possible to expect an improvement in network quality.

또한, 선행 학습법을 통하여 세밀한 학습 시작시간을 탐지하고 후행 학습 방법을 이용하여 네트워크 장비에 대한 이상 대응 시간을 단축할 수 있다. In addition, the detailed learning start time can be detected through the preceding learning method, and the abnormal response time for the network equipment can be shortened using the later learning method.

또한, 신규 설계 및 설치되는 네트워크 장비에 대한 이상을 사전 탐지하여 정보를 제공하고, 이를 통해 네트워크 장비 관리자들의 의사 결정에 도움을 줌으로써, 인적자원 효율성을 향상 시킬 수 있다. In addition, human resources can be improved by proactively detecting and providing information on newly designed and installed network equipment to help network equipment managers make decisions.

또한, 수집된 네트워크 장비 리소스를 연관성 기반으로 분석하여 자동으로 네트워크 장비 관계자에게 네트워크 장비 이상에 대한 정보를 사전에 탐지하는 시스템을 제공할 수 있다.In addition, by analyzing the collected network equipment resources based on the association, it is possible to provide a system for automatically detecting network equipment abnormality information to network equipment personnel in advance.

도 1은 본 발명의 실시예에 따른 사전 탐지 시스템이 적용된 환경의 예시도이다.
도 2는 본 발명의 실시예에 따른 사전 탐지 시스템의 구조도이다.
도 3은 본 발명의 실시예에 따른 지능형 장비 이상 증상 사전 탐지 방법에 대한 흐름도이다.
도 4는 본 발명의 실시예에 따른 장비별 리소스 사용량에 대한 데이터 정규화의 예시도이다.
도 5는 본 발명의 실시예에 따른 장비별 리소스 사용량 그래프의 예시도이다.
도 6은 본 발명의 실시예에 따른 장비별 리소스 사용량의 그래프화의 또 다른 예시도이다.
도 7은 본 발명의 실시예에 따른 트래픽과의 리소스별 그래프 거리 기반 연관도를 나타낸 예시도이다.
도 8은 본 발명의 실시예에 따른 필터링된 리소스 사용량 기반 변화량 유사도 분석의 예시도이다.
도 9는 본 발명의 실시예에 따른 유사도 후행 분석의 예시도이다.
도 10은 본 발명의 실시예에 따른 이상 발생 이력에 추가된 정보의 예시도이다.1 is an exemplary diagram of an environment to which a prior detection system according to an embodiment of the present invention is applied.
2 is a structural diagram of a pre-detection system according to an embodiment of the present invention.
3 is a flowchart illustrating an intelligent device abnormal symptom pre-detection method according to an embodiment of the present invention.
4 is an exemplary diagram of data normalization for resource usage per device according to an embodiment of the present invention.
5 is an exemplary diagram of a resource usage graph for each device according to an embodiment of the present invention.
6 is another exemplary diagram of graphing resource usage by equipment according to an embodiment of the present invention.
7 is an exemplary diagram illustrating a graph distance based association degree for each resource with traffic according to an embodiment of the present invention.
8 is an exemplary diagram of filtered resource usage-based variation similarity analysis according to an embodiment of the present invention.
9 is an exemplary diagram of a similarity trailing analysis according to an embodiment of the present invention.
10 is an exemplary diagram of information added to an abnormal occurrence history according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present invention. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and like reference numerals designate like parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. Throughout the specification, when a part is said to "include" a certain component, it means that it can further include other components, without excluding other components unless specifically stated otherwise.

본 명세서에서 단말(terminal)은, 이동국(Mobile Station, MS), 이동 단말(Mobile Terminal, MT), 가입자국(Subscriber Station, SS), 휴대 가입자국(Portable Subscriber Station, PSS), 사용자 장치(User Equipment, UE), 접근 단말(Access Terminal, AT) 등을 지칭할 수도 있고, 이동 단말, 가입자국, 휴대 가입자 국, 사용자 장치 등의 전부 또는 일부의 기능을 포함할 수도 있다.In the present specification, a terminal is a mobile station (MS), a mobile terminal (MT), a subscriber station (SS), a portable subscriber station (PSS), a user device (User). It may also refer to an Equipment (UE), an Access Terminal (AT), or the like, and may include all or some functions of a mobile terminal, a subscriber station, a portable subscriber station, a user device, and the like.

이하 도면을 참조로 하여, 본 발명의 실시예에 따른 지능형 장비의 이상 증상을 사전에 탐지하는 시스템 및 방법에 대해 상세히 설명한다. Hereinafter, a system and method for detecting abnormal symptoms of intelligent equipment according to an embodiment of the present invention in advance will be described in detail.

도 1은 본 발명의 실시예에 따른 사전 탐지 시스템이 적용된 환경의 예시도이다.1 is an exemplary diagram of an environment to which a prior detection system according to an embodiment of the present invention is applied.

도 1에 도시된 바와 같이, 복수의 네트워크 장비들(200-1∼200-n)과 연동한 지능형 장비 이상 사전 탐지 시스템(이하, 설명의 편의를 위하여 '사전 탐지 시스템'이라 지칭함)(100)은, 각 네트워크 장비들(200-1∼200-n)에서 사용되는 리소스 사용량 정보를 수집한다. As shown in FIG. 1, an intelligent equipment abnormality pre-detection system (hereinafter, referred to as a "pre-detection system" for convenience of description) 100 in conjunction with a plurality of network equipments 200-1 to 200-n. Collects resource usage information used in each of the network devices 200-1 to 200-n.

그리고 수집한 리소스 사용량 정보를 분석한 후, 리소스 내역 중 사용량이 급변한 리소스 내역을 선행 분석한다. 사전 탐지 시스템(100)은 선행 분석한 리소스 내역의 정보와 기존에 네트워크 장비에 이상이 발생한 리소스 정보를 이용하여, 네트워크 장비들(200-1∼200-n)에 문제 발생 가능성을 탐지한다. 문제 발생 가능성을 탐지한 사전 탐지 시스템(100)은 네트워크 장비 관계자가 소지한 단말(300)로 문제 발생 가능성에 대한 정보를 전달한다. After analyzing the collected resource usage information, it analyzes the resource history of the sudden change in the resource history. The preliminary detection system 100 detects the possibility of a problem occurring in the network devices 200-1 to 200-n by using the information of the resource analysis previously analyzed and the resource information in which an abnormality has occurred in the existing network equipment. The preliminary detection system 100 that detects the possibility of a problem transmits information on a problem occurrence possibility to the terminal 300 possessed by a network equipment official.

사전 탐지 시스템(100)은 이상 현상의 발생 시점을 세분화하여 탐지한다. 그리고, 일정 수준 이상의 이상 발생에 대한 유사도를 탐지한 네트워크 장비 리소스 내역을 토대로, 선행 학습을 수행한 급변한 리소스 변화의 시작점으로부터 전체 리소스를 학습하여 기존의 이상이 발생한 네트워크 장비 이력과 비교하여, 사용자에게 자동화 학습과 유사도 기반의 네트워크 이상을 사전에 탐지할 수 있다.The advance detection system 100 subdivides and detects the time point of occurrence of the abnormal phenomenon. And, based on the network equipment resource history that detects the similarity to the abnormal occurrence of a certain level or more, the user learns all the resources from the starting point of the rapidly changing resource change that performed the previous learning, and compares them with the existing network equipment history where the abnormality has occurred. Proactively detect network anomalies based on automated learning and similarity.

여기서, 네트워크 장비들(200-1~200-n)에서 사용된 리소스 사용량 정보를 수집하여 사용량이 급변한 리소스 내역을 분석하고, 문제 발생 가능성을 탐지하는 사전 탐지 시스템(100)의 구조에 대해 도 2를 참조로 설명한다.Here, the structure of the pre-detection system 100 for collecting the resource usage information used in the network devices (200-1 ~ 200-n) to analyze the resource history of the rapidly changing usage, and detect the possibility of problems It demonstrates with reference to 2.

도 2는 본 발명의 실시예에 따른 사전 탐지 시스템의 구조도이다.2 is a structural diagram of a pre-detection system according to an embodiment of the present invention.

도 2에 도시된 바와 같이, 적어도 하나의 프로세서에 의해 제어되어 동작하는 사전 탐지 시스템(100)은 리소스 수집 모듈(110), 그래프 유사도 도출 모듈(120), 장애 검출 모듈(130), 표출 모듈(140), 장애 발생 정보 저장 모듈(150)을 포함한다.As shown in FIG. 2, the proactive detection system 100 controlled and operated by at least one processor includes a resource collection module 110, a graph similarity derivation module 120, a failure detection module 130, and an expression module ( 140, the failure occurrence information storage module 150.

리소스 수집 모듈(110)은 복수의 네트워크 장비들(200-1∼200-n)로부터 각각의 네트워크 장비들이 사용한 리소스 사용량을 수집한다. 리소스 사용량은 각 네트워크 장비들의 식별 정보, 리소스 사용량이 수집된 시간 정보, 각각의 네트워크 장비들이 사용한 다양한 리소스 사용량들을 포함한다. The resource collection module 110 collects resource usage used by each network device from the plurality of network devices 200-1 to 200-n. The resource usage includes identification information of each network device, time information at which resource usage is collected, and various resource usages used by each network device.

본 발명의 실시예에서는 네트워크 장비들이 사용한 리소스 사용량에는, 네트워크 장비를 구성하는 구성 요소인 CPU, 메모리, 디스크에서 각각 사용한 리소스 사용량, 네트워크 장비에서 트래픽을 처리하기 위해 사용한 리소스 사용량, 그리고 네트워크 장비의 온도에 의한 리소스 사용량 등으로 복수의 리소스 사용량이 수집되는 것을 예로 하여 설명한다. 그러나, 반드시 이와 같이 한정되는 것은 아니다.In the embodiment of the present invention, the resource usage used by the network equipment includes resource usage used by the CPU, memory, and disk, which are components constituting the network equipment, resource usage used to process traffic by the network equipment, and temperature of the network equipment. A description will be given by taking an example in which a plurality of resource usages are collected due to resource usage by the server. However, it is not necessarily limited to this.

그래프 유사도 도출 모듈(120)은 리소스 수집 모듈(110)이 수집한 리소스 사용량을 정규화하여 수치화하여 리소스 사용량 정보로 생성한다. 리소스 사용량을 정규화하는 이유는 상이한 종류의 구성 요소에서 수집된 리소스 사용량을 통일된 수치로 표현하기 위함으로, 그래프 유사도 도출 모듈(120)이 리소스 사용량을 정규화하는 방법은 다양한 방법으로 수행할 수 있으므로, 본 발명의 실시예에서는 상세한 설명을 생략한다. The graph similarity derivation module 120 normalizes and quantifies the resource usage collected by the resource collection module 110 to generate resource usage information. The reason for normalizing the resource usage is to represent the resource usage collected from different kinds of components as a unified number. Since the method of normalizing the resource usage by the graph similarity derivation module 120 can be performed in various ways, In the embodiment of the present invention, detailed description is omitted.

그래프 유사도 도출 모듈(120)은 수치화된 리소스 사용량 정보를 토대로, 2차원 평면의 리소스 사용량 그래프를 생성한다. 임의의 네트워크 장비에 대해 생성된 리소스 사용량 그래프는 CPU 그래프, 메모리 그래프, 디스크 입출력 그래프, 트래픽량 그래프, 그리고 온도 그래프가 포함되어 있다. 그래프 유사도 도출 모듈(120)이 리소스 사용량 정보를 리소스 사용량 그래프로 생성하는 것은 다양한 방법으로 수행할 수 있으므로 본 발명의 실시예에서는 상세한 설명을 생략한다. The graph similarity derivation module 120 generates a resource usage graph of the 2D plane based on the quantized resource usage information. Resource usage graphs generated for any network device include CPU graphs, memory graphs, disk I / O graphs, traffic graphs, and temperature graphs. Since the graph similarity derivation module 120 may generate the resource usage information as the resource usage graph, various descriptions are omitted in the exemplary embodiment of the present invention.

그래프 유사도 도출 모듈(120)은 복수의 그래프들을 포함하는 리소스 사용량 그래프에서, 기준 그래프로 삼은 트래픽량 그래프와 다른 그래프들(CPU 그래프, 메모리 그래프, 디스크 입출력 그래프, 온도 그래프)간의 거리 차를 확인한다. 본 발명의 실시예에서는 기준 그래프로 트래픽량 그래프를 이용하는 것을 예로 하여 설명하나, 반드시 이와 같이 한정되는 것은 아니다.The graph similarity derivation module 120 identifies a distance difference between the traffic volume graph as a reference graph and other graphs (CPU graph, memory graph, disk I / O graph, and temperature graph) in a resource usage graph including a plurality of graphs. . In the embodiment of the present invention, a traffic graph is used as a reference graph, but is not limited thereto.

그래프 유사도 도출 모듈(120)은 그래프간 거리의 차를 확인하여, 미리 설정한 임계 거리를 초과하는 그래프가 있는지 판단한다. 임계 거리를 초과한 그래프가 있다면, 임계 거리를 초과한 것으로 판단된 시점(이하, '기준 시점'이라 지칭함)을 기준으로 이전 시점(제1 시점)과 이후 시점(제2 시점)에 그려진 그래프들만을 추출하여 거리 차이 그래프로 생성한다. 여기서, 거리 차이 그래프는 기준 그래프와 다른 그래프들 사이의 거리 차를 그래프로 생성한 것으로, 제1 시점~기준시점~제2 시점까지만 그래프로 생성한다.The graph similarity derivation module 120 checks the difference between the distances between the graphs and determines whether there is a graph exceeding a preset threshold distance. If there is a graph exceeding the threshold distance, only the graphs drawn at the previous time point (the first time point) and the subsequent time point (the second time point) with respect to the time point determined to exceed the threshold distance (hereinafter, referred to as a 'reference time point') To generate a distance difference graph. Here, the distance difference graph is a graph of a distance difference between the reference graph and other graphs, and is generated as a graph only from the first time point to the reference time point to the second time point.

그래프 유사도 도출 모듈(120)은 거리 차이 그래프에서, 각각의 시점간에 절대값 차가 미리 지정한 임계값 보다 작은 그래프들만 확인한다. 그리고 임계값보다 작은 그래프로 생성된 리소스 사용량을 수집한 네트워크 장비의 구성 요소들을 확인한다. 그래프 유사도 도출 모듈(120)은 확인한 네트워크 장비의 구성 요소들이 수집한 리소스 사용량을 '연관된 리소스'라 정의하고, 연관 리소스 그래프를 생성한다.In the distance difference graph, the graph similarity derivation module 120 checks only the graphs in which the absolute difference between each time point is smaller than a predetermined threshold value. Then, identify the components of the network equipment that collected the resource usage generated by the graph smaller than the threshold. The graph similarity derivation module 120 defines the resource usage collected by the identified network equipment components as 'associated resource' and generates an associated resource graph.

장애 검출 모듈(130)은 장애 발생 정보 저장 모듈(150)에 저장되어 있는 장애 이력 그래프와 유사도 도출 모듈(120)에서 생성한 연관 리소스 그래프를 비교하여 유사도를 검출한다. 여기서, 장애 이력 그래프는 기존에 이미 장애가 발생하였던 네트워크 장비에 대한 리소스 사용량으로 생성된 그래프를 의미한다. The failure detection module 130 detects the similarity by comparing the failure history graph stored in the failure occurrence information storage module 150 with the associated resource graph generated by the similarity derivation module 120. Here, the failure history graph refers to a graph generated by resource usage for a network device which has already occurred.

장애 검출 모듈(130)이 두 개의 그래프를 비교하여 유사도를 검출할 때, 그래프간 거리 차이가 미리 설정된 유사도 임계 거리보다 좁다면 유사도가 높은 것으로 검출하나, 반드시 이와 같이 한정하는 것은 아니다. 그리고, 본 발명의 실시예에서는 설명의 편의를 위하여 '장애 검출 모듈(130)'이라 지칭하여 설명하나, 학습의 개념으로 설명할 수도 있다.When the failure detection module 130 compares two graphs and detects similarity, if the distance difference between the graphs is smaller than a preset similarity threshold distance, the similarity is detected as high, but is not necessarily limited thereto. In addition, in the embodiment of the present invention, for convenience of description, it will be referred to as the 'disability detection module 130', but may be described as a concept of learning.

장애 검출 모듈(130)은 검출한 유사도가 미리 설정된 제1 유사도 임계값 이상일 경우, 리소스 사용량이 수집된 네트워크 장비를 선행 분석 결과 이상 발생 가능성이 있는 네트워크 장비(이하, 설명의 편의를 위하여, '이상 발생 가능 장비'라 지칭함)로 정의한다. 그리고, 장애 검출 모듈(130)은 이상 발생 가능 장비에서 기준 시점(△t)을 기준으로 이전 시점인 제1 시점(△t-1)에서부터 이후 시점인 제2 시점(△t-2)까지 수집된 전체 리소스 사용량에 대한 리소스 사용량 그래프를, 장애 이력 그래프와 비교한다.When the detected similarity is greater than or equal to a preset first similarity threshold, the failure detection module 130 may be configured to generate network equipment for which resource usage is collected. Equipment). In addition, the failure detection module 130 collects from the first time point Δt-1, which is a previous time point, to the second time point Δt-2, which is a later time point, based on the reference time point Δt in the equipment capable of abnormality. The resource usage graph for the total used resource usage is compared with the fault history graph.

리소스 사용량 그래프와 장애 발생 그래프를 제1 시점부터 분석한 그래프 유사도가 미리 설정한 제2 유사도 임계값 이상으로 유사할 경우, 해당 네트워크 장비를 후행 분석 결과 이상 발생 가능성이 있는 네트워크 장비(이하, 설명의 편의를 위하여, '이상 발생 유력 장비'라 지칭함)로 정의한다. 여기서, 제1 유사도 임계값과 제2 유사도 임계값은 동일하거나 동일하지 않을 수 있으며, 어느 하나의 수치로 한정하지 않는다.If the graph similarity analysis of the resource usage graph and the failure graph from the first point of view is similar to the preset second similarity threshold or higher, the network equipment that is likely to cause an abnormality as a result of the subsequent analysis is described below. For convenience, it is referred to as 'anomalous potent equipment'). Here, the first similarity threshold and the second similarity threshold may or may not be the same, and are not limited to any one numerical value.

장애 검출 모듈(130)은 최종적으로 이상 발생 유력 장비의 리소스 사용량 데이터와 실제 장애 이력이 있는 실제 장비 이상 정보와 한 번 더 비교한다. 여기서 실제 장비 이상 정보는 장애 발생 정보 모듈(150)에 저장되어 있을 수 있거나, 별도의 데이터베이스에서 관리될 수 있다. The failure detection module 130 finally compares the resource usage data of the abnormal occurrence potential equipment with the actual equipment abnormality information having the actual failure history once more. Here, the actual equipment failure information may be stored in the failure occurrence information module 150 or may be managed in a separate database.

이상 발생 유력 장비에서 사용된 리소스 사용량으로 생성된 리소스 사용량 그래프와 실제 이상이 발생했던 실제 장비의 이상 정보에 포함되어 있는 리소스 사용량을 나타내는 그래프가 일치하면, 이상 발생 유력 장비에 해당하는 네트워크 장비의 정보, 수집한 리소스 사용량, 그리고 그래프를 장애 발생 정보 모듈(150)에 임시로 저장한다.When the resource usage graph generated by the resource usage used by the trouble-bearing power equipment coincides with the graph representing resource usage included in the fault information of the actual equipment where the actual fault occurred, the information of the network equipment corresponding to the fault-prone power equipment is matched. The collected resource usage and the graph are temporarily stored in the failure information module 150.

표출 모듈(140)은 장애 검출 모듈(130)에서 분석한 이상 발생 유력 장비의 정보, 유사도가 높았던 기존 네트워크 장애 발생에 대한 이력을 네트워크 장비 관계자가 소지한 단말(300)로 전송한다. 그리고 단말(300)로부터 경고 정보가 실제 네트워크 장비의 이상으로 발전하였는지 나타내는 정보를 피드백 받는다.The expression module 140 transmits the information on the abnormal occurrence potency equipment analyzed by the failure detection module 130 and the history of the occurrence of the existing network failure, which has a high similarity, to the terminal 300 possessed by the network equipment official. In addition, the terminal 300 receives information indicating whether the warning information has developed beyond the actual network equipment.

만약, 경고 정보가 실제 네트워크 장비의 이상으로 발전하였다면, 이상 발생 가능성이 있는 네트워크 장비에 대한 정보를 장애 발생 이력으로 하여 장애 발생 정보 저장 모듈(150)에 저장한다.If the warning information has developed into an abnormality of the actual network equipment, the information on the network equipment that may have an abnormality is stored in the failure occurrence information storage module 150 as a failure occurrence history.

이상에서 설명한 사전 탐지 시스템(100)을 이용하여 지능형 네트워크 장비의 이상 증상을 사전에 탐지하는 방법에 대해 도 3 내지 도 10을 참조로 설명한다.A method of detecting in advance an abnormal symptom of an intelligent network device using the pre-detection system 100 described above will be described with reference to FIGS. 3 to 10.

도 3은 본 발명의 실시예에 따른 지능형 장비 이상 증상 사전 탐지 방법에 대한 흐름도이다.3 is a flowchart illustrating an intelligent device abnormal symptom pre-detection method according to an embodiment of the present invention.

도 3에 도시된 바와 같이, 사전 탐지 시스템(100)은 복수의 네트워크 장비들(200-1∼200-n)로부터 각각의 네트워크 장비들이 사용한 리소스 사용량을 수집한다. 그리고 수집한 리소스 사용량을 정규화하여 수치화한다(S100). 여기서, 복수의 네트워크 장비들 중 임의의 네트워크 장비에 대한 리소스 사용량을 데이터 정규화하여 수치화한 예에 대해 도 4를 참조로 설명한다.As shown in FIG. 3, the proactive detection system 100 collects resource usage used by each network device from the plurality of network devices 200-1 to 200-n. The collected resource usage is normalized and digitized (S100). Here, an example of data normalizing and quantifying resource usage of any network device among a plurality of network devices will be described with reference to FIG. 4.

도 4는 본 발명의 실시예에 따른 장비별 리소스 사용량에 대한 데이터 정규화의 예시도이다.4 is an exemplary diagram of data normalization for resource usage per device according to an embodiment of the present invention.

도 4의 (a)에 도시된 바와 같이, 임의의 네트워크 장비를 구성하는 구성 요소, 네트워크 장비에서 발생한 트래픽 처리를 위한 리소스 사용량, 네트워크 장비의 온도 등을 포함하는 리소스 사용량이 시간 흐름에 따라 수집된다. 본 발명의 실시예에서는 CPU에서 사용한 리소스 사용량, 메모리에서 사용한 리소스 사용량 디스크의 입/출력을 위해 사용한 리소스 사용량, 네트워크 장비에서의 트래픽 량과 네트워크 장비의 온도를 리소스 사용량으로 수집하는 것을 예로 하여 설명한다.As shown in (a) of FIG. 4, resource usage, including components constituting arbitrary network equipment, resource usage for processing traffic generated from the network equipment, temperature of the network equipment, and the like, are collected over time. . In the embodiment of the present invention, the resource usage used by the CPU, the resource usage used in the memory, the resource usage used for the input / output of the disk, the traffic volume in the network equipment and the temperature of the network equipment as the resource usage will be described as an example. .

여기서, CPU, 메모리, 디스크 입출력, 트래픽량은 네트워크 장비에서 사용된 전체 리소스 사용량에서 각각의 구성 요소들이 사용한 트래픽량이 어느정도 되는지 백분율로 수집된다. 예를 들어, CPU가 제1 시점(△t-1)에 사용한 리소스 사용량은 17%가 된다. 제1 시점의 네트워크 장비의 온도는 20도가 된다.Here, the CPU, memory, disk I / O, and traffic volume are collected as a percentage of the amount of traffic used by each component in the total resource usage used by the network equipment. For example, the resource usage used by the CPU at the first time point? T-1 is 17%. The temperature of the network equipment at the first time point is 20 degrees.

유사도 도출 모듈(120)은 도 4의 (a)에 수집한 리소스 사용량을 도 4의 (b)에 도시한 바와 같이 리소스 사용량을 정규화하여 수치화한다. 유사도 도출 모듈(120)이 리소스 사용량을 정규화하여 수치화하는 방법은 다양하게 수행될 수 있으므로, 본 발명의 실시예에서는 상세한 설명을 생략한다.The similarity derivation module 120 normalizes and quantizes the resource usage collected in FIG. 4A as shown in FIG. 4B. Since the similarity derivation module 120 normalizes and quantifies resource usage, various methods may be performed, and thus detailed descriptions thereof will be omitted.

한편, 도 3에 도시된 바와 같이, 사전 탐지 시스템(100)은 S100 단계에서 수치화한 복수의 네트워크 장비들(200-1∼200-n)에 대한 리소스 사용량을 이용하여 리소스 사용량 그래프를 생성한다(S101). 수치화된 리소스 사용량으로 리소스 사용량 그래프를 생성한 예에 대해 도 5를 참조로 설명한다.On the other hand, as shown in Figure 3, the prior detection system 100 generates a resource usage graph using the resource usage for the plurality of network equipment (200-1 to 200-n) digitized in step S100 ( S101). An example of generating a resource usage graph with the quantified resource usage will be described with reference to FIG. 5.

도 5는 본 발명의 실시예에 따른 장비별 리소스 사용량 그래프의 예시도이다.5 is an exemplary diagram of a resource usage graph for each device according to an embodiment of the present invention.

도 5의 (a)에는 리소스 사용량을 정규화하여 수치화한 표이다. 그리고 도 5의 (b)에 도시된 바와 같이, 그래프 유사도 도출 모듈(120)은 정규화되어 산출된 수치를 토대로 리소스 사용량 그래프를 생성한다. 여기서, X축은 시간 축이고, Y축은 정규화되어 산출된 수치를 나타낸다. FIG. 5A is a table in which resource usage is normalized and digitized. As shown in FIG. 5B, the graph similarity derivation module 120 generates a resource usage graph based on a normalized and calculated value. Here, the X axis is a time axis and the Y axis represents a normalized and calculated value.

임의의 시점(△t)에 해당 네트워크 장비는 CPU에서 17pt만큼의 리소스를 사용하였고, 트래픽량에 대한 리소스는 26pt만큼 사용하였으며, 디스크 입출력을 위해 9pt만큼의 리소스가 사용된 것을 알 수 있다. 또한, 하나의 네트워크 장비에서는 복수의 그래프가 포함된 리소스 사용량 그래프가 생성됨을 알 수 있다.It can be seen that the network equipment used 17pt of resources in the CPU, 26pt of resources for the traffic volume at any time Δt, and 9pt of resources were used for disk I / O. In addition, it can be seen that one network device generates a resource usage graph including a plurality of graphs.

한편, 도 3에 도시된 바와 같이, S101 단계에서 리소스 사용량 그래프가 생성되면, 사전 탐지 시스템(100)은 복수의 그래프가 포함된 리소스 사용량 그래프에서, 기준 그래프와 나머지 그래프 사이의 거리를 비교한다(S102). 본 발명의 실시예에서는 기준 그래프로 트래픽량 그래프를 예로 하여 설명한다.On the other hand, as shown in Figure 3, when the resource usage graph is generated in step S101, the pre-detection system 100 compares the distance between the reference graph and the remaining graph in the resource usage graph including a plurality of graphs ( S102). In the embodiment of the present invention, a traffic graph is taken as a reference graph.

사전 탐지 시스템(100)은 리소스 사용량이 수집된 매 시점에, 트래픽량 그래프와 다른 그래프들간의 거리의 차가 임계 거리를 초과하는 그래프가 있는지 확인한다(S103). 만약 임계 거리를 초과하는 그래프가 없다면 S100 단계에서부터 방법을 지속한다.The pre-detection system 100 checks whether there is a graph in which the difference in the distance between the traffic volume graph and the other graphs exceeds the threshold at each time point at which resource usage is collected (S103). If there is no graph exceeding the threshold distance, the method continues from step S100.

그러나, 복수의 그래프 중 임계 거리를 초과하는 그래프가 있다면, 사전 탐지 시스템(100)은 임계 거리를 초과하는 시점을 기준 시점으로 하고, 기준 시점의 직전 시점인 제1 시점에서부터 기준 시점의 직후 시점인 제2 시점까지에 대한 거리 차이 그래프를 추출한다. 그리고 거리 차이 그래프를 토대로 연관 리소스를 확인하여 연관 리소스 그래프를 생성한다(S104). 이에 대해 도 6과 도 7을 참조로 설명한다.However, if there is a graph exceeding the threshold distance among the plurality of graphs, the preliminary detection system 100 sets a time point exceeding the threshold distance as a reference time point, and is a time point immediately after the reference time point from the first time point immediately before the reference time point. A distance difference graph for the second time point is extracted. In addition, the associated resource graph is generated based on the distance difference graph (S104). This will be described with reference to FIGS. 6 and 7.

도 6은 본 발명의 실시예에 따른 장비별 리소스 사용량의 그래프화의 또 다른 예시도이고, 도 7은 본 발명의 실시예에 따른 트래픽과의 리소스별 그래프 거리 기반 연관도를 나타낸 예시도이다.6 is another exemplary diagram of graphing resource usage by equipment according to an exemplary embodiment of the present invention, and FIG. 7 is an exemplary diagram illustrating a graph distance-based association diagram according to resources with traffic according to an exemplary embodiment of the present invention.

도 6에 도시된 바와 같이, 기준 그래프인 트래픽량 그래프와 CPU 그래프의 거리 차는 3pt이고, 트래픽량 그래프와 디스크 인/아웃 그래프의 거리 차는 17pt이다. 본 발명의 실시예에서 임계 거리를 5pt로 설정한다고 가정하면, 트래픽량 디스크와 디스크 그래프의 거리 차가 임의의 시점(△t)에 임계 거리인 5pt이상으로 발생함을 알 수 있다. As shown in FIG. 6, the distance difference between the traffic graph and the CPU graph as the reference graph is 3pt, and the distance difference between the traffic graph and the disk in / out graph is 17pt. In the embodiment of the present invention, assuming that the threshold distance is set to 5pt, it can be seen that the distance difference between the traffic volume disk and the disk graph occurs at 5pt or more, which is the threshold distance at an arbitrary time Δt.

그러나 해당 시점에 트래픽량 그래프와 CPU 그래프의 거리 차는 임계 거리보다 짧은 3pt임을 알 수 있다. 여기서 임계 거리인 5pt 이상으로 그래프간 거리차가 발생한 임의의 시점을 Δt라 가정하면, 시점 Δt가 기준 시점이 된다.However, it can be seen that the distance difference between the traffic graph and the CPU graph at that point is 3pt shorter than the threshold distance. In this case, it is assumed that any time point at which the distance difference between graphs is greater than or equal to the threshold distance 5pt is Δt, and the time point Δt is the reference time point.

이때, 그래프간의 거리 차가 임계 거리보다 좁게 나타난다는 것은, 기준 그래프로 생성된 트래픽량의 변화에 따라 사전 탐지 시스템(100)의 구성 요소도 함께 변화됨을 의미한다. 그리고 그래프간의 거리 차가 임계 거리보다 넓게 나타난다는 것은, 트래픽량의 변화와 관계 없이 사전 탐지 시스템(100)의 구성 요소가 변화되거나 변화되지 않음을 의미한다. In this case, when the distance difference between the graphs is smaller than the threshold distance, it means that the components of the pre-detection system 100 are also changed according to the change in the traffic volume generated by the reference graph. And the distance difference between the graphs appear wider than the threshold distance means that the components of the preliminary detection system 100 are changed or not, regardless of the change in the traffic volume.

그리고, 기준 시점을 기준으로, 이전 시점(△t-1)과 기준 시점(△t) 사이의 거리차, 기준 시점(△t)과 다음 시점(△t+1) 사이의 거리차이 그래프를 주출한다. 거리 차이 그래프에 대해 도 7을 참조로 살펴보면, 도 7의 (a)에 기준 그래프인 트래픽량 그래프와 다른 그래프들 사이의 거리차이를 나타낸 것이다. 그리고 도 7의 (b)는 계산된 거리차이를 그래프로 생성한 것이다.Based on the reference time point, a distance difference graph between the previous time point Δt-1 and the reference time point Δt and the distance difference between the reference time point Δt and the next time point Δt + 1 are extracted. do. Referring to FIG. 7 for a distance difference graph, FIG. 7A illustrates a distance difference between a traffic volume graph, which is a reference graph, and other graphs. And (b) of FIG. 7 is a graph of the calculated distance difference.

도 7의 (b)에 도시된 바와 같이 추출된 그래프에서, 트래픽량-CPU 그래프와 트래픽량-디스크 그래프의 시점간 거리차는 트래픽량-메모리 그래프와 트래픽량-온도 그래프의 시점간 거리차보다 좁음을 알 수 있다.In the extracted graph as shown in (b) of FIG. 7, the distance difference between the traffic volume-CPU graph and the traffic volume-disk graph is smaller than the distance between the time points of the traffic volume-memory graph and the traffic volume-temperature graph. It can be seen.

예를 들어, 트래픽량-CPU 그래프의 거리차이 그래프를 살펴보면, 이전 시점(△t-1)에서 기준 시점(△t) 사이의 제1 거리차이는 1pt(8pt-9pt)이고, 기준 시점(△t)에서 다음 시점(△t+1) 사이의 제2 거리차이도 1pt(9pt-8pt)임을 알 수 있다. 임계 거리를 1pt라고 가정한다면, 트래픽량-CPU 그래프의 제1 거리차이와 제2 거리차이의 시점간 거리차는 0pt(1pt-1pt)로 임계 거리보다 좁음을 알 수 있다. For example, referring to the distance difference graph of the traffic volume-CPU graph, the first distance difference between the previous time point Δt-1 and the reference time point Δt is 1pt (8pt-9pt) and the reference time point Δ It can be seen that the second distance difference between the next time point Δt + 1 at t) is also 1pt (9pt-8pt). Assuming that the threshold distance is 1pt, it can be seen that the distance difference between the time points between the first distance difference and the second distance difference in the traffic volume-CPU graph is 0pt (1pt-1pt), which is narrower than the threshold distance.

이와 유사하게 트래픽량-디스크 그래프의 거리차이 그래프를 살펴보면, 이전 시점(△t-1)에서 기준 시점(△t) 사이의 제1 거리차이는 4pt(13pt-17pt)이고, 기준 시점(△t)에서 다음 시점(△t+1) 사이의 제2 거리차이도 4pt(17pt-13pt)임을 알 수 있다. 임계 거리를 1pt라고 가정하였으므로, 트래픽량-디스크 그래프의 제1 거리차이와 제2 거리차이의 시점간 거리차는 0pt(4pt-4pt)로 임계 거리보다 좁음을 알 수 있다.Similarly, when looking at the distance difference graph of the traffic volume-disc graph, the first distance difference between the previous time point Δt-1 and the reference time point Δt is 4pt (13pt-17pt) and the reference time point Δt It can be seen that the second distance difference between the next time point (Δt + 1) is also 4pt (17pt-13pt). Since the threshold distance is assumed to be 1pt, it can be seen that the distance difference between the time points between the first distance difference and the second distance difference of the traffic volume-disk graph is 0pt (4pt-4pt), which is narrower than the threshold distance.

반면, 트래픽량-메모리 그래프의 거리차이 그래프를 살펴보면, 이전 시점(△t-1)에서 기준 시점(△t) 사이의 제1 거리차이는 7pt(7pt-14pt)이고, 기준 시점(△t)에서 다음 시점(△t+1) 사이의 제2 거리차이는 5pt(9pt-8pt)임을 알 수 있다. 임계 거리를 1pt라고 가정하였으므로, 트래픽량-메모리 그래프의 제1 거리차이와 제2 거리차이의 시점간 거리차는 2pt(7pt-5pt)로 임계 거리보다 넓음을 알 수 있다.On the other hand, looking at the distance difference graph of the traffic volume-memory graph, the first distance difference between the previous time point Δt-1 and the reference time point Δt is 7pt (7pt-14pt) and the reference time point Δt It can be seen that the second distance difference between the next time point Δt + 1 at is 5pt (9pt-8pt). Since the threshold distance is assumed to be 1pt, it can be seen that the distance difference between the time points between the first distance difference and the second distance difference in the traffic volume-memory graph is 2pt (7pt-5pt), which is wider than the threshold distance.

그리고, 트래픽량-온도 그래프의 거리차이 그래프를 살펴보면, 이전 시점(△t-1)에서 기준 시점(△t) 사이의 제1 거리차이는 3pt(2pt-5pt)이고, 기준 시점(△t)에서 다음 시점(△t+1) 사이의 제2 거리차이는 5pt(5pt-0pt)임을 알 수 있다. 임계 거리를 1pt라고 가정하였으므로, 트래픽량-메모리 그래프의 제1 거리차이와 제2 거리차이의 시점간 거리차는 2pt(3pt-5pt)로 임계 거리보다 넓음을 알 수 있다.In the distance difference graph of the traffic volume-temperature graph, the first distance difference between the previous time point Δt-1 and the reference time point Δt is 3pt (2pt-5pt) and the reference time point Δt It can be seen that the second distance difference between the next time Δt + 1 at is 5pt (5pt-0pt). Since the threshold distance is assumed to be 1pt, it can be seen that the distance difference between the time points between the first distance difference and the second distance difference in the traffic volume-memory graph is 2pt (3pt-5pt), which is wider than the threshold distance.

이와 같이, 시점간 거리차가 임계 거리보다 좁은 그래프들이 그려진 구성 요소에서 사용된 리소스 사용량을 연관된 리소스라고 정의한다. 그리고 사전 탐지 시스템(100)은 도 7의 (b)에 ①로 나타낸 연관 리소스 그래프로 생성한다.As such, the resource usage used in the component in which graphs in which the distance difference between viewpoints is smaller than the threshold distance is drawn is defined as the associated resource. In addition, the prior detection system 100 generates a related resource graph shown by ① in FIG.

한편, 도 3에 도시된 바와 같이, 사전 탐지 시스템(100)은 S104 단계에서 생성된 연관 리소스 그래프와 미리 저장되어 있는 장애 이력 그래프를 비교하여, 연관 리소스 그래프가 장애 이력 그래프와 얼마나 유사한지를 나타내는 유사도를 검출한다(S105). 검출한 유사도가 미리 설정한 제1 유사도 임계치보다 큰지 확인하고(S106), 미리 설정한 제1 유사도 임계치보다 작으면 S105 단계 이후의 절차를 반복한다.On the other hand, as shown in Figure 3, the prior detection system 100 compares the failure history graph previously stored with the associated resource graph generated in step S104, the degree of similarity indicating how similar the associated resource graph with the failure history graph Is detected (S105). If the detected similarity is greater than the preset first similarity threshold (S106), and if the detected similarity is less than the preset first similarity threshold, the procedure after step S105 is repeated.

그러나, 미리 설정한 제1 유사도 임계치보다 검출한 임계치가 크면, 사전 탐지 시스템(100)은 해당 네트워크 장비를 이상 발생 가능 장비로 정의한다(S107). 이에 대해 도 8을 참조로 먼저 설명한다.However, if the detected threshold is greater than the first similarity threshold set in advance, the preliminary detection system 100 defines the corresponding network equipment as an abnormally possible equipment (S107). This will be described first with reference to FIG. 8.

도 8은 본 발명의 실시예에 따른 필터링된 리소스 사용량 기반 변화량 유사도 분석의 예시도이다.8 is an exemplary diagram of filtered resource usage-based variation similarity analysis according to an embodiment of the present invention.

도 8에 도시된 바와 같이, 사전 탐지 시스템(100)은 연관된 리소스로 생성된 연관 리소스 그래프 미리 저장된 장애 이력 그래프를 비교하여, 연관 리소스 그래프의 형태가 유사한 장애 이력 그래프가 있는지 확인하여 유사도를 검출한다. 본 발명의 실시예에서는 연관된 리소스로써 트래픽량, CPU, 디스크를 예로 하여 설명하나, 반드시 이와 같이 한정되는 것은 아니다.As shown in FIG. 8, the preliminary detection system 100 compares a pre-stored fault history graph of the related resource graph generated with the related resource, and checks whether there is a fault history graph having a similar shape of the related resource graph to detect similarity. . In the exemplary embodiment of the present invention, the traffic, CPU, and disk are used as examples of related resources, but the present invention is not limited thereto.

그리고 검출한 유사도가 일정 수준 이상일 경우, 즉, 기존의 카테고리 기반으로 호출한 장애 이력 그래프와 연관 리소스 그래프를 비교하여, 유사한 형태의 그래프를 보이면 네트워크 장비를 이상이 발생할 가능성이 있는 이상 발생 가능 장비로 정의한다. 여기서 카테고리라 함은, 연관된 리소스로 검출된 트래픽량, CPU, 디스크를 각각 카테고리라 지칭한다. If the detected similarity is above a certain level, that is, comparing the failure history graph and the associated resource graph called based on the existing category, and displaying a similar type of graph, the network equipment is regarded as an abnormal occurrence that may cause an abnormality. define. Here, the category refers to a traffic amount, a CPU, and a disk detected as an associated resource, respectively.

이때, 본 발명의 실시예에서는 연관 리소스 그래프의 이미지를 벡터화하고, 장애 이력 그래프의 이미지 역시 벡터화하여, 두 벡터를 비교하여 유사도를 검출할 수 있다. 그러나, 반드시 이와 같이 한정되는 것은 아니다.At this time, in the embodiment of the present invention, the image of the associated resource graph is vectorized and the image of the failure history graph is also vectorized, thereby comparing the two vectors to detect similarity. However, it is not necessarily limited to this.

도 8에서는 필터링된 리소스인 연관 리소스를 이용하여 분석한 결과 DDoS 공격에 따라 이상이 발생했던 네트워크 장비의 리소스 사용량 패턴과 유사하게 임의의 네트워크 장비에 리소스 사용량 패턴이 발생한 것으로 확인함을 알 수 있다. 따라서, 사전 탐지 시스템(100)은 해당 네트워크 장비가 DDoS 공격에 노출되어 장애가 발생할 가능성이 있는 장비로 정의한다.In FIG. 8, it can be seen that the resource usage pattern is generated in an arbitrary network device similarly to the resource usage pattern of the network device, which has occurred due to a DDoS attack, as a result of analyzing the associated resource as the filtered resource. Therefore, the preliminary detection system 100 is defined as a device in which a corresponding network device is exposed to a DDoS attack and thus a failure may occur.

한편, 도 3을 이어 설명하면, S107 단계에서 이상 발생 가능 장비가 정의되면, 사전 탐지 시스템(100)은 이상 발생 가능 장비로 정의된 네트워크 장비에 대해 제1 시점에서부터 제2 시점까지의 전체 리소스 사용량에 대한 리소스 사용량 그래프와 미리 저장되어 있는 장애 이력 그래프를 비교한다(S108). Meanwhile, referring to FIG. 3, when an abnormally possible device is defined in step S107, the preliminary detection system 100 uses the total resource usage from the first time point to the second time point for the network equipment defined as the abnormally possible device. The resource usage graph for is compared with a failure history graph stored in advance (S108).

사전 탐지 시스템(100)은 S108 단계에서 비교하여 검출한 유사도가 미리 설정한 제2 유사도 임계값보다 큰지 확인한다(S109). 만약 미리 설정한 제2 유사도 임계값보다 작으면 S108 단계 이후의 절차를 반복한다. 그러나, 제2 유사도 임계값보다 검출한 임계치가 크면, 사전 탐지 시스템(100)은 해당 네트워크 장비를 이상 발생 유력 장비로 정의한다(S110). 이에 대해 도 9를 참조로 먼저 설명한다.The pre-detection system 100 checks whether the similarity detected by comparing in step S108 is greater than a preset second similarity threshold value (S109). If less than the preset second similarity threshold value, the procedure after step S108 is repeated. However, if the detected threshold is greater than the second similarity threshold, the pre-detection system 100 defines the network equipment as an abnormal occurrence potent equipment (S110). This will be described first with reference to FIG. 9.

도 9는 본 발명의 실시예에 따른 유사도 후행 분석의 예시도이다.9 is an exemplary diagram of a similarity trailing analysis according to an embodiment of the present invention.

도 9에 도시된 바와 같이, 사전 탐지 시스템(100)은 기준 시점의 앞뒤 시점에 수집된 네트워크 장비의 모든 리소스 사용량을 토대로 생성된 리소스 사용량 그래프와 유사도 선행 분석에서 검출한 장애 이력 그래프와 비교하여, 그래프의 형태가 검출한 장애 이력 그래프와 유사한지 유사도를 검출한다. 그리고 검출한 유사도가 일정 수준 이상일 경우, 네트워크 장비를 이상이 발생할 가능성이 높은 이상 발생 유력 장비로 정의한다. As shown in FIG. 9, the proactive detection system 100 compares the resource usage graph generated based on all resource usage of the network equipment collected before and after the reference time point with the failure history graph detected by the similarity preceding analysis. Similarity is detected whether the shape of the graph is similar to the detected disability history graph. When the detected similarity is above a certain level, the network equipment is defined as an error generating high-potential equipment that is likely to have an abnormality.

한편, 도 3을 이어 설명하면, S110 단계에서 이상 발생 유력 장비를 정의한 후, 사전 탐지 시스템(100)은 장애 발생 정보 저장 모듈(150) 또는 외부의 데이터베이스에 저장되어 있는 모든 실제 네트워크 장비들 중 실제 장애가 발생했던 실제 장비 이상 정보와, 이상 발생 유력 네트워크 장비의 리소스 사용량 그래프를 한번 더 비교하여 유사도를 확인한다(S111, S112). 비교한 유사도가 미리 설정한 임계치보다 높으면, 사전 탐지 시스템(100)은 이상 발생 유력 장비의 식별 정보와 리소스 사용량, 그리고 각 그래프 등을 임시 저장한 후, 관리자 단말(300)로 이상 내역 리스트를 제공한다(S113, S114). On the other hand, referring to Figure 3, after defining the error-prone potent equipment in step S110, the pre-detection system 100 is the actual out of all the actual network equipment stored in the failure information storage module 150 or an external database The similarity is confirmed by comparing the actual equipment failure information of the failure with the resource usage graph of the faulty potential network equipment once more (S111 and S112). If the compared similarity is higher than a preset threshold, the preliminary detection system 100 temporarily stores identification information, resource usage, and graphs of abnormally occurring potent equipment, and then provides an abnormal history list to the manager terminal 300. (S113, S114).

그러나, 임계치보다 유사도가 낮으면 임시 저장하지 않고 바로 관리자 단말(300)로 이상 내역 리스트를 제공한다(S114). 사전 탐지 시스템(100)은 S114 단계에서 관리자 단말(300)로 제공한 이상 내역 리스트를 토대로 관리자가 별도의 피드백을 제공하면, 제공 받은 피드백 내용과 함께 S113 단계에서 임시 저장한 정보들, 또는 피드백 내용만을 장애 발생 정보 저장 모듈(150)에 장애 이력으로 저장한다(S115). 이에 대해 도 10을 참조로 설명한다.However, if the similarity is lower than the threshold value, the abnormal history list is directly provided to the manager terminal 300 without temporarily storing the data (S114). If the administrator provides a separate feedback based on the abnormal history list provided to the manager terminal 300 in step S114, the pre-detection system 100 temporarily stored information or feedback content in step S113 with the received feedback content Only the failure occurrence information storage module 150 is stored as a failure history (S115). This will be described with reference to FIG. 10.

도 10은 본 발명의 실시예에 따른 이상 발생 이력에 추가된 정보의 예시도이다.10 is an exemplary diagram of information added to an abnormal occurrence history according to an embodiment of the present invention.

도 10의 (a)에는 실제 장비 이상 정보와 이상 발생 유력 장비의 정보를 비교하여, 이상 발생 이력에 정보를 추가하는 예에 대한 것이고, 도 10의 (b)는 관리자로부터 입력된 피드백을 포함하여 정보를 추가하는 예에 대한 것이다.FIG. 10A illustrates an example of adding information to an abnormal occurrence history by comparing the actual equipment abnormality information with information of the potential occurrence equipment, and FIG. 10B includes feedback input from an administrator. This is an example of adding information.

이상 발생 가능성이 높은 네트워크 장비 결과가 실제 장비 이상의 결과와 일치한다면, 사전 탐지 시스템(100)은 이상 발생 가능성이 높은 네트워크 장비의 정보와 리소스 사용량, 그래프 정보 등을 장애 발생 정보 저장 모듈(150)에 추가한다. 이를 토대로 이후에 새로운 네트워크 장비의 장애를 사전 검출하는데 기준 정보로 사용한다.If the network equipment result that is likely to cause an abnormality matches the result of the actual device abnormality, the proactive detection system 100 transmits information, resource usage, graph information, etc., of the network device that is likely to cause an error to the failure information storage module 150. Add. Based on this, it is used as reference information for detecting the failure of new network equipment later.

또한, 네트워크 장비를 관리하는 관리자의 단말(300)로부터 피드백 되는 정보도 네트워크 장비의 정보, 리소스 사용량, 그래프 정보와 함께 저장되어, 새로운 네트워크 장비의 장애를 사전 검출하는데 참고 정보로 사용된다. 여기서, 피드백 되는 정보에는 실제 네트워크 장비에 이상이 발생하였는지 또는 이상이 발생하지 않았는지를 나타내는 정보가 포함된다.In addition, the information fed back from the terminal 300 of the administrator who manages the network equipment is also stored along with the information of the network equipment, resource usage, and graph information, and is used as reference information to detect the failure of the new network equipment in advance. Here, the information fed back includes information indicating whether an abnormality has occurred in the actual network equipment or whether the abnormality has not occurred.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements of those skilled in the art using the basic concepts of the present invention defined in the following claims are also provided. It belongs to the scope of rights.

Claims

As a system for detecting abnormality of network equipment in advance,
Resource usage graphs are generated by normalizing a plurality of resource usages collected from a network device, and comparing the difference between the reference resource graph selected from the resource usage graphs and the remaining resource graphs and at least one association associated with the reference resource graph. Graph similarity derivation module for extracting resource graphs,
A failure information storage module for storing failure history information on a failed network device, and
An abnormal symptom is defined by comparing the associated resource graph and the failure history information, and a problem symptom is detected by comparing a resource usage graph of the network equipment defined as the abnormal capable device and the failure history information. Fault detection module to detect errors in advance
Abnormal symptoms proactive detection system comprising a.

The method of claim 1,
Fault detection module,
Comparing the association resource graph with the failure history information to detect a similarity between the association resource graph and the failure history information, and when the detected similarity is higher than a preset first similarity threshold value, the network device is defined as an abnormal occurrence device. ,
Comparing the resource usage graph with the failure history information to detect a similarity between the resource usage graph and the failure history information, and when the detected similarity is higher than a second preset similarity threshold, the network equipment defined as the abnormality occurrence equipment. Anomaly detection system defined as anomalous potent equipment.

The method of claim 1,
The graph similarity derivation module,
A distance difference graph includes at least one resource graph between a reference time point at which a difference between the reference resource graph and the remaining resource graphs exceeds a threshold difference and a second time point immediately after a first time point immediately after the reference time point and a second time point immediately after the reference time point. Extracted with
The abnormal symptom pre-detection system extracts an absolute distance difference from a first time point to a reference time point and an absolute distance difference from a reference time point to a second time point in the distance difference graph to the associated resource graph.

The method of claim 1,
A resource collection module for collecting a plurality of resource usages for a plurality of device components constituting a network device; and
An expression module for providing information on the abnormal occurrence potent equipment detected by the failure detection module to a terminal of an administrator managing the network equipment.
Abnormal symptoms proactive detection system further comprising.