KR20040028400A

KR20040028400A - Fault management system of metro ethernet network and method thereof

Info

Publication number: KR20040028400A
Application number: KR1020020059570A
Authority: KR
Inventors: 성종규; 김영일; 손정표; 홍원규
Original assignee: 주식회사 케이티
Priority date: 2002-09-30
Filing date: 2002-09-30
Publication date: 2004-04-03
Also published as: KR100500836B1

Abstract

PURPOSE: A device and a method for processing a failure of a Metro-Ethernet network are provided to efficiently process the failure through a central server by transmitting the failure information to the central server while frequently monitoring the Metro-Ethernet network through a gateway system. CONSTITUTION: The gateway systems(3,4,5) for managing local networks dividing each network are placed to each region. The gateway systems and an operator(1) are connected through a Metro-Ethernet network management system(2). The gateway system collects information for each network and alarm information such as failure information, and transmits the alarm information to the Metro-Ethernet network management system. The Metro-Ethernet network management system processes/manages the failure using the alarm information and informs the operator of the failure. The Metro-Ethernet network management system includes a network management database(20) and the central server(21). The central server includes a failure manager(22), a configuration manager(23), a performance manager(24), a GUI(Graphic User Interface) device(25), an additional device(26), and a linking device(27).

Description

Fault management system of metro Ethernet network and method thereof

본 발명은 매트로 이더넷망의 장애처리 장치 및 그 방법에 관한 것으로서, 운용자와 게이트웨이 시스템 사이에 매트로 이더넷 망관리 시스템을 구비하여, 빠르고 정확한 장애처리를 함으로써, 효율적인 망관리가 가능한 매트로 이더넷망의 장애처리 장치 및 그 방법에 관한 것이다.The present invention relates to a failure handling apparatus and a method of a macro Ethernet network, and has a macro Ethernet network management system between an operator and a gateway system, and performs a fast and accurate failure management, the Ethernet Ethernet network capable of efficient network management The present invention relates to an apparatus and a method for treating disorders.

현재 국내의 인터넷환경은 기존의 라우터 위주의 접속망 구성방식에서 비대칭 디지털 가입자 회선(이하, ADSL이라 함)의 급속한 확산과 퍼스널 컴퓨터(PC) 게임방의 저변확대 등으로 인한 무선랜 환경과 매트로 이더넷 방식의 액세스망 구성방식이 증가되고 있는 추세이다.At present, the domestic Internet environment is based on the wireless LAN environment and the macro Ethernet method due to the rapid spread of asymmetric digital subscriber lines (hereinafter referred to as ADSL) and the expansion of the base of personal computer (PC) game rooms. The way access networks are organized is on the rise.

이에 따라, 액세스망의 원활한 운용관리를 통해 안정적인 서비스를 제공하고자, 망사업자는 새로운 액세스망을 구성하는 각각의 망 요소들을 관리하는 망관리 시스템을 필요로 하게 되었다.Accordingly, in order to provide a stable service through smooth operation management of the access network, network operators need a network management system that manages each network element constituting a new access network.

그러나, 수십만의 망요소를 가지는 대규모망에 장애가 발생하는 등의 문제가 생기면, 그 장애가 어디에서 발생했는지를 정확하게 파악하고, 해당 장애를 빨리 처리하여야 하는데, 종래의 망관리 시스템은 간단한 테스트의 결과를 데이터베이스에 저장하여 가공하여 출력하는 정도의 기능을 할뿐이어서, 안정적이고 효율적인 망관리가 어려웠다.However, when a problem occurs such as a failure in a large network having hundreds of thousands of network elements, it is necessary to accurately identify where the failure occurs and to deal with the failure quickly. It only functions as much as storing it in a database, processing it, and outputting it. Therefore, it is difficult to manage a stable and efficient network.

또한, 종래에는 대규모 인터넷망을 구성하는 모든 장치와 포트의 상태를 감시하고 성능 데이터를 수집하기 위한 대용량 서버의 증설을 요구하나, 대용량 서버의 증설은 경제적, 노동력 손실이 커서 현실적으로 구현되기 어려웠다.In addition, conventionally requires the expansion of a large capacity server for monitoring the status of all the devices and ports constituting the large-scale Internet network and collecting performance data, but the expansion of large capacity server is economical and labor loss is difficult to be realistically implemented.

이를 해결하기 위해, 종래에 분산된 게이트웨이 시스템을 이용하여 단위 게이트웨이 관리영역을 분할하여, 분할된 관리 영역내에 존재하는 장치에 대하여 데이터를 수집하고, 원시 데이터를 생성하며, 이를 중앙의 관리서버로 다양한 통신수단을 통해 통보한다. 중앙의 관리서버는 전송받은 데이터를 수합하여, 가공처리하여, 데이터베이스에 저장한 후, 과금 시스템 등의 관련된 타 시스템이 사용할 수있도록 한다.In order to solve this problem, a unit gateway management area is divided by using a conventionally distributed gateway system, data is collected for devices existing in the divided management area, raw data is generated, and various central management servers are used. Notify via communication means. The central management server collects the received data, processes it, stores it in a database, and makes it available to other related systems such as a billing system.

그러나, 이와같이 지역적으로 위치하게 되는 게이트웨이 시스템에서 중앙서버로 데이터를 전송할 때에, 짧은 주기 내에 방대한 량의 데이터를 효율적으로 처리하여 올바른 정보를 담아 적은량으로 전송하기가 어려운 문제점이 있었다.However, when transmitting data to a central server in such a gateway system that is located locally, there is a problem that it is difficult to efficiently process a large amount of data in a short period of time to transmit a small amount containing the correct information.

상기와 같은 문제점을 해결하기 위한 본 발명의 목적은, 게이트웨이 시스템을 통해 수시로 매트로 이더넷망을 감시하다가 장애발생을 감지하여, 장애정보를 중앙서버로 전송함으로써, 중앙서버를 통해 장애를 효율적으로 처리하는 데 있다.An object of the present invention for solving the above problems, by monitoring the Ethernet network in a macro through the gateway system from time to time to detect the occurrence of the failure, by transmitting the failure information to the central server, through the central server to efficiently handle the failure There is.

도 1은 본 발명에 따른 매트로 이더넷망의 장애처리를 위한 전체 시스템도.1 is an overall system diagram for the failure of the macro Ethernet network according to the present invention.

도 2는 도 1의 매트로 이더넷 망관리 시스템의 구성도.Figure 2 is a block diagram of a macro Ethernet network management system of FIG.

도 3은 도 2의 장애관리장치의 구성도.3 is a block diagram of the failure management apparatus of FIG.

도 4는 도 3의 게이트웨이 시스템의 구성도.4 is a configuration diagram of a gateway system of FIG. 3.

도 5는 본 발명에 따른 중앙서버와 게이트웨이 시스템간 전송을 위한 포맷 구조를 도시한 도면.5 is a diagram illustrating a format structure for transmission between a central server and a gateway system according to the present invention;

도 6은 본 발명에 따른 매트로 망의 장애처리 흐름도.6 is a flowchart illustrating a failure processing of a macro network according to the present invention.

상기 과제를 달성하기 위한 본 발명은 매트로 이더넷망을 관리하는 운용자와, 복수개의 지역망의 정보를 수집하고, 상기 지역망을 테스트하여, 상기 테스트 결과를 전송하는 복수개의 게이트웨이 시스템과, 상기 복수개의 게이트웨이 시스템과 상기 운용자 사이에 위치하여, 상기 복수개의 게이트웨이로부터 수집된 상기 테스트 결과 및 정보를 분석하고 장애처리를 하여 운용자에게 표시하는 망관리 시스템을 구비하는 것을 특징으로 한다.The present invention for achieving the above object is an operator for managing the Ethernet Ethernet network, a plurality of gateway systems for collecting information of a plurality of local networks, testing the local network, and transmits the test results, and the plurality of Located between the two gateway systems and the operator, characterized in that it comprises a network management system for analyzing the test results and information collected from the plurality of gateways and processing the fault display to the operator.

상술한 목적 및 기타의 목적과 본 발명의 특징 및 이점은 첨부도면과 관련한 다음의 상세한 설명을 통해 보다 분명해 질 것이다.The above and other objects and features and advantages of the present invention will become more apparent from the following detailed description taken in conjunction with the accompanying drawings.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 상세히 설명하면 다음과 같다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 매트로 이더넷망의 장애처리를 위한 전체 시스템도로서, 매트로 이더넷망의 장애처리 시스템은 매트로 이더넷망의 장애를 관리하기 위해 먼저 각 망을 구분한 지역망들을 관리하기 위한 게이트웨이 시스템(3, 4, 5)이 지역적으로 위치하고, 각각의 게이트웨이 시스템(3, 4, 5)과 운용자(1)간에 매트로 이더넷 망관리 시스템(2)을 구비하여 연결함을 개시한다.1 is a complete system diagram for failure handling of a macro ethernet network according to the present invention, a failure handling system of a macro ethernet network manages local networks which first divide each network in order to manage failure of a macro ethernet network. The gateway system (3, 4, 5) is located locally, and each of the gateway system (3, 4, 5) and the operator 1 is provided with a macro Ethernet network management system (2) to connect .

게이트웨이 시스템(3, 4, 5)은 각 지역망으로부터 각 망에 관한 정보 및 고장정보 등의 알람정보를 수집하고, 매트로 이더넷 망관리 시스템(2)으로 그 알람정보를 전송하고, 매트로 이더넷 망관리 시스템(2)은 게이트웨이 시스템(3, 4, 5)으로부터 전송받은 알람정보를 이용하여, 장애처리 및 장애관리를 하고, 운용자(1)에게 알려준다.The gateway system (3, 4, 5) collects alarm information such as information about each network and fault information from each local network, transmits the alarm information to the macro Ethernet network management system (2), and the macro Ethernet The network management system 2 uses the alarm information transmitted from the gateway systems 3, 4, and 5 to perform fault processing and fault management, and notify the operator 1.

매트로 이더넷 망관리 시스템(2)을 좀 더 구체적으로 설명하기 위해 도 2를 참조하면, 매트로 이더넷 망관리 시스템(2)은 망관리 데이터베이스(20)와 중앙서버(21)를 구비한다.Referring to FIG. 2 to describe the macro Ethernet network management system 2 in more detail, the macro Ethernet network management system 2 includes a network management database 20 and a central server 21.

망관리 데이터베이스(20)는 중앙서버(21)를 통해 각각의 지역망으로부터 게이트웨이 시스템(3, 4, 5)이 수집한 장애정보를 전송받아 저장하고, 중앙서버(21)와 연동하여, 중앙서버(21)에 필요한 정보를 제공한다.The network management database 20 receives and stores the fault information collected by the gateway systems 3, 4, and 5 from each local network through the central server 21, and interlocks with the central server 21 to provide a central server. Provide the necessary information in (21).

중앙서버(21)는 장애관리장치(22), 구성관리장치(23), 성능관리장치(24), GUI 기능장치(25), 부가기능장치(26), 연동기능장치(27)를 구비하며, 게이트웨이 시스템(3, 4, 5)으로부터 전송받은 장애정보를 이용하여, 장애처리 및 관리를 한다.The central server 21 includes a failure management device 22, a configuration management device 23, a performance management device 24, a GUI function device 25, an additional function device 26, an interworking function device 27, By using the fault information received from the gateway system (3, 4, 5), the fault handling and management.

장애관리장치(22)는 각각의 게이트웨이 시스템(3, 4 , 5)으로부터 각 지역망의 정보들을 수집하여, 장치 핑(packet internet groper; 이하, ping이라 함) 시험, 포트 ping 시험, 간이망관리 프로토콜(simple network management protocol; 이하, SNMP이라 함) 폴링(polling) 시험, SNMP 트랩(trap) 수집 등을 통해 장애를 감지하고, 장애를 처리한 후 그 결과를 망관리 데이터베이스(20)에 저장하고, 망관리 데이터베이스(20)로부터 저장된 알람정보를 가져온다.The failure management device 22 collects information of each local network from each gateway system 3, 4, 5, and performs device ping (packet internet groper) test, port ping test, and simple network management. A simple network management protocol (hereinafter referred to as SNMP) detects a fault through a polling test, collects an SNMP trap, handles the fault, and stores the result in the network management database 20. , The alarm information stored from the network management database 20 is fetched.

여기서, ping은 특정한 인터넷주소가 있고, 그 주소 요청을 받아들일 수 있는지를 확인하는 기본적인 인터넷 프로그램으로서, 사용자가 접속하려고 시도하고 있는 호스트가 실제로 운영되고 있는지 여부를 확인하는 목적으로 사용된다. 또한, ping은 운영되고 있는 호스트(host)가 얼마나 응답을 빠르게 하는 지를 확인할 수 있으며, 도메인의 이름만을 알고 있는 어떤 사이트의 아이피(이하, IP라 함) 주소를 알아낼 수 도 있다.Here, ping is a basic Internet program that checks whether a specific Internet address exists and can accept the request. It is used to check whether the host that the user is trying to access is actually running. In addition, ping can determine how fast a running host responds. It can also find the IP address of a site that only knows the domain name.

또한, 폴링(polling)은 통신상에서 프로그램이나 장치에서 다른 프로그램이나 장치들이 어떤 상태에 있는지를 지속적으로 체크하는 전송제어방식으로서, 대체로 해당 프로그램이나 장치들이 접속이 되어 있는지, 데이터 전송을 원하는 지 등을 확인한다.In addition, polling is a transmission control method that continuously checks the status of other programs or devices in a program or device in communication. Generally, polling indicates whether a corresponding program or device is connected or wants to transmit data. Check it.

한편, SNMP는 네트웍 관리 및 네트웍장치의 동작을 감시하고 통괄하는 프로토콜이며, 트랩은 실행중인 프로그램내에 테스트를 위해 특별한 조건을 걸어놓은 것을 말하며, 예를 들어, 에러트랩은 에러조건을 시험하고, 복원 루틴을 제공하는 것이고, 디버깅 트랩은 특정 명령어의 실행을 기다렸다가 그 프로그램을 중지시키고, 바로 그 순간의 시스템 상태를 분석하는 것이다.SNMP, on the other hand, is a protocol for monitoring and integrating network management and operation of network devices. Trap is a special condition placed in a running program for testing. For example, error traps test and restore error conditions. To provide a routine, a debugging trap is to wait for the execution of a particular instruction, stop the program, and analyze the system state at that moment.

구성관리장치(23)는 상술한 장애관리장치(22)를 통해 수집된 알람정보를 이용하여, 해당 망장치의 위치정보, 가입자정보 등의 정보를 망관리 데이터베이스(20)에 등록하고, 해당 망 시설과 관련된 정보를 이용하여, 해당 망 시설에 관한 보고서를 생성하고, 수정하는 등의 관리를 한다.The configuration management apparatus 23 registers information such as location information and subscriber information of the network apparatus in the network management database 20 by using the alarm information collected through the failure management apparatus 22 described above. The information related to the facility is used to generate and modify reports on the network facility.

성능관리장치(24)는 상술한 장애관리장치(22)를 통해 수집되어 망관리 데이터베이스(20)에 기 등록된 망 장치와 그 망장치에 포함되어 있는 포트 및 링크의 성능 정보를 수집하여, 이를 가공 분석한 후, 해당 망장치 및 포트 등에 관한 보고서를 생성하여, 운용자에게 알리거나, 망관리 데이터베이스(20)에 저장한다.The performance management device 24 collects the performance information of the network device collected through the above-described failure management device 22 and registered in the network management database 20 and ports and links included in the network device. After the process analysis, a report on the network device and port, etc. is generated and notified to the operator or stored in the network management database 20.

GUI 기능장치(25)는 상술한 구성관리장치(23) 및 성능관리장치(24)로부터 가공된 정보, 보고서 및 해당 망 시설, 장치의 정보 등을 원격지에 위치한 운용자(1) 화면에 그래픽 인터페이스를 통하여 표시함으로써, 운용자(1)가 해당 정보를 확인할 수 있도록 한다.The GUI function device 25 provides a graphical interface on the screen of the operator 1 remotely located in the information, reports and information of the network facilities, devices, etc. processed from the configuration management device 23 and the performance management device 24 described above. By displaying through, the operator 1 can confirm the information.

부가기능장치(26)는 중앙서버(21)의 운용자관리, 시스템관리 등의 부가적인 기능을 담당하고, 연동기능장치(27)는 과금관련 시스템 등의 외부의 타 시스템(28)과 연동하여, 해당 망시설의 과금 등의 정보를 주고받는다.The additional function device 26 is responsible for additional functions such as operator management and system management of the central server 21, and the interworking function device 27 is interlocked with other external systems 28 such as billing related systems, Send and receive information such as billing of the network facilities.

상술한 장애관리장치(22)를 구체적으로 설명하기 위해 도 3을 참조한다. 장애관리장치(22)는 통신처리기(220), 컨피그관리기(221), 관리대상 분배기(222), 버퍼(223), 장애정보처리기(224)를 구비한다.3 will be described in more detail with respect to the above-described failure management apparatus 22. The failure management apparatus 22 includes a communication processor 220, a config manager 221, a management target distributor 222, a buffer 223, and a failure information processor 224.

통신처리기(220)는 각각의 게이트웨이 시스템(3, 4, 5)과 연동하여, 각 지역망으로부터 수집한 알람정보 및 SNMP 트랩정보를 송수신하며, 관리대상분배기(222)로부터 ping 테스트 및 SNMP 폴링 테스트 요청이 있을 경우에, 해당 시험 요청을 해당 게이트웨이 시스템(3, 4, 5)으로 전달하고, 그 시험결과를 전송받아 상술한 관리대상 분배기(222)로 전달한다.The communication processor 220 transmits / receives alarm information and SNMP trap information collected from each local area network in connection with each gateway system 3, 4, and 5, and performs a ping test and an SNMP polling test from the management target distributor 222. If there is a request, the test request is transmitted to the gateway system 3, 4, 5, and the test result is received and delivered to the above-described management target distributor 222.

또한, 게이트웨이 시스템(3, 4, 5)과 알람정보를 송수신하는 중에 지정된 게이트웨이 시스템(3, 4, 5)이 통신이 안 되는 경우에, 컨피그관리기(221)로 다운된 사실을 통보하고, 미리 지정해둔 백업 게이트웨이 시스템(225)으로 해당 망정보에 장치 리스트 및 포트리스트에 등의 알람정보를 임시 전달한다.In addition, when the designated gateway system 3, 4, 5 cannot communicate while transmitting and receiving alarm information with the gateway system 3, 4, 5, the config manager 221 is notified of the fact that it is down, and in advance, The designated backup gateway system 225 temporarily transmits alarm information such as a device list and a port list to the network information.

컨피그관리기(221)는 상술한 바와 같이, 통신처리기(220)와 지역 게이트웨이 시스템(3, 4, 5)간의 데이터전송 구조를 지정하고, 게이트웨이 시스템(3, 4, 5,)의 정상동작 여부를 감시하며, 백업 게이트웨이 시스템(225)의 지정, 버퍼(223)의 개수 및 각 버퍼(223)의 크기 관리, 관리대상 분배기(222)의 주기(T), 걸러내는 IP 리스트 등의 관리를 한다.As described above, the config manager 221 designates a data transmission structure between the communication processor 220 and the local gateway systems 3, 4 and 5, and determines whether the gateway system 3, 4, 5 is operating normally. It monitors, designates the backup gateway system 225, manages the number of buffers 223 and the size of each buffer 223, the cycle T of the management target distributor 222, and the filtering IP list.

관리대상 분배기(222)는 상술한 구성관리장치(23)로부터 망관리 데이터베이스(20)에 등록되어 있는 망장치 및 그 망장치에 포함된 포트 정보 및 노드정보를 미리 지정해둔 주기 (T)내에 1회씩 검색한다. 이렇게 검색된 노드정보를 이용하여, 장치 리스트 및 포트리스트를 생성하여, IP 정보를 바탕으로 부적합한 IP 정보를 걸러낸 후, 걸러진 정보를 연동기능장치(27)로 전송되어, 연관된 타 시스템(28)으로 전송된다. 이때, IP는 모든 장치에 부여되고, 각각의 포트에도 각각 IP가 부여된다.The management target distributor 222 is configured within the period T in which the network device registered in the network management database 20 and the port information and node information included in the network device are previously specified from the configuration management device 23 described above. Search once. By using the retrieved node information, a device list and a port list are generated, and filtering out inappropriate IP information based on the IP information, and then the filtered information is transmitted to the interworking function device 27, and the other system 28 is connected. Is sent. At this time, IP is assigned to all devices, and IP is also assigned to each port.

한편, 생성된 장치 리스트 및 포트리스트는 관리대상 분배기(222)가 자체 내장하고 있는 메모리(미도시)에 저장한 후, 관리대산 분배기(222)의 주기(T)의 변화가 없는 경우, 메모리(미도시)상에 존재하는 노드정보, 장치 리스트, 포트리스트를 비교 분석하여, 변경된 사항이 있는 경우에는 리스트를 추출한 후, 새로운 장치 리스트 및 포트 리스트를 생성하여, 통신처리기(220)로 전송하고, 변경된 사항이 없으면 "변경없음"을 통신처리기(220)로 통보한다.Meanwhile, the generated device list and the port list are stored in a memory (not shown) that the management target distributor 222 has built-in, and when there is no change in the period T of the management distribution distributor 222, the memory ( Node information, device list, and port list existing on the network (not shown), and if there is a change, extracts a list, generates a new device list and a port list, and transmits it to the communication processor 220. If there is no change, the communication processor 220 notifies "no change".

버퍼(223)는 중앙서버(21) 시스템의 메모리 배열로서, 메모리 배열의 크기 및 개수는 컨피그관리기(221)를 통해 지정되며, 중앙서버(21) 시스템과 관리 대상이 되는 망요소의 크기에 따라 결정된다.The buffer 223 is a memory array of the central server 21 system, and the size and number of memory arrays are designated by the config manager 221 and according to the size of the central server 21 system and network elements to be managed. Is determined.

장애정보처리기(224)는 상술한 버퍼(223)를 감시하다가, 버퍼에 저장되는 정보가 있으면 추출하여, 그 추출된 정보를 이용하여, 해당 정보가 장치에 관한 것인지, 포트에 관한 것인지를 판단 한 후, 장치인 경우에는 해당 정보와 망관리 데이터베이스(20)에 저장되어 있는 해당 장치에 관한 정보를 비교하여, 변경된 사항이 있는 경우에 망관리 데이터베이스(20)의 장치 테이블(미도시)에 변경된 사항을 저장한다.The failure information processor 224 monitors the above-described buffer 223, extracts information stored in the buffer, and uses the extracted information to determine whether the information relates to a device or a port. After that, in the case of a device, the information is compared with the information about the device stored in the network management database 20, and if there is a change, the change in the device table (not shown) of the network management database 20. Save it.

한편, 해당 정보의 대상이 포트이면 해당 포트가 포함된 장치의 장애여부를 판단하여, 장치가 장애인 경우는 포트의 정보를 폐기하고, 장치가 정상이고, 포트의 장애만 발생한 경우에는 해당 정보와 망관리 데이터베이스(20)에 저장되어 있는 해당 포트의 정보를 비교하여, 변경된 사항이 있으면, 그 변경된 사항을 망관리 데이터베이스(20)의 포트 테이블(미도시)에 저장한다. 이는 장치에 장애가 발생하면, 장치에 속한 모든 포트가 장애상태이지만, 장치가 장애복구상태라고 해서, 모든 포트가 장애복구인 정상상태가 되지는 않기 때문이다.On the other hand, if the target of the information is a port, it is determined whether or not the device including the corresponding port is broken. If the device is disabled, the information on the port is discarded. The information of the corresponding ports stored in the management database 20 is compared, and if there is a change, the change is stored in a port table (not shown) of the network management database 20. This is because when a device fails, all ports belonging to the device are in a failed state, but just because a device is in a failed state does not mean that all ports are in a normal state of failing back.

이렇게 장애가 발생한 노드의 장비와 노드의 장애 건수를 저장하고, GUI 기능장치(25)를 통해 해당 정보를 운용자 화면에 표시한다.The failure of the node and the number of failures of the node having a failure is stored, and the corresponding information is displayed on the operator screen through the GUI function device (25).

한편, 장애의 내용이 본 발명에서 제안한 장애처리장치(22)가 사용하고 있는 ping 테스트, SNMP 폴링, SNMP 트랩의 종류가 다른 경우, 해당 장치 및 포트 리스트의 장애 내역에 장애 타입만을 추가하고, 나머지 내용은 버린다.On the other hand, if the type of failure is different from the type of the ping test, SNMP polling, SNMP trap used by the error handling device 22 proposed in the present invention, only the failure type is added to the failure details of the device and port list, and the rest Discard the contents.

도 4는 도 3의 게이트웨이 시스템의 구성도로서, 게이트웨이 시스템은 장애 수집기(40), 통신처리기(46), 필터(45)를 구비한다.FIG. 4 is a configuration diagram of the gateway system of FIG. 3, which includes a failure collector 40, a communication processor 46, and a filter 45.

장애수집기(40)는 장치 ping 시험기(41), 포트 ping 시험기(42), SNMP 폴링 시험기(43), SNMP 트랩 수집기(44)를 구비함을 개시한다.Fault collector 40 discloses a device ping tester 41, a port ping tester 42, an SNMP polling tester 43, and an SNMP trap collector 44.

장치 ping 시험기(41)는 중앙서버(21)에서 통신처리기(46)를 통해 전달한 주기(T)내에 한번씩 지역망에 존재하는 전체 장치에 대하여 ping 시험을 수행하고, 그 시험결과를 필터(45)로 전송한다. 이때, 주기(T)는 지역망의 모든 장치에 대하여 장치 ping 시험기(41)가 충분히 시험할 수 있도록 설정하는 것이 바람직하다.The device ping tester 41 performs a ping test on all devices existing in the local network once in a period T transmitted from the central server 21 through the communication processor 46, and filters the test results into the filter 45. To send. At this time, the period (T) is preferably set so that the device ping tester 41 can fully test all the devices of the local network.

포트 ping 시험기(42)는 중앙서버(21)에서 통신처리기(46)를 통해 전달한 주기(T)내에 한번씩 지역망 내의 모든 장치에 포함된 전체 포트에 대하여 ping 시험을 수행하며, 장치에 비하여 포트의 수가 많으므로, 다수의 시험을 동시에 수행하기 위해 쓰레드(thread)를 생성하여 수행한다. 이때, 쓰레드 수는 중앙서버(21)에서 통신처리기(46)를 통해 전송된다. 여기서, 주기(T)는 지역망의 모든 장치에 대하여 장치 포트 ping 시험기(42)가 충분히 시험할 수 있도록 설정하는 것이 바람직하다.The port ping tester 42 performs a ping test on all ports included in all devices in the local network once in a period T transmitted from the central server 21 through the communication processor 46. Because of the large number, a thread is created to run multiple tests simultaneously. At this time, the number of threads is transmitted from the central server 21 through the communication processor 46. Here, the period T is preferably set so that the device port ping tester 42 can fully test all devices in the local network.

SNMP 폴링 시험기(43)는 중앙서버(21)에서 통신처리기(46)를 통해 전달한 주기(T)내에 한번씩 지역망내의 모든 장치에 대하여 SNMP 폴링을 수행하여, 각 장치의 SNMP 대리자(agent)로부터 장치에 속한 모든 포트에 대하여 업다운 정보를 가지는 ifOperStatus 항목을 읽어들여, 각 포트가 정상상태인지 여부를 판단한다. 이때, SNMP 폴링이 실패하면 장치장애로 판단되며, 그 결과를 필터(45)로 전송한다.The SNMP polling tester 43 performs an SNMP poll for all devices in the local network once in a period T transmitted from the central server 21 through the communication processor 46, thereby performing a device from the SNMP agent of each device. It reads ifOperStatus item with up-down information about all ports belonging to and determines whether each port is in normal state. At this time, if SNMP polling fails, it is determined to be a device failure, and the result is transmitted to the filter 45.

SNMP 트랩 수집기(44)는 각 장치에서 통보되는 SNMP 트랩 메시지를 전달받는 데몬 프로세스이며, 전달받은 모든 트랩메시지를 필터(45)로 전송한다.The SNMP trap collector 44 is a daemon process that receives an SNMP trap message notified by each device, and transmits all received trap messages to the filter 45.

필터(45)는 상술한 관리대상 분배기(222)로부터 요청된 ping 테스트 및 SNMP 폴링 테스트를 수행한 결과를 계속 저장하고 있다가, 새로운 ping 테스트 결과값과 저장되어 있던 ping 테스트 결과값을 비교하여, 결과값에 변화가 없는 경우에는 이를 버리고, 결과값에 변화가 있는 경우에는 그 테스트 기간을 포함한 결과값을 통신처리기(46)를 통해 중앙서버(21)로 전송한다.The filter 45 continuously stores the results of performing the ping test and the SNMP polling test requested from the management target distributor 222, and compares the new ping test result with the stored ping test result. If there is no change in the result value, it is discarded. If there is a change in the result value, the result value including the test period is transmitted to the central server 21 through the communication processor 46.

또한, 필터(45)는 관리대상 분배기(222)가 SNMP 트랩을 수집 요청한 경우, 장치에서 전달하는 다양한 모든 트랩을 중앙서버(21)로 전달하게 되면, 중앙서버(21)의 부하가 커지므로, 포트의 업(up)/다운(down) 트랩을 제외한 모든 트랩을 제거한다. 그리고, 장치에 따라서, 하나의 장치 장애건수에 대해 여러 건의 트랩이 발생하는 경우에, 필터(45)가 하나의 트랩만을 중앙서버(21)로 전송하도록 한다.In addition, the filter 45, when the management target distributor 222 requests to collect the SNMP trap, if all the various traps delivered by the device to pass to the central server 21, the load on the central server 21 is increased, Remove all traps except the port's up / down traps. And, depending on the device, if multiple traps occur for one device failure, the filter 45 causes only one trap to be sent to the central server 21.

통신처리기(46)는 중앙서버(21)로부터 지역내의 장애정보를 수집할 대상 및관리분배기 주기(T)를 전송받고, 필터(45)를 통해 필터링 된 정보를 중앙서버(21)로 전송하는 등의 중앙서버(21)와의 통신을 처리한다.The communication processor 46 receives the target and management distributor period T for collecting the fault information in the region from the central server 21, and transmits the filtered information to the central server 21 through the filter 45. It handles the communication with the central server 21.

이때, 통신처리기(46)는 중앙서버(21)의 통신처리기(220)로 필터링 된 정보를 소켓으로 미리 정해진 포맷으로 전달하는데, 소켓이란 채널로 통신을 하려면, 일단 사용되는 포트를 상대측에서도 서로 알고 있어야 하며, 프로그램의 스트럭트(struct)구문처럼 어떠한 값들이 순차적으로 오는가를 미리 정해서 정해진 포맷으로만 통신이 가능하도록 한다.At this time, the communication processor 46 transfers the filtered information to the communication processor 220 of the central server 21 in a predetermined format to the socket, the socket is to communicate with the channel, once the port used once knows each other It should be able to communicate in a predetermined format by predetermine which values come in sequence like the program's struct syntax.

본 발명에서 제안하는 포맷을 설명하고자 도 5를 참조한다.Referring to FIG. 5 to describe the format proposed by the present invention.

도 5에서 도시한 바와 같이, 포맷은 IP 어드레스정보, 상태(state)정보, 객체타입(objectType), 이벤트타입(eventType), 시간정보, 게이트웨이 IP주소정보(destIpAddress), 게이트웨이 인덱스정보(gwIndex) 등을 포함한다.As shown in FIG. 5, the format includes IP address information, state information, object type, event type, time information, gateway IP address information (destIpAddress), gateway index information (gwIndex), and the like. It includes.

도 6은 본 발명에 따른 매트로 망의 장애처리 흐름도로서, 지역망의 장치 및 포트의 장애가 발생하여 운용자(1)가 해당 장애를 인식하게 되는 전체적인 흐름을 설명하는 흐름도이다.FIG. 6 is a flowchart illustrating a failure handling of a macro network according to the present invention, in which a failure of an apparatus and a port of a local network occurs and the operator 1 recognizes the failure.

게이트웨이 시스템(3, 4, 5)에서 주기적으로 각 지역망에 ping 시험 및 SNMP 폴링 시험을 실시하여, ping 시험기(41, 42)가 시험 중에 장애발생을 감지하거나, SNMP 폴링 시험기(43)가 포트의 ifOperSttus 항목의 업/다운 상태를 읽어들이고, SNMP 트랩 수집기(44)로부터 트랩정보를 수집한다(S60).The gateway system 3, 4, 5 periodically performs ping test and SNMP polling test on each local network, so that the ping tester 41, 42 detects a failure during the test, or the SNMP polling tester 43 is a port. The up / down state of the ifOperSttus item is read and trap information is collected from the SNMP trap collector 44 (S60).

이렇게 수집된 정보를 게이트웨이 시스템(3, 4, 5)의 필터(45)를 이용하여, 필터링을 수행하여, 해당 정보들의 상태천이가 있는지 여부를 판단하여(S61), 상태천이가 없으면 해당 시험결과 및 트랩정보를 폐기하고(S69), 상태천이가 있으면 중앙서버(21)의 통신처리기(220)로 해당 시험결과 및 트랩정보를 전송한다(S62).The collected information is filtered using the filter 45 of the gateway systems 3, 4, and 5 to determine whether there is a state transition of the corresponding information (S61). And discards the trap information (S69), and if there is a state transition, transmits the corresponding test result and the trap information to the communication processor 220 of the central server 21 (S62).

이렇게 전송된 시험결과 및 트랩정보를 버퍼(223)에 저장하고(S63), 장애정보처리기(224)를 통해 해당 시험결과 및 트랩정보의 대상이 장치인지 포트인지를 확인하여(S64), 대상이 장치이면 망관리 데이터베이스(20)에 등록되어 있는 해당 장치 정보와 비교하여, 상태천이가 있는지 여부를 판단한다(S65).The test result and the trap information thus transmitted are stored in the buffer 223 (S63), and the target of the test result and the trap information is identified as a device or a port through the fault information processor 224 (S64). If it is a device, it is determined whether there is a state transition by comparing with the corresponding device information registered in the network management database 20 (S65).

상술한 단계(S65)에서 상태천이가 없으면 해당 정보를 폐기하고(S66), 상태천이가 있으면, 해당 시험결과 및 트랩정보를 망관리 데이터베이스(20)에 저장한 후 종료한다.If there is no state transition in the above-described step (S65), the corresponding information is discarded (S66). If there is a state transition, the test result and trap information are stored in the network management database 20 and then terminated.

한편, 상술한 단계(S64)에서 대상이 포트인 경우에는, 해당 시험결과 및 트랩정보가 장애정보인지 장애복구정보인지를 판단하여(S68), 장애정보인 경우에는 해당 포트가 속한 장치가 장애인지 여부를 판단하여(S69), 해당 포트가 속한 장치의 장애인 경우에는 해당 정보를 폐기한다(S66).On the other hand, if the target is a port in the above-described step (S64), it is determined whether the test result and trap information is failure information or failure recovery information (S68), in the case of failure information, whether the device to which the port belongs is disabled. By determining whether (S69), if the device of the corresponding device is disabled, the corresponding information is discarded (S66).

반면, 해당 포트가 속한 장치의 장애가 아닌 경우에는 기 저장해둔 포트리스트와 해당 시험결과 및 트랩정보를 비교하여, 해당 시험결과 및 트랩정보가 상태천이가 있는 지를 판별하여(S70), 상태천이가 없으면 해당 정보를 폐기하고(S66), 상태천이가 있으면, 이를 망관리 데이터베이스(20)에 저장한 후(S71), 종료한다.On the other hand, if the device does not belong to the corresponding port list by comparing the previously stored port list and the corresponding test results and trap information, to determine whether there is a state transition of the test results and trap information (S70), if there is no state transition Discarding the information (S66), if there is a state transition, it is stored in the network management database 20 (S71), and ends.

상술한 단계(S68)에서 해당 시험결과 및 트랩정보가 장애복구 정보인 경우에는 해당 장애복구 정보의 상태천이가 있는지를 판별하여(S72), 상태천이가 없으면 폐기하고(S73), 상태천이가 있으면 망관리 데이터베이스(20)에 저장한 후종료한다(S74).If the test result and the trap information in the above-mentioned step (S68) is the failure recovery information, it is determined whether there is a state transition of the failure recovery information (S72), and if there is no state transition (S73), if there is a state transition It ends after storing in the network management database 20 (S74).

이와같이, 게이트웨이 시스템(3, 4, 5)으로부터 전송된 알람정보를 버퍼(223)에 저장하면, 버퍼(223)를 수시로 모니터링하던 장애정보처리기(224)가 버퍼(223)에 저장된 알람정보를 추출하여, 해당 알람정보가 장치에 관한 것인지 포트에 관한 것인지 여부를 판단한다.As such, when the alarm information transmitted from the gateway systems 3, 4, and 5 is stored in the buffer 223, the fault information processor 224, which frequently monitors the buffer 223, extracts the alarm information stored in the buffer 223. Then, it is determined whether the corresponding alarm information relates to the device or the port.

그 후, 해당 알람정보가 장치에 관한 정보이면 망관리 데이터베이스(20)에 저장하고, 해당 알람정보가 포트에 관한 정보이면, 해당 포트가 속한 장치가 장애인지를 판별하여, 해당 포트가 속한 장치가 장애가 아니면 폐기하고, 장애이면 해당 포트가 속한 장치의 정보를 망관리 데이터베이스(20)에 저장한다.Thereafter, if the corresponding alarm information is information about a device, it is stored in the network management database 20. If the corresponding alarm information is information about a port, it is determined whether a device to which the corresponding port belongs is a failure, and a device to which the corresponding port belongs to has failed. Otherwise, if discarded, and stores the information of the device to which the port belongs in the network management database (20).

따라서, 중앙서버(21)와 각 지역적으로 존재하는 액세스망 장치로부터 장애정보를 수집하여 서버 시스템과 관리정보를 통신하는 게이트웨이 시스템(3, 4, 5)을 통해 망 규모에 따라 유연하게 장애관리를 할 수 있도록 한다.Therefore, the fault management is flexibly managed according to the network size through the gateway system (3, 4, 5) which collects the fault information from the central server 21 and the access network devices existing in each region and communicates the management information with the server system. Do it.

이상에서 살펴본 바와 같이, 본 발명에 따른 매트로 이더넷망의 장애처리 장치 및 그 방법은, 장치 ping 시험기, 포트 ping 시험기, 간이망관리 프로토콜 폴링 시험기, 간이망관리 프로토콜 트랩 수집기를 통해 빠르고 정확한 장애 검출을 하여 장애처리를 하도록 함으로써, 안정적인 망관리를 하여 고객에게 최상의 서비스를 할 수 있는 효과가 있다.As described above, the fault handling apparatus and method of the macro Ethernet network according to the present invention, fast and accurate fault detection through the device ping tester, port ping tester, simple network management protocol polling tester, simple network management protocol trap collector By dealing with faults, it is effective to provide the best service to customers through stable network management.

아울러 본 발명의 바람직한 실시예는 예시의 목적을 위한 것으로, 당업자라면 첨부된 특허청구범위의 기술적 사상과 범위를 통해 다양한 수정, 변경, 대체 및부가가 가능할 것이며, 이러한 수정 변경 등은 이하의 특허청구범위에 속하는 것으로 보아야 할 것이다.In addition, a preferred embodiment of the present invention is for the purpose of illustration, those skilled in the art will be able to various modifications, changes, substitutions and additions through the spirit and scope of the appended claims, such modifications and changes are the following claims It should be seen as belonging to a range.

Claims

Operator to manage the Ethernet network in a macro;

A plurality of gateway systems collecting information on a plurality of local networks, testing the local networks, and transmitting the test results;

Located between the plurality of gateway systems and the operator, a failure management apparatus of a macro Ethernet network including a network management system for analyzing and testing the test results and information collected from the plurality of gateways and display them to the operator .

According to claim 1, The network management system,

A network management database for storing network information, alarm information, device and port information, and failure handling details; And

Compared with the information received from the gateway system and the information of the network management database, if there is any changed information in the network management database, the failure processing apparatus of the macro Ethernet network characterized in that it comprises a central server for handling the failure .

The method of claim 2, wherein the central server,

A failure management device receiving failure information of the local network from the gateway system and handling and managing the failure;

A configuration management device that stores the location information and the subscriber information of the device in the network management database and generates a report on the device, if the object having a failure is a device using the failure information;

A performance management apparatus using the failure information to store information on the port in the network management database and to generate a report regarding the port if a failure target is a port;

A graphical interface device displaying information stored in the network management database to the operator through the configuration management device and the performance management device;

An additional function device for operator management and system management; And

Failure handling apparatus of the macro Ethernet network characterized in that it comprises an interlocking function device interoperating with an external system.

According to claim 3, The failure management device,

A communication processor for transmitting a test request message to the gateway system and receiving the failure information and a test result;

A management target distributor for retrieving device and port information stored in the network management database once in a predetermined period and generating a device list and a port list using the retrieved information;

A configuration manager for designating a data transmission structure between the communication processor and the gateway system and monitoring whether the gateway system is in normal operation;

A size and number designated by the config manager, and a buffer configured to temporarily store the test result and the failure information; And

By using the fault information, it is determined whether the corresponding fault is a handicap for a device or a handicap for a port. If the device is a fault, the fault processing is performed using information related to the device. Failure handling apparatus of the macro Ethernet network characterized in that it comprises a failure information processor for processing.

The method of claim 1, wherein the gateway system,

A device ping tester for pinging the device;

A pot ping tester that pings the pot;

A simple network management protocol polling tester configured to perform a simple network management protocol polling test on the local network;

And a simple network management protocol trap collector for collecting a simple network management protocol trap for the local network.

A fault detection step of detecting a failure of a network element using a ping test, a simple network management protocol polling test, and a simple network management protocol trap collection;

A transmission step of transmitting the detected failure information to a central server and storing in the buffer;

A determination step of determining whether a target of the failure information transmitted from the buffer is a device or a port;

A device step of comparing the previously stored device information with the device information of the alarm information, discarding if there is no change, and storing the changed information newly if there is a change in the determination step;

If the target is a port in the determination step, if the device to which the port belongs has a failure, discard the device. A port step of newly storing the changed information if there is a change;

When the alarm information about the port in the port step is the failure recovery information, the previously stored failure recovery information and the failure recovery information of the alarm information is compared and discarded if there is no change, and if there is a change, the changed information is newly stored. How to deal with the failure of the macro Ethernet network including the storage step.