KR20220168420A

KR20220168420A - Real-time abnormal symptoms detection system and method by in-memory

Info

Publication number: KR20220168420A
Application number: KR1020210078136A
Authority: KR
Inventors: 이진택
Original assignee: 인터리젠 주식회사
Priority date: 2021-06-16
Filing date: 2021-06-16
Publication date: 2022-12-23
Also published as: KR102636239B1

Abstract

An analysis method for real-time analysis of abnormal symptoms in an RTA analysis processor unit in an in-memory-based abnormal symptom real-time analysis system comprises: a first comparison field processing step for performing a process of comparing whether data corresponds to a special condition among second input data transmitted from a second Kafka unit; a second profiled data comparing step for performing a process of comparing the data requiring analysis with past profiled data based on a specific reference value when the data corresponds to the special condition in the first comparison field processing step; and a third aggregation data comparing step of performing a process of extracting and setting an abnormal symptom class aggregation for a corresponding customer from memory and comparing the second input data with a given set abnormal symptom class aggregation. When the second data corresponds to the abnormal symptom class aggregation set in the third aggregation data comparing step, the RT analysis processor unit outputs a case as an abnormal symptom. Accordingly, real-time analysis of abnormal symptoms is possible.

Description

Real-time abnormal symptoms detection system and method by in-memory}

본 발명은 인-메모리 기반 이상징후 실시간 분석방법에 관한 것이다.The present invention relates to an in-memory based anomaly analysis method in real time.

인터넷을 포함한 통신망의 발전, 자동화 기기의 발달과 더불어 전 세계적으로 인터넷 등을 이용한 관련 금융사기가 급속히 증가하고 있다. With the development of communication networks including the Internet and the development of automated devices, related financial fraud using the Internet is rapidly increasing worldwide.

예컨대, 개인의 ID를 도용하여 대포통장으로　계좌이체를 실행한 후 인출해 가는 방식,　보이스피싱으로　대포통장으로 입금하도록 유도한 후에 인출해 가는 것 등의 다양한 수법으로 금융사기의 수법이 나날이 고도화, 지능화되고 있다.For example, the methods of financial fraud are getting more sophisticated day by day with various methods, such as the method of stealing an individual ID to perform account transfer to the Daepo bankbook and then making withdrawals, and the method of withdrawing after inducing deposits into the Daepo bankbook through voice phishing. are becoming intelligent.

이와 같은 금융사기에 의한 피해는 개인뿐만 아니라 은행이나 전자금융서비스 제공자에게도 시간적으로나 금전적으로 손실을 야기하는 등 사회적으로도 심각한 문제가 되고 있다.The damage caused by such financial fraud is becoming a serious social problem, causing time and money losses not only to individuals but also to banks and electronic financial service providers.

금융사기를 방지하기 위한 관련 기술로는 금융사기에 활용된 것으로 신고된 IP로 접속하는 경우 인터넷 뱅킹 서비스를 차단하는 기술이 사용되고 있다.As a related technology to prevent financial fraud, a technology that blocks Internet banking service when accessing to an IP reported as being used for financial fraud is used.

또한, 결제자의 다양한 정보를 수집해 패턴을 만든 후 패턴과 다른 이상 결제를 잡아내고 결제 경로를 차단하는 룰(Rule) 기반의 이상금융거래 탐지 시스템에 대한 연구가 지속적으로 이루어지고 있다.In addition, research on a rule-based abnormal financial transaction detection system that collects various information of a payer to create a pattern, catches an abnormal payment that is different from the pattern, and blocks the payment route is being continuously conducted.

이와. 관련하여, 국내등록 특허공보 제10-1153968호에는 금융사기 방지 시스템이 개시되어 있다. 제10-1153968호 특허에서는 다중채널을 통해 수집한 금융사기 사례에 관한 데이터를 유형별로 저장 및 관리하는 금융사기 패턴 데이터베이스부와, 금융거래를 하는 사용자의 평소 금융거래에 관한 데이터를 사용자 별로 저장 및 관리하는 사용자 패턴 데이터베이스부 및 사용자의 평소 금융거래에 관한 데이터 및 금융사기 사례에 관한 데이터에 따라, 사용자가 통신망을 통해 수행하는 금융거래가 금융사기에 해당하는지 판단하여 사용자의 금융거래를 차단하는 금융사기 검출부를 포함하는 금융사기 방지 시스템과 이를 이용한 방지 방법을 제공한다. 이에 따라 다중채널을 통해 금융사기 사례를 [0013] 수집하고 이를 활용하므로 계속적으로 진화하는 금융사기기법에 유연하게 대처할 수 있는 장점이 있다.with this In relation to this, Korean Patent Publication No. 10-1153968 discloses a financial fraud prevention system. Patent No. 10-1153968 discloses a financial fraud pattern database unit that stores and manages data on financial fraud cases collected through multiple channels by type, and stores and Financial fraud that blocks the user's financial transaction by determining whether the user's financial transaction through the communication network corresponds to a financial fraud according to the managed user pattern database unit and the user's usual financial transaction data and financial fraud case data A financial fraud prevention system including a detection unit and a prevention method using the same are provided. Accordingly, since financial fraud cases are collected and utilized through multiple channels, there is an advantage in flexibly coping with continuously evolving financial fraud techniques.

그러나 종래의 기술에서 사용되는 관계형 데이터 모델에 기초를 둔 데이터베이스(Relational Database) 및 IP 추적 기술로는 실시간 분석이 곤란한 문제점이 있다. However, real-time analysis is difficult with a relational database and IP tracking technology based on a relational data model used in the prior art.

최근에는 인-메모리 데이터베이스(In-memory Database) 기술이 제안되어 도입되고 있다. 인-메모리 기반 시스템은 데이터 스토리지의 메인 메모리에 설치되어 운영되는 방식의 데이터베이스 관리 시스템으로, 디스크에 설치되는 방식에 비해 처리 속도가 빠르다는 장점을 갖는다.Recently, an in-memory database technology has been proposed and introduced. The in-memory based system is a database management system that is installed and operated in the main memory of data storage, and has the advantage of faster processing speed than the method installed on a disk.

대한민국 등록특허공보 제10-1153968호(금융사기 방지 시스템 및 방법)Republic of Korea Patent Registration No. 10-1153968 (Financial fraud prevention system and method)

본 발명의 목적은 인-메모리 기반 이상징후를 실시간으로 분석할 수 있는 분석시스템 및 방법을 제공하는 것이다.An object of the present invention is to provide an analysis system and method capable of analyzing in-memory-based anomalies in real time.

본 발명의 일 측면에 따르면, 인-메모리 기반 이상징후 실시간 분석시스템은, 분석을 위한 입력데이터가 입력되며 복수의 카푸카(kafka) 모듈로 형성된 제1 카푸카부; 상기 제1카푸카부로부터 전송받은 입력 데이터를 파싱하여 정규화 데이터인 제2입력데이터로 변환시키는 파서부; 상기 파서부에서 변환된 제2입력데이터를 복수의 카푸카(kafka) 모듈로 분산 처리하여 전송하는 제2카푸카부; 상기 제2 카푸카부로부터 전송받은 제2 입력 데이터에 대해 주어진 프로세스에 따라 실시간 분석하는 기능을 수행하는 RTA 분석프로세서부; 상기 RTA 분석프로세서부(120)에서 분석이 완료된 제2 입력 데이터에 대해 특정 기준값을 기준으로 설정된 조건 프로세스에 따라 프로파일링하고 프로파일링된 데이터를 key value 방식으로 변환하는 프로파일러부; 및 상기 프로파일러부에서 key value 방식으로 변환된 데이터를 메모리에 저장하는 Redis부; 를 포함하는 것을 특징을 한다.According to one aspect of the present invention, an in-memory based anomaly symptom real-time analysis system includes: a first Kafka unit formed of a plurality of kafka modules and receiving input data for analysis; a parser unit parsing the input data transmitted from the first Kapuka unit and converting it into second input data that is normalized data; a second kafka unit that distributes and transmits the second input data converted by the parser unit to a plurality of kafka modules; an RTA analysis processor unit performing real-time analysis according to a given process on the second input data transmitted from the second KaPuka unit; a profiler unit for profiling the second input data analyzed by the RTA analysis processor unit 120 according to a condition process set based on a specific reference value and converting the profiled data into a key value method; and a Redis unit that stores the data converted by the profiler unit in a key value method in a memory. It is characterized by including.

또한, 상기 제1 카푸카부 및 제2카푸카부는 2 ~10개의 카푸카(kafka) 모듈이 병렬로 형성된 것을 특징으로 한다.In addition, the first Kafka unit and the second Kafka unit are characterized in that 2 to 10 Kafka modules are formed in parallel.

또한, 상기 RTA 분석 프로세서부는 복수 개의 RTA 분석프로세서가 형성되며, 각각 처리 물량을 나누어서 병렬로 분석 처리하는 것을 특징으로 한다.In addition, the RTA analysis processor unit is characterized in that a plurality of RTA analysis processors are formed, and each processing quantity is divided and analyzed in parallel.

또한, 상기 프로파일러부에서, 상기 특정 기준값은 고객번호 또는 아이디를 포함하며, 상기 설정된 조건 프로세스는 상기 특정 기준값을 기준으로 거래하였던 단말기기, 국가, 계좌번호, 거래채널, 거래평균금액 중 어느 하나 이상을 포함하는 것을 특징으로 한다.In addition, in the profiler unit, the specific reference value includes a customer number or ID, and the set condition process is performed based on the specific reference value, and the terminal device, country, account number, transaction channel, and average transaction amount are any one or more. It is characterized in that it includes.

또한, 상기 분석시스템은 금융서버의 고객인터페이스와 계정계를 통하여 상기 입력데이터를 입력받는 것을 특징으로 한다.In addition, the analysis system is characterized in that the input data is received through a customer interface and an account system of a financial server.

본 발명의 또 다른 측면에 따르면, 상기 분석시스템에서 상기 RTA 분석 프로세서부에서 이상징후를 실시간 분석하는 분석방법은, 상기 제2카푸카부로부터 전송된 제2입력 데이터 중에서 특이 조건에 해당하는 데이터인지를 비교하는 과정이 수행되는 제1차 비교 필드처리 단계; 상기 제1차 비교 필드처리 단계에서 상기 특이 조건에 해당되는 경우에는 상기 분석이 요구되는 데이터를 특정 기준값을 기준으로 과거 프로파일링된 데이터와 비교하는 과정이 수행되는 제2차 프로파일링된 데이터 비교 단계; 및 상기 제2차 프로파일링된 데이터 비교 단계에서 상기 특정 기준값에 만족되지 않는 경우에는 해당 고객에 대한 이상 징후 클래스 aggregation을 메모리에서 추출하여 설정하고, 상기 제2 입력 데이터를 주어진 설정된 이상 징후 클래스 aggregation와 비교하는 과정이 수행되는 제3차 어그리제이션 데이터 비교단계; 를 포함하며, 상기 제3차 어그리제이션 데이터 비교단계에서 설정된 이상 징후 클래스 aggregation에 해당되는 경우에는 상기 RTA 분석 프로세서부가 이상징후 거래로 출력하는 것을 특징으로 한다.According to another aspect of the present invention, in the analysis system, the analysis method of analyzing the anomaly in the RTA analysis processor unit in real time determines whether the second input data transmitted from the second KaPuka unit corresponds to a specific condition. a first comparison field processing step in which a comparison process is performed; In the first comparison field processing step, when the specific condition is met, a second profiled data comparison step in which a process of comparing the data required for analysis with data profiled in the past based on a specific reference value is performed. ; and in the second profiled data comparison step, if the specific reference value is not satisfied, an anomaly symptom class aggregation for the corresponding customer is extracted from the memory and set, and the second input data is set with the anomaly symptom class aggregation given. a third aggregation data comparison step in which a comparison process is performed; and outputting the RTA analysis processor unit as an abnormal symptom transaction when it corresponds to the abnormal symptom class aggregation set in the third aggregation data comparison step.

또한, 상기 제3차 어그리제이션 데이터 비교단계에서 설정된 이상 징후 클래스 aggregation에 해당되지 않는 경우에는, 상기 제2입력 데이터는 상기 프로파일러에서 프로파일링되어 상기 Redis부에 저장되는 것을 특징으로 한다.In addition, if it does not correspond to the anomaly symptom class aggregation set in the third aggregation data comparison step, the second input data is profiled by the profiler and stored in the Redis unit.

또한, 상기 제1차 비교 필드처리 단계의 특이 조건은, ”접속국가가 한국이 아니고 접속 단말이 모바일이면서 이체를 한 거래” 또는, 단말정보로 단말식별키(MAC, HDD, IMEI, UUID, androidID, widevineID), 접속 IP, 접속 국가코드, VPN 사용 여부, VM 사용 여부, PROXY 사용 여부, WIFI 사용 여부, 단말 OS, 단말언어 중 어느 하나 이상을 포함하거나, 또는, 거래정보로, 거래채널, 입금/출금의 수신구분, 고객번호, 아이디, 단말정보, 계좌번호, 상대계좌번호, 상대은행코드, 거래금액, 잔액, 계좌주명, 상대 계좌주명, 법인 계좌 여부, 변경 기록코드 중 어느 하나 이상을 포함하는 것을 특징으로 한다.In addition, the specific condition of the first comparison field processing step is “transaction made while the accessing terminal is mobile and the accessing country is not Korea” or, as the terminal information, the terminal identification key (MAC, HDD, IMEI, UUID, androidID . / Including any one or more of the classification of receipt of withdrawal, customer number, ID, terminal information, account number, counterpart account number, counterpart bank code, transaction amount, balance, account holder name, counterpart account holder name, corporate account status, change record code It is characterized by doing.

또한, 상기 제2차 프로파일링된 데이터 비교 단계의 특정 기준값은, 고객기준으로 과거에 해외사용 여부가 있는 경우, 고객 기준으로 과거에 사용했던 단말기인 경우 및 고객 기준으로 과거에 이체했던 계좌인지 여부 중 어느 하나 이상을 포함하는 것을 특징으로 한다.In addition, the specific reference value of the second profiled data comparison step is based on customer standards, if there has been overseas use in the past, Characterized in that it includes any one or more of whether the terminal was used in the past on a customer basis and whether the account was transferred in the past on a customer basis.

또한, 상기 제3차 어그리제이션 데이터 비교단계에서 설정된 이상 징후 클래스 aggregation은, 기준시간 동안 동일 아이디가 정해진 임계값 이상 접속시, 기준시간 동안 처리된 누적 금액이 임계값 이상 이체 고객, 지속시간(duration) 동안 기준값에 대한 count가 임계값 초과시 중 어느 하나 이상을 포함하는 것을 특징으로 한다.In addition, the abnormal symptom class aggregation set in the third aggregation data comparison step is, when the same ID accesses over a predetermined threshold value during the reference time, the accumulated amount processed during the reference time is greater than the threshold value, the transfer customer, the duration ( It is characterized in that the count for the reference value during duration) includes any one or more of when the threshold value is exceeded.

본 발명의 일 실시 예에 따르면, 데이터 스토리지의 메인 메모리에 알고리즘이 설치되어 운영되는 방식으로 간단한 아키텍처에 의해 처리시간이 단축되어 실시간 이상징후에 대한 분석이 가능한 효과를 가진다.According to an embodiment of the present invention, an algorithm is installed and operated in the main memory of data storage, and processing time is reduced by a simple architecture, thereby enabling real-time analysis of anomalies.

도 1은 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템을 설명하기 위한 구조를 간단하게 도시한 것이다.
도 2는 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템의 RTA 분석 프로세서부(120)에서 이상징후 실시간 분석방법에 대해 도시한 것이다.
도 3은 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템을 금융시스템에 적용한 예를 도시한 것이다.1 is a schematic diagram of a structure for explaining an in-memory based anomaly symptom real-time analysis system according to an embodiment of the present invention.
FIG. 2 illustrates a real-time analysis method for abnormal symptoms in the RTA analysis processor unit 120 of the in-memory based real-time abnormal symptom analysis system according to an embodiment of the present invention.
3 illustrates an example of applying the in-memory based real-time anomaly analysis system according to an embodiment of the present invention to a financial system.

본 출원에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.Terms used in this application are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise.

본 출원에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. In this application, when a certain component is said to "include", it means that it may further include other components without excluding other components unless otherwise stated.

또한, 명세서에 기재된 "…부", "…기", "모듈", "장치" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.In addition, terms such as "...unit", "...unit", "module", and "device" described in the specification mean a unit that processes at least one function or operation, which is hardware or software, or a combination of hardware and software. can be implemented as

또한, 본 발명의 실시 예의 구성 요소를 설명하는 데 있어서, 제1, 제2 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 어떤 구성 요소가 다른 구성요소에 '연결', '결합' 또는 '접속'된다고 기재된 경우, 그 구성 요소는 그 다른 구성요소에 직접적으로 연결, 결합 또는 접속될 수 있지만, 그 구성 요소와 그 다른 구성요소 사이에 또 다른 구성 요소가 '연결', '결합' 또는 '접속'될 수도 있다고 이해되어야 할 것이다.Also, terms such as first and second may be used in describing components of an embodiment of the present invention. These terms are only used to distinguish the component from other components, and the nature, order, or order of the corresponding component is not limited by the term. When an element is described as being 'connected', 'coupled' or 'connected' to another element, the element may be directly connected, coupled or connected to the other element, but not between the element and the other element. It should be understood that another component may be 'connected', 'coupled' or 'connected' between elements.

이하 본 발명의 구현에 따른 인-메모리 기반 이상징후 실시간 분석시스템 및 방법에 대하여 상세하게 설명한다.Hereinafter, an in-memory based anomaly symptom real-time analysis system and method according to an implementation of the present invention will be described in detail.

인-메모리 기반 데이터 처리시스템은 디스크가 아닌 주 메모리에 모든 데이터를 보유하고 있는 데이터베이스. 디스크 검색보다 자료 접근이 훨씬 빠른 것이 가장 큰 장점이다. 이에 따라 데이터양의 빠른 증가로 데이터베이스 응답 속도가 떨어지는 문제를 해결할 수 있고 실시간 이상징후 분석이 가능하게 된다.An in-memory based data processing system is a database that holds all data in main memory rather than on disk. The biggest advantage is that data access is much faster than disk search. Accordingly, it is possible to solve the problem of slow database response due to the rapid increase in the amount of data, and to analyze anomalies in real time.

전형적인 디스크 방식은 디스크에 저장된 데이터를 대상으로 쿼리를 수행하지만, 인-메모리 방식은 메모리 상에 색인을 넣어 필요한 모든 정보를 메모리 상의 색인을 통해 빠르게 검색할 수 있다.A typical disk method performs a query on data stored on disk, but an in-memory method puts an index in memory and can quickly retrieve all necessary information through the in-memory index.

도 1은 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템을 설명하기 위한 구조를 간단하게 도시한 것이다.1 is a schematic diagram of a structure for explaining an in-memory based anomaly symptom real-time analysis system according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템(100)은 분석을 위한 raw 데이터가 입력되며, 입력된 raw 데이터를 분산 처리하여 파서부(parser, 160)로 전송처리 하기 위해 복수의 카푸카(kafka)모듈이 병렬로 형성된 제1 카푸카부(150), 상기 제1 카푸카부(150)로부터 전송받은 비정형 상태의 raw 데이터를 파싱하여 정규화 데이터인 제2 입력데이터로 변환시키는 파서부(parser, 160), 상기 파서부(parser, 160)에서 변환된 제2입력 데이터를 분산 처리하여 RTA부로 전송하기 위해, 복수의 카푸카(kafka) 모듈이 병렬로 형성된 제2 카푸카부(170), 상기 제2 카푸카부(170)로부터 전송받은 제2입력 데이터에 대해 주어진 프로세스에 따라 데이터를 실시간 분석하는 기능을 수행하는 RTA 분석프로세서부(120), 상기 RTA 분석프로세서부(120)에서 분석된 제2입력 데이터에 대해 설정된 조건 프로세스에 따라 프로파일링하고, 프로파일링된 데이터를 Key=value 형식으로 변환하는 프로파일러부(130) 및 상기 프로파일러부(130)에서 key=value 방식으로 변환된 데이터를 메모리에 저장하는 Redis부(110)를 포함한다.Referring to FIG. 1, in the in-memory based anomaly symptom real-time analysis system 100 according to an embodiment of the present invention, raw data for analysis is input, and the input raw data is distributed and processed to parser 160 ), the first Kafka module 150 formed in parallel with a plurality of Kafka modules to process the transmission, parsing the unstructured raw data received from the first Kafka module 150, and parsing the normalized data, the second A parser unit (160) that converts into input data, and a plurality of kafka modules are formed in parallel to distribute and process the second input data converted by the parser unit (160) and transmit it to the RTA unit The second KaPuka unit 170, the RTA analysis processor unit 120 performing a function of analyzing data in real time according to a given process for the second input data transmitted from the second KaPuka unit 170, the RTA analysis processor The profiler unit 130 performs profiling according to the conditional process set for the second input data analyzed in the unit 120 and converts the profiled data into a key=value format, and the profiler unit 130 key=value It includes a Redis unit 110 that stores data converted in this way in memory.

본 발명의 일 실시 예에서는 제1 카푸카부(150)와 제2 카푸카부(170)는 데이터 처리를 위해 파서부(parser, 160)의 전, 후단 사이에 분산 처리를 위한 분산 데이터큐(임시저장)로 사용되는 것을 특징으로 한다. In one embodiment of the present invention, the first KaPuka unit 150 and the second KaPuka unit 170 are distributed data queues (temporary storage) for distributed processing between the front and rear ends of the parser unit 160 for data processing. ) characterized in that it is used as.

본 발명의 일 실시 예에 따르면, 카푸카(kafka) 모듈은 socket에서 데이터를 수신한 연결된 process가 읽어간 시간 동안 추가 데이터를 버퍼에 임시 저장하기 위해 분산 데이터큐로 사용하는 모듈을 의미한다.According to an embodiment of the present invention, a kafka module refers to a module used as a distributed data queue to temporarily store additional data in a buffer while a connected process receiving data from a socket reads them.

본 발명의 일 실시 예에서 제1 카푸카부(150)는 파셔부(160)의 처리량에 맞추어 2 내지 10개의 카푸카(kafka) 모듈로 분산 처리되어 효율적인 데이터의 전송과 데이터의 전송속도를 높일 수 있다.In one embodiment of the present invention, the first Kafka unit 150 is distributed and processed with 2 to 10 kafka modules according to the throughput of the parser unit 160 to increase efficient data transmission and data transmission speed. there is.

도 1에서 입력 데이터의 양이 파서부(160)에서 실시간으로 파싱 처리하는 양보다 많을 경우에는 제1 카푸카부(150)에서 2 ~ 10개의 카푸카(kafka) 모듈로 분산되어 순차적으로 처리된다. 또한, 파서부(160)가 데이터를 정규화하고 RTA 분석프로세서부(120)가 분석을 해야 하는 과정에서 parser 데이터 처리 성능이 RTA 데이터 성능보다 높을 경우, 제2 카푸카부(170)에서는 2 ~ 10개의 카푸카(kafka) 모듈로 분산되어 순차적으로 전송되도록 함으로써, 파서부(160)는 RTA 분석프로세서부(120)에서 처리할 때까지 기다리지 않고 입력된 데이터신호에 대해 파싱 처리를 계속 수행할 수 있다.In FIG. 1 , when the amount of input data is greater than the amount parsed in real time by the parser unit 160, the first Kafka unit 150 distributes it to 2 to 10 kafka modules and processes them sequentially. In addition, when the parser data processing performance is higher than the RTA data performance during the process in which the parser unit 160 normalizes data and the RTA analysis processor unit 120 analyzes the data, the second Kapuka unit 170 analyzes 2 to 10 data. By distributing to the Kafka module and sequentially transmitting, the parser unit 160 can continue parsing the input data signal without waiting for processing by the RTA analysis processor unit 120.

따라서, 본 발명의 일 실시 예에 따른 전체적인 데이터 흐름의 속도를 높여서 실시간 분석이 가능하도록 하는 효과를 가진다.Therefore, it has an effect of enabling real-time analysis by increasing the speed of the overall data flow according to an embodiment of the present invention.

본 발명의 일 실시 예에 따르면, 상기 파서부(parser, 160)는 비정형 상태의 raw 데이터를 파싱하여 정규화 데이터로 변환시키는 기능을 수행한다.According to an embodiment of the present invention, the parser 160 performs a function of parsing unstructured raw data and converting it into normalized data.

파서부(parser, 160)는 주어진 종단 기호의 열이 특정의 문법에서 생성되는지를 판정하고, 초기 기호로부터 그 열에 도달하는 생성 규칙의 열을 발견하는 프로그램을 가리킨다. 구조를 가진 대상 중에서도 자연 언어로 쓰여진 문이나 인공 언어로 쓰여진 프로그램을 문법 규칙에 따라 구문 해석하여 변환한다.Parser 160 refers to a program that determines if a sequence of given termination symbols is generated from a particular grammar, and finds the sequence of production rules that reach that sequence from the initial symbol. Among objects with a structure, a statement written in a natural language or a program written in an artificial language is parsed and converted according to grammar rules.

예를 들면, 입력된 비정형 데이터의 예는 다음과 같다.For example, an example of input unstructured data is as follows.

- 비정형 데이터 예시 1: "KR", "abc", "1.1.1.1.'...- Unstructured data example 1: "KR", "abc", "1.1.1.1.'...

- 비정형 데이터 예시 2: con="JP", id="efg", ip="2.2.2.2", ...- Unstructured data example 2: con="JP", id="efg", ip="2.2.2.2", ...

상기 파서부(parser, 160)에서는 상기 비정형 데이터를 RTA 분석프로세서부(120)에서 분석 작업을 효율적으로 수행하기 위해 다음 표1과 같이 정규화된 제2 입력데이터로 변환을 한다.The parser 160 converts the unstructured data into normalized second input data as shown in Table 1 below in order to efficiently perform analysis work in the RTA analysis processor unit 120.

표 1은 입력된 비정형 데이터로부터 변환된 정규화 제2입력 데이터의 예를 도시한 것이다.Table 1 shows an example of normalized second input data converted from input unstructured data.

국가country 아이디ID 아이피IP KRKR abcabc 1.1.1.11.1.1.1 JPJP efgefg 2.2.2.22.2.2.2

RTA 분석 프로세서부(120)는 입력된 제2 입력 데이터를 주어진 프로세스에 따라 Redis부(110)에서 프로파일링된 데이터와 실시간으로 분석하는 분석 프로세서로써, 본 발명의 일 실시 예에서는 실시간 처리 속도를 위해 병렬로 복수개의 RTA(realtime analyzer) 분석프로세서로 형성되어 각각 처리 물량을 나누어서 병렬로 분석 처리하는 것을 특징으로 한다.The RTA analysis processor unit 120 is an analysis processor that analyzes the input second input data in real time with the data profiled in the Redis unit 110 according to a given process. In one embodiment of the present invention, for real-time processing speed It is formed of a plurality of real-time analyzer (RTA) analysis processors in parallel and is characterized in that each processing volume is divided and analyzed and processed in parallel.

예를 들면, 실시간 분석할 시나리오가 100개일 경우 RTA 분석 프로세서를 2개 구동시키면 각각 50개의 시나리오가 배정되어 실시간 분석되는 구조로 수행된다.For example, if there are 100 scenarios to be analyzed in real time, if two RTA analysis processors are driven, 50 scenarios are assigned to each and analyzed in real time.

본 발명의 일 실시 예에 따른 프로파일러부(130)는 제2 입력 데이터를 특정 기준값을 기준으로 지속적으로 프로파일링하는 모듈이다.The profiler unit 130 according to an embodiment of the present invention is a module that continuously profiles the second input data based on a specific reference value.

프로파일링은 특정 기준값(예>고객번호, 혹은 아이디)을 기준으로 일정 기간 동안 데이터를 연산하여 쌓아놓는 작업이다. 예를 들면, 고객번호 기준으로 거래하였던 단말기기, 국가, 계좌번호, 거래채널, 거래평균금액 등으로 축적이 된다.Profiling is the operation of calculating and accumulating data for a certain period of time based on a specific reference value (e.g. customer number or ID). For example, the terminal device, country, account number, transaction channel, and average transaction amount are accumulated based on the customer number.

이를 금융시스템에 적용하여 이상징후 탐지를 위해서는 본 발명의 일 실시 예에서는 상기의 예와 같이 개개의 고객들에 대한 과거 이력을 지속적으로 프로파일링을 해서 데이터를 축적하는 단계를 포함한다.In order to apply this to the financial system to detect anomalies, an embodiment of the present invention includes the step of continuously profiling the past history of each customer and accumulating data, as in the above example.

예를 들어 5000만원 이상 이체 시 이상행위로 보는 시나리오가 존재할 시에, 현재 5000만원 이상 이체를 했지만 과거에 거래가 있던 계좌로 이체를 했을 경우라면, 정상으로 볼 수 있다. For example, when there is a scenario in which a transfer of more than 50 million won is regarded as an abnormal behavior, if the transfer is made to an account where there was a transaction in the past even though the current transfer is more than 50 million won, it can be regarded as normal.

본 발명의 일 실시 예에서는 각 고객별로 지속적으로 이체를 했던 상대 계좌번호를 프로파일링하여 지속적으로 메모리에 쌓아놓고, RTA 분석 프로세서부(120)에서는 이를 현재 데이터와 대비하여 조건에 따라 단계별로 이상징후를 판단한다.In one embodiment of the present invention, the relative account number that has been continuously transferred for each customer is profiled and continuously stored in memory, and the RTA analysis processor unit 120 compares it with the current data and detects anomalies step by step according to conditions. judge

상기 프로파일러부(130)에서 프로파일링된 데이터를 key value 방식의 비관계형 데이터로 변환하고, Redis부(110)는 상기 변환된 key value 방식의 데이터를 메모리에 저장하는 것을 특징으로 한다.Characterized in that the data profiled by the profiler unit 130 is converted into non-relational data of a key value method, and the Redis unit 110 stores the converted key value method data in a memory.

일반적은 DBMS는 여러 개의 테이블이 생성하고 각각의 테이블들끼리의 관계를 형성할 수 있고 각각의 테이블에 여러 개의 필드를 생성하여 데이터를 관리하는데 비해, 본 발명의 일 실시 예에 따른 Redis부(110)는 단순 key=value 방식으로 저장되어 있기 때문에 해당 Key에 대한 value 검색이 굉장히 빠르고, 이에 따라 실시간 검색이 가능한 효과를 가진다.Compared to general DBMS, which creates several tables, can form relationships between tables, and manages data by creating several fields in each table, the Redis unit (110) according to an embodiment of the present invention ) is stored in a simple key=value method, so searching for the value for the key is very fast, and as a result, it has the effect of real-time searching.

key=value 방식은 키를 고유한 식별자로 사용하는 키-값 쌍의 집합으로 데이터를 저장한다. 단순한 객체에서 복잡한 집합체에 이르기까지 무엇이든 키와 값이 될 수 있다. key=value 방식은 파티셔닝이 가능하고 다른 유형의 데이터베이스로는 불가능한 범위까지 수평 확장을 가능하게 합니다. 이와 같은 key=value 방식은 오픈 소스인 “DynamoDB”를 사용하여 수행될 수 있다.The key=value method stores data as a set of key-value pairs using the key as a unique identifier. Keys and values can be anything from simple objects to complex aggregates. The key=value approach allows for partitioning and horizontal scaling to an extent not possible with other types of databases. This key=value method can be performed using the open source “DynamoDB”.

도 2는 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템의 RTA 분석 프로세서부(120)에서 이상징후 실시간 분석방법에 대해 도시한 것이다.FIG. 2 illustrates a real-time analysis method for abnormal symptoms in the RTA analysis processor unit 120 of the in-memory based real-time abnormal symptom analysis system according to an embodiment of the present invention.

본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템은 금융시스템에 적용하여 금융 거래에 대한 이상징후를 분석하여 이상시 출력으로 정보를 제공할 수 있다.The in-memory based anomaly symptom real-time analysis system according to an embodiment of the present invention can be applied to a financial system to analyze anomaly symptoms for financial transactions and provide information as an output in case of an anomaly.

도 3은 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템을 금융시스템에 적용한 예를 도시한 것이다.3 illustrates an example of applying the in-memory based real-time anomaly analysis system according to an embodiment of the present invention to a financial system.

도 3을 참조하면, 금융시스템을 이용하는 고객이 고객 단말(500)을 이용한 네트워크를 통하여 금융서버(200)에 접속하여 금융 업무를 처리한다,Referring to FIG. 3, a customer using a financial system accesses a financial server 200 through a network using a customer terminal 500 to process financial affairs.

상기 고객 단말은 스마트폰, 텔레뱅킹, PC, ATM 등을 포함한다.The customer terminal includes a smart phone, telebanking, PC, ATM, and the like.

본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템(100)은 금융서버(200)의 MCI(Multi channel integration, 210) 및 계정계(250)를 통하여 고객들의 금융 처리 데이터를 실시간으로 입력받는다.The in-memory based anomaly symptom real-time analysis system 100 according to an embodiment of the present invention provides customers' financial processing data in real time through the multi-channel integration (MCI) 210 and the account system 250 of the financial server 200. receive input

본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템(100)은 금융서버(200)의 MCI(Multi channel integration, 210) 고객 인터페이스를 통하여 고객이 단말(5400)에 의해 처리되는 금융 업무에 대한 데이터를 입력받는다. 또한, 금융서버(200)의 계정계(250)와 접속하여 통해 고객이 다른 금융기관을 통해 처리되는 금융업무에 대한 데이터를 실시간으로 입력받는다.The in-memory based anomaly symptom real-time analysis system 100 according to an embodiment of the present invention is a financial server 200 through a multi channel integration (MCI) 210 customer interface. Receive data about work. In addition, by accessing the account system 250 of the financial server 200, the customer receives data on financial business processed through other financial institutions in real time.

도 2를 참조하면, 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템의 RTA 분석 프로세서부(120)에서 이상징후 실시간 분석방법은, 먼저 제1차 비교 필드처리(1^st comparsion filed operation) 단계(1210)가 수행된다.Referring to FIG. 2, in the real-time analysis method for anomalies in the RTA analysis processor unit 120 of the in-memory-based real-time anomaly analysis system according to an embodiment of the present invention, first comparison field processing (1 ^st comparsion filed operation) step 1210 is performed.

제1차 비교 필드처리(1^st comparsion filed operation) 단계(1210)에서는 상기 제2카부카부(170)에서 전송된 정규화된 분석이 요구되는 데이터 중에서 요구되는 특이 조건에 해당하는 데이터를 추출하여 비교하는 과정이 수행된다.In the first comparison field processing (1st comparison field operation) step 1210, data corresponding to the specific condition required from among the data required for normalized analysis transmitted from the second ^Kabuka unit 170 is extracted and compared. process is carried out

본 발명의 일 실시 예에 따른 제1차 비교 필드처리(1^st comparsion filed operation) 단계(1210)에서는 조건 필드와 데이터 필드의 연산 처리로 수행된다.In step ¹²¹⁰ of a first comparison field operation according to an embodiment of the present invention, operation processing of a condition field and a data field is performed.

예를 들면,For example,

- 필드와 값에 대한 연산으로 : =, !=, like, not like, in, not in, >, <, >=, <= 등이 지원된다. - Operations on fields and values: =, !=, like, not like, in, not in, >, <, >=, <=, etc. are supported.

- Ex> A, B, C, D의 변수가 존재할 때- Ex> When there are variables of A, B, C, D

(A=1 and B <= 3 and A != B) or (C like 'man*' and D in ('t', 's')) …(A=1 and B <= 3 and A != B) or (C like 'man*' and D in ('t', 's')) …

또한, 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템을 금융시스템에 적용한 금융처리의 이상징후 분석방법에 적용한 경우, 특이 조건은 금융 사기에 해당하는 조건으로 설정될 수 있다.In addition, when the in-memory-based real-time analysis system for anomalies according to an embodiment of the present invention is applied to a method for analyzing anomalies in financial processing applied to a financial system, a specific condition may be set as a condition corresponding to financial fraud.

예를 들면,, 본 발명의 일 실시 예에 따른 인-메모리 기반 이상징후 실시간 분석시스템을 금융시스템에 적용한 금융처리의 이상징후 분석방법의 경우, 제1차 비교 필드처리(1^st comparsion filed operation) 단계(1210)에서 특이 조건은 ”접속국가가 한국이 아니고 접속 단말이 모바일이면서 이체를 한 거래” 로 설정할 수 있다.For example, in the case of an anomaly analysis method for financial processing in which the in-memory-based real-time analysis system for anomaly analysis according to an embodiment of the present invention is applied to a financial system, the first comparison field processing (1st ^comparison filed operation) In step 1210, the specific condition may be set to “transfer transaction while the accessing country is not Korea and the accessing terminal is mobile”.

이와 같은 금융 사기에 해당하는 조건은, 단말정보로 단말식별키(MAC, HDD, IMEI, UUID, androidID, widevineID), 접속 IP, 접속 국가코드, VPN 사용여부, VM 사용여부, PROXY 사용여부, WIFI 사용여부, 단말 OS, 단말언어 등을 포함할 수 있다. 또한, 거래정보로, 거래구분(예비거래/본거래), 거래채널, 수신구분(입금/출금), 고객번호, 아이디, 단말정보, 계좌번호, 상대 계좌번호, 상대 은행코드, 거래금액, 잔액, 계좌주명, 상대 계좌주명, 법인 계좌여부, 변경 기록코드 등을 포함하여 설정할 수 있다.The conditions corresponding to such financial fraud are terminal information such as terminal identification key (MAC, HDD, IMEI, UUID, androidID, widevineID), access IP, access country code, whether to use VPN, whether to use VM, whether to use PROXY, WIFI It may include usage status, terminal OS, terminal language, and the like. In addition, as transaction information, transaction classification (preliminary transaction/main transaction), transaction channel, reception classification (deposit/withdrawal), customer number, ID, terminal information, account number, counterparty account number, counterparty bank code, transaction amount, balance , the name of the account holder, the name of the other account holder, whether or not a corporate account, and the change record code can be set.

제1차 비교 필드처리(1st comparsion filed operation) 단계(1210 단계)에서 특이 사항에 해당하지 않는 경우(NO)에는 종료하고, 입력된 데이터는 프로파일러(130)에서 프로파일링되어 Redis부(110)에 저장이 된다.In the 1st comparison field operation step (step 1210), if it does not correspond to the specific case (NO), it ends, and the input data is profiled in the profiler 130 and the Redis unit 110 is stored in

제1차 비교 필드처리(1st comparsion filed operation) 단계(1210 단계)에서 특이 조건에 해당되는 경우(YES)에는 상기 분석이 요구되는 데이터를 특정 기준값을 기준으로 과거 프로파일링된 데이터와 비교하는 제2차 프로파일링된 데이터 비교(2^nd comparsion profiled data,) 단계(1220 단계)가 수행된다.In the first comparison field operation step (step 1210), if a specific condition is met (YES), the data required for analysis is compared with data profiled in the past based on a specific reference value. A 2 ^nd comparison profiled data step (step 1220) is performed.

금융처리의 이상징후에 대한 분석의 경우, 제2차 프로파일링된 데이터 비교(2nd comparsion profiled data, 1220) 단계에서의 특정 기준값의 예를 들면, 다음과 같은 것을 포함할 수 있다.In the case of analysis of anomalies in financial processing, examples of specific reference values in the second comparison profiled data comparison step 1220 may include the following.

- 고객기준으로 과거에 해외사용 여부가 있는 경우,- If there has been overseas use in the past on a customer basis,

- 고객 기준으로 과거에 사용했던 단말기인 경우,- In the case of a terminal used in the past on a customer basis,

- 고객 기준으로 과거에 이체했던 계좌인지 여부- Whether the account has been transferred in the past on a customer basis

상기 특정 기준값에 만족이 되면(YES), 종료하고, 입력된 데이터는 프로파일러(130)에서 프로파일링되고, key=value 값으로 변환되어 Redis부(110)에 저장이 된다.When the specific reference value is satisfied (YES), it ends, and the input data is profiled in the profiler 130, converted into a key=value value, and stored in the Redis unit 110.

한편, 해당 고객이 과거에 해외거래 이력이 없고 과거에 사용하지 않았던 단말기를 이용했고 과거에 거래내역이 없는 계좌 거래시 즉, 상기 특정 기준값에 만족이 되지 않으면(NO), aggregation 단계(1231 단계)를 거쳐서 해당 고객에 대해 이상 징후 클래스 aggregation을 메모리(1232)에서 추출(1231)하여 설정하고, 주어진 설정에 대해 제3차 어그리제이션 데이터 비교(3^rd comparsion aggregation data) 단계(1235 단계)가 수행된다.On the other hand, when the customer has no overseas transaction history in the past, used a terminal that has not been used in the past, and accounts with no transaction history in the past, that is, if the above specific reference value is not satisfied (NO), the aggregation step (step 1231) After going through, the anomaly symptom class aggregation for the customer is extracted (1231) from the memory 1232 and set, and the 3 ^rd comparison aggregation data step (step 1235) is performed for the given setting do.

본 발명의 일 실시 예에서 상기 aggregation 단계(1231 단계)는 해외거래 이력이 없고 과거에 사용하지 않았던 단말기를 사용했고 과거 거래내역이 없는 계좌와 거래한 고객을 기준으로 거래건수를 count하는 작업을 포함한다.In one embodiment of the present invention, the aggregation step (step 1231) includes counting the number of transactions based on customers who have no overseas transaction history, have used a terminal that has not been used in the past, and have traded with accounts without past transaction history. do.

또한, aggregation data by pivot value 단계(1232 단계)에서는 상기 1231단계에서 에서 연산한 작업을 메모리에 임계시간 동안 임시 저장하고 제3차 어그리제이션 데이터 비교(3^rd comparsion aggregation data) 단계(1235 단계)에서 임계치 비교 작업을 포함하여 수행될 수 있다.In addition, in the aggregation data by pivot value step (step 1232), the operation calculated in step 1231 is temporarily stored in memory for a critical time, and the 3 ^rd comparison aggregation data step (step 1235) It may be performed by including a threshold comparison operation in

예를 들면, "1시간 동안 해당 고객이 과거에 해외거래 이력이 없고 과거에 사용하지 않았던 단말기를 이용했고 과거에 거래내역이 없는 계좌 2건 이상 거래시”, 본 발명의 일 실시 예에서 상기 aggregation 단계(1231 단계)에서는 해외거래 이력이 없고 과거에 사용하지 않았던 단말기를 사용했고 과거 거래내역이 없는 계좌와 거래한 고객을 기준으로 거래건수를 1시간 동안 count 하는 작업을 포함한다.For example, "when the customer has no overseas transaction history in the past, uses a terminal that has not been used in the past, and trades two or more accounts with no transaction history in the past for 1 hour", in one embodiment of the present invention, the aggregation Step 1231 includes counting the number of transactions for 1 hour based on customers who have no overseas transaction history, have used a terminal that has not been used in the past, and have made transactions with accounts without past transaction history.

또한, aggregation data by pivot value 단계(1232 단계)에서는 상기 aggregation 단계(1231 단계)에서 연산한 작업을 메모리에 임시 저장하고 제3차 어그리제이션 데이터 비교(3^rd comparsion aggregation data) 단계(1235 단계)에서 2건 이상 발생하면 이상징후로 탐지하는 작업을 포함하여 수행된다.In addition, in the aggregation data by pivot value step (step 1232), the operation calculated in the aggregation step (step 1231) is temporarily stored in memory and the 3 ^rd comparison aggregation data step (step 1235) If more than 2 cases occur in , it is performed including detecting as an anomaly.

금융처리의 이상징후에 대한 분석의 경우, 제3차 어그리제이션 데이터 비교(3^rd comparsion aggregation data) 단계(1235 단계)에서 설정된 어그리제이션(aggregation data) 데이터 및 비교분석의 예를 들면, 다음과 같은 것을 더 포함할 수 있다.In the case of analysis of anomalies in financial processing, for example of the aggregation data data and comparative analysis set in the 3 ^rd comparison aggregation data step (step 1235), the following You may include more such as

- 구성값 : 지속시간, 기준값, 대상값, 연산자(count, sum), 임계값- Configuration values: duration, reference value, target value, operator (count, sum), threshold value

- 기준시간 또는 지속시간(duration) 동안 기준값에 대한 count가 임계값 초과시- When the count for the reference value exceeds the threshold value during the reference time or duration

(예> 1시간 동안 동일 아이디(기준값)가 10(임계값)회 이상 접속시)(Example> When the same ID (standard value) accesses more than 10 (threshold value) times in 1 hour)

- 기준시간 또는 지속시간(duration) 동안 기준값에 대한 대상값의 SUM 임계값 초과시- When the SUM threshold of the target value for the reference value is exceeded during the reference time or duration

(예> 30분 동안 누적 금액(대상값) 천만 원 (임계값)이상 이체 고객(기준값))(Example> Customers who transferred more than KRW 10 million (threshold value) of the cumulative amount (target value) for 30 minutes (standard value))

- 지속시간(duration) 동안 기준값에 대한 대상값의 유일 COUNT 임계값 초과시- When the unique COUNT threshold of the target value for the reference value is exceeded during the duration

(예> 1시간 동안 동일단말기(기준값)에서 접속한 아이디(대상값)가 10(임계값)개 이상시)(Example> When there are 10 (threshold) or more IDs (target value) accessed from the same terminal (standard value) for 1 hour)

- 상기 연산들을 지속시간 동안만 메모리에 유지하고 지난 데이터는 메모리에서 삭제(why)- Keep the above operations in memory only for the duration and delete the past data from memory (why)

본 발명의 일 실시 예에 따른 해당 고객에 대한 이상 징후 클래스 aggregation는 예상되는 금융 사고에 대해 메모리에 지속적으로 축적되는 것을 특징으로 한다.According to an embodiment of the present invention, anomaly symptom class aggregation for a corresponding customer is continuously accumulated in a memory for expected financial accidents.

예를 들면, “과거에 해외에서 거래내역이 없는 고객이 과거에 사용하지 않은 모바일 장비로 한 번도 이체하지 않은 계좌에 1시간 동안 1천만 원 이상 해외로 이체한 경우에는 설정된 어그리제이션(aggregation data) 데이터를 만족한 것(YES)으로 판단을 하여, 이상징후 거래로 출력(output)을 하고, 입력된 분석이 요구되는 데이터는 프로파일러(130)에서 프로파일링되어 Redis부(110)에 저장이 된다.For example, “If a customer with no transaction history abroad in the past transfers over 10 million won overseas to an account that has never been transferred to an account that has never been transferred to a mobile device that has not been used in the past, the set aggregation data ) The data is judged to be satisfied (YES), and the output is output as an anomaly transaction, and the input data required for analysis are profiled in the profiler 130 and stored in the Redis unit 110 do.

또는, 설정된 어그리제이션(aggregation data) 데이터에 해당되지 않는 경우(NO)에는 입력된 데이터는 프로파일러(130)에서 프로파일링되어 Redis부(110)에 저장이 된다.Alternatively, if it does not correspond to the set aggregation data (NO), the input data is profiled in the profiler 130 and stored in the Redis unit 110.

100: 분석시스템
110: Redis부
120: RTA 분석프로세서부
130: 프로파일러부
150: 제1 카푸카부
160: 파서부
170: 제2 카푸카부
200: 금융서버
500: 고객단말100: analysis system
110: Redis part
120: RTA analysis processor
130: profiler unit
150: 1st Kapukabu
160: parser
170: second kapukabu
200: financial server
500: customer terminal

Claims

In the in-memory based anomaly real-time analysis system,
The analysis system,
a first Kafka unit formed of a plurality of kafka modules and receiving input data for analysis;
a parser unit parsing the input data transmitted from the first Kapuka unit and converting it into second input data that is normalized data;
a second kafka unit that distributes and transmits the second input data converted by the parser unit to a plurality of kafka modules;
an RTA analysis processor unit performing real-time analysis according to a given process on the second input data transmitted from the second KaPuka unit;
a profiler unit for profiling the second input data analyzed by the RTA analysis processor unit 120 according to a condition process set based on a specific reference value and converting the profiled data into a key value method; and
a Redis unit that stores the data converted in the key value method by the profiler unit in a memory; Analysis system comprising a.

According to claim 1,
Wherein the first Kafka unit and the second Kafka unit are characterized in that 2 to 10 Kafka modules are formed in parallel.

According to claim 1,
The RTA analysis processor unit is characterized in that a plurality of RTA analysis processors are formed, each processing volume is divided and analyzed and processed in parallel.

According to claim 1,
In the profiler unit, the specific reference value includes a customer number or ID, and the set condition process includes any one or more of the terminal device, country, account number, transaction channel, and average transaction amount that have been transacted based on the specific reference value. An analysis system characterized in that for doing.

According to claim 1,
The analysis system, characterized in that for receiving the input data through the customer interface and account system of the financial server.

In the analysis system of claim 1, the analysis method of real-time analysis of anomalies in the RTA analysis processor unit,
a first comparison field processing step of performing a process of comparing whether or not data corresponding to a specific condition among the second input data transmitted from the second KaPuka unit is performed;
In the first comparison field processing step, when the specific condition is met, a second profiled data comparison step in which a process of comparing the data required for analysis with data profiled in the past based on a specific reference value is performed. ; and
In the second profiled data comparison step, when the specific reference value is not satisfied, an anomaly symptom class aggregation for the corresponding customer is extracted from memory and set, and the second input data is compared with the given abnormal symptom class aggregation. A third aggregation data comparison step in which the process of performing is performed; Including,
Characterized in that the RTA analysis processor unit outputs the abnormal symptom transaction when it corresponds to the abnormal symptom class aggregation set in the third aggregation data comparison step.

According to claim 6,
And if it does not correspond to the anomaly symptom class aggregation set in the third aggregation data comparison step, the second input data is profiled in the profiler and stored in the Redis unit.

According to claim 6,
The specific condition of the first comparison field processing step is,
”The access country is not Korea and the access terminal is mobile and transfer transaction” or, as terminal information, terminal identification key, access IP, access country code, whether to use VPN, whether to use VM, whether to use PROXY, whether to use WIFI, terminal Including any one or more of OS and terminal language,
Or, as transaction information, transaction channel, deposit/withdrawal receipt classification, customer number, ID, terminal information, account number, counterparty account number, counterparty bank code, transaction amount, balance, account holder name, counterpart account holder name, corporate account status , Analysis method characterized in that it includes any one or more of the change record code.

According to claim 6,
The specific reference value of the second profiled data comparison step,
Analysis method characterized in that it includes any one or more of whether there has been overseas use in the past on a customer basis, whether the terminal has been used in the past on a customer basis, and whether it is an account that has been transferred in the past on a customer basis.

According to claim 6,
The abnormal symptom class aggregation set in the third aggregation data comparison step,
When the same ID accesses more than a set threshold during the reference time, the accumulated amount processed during the reference time includes more than one of the transfer customers, and the count for the reference value during the duration exceeds the threshold. Analysis method by