KR102156891B1

KR102156891B1 - System and method for detecting and blocking web attack through web protocol behavior analysis based on ai machine learning

Info

Publication number: KR102156891B1
Application number: KR1020200023050A
Authority: KR
Inventors: 이대호; 이동근; 이인영; 김현수; 이형; 신경아; 진세민
Original assignee: 주식회사 에프원시큐리티
Priority date: 2020-02-25
Filing date: 2020-02-25
Publication date: 2020-09-16
Also published as: JP2023516621A; WO2021172711A1; JP7391313B2

Abstract

Provided is a web attack detection system based on artificial intelligence comprising: a filter unit which receives a plurality of HTTP request packets from a web user; a learning unit which extracts a plurality of features from the plurality of HTTP request packets, clusters the plurality of HTTP request packets into a plurality of groups based on the plurality of features, transmits the clustered information to a web manager server, receives labeling information on whether the plurality of groups is an abnormal cluster from the web manager server, and performs machine learning based on the labeling information; and an analysis unit which determines whether the HTTP request packet received from the web user is a web attack packet by using the machine learning using the HTTP request packet received from the web user as an input variable.

Description

Web attack detection and blocking system and method through web protocol analysis based on artificial intelligence machine learning behavior {SYSTEM AND METHOD FOR DETECTING AND BLOCKING WEB ATTACK THROUGH WEB PROTOCOL BEHAVIOR ANALYSIS BASED ON AI MACHINE LEARNING}

본 발명은 웹 공격 탐지 시스템 및 방법에 관한 것으로, 특히 인공지능 머신러닝 행위 기반 웹 프로토콜 분석을 통한 웹 공격 탐지 시스템 및 방법에 관한 것이다.The present invention relates to a web attack detection system and method, and more particularly, to a web attack detection system and method through analysis of a web protocol based on artificial intelligence machine learning behavior.

현재 HTTP 통신상에서의 공격 탐지 방법으로 인젝션 공격, 파라미터 검사, 업로드 바이너리 검사 등 페이로드 위주의 연구가 활발하게 진행되고 있다.Currently, payload-oriented research such as injection attack, parameter inspection, upload binary inspection, etc. as an attack detection method in HTTP communication is actively being conducted.

인공지능 기반 공격 탐지 방법에 대한 연구도 활발하게 진행되고 있다. 종래의 인공지능 기반 공격 탐지 방법은 네트워크와 페이로드(Length of payload, Byte entropy of payload, Number of distinct bytes 등)에 기반하여 데이터 셋(Data set) 및 피처(Feature)를 추출하였기 때문에, 웹 공격 탐지의 정확도가 낮은 문제점이 있었다.Research on artificial intelligence-based attack detection methods is also being actively conducted. The conventional AI-based attack detection method extracts a data set and features based on the network and payload (Length of payload, Byte entropy of payload, Number of distinct bytes, etc.). There was a problem with low detection accuracy.

이에 따라, 사용자의 웹 행위에 기반하여 피처 선정, 추출, 및 클러스터링을 수행함으로써 웹 공격 탐지의 정확도를 향상시킬 수 있는 기술이 요구된다. Accordingly, there is a need for a technology capable of improving the accuracy of web attack detection by performing feature selection, extraction, and clustering based on user web behavior.

본 발명이 이루고자 하는 기술적 과제는 사용자의 웹 행위에 기반하여 피처 선정, 추출, 및 클러스터링을 수행함으로써 웹 공격 탐지의 정확도를 향상시킬 수 있는 인공지능 기반 웹 공격 탐지 시스템 및 방법을 제공하는 것이다.The technical problem to be achieved by the present invention is to provide an artificial intelligence-based web attack detection system and method capable of improving the accuracy of web attack detection by performing feature selection, extraction, and clustering based on a user's web behavior.

한 실시예에 따르면, 인공지능 기반 웹 공격 탐지 시스템이 제공된다. 상기 웹 공격 탐지 시스템은 웹 사용자로부터 복수의 HTTP 요청 패킷을 수신하는 필터부, 상기 복수의 HTTP 요청 패킷으로부터 복수의 피처를 추출하고, 상기 복수의 피처를 바탕으로 상기 복수의 HTTP 요청 패킷을 복수의 그룹으로 클러스터링하며, 클러스터링된 정보를 웹 관리자 서버에게 전송하며, 상기 웹 관리자 서버로부터 상기 복수의 그룹이 비정상 클러스터인지 여부에 관한 라벨링 정보를 수신하며, 상기 라벨링 정보에 기반하여 기계 학습을 수행하는 학습부, 그리고 웹 사용자로부터 수신되는 HTTP 요청 패킷을 입력 변수로 하는 상기 기계 학습을 이용하여, 상기 웹 사용자로부터 수신되는 HTTP 요청 패킷이 웹 공격 패킷인지 여부를 판단하는 분석부를 포함한다.According to an embodiment, an artificial intelligence-based web attack detection system is provided. The web attack detection system includes a filter unit for receiving a plurality of HTTP request packets from a web user, extracting a plurality of features from the plurality of HTTP request packets, and extracting the plurality of HTTP request packets based on the plurality of features. Learning to cluster into groups, transmit clustered information to the web manager server, receive labeling information on whether the plurality of groups are abnormal clusters from the web manager server, and perform machine learning based on the labeling information And an analysis unit that determines whether or not the HTTP request packet received from the web user is a web attack packet by using the machine learning using the HTTP request packet received from the web user as an input variable.

상기 웹 공격 탐지 시스템은 상기 클러스터링된 정보를 바탕으로, 화면 상에 각 클러스트를 서로 다른 색상으로 출력하고, 상기 웹 관리자 서버로부터 각 클러스트에 대응하는 라벨링 정보를 제공받는 시각화부를 더 포함할 수 있다.The web attack detection system may further include a visualization unit that outputs each cluster in different colors on the screen based on the clustered information, and receives labeling information corresponding to each cluster from the web manager server.

상기 복수의 피처는, 웹 사용자의 원격 공인 IP, 웹 사용자의 메인 요청 패킷, 메인 요청 패킷에 의해 연결되는 하위 요청 패킷의 수, 하위 요청 패킷의 리소스 종류, 하위 요청의 리소스 종류별 개수, 요청 패킷의 헤더, 요청 사용자의 세션 ID, 세션 ID의 생성 간격, 세션 ID의 갱신 반복 수, 요청 패킷의 그룹 내에서 헤더 쿠키의 변화, 및 요청 패킷의 그룹 내에서 헤더 사용자 에이전트의 변화 중 적어도 하나를 포함할 수 있다.The plurality of features include: the remote public IP of the web user, the main request packet of the web user, the number of sub-request packets connected by the main request packet, the resource type of the sub-request packet, the number of sub-requests by resource type, and the number of request packets. The header, the requesting user's session ID, the session ID generation interval, the number of repetitions of the session ID update, the change of the header cookie within the group of the request packet, and the change of the header user agent within the group of the request packet. I can.

상기 분석부는, 상기 웹 사용자로부터 수신되는 HTTP 요청 패킷이 웹 공격 패킷으로 판단된 경우, 요청 리소스를 차단하거나 또는 리디렉션 동작을 수행할 수 있다.When it is determined that the HTTP request packet received from the web user is a web attack packet, the analysis unit may block the requested resource or perform a redirection operation.

한 실시예에 따르면, 인공지능 기반 웹 공격 탐지 방법이 제공된다. 상기 웹 공격 탐지 방법은 웹 사용자로부터 수신되는 복수의 HTTP 요청 패킷을 이용하여 기계 학습을 수행하는 단계, 그리고 웹 사용자로부터 수신되는 HTTP 요청 패킷을 입력 변수로 하는 상기 기계 학습을 이용하여, 상기 웹 사용자로부터 수신되는 HTTP 요청 패킷이 웹 공격 패킷인지 여부를 판단하는 단계를 포함하고, 상기 기계 학습을 수행하는 단계는, 웹 사용자로부터 복수의 HTTP 요청 패킷을 수신하는 단계, 상기 복수의 HTTP 요청 패킷으로부터 복수의 피처를 추출하는 단계, 상기 복수의 피처를 바탕으로 상기 복수의 HTTP 요청 패킷을 복수의 그룹으로 클러스터링하는 단계, 클러스터링된 정보를 웹 관리자 서버에게 전송하는 단계, 상기 웹 관리자 서버로부터 상기 복수의 그룹이 비정상 클러스터인지 여부에 관한 라벨링 정보를 수신하는 단계, 및 상기 라벨링 정보에 기반하여 기계 학습을 수행하는 단계를 포함한다.According to an embodiment, an artificial intelligence-based web attack detection method is provided. The web attack detection method includes performing machine learning using a plurality of HTTP request packets received from a web user, and using the machine learning using HTTP request packets received from a web user as input variables, the web user Determining whether an HTTP request packet received from a web attack packet is a web attack packet, and the performing of machine learning includes receiving a plurality of HTTP request packets from a web user, and a plurality of the HTTP request packets from the plurality of HTTP request packets. Extracting features of, clustering the plurality of HTTP request packets into a plurality of groups based on the plurality of features, transmitting clustered information to a web manager server, the plurality of groups from the web manager server And receiving labeling information regarding whether the cluster is abnormal, and performing machine learning based on the labeling information.

사용자의 웹 행위에 기반하여 피처 선정, 추출, 및 클러스터링을 수행함으로써 웹 공격 탐지의 정확도를 향상시킬 수 있다.The accuracy of web attack detection can be improved by performing feature selection, extraction, and clustering based on the user's web behavior.

여러 개의 HTTP 요청 패킷 묶음에 대해 피처 선정, 추출, 클러스터링을 수행함으로써, 해커의 해킹 시도 전 이상 행위(예, 단독 리소스의 요청 및 명령 요청, 명시적 에러 발생 유도, 존재하지 않는 리소스의 주기적인 요청, 일정 간격의 균일한 요청 패턴, 동일한 에러의 지속적인 발생, GeoIP를 통한 불가능한 이동 요청 행위 판별)에 대한 탐지가 가능하다.Abnormal behavior before hacker attempts to hack by performing feature selection, extraction, and clustering for multiple HTTP request packet bundles (e.g., requests for single resources and commands, inducing explicit errors, periodic requests for non-existent resources) , Uniform request pattern at regular intervals, continuous occurrence of the same error, and determination of impossible movement request behavior through GeoIP) can be detected.

각 필드별 값에 대해 피처를 선정, 추출, 및 클러스터링을 수행함으로써, 컨텐츠(Content) 전체에 대해 클러스터링을 수행하는 것에 비해 공격 탐지의 정확도를 향상시킬 수 있다.By selecting, extracting, and clustering features for each field value, it is possible to improve the accuracy of attack detection compared to performing clustering on the entire content.

도 1은 한 실시예에 따른 인공지능 기반 웹 공격 탐지 시스템의 블록도이다.
도 2는 한 실시예에 따른 HTTP 요청 패킷을 설명하기 위한 도면이다.
도 3은 한 실시예에 따른 시각화부의 동작 내용을 설명하기 위한 도면이다.
도 4 및 도 5는 한 실시예에 따른 인공지능 기반 웹 공격 탐지 방법의 흐름도이다.1 is a block diagram of an artificial intelligence-based web attack detection system according to an embodiment.
2 is a diagram illustrating an HTTP request packet according to an embodiment.
3 is a diagram for describing an operation content of a visualization unit according to an exemplary embodiment.
4 and 5 are flowcharts of a method for detecting a web attack based on artificial intelligence according to an embodiment.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art can easily implement the embodiments of the present invention. However, the present invention may be implemented in various different forms and is not limited to the embodiments described herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and similar reference numerals are assigned to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part "includes" a certain component, it means that other components may be further included rather than excluding other components unless specifically stated to the contrary.

도 1은 한 실시예에 따른 인공지능 기반 웹 공격 탐지 시스템의 블록도이다. 도 2는 한 실시예에 따른 HTTP 요청 패킷을 설명하기 위한 도면이다. 도 3은 한 실시예에 따른 시각화부의 동작 내용을 설명하기 위한 도면이다.1 is a block diagram of an artificial intelligence-based web attack detection system according to an embodiment. 2 is a diagram illustrating an HTTP request packet according to an embodiment. 3 is a diagram for describing an operation content of a visualization unit according to an exemplary embodiment.

도 1을 참조하면, 한 실시예에 따른 인공지능 기반 웹 공격 탐지 시스템(100)은, 필터부(110), 학습부(120), 분석부(130), 데이터베이스부(140), 및 시각화부(150)를 포함한다.Referring to FIG. 1, an artificial intelligence-based web attack detection system 100 according to an embodiment includes a filter unit 110, a learning unit 120, an analysis unit 130, a database unit 140, and a visualization unit. Includes 150.

필터부(110)는 웹 사용자(10)로부터 복수의 HTTP 요청 패킷을 수신한다. 필터부(110)는 웹 사용자(10)의 HTTP 요청 패킷을 수신하면, 분석부(130)에게 공격 탐지 요청 메시지를 전송한다.The filter unit 110 receives a plurality of HTTP request packets from the web user 10. When the filter unit 110 receives the HTTP request packet from the web user 10, the filter unit 110 transmits an attack detection request message to the analysis unit 130.

필터부(110)는 한 실시예로서, 분석부(130)로부터 공격 탐지 결과를 수신하여 차단 동작(예, Deny, Allow)을 수행하거나 또는 필터링된 데이터를 웹 어플리케이션으로 전달할 수 있다. As an embodiment, the filter unit 110 may receive an attack detection result from the analysis unit 130 and perform a blocking operation (eg, Deny, Allow) or transmit the filtered data to a web application.

필터부(110)는 한 실시예로서, 분석부(130)에 의해 필터링 처리된(예, 주민번호 등 개인정보 및 주석 제거) HTTP 응답을 실제 웹 사용자(10)에게 전달할 수 있다. 필터부(110)는 한 실시예로서, 아파치(Apache) 필터 모듈일 수 있다.As an embodiment, the filter unit 110 may deliver an HTTP response filtered by the analysis unit 130 (eg, removing personal information and annotations such as a resident number) to the actual web user 10. As an embodiment, the filter unit 110 may be an Apache filter module.

학습부(120)는 웹 사용자(10)로부터 수신된 복수의 HTTP 요청 패킷(웹 트래픽)에 대해 전처리를 수행하고, 전처리된 데이터로부터 피처 선정(Feature Selection), 추출(Extraction), 클러스터링(Clustering), 및 웹 관리자 서버(30)로부터 수신된 라벨링 정보에 기반하여 기계 학습을 수행한다.The learning unit 120 performs pre-processing on a plurality of HTTP request packets (web traffic) received from the web user 10, and selects features from the pre-processed data (Feature Selection), extraction (Extraction), and clustering (Clustering). , And machine learning is performed based on the labeling information received from the web manager server 30.

구체적으로, 학습부(120)는 한 실시예로서, 미리 저장된 알고리즘을 이용하여, 전처리 과정으로서 Json 형태로 되어 있는 HTTP 트래픽 정보에서 피처 추출을 위한 데이터로 가공한다.Specifically, as an embodiment, the learning unit 120 processes data for feature extraction from HTTP traffic information in Json format as a preprocessing process using a pre-stored algorithm.

학습부(120)는 미리 저장된 알고리즘을 이용하여, 전처리된 복수의 HTTP 요청 패킷으로부터 복수의 피처를 선정 및 추출한다. 학습부(120)는 표 1과 같이 복수의 HTTP 요청 패킷의 컨텐츠 유형(Content-Type)별 데이터 값(value)을 추출한다. The learning unit 120 selects and extracts a plurality of features from a plurality of pre-processed HTTP request packets using a pre-stored algorithm. As shown in Table 1, the learning unit 120 extracts a data value for each content type of a plurality of HTTP request packets.

학습부(120)는 미리 저장된 알고리즘을 이용하여, 복수의 피처를 바탕으로 복수의 HTTP 요청 패킷을 복수의 그룹으로 클러스터링한다. 학습부(120)는 HTTP Main Request를 시작으로 하는 모든 서브 요청(Sub Request)을 일정 시간(예, 최대 10초)동안 요청 그룹(Request Group)으로 클러스터링(군집화)할 수 있다. The learning unit 120 clusters a plurality of HTTP request packets into a plurality of groups based on a plurality of features using a previously stored algorithm. The learning unit 120 may cluster (cluster) all sub requests starting with the HTTP Main Request into a request group for a predetermined time (eg, up to 10 seconds).

복수의 피처는 표 2와 같이 웹 사용자의 원격 공인 IP, 웹 사용자의 메인 요청 패킷, 메인 요청 패킷에 의해 연결되는 하위 요청 패킷의 수, 하위 요청 패킷의 리소스 종류, 하위 요청의 리소스 종류별 개수, 요청 패킷의 헤더, 요청 사용자의 세션 ID, 세션 ID의 생성 간격, 세션 ID의 갱신 반복 수, 요청 패킷의 그룹 내에서 헤더 쿠키의 변화, 및 요청 패킷의 그룹 내에서 헤더 사용자 에이전트의 변화를 포함할 수 있다.As shown in Table 2, the plurality of features include the remote public IP of the web user, the main request packet of the web user, the number of sub-request packets connected by the main request packet, the resource type of the sub-request packet, the number of sub-requests by resource type, and request. The header of the packet, the requesting user's session ID, the session ID generation interval, the number of repetitions of the session ID update, the change of the header cookie within the group of the request packet, and the change of the header user agent within the group of the request packet. have.

학습부(120)는 클러스터링된 정보(군집화된 정보)를 웹 관리자 서버(30)에게 전송한다. 학습부(120)는 웹 관리자 서버(30)로부터 복수의 그룹이 비정상 클러스터인지 여부에 관한 라벨링 정보를 수신한다. 웹 관리자 서버(30)는 웹 관리자 또는 보안 관리자로부터 클러스터링된 정보에 대한 정상 또는 비정상 라벨링(Labeling) 설정 정보를 입력받을 수 있다.The learning unit 120 transmits the clustered information (clustered information) to the web manager server 30. The learning unit 120 receives labeling information on whether a plurality of groups is an abnormal cluster from the web manager server 30. The web manager server 30 may receive normal or abnormal labeling setting information for clustered information from a web manager or a security manager.

학습부(120)는 미리 저장된 알고리즘을 이용하여, 웹 관리자 서버(30)로부터 수신된 라벨링 정보에 기반하여 기계 학습을 수행한다. 미리 저장된 알고리즘은 한 실시예로서, 비지도 학습(Unsupervised Learning) 알고리즘 또는 지도 학습(Supervised Learning) 알고리즘일 수 있다. The learning unit 120 uses a pre-stored algorithm, Machine learning is performed based on the labeling information received from the web manager server 30. As an example, the pre-stored algorithm may be an unsupervised learning algorithm or a supervised learning algorithm.

도 2를 참조하면, 웹 사용자는 대부분 웹 브라우저 또는 모바일 앱을 사용하므로, 웹 서버로 요청되는 HTTP 요청(Request) 패킷은 하나가 아닌 여러 개일 수 있다. 본 발명은 여러 개의 HTTP 요청 패킷 묶음에 대해 피처 선정, 추출, 클러스터링을 수행함으로써, 해커의 해킹 시도 전 이상 행위(예, 단독 리소스의 요청 및 명령 요청, 명시적 에러 발생 유도, 존재하지 않는 리소스의 주기적인 요청, 일정 간격의 균일한 요청 패턴, 동일한 에러의 지속적인 발생, GeoIP를 통한 불가능한 이동 요청 행위 판별)에 대한 탐지가 가능하다. Referring to FIG. 2, since most web users use web browsers or mobile apps, there may be several HTTP request packets requested to the web server instead of one. In the present invention, by performing feature selection, extraction, and clustering for several HTTP request packet bundles, abnormal behavior (e.g., request and command request for a single resource, explicit error occurrence inducement, non-existent resource) before attempting to hack It is possible to detect periodic requests, uniform request patterns at regular intervals, continuous occurrence of the same error, and determination of impossible movement request behavior through GeoIP.

컨텐츠 유형(Content-Type)의 경우 컨텐츠(Content)가 필드와 값으로 구성되어 있는데, 본 발명은 각 필드별 값에 대해 피처를 선정, 추출, 및 클러스터링을 수행함으로써, 컨텐츠(Content) 전체에 대해 클러스터링을 수행하는 것에 비해 공격 탐지의 정확도를 향상시킬 수 있다.In the case of Content-Type, content consists of a field and a value. In the present invention, by selecting, extracting, and clustering features for each field value, Compared to performing clustering, the accuracy of attack detection can be improved.

분석부(130)는 웹 사용자(10)로부터 수신되는 HTTP 요청 패킷을 입력 변수로 하는 기계 학습을 이용하여, 웹 사용자(10)로부터 수신되는 HTTP 요청 패킷이 웹 공격 패킷인지 여부를 판단한다. The analysis unit 130 determines whether the HTTP request packet received from the web user 10 is a web attack packet by using machine learning using the HTTP request packet received from the web user 10 as an input variable.

분석부(130)는 웹 사용자(10)로부터 수신되는 HTTP 요청 패킷이 웹 공격 패킷으로 판단된 경우, 요청 리소스를 차단하거나 또는 리디렉션(Redirection) 동작을 수행할 수 있다.When it is determined that the HTTP request packet received from the web user 10 is a web attack packet, the analysis unit 130 may block the requested resource or perform a redirection operation.

분석부(130)는 HTTP 요청 패킷에 대한 분석 결과를 필터부(110)에게 전달한다.The analysis unit 130 transmits the analysis result of the HTTP request packet to the filter unit 110.

분석부(130)는 한 실시예로서, 웹방화벽 데몬 모듈일 수 있다. 분석부(130)는 여러 사용자의 웹서버에 설치되어 있는 필터 모듈에 대한 지원 즉, 다중 웹서버 또는 가상 웹서버 지원을 수행할 수 있다.As an example, the analysis unit 130 may be a web firewall daemon module. The analysis unit 130 may support filter modules installed in web servers of several users, that is, support for multiple web servers or virtual web servers.

데이터베이스부(140)는 필터부(110)에 수신된 데이터 및 분석부(130)에 의해 분석 또는 처리된 데이터를 저장한다. The database unit 140 stores data received by the filter unit 110 and data analyzed or processed by the analysis unit 130.

데이터베이스부(140)는 Json 형태의 Document를 바로 저장 및 제어할 수 있고, 오토-샤딩(Auto-Sharding)을 통해 분산 저장 및 처리를 수행할 수 있다. 데이터베이스부(140)는 한 실시예로서, MongoDB일 수 있다.The database unit 140 may directly store and control a Json-type document, and may perform distributed storage and processing through auto-sharding. As an example, the database unit 140 may be MongoDB.

도 3을 참조하면, 시각화부(150)는 학습부(120)를 통해 클러스터링된 정보를 바탕으로, 웹 브라우저의 화면 상에 각 클러스트를 서로 다른 색상으로 출력할 수 있다. 시각화부(150)는 미리 저장된 멀티-차원 시각화 도구(예, Tensorboard)를 이용하여, 웹 브라우저의 화면 상에 각 클러스트를 서로 다른 색상으로 출력할 수 있다.Referring to FIG. 3, the visualization unit 150 may output each cluster in a different color on a screen of a web browser based on information clustered through the learning unit 120. The visualization unit 150 may output each cluster in a different color on the screen of a web browser using a pre-stored multi-dimensional visualization tool (eg, Tensorboard).

시각화부(150)는 웹 관리자 서버(30)로부터 각 클러스트에 대응하는 라벨링 정보를 제공받아 학습부(120)에게 전달할 수 있다. The visualization unit 150 may receive labeling information corresponding to each cluster from the web manager server 30 and transmit it to the learning unit 120.

도 4 및 도 5는 한 실시예에 따른 인공지능 기반 웹 공격 탐지 방법의 흐름도이다.4 and 5 are flowcharts of a method for detecting a web attack based on artificial intelligence according to an embodiment.

도 4 및 도 5를 참조하면, 인공지능 기반 웹 공격 탐지 방법은, 웹 사용자로부터 수신되는 복수의 HTTP 요청 패킷을 이용하여 기계 학습을 수행하는 단계(S100), 그리고 웹 사용자로부터 수신되는 HTTP 요청 패킷을 입력 변수로 하는 기계 학습을 이용하여, 웹 사용자로부터 수신되는 HTTP 요청 패킷이 웹 공격 패킷인지 여부를 판단하는 단계(S200)를 포함하고, 기계 학습을 수행하는 단계(S100)는, 웹 사용자로부터 복수의 HTTP 요청 패킷을 수신하는 단계(S110), 복수의 HTTP 요청 패킷으로부터 복수의 피처를 추출하는 단계(S120), 복수의 피처를 바탕으로 복수의 HTTP 요청 패킷을 복수의 그룹으로 클러스터링하는 단계(S130), 클러스터링된 정보를 웹 관리자 서버에게 전송하는 단계(S140), 웹 관리자 서버로부터 복수의 그룹이 비정상 클러스터인지 여부에 관한 라벨링 정보를 수신하는 단계(S150), 및 라벨링 정보에 기반하여 기계 학습을 수행하는 단계(S160)를 포함한다.4 and 5, the artificial intelligence-based web attack detection method includes performing machine learning using a plurality of HTTP request packets received from a web user (S100), and an HTTP request packet received from a web user. Using machine learning using as an input variable, determining whether an HTTP request packet received from a web user is a web attack packet (S200), and performing machine learning (S100) includes: Receiving a plurality of HTTP request packets (S110), extracting a plurality of features from the plurality of HTTP request packets (S120), clustering a plurality of HTTP request packets into a plurality of groups based on the plurality of features ( S130), transmitting the clustered information to the web manager server (S140), receiving labeling information on whether a plurality of groups is an abnormal cluster from the web manager server (S150), and machine learning based on the labeling information It includes a step (S160) of performing.

웹 사용자로부터 수신되는 복수의 HTTP 요청 패킷을 이용하여 기계 학습을 수행하는 단계(S100), 웹 사용자로부터 수신되는 HTTP 요청 패킷이 웹 공격 패킷인지 여부를 판단하는 단계(S200), 웹 사용자로부터 복수의 HTTP 요청 패킷을 수신하는 단계(S110), 복수의 HTTP 요청 패킷으로부터 복수의 피처를 추출하는 단계(S120), 복수의 피처를 바탕으로 복수의 HTTP 요청 패킷을 복수의 그룹으로 클러스터링하는 단계(S130), 클러스터링된 정보를 웹 관리자 서버에게 전송하는 단계(S140), 웹 관리자 서버로부터 복수의 그룹이 비정상 클러스터인지 여부에 관한 라벨링 정보를 수신하는 단계(S150), 및 라벨링 정보에 기반하여 기계 학습을 수행하는 단계(S160)는 위에서 설명한 웹 공격 탐지 시스템(100)의 동작 내용과 동일하므로, 상세한 설명은 생략한다.Performing machine learning using a plurality of HTTP request packets received from a web user (S100), determining whether an HTTP request packet received from a web user is a web attack packet (S200), Receiving an HTTP request packet (S110), extracting a plurality of features from a plurality of HTTP request packets (S120), clustering a plurality of HTTP request packets into a plurality of groups based on the plurality of features (S130) , Transmitting the clustered information to the web manager server (S140), receiving labeling information on whether a plurality of groups is an abnormal cluster from the web manager server (S150), and performing machine learning based on the labeling information Since the step S160 is the same as the operation contents of the web attack detection system 100 described above, a detailed description will be omitted.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements by those skilled in the art using the basic concept of the present invention defined in the following claims are also provided. It belongs to the scope of rights.

Claims

As an artificial intelligence-based web attack detection system,
A filter unit that receives a plurality of HTTP request packets from web users,
Extracting a plurality of features from the plurality of HTTP request packets, clustering the plurality of HTTP request packets into a plurality of groups based on the plurality of features, transmitting the clustered information to a web manager server, and the web manager server A learning unit that receives labeling information on whether the plurality of groups is an abnormal cluster from, and performs machine learning based on the labeling information, and
Analysis unit that determines whether the HTTP request packet received from the web user is a web attack packet by using the machine learning using the HTTP request packet received from the web user as an input variable
Including,
The plurality of features include the main request packet of the web user, the number of sub-request packets connected by the main request packet, the resource type of the sub-request packet, the number of sub-requests by resource type, the header of the request packet, and within the group of the request packet. A header cookie change, and a header user agent change within a group of request packets,
The filter unit,
When receiving a plurality of HTTP request packets from the web user, transmits an attack detection request message to the analysis unit,
The learning unit,
A web attack detection system for extracting data values for each content type from the plurality of HTTP request packets, and performing clustering on values for each field included in each content type of the plurality of HTTP request packets.

In claim 1,
Based on the clustered information, the web attack detection system further comprising a visualization unit that outputs each cluster in a different color on the screen and receives labeling information corresponding to each cluster from the web manager server.

delete

In claim 1,
The analysis unit,
When it is determined that the HTTP request packet received from the web user is a web attack packet, the web attack detection system blocks the requested resource or performs a redirection operation.

As an artificial intelligence-based web attack detection method,
Performing machine learning using a plurality of HTTP request packets received from a web user, and
Determining whether the HTTP request packet received from the web user is a web attack packet by using the machine learning using the HTTP request packet received from the web user as an input variable
Including,
The step of performing the machine learning,
Receiving a plurality of HTTP request packets from a web user,
Extracting a plurality of features from the plurality of HTTP request packets,
Clustering the plurality of HTTP request packets into a plurality of groups based on the plurality of features,
Transmitting the clustered information to the web manager server,
Receiving labeling information on whether the plurality of groups is an abnormal cluster from the web manager server, and
Including the step of performing machine learning based on the labeling information,
The plurality of features include the main request packet of the web user, the number of sub-request packets connected by the main request packet, the resource type of the sub-request packet, the number of sub-requests by resource type, the header of the request packet, and within the group of the request packet. A header cookie change, and a header user agent change within a group of request packets,
Receiving a plurality of HTTP request packets from the web user,
When receiving a plurality of HTTP request packets from the web user, including the step of transmitting an attack detection request message to the analysis unit,
Extracting a plurality of features from the plurality of HTTP request packets,
Including the step of extracting data values for each content type from the plurality of HTTP request packets,
Clustering the plurality of HTTP request packets into a plurality of groups,
And performing clustering on values for each field included in each content type of the plurality of HTTP request packets.

delete