JP2023516621A

JP2023516621A - Web attack detection and blocking system and method by artificial intelligence machine learning behavior-based web protocol analysis

Info

Publication number: JP2023516621A
Application number: JP2022551682A
Authority: JP
Inventors: デホリ; ドングンリ; インヨンリ
Original assignee: F1 Security Inc
Current assignee: F1 Security Inc
Priority date: 2020-02-25
Filing date: 2020-12-08
Publication date: 2023-04-20
Anticipated expiration: 2040-12-08
Also published as: JP7391313B2; WO2021172711A1; KR102156891B1

Abstract

人工知能ベースのウェブ攻撃検知システムであって、ウェブユーザから複数のＨＴＴＰ要請パケットを受信するフィルタ部と、前記複数のＨＴＴＰ要請パケットから複数のフィーチャを抽出し、前記複数のフィーチャに基づいて前記複数のＨＴＴＰ要請パケットを複数のグループにクラスタリングし、クラスタリングされた情報をウェブ管理者サーバに伝送し、前記ウェブ管理者サーバから前記複数のグループが異常クラスタであるか否かに関するラベリング情報を受信し、前記ラベリング情報に基づいて機械学習を行う学習部と、ウェブユーザから受信するＨＴＴＰ要請パケットを入力変数とする前記機械学習を用いて、前記ウェブユーザから受信するＨＴＴＰ要請パケットがウェブ攻撃パケットであるか否かを判断する分析部とを含むウェブ攻撃検知システムが提供される。An artificial intelligence-based web attack detection system, comprising: a filter unit that receives a plurality of HTTP request packets from a web user; extracts a plurality of features from the plurality of HTTP request packets; clustering the HTTP request packets of into a plurality of groups, transmitting the clustered information to a web administrator server, receiving from the web administrator server labeling information regarding whether the plurality of groups are abnormal clusters; A learning unit that performs machine learning based on the labeling information; A web attack detection system is provided that includes an analyzer that determines whether

Description

本発明は、ウェブ攻撃検知システムおよび方法に関し、特に、人工知能マシンラーニング行為ベースウェブプロトコル分析によるウェブ攻撃検知システムおよび方法に関する。 The present invention relates to web attack detection systems and methods, and more particularly to web attack detection systems and methods by artificial intelligence machine learning behavior-based web protocol analysis.

現在、ＨＴＴＰ通信上での攻撃検知方法として、インジェクション攻撃、パラメータ検査、アップロードバイナリ検査などペイロード中心の研究が活発に行われている。 Currently, payload-centered studies such as injection attacks, parameter inspection, and upload binary inspection are being actively conducted as methods of detecting attacks on HTTP communication.

人工知能ベース攻撃検知方法に関する研究も活発に行われている。従来の人工知能ベース攻撃検知方法は、ネットワークとペイロード（Ｌｅｎｇｔｈｏｆｐａｙｌｏａｄ、Ｂｙｔｅｅｎｔｒｏｐｙｏｆｐａｙｌｏａｄ、Ｎｕｍｂｅｒｏｆｄｉｓｔｉｎｃｔｂｙｔｅｓなど）に基づいてデータセット（Ｄａｔａｓｅｔ）およびフィーチャ（Ｆｅａｔｕｒｅ）を抽出したため、ウェブ攻撃検知の正確度が低い問題があった。 Research on artificial intelligence-based attack detection methods is also actively conducted. Conventional AI-based attack detection methods extract data sets and features based on networks and payloads (Length of payload, Byte entropy of payload, Number of distinct bytes, etc.). There was a problem of low accuracy of

そのため、ユーザのウェブ行為に基づいてフィーチャの選択、抽出、およびクラスタリングを行うことで、ウェブ攻撃検知の正確度を向上させることができる技術が求められている。 Therefore, there is a need for techniques that can improve the accuracy of web attack detection by selecting, extracting, and clustering features based on user web behavior.

本発明が解決しようとする技術的課題は、ユーザのウェブ行為に基づいてフィーチャの選択、抽出、およびクラスタリングを行うことで、ウェブ攻撃検知の正確度を向上させることができる人工知能ベースのウェブ攻撃検知システムおよび方法を提供することである。 The technical problem to be solved by the present invention is to select, extract and cluster features based on user's web behavior, which can improve the accuracy of web attack detection. An object is to provide a detection system and method.

一実施形態によると、人工知能ベースのウェブ攻撃検知システムが提供される。前記ウェブ攻撃検知システムは、ウェブユーザから複数のＨＴＴＰ要請パケットを受信するフィルタ部と、前記複数のＨＴＴＰ要請パケットから複数のフィーチャを抽出し、前記複数のフィーチャに基づいて前記複数のＨＴＴＰ要請パケットを複数のグループにクラスタリングし、クラスタリングされた情報をウェブ管理者サーバに伝送し、前記ウェブ管理者サーバから前記複数のグループが異常クラスタであるか否かに関するラベリング情報を受信し、前記ラベリング情報に基づいて機械学習を行う学習部と、ウェブユーザから受信するＨＴＴＰ要請パケットを入力変数とする前記機械学習を用いて、前記ウェブユーザから受信するＨＴＴＰ要請パケットがウェブ攻撃パケットであるか否かを判断する分析部とを含む。 According to one embodiment, an artificial intelligence-based web attack detection system is provided. The web attack detection system includes a filter unit that receives a plurality of HTTP request packets from web users, a plurality of features extracted from the plurality of HTTP request packets, and a plurality of HTTP request packets based on the plurality of features. clustering into a plurality of groups, transmitting the clustered information to a web administrator server, receiving labeling information as to whether the plurality of groups are abnormal clusters from the web administrator server, and based on the labeling information; and a learning unit that performs machine learning using the HTTP request packet received from the web user, and the machine learning that uses the HTTP request packet received from the web user as an input variable to determine whether the HTTP request packet received from the web user is a web attack packet. and an analysis part.

前記ウェブ攻撃検知システムは、前記クラスタリングされた情報に基づいて、画面上に各クラスタを互いに異なる色で出力し、前記ウェブ管理者サーバから各クラスタに対応するラベリング情報の提供を受ける視覚化部をさらに含むことができる。 The web attack detection system outputs each cluster in a different color on a screen based on the clustered information, and includes a visualization unit that receives labeling information corresponding to each cluster from the web administrator server. can further include:

前記複数のフィーチャは、ウェブユーザの遠隔公認ＩＰ、ウェブユーザのメイン要請パケット、メイン要請パケットによって連結される下位要請パケットの数、下位要請パケットのリソース種類、下位要請のリソース種類別の個数、要請パケットのヘッダ、要請ユーザのセッションＩＤ、セッションＩＤの生成間隔、セッションＩＤの更新繰り返し数、要請パケットのグループ内でのヘッダクッキーの変化、および要請パケットのグループ内でのヘッダユーザエージェントの変化のうち少なくとも一つを含むことができる。 The plurality of features includes a web user's remote authorized IP, a web user's main request packet, the number of sub-request packets connected by the main request packet, the resource type of the sub-request packet, the number of each sub-request resource type, and the request. Packet header, session ID of the requesting user, session ID generation interval, session ID update repetition count, change in header cookie within a group of request packets, and change in header user agent within a group of request packets It can contain at least one.

前記分析部は、前記ウェブユーザから受信するＨＴＴＰ要請パケットがウェブ攻撃パケットであると判断した場合、要請リソースを遮断するかまたはリダイレクション動作を行うことができる。 When the analysis unit determines that the HTTP request packet received from the web user is a web attack packet, the analysis unit may block the request resource or perform a redirection operation.

一実施形態によると、人工知能ベースのウェブ攻撃検知方法が提供される。前記ウェブ攻撃検知方法は、ウェブユーザから受信する複数のＨＴＴＰ要請パケットを用いて機械学習を行うステップと、ウェブユーザから受信するＨＴＴＰ要請パケットを入力変数とする前記機械学習を用いて、前記ウェブユーザから受信するＨＴＴＰ要請パケットがウェブ攻撃パケットであるか否かを判断するステップとを含み、前記機械学習を行うステップは、ウェブユーザから複数のＨＴＴＰ要請パケットを受信するステップと、前記複数のＨＴＴＰ要請パケットから複数のフィーチャを抽出するステップと、前記複数のフィーチャに基づいて前記複数のＨＴＴＰ要請パケットを複数のグループにクラスタリングするステップと、クラスタリングされた情報をウェブ管理者サーバに伝送するステップと、前記ウェブ管理者サーバから前記複数のグループが異常クラスタであるか否かに関するラベリング情報を受信するステップと、前記ラベリング情報に基づいて機械学習を行うステップとを含む。 According to one embodiment, an artificial intelligence-based web attack detection method is provided. The web attack detection method comprises: performing machine learning using a plurality of HTTP request packets received from a web user; determining whether an HTTP request packet received from the extracting a plurality of features from packets; clustering the plurality of HTTP request packets into a plurality of groups based on the plurality of features; transmitting the clustered information to a web administrator server; The method includes receiving labeling information about whether the plurality of groups are abnormal clusters from a web administrator server, and performing machine learning based on the labeling information.

ユーザのウェブ行為に基づいて、フィーチャの選択、抽出、およびクラスタリングを行うことで、ウェブ攻撃検知の正確度を向上させることができる。 Feature selection, extraction, and clustering based on user web behavior can improve the accuracy of web attack detection.

複数のＨＴＴＰ要請パケット群に対して、フィーチャの選択、抽出、クラスタリングを行うことで、ハッカーのハッキング試み前の異常行為（例：単独リソースの要請および命令要請、明示的なエラー発生誘導、存在しないリソースの周期的な要請、所定間隔の均一な要請パターン、同じエラーの繰り返しの発生、ＧｅｏＩＰによる不可能な移動要請行為判別）に関する検知が可能である。 By selecting, extracting, and clustering features from multiple HTTP request packet groups, abnormal behavior before hacking attempts by hackers (e.g. single resource request and command request, explicit error induction, non-existent Periodic requests for resources, uniform request patterns at predetermined intervals, repeated occurrence of the same error, determination of impossible movement request actions by GeoIP) can be detected.

各フィールド別の値に対してフィーチャの選択、抽出、およびクラスタリングを行うことで、コンテンツ（Ｃｏｎｔｅｎｔ）全体に対してクラスタリングを行うことに比べて攻撃検知の正確度を向上させることができる。 By performing feature selection, extraction, and clustering on values for each field, the accuracy of attack detection can be improved compared to clustering on the entire content.

一実施形態による人工知能ベースのウェブ攻撃検知システムのブロック図である。1 is a block diagram of an artificial intelligence-based web attack detection system according to one embodiment; FIG. 一実施形態によるＨＴＴＰ要請パケットを説明するための図である。FIG. 4 is a diagram for explaining an HTTP request packet according to one embodiment; 一実施形態による視覚化部の動作内容を説明するための図である。FIG. 10 is a diagram for explaining operation contents of a visualization unit according to an embodiment; 一実施形態による人工知能ベースのウェブ攻撃検知方法のフローチャートである。4 is a flowchart of an artificial intelligence-based web attack detection method according to one embodiment; 一実施形態による人工知能ベースのウェブ攻撃検知方法のフローチャートである。4 is a flowchart of an artificial intelligence-based web attack detection method according to one embodiment;

以下、添付の図面を参照して、本発明の実施形態について、本発明が属する技術分野において通常の知識を有する者が容易に実施するように詳細に説明する。しかし、本発明は、様々な相違する形態に具現されることができ、ここで説明する実施形態に限定されない。また、図面において、本発明を明確に説明するために、説明と関係のない部分は省略し、明細書の全体にわたり、類似する部分に対しては類似する図面符号を付けた。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry them out. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In addition, in the drawings, in order to clearly explain the present invention, parts unrelated to the description are omitted, and similar parts are given similar reference numerals throughout the specification.

明細書の全体において、ある部分がある構成要素を「含む」とした時に、これは、特別に反対の意味の記載がない限り、他の構成要素を除くのではなく、他の構成要素をさらに含み得ることを意味する。 Throughout the specification, when a part "includes" a certain component, this does not exclude other components, but further includes other components, unless specifically stated to the contrary. It means that it can contain

図１は一実施形態による人工知能ベースのウェブ攻撃検知システムのブロック図である。図２は一実施形態によるＨＴＴＰ要請パケットを説明するための図である。図３は一実施形態による視覚化部の動作内容を説明するための図である。 FIG. 1 is a block diagram of an artificial intelligence-based web attack detection system according to one embodiment. FIG. 2 is a diagram illustrating an HTTP request packet according to one embodiment. FIG. 3 is a diagram for explaining the operation contents of the visualization unit according to one embodiment.

図１を参照すると、一実施形態による人工知能ベースのウェブ攻撃検知システム１００は、フィルタ部１１０と、学習部１２０と、分析部１３０と、データベース部１４０と、視覚化部１５０とを含む。 Referring to FIG. 1 , an artificial intelligence-based web attack detection system 100 according to one embodiment includes a filter unit 110 , a learning unit 120 , an analysis unit 130 , a database unit 140 and a visualization unit 150 .

フィルタ部１１０は、ウェブユーザ１０から複数のＨＴＴＰ要請パケットを受信する。フィルタ部１１０は、ウェブユーザ１０のＨＴＴＰ要請パケットを受信すると、分析部１３０に攻撃検知要請メッセージを伝送する。 Filter unit 110 receives a plurality of HTTP request packets from web user 10 . Upon receiving the HTTP request packet of the web user 10 , the filter unit 110 transmits an attack detection request message to the analysis unit 130 .

フィルタ部１１０は、一実施形態として、分析部１３０から攻撃検知結果を受信して遮断動作（例：Ｄｅｎｙ、Ａｌｌｏｗ）を行うか、またはフィルタリングされたデータをウェブアプリケーションに伝達することができる。 In one embodiment, the filter unit 110 may receive an attack detection result from the analysis unit 130 and perform a blocking operation (eg, Deny, Allow) or transmit filtered data to a web application.

フィルタ部１１０は、一実施形態として、分析部１３０によってフィルタリング処理された（例：住民登録番号などの個人情報および注釈の除去が行われた）ＨＴＴＰ応答を実際のウェブユーザ１０に伝達することができる。フィルタ部１１０は、一実施形態として、アパッチ（Ａｐａｃｈｅ）フィルタモジュールであることができる。 As one embodiment, the filter unit 110 can transmit the HTTP response filtered by the analysis unit 130 (eg, removing personal information such as resident registration number and annotations) to the actual web user 10 . can. Filter unit 110 may be an Apache filter module, as one embodiment.

学習部１２０は、ウェブユーザ１０から受信した複数のＨＴＴＰ要請パケット（ウェブトラフィック）に対して前処理を行い、前処理されたデータからフィーチャの選択（ＦｅａｔｕｒｅＳｅｌｅｃｔｉｏｎ）、抽出（Ｅｘｔｒａｃｔｉｏｎ）、クラスタリング（Ｃｌｕｓｔｅｒｉｎｇ）、およびウェブ管理者サーバ３０から受信したラベリング情報に基づいて機械学習を行う。 The learning unit 120 preprocesses a plurality of HTTP request packets (web traffic) received from the web user 10, and performs feature selection, extraction, and clustering from the preprocessed data. ), and the labeling information received from the web administrator server 30 .

具体的には、学習部１２０は、一実施形態として、予め格納されたアルゴリズムを用いて、前処理過程として、Ｊｓｏｎ形式になっているＨＴＴＰトラフィック情報からフィーチャ抽出のためのデータに加工する。 Specifically, as one embodiment, the learning unit 120 uses a pre-stored algorithm to process HTTP traffic information in JSON format into data for feature extraction as a preprocessing step.

学習部１２０は、予め格納されたアルゴリズムを用いて、前処理された複数のＨＴＴＰ要請パケットから複数のフィーチャを選択および抽出する。学習部１２０は、表１のように、複数のＨＴＴＰ要請パケットのコンテンツタイプ（Ｃｏｎｔｅｎｔ―Ｔｙｐｅ）別にデータ値（ｖａｌｕｅ）を抽出する。 The learning unit 120 selects and extracts features from the preprocessed HTTP request packets using a pre-stored algorithm. As shown in Table 1, the learning unit 120 extracts data values for each content type (Content-Type) of a plurality of HTTP request packets.

学習部１２０は、予め格納されたアルゴリズムを用いて、複数のフィーチャに基づいて複数のＨＴＴＰ要請パケットを複数のグループにクラスタリングする。学習部１２０は、ＨＴＴＰＭａｉｎＲｅｑｕｅｓｔを始まりとするすべてのサブ要請（ＳｕｂＲｅｑｕｅｓｔ）を所定時間（例：最大１０秒）の間要請グループ（ＲｅｑｕｅｓｔＧｒｏｕｐ）にクラスタリング（グループ分け）することができる。 The learning unit 120 clusters a plurality of HTTP request packets into a plurality of groups based on a plurality of features using a pre-stored algorithm. The learning unit 120 may cluster (group) all sub requests starting from the HTTP main request into request groups for a predetermined time (eg, up to 10 seconds).

複数のフィーチャは、表２のようにウェブユーザの遠隔公認ＩＰ、ウェブユーザのメイン要請パケット、メイン要請パケットによって連結される下位要請パケットの数、下位要請パケットのリソース種類、下位要請のリソース種類別の個数、要請パケットのヘッダ、要請ユーザのセッションＩＤ、セッションＩＤの生成間隔、セッションＩＤの更新繰り返し数、要請パケットのグループ内でのヘッダクッキーの変化、および要請パケットのグループ内でのヘッダユーザエージェントの変化を含むことができる。 As shown in Table 2, a plurality of features are classified into a web user's remote authorized IP, a web user's main request packet, the number of sub-request packets connected by the main request packet, resource types of sub-request packets, resource types of sub-request. request packet header, session ID of the requesting user, session ID generation interval, session ID update repetition count, header cookie change within a group of request packets, and header user agent within a group of request packets can include changes in

学習部１２０は、クラスタリングされた情報（グループ分けした情報）をウェブ管理者サーバ３０に伝送する。学習部１２０は、ウェブ管理者サーバ３０から複数のグループが異常クラスタであるか否かに関するラベリング情報を受信する。ウェブ管理者サーバ３０は、ウェブ管理者または保安管理者からクラスタリングされた情報に関する正常または異常ラベリング（Ｌａｂｅｌｉｎｇ）設定情報の入力を受けることができる。 The learning unit 120 transmits the clustered information (grouped information) to the web administrator server 30 . The learning unit 120 receives labeling information regarding whether a plurality of groups are abnormal clusters from the web administrator server 30 . The web manager server 30 can receive normal or abnormal labeling setting information regarding clustered information from a web manager or a security manager.

学習部１２０は、予め格納されたアルゴリズムを用いて、ウェブ管理者サーバ３０から受信したラベリング情報に基づいて機械学習を行う。予め格納されたアルゴリズムは、一実施形態として、教師なし学習（ＵｎｓｕｐｅｒｖｉｓｅｄＬｅａｒｎｉｎｇ）アルゴリズムまたは教師あり学習（ＳｕｐｅｒｖｉｓｅｄＬｅａｒｎｉｎｇ）アルゴリズムであることができる。 The learning unit 120 performs machine learning based on the labeling information received from the web administrator server 30 using a pre-stored algorithm. The pre-stored algorithm can be an Unsupervised Learning algorithm or a Supervised Learning algorithm, as an embodiment.

図２を参照すると、ウェブユーザは、ほとんどがウェブブラウザまたはモバイルアプリを使用するため、ウェブサーバに要請されるＨＴＴＰ要請（Ｒｅｑｕｅｓｔ）パケットは、一つではなく、複数であることができる。本発明は、複数のＨＴＴＰ要請パケット群に対して、フィーチャの選択、抽出、クラスタリングを行うことで、ハッカーのハッキング試み前の異常行為（例：単独リソースの要請および命令要請、明示的なエラー発生誘導、存在しないリソースの周期的な要請、所定間隔の均一な要請パターン、同じエラーの繰り返しの発生、ＧｅｏＩＰによる不可能な移動要請行為判別）に関する検知が可能である。 Referring to FIG. 2, since most web users use web browsers or mobile applications, the number of HTTP request packets sent to the web server may be multiple instead of one. The present invention selects, extracts, and clusters features from a group of multiple HTTP request packets to detect abnormal actions (e.g., single resource requests and command requests, explicit error occurrences) before hacking attempts by hackers. guidance, periodic requests for non-existent resources, uniform request patterns at predetermined intervals, repeated occurrences of the same error, and determination of impossible movement request actions by GeoIP).

コンテンツタイプ（Ｃｏｎｔｅｎｔ―Ｔｙｐｅ）の場合、コンテンツ（Ｃｏｎｔｅｎｔ）がフィールドと値で構成されており、本発明は、各フィールド別の値に対してフィーチャの選択、抽出、およびクラスタリングを行うことで、コンテンツ（Ｃｏｎｔｅｎｔ）全体に対してクラスタリングを行うことに比べて、攻撃検知の正確度を向上させることができる。 In the case of the content type (Content-Type), the content (Content) consists of fields and values. The accuracy of attack detection can be improved compared to clustering the entire (Content).

分析部１３０は、ウェブユーザ１０から受信するＨＴＴＰ要請パケットを入力変数とする機械学習を用いて、ウェブユーザ１０から受信するＨＴＴＰ要請パケットがウェブ攻撃パケットであるか否かを判断する。 The analysis unit 130 determines whether the HTTP request packet received from the web user 10 is a web attack packet using machine learning using the HTTP request packet received from the web user 10 as an input variable.

分析部１３０は、ウェブユーザ１０から受信するＨＴＴＰ要請パケットがウェブ攻撃パケットであると判断した場合、要請リソースを遮断するかまたはリダイレクション（Ｒｅｄｉｒｅｃｔｉｏｎ）動作を行うことができる。 If the analysis unit 130 determines that the HTTP request packet received from the web user 10 is a web attack packet, it may block the request resource or perform a redirection operation.

分析部１３０は、ＨＴＴＰ要請パケットに対する分析結果をフィルタ部１１０に伝達する。 The analysis unit 130 transfers the analysis result of the HTTP request packet to the filter unit 110 .

分析部１３０は、一実施形態として、ウェブファイアウォールデーモンモジュールであることができる。分析部１３０は、複数のユーザのウェブサーバに設置されているフィルタモジュールに対する支援、すなわち、多重ウェブサーバまたは仮想ウェブサーバ支援を行うことができる。 The analyzer 130 can be a web firewall daemon module, as one embodiment. The analysis unit 130 can support filter modules installed in web servers of a plurality of users, that is, support multiple web servers or virtual web servers.

データベース部１４０は、フィルタ部１１０が受信したデータおよび分析部１３０により分析または処理されたデータを格納する。 Database unit 140 stores data received by filter unit 110 and data analyzed or processed by analysis unit 130 .

データベース部１４０は、Ｊｓｏｎ形式のＤｏｃｕｍｅｎｔをすぐ格納および制御することができ、オート－シャーディング（Ａｕｔｏ―Ｓｈａｒｄｉｎｇ）により、分散格納および処理を行うことができる。データベース部１４０は、一実施形態として、ＭｏｎｇｏＤＢであることができる。 The database unit 140 can store and control documents in Json format immediately, and can perform distributed storage and processing through auto-sharding. The database unit 140 can be MongoDB as one embodiment.

図３を参照すると、視覚化部１５０は、学習部１２０によりクラスタリングされた情報に基づいて、ウェブブラウザの画面上に各クラスタを互いに異なる色で出力することができる。視覚化部１５０は、予め格納された多次元視覚化ツール（例：Ｔｅｎｓｏｒｂｏａｒｄ）を用いて、ウェブブラウザの画面上に各クラスタを互いに異なる色で出力することができる。 Referring to FIG. 3 , the visualization unit 150 may output each cluster in a different color on the screen of the web browser based on the information clustered by the learning unit 120 . The visualization unit 150 may output each cluster in a different color on the screen of the web browser using a pre-stored multi-dimensional visualization tool (eg, Tensorboard).

視覚化部１５０は、ウェブ管理者サーバ３０から各クラスタに対応するラベリング情報の提供を受けて学習部１２０に伝達することができる。 The visualization unit 150 may receive labeling information corresponding to each cluster from the web administrator server 30 and transmit the labeling information to the learning unit 120 .

図４および図５は一実施形態による人工知能ベースのウェブ攻撃検知方法のフローチャートである。 4 and 5 are flowcharts of an artificial intelligence-based web attack detection method according to one embodiment.

図４および図５を参照すると、人工知能ベースのウェブ攻撃検知方法は、ウェブユーザから受信する複数のＨＴＴＰ要請パケットを用いて機械学習を行うステップ（Ｓ１００）と、ウェブユーザから受信するＨＴＴＰ要請パケットを入力変数とする機械学習を用いて、ウェブユーザから受信するＨＴＴＰ要請パケットがウェブ攻撃パケットであるか否かを判断するステップ（Ｓ２００）とを含み、機械学習を行うステップ（Ｓ１００）は、ウェブユーザから複数のＨＴＴＰ要請パケットを受信するステップ（Ｓ１１０）と、複数のＨＴＴＰ要請パケットから複数のフィーチャを抽出するステップ（Ｓ１２０）と、複数のフィーチャに基づいて複数のＨＴＴＰ要請パケットを複数のグループにクラスタリングするステップ（Ｓ１３０）と、クラスタリングされた情報をウェブ管理者サーバに伝送するステップ（Ｓ１４０）と、ウェブ管理者サーバから複数のグループが異常クラスタであるか否かに関するラベリング情報を受信するステップ（Ｓ１５０）と、ラベリング情報に基づいて機械学習を行うステップ（Ｓ１６０）とを含む。 4 and 5, the artificial intelligence-based web attack detection method comprises a step of performing machine learning using a plurality of HTTP request packets received from web users (S100); and a step (S200) of determining whether an HTTP request packet received from a web user is a web attack packet using machine learning using as an input variable, the step of performing machine learning (S100) includes: receiving a plurality of HTTP request packets from a user (S110); extracting a plurality of features from the plurality of HTTP request packets (S120); dividing the plurality of HTTP request packets into a plurality of groups based on the plurality of features; clustering (S130); transmitting the clustered information to the web administrator server (S140); and receiving labeling information on whether a plurality of groups are abnormal clusters from the web administrator server ( S150) and a step of performing machine learning based on the labeling information (S160).

ウェブユーザから受信する複数のＨＴＴＰ要請パケットを用いて機械学習を行うステップ（Ｓ１００）と、ウェブユーザから受信するＨＴＴＰ要請パケットがウェブ攻撃パケットであるか否かを判断するステップ（Ｓ２００）と、ウェブユーザから複数のＨＴＴＰ要請パケットを受信するステップ（Ｓ１１０）と、複数のＨＴＴＰ要請パケットから複数のフィーチャを抽出するステップ（Ｓ１２０）と、複数のフィーチャに基づいて複数のＨＴＴＰ要請パケットを複数のグループにクラスタリングするステップ（Ｓ１３０）と、クラスタリングされた情報をウェブ管理者サーバに伝送するステップ（Ｓ１４０）と、ウェブ管理者サーバから複数のグループが異常クラスタであるか否かに関するラベリング情報を受信するステップ（Ｓ１５０）と、ラベリング情報に基づいて機械学習を行うステップ（Ｓ１６０）は、上述のウェブ攻撃検知システム１００の動作内容と同一であるため、詳細な説明は省略する。 performing machine learning using a plurality of HTTP request packets received from web users (S100); determining whether the HTTP request packets received from web users are web attack packets (S200); receiving a plurality of HTTP request packets from a user (S110); extracting a plurality of features from the plurality of HTTP request packets (S120); dividing the plurality of HTTP request packets into a plurality of groups based on the plurality of features; clustering (S130); transmitting the clustered information to the web administrator server (S140); and receiving labeling information on whether a plurality of groups are abnormal clusters from the web administrator server ( S150) and the step of performing machine learning based on the labeling information (S160) are the same as the operation contents of the web attack detection system 100 described above, so detailed description thereof will be omitted.

以上、本発明の実施形態について詳細に説明しているが、本発明の権利範囲は、これに限定されず、以下の請求の範囲で定義している本発明の基本概念を用いた当業者の様々な変形および改良形態も本発明の権利範囲に属する。 Although the embodiments of the present invention have been described in detail above, the scope of rights of the present invention is not limited thereto, and a person skilled in the art can use the basic concept of the present invention defined in the following claims. Various modifications and improvements are also within the scope of the invention.

Claims

An artificial intelligence-based web attack detection system comprising:
a filter unit that receives a plurality of HTTP request packets from a web user;
extracting a plurality of features from the plurality of HTTP request packets; clustering the plurality of HTTP request packets into a plurality of groups based on the plurality of features; transmitting the clustered information to a web administrator server; a learning unit that receives labeling information about whether the plurality of groups are abnormal clusters from an administrator server and performs machine learning based on the labeling information;
an analysis unit that determines whether an HTTP request packet received from a web user is a web attack packet using the machine learning using the HTTP request packet received from the web user as an input variable. system.

2. The method of claim 1, further comprising a visualization unit that outputs each cluster in a different color on a screen based on the clustered information, and receives labeling information corresponding to each cluster from the web administrator server. A web attack detection system as described.

The plurality of features are
Web user's remote authorized IP, web user's main request packet, number of sub-request packets linked by main request packet, resource type of sub-request packet, number of each resource type of sub-request packet, header of request packet, requesting user session ID, session ID generation interval, session ID update repetition count, header cookie change within a group of request packets, and header user agent change within a group of request packets; The web attack detection system according to claim 1.

The analysis unit
2. The web attack detection system of claim 1, wherein, when the HTTP request packet received from the web user is determined to be a web attack packet, the request resource is blocked or redirected.

An artificial intelligence based web attack detection method comprising:
performing machine learning using a plurality of HTTP request packets received from web users;
determining whether the HTTP request packet received from the web user is a web attack packet using the machine learning with the HTTP request packet received from the web user as an input variable;
The step of performing machine learning includes:
receiving a plurality of HTTP request packets from web users;
extracting a plurality of features from the plurality of HTTP request packets;
clustering the plurality of HTTP request packets into a plurality of groups based on the plurality of features;
transmitting the clustered information to a webmaster server;
receiving labeling information regarding whether the plurality of groups are abnormal clusters from the web administrator server;
and performing machine learning based on the labeling information.

The plurality of features are
Web user's remote authorized IP, web user's main request packet, number of sub-request packets linked by main request packet, resource type of sub-request packet, number of each resource type of sub-request packet, header of request packet, requesting user session ID, session ID generation interval, session ID update repetition count, header cookie change within a group of request packets, and header user agent change within a group of request packets; The web attack detection method according to claim 5.