CN111147490A

CN111147490A - Directional fishing attack event discovery method and device

Info

Publication number: CN111147490A
Application number: CN201911367293.6A
Authority: CN
Inventors: 刘澄澄; 廖纯; 赵双; 白波; 于平; 王菲飞; 于海波
Original assignee: Institute of Information Engineering of CAS
Current assignee: Institute of Information Engineering of CAS
Priority date: 2019-12-26
Filing date: 2019-12-26
Publication date: 2020-05-12

Abstract

The invention discloses a method and a device for discovering a directional fishing attack event, wherein the method comprises the following steps: acquiring and analyzing network access data of a user, and screening out suspicious login behaviors according to configuration rules; acquiring the actual login page characteristics of the user according to the suspicious login behavior; and comparing the difference between the actual login page characteristics and the configuration rules, and calculating the counterfeit degree of the actual login page characteristics and the official login page characteristics to judge the directional phishing attack event. The invention is not limited to a single target identified by a traditional phishing page, so that complete event discovery and comprehensive threat assessment around directional phishing attack behaviors are possible. The invention can enable a service department to effectively supervise the login behavior of the system by monitoring the original flow of the protected system, accurately find the directional fishing attack event in time, give an alarm and remind a user to change the password in time so as to block the further attack behavior of an attacker.

Description

Directional fishing attack event discovery method and device

Technical Field

The invention belongs to the technical field of network security, and particularly relates to a method and a device for discovering a directional phishing attack event.

Technical Field

The directional phishing refers to an attack means that an attacker constructs a counterfeit login page which is highly similar to a system login page frequently used by an attack target, and induces the attack target to click a malicious phishing link pointing to the counterfeit login page so as to steal credential information such as a user name and a password. The directional phishing attack event refers to an attack event that a user inputs and uploads credential information such as a user name and a password in a counterfeit login page to cause the disclosure of the credential information.

At present, a large number of detection methods for phishing websites appear in the prior art, which only focus on the discovery of phishing pages, are not enough to study whether phishing succeeds or not, the threat degree and the like, and a series of problems of incomplete detection dimension, unclear detection flow and the like exist, so that a directional phishing attack event cannot be efficiently and accurately discovered, accurate alarm is performed, and a user is reminded to timely change a password to block further attack behaviors of an attacker. For example, chinese patent application CN103179095 discloses a method and a client device for detecting a phishing website, which obtains a website of a target website and obtains page information of the target website according to the website; extracting key area features from the page information of the target website, and matching interface image similarity with key area features in a real key area feature library; and if the matching degree of the interface image similarity accords with a second preset condition, determining that the target website is a phishing website, and otherwise, determining that the target website is a normal website. The patent application focuses on finding page information and cannot judge whether fishing is successful or not.

Disclosure of Invention

Aiming at the requirement of finding the directional fishing attack event, the invention aims to provide a method and a device for finding the directional fishing attack event.

A method for discovering directional phishing attack events comprises the following steps:

1) acquiring user network access data, and screening out suspicious login behaviors according to configuration rules;

2) acquiring the actual login page characteristics of the user according to the suspicious login behavior;

3) and finding out a directional phishing attack event according to the actual login page characteristics, the configuration rules and the counterfeiting degree of the actual login page characteristics and the official login page characteristics.

Further, the network access data includes network traffic data and host data acquired by the user terminal.

Further, the method for establishing the configuration rule comprises the following steps:

1) setting a plurality of monitored target systems, collecting target system data, and extracting characteristics;

2) and acquiring a user name password keyword, a directional phishing page domain name blacklist, a directional phishing page IP blacklist and a non-directional phishing page domain name white list as configuration rules according to the characteristics of the target system.

Furthermore, the target system comprises a mailbox system, an OA system and a website system which perform identity authentication in a Web login mode.

Further, the target system characteristics include identifying from the user access data an IP address, a domain name, a url address, title content, and style content of the target system.

Further, the actual landing page is characterized by an IP address, a domain name, a URL address, title content, and style content of the actual landing page identified from the user network access data.

Further, the counterfeit degree refers to the maximum similarity between the actual login page and the official login page calculated based on the multidimensional characteristics of the domain name, the URL address, the page title and the page style.

Further, when the directional fishing attack event is judged, an alarm is sent out; the content of the alarm comprises a phishing page name, a phishing page URL, a counterfeit object name, a counterfeit object URL, a threat degree, an alarm ID and an alarm time.

A targeted phishing attack event discovery apparatus comprising:

the rule configuration and updating module is used for storing and updating configuration rules;

the data acquisition and screening module is used for acquiring network access data of the user and screening out suspicious login behaviors according to configuration rules;

the characteristic extraction module is used for acquiring the actual login page characteristics of the user according to the suspicious login behavior;

and the directional phishing attack event detection module is used for finding the directional phishing attack event according to the actual login page characteristics, the configuration rules and the imitation degree of the actual login page characteristics and the official login page characteristics.

Further, the configuration rules include a username and password keyword, a domain name blacklist of directional phishing pages, an IP blacklist of directional phishing pages, and a domain name white list of non-directional phishing pages.

Compared with the prior art, the invention has the following positive effects:

the invention discovers the directional phishing attack event by detecting the suspicious login behavior from the network access data, is not limited to a single target identified by the traditional phishing page, and enables complete event discovery and comprehensive threat assessment around the directional phishing attack behavior to be possible. The invention can enable a service department to effectively supervise the login behavior of the system by monitoring the original flow of the protected system, accurately find the directional fishing attack event in time, give an alarm and remind a user to change the password in time so as to block the further attack behavior of an attacker.

Drawings

Fig. 1 is a flow chart of a method for discovering a directional phishing attack event.

Fig. 2 is a relationship diagram of modules of a directional phishing attack event discovery device.

Detailed Description

The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, and it should be understood that the embodiments described herein are merely for the purpose of illustrating and explaining the present invention and are not intended to limit the present invention.

The embodiment provides a method for discovering a directional phishing attack event, fig. 1 is a flowchart thereof, and the method for discovering a directional phishing attack event will be described with reference to fig. 1.

1. Target setting and feature extraction

The method can set a monitored target system, collect target system data and extract features, and has the following specific implementation mode:

(step 1): setting monitored target systems, such as mailboxes, OA (office automation) systems, websites and the like which can perform identity authentication in a Web login mode;

(step 2): extracting the domain name of the set target system, and the IP address and the URL address corresponding to the domain name;

(step 3): acquiring html page data of a target system login page according to the URL address;

(step 4): extracting < title > tag data in the < head > tag from html page data to serve as a page title;

(step 5): extracting all attributes and attribute values under the < style > and < class > tags from html page data to serve as page styles;

(step 6): and recording the extracted domain name, IP address, URL address, page title and page style as the characteristics of each target system.

2. Rule configuration and update module

The method can configure screening and matching rules required by the directional fishing attack event discovery method, and can update according to the judgment result of the directional fishing event, and the specific implementation mode is as follows:

(step 1): analyzing data characteristics of a monitored target system, and setting a combined screening characteristic of a user name field and a password field, wherein the user name field is [ 'user', 'email', 'account', 'login', 'access', 'pass', and the password field is [ 'pass', 'pwd', 'key' ], and wherein "-" represents a wildcard to be used as a user name password key; or updating the user name and password keywords according to the judgment result of the directional fishing event;

(step 2): configuring/updating domain names of all target systems and a domain name list used when the user identity authentication is carried out, and using the domain names as a domain name white list of the non-directional phishing page; or updating a domain name white list of the non-directional phishing page according to the judgment result of the directional phishing event;

(step 3): configuring a domain name list of all found directional phishing pages as a domain name blacklist of the directional phishing pages; or updating the domain name blacklist of the directional phishing page according to the judgment result of the directional phishing event;

(step 4): configuring an IP list of all found oriented phishing pages as an oriented phishing page IP blacklist; or updating the IP blacklist of the directional phishing page according to the judgment result of the directional phishing event.

3. Data acquisition and screening

The method can screen out suspicious login behaviors from network access data of the user, and the specific implementation mode is as follows:

(step 1): acquiring network access data of a user, and analyzing and restoring HTTP protocol metadata comprising contents such as a source IP, a destination IP, an HTTP request header, an HT TP request body, an HTTP response header, an HTTP response body and the like;

(step 2): analyzing the HTTP protocol metadata to identify HTTP POST behavior data; (ii) a

(step 3): for HTTP POST behavior data, parameter fields in a request body of the HTTP POST behavior data are extracted, and target system login behaviors containing user name and password fields are screened out according to configured combination rules;

(step 5): and aiming at each screened target system login behavior, extracting the value of a 'Host' field in a POST behavior data request header, comparing a domain name corresponding to the value with a domain name white list of a non-directional phishing page, and filtering the login behavior in the domain name white list to obtain the suspicious login behavior to be detected.

4. Actual landing page feature extraction

The method can extract the features of the actual login page, and the specific implementation mode is as follows:

(step 1): extracting the value of a 'refer' field in an HTTP POST behavior data request header aiming at each suspicious login behavior to be detected to obtain the URL address of an actual login page;

(step 2): extracting a domain name and an IP address corresponding to the domain name according to the URL address of the actual login page;

(step 3): if the real login page is online, crawling html page data of the real login page according to the URL address by adopting a web crawler technology; otherwise, backtracking the original HTTP protocol metadata, and obtaining html page data from the Get request response body corresponding to the URL address.

(step 4): extracting text data of a < title > tag in the < head > tag from html page data to be used as a page title;

(step 5): extracting all attributes and attribute values under 'style' and 'class' tags from html page data to serve as page styles;

(step 6): and recording the extracted URL address, domain name, IP address, page title and page style as the characteristics of the actual login page.

5. Directional phishing attack event detection

The method can detect the directional fishing attack event based on the multidimensional characteristics, and the specific implementation mode is as follows:

(step 1): aiming at each suspicious login behavior to be detected, comparing the domain name and the IP address of the actual login page with the domain name and the IP blacklist of the preset phishing page, if the domain name and the IP address are matched with the preset phishing page, determining that a directional phishing attack event occurs, and otherwise, performing the next detection;

(step 2): extracting the multidimensional characteristics of each suspicious login behavior to be detected:

(step 2-1): extracting the value of a 'refer' field in an HTTP POST behavior data request header aiming at each suspicious login behavior to be detected to obtain the URL of an actual login page;

(step 2-2): extracting the domain name and the IP address corresponding to the domain name according to the actual login page URL;

(step 2-3): if the real login page is online, crawling html page data of the real login page according to the URL address by adopting a web crawler technology; otherwise, backtracking original HTTP protocol metadata, and obtaining html page data from a Get request response body corresponding to the URL address;

(step 2-4): extracting < title > tag data in the < head > tag from html page data to serve as a page title;

(step 2-5): extracting all attributes and attribute values under the < style > and < class > tags from html page data to serve as page styles;

(step 2-6): and recording the extracted domain name, IP address, URL address, page title and page style as characteristics for each target system.

(step 3): computing the actual landing Page p₁And official login page p₂URL similarity s of_l(u₁,u₂) And domain name similarity s_l(d₁,d₂) The calculation formula is as follows:

s_l(x₁，x₂)＝1-Leven(x₁，x₂)/maXlen(x₁，x₂)

wherein x₁,x₂Refers to the two sequences (u) to be aligned₁,u₂) Or (d)₁,d₂)，Leven(x₁,x₂) Refers to the Levensan distance between two sequences, maxlen (x)₁,x₂) Refers to the maximum length of the two sequences;

(step 4): for URL similarity s_l(u₁,u₂) And domain name similarity s_l(d₁,d₂) Carrying out weighted summation to obtain the actual login page p₁And official login page p₂Link similarity sim of_l(p₁,p₂)；

(step 5): computing the actual landing Page p₁And official login page p₂The same character ratio in the page title of the page is obtained to obtain the title similarity s_p(t₁,t₂) (ii) a Computing the actual landing Page p₁And official login page p₂In the page style, the attribute ratios with the same attribute value are obtained to obtain the style similarity s_p(m₁,m₂)；

(step 6): to title similarity s_p(t₁,t₂) And similarity of pattern s_p(m₁,m₂) Carrying out weighted summation to obtain the similarity sim of the page_p(p₁,p₂)；

(step 7): computing the actual landing Page p₁With any official login page p₂The calculation formula of the comprehensive similarity is as follows:

(step 8): get the actual landing page p₁The maximum similarity with all pages in the protected system login page feature knowledge base is taken as the page p₁Degree of imitation Fake (p)₁)；

(step 9): if faking degree Fake (p)₁) If the number of the phishing attack events is higher than a certain threshold value, determining that the phishing attack events are directed, and forming an alarm comprising fields such as a phishing page name, a phishing page URL, a counterfeit object name, a counterfeit object URL, a threat degree, an alarm ID, an alarm time and the like; otherwise, normal behavior.

A directional fishing attack event discovery device comprises a target setting and feature extraction module, a data acquisition and screening module, an actual login page feature extraction module and a directional fishing attack event detection module. The target setting and feature extraction module is used for constructing a specific black and white list and a feature knowledge base according to a protected target and updating the knowledge base in real time according to the research and judgment condition of a detection result; the data acquisition and screening module is used for screening out suspicious login behaviors to be detected from original traffic of the protected system or HTTP (hyper text transport protocol) metadata restored from the original traffic; the actual login page feature extraction module is used for extracting the features of the domain name, the IP, the URL, the page title, the page style and the like of the login page in the suspicious login behavior to be detected to obtain a login page feature set; the directional phishing attack event detection module is used for detecting each suspicious login behavior, judging whether the suspicious login behavior is a directional phishing attack event or not according to the IP blacklist knowledge base of the directional phishing page and the counterfeiting degree obtained through comprehensive calculation, and giving an alarm according to the judgment result.

The above embodiments are only intended to illustrate the technical solution of the present invention and not to limit the same, and a person skilled in the art can modify the technical solution of the present invention or substitute the same without departing from the spirit and scope of the present invention, and the scope of the present invention should be determined by the claims.

Claims

1. A method for discovering directional phishing attack events comprises the following steps:

2. The method of claim 1, wherein the network access data comprises network traffic data, host data obtained by the user terminal.

3. The method of claim 1, wherein the step of establishing the configuration rule comprises:

4. The method of claim 3, wherein the target system comprises a mailbox system, an OA system, and a website system for identity authentication through a Web login.

5. The method of claim 3, wherein the target system characteristics include an IP address, a domain name, a URL address, title content, and style content that identify the target system from the user access data.

6. The method of claim 1, wherein the actual landing page characteristics are an IP address, a domain name, a URL address, title content, and style content of the actual landing page identified from the user network access data.

7. The method of claim 1, wherein the phishing degree refers to a maximum similarity of an actual landing page to an official landing page calculated based on domain name, URL address, page title, and page style multi-dimensional features.

8. The method of claim 1, wherein upon determining the targeted phishing attack event, issuing an alert; the content of the alarm comprises a phishing page name, a phishing page URL, a counterfeit object name, a counterfeit object URL, a threat degree, an alarm ID and an alarm time.

9. A targeted phishing attack event discovery apparatus comprising:

10. The targeted phishing attack event discovery apparatus of claim 9 wherein the configuration rules include username password keywords, a targeted phishing page domain name blacklist, a targeted phishing page IP blacklist, a non-targeted phishing page domain name whitelist.