WO2024049148A1

WO2024049148A1 - Phishing attack prevention method, and recording media and devices for performing same

Info

Publication number: WO2024049148A1
Application number: PCT/KR2023/012749
Authority: WO
Inventors: 조해현; 심경민
Original assignee: 숭실대학교 산학협력단
Priority date: 2022-09-01
Filing date: 2023-08-29
Publication date: 2024-03-07

Abstract

The present invention relates to a phishing attack prevention method in a phishing attack prevention device for protecting a user browser from a phishing attack by a phishing website, by using a fingerprinting-based concealment technique, the phishing attack prevention method comprising the steps of: accessing a user resource locator (URL) through a user browser; disguising the user browser as a crawler by setting fingerprinting-based cloaking in the user browser; and receiving web page content from a phishing website corresponding to the URL. This can protect a user from the first visit to an unknown phishing website and also protect against a phishing attack by returning a harmless web page to the user.

Description

Methods for preventing phishing attacks, recording media and devices for performing them

The present invention relates to a phishing attack prevention method for protecting a user's browser from phishing attacks, and a recording medium and device for performing the same.

Research to protect existing websites suggests many anti-phishing technologies, such as URL (Uniform Resource Locator)-based phishing detection and web content analysis techniques.

For example, commercial URL blacklists such as Google Safe Browsing and Microsoft Smart Screen support the anti-phishing ecosystem on the backend.

This ecosystem uses machine learning classifiers to alert users to anti-phishing systems if the websites they visit are suspicious.

However, this ecosystem has the disadvantage that anti-phishing systems can only accurately classify phishing websites when searching for phishing content. In other words, there is a limitation in that phishing websites cannot be accurately classified when phishing content is not searched.

Accordingly, the latest phishing technologies can conceal the anti-phishing system and delay or disable browser detection to exploit the shortcomings of the conventional anti-phishing system.

Cloaking, one of the technologies that conceals the anti-phishing systems of these latest phishing technologies, exists in two categories: client-side and server-side.

First, client-side cloaking technology executes JavaScript in the user's browser and leaves fingerprinting in the browser to distinguish visitors and display different web page content.

Additionally, server-side cloaking technology analyzes HTTP requests to identify visits to anti-phishing elements. This server-side cloaking technology has the problem that it does not work well with small amounts of information provided by server-side cloaking sites, but it uses a relatively large amount of information to easily match HTTP requests from legitimate users and HTTP requests from anti-phishing systems. can be distinguished.

Therefore, fingerprinting-based concealment techniques are widely used in advanced phishing websites. Figure 1 is a diagram showing the general operating process of server-side fingerprinting-based cloaking performed by such advanced phishing websites.

As shown in Figure 1, fingerprinting-based cloaking technology identifies the visitor by using the hostname, IP address, User-Agent HTTP header, or Referrer HTTP header when a visitor visits a phishing website using a browser. .

And if it is determined to be an HTTP request by an anti-phishing system, the phishing web server's cloaking code leaves a fingerprint in the profile of the HTTP request and responds with different web page content.

Figure 2 is a diagram showing a simplified PHP code snippet of fingerprinting-based cloaking that verifies IP, hostname, and User-Agent in a phishing kit.

As phishing techniques continue to evolve, the number of fingerprints identified increases, so if the hostname, IP address, or User-Agent matches, a 404 Page Not Found error response is displayed. Therefore, since the visit of the anti-phishing system triggers a fingerprinting-based cloaking technology on the phishing server as shown in Figures 1 and 2, the anti-phishing system cannot search for phishing content, resulting in false positives or detection delays. there is.

[Prior art literature]

[Patent Document]

(Patent Document 1) Korean Patent Publication No. 10-2008-0072978

The present invention was created to solve the above problems, and the purpose of the present invention is to not only protect users from the first visit to an unknown phishing website, but also protect against phishing attacks by returning a harmless web page to the user. The goal is to provide a possible phishing attack prevention method, a recording medium, and a device for performing it.

A phishing attack prevention method according to an embodiment of the present invention for achieving the above object is a phishing attack prevention device for protecting a user's browser from a phishing attack by a phishing website using a fingerprinting-based concealment technique. A method for preventing phishing attacks, comprising: accessing a URL (Uniform Resource Locator) through the user's browser; Disguising the user browser as a crawler by setting cloaking based on fingerprinting in the user browser; and receiving web page content from a phishing website corresponding to the URL.

And the step of disguising it as a crawler includes comparing the URL with a blacklist prepared in advance; If the URL is included in the blacklist, blocking access to the URL; And if the URL is not included in the blacklist, it may include the step of checking the past history of the URL by querying a fingerprinting database prepared in advance.

In addition, in the step of disguising as a crawler, if a history of successfully preventing phishing attacks on the URL is confirmed in the step of checking the past history, the user browser is disguised as the crawler by maintaining the profile of the user browser. , if the history of successfully preventing phishing attacks against the URL is not confirmed, the profile of the user browser can be changed to disguise the user browser as the crawler.

And changing the profile of the user browser may mean changing the bot profile included in the profile of the user browser based on an anti-phishing bot profile database prepared in advance.

In addition, after the step of disguising itself as a crawler, the step of requesting HTTP (HyperText Transfer Protocol) including the maintained or changed profile of the user browser to the server of the phishing website may be further included.

The method for preventing phishing attacks includes classifying phishing content among the web page content using a classification engine running in the background after receiving the web page content; And it may further include updating the fingerprint database using the accessed URL, the changed user browser profile, and a classification result of the web page content.

Meanwhile, a recording medium according to an embodiment of the present invention for achieving the above object is a computer-readable recording medium on which a computer program for performing a method for preventing phishing attacks according to an embodiment of the present invention is recorded.

Meanwhile, a phishing attack prevention device according to an embodiment of the present invention for achieving the above object is a phishing attack prevention device for protecting the user's browser from phishing attacks by phishing websites using a fingerprinting-based concealment technique. , a communication unit that accesses a URL through the user browser and receives web page content from a phishing website corresponding to the URL; And it may include a control unit that sets cloaking based on fingerprinting to the user's browser and disguises the user's browser as a crawler.

Additionally, the control unit may include a comparison unit that compares the URL with a blacklist prepared in advance; a blocking unit that blocks access to the URL when the URL is included in the blacklist; And, if the URL is not included in the blacklist, it may include a past history confirmation unit that verifies the past history of the URL by querying a fingerprint database prepared in advance.

And, when the past history check unit confirms the history of successfully preventing a phishing attack against the URL, the control unit maintains the profile of the user browser to disguise the user browser as the crawler, but prevents phishing attacks against the URL. If the history of successfully preventing an attack is not confirmed, it may further include a profile management unit that changes the profile of the user browser to disguise the user browser as the crawler.

Additionally, the profile management unit may change the bot profile included in the profile of the user browser based on a pre-prepared anti-phishing bot profile database.

And the control unit, after disguising itself as the crawler, may request HTTP (HyperText Transfer Protocol) including the maintained or changed user browser profile to the server of the phishing website.

In addition, the control unit further includes a classification unit that classifies phishing content among the web page content using a classification engine running in the background after receiving the web page content, and the control unit is configured to classify the accessed URL, the changed The fingerprint database can be updated using the user browser profile and the classification results of the web page content.

According to one aspect of the present invention described above, by providing a method for preventing phishing attacks, a recording medium and a device for performing the same, it is possible to protect users from the first visit to an unknown phishing website and return a web page that is harmless to the user. This can protect you from phishing attacks.

1 is a diagram showing the general operating process of server-side fingerprinting-based cloaking performed on a phishing website;

Figure 2 shows a simplified PHP code snippet of fingerprinting-based cloaking to verify IP, hostname, and User-Agent in a phishing kit.

3 is a diagram for explaining the configuration of a phishing attack prevention device according to an embodiment of the present invention;

4 is a flowchart illustrating a method for preventing phishing attacks according to an embodiment of the present invention;

Figure 5 is a flowchart for explaining in more detail a method for preventing phishing attacks according to an embodiment of the present invention, and

Figure 6a shows the content displayed by the phishing content when visited by the default browser;

6B and 6C show content displayed when a browser according to the present invention visits a phishing website;

7A and 7B are diagrams comparing a default browser and a browser according to the present invention when visiting a general website;

Figures 8a and 8b are diagrams for explaining pop-up permission results in a basic browser and a browser according to the present invention.

The detailed description of the present invention described below refers to the accompanying drawings, which show by way of example specific embodiments in which the present invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention. It should be understood that the various embodiments of the present invention are different from one another but are not necessarily mutually exclusive. For example, specific shapes, structures and characteristics described herein may be implemented in one embodiment without departing from the spirit and scope of the invention. Additionally, it should be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the invention. Accordingly, the detailed description that follows is not intended to be taken in a limiting sense, and the scope of the invention is limited only by the appended claims, together with all equivalents to what those claims assert, if properly described. Similar reference numbers in the drawings refer to identical or similar functions across various aspects.

The components according to the present invention are components defined by functional division rather than physical division, and can be defined by the functions each performs. Each component may be implemented as hardware or program code and processing units that perform each function, and the functions of two or more components may be included and implemented in one component. Therefore, the names given to the components in the following embodiments are not intended to physically distinguish each component, but are given to suggest the representative function performed by each component, and the names of the components refer to the present invention. It should be noted that the technical idea is not limited.

Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the drawings.

Figure 3 is a diagram for explaining the configuration of a phishing attack prevention device 100 according to an embodiment of the present invention.

The phishing attack prevention device 100 (hereinafter referred to as device) according to this embodiment is provided to protect the user's browser from phishing attacks by phishing websites that use fingerprinting-based concealment techniques.

To this end, the device 100 according to this embodiment may include a communication unit 110, a storage unit 130, and a control unit 150. In addition, the device 100 may have software (application) installed and executed to perform a phishing attack prevention method, and the communication unit 110, storage unit 130, and control unit 150 may perform a phishing attack prevention method. It can be controlled by software (application) to do this.

At this time, the device 100 may be a separate terminal or a partial module of the terminal. Additionally, the communication unit 110, storage unit 130, and control unit 150 may be formed as an integrated module or may be comprised of one or more modules. However, on the contrary, each component may be comprised of a separate module.

Additionally, device 100 may be mobile or fixed. This device 100 may be in the form of a server or engine, and may be a device, apparatus, terminal, user equipment (UE), mobile station (MS), or wireless device. It may be called by other terms such as (wireless device) or handheld device. And the device 100 can execute or produce various software based on an operating system (OS), that is, a system. Here, the operating system is a system program that allows software to use the hardware of the device, and includes mobile computer operating systems such as Android OS, iOS, Windows Mobile OS, Bada OS, Symbian OS, and Blackberry OS, as well as Windows, Linux, and Unix systems. It can include all computer operating systems such as MAC, AIX, and HP-UX.

First, the communication unit 110 is provided to transmit and receive various information.

The communication unit 110 according to this embodiment can enable the user to access the URL through the user browser.

And the communication unit 110 may receive web page content from a phishing website corresponding to the URL entered through the user's browser.

The storage unit 130 records a program for performing a phishing attack prevention method. In addition, the data processed by the control unit 150 is stored temporarily or permanently, and may include a volatile storage medium or a non-volatile storage medium, but the scope of the present invention is not limited thereto.

And the storage unit 130 stores data accumulated while performing a phishing attack prevention method.

This storage unit 130 may include a fingerprint database unit 131 and an anti-phishing bot profile database.

The fingerprint database unit 131 may be provided to store URL past history, which is the URL processing history. And the fingerprint database unit 131 may be updated according to the information processed by the control unit 150.

This fingerprint database unit 131 may operate locally to prevent sharing of visit records between users to protect personal information.

Additionally, the fingerprint database unit 131 may be configured to share URL past history through a centralized server while maintaining the user's personal information.

Meanwhile, the anti-phishing bot profile database unit 133 may be provided to store information that allows changing the profile of the user's browser.

Here, a profile can be defined as a set of fingerprint-recognizable attributes and their values. For example, one profile might contain a User-Agent string with a bot, an empty Referrer, or an AWS IP address. These profiles are used to generate appropriate HTTP requests to the target website.

Accordingly, whenever a phishing website, that is, a phisher, adds a new fingerprint to the fingerprint-based concealment, these attributes can be stored in the anti-phishing bot profile database unit 133.

That is, the anti-phishing bot profile database unit 133 can store a profile that allows the user's browser to be disguised as a crawler. A profile that can be disguised as such a crawler can be extracted and stored from the phishing kit shown in FIG. 2.

Additionally, the storage unit 130 may store a blacklist maintained by an anti-phishing system. Of course, this is an example, and the blacklist maintained by the anti-phishing system may be received in real time through the communication unit 110.

Meanwhile, the control unit 150 is provided to control the entire process of providing a method to prevent phishing attacks.

Additionally, the control unit 150 can set cloaking based on fingerprinting on the user's browser to disguise the user's browser as a crawler.

For this purpose, the control unit 150 may include a comparison unit 151, a blocking unit 153, a past history confirmation unit 155, and a profile management unit 157.

The comparison unit 151 can compare the URL to be accessed through the user's browser with the blacklist.

And the blocking unit 153 can block access to the URL if the URL is included in the blacklist.

If the corresponding URL is included in the blacklist, the blocking unit 153 can completely block access by outputting a phishing route that can increase identification.

If the URL is not included in the blacklist, the past history confirmation unit 155 may query the fingerprint database unit 131 to check the past history of the URL.

If the past history confirmation unit 155 confirms the history of successfully preventing phishing attacks on the URL, the profile management unit 157 maintains the profile of the user's browser as a profile used in the past and disguises the user's browser as a crawler.

And if the profile management unit 157 does not confirm the history of successfully preventing phishing attacks for the URL, the user browser can be disguised as a crawler by changing the profile of the user browser to trigger fingerprinting-based cloaking on the phishing website. .

When the profile management unit 157 changes the profile to disguise the user's browser as a crawler, it may mean changing the bot profile included in the user browser's profile based on the anti-phishing bot profile database unit 133.

Specifically, the profile management unit 157 can edit or change profile items such as the User-Agent HTTP header.

Anti-phishing crawlers typically include "bot" and "crawler" or company names such as "Google" and "Facebook" in the User-Agent HTTP header, as shown in Table 1 below.

Table 1 is an exemplary table showing a list of the top 10 sensitive words that appeared as a result of analyzing at least one phishing kit as shown in FIG. 2. Therefore, the profile management unit 157 is equipped with trigger words automatically extracted from the phishing kit listed in Table 1 in advance, and can disguise the user's browser as a crawler by changing the profile using these.

Additionally, the profile management unit 157 according to this embodiment may be arranged to allow more trigger words to modify the user profile according to anti-phishing crawlers, bots, and server-side cloaking conditions.

Additionally, the profile management unit 157 can edit or change the Referrer HTTP header. Typically, users who are potential victims access phishing websites through the phisher's phishing lures. Therefore, phishers can block all visits that are not phishing bait.

Therefore, the profile management unit 157 can optionally disguise the user's IP address by using a proxy server.

More specifically, according to the results of analyzing the phishing kit as shown in Table 1 above, the proxy server of AWS EC2 would be useful because the phisher inferred that some anti-phishing crawlers use AWS EC2, so the profile management unit 157 You can use a proxy server.

In this case, the profile management unit 157 can proxy the request through a disguised IP address, and the proxy server can help avoid hidden phishing websites based on fingerprinting.

Additionally, in the process of changing the profile, the profile management unit 157 may add one trigger word from the anti-phishing bot profile database unit 133 to the User-Agent string according to the popularity order in Table 1.

Additionally, the profile management unit 157 may not use trigger words that are not successful for the same URL and may set the referring page to none (header removal).

Additionally, the profile manager 157 can optionally reroute the request to a proxy server in one of the most blocked IP ranges when changing the IP/hostname.

Therefore, the control unit 150 can disguise the user's browser as a crawler through the profile management unit 157 and then request HTTP (HyperText Transfer Protocol) including the browser's profile to the server of the phishing website.

Accordingly, the control unit 150 can receive an HTTP response from the server.

Additionally, the control unit 150 may further include a classification unit 159.

The classification unit 159 may classify phishing content using a classification engine running in the background after receiving web page content through an HTTP request.

Specifically, after receiving web page content, the control unit 150 may check whether there is suspicious content using a classification engine running in the background to prevent page rendering delays.

The reason why the classification unit 159 classifies suspicious content in this way is to check whether the profile changed in the profile management unit 157 is effective in causing cloaking.

And the control unit 150 can update the fingerprint database unit 131 using the accessed URL, the changed user browser profile, and the classification result of the web page content.

Below, four possibilities will be described when the device 100 according to this embodiment receives an HTTP response after disguising the user's browser as a crawler.

The first is when the server is harmless and responds with harmless content, the second is when the server is harmless and responds with suspicious content, and the third is when the server is malicious and responds with harmless content (such as an error page or a redirect to a harmless website, etc.). When responding, the last is when the server is malicious and responds with suspicious content.

The criteria for determining this are the presence or absence of features such as login forms, sensitive (phishing) words such as user names or passwords, and submit buttons. Whenever these features are absent, phishing attacks can be considered successfully prevented.

For example, when the user visits the harmless paypal.com for the first time in a situation where the device 100 does not know whether the URL is harmless or harmful, the profile management unit 157 changes the profile, and paypal is returned, so the web page is still displayed. Includes username and password fields. In this case, device 100 still contains sensitive words, so preventing phishing attacks may be considered unsuccessful.

Likewise, if the user visits a paypal phishing page such as paypal-cerify.com that does not involve cloaking, the device 100 changes the profile but still receives the phishing web page, so the control unit 150 is not aware of the phishing attack for the same reason as described above. Prevention can be considered unsuccessful.

In the case of the first of the above four possibilities (harmless server and content), there is no security risk to the user, so the device 100 can regard the prevention of the phishing attack as successful.

The second (harmful servers and suspicious content) may occur when a user visits a harmless website that contains phishing-like features such as login forms, sensitive words, and submit buttons. These websites are often indistinguishable from phishing websites. Accordingly, the device 100 may classify all unknown websites with login forms as suspicious, thereby rendering phishing attack prevention unsuccessful. Accordingly, the device 100 may determine the suspicious content to be suspicious and mark the modification as a failure in the suspicious content classification.

Meanwhile, in the third case (malicious server and harmless content), the server is malicious using a concealment technique and determines that a URL visit through the device 100 is a visit to an anti-phishing bot. Therefore, in this case, the phishing website returns an error web page or redirects the visit to a normal website. As a result, the device 100 can successfully prevent users from viewing phishing content by triggering fingerprinting-based cloaking technology on phishing websites.

In the fourth case (malicious servers and suspicious content), the phishing website may not perform fingerprinting-based cloaking, or the profile may not trigger fingerprinting-based cloaking.

In the former case, the device 100 cannot trigger any cloaking actions, so it can be considered to have failed to prevent phishing attacks. However, in this case, the phishing website can be quickly detected by the phishing prevention system, and this will be described later with reference to FIGS. 6 to 8.

In the latter case, the device 100 may save the failed profile to aid in future visits to that URL to trigger fingerprinting-based cloaking. From the browser's perspective, that case is the same as the second case (harmful server and suspicious content), so it can't simply block the URL because it doesn't know whether the server is malicious or harmless.

FIG. 4 is a flowchart illustrating a method for preventing phishing attacks according to an embodiment of the present invention. The method for preventing phishing attacks according to an embodiment of the present invention is substantially the same as the phishing attack device 100 shown in FIG. 3. Since this is done in terms of configuration, the same reference numerals will be assigned to the same components as those of the phishing attack device 100 of FIG. 3, and repeated descriptions will be omitted.

This phishing attack prevention method includes the step of accessing a URL (S110), disguising the user's browser as a crawler (S130), and receiving web page content (S150).

In the step of accessing the URL (S110), the communication unit 110 can access the URL through the user's browser.

In the step of disguising the user's browser as a crawler (S130), the control unit 150 can disguise the user's browser as a crawler by setting cloaking based on fingerprinting on the user's browser.

In the step of disguising the user's browser as a crawler (S130), the control unit 150 may compare the URL with a blacklist prepared in advance.

And in the step of disguising the user's browser as a crawler (S130), if the URL is included in the blacklist, the control unit 150 may block access to the URL.

Additionally, in the step of disguising the user's browser as a crawler (S130), if the URL is not included in the blacklist, the control unit 150 can check the past history of the URL by querying the fingerprint database unit 131 prepared in advance.

And in the step of disguising the user's browser as a crawler (S130), if the control unit 150 checks the past history and confirms a history of successfully preventing phishing attacks on URLs, the profile of the user's browser is changed to a profile that successfully prevents phishing attacks. By maintaining this, the user's browser can be disguised as a crawler.

Additionally, in the step of disguising the user browser as a crawler (S130), if the control unit 150 does not confirm the history of successfully preventing phishing attacks on URLs, the control unit 150 changes the profile of the user browser to transform the user browser into a crawler. It can be disguised.

In the step of disguising the user browser as a crawler (S130), changing the profile of the user browser means that the control unit 150 changes the profile of the user browser based on the anti-phishing bot profile database unit 133 prepared in advance. This could be changing the bot profile.

In the method of preventing phishing attacks according to this embodiment, after the step of disguising the user's browser as a crawler (S130), the control unit 150 transmits HTTP (HyperText Transfer Protocol) including the profile of the user's browser that has been maintained or changed to the phishing website. It may further include a request step to the server side.

In the step of receiving web page content (S150), the control unit 150 may receive web page content from the phishing website corresponding to the URL.

Additionally, the method for preventing phishing attacks may further include classifying phishing content and updating the fingerprint database unit 131 after receiving web page content (S150).

The step of classifying phishing content may be a step in which the control unit 150 classifies phishing content among web page content using a classification engine running in the background.

In the step of updating the fingerprint database unit 131, the fingerprint database may be updated using the URL accessed by the control unit 150, the changed user browser profile, and the classification result of web page content.

Figure 5 is a flowchart to explain in more detail a method for preventing phishing attacks according to an embodiment of the present invention.

First, the device 100 can access the URL using a browser (S210).

And the device 100 can determine in real time whether the corresponding URL is included in the blacklist using a commercial URL blacklist such as Google Safe Browsing or Microsoft SmartScreen (S220).

If the URL is included in the blacklist (S220-Yes), the device 100 can block access to the URL (S225).

On the other hand, if the URL is not included in the blacklist (S220-No), the device 100 queries the fingerprint database unit 131 to check whether the URL has been processed previously (S230).

If there is a history of processing the URL and the phishing attack was previously successfully prevented (S230-Yes), the device 100 can maintain a profile based on the processing history (S235).

If there is no history of processing the URL or the phishing attack cannot be prevented (S230-No), the device 100 can change the profile using the profile information stored in the anti-phishing bot profile database unit 133 (S240). .

Thereafter, the device 100 may request HTTP to the phishing website server through the maintained or changed profile (S245).

Device 100 may then receive web page content in response to HTTP from the phishing website.

Additionally, the device 100 may check whether there is suspicious content using a classification engine running in the background (S255).

The phishing attack prevention method of the present invention can be implemented in the form of program instructions that can be executed through various computer components and recorded on a computer-readable recording medium. The computer-readable recording medium may include program instructions, data files, data structures, etc., singly or in combination.

Program instructions recorded on the computer-readable recording medium may be specially designed and configured for the present invention, or may be known and usable by those skilled in the computer software field.

Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. media), and hardware devices specifically configured to store and perform program instructions, such as ROM, RAM, flash memory, etc.

Examples of program instructions include not only machine language code such as that created by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like. The hardware device may be configured to operate as one or more software modules to perform processing according to the invention and vice versa.

6 to 8 are diagrams for explaining the effectiveness of a method for preventing phishing attacks according to an embodiment of the present invention.

In order to evaluate the phishing attack prevention method of the present invention, the present device 100 was implemented as a Chrome browser extension, and evaluation was conducted from three perspectives: effectiveness, waiting time, and functional impact.

This perspective demonstrates the feasibility of our framework in practice, as it successfully evades highly phishing websites and does not cause damage by introducing negligible latency to user navigation.

Malicious and benign data sets were used to test the effectiveness of the phishing attack prevention method.

More specifically, for effectiveness evaluation, the malicious dataset included 160,728 live phishing webs from November 2020 to July 2021 using the Anti-Phishing Working Group (APWG) URL feed, a curated dataset of reported phishing URLs. Visited the site. We then used another 8,474 live phishing websites in the APWG dataset to evaluate the effectiveness of IP changes, or profile changes.

Meanwhile, to evaluate the impact of the phishing attack prevention method of the present invention on benign websites, a data set of 60,848 positive domains randomly selected from 629,843 domains in the Alexa Top One Million Domain List was collected as a positive data set.

Then, the effectiveness of the phishing attack prevention method was evaluated by visiting the same phishing website using the default browser and a browser reflecting the application for performing the phishing attack prevention method of the present invention (hereinafter referred to as this browser).

Phishing URL visits using default and main browsers are taken from the APWG data set.

To reduce the impact of trigger word selection on the results, for each URL, our browser used a profile with 407 trigger words, a random trigger word without a referrer header, and no IP proxy. A simple profile was used because cloaking is triggered when the cloaking rule matches according to the phishing kit behavior analysis.

We also evaluated the effectiveness of each trigger word, evaluated the effectiveness of IP address proxies, and recorded the URL and content of the last visited web page each time a URL was visited.

As a result, in APWG's experiments on 160,728 phishing URLs from November 2020 to July 2021, 132,247 (82.28%) did not contain malicious content in the browser viewed.

And, if this browser's HTTP response (1) differs from the web page displayed in the default browser and (2) does not contain suspicious content such as "phishing" words or malformations, it is considered healthy under the CANTINA+ content feature reimplementation. did.

Figure 6 is a diagram illustrating the difference in response web page content between a visit to a cloaked phishing website through a default browser and a visit through the main browser.

The content in Figure 6a is the content displayed by phishing content when the default browser visits.

Meanwhile, when the browser changed the HTTP profile to include random trigger words and removed the referrer header, the phishing website displayed the error web page in Figure 6b.

Additionally, other phishing websites in this browser redirected visitors to a benign URL instead of returning them to an error page. In this way the browser received the web page content shown in Figure 6C, indicating a successful evasion.

These results show that the present invention can disguise users as anti-phishing entities through fingerprinting-based cloaking technology and prevent users from phishing content from phishing sites.

And because users view harmless content (error pages or harmless URLs), they are not exposed to phishing attacks, thus preventing them from becoming victims of phishing attacks. This also applies to users who visit a phishing URL for the first time.

Below, we will describe the results of evaluating the effectiveness of each trigger word that actually triggers the fingerprinting-based cloaking technique.

Trigger words are retrieved through the phishing kit analysis in Figure 2, and all trigger words are tested by visiting each phishing website using different profiles consisting of no trigger word, no referrer header, and no IP proxy.

For evaluation purposes, each profile included only one trigger word, so one profile means one trigger word.

Similar to the effectiveness evaluation, websites were visited and compared using the default browser, and evaluation was performed on 916 phishing websites.

As a result, it was confirmed that 725 phishing websites display web pages differently in one or more trigger words between the main browser and the default browser.

In 725 cloaked phishing websites, each trigger word has a different evasion function.

Table 2 below shows the results of the top 10 trigger words that successfully avoided phishing content.

As can be seen in Table 2, the word bot is most often avoided by phishing websites. In other words, 99.31% of cloaked phishing websites can be avoided by adding a bot to the User-Agent. Compared to the popularity ranking in Table 1, it is also the most blocked word in the phishing kit investigated by the inventor of the present invention.

In fact, the effect of this word has been confirmed to be popular in phishing kits, and similarly, "amazonaws", "phishtank", and "google" are also frequently used in phishing kits.

What's interesting is that "bots" and "amazonaws" combined can cause cloaking on any phishing website that uses fingerprinting-based cloaking techniques.

These results show that trigger words can effectively evade phishing websites through fingerprinting-based cloaking technology.

You can also avoid countless hidden phishing websites with just a few trigger words.

Additionally, this phishing attack prevention method achieved an evasion rate of over 80% by changing both the User-Agent and Referrer headers. To achieve this, three different browsers visit the same dataset of phishing websites: (1) the UA browser, which is the native browser that only changes the User-Agent string (UA), (2) the REF browser, which is the native browser that only changes the Referrer, and (3) Evaluation was performed through a general browser.

In this evaluation, 4,905 phishing websites were visited using the three browsers above. In addition, the analysis compared the web page content of each website in the UA browser and REF browser with the web page content in the general browser.

Therefore, it is possible to detect the number of web pages in UA Browser or REF Browser that do not contain suspicious content.

Among the visited phishing websites, UA browser avoided 4,028 and REF browser avoided 16. Because a limited number of phishing kits include Referrer checking, our browser was able to avoid a small number of phishing websites by only changing the Referrer.

Consider referrers as an option in this browser because phishers can check them in future phishing kits.

And by combining Referrer and User-Agent, 82.44% of phishing websites can be avoided. These results show that this browser can avoid more phishing websites by changing only the User-Agent header than by changing only the Referrer.

Meanwhile, although the present invention can successfully avoid phishing content in more than 80% of phishing websites by changing the User-Agent and Referrer headers, the effectiveness of changing the IP address was also analyzed.

Therefore, we performed another experiment with this browser's profile, which had the default User-Agent and Referrer headers, but proxyed the connection through a server with an Amazon AWS IP address.

The proxy server option may affect your privacy, so turn it off by default. You may choose to use this feature in your browser only if you have read, understood, and agreed to the privacy statements. In this experiment, 8,474 phishing websites were used, and the websites were visited in both the default browser and the main browser (only the IP address changed depending on the profile).

We then compared the web page content of each phishing website across the two visits to detect whether the browser's web pages contained suspicious content. Of the phishing sites visited, this browser evaded 88.98% (7,540) through proxy servers.

This allows this browser to avoid phishing websites that implement IP, User-Agent, or Referrer cloaking. Fisher may design new cloaking technologies in the future, but the invention was designed as an extensible framework so that fingerprinting capabilities could be added.

Next, we looked at the impact of the browser on the user experience when the user visits a normal website.

By design, this browser can introduce latency into HTTP requests due to database queries, HTTP profile changes, and inspection of returned content. We conducted an experiment to measure the browser's latency from three perspectives: database queries, profile changes, and content inspection.

We used exthouse as a test bench to analyze the impact of browser extensions on web performance, and it includes five key measurements, which are:

(1) Time to Interactive (TTI): the time it takes for the page to be fully interacted with the extension; (2) First Input Delay (FID Δ): from the time the user first interacts with the website, the browser The time until the event handler can actually start processing in response to the action; (3) Scripting time (Scripting Δ): the time the JavaScript is executed in the extension; (4) Long Task (added Long Task): The value represents the sum of Long Tasks added by the extension, where a Long Task is defined as a task that blocks the main thread for more than 50ms, and (5) Additional CPU consumption (additional CPU time): of the extension for each URL visited by the browser. Additional CPU consumption.

The lower the factor, the better the performance.

And exthouse generates a score for the extension, with a higher score indicating better performance of the extension.

Table 3 shows the external scores of the top 10 Chrome extensions and the main browser (Spartacus) when visiting benign and malicious websites.

We tested these extensions on 100 websites, half benign and half malicious, and used the average for our metrics. Our browser (Spartacus) received a score of 100 based on 20ms FID, 0 scripting delta, and 800ms TTI when visiting benign websites. It can be seen that the indicators of the main browser (Spartacus) visiting malicious websites also exceed those of other popular extensions.

Even though it takes longer to interact with a malicious website, this is still less time than other extensions as Spartacus requires time to change profiles. For example, Avira Browser Safety (ABS) is an extension that warns users if a website is not safe, adding lengthy operations and additional CPU time when visiting malicious websites.

The results of this evaluation show that our browser (Spartacus) adds minimal overhead to web browsing.

Test results show that Spartacus outperforms popular Chrome extensions and has a minimal impact on website performance compared to other extensions.

Additionally, it is important for the phishing attack prevention method of the present invention to minimize the negative impact of visiting harmless URLs. These impacts may include website accessibility, correct display of website layout and correct website functionality.

To evaluate the functionality of benign websites, for each URL, this browser used a profile without any trigger words, referrer headers, and IP proxies.

Specifically, we evaluated whether our framework would have a negative impact on access to a website or website layout through automatic analysis of the results of large-scale crawling of innocuous domains.

For this purpose, 60,848 (9.66%) of the 629,843 URLs in the Alexa Top One Million Domain List were randomly sampled and visited in the default and native browsers.

We visited using both browsers with existing sessions similar to the user's browser, and compared the HTML similarity of the resulting web page screenshots and visited URLs. The results are shown in Table 4 below.

0.25% (150) use a different layout and 0.20% (124) block access to this browser.

First, we manually examined the results to determine why our browser displayed a different layout than the default browser. Although the screenshots and HTML are different between default and native browser visits, we found that these differences did not affect the use of the website.

Figures 7 and 8 show general differences in browser rendering between default browser and native browser visits. For example, a web page is rendered differently in terms of screenshot similarity between the default browser visit shown in Figure 7A and the main browser visit shown in Figure 7B.

Here, the differences between Figures 7a and 7b are the shape of the button, different background color, and content spacing.

Referring to FIG. 8, a window requesting permission to use cookies pops up in the default browser in FIG. 8a, but the pop-up is not allowed when visiting through the main browser in FIG. 9b.

The missing cookie request pop-up in our browser in Figure 8b was not caused by an extension; the pop-up only appeared three times in 10 visits in a different default browser.

Meanwhile, to further evaluate the potential impact of the present invention on harmless websites, an experiment was conducted by visiting 629,843 harmless websites on the Alexa Top One Million Domain List.

Through our experiments, we found that only 3,023 (0.48%) harmless websites blocked access or displayed a different web page layout in the browser.

This result confirms previous results showing that most harmless websites do not block visits by the browser and do not deliver web page content to the browser that is different from normal visits.

And some legitimate websites are built on web hosting services such as Cloudflare and Akamai, whose services include security mechanisms such as anti-DDoS and anti-crawling.

Therefore, to ensure that you are protected by this browser and can successfully visit these websites, we used this browser to visit 5,000 Cloudflare-powered and 5,000 Akamai-powered benign websites.

Of the total 10,000 websites, 99.86% could be visited successfully, and 14 benign sites were inaccessible.

This is mainly because website owners use traffic filtering mechanisms through CDNs, and we confirmed that with such a low false evasion rate, users can successfully visit most legitimate websites hosted on CDNs.

Incorrectly avoided, harmless websites may be reported to your browser's provider, which may scan them asynchronously and force your browser to visit websites using the default profile.

Although various embodiments of the present invention have been shown and described above, the present invention is not limited to the specific embodiments described above, and may be used in the technical field to which the invention pertains without departing from the gist of the invention as claimed in the claims. Of course, various modifications can be made by those skilled in the art, and these modifications should not be understood individually from the technical idea or perspective of the present invention.

[Explanation of symbols]

100: Phishing attack prevention device 110: Communication Department

130: storage unit 131: fingerprint database unit

133: Anti-phishing bot profile database section

150: control unit 151: comparison unit

153: Blocking unit 155: Past history confirmation unit

157: Profile management department 159: Classification department

Claims

In the method of preventing phishing attacks in a phishing attack prevention device to protect a user's browser from phishing attacks by phishing websites using fingerprinting-based concealment techniques,

Accessing a URL (Uniform Resource Locator) through the user browser;

Disguising the user browser as a crawler by setting cloaking based on fingerprinting in the user browser; and

A method for preventing phishing attacks, comprising receiving web page content from a phishing website corresponding to the URL.
According to paragraph 1,

The step of disguising it as a crawler is,

Comparing the URL with a blacklist prepared in advance;

If the URL is included in the blacklist, blocking access to the URL; and

If the URL is not included in the blacklist, a method for preventing phishing attacks, including the step of checking the past history of the URL by querying a fingerprint database prepared in advance.
According to paragraph 2,

In the step of disguising it as a crawler,

In the step of checking the past history, if the history of successfully preventing phishing attacks against the URL is confirmed, the profile of the user browser is maintained and the user browser is disguised as the crawler.

A method for preventing phishing attacks, where the user browser is disguised as the crawler by changing the profile of the user browser when the history of successfully preventing phishing attacks against the URL is not confirmed.
According to paragraph 3,

Changing the profile of the user's browser,

A method of preventing phishing attacks, wherein the bot profile included in the profile of the user's browser is changed based on an anti-phishing bot profile database prepared in advance.
According to paragraph 3,

A method for preventing phishing attacks, further comprising requesting HTTP (HyperText Transfer Protocol) including the maintained or changed profile of the user browser to the server of the phishing website after disguising it as a crawler.
According to clause 5,

The method for preventing the phishing attack is,

After receiving the web page content, classifying phishing content among the web page content using a classification engine running in the background; and

A method for preventing phishing attacks, further comprising updating the fingerprint database using the accessed URL, the changed user browser profile, and a classification result of the web page content.
A computer-readable recording medium on which a computer program for performing the method for preventing phishing attacks according to claim 1 is recorded.
In the phishing attack prevention device for protecting user browsers from phishing attacks by phishing websites using fingerprinting-based concealment techniques,

a communication unit that accesses a URL through the user browser and receives web page content from a phishing website corresponding to the URL; and

A phishing attack prevention device comprising a control unit that sets cloaking based on fingerprinting to the user's browser and disguises the user's browser as a crawler.
According to clause 8,

The control unit,

a comparison unit that compares the URL with a blacklist prepared in advance;

a blocking unit that blocks access to the URL when the URL is included in the blacklist; and

A phishing attack prevention device comprising a past history checker that verifies the past history of the URL by querying a fingerprint database prepared in advance when the URL is not included in the blacklist.
According to clause 9,

The control unit,

If the past history check unit confirms the history of successfully preventing phishing attacks against the URL, the profile of the user browser is maintained and the user browser is disguised as the crawler,

If the history of successfully preventing phishing attacks against the URL is not confirmed, the device further includes a profile manager that changes the profile of the user browser to disguise the user browser as the crawler.
According to clause 10,

The profile management department,

A phishing attack prevention device that changes the bot profile included in the profile of the user's browser based on a pre-prepared anti-phishing bot profile database.
According to clause 10,

The control unit,

A phishing attack prevention device that disguises itself as the crawler and then requests HTTP (HyperText Transfer Protocol) containing the maintained or changed user browser profile to the server of the phishing website.
According to clause 12,

The control unit,

Further comprising a classification unit that classifies phishing content among the web page content using a classification engine running in the background after the step of receiving the web page content,

The control unit,

A phishing attack prevention device that updates the fingerprint database using the accessed URL, the changed user browser profile, and a classification result of the web page content.