CN116032508A

CN116032508A - A method for automatic whitelist detection of phishing attacks based on process control

Info

Publication number: CN116032508A
Application number: CN202111255230.9A
Authority: CN
Inventors: 林薇
Original assignee: Nanjing Liancheng Technology Development Co ltd
Current assignee: Nanjing Liancheng Technology Development Co ltd
Priority date: 2021-10-27
Filing date: 2021-10-27
Publication date: 2023-04-28

Abstract

The invention discloses a method for detecting phishing attacks by an automatic white list based on process control, which is characterized by comprising a matching module, a user confirmation module and a white list database updating module, wherein the anti-reconnaissance technology is used in the process control; the matching module comprises a URL matching sub-module and a DNS matching sub-module; the user confirmation module confirms whether the web page has phishing attack or not by extracting the hyperlink and then applying a phishing detection algorithm, wherein the phishing detection algorithm comprises an algorithm 1 for checking whether the hyperlink is legal or phishing type, if the hyperlink is in the latter state, the system will send out a warning to the user, and if the hyperlink is in the former state, the system will update the white list database; and the white list updating database module writes legal hyperlinks of the first access webpage into a white list database, wherein the legal hyperlinks comprise an IP address and a DNS domain name. The invention can improve the detection rate of the phishing attack, thereby reducing network crimes, avoiding property loss, disclosing confidential information to the public and the like.

Description

A method for automatic whitelist detection of phishing attacks based on process control

技术领域Technical Field

本发明涉及网络安全、SOC(Security operation center)、网络钓鱼、可信集中管控、数据采集、操作系统、文件系统和数据加密的技术领域，尤其涉及到一种基于过程管控的自动白名单检测网络钓鱼攻击的方法。The present invention relates to the technical fields of network security, SOC (Security operation center), phishing, trusted centralized control, data collection, operating system, file system and data encryption, and in particular to a method for automatic whitelist detection of phishing attacks based on process control.

背景技术Background Art

网络空间的使用不断增加，因为它在当今的商业和商业活动中发挥着重要作用，提供了许多在线服务，这些服务往往简化了人们的日常生活。这些服务使人们能够无处不在地获取信息。例如，通过网络进行的网上银行已经变得非常流行，还有网上购物，因为许多人已经习惯了它。互联网信息共享技术无处不在，无疑带来了各种形式的攻击。其中最突出的是网络钓鱼(phishing)。The use of cyberspace is increasing as it plays an important role in today's business and commercial activities, providing many online services that often simplify people's daily lives. These services allow people to access information everywhere. For example, online banking through the Internet has become very popular, as well as online shopping, as many people have become accustomed to it. The ubiquity of Internet information sharing technology has undoubtedly brought various forms of attacks. The most prominent of these is phishing.

网络钓鱼可简明扼要地定义为欺诈和恶意行为，常常被黑客用来侦察目标网络。按照过程管控模型(图2)，侦察是黑客发起攻击的第一步。如果存在反网络钓鱼攻击技术，则能够将黑客挡在目标网络之外，使得它攻不进“门”来，也没有对企业网络造成任何破坏和损失，从而保障了企业网络的正常运营秩序。因此，研究反网络钓鱼攻击技术是至关重要的。Phishing can be succinctly defined as fraud and malicious behavior, which is often used by hackers to reconnaissance the target network. According to the process control model (Figure 2), reconnaissance is the first step for hackers to launch an attack. If anti-phishing attack technology exists, it can block hackers from the target network, making it impossible for them to break into the "door" and causing no damage or loss to the enterprise network, thereby ensuring the normal operation order of the enterprise network. Therefore, it is crucial to study anti-phishing attack technology.

已有的几种反网络钓鱼的技术，存在着检测率低的缺陷。黑名单是最常用的网络钓鱼的检测方法。黑名单包含钓鱼网站；然而，维护黑名单需要大量资源来报告和验证可疑网站。此外，由于新的钓鱼网站不断涌现，因此很难维持全球黑名单。另一方面，白名单包含合法的网站；但就像黑名单一样，全球白名单也难以维持；不可能为包含所有可用真实合法网站的白名单开发数据库，因为这些网站规模之庞大且增长迅速。Several existing anti-phishing technologies have the disadvantage of low detection rates. Blacklists are the most commonly used method for detecting phishing. Blacklists contain phishing websites; however, maintaining blacklists requires a lot of resources to report and verify suspicious websites. In addition, it is difficult to maintain a global blacklist because new phishing websites are constantly emerging. On the other hand, whitelists contain legitimate websites; but just like blacklists, global whitelists are difficult to maintain; it is impossible to develop a database for a whitelist that contains all available real legitimate websites because these websites are so large and growing rapidly.

发明内容Summary of the invention

为了解决上述技术问题，本发明提供了一种基于过程管控的自动白名单检测网络钓鱼攻击的方法，采用自动白名单来检测网络钓鱼攻击的算法，以提升网络钓鱼攻击的检测率，从而减少网络犯罪、避免财产损失和向公众披露机密信息等。In order to solve the above technical problems, the present invention provides a method for detecting phishing attacks by automatic whitelist based on process control, which adopts an automatic whitelist to detect phishing attack algorithms to improve the detection rate of phishing attacks, thereby reducing cybercrime, avoiding property losses and disclosure of confidential information to the public.

一种基于过程管控的自动白名单检测网络钓鱼攻击的方法，其特征在于，被用于过程管控中的一种反侦察技术，包括匹配模块、用户确认模块和更新白名单数据库模块；所述匹配模块，包括URL匹配子模块和DNS匹配子模块；所述用户确认模块，确认网页是否存在网络钓鱼攻击，这是通过提取超链接并随后应用钓鱼检测算法来实现的，所述钓鱼检测算法，包括算法1，用于检查超链接是合法的还是钓鱼类型，如果状态为后者，则系统将向用户发出警告，如果状态为前者，则系统将更新白名单数据库；所述更新白名单数据库模块，将第一次访问网页的合法的超链接写入到白名单数据库中，包括IP地址和DNS域名；所述方法，还包括如下步骤：A method for automatic whitelist detection of phishing attacks based on process control, characterized in that it is an anti-reconnaissance technology used in process control, including a matching module, a user confirmation module and a whitelist database update module; the matching module includes a URL matching submodule and a DNS matching submodule; the user confirmation module confirms whether a web page has a phishing attack, which is achieved by extracting a hyperlink and then applying a phishing detection algorithm, the phishing detection algorithm includes Algorithm 1, which is used to check whether the hyperlink is legitimate or a phishing type, if the status is the latter, the system will issue a warning to the user, if the status is the former, the system will update the whitelist database; the whitelist database update module writes the legitimate hyperlink of the first visit to the web page into the whitelist database, including the IP address and DNS domain name; the method also includes the following steps:

(1)如果用户是第一次访问网页，则基于算法1，用户进行确认，以决定该网页是否存在网络钓鱼，如果是网络钓鱼，则系统将向用户发出警告，如果是合法的网页，则系统将更新白名单数据库；(1) If the user is visiting a web page for the first time, based on Algorithm 1, the user confirms whether the web page contains phishing. If it is phishing, the system will warn the user. If it is a legitimate web page, the system will update the whitelist database.

(2)如果用户不是第一次访问网页，则URL匹配，如果DNS也匹配，则该网站为合法网站，否则为钓鱼网站，并向用户发出警告；(2) If the user is not visiting the web page for the first time, the URL matches. If the DNS also matches, the website is a legitimate website. Otherwise, it is a phishing website, and a warning is issued to the user.

所述算法1，通过对实际链接和视觉链接进行综合分析来确定白名单，此外，还可以计算已知的可信网站的相似性，并对从超链接中提取的信息做出最终决定，这些信息同样也能从用户提供的网址中获得，提取超链接背后的原因是，钓鱼网站从目标原始或合法网页复制页面内容的内容，该网页可能有许多指向目标合法页面的伪造和模拟超链接，网络钓鱼数据库中的某些可用URL会重定向到其相应的原始或合法网站，但是，如果该网页是正版网页，则不会指向网络钓鱼网页，检测钓鱼的算法基于三个指标决定任何URL的状态：源代码中存在的空链接、不包含任何超链接的网页和源代码中存在的外部链接。The algorithm 1 determines the whitelist by performing a comprehensive analysis of actual links and visual links. In addition, it can also calculate the similarity of known trusted websites and make a final decision on the information extracted from the hyperlinks, which can also be obtained from the URL provided by the user. The reason behind extracting the hyperlinks is that the phishing website copies the content of the page from the target original or legitimate web page. The web page may have many forged and simulated hyperlinks pointing to the target legitimate page. Some available URLs in the phishing database will redirect to their corresponding original or legitimate websites. However, if the web page is a genuine web page, it will not point to the phishing web page. The algorithm for detecting phishing determines the status of any URL based on three indicators: empty links present in the source code, web pages that do not contain any hyperlinks, and external links present in the source code.

进一步地，所述源代码中存在的空链接，也就是说包含空指针的网页，当链接不指向任何网页或文档时，它被称为空链接或空指针，它通常用 <a href＝“#”>表示，每当单击链接时，它都会返回同一页面上的链接，攻击者使用空指针实现其别有用心的目的。Furthermore, the empty link in the source code, that is, the web page containing the empty pointer, when the link does not point to any web page or document, it is called an empty link or empty pointer, which is usually represented by <a href="#">. Whenever the link is clicked, it returns to the link on the same page. The attacker uses the empty pointer to achieve his ulterior motives.

进一步地，所述不包含任何超链接的网页，如果一个网站是合法的，则至少在一个超链接上轻松地进行提取，如果提取的链接总数为零，则该网站被视为钓鱼网站，但是，如果没有超链接提取，则该网页也被视为网络钓鱼类型。Further, the web page that does not contain any hyperlinks, if a website is legitimate, then at least one hyperlink can be easily extracted, and if the total number of links extracted is zero, then the website is considered to be a phishing website, however, if no hyperlink is extracted, then the web page is also considered to be a phishing type.

进一步地，所述源代码中存在的外部链接，算法1将根据提取的超链接的结果来决定，如果超链接是合法的，则大多数超链接都指向同一域，而对于钓鱼网站，大多数超链接都指向各自的目标域或外域，这个算法1 能够计算从网页源代码中提取的链接总数和指向外域的链接总数，并选择比率的合适的阈值，超链接性质的决定由以下方程式确定：Furthermore, the external links existing in the source code, Algorithm 1 will be determined based on the results of the extracted hyperlinks. If the hyperlinks are legitimate, most of the hyperlinks point to the same domain, while for phishing websites, most of the hyperlinks point to their respective target domains or external domains. This Algorithm 1 is able to calculate the total number of links extracted from the web page source code and the total number of links pointing to external domains, and select a suitable threshold for the ratio. The determination of the nature of the hyperlink is determined by the following equation:

其中，ND_i＝指向自己域的链接总数，∑L＝从可疑网页的网页源中提取的链接总数。Wherein, ND _i = the total number of links pointing to the own domain, ∑L = the total number of links extracted from the web page source of the suspicious web page.

本发明的技术效果在于：The technical effects of the present invention are:

在本发明中，提供了一种基于过程管控的自动白名单检测网络钓鱼攻击的方法，其特征在于，被用于过程管控中的一种反侦察技术，包括匹配模块、用户确认模块和更新白名单数据库模块；所述匹配模块，包括URL 匹配子模块和DNS匹配子模块；所述用户确认模块，确认网页是否存在网络钓鱼攻击，这是通过提取超链接并随后应用钓鱼检测算法来实现的，所述钓鱼检测算法，包括算法1，用于检查超链接是合法的还是钓鱼类型，如果状态为后者，则系统将向用户发出警告，如果状态为前者，则系统将更新白名单数据库；所述更新白名单数据库模块，将第一次访问网页的合法的超链接写入到白名单数据库中，包括IP地址和DNS域名。通过本发明，能够提升网络钓鱼攻击的检测率，从而减少网络犯罪、避免了财产损失和向公众披露机密信息等。In the present invention, a method for automatically detecting phishing attacks based on a whitelist of process control is provided, characterized in that it is an anti-reconnaissance technology used in process control, including a matching module, a user confirmation module and an update whitelist database module; the matching module includes a URL matching submodule and a DNS matching submodule; the user confirmation module confirms whether a web page has a phishing attack, which is achieved by extracting a hyperlink and then applying a phishing detection algorithm, the phishing detection algorithm includes Algorithm 1, which is used to check whether the hyperlink is legitimate or a phishing type, if the state is the latter, the system will issue a warning to the user, if the state is the former, the system will update the whitelist database; the update whitelist database module writes the legitimate hyperlink of the first visit to the web page into the whitelist database, including the IP address and the DNS domain name. Through the present invention, the detection rate of phishing attacks can be improved, thereby reducing cybercrime, avoiding property losses and disclosing confidential information to the public, etc.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是一种基于过程管控的自动白名单检测网络钓鱼攻击的方法的钓鱼攻击生命周期示意图；FIG1 is a schematic diagram of a phishing attack life cycle of a method for detecting phishing attacks using an automatic whitelist based on process control;

图2是一种基于过程管控的自动白名单检测网络钓鱼攻击的方法的过程管控示意图；FIG2 is a schematic diagram of process control of a method for detecting phishing attacks using an automatic whitelist based on process control;

图3是一种基于过程管控的自动白名单检测网络钓鱼攻击的方法的结构示意图；FIG3 is a schematic diagram of a method for detecting phishing attacks through an automatic whitelist based on process control;

图4是一种基于自动白名单的检测网络钓鱼攻击的方法的算法1示意图。FIG. 4 is a schematic diagram of Algorithm 1 of a method for detecting phishing attacks based on an automatic whitelist.

具体实施方式DETAILED DESCRIPTION

下面是根据附图和实例对本发明的进一步详细说明：The following is a further detailed description of the present invention based on the accompanying drawings and examples:

图1是一种基于过程管控的自动白名单检测网络钓鱼攻击的方法的网络钓鱼攻击生命周期的示意图。伪装的网页通常包含木马程序等。网络钓鱼攻击涉及以下步骤：FIG1 is a schematic diagram of a phishing attack life cycle of a method for detecting phishing attacks using an automatic whitelist based on process control. The disguised web page usually contains a Trojan horse program, etc. The phishing attack involves the following steps:

1、攻击者从知名公司或银行的网站复制内容，并创建钓鱼网站。攻击者保持钓鱼网站的视觉相似性与相应的合法网站相似，以吸引更多用户。1. Attackers copy content from the websites of well-known companies or banks and create phishing websites. Attackers keep the visual similarity of the phishing website similar to the corresponding legitimate website to attract more users.

2、攻击者编写诸如电子邮件等，包括钓鱼网站的链接，并将其发送给大量用户或选定的目标用户。2. The attacker writes emails, including links to phishing websites, and sends them to a large number of users or selected target users.

3、用户打开电子邮件并访问伪装的网站时，激活了伪装的网站所嵌入的木马程序等；本申请就是在用户访问伪装的网站之前进行检测，如果是合法网站，则允许用户访问，否则为网络钓鱼网站则发出告警。3. When the user opens the email and visits the disguised website, the Trojan program embedded in the disguised website is activated; this application is to detect before the user visits the disguised website. If it is a legitimate website, the user is allowed to access it. Otherwise, an alarm is issued if it is a phishing website.

4、攻击者通过伪装的网站将“木马程序”交付到目标网络，或者说，攻击者通过伪装的网站将“weaponization(武器)”交付到目标网络，然后，安装、权限提升、…。4. The attacker delivers the "Trojan program" to the target network through a disguised website. In other words, the attacker delivers the "weaponization" to the target network through a disguised website, and then installs it, elevates permissions, etc.

图2是一种基于过程管控的自动白名单检测网络钓鱼攻击的方法的过程管控示意图。所述过程管控，包括三个阶段：Figure 2 is a schematic diagram of process control of a method for automatically detecting phishing attacks using a whitelist based on process control. The process control includes three stages:

第一阶段：为过程管控模型的网络阶段(包括：侦察、交付)，企业网络系统正常运营，无任何入侵；在此阶段，黑客或攻击者会采用诸如网络钓鱼攻击来侦察目标网络；而本申请提供了一种反侦察技术，用于检测所述的网络钓鱼攻击。The first stage: the network stage of the process control model (including: reconnaissance, delivery), the enterprise network system operates normally without any intrusion; at this stage, hackers or attackers will use attacks such as phishing to reconnaissance the target network; and this application provides an anti-reconnaissance technology for detecting the said phishing attacks.

第二阶段：为过程管控模型的端点阶段(包括：安装、权限提升)，自它开始时，系统始终会受到威胁，攻击者在企业网络内，但没有完全控制企业网络。The second stage is the endpoint stage of the process control model (including installation and privilege escalation). From the beginning, the system is always under threat. The attacker is within the enterprise network but does not have full control over the enterprise network.

第三阶段：为过程管控模型的域阶段或撤离阶段(包括：横向移动、操作目标和撤出)，攻击者提升权限并完全控制机器，攻击者可以删除和操纵日志，以使攻击痕迹消失。Phase 3: The domain phase or evacuation phase of the process control model (including: lateral movement, operational objectives, and evacuation), where the attacker elevates privileges and fully controls the machine. The attacker can delete and manipulate logs to make traces of the attack disappear.

具体地说，侦察阶段包括黑客主动或被动收集可用于支持目标定位的信息。此类信息可能包括受害者企业、关键基础设施或工作人员的详细信息。黑客可以利用这些信息在黑客生命周期的其它阶段提供帮助，例如使用收集的信息来规划和执行交付，确定入侵后目标的范围和优先级，或者推动和领导进一步的侦察工作。Specifically, the reconnaissance phase involves hackers actively or passively gathering information that can be used to support targeting. Such information may include detailed information about the victim's business, critical infrastructure, or personnel. Hackers can use this information to assist in other stages of the hacker lifecycle, such as using the collected information to plan and execute deliveries, determine the scope and priority of post-intrusion targets, or drive and lead further reconnaissance efforts.

图3是一种基于过程管控的自动白名单检测网络钓鱼攻击的方法的结构示意图。一种基于过程管控的自动白名单检测网络钓鱼攻击的方法，其特征在于，包括匹配模块、用户确认模块和更新白名单数据库模块；所述匹配模块，包括URL匹配子模块和DNS匹配子模块；所述用户确认模块，确认网页是否存在网络钓鱼攻击，这是通过提取超链接并随后应用钓鱼检测算法来实现的，所述钓鱼检测算法，包括算法1，用于检查超链接是合法的还是钓鱼类型，如果状态为后者，则系统将向用户发出警告，如果状态为前者，则系统将更新白名单数据库；所述更新白名单数据库模块，将第一次访问网页的合法的超链接写入到白名单数据库中，包括IP地址和 DNS域名；所述方法，还包括如下步骤：Figure 3 is a structural diagram of a method for automatically detecting phishing attacks using a whitelist based on process control. A method for automatically detecting phishing attacks using a whitelist based on process control, characterized in that it includes a matching module, a user confirmation module, and a whitelist database update module; the matching module includes a URL matching submodule and a DNS matching submodule; the user confirmation module confirms whether a web page has a phishing attack, which is achieved by extracting a hyperlink and then applying a phishing detection algorithm, the phishing detection algorithm includes Algorithm 1, which is used to check whether the hyperlink is legitimate or phishing type, if the status is the latter, the system will issue a warning to the user, if the status is the former, the system will update the whitelist database; the whitelist database update module writes the legitimate hyperlink of the first visit to the web page into the whitelist database, including the IP address and DNS domain name; the method also includes the following steps:

(2)如果用户不是第一次访问网页，则URL匹配，如果DNS也匹配，则该网站为合法网站，否则为钓鱼网站，并向用户发出警告。(2) If this is not the first time the user visits a web page, the URL matches. If the DNS also matches, the website is a legitimate website. Otherwise, it is a phishing website, and a warning is issued to the user.

图4是一种基于自动白名单的检测网络钓鱼攻击的方法的算法1示意图。所述算法1，通过对实际链接和视觉链接进行综合分析来确定白名单，此外，还可以计算已知的可信网站的相似性，并对从超链接中提取的信息做出最终决定，这些信息同样也能从用户提供的网址中获得，提取超链接背后的原因是，钓鱼网站从目标原始或合法网页复制页面内容的内容，该网页可能有许多指向目标合法页面的伪造和模拟超链接，网络钓鱼数据库中的某些可用URL会重定向到其相应的原始或合法网站，但是，如果该网页是正版网页，则不会指向网络钓鱼网页，检测钓鱼的算法基于三个指标决定任何URL的状态：源代码中存在的空链接、不包含任何超链接的网页和源代码中存在的外部链接。FIG4 is a schematic diagram of Algorithm 1 of a method for detecting phishing attacks based on an automatic whitelist. Algorithm 1 determines the whitelist by performing a comprehensive analysis of actual links and visual links. In addition, the similarity of known trusted websites can be calculated and a final decision can be made on the information extracted from the hyperlinks. This information can also be obtained from the URL provided by the user. The reason behind extracting the hyperlinks is that the phishing website copies the content of the page from the target original or legitimate web page. The web page may have many forged and simulated hyperlinks pointing to the target legitimate page. Some available URLs in the phishing database redirect to their corresponding original or legitimate websites. However, if the web page is a genuine web page, it will not point to the phishing web page. The algorithm for detecting phishing determines the status of any URL based on three indicators: empty links existing in the source code, web pages that do not contain any hyperlinks, and external links existing in the source code.

进一步地，所述不包含任何超链接的网页，如果一个网站是合法的，则至少在一个超链接上轻松地进行提取，如果提取的链接总数为零，则该网站被视为钓鱼网站，但是，如果没有超链接提取，则该网页也被视为网络钓鱼类型。出于以下两个原因，攻击者在假网页中创建空指针：Furthermore, the webpage that does not contain any hyperlinks, if a website is legitimate, it is easy to extract at least one hyperlink, if the total number of extracted links is zero, then the website is considered as a phishing website, however, if there is no hyperlink extraction, then the webpage is also considered as a phishing type. The attacker creates a null pointer in the fake webpage for the following two reasons:

1、第一个原因是创建没有任何地方的实时超链接。一个真正的网站包含很多网页，但一个虚假的网站包含非常有限的网页。因此，为了伪装成合法网页，攻击者创建了一个假网页，并将空值放入超链接中。当用户将鼠标滚动到空链接上时，它们似乎处于活动状态。1. The first reason is to create live hyperlinks that are nowhere. A real website contains a lot of web pages, but a fake website contains very limited web pages. So, in order to disguise as a legitimate web page, the attacker creates a fake web page and puts empty values into the hyperlinks. When the user rolls the mouse over the empty links, they appear to be active.

2、黑客使用带有空链接的javascript攻击web浏览器的漏洞。攻击者创建超链接的方式是，当用户将鼠标滚动到超链接上时，它会显示其他内容，而不是实际的超链接。在示例中(如下所示)，链接看起来像 www.example1.org，但实际上，真正的域是http://example2.org.通过使用href＝“#”，链接被激活并指向相同的位置，因此onClick属性能够被激活。2. Hackers use javascript with empty links to attack web browser vulnerabilities. The attacker creates a hyperlink in such a way that when the user rolls the mouse over the hyperlink, it displays other content instead of the actual hyperlink. In the example (shown below), the link looks like www.example1.org, but in fact, the real domain is http://example2.org. By using href="#", the link is activated and points to the same location, so the onClick attribute can be activated.

以上所述仅为本发明的较佳实施例，并非用来限定本发明的实施范围；凡是依本发明所作的等效变化与修改，都被视为本发明的专利范围所涵盖。The above description is only a preferred embodiment of the present invention and is not intended to limit the scope of implementation of the present invention; all equivalent changes and modifications made according to the present invention are deemed to be covered by the patent scope of the present invention.

Claims

1. A method for automatically detecting phishing attacks based on a whitelist of process control, characterized in that it is an anti-reconnaissance technology used in process control, including a matching module, a user confirmation module and a whitelist database update module; the matching module includes a URL matching submodule and a DNS matching submodule; the user confirmation module confirms whether a web page has a phishing attack, which is achieved by extracting a hyperlink and then applying a phishing detection algorithm, the phishing detection algorithm includes Algorithm 1, which is used to check whether the hyperlink is legitimate or a phishing type, if the status is the latter, the system will issue a warning to the user, if the status is the former, the system will update the whitelist database; the whitelist database update module writes the legitimate hyperlink of the first visit to the web page into the whitelist database, including the IP address and DNS domain name; the method also includes the following steps:

(1) If the user is visiting a web page for the first time, based on Algorithm 1, the user confirms whether the web page contains phishing information. If it is phishing information, the system will warn the user. If it is a legitimate web page, the system will update the whitelist database.

(2) If this is not the first time the user visits a web page, the URL matches. If the DNS also matches, the website is a legitimate website. Otherwise, it is a phishing website, and a warning is issued to the user.

The algorithm 1 determines the whitelist by performing a comprehensive analysis of actual links and visual links. In addition, it can also calculate the similarity of known trusted websites and make a final decision on the information extracted from the hyperlinks, which can also be obtained from the URL provided by the user. The reason behind extracting the hyperlinks is that the phishing website copies the content of the page from the target original or legitimate web page. The web page may have many forged and simulated hyperlinks pointing to the target legitimate page. Some available URLs in the phishing database will redirect to their corresponding original or legitimate websites. However, if the web page is a genuine web page, it will not point to the phishing web page. The algorithm for detecting phishing determines the status of any URL based on three indicators: empty links present in the source code, web pages that do not contain any hyperlinks, and external links present in the source code.

2. A method for automatic whitelist detection of phishing attacks based on process control as described in claim 1 is characterized in that the empty link existing in the source code, that is, a web page containing a null pointer, when the link does not point to any web page or document, it is called an empty link or null pointer, it is usually represented by <a href="#">, whenever the link is clicked, it will return to the link on the same page, and the attacker uses the null pointer to achieve his ulterior motives.

3. A method for automatic whitelist detection of phishing attacks based on process control as described in claim 1 is characterized in that the web page that does not contain any hyperlinks can be easily extracted on at least one hyperlink if a website is legitimate, and if the total number of extracted links is zero, the website is regarded as a phishing website, but if no hyperlink is extracted, the web page is also regarded as a phishing type.

4. A method for detecting phishing attacks based on automatic whitelisting of process control as described in claim 1, characterized in that the external links existing in the source code, Algorithm 1 will be determined based on the results of the extracted hyperlinks. If the hyperlinks are legitimate, most of the hyperlinks point to the same domain, while for phishing websites, most of the hyperlinks point to their respective target domains or external domains. This Algorithm 1 can calculate the total number of links extracted from the web page source code and the total number of links pointing to external domains, and select a suitable threshold for the ratio. The determination of the nature of the hyperlink is determined by the following equation:

Ratio =

in,

= the total number of links pointing to own domain,

=The total number of links extracted from the web source of the suspicious web page.