CN115603924A - Detection method and device for phishing mails, electronic equipment and storage medium - Google Patents

Detection method and device for phishing mails, electronic equipment and storage medium Download PDF

Info

Publication number
CN115603924A
CN115603924A CN202110723018.4A CN202110723018A CN115603924A CN 115603924 A CN115603924 A CN 115603924A CN 202110723018 A CN202110723018 A CN 202110723018A CN 115603924 A CN115603924 A CN 115603924A
Authority
CN
China
Prior art keywords
domain name
url
phishing
mail
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110723018.4A
Other languages
Chinese (zh)
Inventor
宁阳
闫凡
郜振锋
郑景中
王雄
许云中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Technologies Co Ltd filed Critical Sangfor Technologies Co Ltd
Priority to CN202110723018.4A priority Critical patent/CN115603924A/en
Publication of CN115603924A publication Critical patent/CN115603924A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1483Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application discloses a detection method of phishing mails, which comprises the following steps: acquiring at least one Uniform Resource Locator (URL) link in a target mail, and determining at least one primary domain name in the URL link; deleting a known domain name in the at least one first-level domain name to obtain a suspicious URL set; wherein the known domain name is an intersection of the at least one primary domain name and a set of security domain names; and if the suspicious URL set is not empty, outputting a phishing mail detection result according to the similarity between the suspicious URL set and the safe URL link. The method and the device can improve the accuracy of detecting the phishing mails. The application also discloses a detection device for the fishing mails, an electronic device and a storage medium, and the detection device has the beneficial effects.

Description

Detection method and device for phishing mails, electronic equipment and storage medium
Technical Field
The present application relates to the field of network security technologies, and in particular, to a method and an apparatus for detecting phishing mails, an electronic device, and a storage medium.
Background
Phishing typically tricks the recipient into replying to the intended recipient with an account number, password, etc., or directs the recipient to connect to a tailored web page. The webpage pointed by the phishing mail is usually disguised as a real website, such as a bank or financial webpage, so that a login user can believe the true and input a credit card or bank card number, an account name, a password and the like to be stolen. Therefore, detection of phishing mail is a major concern for network security personnel.
In the related art, the fishing mails are mainly identified based on the mailbox name and the sender IP address, but the above scheme has low accuracy in detecting the fishing mails because the mailbox name and the sender IP address are easily disguised and changed.
Therefore, how to improve the accuracy of detecting phishing mails is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
The application aims to provide a method and a device for detecting phishing mails, electronic equipment and a storage medium, which can improve the accuracy rate of detecting the phishing mails.
In order to solve the above technical problem, the present application provides a method for detecting phishing mails, including:
acquiring at least one Uniform Resource Locator (URL) link in a target mail, and determining at least one primary domain name in the URL link;
deleting a known domain name in the at least one first-level domain name to obtain a suspicious URL set; wherein the known domain name is an intersection of the at least one primary domain name and a set of security domain names;
and if the suspicious URL set is not empty, outputting a phishing mail detection result according to the similarity between the suspicious URL set and the safe URL link.
Optionally, outputting a phishing mail detection result according to the similarity between the suspicious URL set and the safe URL link includes:
and if the similarity between the suspicious URL set and the safe URL link is within a first similarity interval, outputting a detection result that the target mail is a phishing mail.
Optionally, the method further includes:
if the similarity between the suspicious URL set and the safe URL link is not in the first similarity interval, performing character replacement on the suspicious URL set to obtain a new suspicious URL set, and judging that the similarity between the new suspicious URL set and the safe URL link is in a second similarity interval.
Optionally, the method further includes:
and if the similarity between the new suspicious URL set and the safe URL link is within the second similarity interval, outputting the detection result that the target mail is the phishing mail.
Optionally, performing character replacement on the suspicious URL set to obtain a new suspicious URL set, including:
and carrying out isomorphic character replacement and/or punycode code replacement on the suspicious URL set to obtain the new suspicious URL set.
Optionally, the method further includes:
if the similarity between the new suspicious URL set and the safe URL link is not within the second similarity interval, extracting the core key words of each safe domain name in the safe domain name set;
judging whether the suspicious URL set comprises the core keyword or not;
and if so, judging that the target mail is a phishing mail.
Optionally, the extracting the core keyword of each security domain name in the security domain name set includes:
and extracting a difference set of the first-level domain name and the top-level domain name of each safe domain name in the safe domain name set as the core keyword.
Optionally, after determining at least one primary domain name in at least one of the URL links, the method further includes:
and removing repeated segments in the primary domain name.
Optionally, the method further includes:
and if the target mail is a phishing mail, adding the first-level domain name in the URL link to a URL blacklist, and marking the camouflage type of the URL link.
The application also provides a detection device for fishing mails, which comprises:
the domain name determining module is used for acquiring at least one Uniform Resource Locator (URL) link in a target mail and determining at least one primary domain name in the URL link;
the suspicious URL determining module is used for deleting a known domain name in the at least one first-level domain name to obtain a suspicious URL set; wherein the known domain name is an intersection of the at least one primary domain name and a set of security domain names;
and the judging module is used for outputting a phishing mail detection result according to the similarity between the suspicious URL set and the safe URL link if the suspicious URL set is not empty.
The application also provides a storage medium, wherein a computer program is stored on the storage medium, and the computer program realizes the steps executed by the detection method of the phishing mails when executed.
The application also provides electronic equipment which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps executed by the detection method of the phishing mails when calling the computer program in the memory.
The application provides a phishing mail detection method, which comprises the following steps: acquiring at least one Uniform Resource Locator (URL) link in a target mail, and determining at least one primary domain name in the URL link; deleting a known domain name in the at least one first-level domain name to obtain a suspicious URL set; wherein the known domain name is an intersection of the at least one primary domain name and a set of security domain names; and if the suspicious URL set is not empty, outputting a phishing mail detection result according to the similarity between the suspicious URL set and the safe URL link.
The method and the device obtain at least one URL link in the target mail, and compare at least one primary domain name of the at least one URL link with a known domain name. Because the page jump of the phishing mail is mainly realized through the URL link, if the primary domain name belongs to the safe domain name set, the address corresponding to the URL link is not the tampered webpage. If the suspicious URL set is not empty, the URL link is indicated to contain other domain names except the safe domain name set, and the similarity between the suspicious URL set and the safe URL link can be judged by continuously utilizing the safe URL link. According to the method and the device, the detection of the phishing mails is realized based on the URL link content of the target mail, the influence of the mailbox name and the sender IP address change is avoided, the disguised phishing mails can be effectively identified, and the accuracy of detecting the phishing mails is improved. This application still provides a detection device, an electronic equipment and a storage medium of fishing mail simultaneously, has above-mentioned beneficial effect, no longer gives unnecessary details here.
Drawings
In order to more clearly illustrate the embodiments of the present application, the drawings needed for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a flowchart of a phishing mail detection method provided in an embodiment of the present application;
FIG. 2 is a flowchart of a phishing mail detection method based on URL link comparison according to an embodiment of the present application;
fig. 3 is a flowchart of a method for detecting phishing mails based on keywords according to an embodiment of the present application;
FIG. 4 is a flowchart of a phishing mail detection method based on identifying confusing URLs according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a detection device for fishing mails according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a flowchart of a phishing mail detection method according to an embodiment of the present application.
The specific steps may include:
s101: acquiring at least one URL (Uniform Resource Locator) link in a target mail, and determining at least one first-level domain name in the at least one URL link;
the method and the device can be applied to electronic equipment such as a firewall, an equal security all-in-one machine and a mailbox server, and the target mail can be an unknown type mail sent by other terminals. Before acquiring the URL link in the target email, determining whether the target email includes the URL link may exist, if so, executing the relevant step of S101, and if not, determining that the target email is a normal email.
The target email may include any number of URL links, and after acquiring at least one URL link in the target email, this embodiment may acquire at least one first-level domain name in the at least one URL link. Further, the embodiment can acquire all URL links in the target mail and determine all primary domain names in each URL link so as to improve the detection accuracy of the phishing mail. All the domain names in the URL link are separated by point numbers, counting from the right to the left, all the characters on the right of the first point number are top-level domain names, all the characters on the right of the second point number are first-level domain names, and so on, and the next-level domain name comprises all the contents of the previous-level domain name. For example, the top-level domain name is.com, the first-level domain name is.def.com, and the second-level domain name is.abc.def.com in www.abc.def.com.
Further, after determining the primary domain name in the URL link, the repeated segments in the primary domain name may also be removed. Specifically, the present embodiment may use the content in the URL link, where the number of repeated bytes is greater than the preset value, as the repeated segment. By deleting the repeated fragments, the calculation amount of the phishing mail detection process can be reduced, and the detection efficiency of the phishing mails is improved.
S102: deleting a known domain name in at least one first-level domain name to obtain a suspicious URL set;
after the first-level domain name linked to the URL is determined, the first-level domain name may be matched with a security domain name in the security domain name set, and then an intersection of at least one first-level domain name and the security domain name set is used as a known domain name. In this step, known domain names in all the first-level domain names linked by all the URLs can be deleted to obtain a suspicious URL set. The security domain name is a known secure domain name.
S103: if the suspicious URL set is empty, judging that the target mail is a normal mail;
if the suspicious URL set is empty, the first-level domain name linked with the URL is a safe domain name, safe access can be achieved, and the target mail can be judged to be a normal mail.
S104: and if the suspicious URL set is not empty, outputting a phishing mail detection result according to the similarity between the suspicious URL set and the safe URL link.
If the suspicious URL set is not empty, the safety of part of contents of the first-level domain name of the URL link is unknown, similarity comparison can be carried out on the contents in the suspicious URL set and the safe URL link, and then a phishing mail detection result is output according to the similarity comparison result.
Specifically, if the similarity between the suspicious URL set and the safe URL link is high, it indicates that the content of the first-level domain name other than the known domain name is safe, and it may be determined that the target email is a normal email. If the similarity between the suspicious URL set and the safe URL link is low, the security of the content except the known domain name in the first-level domain name is unknown, at this time, the target mail can be judged to be a phishing mail, and the suspicious URL set can be further detected by adopting a character replacement and keyword matching method. The secure URL is a known secure URL.
As a possible implementation mode, after the target mail is detected to be the phishing mail, the first-level domain name in the URL link can be added to the URL blacklist, and the camouflage type of the URL link is marked. Specifically, the above-mentioned primary domain name added to the URL blacklist is a primary domain name that makes the target mail determined as a phishing mail. The camouflage types may include: domain name similarity camouflage, domain name character replacement camouflage and domain name keyword camouflage. When receiving the mail, the mail containing the URL link can be screened by utilizing the URL blacklist, and the network security is improved.
In this embodiment, at least one URL link in the target email is obtained, and at least one primary domain name of the at least one URL link is compared with a known domain name. Because the page jump of the phishing mail is mainly realized through the URL link, if the primary domain name belongs to the safe domain name set, the address corresponding to the URL link is not the tampered webpage. If the suspicious URL set is not empty, the URL link is indicated to contain other domain names except the safe domain name set, and the similarity between the suspicious URL set and the safe URL link can be judged by continuously utilizing the safe URL link. The detection of the phishing mails is realized based on the URL link content of the target mail, the influence of mailbox name and sender IP address change is avoided, the disguised phishing mails can be effectively identified, and the accuracy rate of detecting the phishing mails is improved.
Referring to fig. 2, fig. 2 is a flowchart of a method for detecting phishing mails based on URL link comparison according to an embodiment of the present application, where this embodiment is a further description of the method for detecting phishing mails when a suspicious URL set is not empty in the embodiment corresponding to fig. 1, and a further implementation may be obtained by combining this embodiment with the embodiment corresponding to fig. 1, where this embodiment may include the following steps:
s201: and calculating the similarity between the suspicious URL set and the safe URL link.
S202: judging whether the similarity between the suspicious URL set and the safe URL link is within a first similarity interval or not; if yes, entering S206; if not, the process proceeds to S203.
S203: and carrying out character replacement on the suspicious URL set to obtain a new suspicious URL set.
S204: judging whether the similarity between the new suspicious URL set and the safe URL link is within a second similarity interval or not; if yes, entering S206; if not, the process proceeds to S205.
S205: and judging that the target mail is a normal mail.
S206: and judging the target mail as the phishing mail.
It will be appreciated that hackers often pretend by adding an illegal domain name close to the secure domain name to the URL link in order to avoid easy recognition by the user when creating phishing mails. Therefore, in the above embodiment, similarity comparison is performed between the safe URL link and the suspicious URL set, and if the similarity between the suspicious URL set and the safe URL link is within the first similarity interval, the detection result that the target email is a phishing email is output. And if the similarity between the suspicious URL set and the safe URL link is not within a first similarity interval, performing character replacement on the suspicious URL set so as to prevent hackers from forging phishing mails in a character replacement mode. Specifically, in this embodiment, the new suspicious URL set may be obtained by performing character replacement on the suspicious URL set by using a punycode code (domain name code). In this embodiment, the new suspicious URL set may be obtained by performing character replacement on the suspicious URL set using homomorphic characters. And if the similarity between the new suspicious URL set and the safe URL link is within a second similarity interval, outputting the detection result that the target mail is the phishing mail. The first similarity interval may be 70% to 95%, and the second similarity interval may be 60% to 85%.
Referring to fig. 3, fig. 3 is a flowchart of a method for detecting phishing mails based on keywords according to an embodiment of the present application, which is a further description of the method for identifying phishing mails in the embodiment corresponding to fig. 2, and a further implementation manner can be obtained by combining the embodiment with the embodiment corresponding to fig. 2, where the embodiment may include the following steps:
s301: if the similarity between the new suspicious URL set and the safe URL link is not within the second similarity interval, extracting the core key words of each safe domain name in the safe domain name set;
after the suspicious URL set is obtained, the present embodiment may extract a difference set between a first-level domain name and a top-level domain name of each security domain name in the security domain name set as the core keyword. To illustrate the above process, for the security domain name www.abc.com, the difference between the first level domain name, abc.com, and the top level domain name, com, is.
S302: judging whether the suspicious URL set comprises the core keyword or not; if yes, entering S303; if not, entering S304;
s303: judging the target mail as a phishing mail;
s304: and judging that the target mail is a normal mail.
If the similarity between the suspicious URL set and the safe URL link is low and the similarity between the suspicious URL set after character replacement and the safe URL link is low, a situation that a hacker forges the phishing mail by adding the core keyword may exist. Therefore, whether the suspicious URL set comprises the core keywords or not is further judged, and the coverage rate of phishing mail detection is improved.
In the related art, social engineering, UEBA technology, text binary classification models and other modes are generally adopted for phishing mail detection, but the modes cannot well detect the phishing mails which are elaborately constructed by hackers and confuse URLs, and the phishing mail technology which is specially used for identifying the confusing URLs does not exist at present.
The following describes the flow described in the above embodiment by an embodiment in practical application, please refer to fig. 4, fig. 4 is a flowchart of a phishing mail detection method based on identifying a confusing URL provided in the embodiment of the present application, and the embodiment provides a detection method of a phishing mail identifying a confusing URL to fill the current blank. In the embodiment, firstly, a URL existing in a mail is detected, a first-level domain name is extracted from the URL, whether the first-level domain name exists in a locally loaded well-known domain name set (namely a safety domain name set) is judged, if not, the first-level domain name and the locally stored well-known domain name (namely a safety domain name) are used for carrying out similarity calculation one by one, and if the similarity is high, the mail is a high-risk phishing mail; if the similarity is low, the core keywords (the difference part of the first-level domain name and the top-level domain name) are continuously loaded, whether the core keywords exist in the URL link of the mail or not is judged, and if yes, the high-risk phishing mail can be judged. The present embodiment may include the following steps:
step 1, loading mail log data of a user, and detecting whether a URL link exists in a mail text; if yes, entering step 2; if not, the mail is directly judged to be a normal mail.
Step 2, removing duplication of the URL link and obtaining a corresponding primary domain name, and if the primary domain name has intersection with the known domain name set, deleting the content left by the intersection as a suspicious URL set; and if the suspicious URL set is empty, directly judging that the mail is a normal mail.
Step 3, traversing the suspicious URL sets one by one, and calculating the similarity of the suspicious URL sets and the known URL links; and if the similarity is within the upper and lower threshold value ranges, determining the phishing mails with high risk. If the detection result is not within the upper and lower threshold ranges, the next detection is continued.
And 4, step 4: similar character replacement based on the punycode code is carried out on the suspicious URL sets one by one, then similarity calculation is carried out on the suspicious URL sets after character replacement and well-known URLs, and if the similarity is within the range of upper and lower thresholds, the high-risk phishing mails are judged. And if the high-risk mails are not obtained, continuing the next detection.
And 5: and (3) locally loading core keywords of the well-known domain name, extracting a difference set of the first-level domain name and the top-level domain name, judging whether the keywords exist in a suspicious URL set, judging that the phishing mail is detected if the keywords exist, and otherwise, judging that the mail is a normal mail.
The existing phishing mail detection method does not use page content of URL in the mail and page content of famous URL to carry out similarity matching calculation, also does not use Punycode and homomorphic characters to replace, and does not use a difference set of a first-level domain name and a top-level domain name to detect confusing URL. However, many phishing mails exist at present, hackers use a technology of imitating a well-known URL, including punycode code and isomorphic character replacement, and a technology of placing keywords of a well-known domain name in other positions of the URL to cheat user trust, so that user account information is leaked. By identifying the URL confusion technology, the attack of the phishing mails can be prevented, and the basic benefit of the user is protected from being lost.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a detection device for phishing mails according to an embodiment of the present application;
the apparatus may include:
a domain name determining module 501, configured to obtain at least one uniform resource locator URL link in a target email, and determine at least one primary domain name in the at least one URL link;
a suspicious URL determining module 502, configured to delete a known domain name in the at least one first-level domain name to obtain a suspicious URL set; wherein the known domain name is an intersection of the at least one primary domain name and a set of security domain names;
and the judging module 503 is configured to output a phishing mail detection result according to the similarity between the suspicious URL set and the secure URL link if the suspicious URL set is not empty.
In this embodiment, at least one URL link in the target email is obtained, and at least one primary domain name of the at least one URL link is compared with a known domain name. Because the page jump of the phishing mail is mainly realized through the URL link, if the primary domain name belongs to the safe domain name set, the address corresponding to the URL link is not the tampered webpage. If the suspicious URL set is not empty, the URL link is indicated to contain other domain names except the safe domain name set, and the similarity between the suspicious URL set and the safe URL link can be judged by continuously utilizing the safe URL link. The detection of the phishing mails is realized based on the URL link content of the target mail, the influence of mailbox name and sender IP address change is avoided, the disguised phishing mails can be effectively identified, and the accuracy of detecting the phishing mails is improved.
Further, the determining module 503 is configured to output a detection result that the target email is a phishing email if the similarity between the suspicious URL set and the safe URL link is within a first similarity interval.
Further, the method also comprises the following steps:
and the character replacement module is used for carrying out character replacement on the suspicious URL set to obtain a new suspicious URL set and judging that the similarity between the new suspicious URL set and the safe URL link is in a second similarity interval if the similarity between the suspicious URL set and the safe URL link is not in a first similarity interval.
Further, the method also comprises the following steps:
and the new set detection module is used for outputting the detection result that the target mail is the phishing mail if the similarity between the new suspicious URL set and the safe URL link is within a second similarity interval.
Further, the determining module 503 includes:
and the character replacing unit is used for carrying out isomorphic character replacement and/or punycode code replacement on the suspicious URL set to obtain the new suspicious URL set.
Further, the method also comprises the following steps:
the keyword extraction module is used for extracting the core keyword of each safety domain name in the safety domain name set if the similarity of the new suspicious URL set and the safety URL link is not in the second similarity interval;
a keyword detection module, configured to determine whether the suspicious URL set includes the core keyword; and if so, judging that the target mail is a fishing mail.
Further, the keyword extraction module is configured to extract a difference set of a first-level domain name and a top-level domain name of each security domain name in the security domain name set as the core keyword.
Further, the method also comprises the following steps:
and the duplication removing module is used for removing the repeated segments in the primary domain name after determining at least one primary domain name in at least one URL link.
Further, the method also comprises the following steps:
and the blacklist maintenance module is used for adding the first-level domain name in the URL link to a URL blacklist and marking the camouflage type of the URL link if the target mail is a phishing mail.
Since the embodiments of the apparatus portion and the method portion correspond to each other, please refer to the description of the embodiments of the method portion for the embodiments of the apparatus portion, which is not repeated here.
The present application also provides a storage medium having a computer program stored thereon, which when executed, may implement the steps provided by the above-described embodiments. The storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The present application further provides an electronic device, which may include a memory and a processor, where the memory stores a computer program, and when the processor calls the computer program in the memory, the steps provided in the foregoing embodiments may be implemented. Of course, the electronic device may also include various network interfaces, power supplies, and the like.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.

Claims (12)

1. A method for detecting phishing mails, comprising:
acquiring at least one Uniform Resource Locator (URL) link in a target mail, and determining at least one primary domain name in the URL link;
deleting a known domain name in the at least one first-level domain name to obtain a suspicious URL set; wherein the known domain name is an intersection of the at least one primary domain name and a set of security domain names;
and if the suspicious URL set is not empty, outputting a phishing mail detection result according to the similarity between the suspicious URL set and the safe URL link.
2. A phishing mail detection method as claimed in claim 1 wherein outputting phishing mail detection results based on similarity of said suspect URL set and said secure URL link comprises:
and if the similarity between the suspicious URL set and the safe URL link is within a first similarity interval, outputting a detection result that the target mail is a phishing mail.
3. A method for detecting phishing mails according to claim 2, further comprising:
if the similarity between the suspicious URL set and the safe URL link is not in the first similarity interval, performing character replacement on the suspicious URL set to obtain a new suspicious URL set, and judging that the similarity between the new suspicious URL set and the safe URL link is in a second similarity interval.
4. A phishing mail detection method according to claim 2 or 3, further comprising:
and if the similarity between the new suspicious URL set and the safe URL link is within the second similarity interval, outputting the detection result that the target mail is the phishing mail.
5. A phishing mail detection method according to claim 3 wherein character replacement of the set of suspect URLs to obtain a new set of suspect URLs comprises:
and carrying out isomorphic character replacement and/or punycode code replacement on the suspicious URL set to obtain the new suspicious URL set.
6. A method for detecting phishing mails according to claim 3, further comprising:
if the similarity between the new suspicious URL set and the safe URL link is not within the second similarity interval, extracting the core key words of each safe domain name in the safe domain name set;
judging whether the suspicious URL set comprises the core keyword or not;
and if so, judging that the target mail is a phishing mail.
7. The method of detecting phishing mails according to claim 6, wherein extracting the core keyword of each of the set of safe domain names comprises:
and extracting a difference set of the first-level domain name and the top-level domain name of each safe domain name in the safe domain name set as the core keyword.
8. A phishing mail detection method as claimed in claim 1 further comprising after determining at least one primary domain name in at least one of said URL links:
and removing repeated segments in the primary domain name.
9. A phishing mail detection method as claimed in any one of claims 1 to 8 further comprising:
and if the target mail is a phishing mail, adding the first-level domain name in the URL link to a URL blacklist, and marking the camouflage type of the URL link.
10. A phishing mail detection apparatus comprising:
the domain name determining module is used for acquiring at least one Uniform Resource Locator (URL) link in a target mail and determining at least one primary domain name in the URL link;
the suspicious URL determining module is used for deleting a known domain name in the at least one first-level domain name to obtain a suspicious URL set; wherein the known domain name is an intersection of the at least one primary domain name and a set of security domain names;
and the judging module is used for outputting a phishing mail detection result according to the similarity between the suspicious URL set and the safe URL link if the suspicious URL set is not empty.
11. An electronic device comprising a memory in which a computer program is stored and a processor which, when called upon by the computer program in the memory, carries out the steps of the method for detecting phishing mails according to any one of claims 1 to 9.
12. A storage medium having stored thereon computer-executable instructions which, when loaded and executed by a processor, carry out the steps of a method of detecting phishing mails according to any one of claims 1 to 9.
CN202110723018.4A 2021-06-28 2021-06-28 Detection method and device for phishing mails, electronic equipment and storage medium Pending CN115603924A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110723018.4A CN115603924A (en) 2021-06-28 2021-06-28 Detection method and device for phishing mails, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110723018.4A CN115603924A (en) 2021-06-28 2021-06-28 Detection method and device for phishing mails, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115603924A true CN115603924A (en) 2023-01-13

Family

ID=84841103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110723018.4A Pending CN115603924A (en) 2021-06-28 2021-06-28 Detection method and device for phishing mails, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115603924A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116633684A (en) * 2023-07-19 2023-08-22 中移(苏州)软件技术有限公司 Phishing detection method, system, electronic device and readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116633684A (en) * 2023-07-19 2023-08-22 中移(苏州)软件技术有限公司 Phishing detection method, system, electronic device and readable storage medium
CN116633684B (en) * 2023-07-19 2023-10-13 中移(苏州)软件技术有限公司 Phishing detection method, system, electronic device and readable storage medium

Similar Documents

Publication Publication Date Title
US10530806B2 (en) Methods and systems for malicious message detection and processing
CN104468249B (en) Account abnormity detection method and device
US8984289B2 (en) Classifying a message based on fraud indicators
TWI593266B (en) Malicious message detection and processing
Alazab et al. Malicious spam emails developments and authorship attribution
US8719352B2 (en) Reputation management for network content classification
Kang et al. Advanced white list approach for preventing access to phishing sites
US20220030029A1 (en) Phishing Protection Methods and Systems
US11563757B2 (en) System and method for email account takeover detection and remediation utilizing AI models
CN111147489B (en) Link camouflage-oriented fishfork attack mail discovery method and device
JP2022530290A (en) Optimal scanning parameter calculation methods, devices, and systems for malicious URL detection
US11665195B2 (en) System and method for email account takeover detection and remediation utilizing anonymized datasets
CN112948725A (en) Phishing website URL detection method and system based on machine learning
Sankhwar et al. Email phishing: an enhanced classification model to detect malicious urls
WO2017162997A1 (en) A method of protecting a user from messages with links to malicious websites containing homograph attacks
EP3195140B1 (en) Malicious message detection and processing
CN115603924A (en) Detection method and device for phishing mails, electronic equipment and storage medium
JP4564916B2 (en) Phishing fraud countermeasure method, terminal, server and program
CN113556347B (en) Detection method, device and equipment for phishing mails and storage medium
EP3837625A1 (en) Fuzzy inclusion based impersonation detection
Balamuralikrishna et al. Mitigating Online Fraud by Ant phishing Model with URL & Image based Webpage Matching
KR101857969B1 (en) Method and Apparatus for Determining Risk of Fraudulent Mail
CN114095252B (en) FQDN domain name detection method, FQDN domain name detection device, computing equipment and storage medium
CN115801721A (en) Mail detection method and device
CN115396184A (en) Mail detection method and device and nonvolatile storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination