CN111614543B - URL-based spear phishing mail detection method and system - Google Patents

URL-based spear phishing mail detection method and system Download PDF

Info

Publication number
CN111614543B
CN111614543B CN202010279729.2A CN202010279729A CN111614543B CN 111614543 B CN111614543 B CN 111614543B CN 202010279729 A CN202010279729 A CN 202010279729A CN 111614543 B CN111614543 B CN 111614543B
Authority
CN
China
Prior art keywords
mail
url
link
spear
domain name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010279729.2A
Other languages
Chinese (zh)
Other versions
CN111614543A (en
Inventor
汪秋云
姜政伟
汪姝玮
辛丽玲
丁雄
刘宝旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN202010279729.2A priority Critical patent/CN111614543B/en
Publication of CN111614543A publication Critical patent/CN111614543A/en
Application granted granted Critical
Publication of CN111614543B publication Critical patent/CN111614543B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/42Mailbox-related aspects, e.g. synchronisation of mailboxes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/23Reliability checks, e.g. acknowledgments or fault reporting

Abstract

The invention discloses a URL-based spear phishing mail detection method and a URL-based spear phishing mail detection system, which relate to the field of information security detection and the field of network space security, and select a mail containing a URL link by detecting whether a mail body contains the URL link or not; extracting a feature vector of a URL link from a mail containing the URL link based on a mail history record; classifying the characteristic vectors of the URL links by using a trained link classifier, and selecting the mails with malicious links; extracting metadata with malicious linked mails, and extracting mail feature vectors from the metadata by using mail history records; and classifying the mail feature vectors by using a trained spear classifier, and detecting the spear phishing mails based on the URL. The invention can achieve lower false alarm rate and higher detection rate only by the support of the historical mails.

Description

URL-based spear phishing mail detection method and system
Technical Field
The invention relates to the field of information security detection and the field of network space security, in particular to a URL-based spearphishing (spearphishing) mail detection method and system.
Background
With the rapid development of computers and the internet, electronic mails have become an indispensable and important part of people's daily life and work. However, the e-mail brings convenience to people and brings convenience to attackers. Attackers steal money or sensitive information by sending phishing mails or spear phishing mails to users and employees by means of social engineering.
The spear fishing mail is a very targeted fishing mail, and an attacker firstly collects information of a user, then elaborately customizes mail content and sends the mail to a receiver through a fake sender or a fake third-party service provider. Attackers often implement spearphishing mail in two ways, the first being sending spearphishing mail with malicious attachments and the second being sending spearphishing mail with phishing links. For an attacker, the attachment of the mail usually uses 0-day or related loopholes, and the digging of the loopholes is a very time-consuming and labor-consuming work, so the threshold difficulty of making the attachment is high, and the requirement on the attacker is also high. Many current mail gateways check mail attachments more frequently, such as virtual execution using sandbox, and these measures all make the attachment-based attack success rate lower. For the link-based spearphishing mail, an attacker needs to make only one phishing website, the cost and the technical content are relatively low, the implementation is easy, and the existing detection technology is difficult to find the spearphishing mail containing the elaborated link. Based on the above background, the present invention provides a detection scheme for a spearphishing mail for a URL.
Currently, the following three problems mainly exist for the URL-based spearphishing mail detection scheme. The first is that the false alarm rate is too high, the false alarm rate of most detection methods is about 1%, for tens of thousands or hundreds of thousands of mails per day in practical application, 1% of false alarms can result in thousands or even tens of thousands of false alarms, and such false alarm rate is obviously unacceptable for practicability; the second major problem is that the URL-based spearphishing mails cannot be detected accurately, the detection rate is low, about 80%, for companies and enterprises, 1 spearphishing mail which is missed to be detected causes huge economic loss and consequences, and the detection rate is unacceptable for the companies; the third major problem is that there is currently an academic method, which has a high detection rate and a very low false alarm rate, but this method needs the support of complete logs (such as NIDS, SMTP, etc.), and for most companies and organizations, there is no facility for recording detailed and complete related logs, so this method is not widely practical.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a spear phishing mail detection method and system based on URL, which can achieve lower false alarm rate and higher detection rate only by the support of historical mails.
In order to achieve the purpose, the invention adopts the technical scheme that:
a spear phishing mail detection method based on URL comprises the following steps:
detecting whether the mail body contains URL links or not, and selecting the mail containing the URL links;
extracting a feature vector of a URL link from a mail containing the URL link based on a mail history record;
classifying the characteristic vectors of the URL links by using a trained link classifier, and selecting the mails with malicious links;
extracting metadata with malicious linked mails, and extracting mail feature vectors from the metadata by using mail history records;
and classifying the mail feature vectors by using a trained spear classifier, and detecting the spear phishing mails based on the URL.
Further, extracting the feature vector of the URL link includes the following steps:
extracting URLs and corresponding domain names from the mail body, and calculating the number of unique URLs and the number of domain names after duplication removal;
inquiring the ranking of each domain name, and taking the lowest ranking as the global ranking of the mail links;
inquiring the registration date of each domain name, and taking the latest registration date as the registration date of the mail link;
inquiring the score of each URL and domain name analyzed as malicious, wherein the score is equal to the ratio of the number of the analysis engines for judging the URL or the domain name as malicious to the total number of the analysis engines, and the highest score is taken as the score of the URL and the domain name of the mail;
inquiring whether each URL is a phishing link or not, and taking the worst inquiry result as the score of the mail as a phishing mail;
counting the occurrence times of a Fully Qualified Domain Name (FQDN) corresponding to a URL in historical data and the time interval of the last occurrence of the Fully Qualified Domain Name;
and forming a feature vector of the URL link by using the result obtained in the step.
Further, the ranking of each domain name is queried from the Alexa global domain name ranking website.
Further, the registration date of each domain name is queried from the WHOIS domain name query website.
Further, a score is analyzed from the Virus Total malicious code analysis website for each URL and domain name that is analyzed as malicious.
Further, from the Phish Tank phishing link analysis website, it is queried whether each URL is a phishing link.
Further, the feature vector of the URL link includes a reputation feature and a statistical feature.
Further, the metadata includes sender IP, sender address, sender name, recipient address, mail subject, mail body, and mail attachment.
Further, the mail feature vector comprises a reputation feature, a forwarding relation feature and a habit feature.
Further, the step of extracting the mail feature vector from the metadata comprises the following steps:
inquiring the maliciousness scores of the sender IP and the sender mailbox address;
counting the times of appearance of the names and addresses of the senders in the historical data and the time interval of the latest appearance;
counting the occurrence times of the names of the senders in the historical data and the time interval of the latest occurrence;
counting the occurrence times of the sender address in the historical data and the time interval of the latest occurrence;
extracting the forwarding scale quantity of the mail from the recipient list;
counting the times of the simultaneous occurrence of all recipients of the mail in the historical data and the time interval of the latest occurrence;
judging whether the mail text has a telephone number or not;
judging whether a bank account number appears in the mail body;
calculating the subject richness of the mail, wherein the subject richness is equal to the ratio of the number of words in the subject of the mail to the number of characters;
calculating the text richness of the mail, wherein the text richness is equal to the ratio of the number of words in the mail text to the number of characters;
and constructing a mail feature vector by using the results obtained in the steps.
Further, a query is made from the Virus Total malicious code analysis website for the maliciousness score inquiring about the sender IP and sender mailbox address.
Further, a link classifier and a spear classifier are respectively trained through a random forest classification algorithm, wherein a training set of the link classifier is mails containing malicious links and mails not containing the malicious links, and a spear classifier data set is malicious mails containing the malicious links, such as spear fishing mails and non-spear fishing mails.
Further, the data of the spear fishing mail is enhanced by using a K-means based SMOTE (Synthetic Minority Over-sampling Technique) algorithm, and the obtained sample is used for training a spear classifier, wherein the enhancement comprises the following steps:
taking a minority class URL-based spearphishing mail data set as a minority class sample, and taking a majority class non-URL-based spearphishing mail data set as a majority class sample;
calculating K neighbors of each minority sample on the whole data set by using Euclidean distance;
if the K neighbors of the few samples are all the majority samples, recording the samples as noise samples;
carrying out unsupervised clustering on the few samples with the noise samples removed by using a K-means algorithm to obtain a plurality of clusters;
enhancing each cluster by using a SMOTE algorithm;
calculating K neighbor of each sample in the cluster by using the Euclidean distance, randomly selecting one sample in the K neighbor, interpolating based on the sample, the randomly selected sample and a generated random number to obtain a new sample, and repeating the step for R times, wherein R is an enhanced proportion;
and finally obtaining a new sample as a URL-based spearphishing mail sample set.
A URL-based spearphishing mail detection system comprising a memory storing a computer program configured to perform the steps of the above method by a processor and a processor.
Compared with the prior art, the method can quickly screen out suspicious URL-based spearphishing mails. Under the condition of only adopting historical mails, the requirement on data is greatly reduced, the false alarm rate of detection is greatly reduced due to the double-layer classifier structure, and the classification accuracy is greatly improved due to the combination of multi-source characteristics.
Drawings
Fig. 1 is a flowchart of a method for detecting a spearphishing mail based on a URL according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a spearphishing mail detection system based on a URL according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of three key methods provided in the embodiment of the present invention.
Fig. 4 is a schematic diagram of extracting mail reputation features, forwarding relationship features, and habit features using a third-party platform and historical data according to an embodiment of the present invention.
FIG. 5 is a schematic diagram of a process of enhancing a URL-based spearphishing mail by the K-means-based SMOTE algorithm according to an embodiment of the present invention.
Detailed Description
In order to make the technical solution of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.
The embodiment provides a method for detecting a spear phishing mail based on a URL, which detects the spear phishing mail based on the URL by using a double-layer classifier structure, and as shown in FIG. 1, the method comprises the following steps:
firstly, extracting the text of a new mail, judging whether the mail contains a URL link, if not, judging the mail as a non-URL-based spearphishing mail, and if so, entering the second step;
secondly, extracting the characteristics of the URL link by using a third-party platform and a history record to obtain a characteristic vector of the mail URL link;
thirdly, classifying the feature vectors of the URL links by using a trained link classifier, if the link classifier judges that the links have no maliciousness, judging the mail as a non-URL-based spearphishing mail, and if the mail is judged to be maliciousness, entering the fourth step;
fourthly, extracting the characteristics of the mail metadata by using a third-party platform and a history record, and extracting reputation characteristics, forwarding relation characteristics and habit characteristics to obtain a mail characteristic vector;
and fifthly, classifying the mail feature vectors by using the trained spear classifier, and outputting an alarm by a system output module if the spear classifier judges that the mail is suspicious URL-based spear mail.
The present embodiment provides a URL-based spearphishing mail detection system comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the program comprising instructions for performing the steps of the above method. As an embodiment, the system can be divided into six modules according to functions: the system comprises a data preprocessing module, a feature extraction module, a spear phishing mail enhancement module, a two-layer classifier pre-training module, an external interface calling module, a system output module and a two-layer classifier detection model, which are shown in figure 2. It is noted that the present invention is not limited to these six modules.
In the model training phase:
the data preprocessing module extracts the metadata of the mail from the original mail flow, wherein the metadata comprises a sender IP, a sender address, a sender name, a receiver address, a mail subject, a mail body, a mail attachment and the like. And then detecting whether the URL link is embedded in the mail body, and entering the mail containing the URL link into the subsequent flow.
The characteristic extraction module only extracts relevant characteristics of the mail URL link when entering a first-layer link classifier of the model, wherein the relevant characteristics comprise reputation characteristics and statistical characteristics; and when entering a second-layer spear type classifier, habit features, forwarding relation features, reputation features and the like are extracted from the metadata of the mails. Finally, all features are uniformly coded to realize sample standardization.
And the spear fishing mail enhancement module is used for enhancing spear fishing mail data by using a K-means-based SMOTE algorithm.
And the double-layer classifier pre-training module is used for training the link classifier of the first layer and the spear classifier of the second layer by using a random forest classification algorithm. The training set of the link classifier is mails containing malicious links and mails without the malicious links, and the data set of the spear classifier is spear fishing mails containing the malicious links and other malicious mails.
In the model use stage:
and the detection interface of the external interface calling module calls the data preprocessing module and the feature extraction module to finish preprocessing the input mails.
And the external interface calling module calls the trained double-layer classifier.
And the detection result processing interface of the external interface calling module outputs the detection result of the mail.
The detection result is the confidence that the mail is the spearphishing mail based on the URL. The confidence level represents the degree of possibility, the invention takes 0.5 as a boundary, and higher than 0.5 represents the detection of the fish spear phishing mail, and the closer the number is to 1, the greater the degree of possibility.
The core of the method and the system lies in three aspects as shown in figure 3:
extracting mail reputation characteristics, forwarding relation characteristics, habit characteristics and the like by using a third-party platform and historical data;
carrying out data enhancement on the spearphishing mails based on the URL by utilizing a SMOTE algorithm based on K-means;
URL-based spearphishing mails are detected by utilizing a double-layer classifier architecture.
In this embodiment, a third-party platform and historical data are used to extract a mail reputation feature, a statistical feature, a forwarding relation feature, and a habit feature, as shown in fig. 4, the specific steps are as follows:
for the first tier link classifier:
firstly, extracting URLs and corresponding domain names from a mail text, and calculating the number of unique URLs and the number of domain names after duplication removal;
secondly, inquiring the ranking of each domain name from an Alexa global domain name ranking website, and taking the lowest ranking as the global ranking of the mail link;
thirdly, inquiring the registration date of each domain name from the WHOIS domain name inquiry website, and taking the latest registration date as the registration date of the mail link;
thirdly, inquiring each URL and domain name from the Virus Total malicious code analysis website to be analyzed into malicious scores, and calculating the score according to the following formula, wherein the highest score is taken as the score of the URL and the domain name of the mail;
score=numMalicious/numTotal
wherein: numMalcious is the number of analysis engines for judging the URL or domain name as malicious, and nummTotal is the total number of analysis engines;
fourthly, inquiring whether each URL is a phishing link from the Phish Tank phishing link analysis website, and taking the worst inquiry result as the score of the mail which is a phishing mail;
fifthly, counting the occurrence frequency of FQDN corresponding to URL in the historical data and the time interval of the last occurrence of FQDN;
and sixthly, obtaining a feature vector for each sample in the data set by using the method for extracting the features aiming at the link, and forming the input of the link classifier.
For the second layer of spear classifiers:
step one, inquiring the malicious scores of an IP (Internet protocol) of a sender and a mailbox address of the sender from a Virus Total malicious code analysis website;
secondly, counting the times of the names and the addresses of the senders appearing together in the historical data and the time interval of the latest appearance;
thirdly, counting the occurrence times of the names of the senders in the historical data and the time interval of the latest occurrence;
fourthly, counting the occurrence frequency of the address of the sender in the historical data and the time interval of the latest occurrence;
fifthly, extracting the forwarding scale quantity of the mail from the recipient list;
sixthly, counting the times of the simultaneous occurrence of all the recipients of the mail and the time interval of the latest occurrence in the historical data;
sixthly, judging whether the telephone number appears in the mail text;
seventhly, judging whether the mail text has a bank account number or not;
and eighthly, calculating the subject richness of the mail, namely:
subjectRichess=subjectNumWords/subjectNumCharacters
wherein: SubjectNumWords is the number of words in the mail topic, and substectNumCaractors is the number of words in the mail topic;
and the ninth step, calculating the text richness of the mail, namely:
bodyRichess=bodyNumWords/bodyNumCharacters
wherein: the body NumWords is the number of words in the mail body, and the body NumCaractors is the number of words in the mail body;
and step ten, obtaining a feature vector for each sample in the data set by using the method for extracting the features of the mails to form the input of the spear classifier.
The SMOTE algorithm based on K-means is used for enhancing the URL-based spearphishing mails, and as shown in FIG. 5, the specific steps are as follows:
first, a few URL-based spearphishing data sets are marked as P { P1,p2,…,pmRecord the majority of classes of non-URL-based spearphishing mail as Q { Q }1,q2,…,qn};
Second, calculate each minority class sample p using Euclidean distancei(i ═ 1,2, …, m) K neighbors over the entire dataset;
third, if the minority class samples pi(i is 1,2, …, m) all have most samples, and then the sample is marked as a noise sample and does not participate in the subsequent interpolation process;
fourthly, carrying out unsupervised clustering on the few samples after the noise samples are removed by using a K-means algorithm to obtain d clusters which are marked as C { C }1,C2,…,Cd};
The fifth step, for each cluster Ci(i ═ 1,2, …, d), enhanced using the SMOTE algorithm (sixth, seventh steps) as follows;
sixth, for CiCalculating K neighbor of X by using Euclidean distance, and randomly selecting a sample X from the K neighborj
Seventhly, randomly generating a number lambda from 0 to 1, and generating a new sample X by using a time formulanew
Xnew=X+λ×(Xj-X)
Eighthly, repeating the sixth step and the seventh step for R times, wherein R is the enhanced proportion;
and ninthly, obtaining a newly generated spear phishing mail sample set based on the URL through the interpolation.
The above embodiments are only intended to illustrate the technical solution of the present invention, but not to limit it, and a person skilled in the art can modify the technical solution of the present invention or substitute it with an equivalent, and the protection scope of the present invention is subject to the claims.

Claims (10)

1. A spear phishing mail detection method based on URL is characterized by comprising the following steps:
detecting whether the mail body contains URL links or not, and selecting the mail containing the URL links;
extracting a feature vector of a URL link from a mail containing the URL link based on a mail history record;
classifying the characteristic vectors of the URL links by using a trained link classifier, and selecting the mails with malicious links;
extracting metadata with malicious linked mails, and extracting mail feature vectors from the metadata by using mail history records;
and classifying the mail feature vectors by using a trained spear classifier, and detecting the spear phishing mails based on the URL.
2. The method of claim 1, wherein extracting feature vectors for URL links comprises the steps of:
extracting URLs and corresponding domain names from the mail body, and calculating the number of unique URLs and the number of domain names after duplication removal;
inquiring the ranking of each domain name, and taking the lowest ranking as the global ranking of the mail links;
inquiring the registration date of each domain name, and taking the latest registration date as the registration date of the mail link;
inquiring the score of each URL and domain name analyzed as malicious, wherein the score is equal to the ratio of the number of the analysis engines for judging the URL or the domain name as malicious to the total number of the analysis engines, and the highest score is taken as the score of the URL and the domain name of the mail;
inquiring whether each URL is a phishing link or not, and taking the worst inquiry result as the score of the mail as a phishing mail;
counting the occurrence times of the fully qualified domain name corresponding to the URL in the historical data and the time interval of the last occurrence of the fully qualified domain name;
and forming a feature vector of the URL link by using the result obtained in the step.
3. The method of claim 2, wherein the ranking of each domain name is queried from an Alexa global domain name ranking website, the registration date of each domain name is queried from a WHOIS domain name query website, each URL and domain name is analyzed as a malicious score from a Virus Total malicious code analysis website, and each URL is queried from a Phish Tank phishing link analysis website as to whether it is a phishing link.
4. The method of claim 1 or 2, wherein the feature vector of the URL link includes a reputation feature and a statistical feature.
5. The method of claim 1 wherein the metadata includes sender IP, sender address, sender name, recipient address, mail subject, mail body, and mail attachment, and the mail feature vector includes reputation features, forwarding relationship features, and habit features.
6. The method of claim 1 or 5, wherein extracting the mail feature vector from the metadata comprises the steps of:
inquiring the maliciousness scores of the sender IP and the sender mailbox address;
counting the times of appearance of the names and addresses of the senders in the historical data and the time interval of the latest appearance;
counting the occurrence times of the names of the senders in the historical data and the time interval of the latest occurrence;
counting the occurrence times of the sender address in the historical data and the time interval of the latest occurrence;
extracting the forwarding scale quantity of the mail from the recipient list;
counting the times of the simultaneous occurrence of all recipients of the mail in the historical data and the time interval of the latest occurrence;
judging whether the mail text has a telephone number or not;
judging whether a bank account number appears in the mail body;
calculating the subject richness of the mail, wherein the subject richness is equal to the ratio of the number of words in the subject of the mail to the number of characters;
calculating the text richness of the mail, wherein the text richness is equal to the ratio of the number of words in the mail text to the number of characters;
and constructing a mail feature vector by using the results obtained in the steps.
7. The method of claim 6 wherein the maliciousness score for the sender IP and sender mailbox address is queried from a Virus Total maliciousness code analysis website.
8. The method of claim 1, wherein the link classifier and the spear classifier are trained by a random forest classification algorithm, respectively, wherein the training set of link classifiers is mail with and without malicious links, and the spear classifier data set is malicious mail such as spear fishing mail with malicious links and non-spearfishing mail.
9. The method of claim 1 or 8, wherein the data of spear fishing emails are enhanced using a K-means based SMOTE algorithm, and the resulting samples are used for training of a spear classifier, the enhancement comprising the steps of:
taking a minority class URL-based spearphishing mail data set as a minority class sample, and taking a majority class non-URL-based spearphishing mail data set as a majority class sample;
calculating K neighbors of each minority sample on the whole data set by using Euclidean distance;
if the K neighbors of the few samples are all the majority samples, recording the samples as noise samples;
carrying out unsupervised clustering on the few samples with the noise samples removed by using a K-means algorithm to obtain a plurality of clusters;
enhancing each cluster by using a SMOTE algorithm;
calculating K neighbor of each sample in the cluster by using the Euclidean distance, randomly selecting one sample in the K neighbor, interpolating based on the sample, the randomly selected sample and a generated random number to obtain a new sample, and repeating the step for R times, wherein R is an enhanced proportion;
and finally obtaining a new sample as a URL-based spearphishing mail sample set.
10. A URL-based spearphishing mail detection system comprising a memory and a processor, the memory storing a computer program, characterized in that the computer program is configured to perform the steps of the method of any of the preceding claims 1-9 by the processor.
CN202010279729.2A 2020-04-10 2020-04-10 URL-based spear phishing mail detection method and system Active CN111614543B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010279729.2A CN111614543B (en) 2020-04-10 2020-04-10 URL-based spear phishing mail detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010279729.2A CN111614543B (en) 2020-04-10 2020-04-10 URL-based spear phishing mail detection method and system

Publications (2)

Publication Number Publication Date
CN111614543A CN111614543A (en) 2020-09-01
CN111614543B true CN111614543B (en) 2021-09-14

Family

ID=72203720

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010279729.2A Active CN111614543B (en) 2020-04-10 2020-04-10 URL-based spear phishing mail detection method and system

Country Status (1)

Country Link
CN (1) CN111614543B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112688926A (en) * 2020-12-18 2021-04-20 杭州安恒信息技术股份有限公司 Method, system and device for detecting spear type phishing mails based on attachments
CN113726806A (en) * 2021-09-03 2021-11-30 杭州安恒信息技术股份有限公司 BEC mail detection method, device and system and readable storage medium
CN117061198B (en) * 2023-08-30 2024-02-02 广东励通信息技术有限公司 Network security early warning system and method based on big data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105072137A (en) * 2015-09-15 2015-11-18 蔡丝英 Spear phishing mail detection method and device
US10397272B1 (en) * 2018-05-10 2019-08-27 Capital One Services, Llc Systems and methods of detecting email-based attacks through machine learning
CN110519150A (en) * 2018-05-22 2019-11-29 深信服科技股份有限公司 Mail-detection method, apparatus, equipment, system and computer readable storage medium
CN110648118A (en) * 2019-09-27 2020-01-03 深信服科技股份有限公司 Fish fork mail detection method and device, electronic equipment and readable storage medium
US10601865B1 (en) * 2015-09-30 2020-03-24 Fireeye, Inc. Detection of credential spearphishing attacks using email analysis

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9985978B2 (en) * 2008-05-07 2018-05-29 Lookingglass Cyber Solutions Method and system for misuse detection
CN102098235B (en) * 2011-01-18 2013-08-07 南京邮电大学 Fishing mail inspection method based on text characteristic analysis
US10069862B2 (en) * 2013-03-15 2018-09-04 Symantec Corporation Techniques for predicting and protecting spearphishing targets
US9398047B2 (en) * 2014-11-17 2016-07-19 Vade Retro Technology, Inc. Methods and systems for phishing detection
US10284579B2 (en) * 2017-03-22 2019-05-07 Vade Secure, Inc. Detection of email spoofing and spear phishing attacks
CN109039875B (en) * 2018-09-17 2021-06-22 杭州安恒信息技术股份有限公司 Phishing mail detection method and system based on link characteristic analysis

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105072137A (en) * 2015-09-15 2015-11-18 蔡丝英 Spear phishing mail detection method and device
CN105072137B (en) * 2015-09-15 2016-08-17 北京灵创众和科技有限公司 The detection method of spear type fishing mail and device
US10601865B1 (en) * 2015-09-30 2020-03-24 Fireeye, Inc. Detection of credential spearphishing attacks using email analysis
US10397272B1 (en) * 2018-05-10 2019-08-27 Capital One Services, Llc Systems and methods of detecting email-based attacks through machine learning
CN110519150A (en) * 2018-05-22 2019-11-29 深信服科技股份有限公司 Mail-detection method, apparatus, equipment, system and computer readable storage medium
CN110648118A (en) * 2019-09-27 2020-01-03 深信服科技股份有限公司 Fish fork mail detection method and device, electronic equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
鱼叉式网络钓鱼攻击检测方法;池亚平,凌志婷,许萍,杨建喜;《计算机工程与设计》;20190122;第39卷(第11期);3350-3355 *

Also Published As

Publication number Publication date
CN111614543A (en) 2020-09-01

Similar Documents

Publication Publication Date Title
Karim et al. A comprehensive survey for intelligent spam email detection
CN111614543B (en) URL-based spear phishing mail detection method and system
Ho et al. Detecting and characterizing lateral phishing at scale
JP7391110B2 (en) Phishing campaign detection
Bergholz et al. New filtering approaches for phishing email
CN104982011B (en) Use the document classification of multiple dimensioned text fingerprints
Ramanathan et al. phishGILLNET—phishing detection methodology using probabilistic latent semantic analysis, AdaBoost, and co-training
Wang Detecting spam bots in online social networking sites: a machine learning approach
Moradpoor et al. Employing machine learning techniques for detection and classification of phishing emails
US8489689B1 (en) Apparatus and method for obfuscation detection within a spam filtering model
US8112484B1 (en) Apparatus and method for auxiliary classification for generating features for a spam filtering model
Chen et al. Spammers are becoming" Smarter" on Twitter
JP2023515910A (en) System and method for using relationship structure for email classification
Li et al. Detection method of phishing email based on persuasion principle
Kaur et al. Improved email spam classification method using integrated particle swarm optimization and decision tree
Das et al. Analysis of an image spam in email based on content analysis
CN112333185A (en) Domain name shadow detection method and device based on DNS (Domain name Server) resolution
Akinyelu Machine learning and nature inspired based phishing detection: a literature survey
Kumar Birthriya et al. A comprehensive survey of phishing email detection and protection techniques
Ding et al. Spear phishing emails detection based on machine learning
Punkamol et al. Detection of account cloning in online social networks
CN109672678B (en) Phishing website identification method and device
Yazhmozhi et al. Natural language processing and Machine learning based phishing website detection system
CN113746814B (en) Mail processing method, mail processing device, electronic equipment and storage medium
Manek et al. ReP-ETD: A Repetitive Preprocessing technique for Embedded Text Detection from images in spam emails

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant