CN110868378A - Phishing mail detection method and device, electronic equipment and storage medium - Google Patents

Phishing mail detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110868378A
CN110868378A CN201811546175.7A CN201811546175A CN110868378A CN 110868378 A CN110868378 A CN 110868378A CN 201811546175 A CN201811546175 A CN 201811546175A CN 110868378 A CN110868378 A CN 110868378A
Authority
CN
China
Prior art keywords
mail
detected
attachment
phishing
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811546175.7A
Other languages
Chinese (zh)
Inventor
余磊
李晓玲
刘燕子
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ahtech Network Safe Technology Ltd
Original Assignee
Beijing Ahtech Network Safe Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ahtech Network Safe Technology Ltd filed Critical Beijing Ahtech Network Safe Technology Ltd
Priority to CN201811546175.7A priority Critical patent/CN110868378A/en
Publication of CN110868378A publication Critical patent/CN110868378A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1483Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/212Monitoring or handling of messages using filtering or selective blocking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/42Mailbox-related aspects, e.g. synchronisation of mailboxes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention discloses a phishing mail detection method, a phishing mail detection device, electronic equipment and a storage medium, relates to the technical field of network security, and aims to improve the recognition rate of phishing mails. The phishing mail detection method comprises the following steps: acquiring a mail to be detected from mail flow; detecting whether the mail to be detected contains preset behavior characteristics; and if the mail to be detected contains the preset behavior characteristics, determining that the mail to be detected is a phishing mail. The phishing mail detection device comprises: the acquisition module is used for acquiring the mail to be detected from the mail flow; the detection module is used for detecting whether the mail to be detected contains the preset behavior characteristics; and the judging module is used for determining that the mail to be detected is a phishing mail if the mail to be detected contains the preset behavior characteristics. The invention is suitable for detecting and identifying phishing mails in mail flow.

Description

Phishing mail detection method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of network security, in particular to a phishing mail detection method and device, electronic equipment and a storage medium.
Background
Phishing mail means that a disguised electronic mailbox is utilized to deceive a receiver to reply information such as an account number, a password and the like to a specified receiver; or the mail receiver is induced to click the attachment containing the malicious program or point to the malicious link containing the malicious program, so that the target host is controlled to steal sensitive information. In recent years, the number of phishing mails has been on the rise, which brings a serious challenge to network security.
Detection of phishing mails most of the prior art detects based on the content characteristics of the phishing mails, such as: and detecting the matching black link, the black address, a Uniform Resource Locator (URL) check and the like. The detection mode mostly depends on the updating of the content feature library, and the content is difficult to identify after slight change, so that the identification rate of the phishing mail detection based on the content detection is not high.
Disclosure of Invention
In view of this, embodiments of the present invention provide a phishing mail detection method, apparatus, electronic device and storage medium, which can improve the recognition rate of phishing mails.
In a first aspect, an embodiment of the present invention provides a phishing mail detection method, including: acquiring a mail to be detected from mail flow; detecting whether the mail to be detected contains preset behavior characteristics; and if the mail to be detected contains the preset behavior characteristics, determining that the mail to be detected is a phishing mail.
According to a specific implementation manner of the embodiment of the present invention, the detecting whether the mail to be detected contains the predetermined behavior feature includes:
detecting whether the mail to be detected contains first-class behavior characteristics; the first type of behavior features includes: in the mails to be detected, the used sender mailbox or receiver mailbox is a free mailbox or a mailbox with a phishing event; and/or the presence of a gas in the gas,
detecting whether the mail to be detected contains second type behavior characteristics; the second category of behavior characteristics includes: in the mail to be detected, the subject, the text or the attachment of the mail contains inductive keywords, or the inquiry content in the text of the mail lacks critical description information; and/or the presence of a gas in the gas,
detecting whether the mail to be detected contains third-class behavior characteristics; the third category of behavior characteristics includes: in the mail to be detected, the surface characteristics of the attachment are inconsistent with the actual characteristics; and/or the presence of a gas in the gas,
detecting whether the mail to be detected contains fourth type behavior characteristics; the fourth category of behavior characteristics includes: and when the attachment of the mail to be detected runs in the virtual environment, the attachment contains malicious behaviors.
According to a specific implementation manner of the embodiment of the invention, the surface characteristics of the accessory comprise a file suffix name of the accessory and/or an icon of the accessory; the actual characteristics of the accessory comprise the behavior characteristics of the accessory after the accessory is opened or executed.
According to a specific implementation manner of the embodiment of the present invention, the surface feature of the accessory is inconsistent with the actual feature, including:
the attachment surface is displayed in a compressed format or a text format, but the attachment is actually an executable file;
the surface of the attachment is displayed in a picture format, but the attachment has a link guide behavior after being actually executed;
the attachment surface is displayed in a document format, but the attachment contains a link guide address in the document after being opened;
the attachment surface appears as an executable file, but is actually an office document with a macro virus;
the attachment surface is displayed as an executable file, but has the behavior of remote connection, remote downloading or installation after actual execution;
the attachment surface is shown as an executable file, but after actual execution, runs only in the background, or has the behavior of executing other programs.
According to a specific implementation manner of the embodiment of the present invention, if the mail to be detected includes a predetermined behavior feature, determining that the mail to be detected is a phishing mail includes: and if the mail to be detected contains at least three types of behavior characteristics of the first type of behavior characteristics, the second type of behavior characteristics, the third type of behavior characteristics and the fourth type of behavior characteristics, determining that the mail to be detected is a phishing mail.
In a second aspect, an embodiment of the present invention provides a phishing mail detection apparatus, including: the acquisition module is used for acquiring the mail to be detected from the mail flow; the detection module is used for detecting whether the mail to be detected contains the preset behavior characteristics; and the judging module is used for determining that the mail to be detected is a phishing mail if the mail to be detected contains the preset behavior characteristics.
According to a specific implementation manner of the embodiment of the present invention, the detection module includes:
the first detection submodule is used for detecting whether the mail to be detected contains first-class behavior characteristics; the first type of behavior features includes: in the mails to be detected, the used sender mailbox or receiver mailbox is a free mailbox or a mailbox with a phishing event; and/or the presence of a gas in the gas,
the second detection submodule is used for detecting whether the mail to be detected contains second type behavior characteristics; the second category of behavior characteristics includes: in the mail to be detected, the subject, the text or the attachment of the mail contains inductive keywords, or the inquiry content in the text of the mail lacks critical description information; and/or the presence of a gas in the gas,
the third detection submodule is used for detecting whether the mail to be detected contains third type behavior characteristics; the third category of behavior characteristics includes: in the mail to be detected, the surface characteristics of the attachment are inconsistent with the actual characteristics; and/or the presence of a gas in the gas,
the fourth detection submodule is used for detecting whether the mail to be detected contains fourth type behavior characteristics; the fourth category of behavior characteristics includes: and when the attachment of the mail to be detected runs in the virtual environment, the attachment contains malicious behaviors.
According to a specific implementation manner of the embodiment of the invention, the surface characteristics of the accessory comprise a file suffix name of the accessory and/or an icon of the accessory; the actual characteristics of the accessory comprise the behavior characteristics of the accessory after the accessory is opened or executed.
According to a specific implementation manner of the embodiment of the present invention, the surface feature of the accessory is inconsistent with the actual feature, including:
the attachment surface is displayed in a compressed format or a text format, but the attachment is actually an executable file;
the surface of the attachment is displayed in a picture format, but the attachment has a link guide behavior after being actually executed;
the attachment surface is displayed in a document format, but the attachment contains a link guide address in the document after being opened;
the attachment surface appears as an executable file, but is actually an office document with a macro virus;
the attachment surface is displayed as an executable file, but has the behavior of remote connection, remote downloading or installation after actual execution;
the attachment surface is shown as an executable file, but after actual execution, runs only in the background, or has the behavior of executing other programs.
According to a specific implementation manner of the embodiment of the present invention, the determining module is specifically configured to: and if the mail to be detected contains at least three types of behavior characteristics of the first type of behavior characteristics, the second type of behavior characteristics, the third type of behavior characteristics and the fourth type of behavior characteristics, determining that the mail to be detected is a phishing mail.
In a third aspect, an embodiment of the present invention provides an electronic device, where the electronic device includes: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space enclosed by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; the processor executes the program corresponding to the executable program code by reading the executable program code stored in the memory, and is used for executing the method of any one of the foregoing implementation modes.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium storing one or more programs, which are executable by one or more processors to implement a method according to any one of the foregoing implementation manners.
According to the phishing mail detection method, the device, the electronic equipment and the storage medium provided by the embodiment of the invention, the mail to be detected is obtained from the mail flow, whether the mail to be detected contains the preset behavior characteristics is detected, and if the mail to be detected contains the preset behavior characteristics, the mail to be detected can be determined to be the phishing mail.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a phishing mail detection method according to a first embodiment of the invention;
FIG. 2 is a schematic flow chart of a phishing mail detection method according to a second embodiment of the present invention;
FIG. 3 is a flowchart of a third embodiment of a phishing mail detection method provided by the present invention;
FIG. 4 is a schematic structural diagram of a phishing mail detection apparatus according to a first embodiment of the present invention;
FIG. 5 is a schematic structural view of a second embodiment of a phishing mail detection apparatus provided in the present invention;
fig. 6 is a schematic structural diagram of an embodiment of an electronic device according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In a first aspect, an embodiment of the present invention provides a phishing mail detection method, which can improve the recognition rate of phishing mails.
Fig. 1 is a schematic flow chart of a phishing mail detection method according to a first embodiment of the present invention, as shown in fig. 1, the method of the present embodiment may include:
step 101, obtaining the mail to be detected from the mail flow.
In this embodiment, the mail traffic in the backbone network, for example, the mail traffic on the link of the mail server, may also be collected, and the mail traffic passing through the local area network gateway may also be collected.
In order not to affect the existing network environment, a security device can be deployed in a bypass mode on a mail server link or a local area network gateway, the mail traffic is collected through the security device deployed in the bypass, the collected mail traffic is restored locally in the security device, and then the mail to be detected is obtained.
The data obtained after restoring the collected mail flow may include metadata, mail body information, attachments, and the like. Wherein the metadata includes: the address of the mail server, the sender and the address thereof, the receiver and the address thereof, the sending time, the subject and the like. The data obtained after the reduction can be classified and stored for subsequent analysis and processing.
And 102, detecting whether the mail to be detected contains preset behavior characteristics.
The behavior characteristics of the mail can comprise various behaviors adopted in the process of sending the mail. For the sending behavior of the mail, the behavior characteristics may include the sending time, frequency, sending IP, the type of mailbox used, the inducing behavior of "please download the attachment", "please click and view" used in the content of the mail, and so on.
The behavior characteristics of the mails can be obtained by analyzing and counting the junk mails by adopting a probability statistical mathematical model in advance. The behavior characteristics of the mails obtained through analysis and statistics can be used for establishing a phishing mail behavior characteristic library, so that when the mail flow is monitored, the behavior characteristics of the mails detected in real time can be compared with the pre-established phishing mail behavior characteristic library to judge whether suspicious phishing mails appear in the current mail flow. The pre-established phishing mail behavior feature library can be continuously updated to adapt to new behavior features of phishing mails.
Step 103, if the mail to be detected contains the preset behavior characteristics, determining that the mail to be detected is a phishing mail.
In this embodiment, if the detection is performed, it is determined that the mail to be detected includes the predetermined behavior feature, that is, the behavior feature detected from the mail to be detected hits the behavior feature in the pre-established phishing mail behavior feature library, and it is determined that the mail to be detected is a phishing mail.
The content of the fishing mails can be varied, and the behavior characteristics of the fishing mails are relatively stable. According to the phishing mail detection method and the device, the mails to be detected are obtained from the mail flow, whether the mails to be detected contain the preset behavior characteristics is detected, and if the mails to be detected contain the preset behavior characteristics, the mails to be detected can be determined to be the phishing mails.
Fig. 2 is a schematic flow chart of a second embodiment of the phishing mail detection method provided by the present invention, as shown in fig. 2, the difference between the first embodiment of the method and the second embodiment of the method is that in the present embodiment, the detecting whether the mail to be detected contains a predetermined behavior characteristic (step 102) may include:
step 1021, detecting whether the mail to be detected contains first-class behavior characteristics; the first type of behavior features includes: in the mails to be detected, the used sender mailbox or receiver mailbox is a free mailbox or a mailbox with a phishing event; and/or the presence of a gas in the gas,
step 1022, detecting whether the mail to be detected contains the second type of behavior characteristics; the second category of behavior characteristics includes: in the mail to be detected, the subject, the text or the attachment of the mail contains inductive keywords, or the inquiry content in the text of the mail lacks critical description information; and/or the presence of a gas in the gas,
step 1023, detecting whether the mail to be detected contains third-class behavior characteristics; the third category of behavior characteristics includes: in the mail to be detected, the surface characteristics of the attachment are inconsistent with the actual characteristics; and/or the presence of a gas in the gas,
step 1024, detecting whether the mail to be detected contains fourth type behavior characteristics; the fourth category of behavior characteristics includes: and when the attachment of the mail to be detected runs in the virtual environment, the attachment contains malicious behaviors.
In this embodiment, at least one of the above detection methods may be included, or more than two of the detection methods may be included at the same time, so as to improve the accuracy of detection.
Among the first category of behavior features, when a mailbox used by a sender or a receiver is a free mailbox, rather than a mailbox of an organization such as government, education, company, and the like, the probability of occurrence of a phishing event is relatively high. The identification rate of the phishing mails can be improved by detecting the type of the mailbox used by the mail to be detected. The free mailbox can comprise a hot mailbox, a YAHOO mailbox, a GMAIL mailbox and other free mailboxes. Whether the mailbox used in the mail to be detected is the mailbox of an organization such as government, education, company and the like can be detected through the domain name of the mailbox, for example, the mailbox domain name of the government organization is generally gov. The mailbox with or without phishing events can be obtained by leading in the related mailbox address known in advance or by subsequent machine learning.
In the second category of behavior features, the inducing keywords refer to keywords that prompt or induce the readers of the mails to perform relevant operations according to the prompts or the inducements. For example, it may be "click here", "please download the attachment", "please open the attachment to view", or similar prompt or inducement text. Inquiry (inquiry) is also called as an inquiry price, which means that one party in a transaction is ready to buy or sell a certain commodity and asks the other party about the transaction conditions for buying and selling the commodity. The content of the inquiry can involve: price, specification, quality, quantity, packaging, shipping, and sample retrieval, etc., and most simply ask for price. The inquiry may take the form of a mail. In this embodiment, if the content of the inquiry dish in the mail is simple, for example, there is no explicit requirement for the product, description, price, or specification, etc., it is considered as a suspicious phishing mail. In this embodiment, the recognition rate of the phishing mails can be improved by detecting whether the subject, text or attachment of the mail to be detected contains inductive keywords or whether the inquiry content in the text of the mail lacks critical description information.
In a third category of behavior features, the surface features of the attachment may include a file suffix name of the attachment and/or an icon of the attachment; the actual characteristics of the accessory may include behavioral characteristics of the accessory after it is opened or after it is executed. The surface feature of the accessory is not limited to the above, and in other embodiments, the surface feature of the accessory may also be the size of the accessory.
As an example, the surface features of the accessory are not consistent with the actual features, and may include:
the attachment surface is displayed in a compressed format or a text format, for example, the attachment surface is displayed as icons such as a compressed package icon, a pdf icon, a word icon, a notepad icon or an excel, but the attachment is actually an executable file;
the surface of the attachment is displayed in a picture format, but the attachment has a link guide behavior after being actually executed; that is, the attachment looks like a picture on its surface, but has a link-oriented behavior after being opened, such as automatically jumping to another web page after being opened;
the surface of the attachment is displayed in a document format, such as an office document format or a notepad document format, but the attachment contains a link guide address in the document after being opened;
the attachment surface is shown as an executable file, such as a file suffixed with. exe, but is actually an office document with a macro virus;
the surface of the attachment is displayed as an executable file, but the attachment has the actions of remote connection, remote downloading, or installation and the like after being actually executed;
the attachment surface is shown as an executable file, but after actual execution, runs only in the background, or has the behavior of executing other programs.
Furthermore, the surface features of the accessory are not consistent with the actual features, and may also include: the accessory is a simple accessory on the surface, and after the accessory is opened, the accessory actually has the action of inducing by using the human engineering principle, namely the induced action is more consistent with the characteristics of human physiology, psychology and the like so as to improve the success of induction.
In the fourth type of behavior characteristics, tools such as sandboxes and virtual machines can be used to run the attachment of the mail to be detected in the virtual environment to determine whether the attachment contains malicious behaviors or behaviors such as carrying viruses.
In the above embodiment, if the mail to be detected includes any two of the first type of behavior feature, the second type of behavior feature, the third type of behavior feature, and the fourth type of behavior feature, it may be determined that the mail to be detected is a phishing mail. In order to further improve the recognition rate of the phishing mails and reduce the occurrence of false detection, in another embodiment of the present invention, if the to-be-detected mails contain predetermined behavior characteristics, it is determined that the to-be-detected mails are phishing mails (step 103), including:
and if the mail to be detected contains at least three types of behavior characteristics of the first type of behavior characteristics, the second type of behavior characteristics, the third type of behavior characteristics and the fourth type of behavior characteristics, determining that the mail to be detected is a phishing mail.
In the embodiment, the mails to be detected are detected through four dimensions, so that the recognition rate of the phishing mails can be further improved; and only if the mail to be detected contains at least three types of behavior characteristics of the first type of behavior characteristics, the second type of behavior characteristics, the third type of behavior characteristics and the fourth type of behavior characteristics, the mail to be detected can be determined to be the phishing mail, so that the recognition rate of the phishing mail is improved, and meanwhile, the accuracy rate of detection can also be improved.
The following describes the technical solution of the embodiment of the method shown in fig. 1 in detail by using a specific embodiment.
Fig. 3 is a flowchart of a third embodiment of the phishing mail detection method provided by the present invention, and an applicable scenario of the present embodiment may be that a security device deployed on a mail server link in a bypass manner detects a phishing mail in a mail flow, as shown in fig. 3, the method of the present embodiment may include:
and step 301, establishing a phishing mail behavior feature library.
The junk mail may be analyzed and counted in advance by using a probability statistical mathematical model to obtain behavior characteristics of the phishing mail, and each behavior characteristic of the phishing mail is stored in a characteristic library, for example, first to fourth types of behaviors and specific behavior characteristics corresponding to each type of behavior are stored in the characteristic library, and the characteristic library in this embodiment may be shown in the following table:
Figure BDA0001908430780000091
step 302, obtaining the mail to be detected from the mail flow.
In this embodiment, the to-be-detected mail is acquired from the mail traffic by the security device deployed on the link of the mail server in a bypass manner. Specifically, the mail traffic may be mirrored to the security device, so that the security device locally obtains the mail to be detected from the mail traffic. After the mail to be detected is obtained and restored, metadata, mail text information, attachments and the like in the mail to be detected are obtained. Wherein the metadata includes: the address of the mail server, the sender and the address thereof, the receiver and the address thereof, the sending time, the subject and the like. And classifying and storing the data obtained after reduction so as to carry out analysis processing subsequently.
And step 303, detecting whether the mail to be detected contains preset behavior characteristics.
And determining the behavior characteristics of the mails to be detected according to the restored data, and storing the behavior characteristics of the mails to be detected and the characteristics in a characteristic library for one-by-one matching. The specific matching process may be as follows:
(1) matching a first type of behavior: analyzing a sender mailbox or a receiver mailbox, and marking the behavior of the currently detected mail as 1 when the mailbox uses free mailboxes or mailboxes frequently having phishing events, such as HOTMAIL, YAHOO, GMAIL and the like, or marking the behavior of the currently detected mail as 0;
(2) matching the second type of behaviors: when the mail text, the subject and the attachment contain inducing keywords or require to click on the attachment to view content and other inducing prompts; or the inquiry content in the mail body marks the behavior of the currently detected mail as 1 when the specification and price of the product have no clear requirements, otherwise, marks the behavior of the currently detected mail as 0;
(3) matching a third type of behavior: by comparing the surface characteristic and the actual characteristic of the mail attachment, when the behavior characteristic of the third type of behavior in the upper table appears, the surface characteristic and the actual characteristic of the mail attachment can be considered to be inconsistent, the behavior of the currently detected mail is marked as 1, otherwise, the behavior of the currently detected mail is marked as 0;
(4) matching a fourth type of behavior: analyzing the attachment in detail by using an analysis tool such as a sandbox, and when the attachment is found to have a malicious behavior, marking the behavior of the mail currently detected as 1, otherwise, marking the behavior as 0;
and step 304, determining whether the mail to be detected is a phishing mail.
By reading the analysis result in the step 303, when the currently detected mail has three behaviors marked as 1, that is, when the behavior of the currently detected mail hits the three behaviors in the feature library, the currently detected mail can be judged to be a phishing mail, and an attack log can be reported;
step 305, taking the metadata in the current mail which is judged to be the phishing mail and the information such as the behavior characteristics thereof as a sample to carry out machine learning.
And step 306, updating the phishing mail behavior feature library through machine learning.
In this embodiment, after the currently detected email is determined to be a phishing email, the relevant feature information of the email can be used as a sample for machine learning to perform supervised machine learning, so that new learning data can be added to continuously perform update learning, thereby enhancing the detection accuracy of the email.
In a second aspect, an embodiment of the present invention provides a phishing mail detection apparatus, which can improve the recognition rate of phishing mails.
Fig. 4 is a schematic structural diagram of a first phishing mail detection apparatus according to an embodiment of the present invention, as shown in fig. 4, the apparatus of the present embodiment may include: the device comprises an acquisition module 12, a detection module 12 and a judgment module 13; the obtaining module 12 is configured to obtain the mail to be detected from the mail flow. In this embodiment, the mail traffic in the backbone network, for example, the mail traffic on the link of the mail server, may also be collected, and the mail traffic passing through the local area network gateway may also be collected.
In order not to affect the existing network environment, a security device can be deployed in a bypass mode on a mail server link or a local area network gateway, the mail traffic is collected through the security device deployed in the bypass, the collected mail traffic is restored locally in the security device, and then the mail to be detected is obtained.
The data obtained after restoring the collected mail flow may include metadata, mail body information, attachments, and the like. Wherein the metadata includes: the address of the mail server, the sender and the address thereof, the receiver and the address thereof, the sending time, the subject and the like. The data obtained after the reduction can be classified and stored for subsequent analysis and processing.
The detecting module 12 is configured to detect whether the mail to be detected contains a predetermined behavior characteristic.
The behavior characteristics of the mail can comprise various behaviors adopted in the process of sending the mail. For the sending behavior of the mail, the behavior characteristics may include the sending time, frequency, sending IP, the type of mailbox used, the inducing behavior of "please download the attachment", "please click and view" used in the content of the mail, and so on.
The behavior characteristics of the mails can be obtained by analyzing and counting the junk mails by adopting a probability statistical mathematical model in advance. The behavior characteristics of the mails obtained through analysis and statistics can be used for establishing a phishing mail behavior characteristic library, so that when the mail flow is monitored, the behavior characteristics of the mails detected in real time can be compared with the pre-established phishing mail behavior characteristic library to judge whether suspicious phishing mails appear in the current mail flow. The pre-established phishing mail behavior feature library can be continuously updated to adapt to new behavior features of phishing mails.
The judging module 13 is configured to determine that the mail to be detected is a phishing mail if the mail to be detected includes a predetermined behavior characteristic.
In this embodiment, if the detection is performed, it is determined that the mail to be detected includes the predetermined behavior feature, that is, the behavior feature detected from the mail to be detected hits the behavior feature in the pre-established phishing mail behavior feature library, and it is determined that the mail to be detected is a phishing mail.
The content of the fishing mails can be varied, and the behavior characteristics of the fishing mails are relatively stable. According to the phishing mail detection device and the method, the mails to be detected are obtained from the mail flow, whether the mails to be detected contain the preset behavior characteristics is detected, and if the mails to be detected contain the preset behavior characteristics, the mails to be detected can be determined to be the phishing mails.
Fig. 5 is a schematic structural diagram of a second embodiment of a phishing mail detection device provided by the present invention, as shown in fig. 5, the difference between the first embodiment of the device shown in fig. 4 is that in this embodiment, the detection module 12 may further include: a first detection submodule 121, a second detection submodule 122, a third detection submodule 123, and/or a fourth detection submodule 124.
The first detection submodule 121 is configured to detect whether the mail to be detected contains a first type of behavior feature; the first type of behavior features includes: in the mails to be detected, the used sender mailbox or receiver mailbox is a free mailbox or a mailbox with a phishing event.
When the mailbox used by the sender or the receiver is a free mailbox instead of a mailbox used by an organization such as government, education, company, etc., the probability of the occurrence of the fishing event is relatively high. The identification rate of the phishing mails can be improved by detecting the type of the mailbox used by the mail to be detected. The free mailbox can comprise a hot mailbox, a YAHOO mailbox, a GMAIL mailbox and other free mailboxes. Whether the mailbox used in the mail to be detected is the mailbox of an organization such as government, education, company and the like can be detected through the domain name of the mailbox, for example, the mailbox domain name of the government organization is generally gov. The mailbox with or without phishing events can be obtained by leading in the related mailbox address known in advance or by subsequent machine learning.
The second detecting sub-module 122 is configured to detect whether the mail to be detected contains a second type of behavior feature; the second category of behavior characteristics includes: in the mail to be detected, the subject, text or attachment of the mail contains inductive keywords, or the inquiry content in the text of the mail lacks critical description information.
The induction keywords are keywords for prompting or inducing readers of the mails to perform related operations according to the prompts or the inducements. For example, it may be "click here", "please download the attachment", "please open the attachment to view", or similar prompt or inducement text. Inquiry (inquiry) is also called as an inquiry price, which means that one party in a transaction is ready to buy or sell a certain commodity and asks the other party about the transaction conditions for buying and selling the commodity. The content of the inquiry can involve: price, specification, quality, quantity, packaging, shipping, and sample retrieval, etc., and most simply ask for price. The inquiry may take the form of a mail. In this embodiment, if the content of the inquiry dish in the mail is simple, for example, there is no explicit requirement for the product, description, price, or specification, etc., it is considered as a suspicious phishing mail. In this embodiment, the recognition rate of the phishing mails can be improved by detecting whether the subject, text or attachment of the mail to be detected contains inductive keywords or whether the inquiry content in the text of the mail lacks critical description information.
The third detecting sub-module 123 is configured to detect whether the mail to be detected contains a third type of behavior feature; the third category of behavior characteristics includes: in the mail to be detected, the surface characteristics of the attachment are inconsistent with the actual characteristics.
Wherein the surface characteristics of the attachment may include a file suffix name of the attachment and/or an icon of the attachment; the actual characteristics of the accessory may include behavioral characteristics of the accessory after it is opened or after it is executed. The surface feature of the accessory is not limited to the above, and in other embodiments, the surface feature of the accessory may also be the size of the accessory.
For an exemplary description of the inconsistency between the surface characteristic and the actual characteristic of the accessory, reference may be made to the foregoing method embodiments, and further description is omitted here.
The fourth detection sub-module 124 is configured to detect whether the mail to be detected contains a fourth type of behavior feature; the fourth category of behavior characteristics includes: and when the attachment of the mail to be detected runs in the virtual environment, the attachment contains malicious behaviors. Specifically, the attachment of the mail to be detected can be run in the virtual environment by using tools such as a sandbox and a virtual machine, so as to determine whether the attachment contains a malicious behavior or a behavior such as carrying a virus.
In the embodiment of the apparatus, the detection module 12 may include any two of the first detection sub-module 121, the second detection sub-module 122, the third detection sub-module 123, and the fourth detection sub-module 124, that is, the mail to be detected may be a phishing mail. In order to further increase the recognition rate of the phishing mails and reduce the occurrence of false detection, in an embodiment of the invention, the detection module 12 specifically includes
At least three of the first detection sub-module 121, the second detection sub-module 122, the third detection sub-module 123, and the fourth detection sub-module 124 are configured to detect a behavior of the mail to be detected, and determine that the mail to be detected is a phishing mail if the mail to be detected includes at least three of the first type behavior feature, the second type behavior feature, the third type behavior feature, and the fourth type behavior feature.
The apparatus of this embodiment may be used to execute the technical solutions of the method embodiments shown in fig. 1, fig. 2, or fig. 3, and the implementation principles and technical effects thereof are similar and will not be described herein again.
In a third aspect, an embodiment of the present invention further provides an electronic device. Fig. 6 is a schematic structural diagram of an embodiment of an electronic device according to an embodiment of the present invention, which may implement a flow of the embodiment shown in fig. 1, fig. 2, or fig. 3 of the present invention, and as shown in fig. 6, the electronic device may include: the device comprises a shell 41, a processor 42, a memory 43, a circuit board 44 and a power circuit 45, wherein the circuit board 44 is arranged inside a space enclosed by the shell 41, and the processor 42 and the memory 43 are arranged on the circuit board 44; a power supply circuit 45 for supplying power to each circuit or device of the electronic apparatus; the memory 43 is used for storing executable program code; the processor 42 executes a program corresponding to the executable program code by reading the executable program code stored in the memory 43, for executing the phishing mail detection method described in any of the foregoing embodiments.
The specific execution process of the above steps by the processor 42 and the steps further executed by the processor 42 by running the executable program code may refer to the description of the embodiment shown in fig. 1 of the present invention, and are not described herein again.
The electronic device may take many forms, including but not limited to a desktop computer with computing and processing capabilities, or other electronic devices with computing and processing capabilities.
In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium storing one or more programs, which are executable by one or more processors, for performing the phishing mail detection method of any of the preceding embodiments.
The content of the fishing mails can be varied, and the behavior characteristics of the fishing mails are relatively stable. In the embodiment of the invention, the mail to be detected is acquired from the mail flow, whether the mail to be detected contains the preset behavior characteristics is detected, and if the mail to be detected contains the preset behavior characteristics, the mail to be detected can be determined to be the phishing mail. Furthermore, when the detection of the phishing mails is carried out, the mails to be detected are detected through four dimensions, so that the recognition rate of the phishing mails can be further improved; and only if the mail to be detected contains at least three types of behavior characteristics of the first type of behavior characteristics, the second type of behavior characteristics, the third type of behavior characteristics and the fourth type of behavior characteristics, the mail to be detected can be determined to be the phishing mail, so that the recognition rate of the phishing mail is improved, and meanwhile, the accuracy rate of detection can also be improved.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments.
In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
For convenience of description, the above devices are described separately in terms of functional division into various units/modules. Of course, the functionality of the units/modules may be implemented in one or more software and/or hardware implementations of the invention.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (12)

1. A phishing mail detection method, comprising:
acquiring a mail to be detected from mail flow;
detecting whether the mail to be detected contains preset behavior characteristics;
and if the mail to be detected contains the preset behavior characteristics, determining that the mail to be detected is a phishing mail.
2. A phishing mail detection method as claimed in claim 1 wherein said detecting whether said mail to be detected contains predetermined behavioral characteristics comprises:
detecting whether the mail to be detected contains first-class behavior characteristics; the first type of behavior features includes: in the mails to be detected, the used sender mailbox or receiver mailbox is a free mailbox or a mailbox with a phishing event; and/or the presence of a gas in the gas,
detecting whether the mail to be detected contains second type behavior characteristics; the second category of behavior characteristics includes: in the mail to be detected, the subject, the text or the attachment of the mail contains inductive keywords, or the inquiry content in the text of the mail lacks critical description information; and/or the presence of a gas in the gas,
detecting whether the mail to be detected contains third-class behavior characteristics; the third category of behavior characteristics includes: in the mail to be detected, the surface characteristics of the attachment are inconsistent with the actual characteristics; and/or the presence of a gas in the gas,
detecting whether the mail to be detected contains fourth type behavior characteristics; the fourth category of behavior characteristics includes: and when the attachment of the mail to be detected runs in the virtual environment, the attachment contains malicious behaviors.
3. A phishing mail detection method as claimed in claim 2 wherein the surface characteristics of the attachment comprise a file suffix name of the attachment and/or an icon of the attachment; the actual characteristics of the accessory comprise the behavior characteristics of the accessory after the accessory is opened or executed.
4. A phishing mail detection method as in claim 2 wherein said attachment surface feature is not consistent with an actual feature comprising:
the attachment surface is displayed in a compressed format or a text format, but the attachment is actually an executable file;
the surface of the attachment is displayed in a picture format, but the attachment has a link guide behavior after being actually executed;
the attachment surface is displayed in a document format, but the attachment contains a link guide address in the document after being opened;
the attachment surface appears as an executable file, but is actually an office document with a macro virus;
the attachment surface is displayed as an executable file, but has the behavior of remote connection, remote downloading or installation after actual execution;
the attachment surface is shown as an executable file, but after actual execution, runs only in the background, or has the behavior of executing other programs.
5. A phishing mail detection method as claimed in claim 2 wherein determining that said mail to be detected is a phishing mail if said mail to be detected contains a predetermined behavioral characteristic comprises:
and if the mail to be detected contains at least three types of behavior characteristics of the first type of behavior characteristics, the second type of behavior characteristics, the third type of behavior characteristics and the fourth type of behavior characteristics, determining that the mail to be detected is a phishing mail.
6. A phishing mail detection apparatus comprising:
the acquisition module is used for acquiring the mail to be detected from the mail flow;
the detection module is used for detecting whether the mail to be detected contains the preset behavior characteristics;
and the judging module is used for determining that the mail to be detected is a phishing mail if the mail to be detected contains the preset behavior characteristics.
7. A phishing mail detection device as in claim 6 wherein said detection module comprises:
the first detection submodule is used for detecting whether the mail to be detected contains first-class behavior characteristics; the first type of behavior features includes: in the mails to be detected, the used sender mailbox or receiver mailbox is a free mailbox or a mailbox with a phishing event; and/or the presence of a gas in the gas,
the second detection submodule is used for detecting whether the mail to be detected contains second type behavior characteristics; the second category of behavior characteristics includes: in the mail to be detected, the subject, the text or the attachment of the mail contains inductive keywords, or the inquiry content in the text of the mail lacks critical description information; and/or the presence of a gas in the gas,
the third detection submodule is used for detecting whether the mail to be detected contains third type behavior characteristics; the third category of behavior characteristics includes: in the mail to be detected, the surface characteristics of the attachment are inconsistent with the actual characteristics; and/or the presence of a gas in the gas,
the fourth detection submodule is used for detecting whether the mail to be detected contains fourth type behavior characteristics; the fourth category of behavior characteristics includes: and when the attachment of the mail to be detected runs in the virtual environment, the attachment contains malicious behaviors.
8. A phishing mail detection apparatus as claimed in claim 7 wherein the surface characteristics of the attachment include the file suffix name of the attachment and/or the icon of the attachment; the actual characteristics of the accessory comprise the behavior characteristics of the accessory after the accessory is opened or executed.
9. A phishing mail detection device as claimed in claim 7 wherein said attachment surface features are not consistent with actual features comprising:
the attachment surface is displayed in a compressed format or a text format, but the attachment is actually an executable file;
the surface of the attachment is displayed in a picture format, but the attachment has a link guide behavior after being actually executed;
the attachment surface is displayed in a document format, but the attachment contains a link guide address in the document after being opened;
the attachment surface appears as an executable file, but is actually an office document with a macro virus;
the attachment surface is displayed as an executable file, but has the behavior of remote connection, remote downloading or installation after actual execution;
the attachment surface is shown as an executable file, but after actual execution, runs only in the background, or has the behavior of executing other programs.
10. A phishing mail detection apparatus as in claim 7 wherein said determination module is specifically configured to: and if the mail to be detected contains at least three types of behavior characteristics of the first type of behavior characteristics, the second type of behavior characteristics, the third type of behavior characteristics and the fourth type of behavior characteristics, determining that the mail to be detected is a phishing mail.
11. An electronic device, characterized in that the electronic device comprises: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space enclosed by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory for performing the method of any of the preceding claims.
12. A computer readable storage medium, characterized in that the computer readable storage medium stores one or more programs which are executable by one or more processors to implement the method of any preceding claim.
CN201811546175.7A 2018-12-17 2018-12-17 Phishing mail detection method and device, electronic equipment and storage medium Pending CN110868378A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811546175.7A CN110868378A (en) 2018-12-17 2018-12-17 Phishing mail detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811546175.7A CN110868378A (en) 2018-12-17 2018-12-17 Phishing mail detection method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110868378A true CN110868378A (en) 2020-03-06

Family

ID=69651893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811546175.7A Pending CN110868378A (en) 2018-12-17 2018-12-17 Phishing mail detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110868378A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111404939A (en) * 2020-03-16 2020-07-10 深信服科技股份有限公司 Mail threat detection method, device, equipment and storage medium
CN112003779A (en) * 2020-07-28 2020-11-27 杭州安恒信息技术股份有限公司 Phishing mail detection method and medium based on dynamic and static link characteristic identification
CN113794674A (en) * 2021-03-09 2021-12-14 北京沃东天骏信息技术有限公司 Method, device and system for detecting mail
CN114004604A (en) * 2021-12-30 2022-02-01 北京微步在线科技有限公司 Method and device for detecting URL data in mail and electronic equipment
CN114363033A (en) * 2021-12-29 2022-04-15 湖北天融信网络安全技术有限公司 Mail management and control method and device, network security equipment and storage medium
CN115643095A (en) * 2022-10-27 2023-01-24 山东星维九州安全技术有限公司 Method and system for security test of internal network of company

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106685803A (en) * 2016-12-29 2017-05-17 北京安天网络安全技术有限公司 Method and system of tracing APT attack event based on phishing mail
CN108200105A (en) * 2018-03-30 2018-06-22 杭州迪普科技股份有限公司 A kind of method and device for detecting fishing mail
US10063584B1 (en) * 2016-08-17 2018-08-28 Wombat Security Technologies, Inc. Advanced processing of electronic messages with attachments in a cybersecurity system
CN108965350A (en) * 2018-10-23 2018-12-07 杭州安恒信息技术股份有限公司 A kind of mail auditing method, device and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10063584B1 (en) * 2016-08-17 2018-08-28 Wombat Security Technologies, Inc. Advanced processing of electronic messages with attachments in a cybersecurity system
CN106685803A (en) * 2016-12-29 2017-05-17 北京安天网络安全技术有限公司 Method and system of tracing APT attack event based on phishing mail
CN108200105A (en) * 2018-03-30 2018-06-22 杭州迪普科技股份有限公司 A kind of method and device for detecting fishing mail
CN108965350A (en) * 2018-10-23 2018-12-07 杭州安恒信息技术股份有限公司 A kind of mail auditing method, device and computer readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
华师傅咨询: "《我来教你"杀"电脑极限防毒、防黑必读手册》", 30 June 2004 *
王俊: "新网络环境下的黑客欺骗", 《科技资讯》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111404939A (en) * 2020-03-16 2020-07-10 深信服科技股份有限公司 Mail threat detection method, device, equipment and storage medium
CN112003779A (en) * 2020-07-28 2020-11-27 杭州安恒信息技术股份有限公司 Phishing mail detection method and medium based on dynamic and static link characteristic identification
CN113794674A (en) * 2021-03-09 2021-12-14 北京沃东天骏信息技术有限公司 Method, device and system for detecting mail
CN113794674B (en) * 2021-03-09 2024-04-09 北京沃东天骏信息技术有限公司 Method, device and system for detecting mail
CN114363033A (en) * 2021-12-29 2022-04-15 湖北天融信网络安全技术有限公司 Mail management and control method and device, network security equipment and storage medium
CN114004604A (en) * 2021-12-30 2022-02-01 北京微步在线科技有限公司 Method and device for detecting URL data in mail and electronic equipment
CN115643095A (en) * 2022-10-27 2023-01-24 山东星维九州安全技术有限公司 Method and system for security test of internal network of company
CN115643095B (en) * 2022-10-27 2023-08-29 山东星维九州安全技术有限公司 Method and system for testing network security inside company

Similar Documents

Publication Publication Date Title
CN110868378A (en) Phishing mail detection method and device, electronic equipment and storage medium
US11570211B1 (en) Detection of phishing attacks using similarity analysis
US11716348B2 (en) Malicious script detection
CN110399925B (en) Account risk identification method, device and storage medium
Blum et al. Lexical feature based phishing URL detection using online learning
Rashid et al. Phishing detection using machine learning technique
CN111460445B (en) Sample program malicious degree automatic identification method and device
Hara et al. Visual similarity-based phishing detection without victim site information
CN106295333B (en) method and system for detecting malicious code
US20220030029A1 (en) Phishing Protection Methods and Systems
CN104143008A (en) Method and device for detecting phishing webpage based on picture matching
CN107888606B (en) Domain name credit assessment method and system
CN112685735B (en) Method, apparatus and computer readable storage medium for detecting abnormal data
Abbasi et al. A comparison of tools for detecting fake websites
CN103986731A (en) Method and device for detecting phishing web pages through picture matching
Tan et al. Phishing website detection using URL-assisted brand name weighting system
Joshi et al. Phishing attack detection using feature selection techniques
CN116074278A (en) Method, system, electronic equipment and storage medium for identifying malicious mail
US9177146B1 (en) Layout scanner for application classification
US11632395B2 (en) Method for detecting webpage spoofing attacks
CN115643044A (en) Data processing method, device, server and storage medium
CN114884686A (en) PHP threat identification method and device
US11757816B1 (en) Systems and methods for detecting scam emails
CN109214212B (en) Information leakage prevention method and device
CN113361597A (en) URL detection model training method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200306