CN116319654A - Intelligent type junk mail scanning method - Google Patents

Intelligent type junk mail scanning method Download PDF

Info

Publication number
CN116319654A
CN116319654A CN202310385460.XA CN202310385460A CN116319654A CN 116319654 A CN116319654 A CN 116319654A CN 202310385460 A CN202310385460 A CN 202310385460A CN 116319654 A CN116319654 A CN 116319654A
Authority
CN
China
Prior art keywords
mail
suspected
preset
junk
spam
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310385460.XA
Other languages
Chinese (zh)
Inventor
王宇飞
戚红建
韩硕
张洪卫
秦绪帅
邓旭楠
朱梦迪
秦子杨
薛松
孟庆宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Bidding Branch Of China Huaneng Group Co ltd
Huaneng Information Technology Co Ltd
Original Assignee
Beijing Bidding Branch Of China Huaneng Group Co ltd
Huaneng Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bidding Branch Of China Huaneng Group Co ltd, Huaneng Information Technology Co Ltd filed Critical Beijing Bidding Branch Of China Huaneng Group Co ltd
Priority to CN202310385460.XA priority Critical patent/CN116319654A/en
Publication of CN116319654A publication Critical patent/CN116319654A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/42Mailbox-related aspects, e.g. synchronisation of mailboxes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/101Access control lists [ACL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1466Active attacks involving interception, injection, modification, spoofing of data unit addresses, e.g. hijacking, packet injection or TCP sequence number attacks

Abstract

The invention relates to the technical field of mail transmission and discloses an intelligent type junk mail scanning method, which is used for receiving a mail to be transmitted, obtaining an IP address and mail information of the mail to be transmitted, judging whether the mail to be transmitted is a suspected junk mail according to the IP address, if so, preprocessing the suspected junk mail according to a mail subject and a mail text, carrying out characteristic processing on the preprocessed suspected junk mail to obtain a trusted value of the suspected junk mail, judging whether the suspected junk mail meets a first preset condition according to the relation between the trusted value and a preset trusted value, if so, transmitting the suspected junk mail to a virtual mail receiving server, judging whether the suspected junk mail has malicious attack behaviors in the first preset time, and when the suspected junk mail has the malicious attack behaviors, scanning the suspected junk mail as the junk mail.

Description

Intelligent type junk mail scanning method
Technical Field
The invention relates to the technical field of mail transmission, in particular to an intelligent junk mail scanning method.
Background
Junk e-mail (simply referred to as spam) refers to any e-mail that is forced into a user's mailbox without permission from the user. Email is one of the basic applications of internet users today, while spam is primarily sent through electronic mailboxes. In the current field of e-mails, junk mails are increasingly flooded, so that the processing time of normal mail users is increased, precious resources of a mail system are wasted, and the process of obtaining useful information by users is blocked, so that the junk mails are a problem to be solved urgently in the current field of network communication.
In the prior art, a black-and-white list is set in advance, mails sent by mailbox users with the white list pass preferentially and cannot be rejected as junk mails, and mails sent by mailbox users with the black list take interception operations and cannot pass. However, the junk mail scanning mode can enable a mail sender to intercept the mailboxes of legal users in a Trojan or virus program mode and the like, so that the mailboxes of the legal users slowly send junk mails in a large quantity, so that the interception of a blacklist is bypassed, and further the sending of the junk mails is completed.
Therefore, how to provide a method for accurately scanning the junk mail is a technical problem to be solved at present.
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide an intelligent junk mail scanning method, which solves the technical problem that junk mails cannot be accurately scanned, and further avoids the phenomenon that the junk mails influence network transmission and operation speed to cause congestion of mail servers.
In order to achieve the above object, the present invention provides an intelligent spam scanning method, the method comprising:
receiving a mail to be sent, and analyzing the mail to be sent to obtain an IP address and mail information of the mail to be sent, wherein the mail information comprises a mail subject and a mail text;
judging whether the mail to be sent is a suspected junk mail or not according to the IP address, and when the mail to be sent is the suspected junk mail, preprocessing the suspected junk mail according to the mail subject and the mail body;
performing feature processing on the pre-processed suspected junk mail to obtain a trusted value of the suspected junk mail, and judging whether the suspected junk mail meets a first preset condition according to the relation between the trusted value of the suspected junk mail and a preset trusted value;
when the suspected junk mail accords with the first preset condition, the suspected junk mail is sent to a virtual mail receiving server, whether malicious attack acts exist in the suspected junk mail in a first preset time is judged, and when the suspected junk mail has the malicious attack acts, the suspected junk mail is scanned to be junk mail.
In one embodiment, when determining whether the mail to be sent is a suspected spam according to the IP address, the method includes:
determining the mail sending server of the mail to be sent according to the IP address, judging whether the mail sending server is in a blacklist,
if the mail sending server side is in the blacklist, judging that the mail to be sent is junk mail;
and if the mail sending server is not in the blacklist, judging that the mail to be sent is suspected junk mail.
In one embodiment, when the pre-processing is performed on the suspected spam according to the mail subject and the mail body, the pre-processing includes:
performing word segmentation processing on the mail subject and the mail text to obtain a plurality of segmented words, and determining the occurrence times of the segmented words;
identifying and classifying the word segmentation according to the relation between the occurrence times and the preset occurrence times,
when the occurrence frequency is larger than the preset occurrence frequency, recognizing the word segmentation as a high-frequency word;
and when the occurrence frequency is smaller than or equal to the preset occurrence frequency, recognizing the word segmentation as a low-frequency word.
In one embodiment, after identifying and classifying the word segment according to the relationship between the occurrence frequency and the preset occurrence frequency, the method further includes:
the weight of the high frequency word is calculated according to the following formula:
Figure SMS_1
wherein phi is the weight of the high-frequency word, X a,b Is the number of times that the high frequency word a appears in the suspected spam b.
In one embodiment, when performing feature processing on the pre-processed suspected spam to obtain a trusted value of the suspected spam, the method includes:
judging whether the high-frequency word is in a preset recognition library or not, and if the high-frequency word is in the preset recognition library, marking the high-frequency word as a garbage keyword;
and obtaining the occurrence times of the spam keywords, and calculating the credible value of the suspected spam according to the occurrence times of the spam keywords and the weight of the high-frequency words.
In one embodiment, the trusted value of the suspected spam is calculated according to the following equation:
P=K×φ;
wherein P is the trusted value of the suspected junk mail, K is the occurrence frequency of junk keywords, and phi is the weight of high-frequency words.
In one embodiment, when judging whether the suspected spam accords with a first preset condition according to the relation between the trusted value of the suspected spam and a preset trusted value, the method includes:
if the trusted value of the suspected junk mail is greater than or equal to the preset trusted value, judging that the suspected junk mail accords with the first preset condition;
and if the trusted value of the suspected junk mail is smaller than the preset trusted value, judging that the suspected junk mail does not accord with the first preset condition, scanning the suspected junk mail as a normal mail, and sending the normal mail.
In one embodiment, before the suspected spam is sent to the virtual mail receiving server, the method further includes:
and acquiring the target IP address of the suspected junk mail, and correcting the virtual IP address of the virtual mail receiving server based on the target IP address.
In one embodiment, after scanning the suspected spam to be spam, the method further comprises:
acquiring the quantity A of junk mails sent by the mail sending server in a second preset time;
and setting the network speed of the mail sending server according to the quantity A of the junk mails.
In one embodiment, when setting the network speed of the mail sending server according to the number of the junk mails, the method includes:
presetting a junk mail quantity matrix B sent by a mail sending server, and setting B (B1, B2, B3 and B4), wherein B1 is a first preset junk mail quantity, B2 is a second preset junk mail quantity, B3 is a third preset junk mail quantity, B4 is a fourth preset junk mail quantity, and B1 is more than B2 and less than B3 and less than B4;
presetting a network speed matrix C of a mail sending server, and setting C (C1, C2, C3, C4 and C5), wherein C1 is a first preset network speed, C2 is a second preset network speed, C3 is a third preset network speed, C4 is a fourth preset network speed, C5 is a fifth preset network speed, and C1 is more than C2 and less than C3 and less than C4 and less than C5;
setting the network speed of the mail sending server according to the relation between the number A of the junk mails sent by the mail sending server and the number of the preset junk mails:
when A is smaller than B1, selecting the first preset network speed C1 as the network speed of the mail sending server;
when B1 is less than or equal to A and less than B2, selecting the second preset network speed C2 as the network speed of the mail sending server;
when B2 is less than or equal to A and less than B3, selecting the third preset network speed C3 as the network speed of the mail sending server;
when B3 is less than or equal to A and less than B4, selecting the fourth preset network speed C4 as the network speed of the mail sending server;
and when B4 is less than or equal to A, selecting the fifth preset network speed C5 as the network speed of the mail sending server.
The invention provides an intelligent junk mail scanning method, which has the following beneficial effects compared with the prior art:
the invention discloses an intelligent type junk mail scanning method, which is used for receiving a mail to be sent, analyzing the mail to be sent to obtain an IP address and mail information of the mail to be sent, judging whether the mail to be sent is a suspected junk mail according to the IP address, when the mail to be sent is the suspected junk mail, preprocessing the suspected junk mail according to the mail theme and the mail text, carrying out characteristic processing on the preprocessed suspected junk mail to obtain a trusted value of the suspected junk mail, judging whether the suspected junk mail meets a first preset condition according to the relation between the trusted value of the suspected junk mail and the preset trusted value, sending the suspected junk mail to a virtual mail receiving server when the suspected junk mail meets the first preset condition, judging whether the suspected junk mail has malicious attack behavior in the first preset time, and scanning the suspected junk mail as the junk mail when the suspected junk mail has the malicious attack behavior.
Drawings
FIG. 1 is a flow chart of an intelligent garbage mail scanning method according to an embodiment of the invention;
fig. 2 is a schematic flow chart of preprocessing a suspected spam according to a mail subject and a mail body in an embodiment of the invention.
Detailed Description
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.
In the description of the present application, it should be understood that the terms "center," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate description of the present application and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present application.
The terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
In the description of the present application, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the terms in this application will be understood by those of ordinary skill in the art in a specific context.
The following is a description of preferred embodiments of the invention, taken in conjunction with the accompanying drawings.
As shown in fig. 1, an embodiment of the present invention discloses an intelligent spam scanning method, which includes:
s110: and receiving the mail to be sent, and analyzing the mail to be sent to obtain the IP address and the mail information of the mail to be sent, wherein the mail information comprises a mail subject and a mail body.
S120: judging whether the mail to be sent is a suspected junk mail or not according to the IP address, and when the mail to be sent is the suspected junk mail, preprocessing the suspected junk mail according to the mail subject and the mail body.
In order to prevent a phenomenon that illegal personnel slowly send a large amount of junk mails by using a mailbox of a legal user, in some embodiments of the present application, when judging whether the mail to be sent is a suspected junk mail according to the IP address, the method includes:
determining the mail sending server of the mail to be sent according to the IP address, judging whether the mail sending server is in a blacklist,
if the mail sending server side is in the blacklist, judging that the mail to be sent is junk mail;
and if the mail sending server is not in the blacklist, judging that the mail to be sent is suspected junk mail.
In this embodiment, a black-and-white list is set in advance, and by acquiring a mail sending server of a mail to be sent, whether the mail sending server is in the black list is determined, when the mail sending server is in the black list, the mail to be sent is directly determined to be a junk mail, and when the mail to be sent is not in the black list, the mail to be sent is determined to be a suspected junk mail, further determination needs to be made, and thus the phenomenon that illegal persons use a mailbox of legal users to slowly send a large amount of junk mails is prevented, and the identification accuracy of the junk mails is improved.
As shown in fig. 2, in order to improve the recognition efficiency of the spam, in some embodiments of the present application, when preprocessing the suspected spam according to the mail subject and the mail body, the method includes:
s121: performing word segmentation processing on the mail subject and the mail text to obtain a plurality of segmented words, and determining the occurrence times of the segmented words;
s122: identifying and classifying the word segmentation according to the relation between the occurrence times and the preset occurrence times,
when the occurrence frequency is larger than the preset occurrence frequency, recognizing the word segmentation as a high-frequency word;
and when the occurrence frequency is smaller than or equal to the preset occurrence frequency, recognizing the word segmentation as a low-frequency word.
In this embodiment, the pre-processing is performed on the suspected junk mail according to the parsed mail subject and mail text, where the pre-processing includes word segmentation processing and word segmentation recognition classification, and the word segmentation processing is performed on the mail subject and mail text to obtain a plurality of words, for example, the mail subject is "mail standard format", the mail text is "in order to standardize the mail standard format", and professional enterprise culture is created, so that the enterprise image is improved, and therefore we need to formulate a unified mail standard format. The mail subject and mail text can be divided into mail standard format, purpose, standard, mail standard format, construction, enterprise culture, specialization, promotion, enterprise image, therefore, we, need, formulation, share, unification, standard and mail standard format. The method comprises the steps of determining the occurrence times of each word, if the mail standard format is 3 times, and the mail standard format is 2 times, identifying and classifying the words according to the relation between the occurrence times and the preset occurrence times, identifying the corresponding word as a high-frequency word when the occurrence times are larger than the preset occurrence times, identifying the corresponding word as a low-frequency word when the occurrence times are smaller than or equal to the preset occurrence times, and identifying the word as a high-frequency word when the preset occurrence times are 2 times if the occurrence times are 2 times, wherein the word is the mail standard format, other words are the low-frequency word, specific word segmentation rules and the preset occurrence times can be set according to actual conditions.
In some embodiments of the present application, after identifying and classifying the word segment according to the relationship between the occurrence frequency and the preset occurrence frequency, the method further includes:
the weight of the high frequency word is calculated according to the following formula:
Figure SMS_2
wherein phi is the weight of the high-frequency word, X a,b Is the number of times that the high frequency word a appears in the suspected spam b.
In this embodiment, the weight of the high-frequency word is calculated according to the above formula, and by calculating the weight of the high-frequency word, reliable data support can be provided for calculating the trusted value of the suspected spam.
S130: and carrying out feature processing on the pre-processed suspected junk mail to obtain a trusted value of the suspected junk mail, and judging whether the suspected junk mail accords with a first preset condition according to the relation between the trusted value of the suspected junk mail and a preset trusted value.
In order to further improve the recognition accuracy of the spam, in some embodiments of the present application, when performing feature processing on the pre-processed suspected spam to obtain a trusted value of the suspected spam, the method includes:
judging whether the high-frequency word is in a preset recognition library or not, and if the high-frequency word is in the preset recognition library, marking the high-frequency word as a garbage keyword;
and obtaining the occurrence times of the spam keywords, and calculating the credible value of the suspected spam according to the occurrence times of the spam keywords and the weight of the high-frequency words.
In some embodiments of the present application, the trusted value of the suspected spam is calculated according to the following formula:
P=K×φ;
wherein P is the trusted value of the suspected junk mail, K is the occurrence frequency of junk keywords, and phi is the weight of high-frequency words.
In this embodiment, after each word is divided into a high-frequency word and a low-frequency word, whether the high-frequency word appears in a preset recognition library is determined, when the high-frequency word appears in the preset recognition library, the current high-frequency word is a spam keyword, at this time, the number of occurrences of the spam keyword is obtained, and the trusted value of the suspected spam is calculated according to the relationship between the spam keyword and the corresponding high-frequency word weight, and it should be understood that if the spam keyword is greater than 1, the trusted values corresponding to each spam keyword are added, if the spam keyword appears for 30 times, the weight of "cash" is 0.8, and the weight of "cash" appears for 20 times, and the weight of "cash" is 0.75, at this time, the trusted value of the suspected spam is 30×0.8+20×0.75=39. The above examples are not particularly limited, and the method and the device can accurately identify the junk mail by calculating the trusted value of the suspected junk mail, thereby obviously improving the identification rate of the junk mail.
In some embodiments of the present application, when determining whether the suspected spam meets the first preset condition according to the relationship between the trusted value of the suspected spam and the preset trusted value, the method includes:
if the trusted value of the suspected junk mail is greater than or equal to the preset trusted value, judging that the suspected junk mail accords with the first preset condition;
and if the trusted value of the suspected junk mail is smaller than the preset trusted value, judging that the suspected junk mail does not accord with the first preset condition, scanning the suspected junk mail as a normal mail, and sending the normal mail.
In this embodiment, when the trusted value of the suspected spam is calculated, whether the suspected spam meets the first preset condition is determined according to the relation between the trusted value of the suspected spam and the preset trusted value, if the trusted value of the suspected spam is greater than or equal to the preset trusted value, the suspected spam is determined to meet the first preset condition, if the trusted value of the suspected spam is less than the preset trusted value, the suspected spam is determined not to meet the first preset condition, the suspected spam is scanned as a normal mail, and the normal mail is sent.
S140: when the suspected junk mail accords with the first preset condition, the suspected junk mail is sent to a virtual mail receiving server, whether malicious attack acts exist in the suspected junk mail in a first preset time is judged, and when the suspected junk mail has the malicious attack acts, the suspected junk mail is scanned to be junk mail.
In order to prevent erroneous judgment, in some embodiments of the present application, before the suspected spam is sent to the virtual mail receiving server, the method further includes:
and acquiring the target IP address of the suspected junk mail, and correcting the virtual IP address of the virtual mail receiving server based on the target IP address.
In this embodiment, when the suspected spam accords with the first preset condition, the destination IP address of the suspected spam is obtained, and the virtual IP address of the virtual mail receiving server is corrected based on the destination IP address, where the IP address of the virtual mail receiving server may be updated according to the actual situation, so as to confuse an illegal person, determine whether the suspected spam has a malicious attack in the virtual mail receiving server, and scan the suspected spam as a spam when the suspected spam has a malicious attack.
In order to prevent the mail sending server from continuously sending the spam, in some embodiments of the present application, after scanning the suspected spam into the spam, the method further includes:
acquiring the quantity A of junk mails sent by the mail sending server in a second preset time;
and setting the network speed of the mail sending server according to the quantity A of the junk mails.
In some embodiments of the present application, when setting the network speed of the mail sending server according to the number of spam, the method includes:
presetting a junk mail quantity matrix B sent by a mail sending server, and setting B (B1, B2, B3 and B4), wherein B1 is a first preset junk mail quantity, B2 is a second preset junk mail quantity, B3 is a third preset junk mail quantity, B4 is a fourth preset junk mail quantity, and B1 is more than B2 and less than B3 and less than B4;
presetting a network speed matrix C of a mail sending server, and setting C (C1, C2, C3, C4 and C5), wherein C1 is a first preset network speed, C2 is a second preset network speed, C3 is a third preset network speed, C4 is a fourth preset network speed, C5 is a fifth preset network speed, and C1 is more than C2 and less than C3 and less than C4 and less than C5;
setting the network speed of the mail sending server according to the relation between the number A of the junk mails sent by the mail sending server and the number of the preset junk mails:
when A is smaller than B1, selecting the first preset network speed C1 as the network speed of the mail sending server;
when B1 is less than or equal to A and less than B2, selecting the second preset network speed C2 as the network speed of the mail sending server;
when B2 is less than or equal to A and less than B3, selecting the third preset network speed C3 as the network speed of the mail sending server;
when B3 is less than or equal to A and less than B4, selecting the fourth preset network speed C4 as the network speed of the mail sending server;
and when B4 is less than or equal to A, selecting the fifth preset network speed C5 as the network speed of the mail sending server.
In this embodiment, the number a of junk mails sent by the mail sending server in the second preset time is obtained, and the network speed of the mail sending server is set according to the relationship between the number a of junk mails sent by the mail sending server and the number of each preset junk mail.
In summary, the embodiment of the invention obtains the IP address and the mail information of the mail to be sent by receiving the mail to be sent and analyzing the mail to be sent, the mail information comprises a mail subject and a mail text, whether the mail to be sent is a suspected junk mail or not is judged according to the IP address, when the mail to be sent is the suspected junk mail, the suspected junk mail is preprocessed according to the mail subject and the mail text, the preprocessed suspected junk mail is subjected to characteristic processing to obtain the trusted value of the suspected junk mail, whether the suspected junk mail meets a first preset condition is judged according to the relation between the trusted value of the suspected junk mail and the preset trusted value, when the suspected junk mail meets the first preset condition, the suspected junk mail is sent to a virtual mail receiving server, and whether the suspected junk mail has malicious attack behavior in the first preset time is judged, and when the suspected junk mail has the malicious attack behavior, the suspected junk mail is scanned into the junk mail.
In the description of the above embodiments, particular features, structures, materials, or characteristics may be combined in any suitable manner in any one or more embodiments or examples.
Although the invention has been described hereinabove with reference to embodiments, various modifications thereof may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention. In particular, the features of the disclosed embodiments may be combined with each other in any manner as long as there is no structural conflict, and the entire description of these combinations is not made in the present specification merely for the sake of omitting the descriptions and saving resources. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.
Those of ordinary skill in the art will appreciate that: the above is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that the present invention is described in detail with reference to the foregoing embodiments, and modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An intelligent spam scanning method, the method comprising:
receiving a mail to be sent, and analyzing the mail to be sent to obtain an IP address and mail information of the mail to be sent, wherein the mail information comprises a mail subject and a mail text;
judging whether the mail to be sent is a suspected junk mail or not according to the IP address, and when the mail to be sent is the suspected junk mail, preprocessing the suspected junk mail according to the mail subject and the mail body;
performing feature processing on the pre-processed suspected junk mail to obtain a trusted value of the suspected junk mail, and judging whether the suspected junk mail meets a first preset condition according to the relation between the trusted value of the suspected junk mail and a preset trusted value;
when the suspected junk mail accords with the first preset condition, the suspected junk mail is sent to a virtual mail receiving server, whether malicious attack acts exist in the suspected junk mail in a first preset time is judged, and when the suspected junk mail has the malicious attack acts, the suspected junk mail is scanned to be junk mail.
2. The intelligent spam scanning method according to claim 1, wherein when determining whether the mail to be sent is a suspected spam according to the IP address, comprising:
determining the mail sending server of the mail to be sent according to the IP address, judging whether the mail sending server is in a blacklist,
if the mail sending server side is in the blacklist, judging that the mail to be sent is junk mail;
and if the mail sending server is not in the blacklist, judging that the mail to be sent is suspected junk mail.
3. The intelligent spam scanning method of claim 1, wherein when the suspected spam is pre-processed according to the mail subject and the mail body, comprising:
performing word segmentation processing on the mail subject and the mail text to obtain a plurality of segmented words, and determining the occurrence times of the segmented words;
identifying and classifying the word segmentation according to the relation between the occurrence times and the preset occurrence times,
when the occurrence frequency is larger than the preset occurrence frequency, recognizing the word segmentation as a high-frequency word;
and when the occurrence frequency is smaller than or equal to the preset occurrence frequency, recognizing the word segmentation as a low-frequency word.
4. The intelligent spam scanning method of claim 3, further comprising, after identifying and classifying the tokens according to a relationship between the frequency of occurrence and a predetermined frequency of occurrence:
the weight of the high frequency word is calculated according to the following formula:
Figure QLYQS_1
wherein phi is the weight of the high-frequency word, X a,b Is the number of times that the high frequency word a appears in the suspected spam b.
5. The intelligent spam scanning method according to claim 4, wherein when performing feature processing on the pre-processed suspected spam to obtain a trusted value of the suspected spam, the method comprises:
judging whether the high-frequency word is in a preset recognition library or not, and if the high-frequency word is in the preset recognition library, marking the high-frequency word as a garbage keyword;
and obtaining the occurrence times of the spam keywords, and calculating the credible value of the suspected spam according to the occurrence times of the spam keywords and the weight of the high-frequency words.
6. The intelligent spam scanning method of claim 5, wherein the trusted value of the suspected spam is calculated according to the following equation:
P=K×φ;
wherein P is the trusted value of the suspected junk mail, K is the occurrence frequency of junk keywords, and phi is the weight of high-frequency words.
7. The intelligent spam scanning method according to claim 1, wherein when determining whether the suspected spam meets a first preset condition according to a relationship between the trusted value of the suspected spam and a preset trusted value, comprising:
if the trusted value of the suspected junk mail is greater than or equal to the preset trusted value, judging that the suspected junk mail accords with the first preset condition;
and if the trusted value of the suspected junk mail is smaller than the preset trusted value, judging that the suspected junk mail does not accord with the first preset condition, scanning the suspected junk mail as a normal mail, and sending the normal mail.
8. The intelligent spam scanning method of claim 1, further comprising, prior to sending the suspected spam to a virtual mail receiving server:
and acquiring the target IP address of the suspected junk mail, and correcting the virtual IP address of the virtual mail receiving server based on the target IP address.
9. The intelligent spam scanning method of claim 1, further comprising, after scanning the suspected spam as spam:
acquiring the quantity A of junk mails sent by the mail sending server in a second preset time;
and setting the network speed of the mail sending server according to the quantity A of the junk mails.
10. The intelligent spam scanning method of claim 9, wherein when setting the network speed of the mail sending server according to the number of spam, comprising:
presetting a junk mail quantity matrix B sent by a mail sending server, and setting B (B1, B2, B3 and B4), wherein B1 is a first preset junk mail quantity, B2 is a second preset junk mail quantity, B3 is a third preset junk mail quantity, B4 is a fourth preset junk mail quantity, and B1 is more than B2 and less than B3 and less than B4;
presetting a network speed matrix C of a mail sending server, and setting C (C1, C2, C3, C4 and C5), wherein C1 is a first preset network speed, C2 is a second preset network speed, C3 is a third preset network speed, C4 is a fourth preset network speed, C5 is a fifth preset network speed, and C1 is more than C2 and less than C3 and less than C4 and less than C5;
setting the network speed of the mail sending server according to the relation between the number A of the junk mails sent by the mail sending server and the number of the preset junk mails:
when A is smaller than B1, selecting the first preset network speed C1 as the network speed of the mail sending server;
when B1 is less than or equal to A and less than B2, selecting the second preset network speed C2 as the network speed of the mail sending server;
when B2 is less than or equal to A and less than B3, selecting the third preset network speed C3 as the network speed of the mail sending server;
when B3 is less than or equal to A and less than B4, selecting the fourth preset network speed C4 as the network speed of the mail sending server;
and when B4 is less than or equal to A, selecting the fifth preset network speed C5 as the network speed of the mail sending server.
CN202310385460.XA 2023-04-11 2023-04-11 Intelligent type junk mail scanning method Pending CN116319654A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310385460.XA CN116319654A (en) 2023-04-11 2023-04-11 Intelligent type junk mail scanning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310385460.XA CN116319654A (en) 2023-04-11 2023-04-11 Intelligent type junk mail scanning method

Publications (1)

Publication Number Publication Date
CN116319654A true CN116319654A (en) 2023-06-23

Family

ID=86822379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310385460.XA Pending CN116319654A (en) 2023-04-11 2023-04-11 Intelligent type junk mail scanning method

Country Status (1)

Country Link
CN (1) CN116319654A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101217555A (en) * 2008-01-10 2008-07-09 厦门三五互联科技股份有限公司 An intelligent anti-waster and anti-virus gateway and the corresponding filtering method
CN107294834A (en) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 A kind of method and apparatus for recognizing spam
CN108418777A (en) * 2017-02-09 2018-08-17 中国移动通信有限公司研究院 A kind of fishing mail detection method, apparatus and system
US20190132358A1 (en) * 2014-06-11 2019-05-02 Accenture Global Services Limited Deception Network System
CN110149266A (en) * 2018-07-19 2019-08-20 腾讯科技(北京)有限公司 Spam filtering method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101217555A (en) * 2008-01-10 2008-07-09 厦门三五互联科技股份有限公司 An intelligent anti-waster and anti-virus gateway and the corresponding filtering method
US20190132358A1 (en) * 2014-06-11 2019-05-02 Accenture Global Services Limited Deception Network System
CN107294834A (en) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 A kind of method and apparatus for recognizing spam
CN108418777A (en) * 2017-02-09 2018-08-17 中国移动通信有限公司研究院 A kind of fishing mail detection method, apparatus and system
CN110149266A (en) * 2018-07-19 2019-08-20 腾讯科技(北京)有限公司 Spam filtering method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
石义;钱步仁;: "基于内容与行为特征的反垃圾邮件系统", 网络安全技术与应用, no. 04, 15 April 2009 (2009-04-15) *

Similar Documents

Publication Publication Date Title
US7882187B2 (en) Method and system for detecting undesired email containing image-based messages
US10204157B2 (en) Image based spam blocking
US8503717B2 (en) Detection of spam images
US8763114B2 (en) Detecting image spam
US10178115B2 (en) Systems and methods for categorizing network traffic content
US8214497B2 (en) Multi-dimensional reputation scoring
US7433923B2 (en) Authorized email control system
US8578051B2 (en) Reputation based load balancing
JP5047624B2 (en) A framework that enables the incorporation of anti-spam techniques
US20050050150A1 (en) Filter, system and method for filtering an electronic mail message
US20040128355A1 (en) Community-based message classification and self-amending system for a messaging system
US8179798B2 (en) Reputation based connection throttling
WO2007146701A2 (en) Methods and systems for exposing messaging reputation to an end user
KR20140116410A (en) Systems and methods for spam detection using character histograms
CN109039874B (en) Mail auditing method and device based on behavior analysis
US20060075099A1 (en) Automatic elimination of viruses and spam
CN113630397A (en) E-mail security control method, client and system
US20050198181A1 (en) Method and apparatus to use a statistical model to classify electronic communications
JP4670049B2 (en) E-mail filtering program, e-mail filtering method, e-mail filtering system
CN116319654A (en) Intelligent type junk mail scanning method
JP2006059313A (en) Filtering device for removing unsolicited mail
KR20060065403A (en) The spam filter capable of doing the reflex-studying according to the users' propensity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination