CN115567475A - Junk mail identification method, device, equipment and storage medium - Google Patents

Junk mail identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN115567475A
CN115567475A CN202211179798.1A CN202211179798A CN115567475A CN 115567475 A CN115567475 A CN 115567475A CN 202211179798 A CN202211179798 A CN 202211179798A CN 115567475 A CN115567475 A CN 115567475A
Authority
CN
China
Prior art keywords
mail
target
sender
receiver
traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211179798.1A
Other languages
Chinese (zh)
Inventor
何宁华
李志涛
刘建虎
金永刚
刘萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing 263 Enterprise Communication Co ltd
Original Assignee
Beijing 263 Enterprise Communication Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing 263 Enterprise Communication Co ltd filed Critical Beijing 263 Enterprise Communication Co ltd
Priority to CN202211179798.1A priority Critical patent/CN115567475A/en
Publication of CN115567475A publication Critical patent/CN115567475A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application provides a method, a device, equipment and a storage medium for identifying junk mails, and relates to the technical field of networks. On the basis of identifying a target mail as a junk mail based on a preset junk mail identification technology, determining whether a historical mail exchange exists between the target receiver and the target sender according to the fact that the target mail contains relevant information for representing the target receiver and the target sender; if the target receiver and the target sender have historical mail traffic, namely the target receiver and the target sender send and receive mails mutually, identifying the target mail as a normal mail; if the target receiver and the target sender do not have historical mail traffic, namely the target receiver and the target sender do not send and receive mails mutually temporarily, the target mail is identified as a junk mail so as to reduce the misjudgment rate of the junk mail.

Description

Junk mail identification method, device, equipment and storage medium
Technical Field
The present application relates to the field of network technologies, and in particular, to a method, an apparatus, a device, and a storage medium for identifying spam.
Background
With the informatization of enterprises and the deep application of industrial internet, the quantity of junk mails is increasing day by day. Therefore, how to effectively identify spam is an urgent problem to be solved.
In the related art, spam identification technologies such as Internet Protocol (IP) address blacklist or IP address whitelist technology, reverse query technology, behavior identification technology, honeypot technology, cryptographic technology, statistical analysis technology, content identification technology, and the like are generally used to identify spam, but spam misjudgment still occurs.
Disclosure of Invention
The application provides a method, a device, equipment and a storage medium for identifying junk mails, which are used for reducing the phenomenon of misjudgment as junk mails.
In a first aspect, the present application provides a method for identifying spam, including:
acquiring a target mail which is identified as a junk mail based on a preset junk mail identification technology, wherein the target mail comprises related information for representing a target receiver and a target sender;
determining whether historical mail traffic exists between a target receiver and a target sender;
if the target receiver and the target sender have historical mail traffic, identifying the target mail as a normal mail;
and if the target receiver does not have historical mail traffic with the target sender, identifying the target mail as a junk mail.
In one possible implementation, determining whether there is a historical mail exchange between the target recipient and the target sender includes:
determining whether the information of the target sender is contained in a mail transaction relation with the target receiver, wherein the mail transaction relation has historical mail transaction;
if yes, determining that the target receiver and the target sender have historical mail traffic;
if not, determining that no historical mail exchange exists between the target receiver and the target sender.
In one possible implementation manner, the mail correspondence relationship includes a mail correspondence degree used for characterizing the number of times of mail correspondence, and identifying the target mail as a normal mail includes:
determining the mail traffic relation degree between the target receiver and the target sender based on the mail traffic relation;
determining whether the relation degree of the mail traffic of a target receiver and the target sender is smaller than a relation degree threshold value;
and if the mail traffic relation degree between the target receiver and the target sender is greater than or equal to the relation degree threshold value, identifying the target mail as a normal mail.
In a possible implementation manner, the method for identifying spam further includes:
and if the mail traffic relation degree between the target receiver and the target sender is smaller than the relation degree threshold value, identifying the target mail as a junk mail.
In one possible implementation, the mail-to-mail relationship is determined by:
acquiring sender mailbox information and receiver mailbox information of the mail based on a simple mail transmission protocol;
responding to the sent mail, and inquiring whether the mail receiving record contains the receiver mailbox information or not according to the sender mailbox information; if yes, determining a mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information;
responding to the received mail, and inquiring whether the mail sending record contains the sender mailbox information or not according to the recipient mailbox information; and if so, determining the mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information.
In one possible implementation manner, when determining the mail relationship, the method may further include: and if so, increasing the mail traffic degree corresponding to the sender mailbox information and the receiver mailbox information.
In one possible implementation manner, when determining the mail relationship, the method may further include:
in response to the sent mail, updating a mail receiving record corresponding to the receiver mailbox information according to the sender mailbox information;
and responding to the received mail, and updating the mail sending record corresponding to the receiver mailbox information according to the receiver mailbox information.
In a second aspect, the present application provides an apparatus for identifying spam, including:
the system comprises an acquisition module, a judgment module and a display module, wherein the acquisition module is used for acquiring a target mail which is identified as a junk mail based on a preset junk mail identification technology, and the target mail comprises related information used for representing a target receiver and a target sender;
the determining module is used for determining whether historical mail traffic exists between the target receiver and the target sender;
the identification module is used for identifying the target mail as a normal mail when the historical mails come and go exist between the target receiver and the target sender; and when the historical mail exchange does not exist between the target receiver and the target sender, identifying the target mail as a junk mail.
In a possible implementation manner, the determining module is specifically configured to:
determining whether the information of the target sender is contained in a mail exchange relation with the target receiver, wherein the mail exchange relation has historical mail exchange;
determining that the historical mail traffic exists between the target receiver and the target sender when determining that the information of the target sender is contained in the mail traffic relation with the target receiver in which the historical mail traffic exists;
and when determining that the information of the target sender is not contained in the mail exchange relationship with the target receiver in which the historical mail exchange exists, determining that the historical mail exchange does not exist between the target receiver and the target sender.
In one possible implementation manner, the mail traffic relationship includes a mail traffic relationship degree used for characterizing the mail traffic times. Correspondingly, the identification module may be specifically configured to:
determining the mail traffic relation degree between the target receiver and the target sender based on the mail traffic relation;
determining whether the relation degree of the mail traffic of a target receiver and the target sender is smaller than a relation degree threshold value;
and when the mail transaction relation between the target receiver and the target sender is greater than or equal to a relation threshold, identifying the target mail as a normal mail.
In one possible implementation manner, the identification module may be further configured to: and when the mail traffic degree of the target receiver and the target sender is smaller than the relation degree threshold value, identifying the target mail as a junk mail.
In one possible implementation, the mail-to-mail relationship is determined by:
acquiring sender mailbox information and receiver mailbox information of the mails based on a simple mail transmission protocol;
responding to the sent mail, and inquiring whether the mail receiving record contains the receiver mailbox information or not according to the sender mailbox information; if yes, determining a mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information;
responding to the received mail, and inquiring whether the mail sending record contains the sender mailbox information or not according to the recipient mailbox information; and if so, determining the mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information.
In one possible implementation manner, when determining the relation between mails, the method may further include: when the mail receiving record is inquired to contain the information of the mailbox of the sender according to the information of the mailbox of the sender, or when the mail sending record is inquired to contain the information of the mailbox of the sender according to the information of the mailbox of the recipient, the mail traffic degree corresponding to the information of the mailbox of the sender and the information of the mailbox of the recipient is increased.
In one possible implementation manner, when determining the mail relationship, the method may further include:
in response to the sent mail, updating a mail receiving record corresponding to the receiver mailbox information according to the sender mailbox information;
and responding to the received mail, and updating the mail sending record corresponding to the sender mailbox information according to the receiver mailbox information.
In a third aspect, the present application provides an electronic device, comprising:
at least one processor;
and a memory coupled to the at least one processor;
wherein the memory is used for storing computer-executable instructions, which are executed by the at least one processor, to enable the at least one processor to perform the method provided by the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon computer-executable instructions for implementing the method provided in the first aspect when executed.
In a fifth aspect, the present application provides a program product comprising computer executable instructions. When executed by a computer, the instructions implement the method provided by the first aspect.
According to the junk mail identification method, the device, the equipment and the storage medium, on the basis of identifying that a target mail is a junk mail based on a preset junk mail identification technology, whether historical mail traffic exists between the target receiver and the target sender is further determined according to relevant information which is contained in the target mail and used for representing the target receiver and the target sender, and if the historical mail traffic exists, namely the target receiver and the target sender send and receive mails mutually, the target mail is identified to be a normal mail; if no history mail exists, namely the target receiver and the target sender temporarily do not send and receive mails mutually, the target mail is identified as a junk mail, and therefore the misjudgment rate of the junk mail is reduced.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic diagram of an application scenario provided in an embodiment of the present application;
fig. 2 is a flowchart of a spam email identification method according to an embodiment of the present application;
FIG. 3 is a flowchart of a spam email identification method according to another embodiment of the present application;
FIG. 4 is a flowchart illustrating a process for determining a relationship between mail and current provided by an embodiment of the present application;
fig. 5 is a schematic structural diagram of a spam email recognition framework according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an apparatus for identifying spam email according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
At present, the junk mails are abused, and mail senders send mails containing viruses, advertisements, illegal contents and other information in a large batch by using mail senders. The current identification technologies applied to spam mails mainly include an IP address blacklist or IP address whitelist technology, a reverse query technology, a behavior identification technology, a honeypot technology, a statistical analysis technology, a content identification technology and the like. Specifically, the method comprises the following steps:
the IP address blacklisting or IP address whitelisting technique adds the IP addresses that often send spam to the IP address blacklisting, and subsequently determines that all the mails sent from the same IP address are spam. If an IP address is added to the white list, any mail received from that IP address is considered not spam. This technique facilitates the filtering of normal mail as spam.
The reverse inquiry technology is based on that the junk mails generally use forged mail sending addresses, only a very few junk mails use real addresses to verify the mail addresses of senders, the authenticity of the mail addresses is identified, and whether the mail addresses are the junk mails or not is further identified according to the authenticity of the mail addresses.
Behavior recognition technology, that is, a recognition method proposed according to the sending behavior of spam, which is commonly a limited Simple Mail Transfer Protocol (SMTP) connection frequency, a source tracing authentication technology, a reputation verification technology, and the like.
The honeypot technology is that the honeypot collects mailbox addresses distributed on a network to a database of the honeypot, and meanwhile, common junk mails are placed in the database of the honeypot, and the mails received later are matched with information in the database of the honeypot, so that whether the mails are junk mails or not is effectively judged.
Cryptographic techniques, i.e. cryptographic techniques are used to verify the legitimacy of the sender of the mail, which is provided by means of a certificate. Without proper credentials, counterfeit mail is easily identified. Statistical analysis techniques by which large numbers of correctly identified spam and non-spam are analyzed produce a probabilistic database that includes all words and the probability values that each word appears in spam. With the probability database, the probability of a mail is easily calculated, so that the legality of the mail can be recognized, but the misjudgment is also high.
The content recognition technology, an intelligent filtering method based on Bayesian statistical algorithm and keyword filtering.
In the related technology, the junk mail identification technology is comprehensively applied, so that the junk mails can be effectively identified, but the junk mail misjudgment still exists. For example, more common false positives include advertising mails, system alert mails, human recruitment mails, and order mails, among others.
Based on the problem, the method and the device further perform secondary recognition on the junk mails recognized by the existing junk mail recognition technology according to the mail traffic relation between the sender and the receiver on the basis of the existing junk mail recognition, so that misjudgment of the junk mails is reduced. In addition, the mail received by the receiver is identified in a targeted manner from the perspective of the individual of the receiver, so that the method and the system also realize the personalized identification of whether the mail is the junk mail.
For ease of understanding, an application scenario of the embodiment of the present application is first described.
Fig. 1 is a schematic diagram of an application scenario provided in an embodiment of the present application. As shown in fig. 1, the application scenario may include a plurality of terminal devices 11, mailers 12, servers 13, and databases 14. Illustratively, the terminal device 11 may be a mobile phone, a tablet, a computer, and so on.
In which a mail transmitter 12 transmits mails such as advertisement mails, commercial mails, and the like to a plurality of terminal apparatuses 11. The plurality of terminal devices 11 are used for receiving and/or sending mail. The server 13 is configured to collect and store the mail transmission and reception information among the plurality of terminal apparatuses 11, and store the mail transmission and reception information in the database 14. Specifically, the server 13 may further include a mail-to-mail relationship determining module, a spam recognition module, a mail-to-send-receive information collecting and mail-to-mail relationship query module (not shown), and the like, for executing the relevant steps of the spam recognition method provided in the embodiment of the present application.
The following describes the spam email recognition method provided by the present application in detail with reference to specific embodiments, taking a server as an execution subject.
Fig. 2 is a flowchart of a spam email identification method according to an embodiment of the present application. As shown in fig. 2, the spam email identification method includes the following steps:
s201, acquiring a target mail which is identified as a junk mail based on a preset junk mail identification technology, wherein the target mail comprises related information used for representing a target receiver and a target sender.
This step acquires a target mail that has been identified as a spam mail based on a preset spam mail identification technique. Optionally, the predetermined spam recognition technology may be one or more of the IP address blacklisting or IP address whitelisting technologies, reverse query technology, behavior recognition technology, honeypot technology, statistical analysis technology, content recognition technology, and the like, as described above. The spam email identification method of each spam email identification technology is similar to that described above, and is not described herein again.
Illustratively, the related information may be the mailbox addresses of the target recipient and the target sender.
S202, whether historical mail traffic exists between the target receiver and the target sender is determined.
If yes, namely the target receiver and the target sender have historical mail traffic, executing step S103; if not, that is, no historical mail exists between the target receiver and the target sender, step S104 is executed.
Optionally, based on the target recipient, if the target recipient receives the email from the target sender and replies to the email, it indicates that there is a historical email traffic between the target recipient and the target sender; if the target receiver receives the mail of the target sender and does not reply the mail, the fact that no historical mail exists between the target receiver and the target sender is indicated. For example, when the target receiver receives the mail of the target sender, the case of not replying to the mail may be to delete the mail or not to process the mail directly, and the like.
S203, identifying the target mail as a normal mail.
Optionally, the target mail is corrected to a normal mail. It should be noted that the target email may be a service email, an advertisement email, or the like.
It can be understood that, for the advertisement mails, the system alarm mails, the human recruitment mails, the order mails and other mails which are easy to misjudge by the existing spam mail recognition technology, different recipients may be considered as normal mails or spam mails.
For example, for an advertisement mail, different recipients have different attitudes to the advertisement mail, some recipients receive the advertisement mail and directly delete the advertisement mail, and some recipients receive the advertisement mail and possibly reply the advertisement mail to the content of interest of the recipients. Therefore, when the target mail is the advertisement mail, the personalized recognition of whether the mail is the junk mail or not can be realized based on the historical mail traffic of the target receiver, and the misjudgment of the mail which is different from person to person is reduced.
And S204, identifying the target mail as a junk mail.
In the embodiment of the application, on the basis of identifying a target mail as a junk mail based on a preset junk mail identification technology, whether a historical mail exchange exists between the target receiver and the target sender is determined according to the fact that the target mail contains relevant information for representing the target receiver and the target sender; if the target receiver and the target sender have historical mail traffic, namely the target receiver and the target sender send and receive mails mutually, identifying the target mail as a normal mail; if the target receiver and the target sender do not have historical mail traffic, namely the target receiver and the target sender do not send and receive mails mutually temporarily, the target mail is identified as a junk mail, and the misjudgment rate of the junk mail is reduced.
In some embodiments, the step S202 of determining whether there is a historical mail exchange between the target receiver and the target sender may specifically include the following steps: determining whether the information of the target sender is contained in a mail transaction relation with the target receiver, wherein the mail transaction relation has historical mail transaction; if yes, determining that historical mail traffic exists between the target receiver and the target sender; if not, determining that no historical mail exchange exists between the target receiver and the target sender. It is understood that the mail traffic is specific to the intended recipient.
On the basis of the above embodiment, optionally, the mail relationship may include a mail relationship degree for characterizing the mail times. Based on this, whether the target mail is a normal mail or not in the embodiment shown in fig. 2 is identified will be described in detail with reference to fig. 3:
fig. 3 is a flowchart of a spam email identification method according to another embodiment of the present application. As shown in fig. 3, identifying whether the target email is a normal email may specifically include the following steps:
s301, determining the mail traffic relation degree between the target receiver and the target sender based on the mail traffic relation.
Specifically, the mail traffic relation degree may be used to indicate the number of times of the mail traffic relation. For example, based on the target recipient, if the target recipient receives the mail from the target sender and replies to the mail, it is determined that a mail traffic relationship exists between the target recipient and the target sender, and the corresponding mail traffic relationship degree may be determined to be 1. It can be understood that, for the target receiver, every time the mail transaction relationship between the target receiver and the target sender is increased, the corresponding mail transaction relationship degree is also increased by 1.
It should be noted that the mail traffic relationship between the target recipient and the target sender in the embodiment of the present application does not include a mail that is automatically replied by the system.
S302, determining whether the mail traffic relation between the target receiver and the target sender is less than the relation threshold.
If not, that is, the mail traffic relationship between the target receiver and the target sender is greater than or equal to the relationship threshold, step S303 is executed.
Alternatively, the relatedness threshold may be 1, or any integer greater than 1. For example, the size of the specific relation threshold may be determined according to the system operating condition of the specific mail, and the size of the relation threshold is not limited in the embodiment of the present application.
S303, identifying the target mail as a normal mail.
In the embodiment of the application, the mail traffic relation between the target receiver and the target sender is determined based on the mail traffic relation, and the target mail of which the mail traffic relation between the target receiver and the target sender is greater than or equal to the relation threshold is further identified as the normal mail, so that the identification rate of the junk mail is improved.
Optionally, if the mail transaction degree between the target receiver and the target sender is smaller than the relation degree threshold, the target mail is identified as a junk mail.
The above embodiment refers to the mail relationship, and the following describes the determination of the mail relationship in detail with reference to fig. 4.
Fig. 4 is a flowchart of determining a mail-to-call relationship according to an embodiment of the present application. As shown in fig. 4, the determination of the mail-to-call relationship includes the following steps:
s401, based on SMTP, obtaining the sender mailbox information and the receiver mailbox information of the mail.
SMTP is a protocol for providing reliable and efficient email transmission. SMTP is a mail service established on a File Transfer Protocol (FTP) File Transfer service, and is mainly used for mail information Transfer between systems and providing notification about incoming messages.
Alternatively, the sender mailbox information and the recipient mailbox information may be mailbox address information of the sender and mailbox address information of the recipient.
S402, responding to the sent mail, and inquiring whether the mail receiving record contains the receiver mailbox information or not according to the sender mailbox information; and if so, determining the mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information.
It can be understood that, in response to sending a mail, if it is found that the mail receiving record of the sender includes the mailbox information of the recipient according to the mailbox information of the sender, it indicates that the sender receives the reply information of the recipient corresponding to the sent mail.
S403, responding to the received mail, and inquiring whether the mail sending record contains the sender mailbox information or not according to the receiver mailbox information; and if so, determining the mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information.
It can be understood that, in response to receiving a mail, if it is found that the mail receiving record of the sender includes the mailbox information of the recipient according to the mailbox information of the recipient, it indicates that the sender receives the reply information of the recipient corresponding to the received mail.
Alternatively, the mail reception record or the mail transmission record may be stored in the database. Illustratively, the database may be a real-time database or a persistent database. Specifically, the real-time database is used for storing mail receiving records with mail communication relation, and the corresponding mail receiving records can be updated in real time according to the receiving information of the specific mails; the persistent database is used to store all mailing records for accumulation and analysis of data.
In some embodiments, the mailing records stored in the real-time database may also be updated in real-time. Specifically, the method for identifying spam mails may further include: in response to the sent mail, updating a mail receiving record corresponding to the receiver mailbox information according to the sender mailbox information; and responding to the received mail, and updating a mail sending record corresponding to the mailbox information of the receiver according to the mailbox information of the receiver.
Illustratively, responding to the received mail, inquiring the receiving record of the mail receiving record table of the receiver in the real-time database according to the address of the receiver mailbox, if the receiving record exists, modifying a counter in the mail receiving record table, and increasing 1, if the receiving record does not exist, writing the receiving record into the mail receiving record table for the first time, and setting the counter to be 1.
Based on the above embodiment of the spam identification method, it can be understood that the mail traffic degree in the mail traffic relation is changed according to the number of times of mail traffic between the recipient and the sender. In some embodiments, in response to sending the mail, the mail receiving record is queried to include the recipient mailbox information according to the sender mailbox information, and/or in response to receiving the mail, the mail sending record is queried to include the sender mailbox information according to the recipient mailbox information, and then the mail traffic relation degree corresponding to the sender mailbox information and the recipient mailbox information is increased. Illustratively, the mail traffic may be incremented by 1 each time.
In the embodiment of the application, the sender mailbox information and the receiver mailbox information of the mail are acquired based on a simple mail transmission protocol; the method comprises the steps of responding to a sent mail and a received mail respectively, and inquiring whether the mail receiving and sending record contains the receiver mailbox information and the sender mailbox information according to the sender mailbox information and the receiver mailbox information so as to determine the mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information, ensure the accuracy of the mail traffic relation and improve the identification rate of junk mails.
By combining the embodiments of the spam email identification method and the email relationship determination method provided by the present application, an identification framework corresponding to the spam email identification method provided by the present application will be described in detail with reference to fig. 5.
Fig. 5 is a schematic structural diagram of a spam email recognition framework according to an embodiment of the present application. As shown in fig. 5, the mail round-trip relationship determining module 51 is configured to determine a mail round-trip relationship based on a recipient, the spam identifying module 52 is configured to identify a spam mail, and a specific method for determining a mail round-trip relationship and a method for identifying a spam mail are similar to those described in the foregoing embodiments and are not described herein again. The mail receiving and sending information collecting and mail sending relationship query module in the mail sending and receiving relationship determining module 51 is configured to collect the recipient mailbox information and the sender mailbox information from the SMTP, and determine the mail sending and receiving relationship based on the recipient according to the recipient mailbox information and the sender mailbox information. The data stored in the persistent database and the real-time database are similar to those described in the above embodiments, and are not described herein again. Specifically, for data stored in the persistent database, for data that cannot be stored in a short time, the data may be stored by means of a queue. That is, the corresponding data is sent to the queue first, and the data is stored in the persistent database in sequence through the queue. And the data storage monitoring queue can monitor whether the data to be stored still exists in the data storage monitoring queue through the round-trip relationship. It is understood that the mail-to-mail relationship determining module 51, the spam identifying module 52 and the mail-to-send-receive information collecting and mail-to-mail relationship querying module are virtual devices in the server.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Fig. 6 is a schematic structural diagram of an apparatus for identifying spam email according to an embodiment of the present application. As shown in fig. 6, the spam recognition device 60 includes: an acquisition module 610, a determination module 620, and an identification module 630. Wherein:
the obtaining module 610 is configured to obtain a target email identified as a spam email based on a preset spam email identification technology, where the target email includes related information used for characterizing a target recipient and a target sender.
A determining module 620, configured to determine whether there is a historical mail traffic between the target recipient and the target sender.
The identifying module 630 is configured to identify the target email as a normal email when the historical email is sent to and from the target recipient and the target sender; and when the historical mail exchange does not exist between the target receiver and the target sender, identifying the target mail as a junk mail.
In a possible implementation manner, the determining module 620 is specifically configured to: determining whether the information of the target sender is contained in a mail exchange relation with the target receiver, wherein the mail exchange relation has historical mail exchange; determining that the historical mail traffic exists between the target receiver and the target sender when determining that the information of the target sender is contained in the mail traffic relation with the target receiver in which the historical mail traffic exists; and when determining that the information of the target sender is not contained in the mail exchange relationship with the target receiver in which the historical mail exchange exists, determining that the historical mail exchange does not exist between the target receiver and the target sender.
In one possible implementation manner, the mail traffic relationship includes a mail traffic relationship degree used for characterizing the mail traffic times. Correspondingly, the identifying module 630 may be specifically configured to: determining the mail traffic relation degree between the target receiver and the target sender based on the mail traffic relation; determining whether the relation degree of the mail traffic of a target receiver and the target sender is smaller than a relation degree threshold value; and when the mail traffic relation degree between the target receiver and the target sender is greater than or equal to the relation degree threshold value, identifying the target mail as a normal mail.
In one possible implementation, the identifying module 630 may be further configured to: and when the mail traffic relation degree between the target receiver and the target sender is smaller than a relation degree threshold value, identifying the target mail as a junk mail.
In one possible implementation, the mail-to-mail relationship is determined by: acquiring sender mailbox information and receiver mailbox information of the mail based on a simple mail transmission protocol; responding to the sent mail, and inquiring whether the mail receiving record contains the receiver mailbox information or not according to the sender mailbox information; if yes, determining a mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information; responding to the received mail, and inquiring whether the mail sending record contains the sender mailbox information or not according to the recipient mailbox information; and if so, determining the mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information.
In one possible implementation manner, when determining the relation between mails, the method may further include: and when the mail receiving record is inquired to contain the sender mailbox information according to the sender mailbox information or the mail sending record is inquired to contain the sender mailbox information according to the receiver mailbox information, the mail traffic relation degree corresponding to the sender mailbox information and the receiver mailbox information is increased.
In one possible implementation manner, when determining the mail relationship, the method may further include: in response to the sent mail, updating a mail receiving record corresponding to the receiver mailbox information according to the sender mailbox information; and responding to the received mail, and updating the mail sending record corresponding to the receiver mailbox information according to the receiver mailbox information.
The apparatus provided in the embodiment of the present application may be configured to perform the method steps of the foregoing method embodiment, and the specific implementation manner and the technical effect are similar, which are not described herein again.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 7, the electronic device 70 includes:
at least one processor 701; and
a memory 702 communicatively coupled to the at least one processor 701; wherein the content of the first and second substances,
the memory 702 stores instructions executable by the at least one processor 701 to cause the at least one processor 701 to perform the method for spam identification as described above.
For a specific implementation process of the processor 701, reference may be made to the above method embodiment, and a specific implementation manner and a technical effect are similar, which are not described herein again.
In particular, processor 701 may include one or more processing units, such as: the Processor 701 may be a Central Processing Unit (CPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present application may be embodied directly in a hardware processor, or in a combination of the hardware and software modules in the processor.
Memory 702 may be used to store program instructions. The memory 702 may include a program storage area and a data storage area. The storage program area can store an operating system and an application program required by at least one function. The storage data area may store data created during use of the electronic device 70. Further, the memory 702 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like. The processor 701 executes various functional applications of the electronic device 70 and data processing by executing program instructions stored in the memory 702.
It should be noted that, regarding the number of the memories 702 and the processors 702, the embodiments of the present application do not limit them, and they may be one or more, and fig. 7 illustrates one example; the memory 702 and the processor 701 may be connected by various means, such as a bus, in a wired or wireless manner. In practice, the electronic device 70 may be a computer or a mobile terminal in various forms. Wherein the computer is, for example, a laptop computer, a desktop computer, a workbench, a server, a blade server, a mainframe computer, etc.; mobile terminals are, for example, personal digital assistants, cellular phones, smart phones, wearable devices, and other similar computing devices.
The electronic device of this embodiment may be configured to execute the technical solution in the foregoing method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.
The embodiment of the present application provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and the computer-executable instructions are executed by a processor to implement the method steps in the foregoing method embodiments, and specific implementation manners and technical effects are similar, and are not described herein again.
The embodiment of the application also provides a program product, and the program product comprises computer execution instructions. When the computer executes the instructions, the method steps in the above method embodiments are implemented in a similar manner and technical effects, which are not described herein again.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. A method for identifying spam mails, comprising the following steps:
acquiring a target mail which is identified as a junk mail based on a preset junk mail identification technology, wherein the target mail comprises related information for representing a target receiver and a target sender;
determining whether historical mail traffic exists between the target receiver and the target sender;
if the target receiver and the target sender have historical mail traffic, identifying the target mail as a normal mail;
and if the target receiver does not have historical mail traffic with the target sender, identifying the target mail as a junk mail.
2. The method of claim 1 wherein said determining whether there is a historical mail exchange between said intended recipient and said intended sender comprises:
determining whether the information of the target sender is contained in a mail exchange relation with the target receiver, wherein the mail exchange relation has historical mail exchange;
if yes, determining that historical mail traffic exists between the target receiver and the target sender;
if not, determining that the historical mail communication does not exist between the target receiver and the target sender.
3. The identification method according to claim 2, wherein the mail traffic relation includes a mail traffic relation degree used for characterizing the mail traffic times, and the identifying the target mail is a normal mail includes:
determining the mail traffic relation degree of the target receiver and the target sender based on the mail traffic relation;
determining whether the mail traffic relation degree of the target receiver and the target sender is smaller than a relation degree threshold value;
and if the mail communication degree of the target receiver and the target sender is greater than or equal to a relation degree threshold value, identifying the target mail as a normal mail.
4. The identification method according to claim 3, further comprising:
and if the mail traffic relation between the target receiver and the target sender is smaller than a relation threshold, identifying the target mail as a junk mail.
5. The identification method according to any one of claims 2 to 4, wherein the mail-to-mail relationship is determined by:
acquiring sender mailbox information and receiver mailbox information of the mails based on a simple mail transmission protocol;
responding to the sent mail, and inquiring whether the mail receiving record contains the receiver mailbox information or not according to the sender mailbox information; if yes, determining a mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information;
responding to the received mail, and inquiring whether the mail sending record contains the sender mailbox information or not according to the recipient mailbox information; and if so, determining the mail traffic relation corresponding to the sender mailbox information and the receiver mailbox information.
6. The identification method of claim 5, further comprising:
and if so, increasing the mail traffic relation degree corresponding to the sender mailbox information and the receiver mailbox information.
7. The identification method of claim 5, further comprising:
responding to the sent mail, and updating a mail receiving record corresponding to the receiver mailbox information according to the sender mailbox information;
and responding to the received mail, and updating a mail sending record corresponding to the receiver mailbox information according to the receiver mailbox information.
8. An apparatus for recognizing spam, comprising:
the system comprises an acquisition module, a judgment module and a display module, wherein the acquisition module is used for acquiring a target mail which is identified as a junk mail based on a preset junk mail identification technology, and the target mail comprises related information used for representing a target receiver and a target sender;
the first determining module is used for determining whether historical mail traffic exists between the target receiver and the target sender;
the first identification module is used for identifying the target mail as a normal mail if the historical mail exchange exists between the target receiver and the target sender;
and the second identification module is used for identifying the target mail as a junk mail if the historical mail traffic does not exist between the target receiver and the target sender.
9. An electronic device, comprising:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory is to store instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 7.
10. A computer-readable storage medium having computer-executable instructions stored therein, which when executed by a processor, are configured to implement the method of any one of claims 1 to 7.
CN202211179798.1A 2022-09-27 2022-09-27 Junk mail identification method, device, equipment and storage medium Pending CN115567475A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211179798.1A CN115567475A (en) 2022-09-27 2022-09-27 Junk mail identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211179798.1A CN115567475A (en) 2022-09-27 2022-09-27 Junk mail identification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115567475A true CN115567475A (en) 2023-01-03

Family

ID=84742094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211179798.1A Pending CN115567475A (en) 2022-09-27 2022-09-27 Junk mail identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115567475A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101035098A (en) * 2007-04-24 2007-09-12 北京网秦天下科技有限公司 Method for the mobile terminal to reject the spam via the query
CN101123589A (en) * 2006-08-10 2008-02-13 华为技术有限公司 A method and device for preventing from spam
CN101325561A (en) * 2007-06-12 2008-12-17 阿里巴巴集团控股有限公司 Method, apparatus and system for processing electronic mail

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101123589A (en) * 2006-08-10 2008-02-13 华为技术有限公司 A method and device for preventing from spam
CN101035098A (en) * 2007-04-24 2007-09-12 北京网秦天下科技有限公司 Method for the mobile terminal to reject the spam via the query
CN101325561A (en) * 2007-06-12 2008-12-17 阿里巴巴集团控股有限公司 Method, apparatus and system for processing electronic mail

Similar Documents

Publication Publication Date Title
US11095586B2 (en) Detection of spam messages
US8141132B2 (en) Determining an invalid request
US10185479B2 (en) Declassifying of suspicious messages
US6829635B1 (en) System and method of automatically generating the criteria to identify bulk electronic mail
US8621638B2 (en) Systems and methods for classification of messaging entities
US11165792B2 (en) System and method for generating heuristic rules for identifying spam emails
US20060026242A1 (en) Messaging spam detection
CN109657152B (en) Push message sending method and device, electronic equipment and readable storage medium
US20100017488A1 (en) Message Classification Using Allowed Items
US20050114452A1 (en) Method and apparatus to block spam based on spam reports from a community of users
US10091150B2 (en) Identifying first contact unsolicited communications
US9935861B2 (en) Method, system and apparatus for detecting instant message spam
US20130086632A1 (en) System, method, and computer program product for applying a rule to associated events
CN104038491A (en) Information interception method, device and system
CN115567475A (en) Junk mail identification method, device, equipment and storage medium
CN115037542A (en) Abnormal mail detection method and device
Juneja et al. A Survey on Email Spam Types and Spam Filtering Techniques
US12028304B2 (en) System and method for restricting the reception of e-mails from a sender of bulk spam mail
EP3716540A1 (en) System and method for generating heuristic rules for identifying spam emails
CN117596227A (en) Intercepted mail processing method, device, equipment and medium
US8504622B1 (en) System, method, and computer program product for reacting based on a frequency in which a compromised source communicates unsolicited electronic messages
EP4199471A1 (en) System and method for restricting the reception of e-mails from a sender of bulk spam mail
EP1968264B1 (en) A method of filtering electronic mails and an electronic mail system
KR20140127036A (en) Server and method for spam filtering
CN115665081A (en) Mail processing method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination