US20050108340A1 - Method and apparatus for filtering email spam based on similarity measures - Google Patents

Method and apparatus for filtering email spam based on similarity measures Download PDF

Info

Publication number
US20050108340A1
US20050108340A1 US10846723 US84672304A US2005108340A1 US 20050108340 A1 US20050108340 A1 US 20050108340A1 US 10846723 US10846723 US 10846723 US 84672304 A US84672304 A US 84672304A US 2005108340 A1 US2005108340 A1 US 2005108340A1
Authority
US
Grant status
Application
Patent type
Prior art keywords
message
spam
data
email
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10846723
Inventor
Matt Gleeson
David Hoogstrate
Sandy Jensen
Eli Mantel
Art Medlar
Ken Schneider
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Symantec Corp
Original Assignee
Symantec Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00Arrangements for user-to-user messaging in packet-switching networks, e.g. e-mail or instant messages
    • H04L51/12Arrangements for user-to-user messaging in packet-switching networks, e.g. e-mail or instant messages with filtering and selective blocking capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00Arrangements for user-to-user messaging in packet-switching networks, e.g. e-mail or instant messages
    • H04L51/06Message adaptation based on network or terminal capabilities
    • H04L51/063Message adaptation based on network or terminal capabilities with adaptation of content

Abstract

A method and system for filtering email spam based on similarity measures are described. In one embodiment, the method includes receiving an incoming email message, generating data characterizing the incoming email message based on the content of the incoming email message, and comparing the generated data with a set of data characterizing spam messages. The method further includes determining whether a resemblance between the data characterizing the incoming email message and any data item from the set of data characterizing spam messages exceeds a threshold.

Description

    RELATED APPLICATIONS
  • [0001]
    The present application claims priority to U.S. Provisional Application Ser. No. 60/471,242, filed May 15, 2003, which is incorporated herein in its entirety.
  • FIELD OF THE INVENTION
  • [0002]
    The present invention relates to filtering electronic mail (email); more particularly, the present invention relates to filtering email spam based on similarity measures.
  • BACKGROUND OF THE INVENTION
  • [0003]
    The Internet is growing in popularity, and more and more people are conducting business over the Internet, advertising their products and services by generating and sending electronic mass mailings. These electronic messages (emails) are usually unsolicited and regarded as nuisances by the recipients because they occupy much of the storage space needed for the necessary and important data processing. For example, a mail server may have to reject accepting an important and/or desired email when its storage capacity is filled to the maximum with the unwanted emails containing advertisements. Moreover, thin client systems such as set top boxes, PDA's, network computers, and pagers all have limited storage capacity. Unwanted emails in any one of such systems can tie up a finite resource for the user. In addition, a typical user wastes time by downloading voluminous but useless advertisement information. These unwanted emails are commonly referred to as spam.
  • [0004]
    Presently, there are products that are capable of filtering out unwanted messages. For example, a spam block method exists which keeps an index list of all spam agents (i.e., companies that generate mass unsolicited e-mails), and provides means to block any e-mail sent from a company on the list.
  • [0005]
    Another “junk mail” filter currently available employs filters which are based on predefined words and patterns as mentioned above. An incoming mail is designated as an unwanted mail, if the subject contains a known spam pattern.
  • [0006]
    However, as spam filtering grows in sophistication, so do the techniques of spammers in avoiding the filters. Examples of tactics incorporated by recent generation of spammers include randomization, origin concealment, and filter evasion using HTML.
  • SUMMARY OF THE INVENTION
  • [0007]
    A method and system for filtering email spam based on similarity measures are described. According to one aspect, the method includes receiving an incoming email message, generating data characterizing the incoming email message based on the content of the incoming email message, and comparing the generated data with a set of data characterizing spam messages. The method further includes determining whether a resemblance between the data characterizing the incoming email message and any data item from the set of data characterizing spam messages exceeds a threshold.
  • [0008]
    Other features of the present invention will be apparent from the accompanying drawings and from the detailed description that follows.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • [0009]
    The present invention will be understood more fully from the detailed description given below and from the accompanying drawings of various embodiments of the invention, which, however, should not be taken to limit the invention to the specific embodiments, but are for explanation and understanding only.
  • [0010]
    FIG. 1 is a block diagram of one embodiment of a system for controlling delivery of spam electronic mail.
  • [0011]
    FIG. 2 is a block diagram of one embodiment of a spam content preparation module.
  • [0012]
    FIG. 3 is a block diagram of one embodiment of a similarity determination module.
  • [0013]
    FIG. 4 is a flow diagram of one embodiment of a process for handling a spam message.
  • [0014]
    FIG. 5 is a flow diagram of one embodiment of a process for filtering email spam based on similarities measures.
  • [0015]
    FIG. 6A is a flow diagram of one embodiment of a process for creating a signature of an email message.
  • [0016]
    FIG. 6B is a flow diagram of one embodiment of a process for detecting spam using a signature of an email message.
  • [0017]
    FIG. 7 is a flow diagram of one embodiment of a process for a character-based comparison of documents.
  • [0018]
    FIG. 8 is a flow diagram of one embodiment of a process for determining whether two documents are similar.
  • [0019]
    FIG. 9 is a flow diagram of one embodiment of a process for reducing noise in an email message.
  • [0020]
    FIG. 10 is a flow diagram of one embodiment of a process for modifying an email message to reduce noise.
  • [0021]
    FIG. 11 is a block diagram of an exemplary computer system.
  • DETAILED DESCRIPTION OF THE PRESENT INVENTION
  • [0022]
    A method and apparatus for filtering email spam based on similarity measures are described. In the following description, numerous details are set forth. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the present invention.
  • [0023]
    Some portions of the detailed descriptions which follow are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
  • [0024]
    It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
  • [0025]
    The present invention also relates to apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
  • [0026]
    The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description below. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein.
  • [0027]
    A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a machine-readable medium includes read only memory (“ROM”); random access memory (“RAM”); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.); etc.
  • [0000]
    Filtering Email Spam Based on Similarity Measures
  • [0028]
    FIG. 1 is a block diagram of one embodiment of a system for controlling delivery of spam electronic mail (email). The system includes a control center 102 coupled to a communications network 100 such as a public network (e.g., the Internet, a wireless network, etc.) or a private network (e.g., LAN, Intranet, etc.). The control center 102 communicates with multiple network servers 104 via the network 100. Each server 104 communicates with user terminals 106 using a private or public network.
  • [0029]
    The control center 102 is an anti-spam facility that is responsible for analyzing messages identified as spam, developing filtering rules for detecting spam, and distributing the filtering rules to the servers 104. A message may be identified as spam because it was sent by a known spam source (as determined, for example, using a “spam probe”, i.e., an email address specifically selected to make its way into as many spammer mailing lists as possible).
  • [0030]
    A server 104 may be a mail server that receives and stores messages addressed to users of corresponding user terminals sent. Alternatively, a server 104 may be a different server coupled to the mail server 104. Servers 104 are responsible for filtering incoming messages based on the filtering rules received from the control center 102.
  • [0031]
    In one embodiment, the control center 102 includes a spam content preparation module 108 that is responsible for generating data characterizing the content associated with a spam attack and sending this data to the servers 104. Each server 104 includes a similarity determination module 110 that is responsible for storing spam data received from the control center 102 and identifying incoming email messages resembling the spam content using the stored data.
  • [0032]
    In an alternative embodiment, each server 104 hosts both the spam content preparation module 108 that generates data characterizing the content associated with a spam attack and the similarity determination module 110 that uses the generated data to identify email messages resembling the spam content.
  • [0033]
    FIG. 2 is a block diagram of one embodiment of a spam content preparation module 200. The spam content preparation module 200 includes a spam content parser 202, a spam data generator 206, and a spam data transmitter 208.
  • [0034]
    The spam content parser 202 is responsible for parsing the body of email messages resulting from spam attacks (referred to as spam messages).
  • [0035]
    The spam data generator 206 is responsible for generating data characterizing a spam message. In one embodiment, data characterizing a spam message includes a list of hash values calculated for sets of tokens (e.g., characters, words, lines, etc.) composing the spam message. Data characterizing a spam message or any other email message is referred to herein as a message signature. Signatures of spam messages or any other email messages may contain various data identifying the message content and may be created using various algorithms that enable the use of similarity measures in comparing signatures of different email messages.
  • [0036]
    In one embodiment, the spam content preparation module 200 also includes a noise reduction algorithm 204 that is responsible for detecting data indicative of noise and removing the noise from spam messages prior to generating signatures of spam messages. Noise represents data invisible to a recipient that was added to a spam message to hide its spam nature.
  • [0037]
    In one embodiment, the spam content preparation module 200 also includes a message grouping algorithm (not shown) that is responsible for grouping messages originated from a single spam attack. Grouping may be performed based on specified characteristics of spam messages (e.g., included URLs, message parts, etc.). If grouping is used, the spam data generator 206 may generate a signature for a group of spam messages rather than for each individual message.
  • [0038]
    The spam data transmitter 208 is responsible for distributing signatures of spam messages to participating servers such as servers 104 of FIG. 1. In one embodiment, each server 104 periodically (e.g., each 5 minutes) initiates a connection (e.g., a secure HTTPS connection) with the call center 102. Using this pull-based connection, signatures are transmitted from the call center 102 to the relevant server 106.
  • [0039]
    FIG. 3 is a block diagram of one embodiment of a similarity determination module 300. The similarity determination module 300 includes an incoming message parser 302, a spam data receiver 306, a message data generator 310, a resemblance identifier 312, and a spam database 304.
  • [0040]
    The incoming message parser 302 is responsible for parsing the body of incoming email messages.
  • [0041]
    The spam data receiver 306 is responsible for receiving signatures of spam messages and storing them in the spam database 304.
  • [0042]
    The message data generator 310 is responsible for generating signatures of incoming email messages. In some embodiments, a signature of an incoming email message includes a list of hash values calculated for sets of tokens (e.g., characters, words, lines, etc.) composing the incoming email message. In other embodiments, a signature of an incoming email message includes various other data characterizing the content of the email message (e.g., a subset of token sets composing the incoming email message). As discussed above, signatures of email messages may be created using various algorithms that allow for use of similarity measures in comparing signatures of different email messages.
  • [0043]
    In one embodiment, the similarity determination module 300 also includes an incoming message cleaning algorithm 308 that is responsible for detecting data indicative of noise and removing the noise from the incoming email messages prior to generating their signatures, as will be discussed in more detail below.
  • [0044]
    The resemblance identifier 312 is responsible for comparing the signature of each incoming email message with the signatures of spam messages stored in the spam database 304 and determining, based on this comparison, whether an incoming email message is similar to any spam message.
  • [0045]
    In one embodiment, the spam database 304 stores signatures generated for spam messages before they undergo the noise reduction process (i.e., noisy spam messages) and signatures generated for these spam messages after they undergo the noise reduction process (i.e., spam message with reduced noise). In this embodiment, the message data generator 310 first generates a signature of an incoming email message prior to noise reduction, and the resemblance identifier 312 compares this signature with the signatures of noisy spam messages. If this comparison indicates that the incoming email message is similar to one of these spam messages, then the resemblance identifier 312 marks this incoming email message as spam. Alternatively, the resemblance identifier 312 invokes the incoming message cleaning algorithm 308 to remove noise from the incoming email message. Then, the message data generator 310 generates a signature for the modified incoming message, which is then compared by the resemblance identifier 312 with the signatures of spam messages with reduced noise.
  • [0046]
    FIG. 4 is a flow diagram of one embodiment of a process 400 for handling a spam message. The process may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both. In one embodiment, processing logic resides at a control center 102 of FIG. 1.
  • [0047]
    Referring to FIG. 4, process 400 begins with processing logic receiving a spam message (processing block 402).
  • [0048]
    At processing block 404, processing logic modifies the spam message to reduce noise. One embodiment of a noise reduction algorithm will be discussed in more detail below in conjunction with FIGS. 9 and 10.
  • [0049]
    At processing block 406, processing logic generates a signature of the spam message. In one embodiment, a signature of the spam message includes a list of hash values calculated for sets of tokens (e.g., characters, words, lines, etc.) composing the incoming email message, as will be discussed in more detail below in conjunction with FIG. 6A. In other embodiments, a signature of an incoming email message includes various other data characterizing the content of the email message.
  • [0050]
    At processing block 408, processing logic transfers the signature of the spam message to a server (e.g., a server 104 of FIG. 1), which uses the signature of the spam message to find incoming email messages resembling the spam message (block 410).
  • [0051]
    FIG. 5 is a flow diagram of one embodiment of a process 500 for filtering email spam based on similarities measures. The process may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both. In one embodiment, processing logic resides at a server 104 of FIG. 1.
  • [0052]
    Referring to FIG. 5, process 500 begins with processing logic receiving an incoming email message (processing block 502).
  • [0053]
    At processing block 504, processing logic modifies the incoming message to reduce noise. One embodiment of a noise reduction algorithm will be discussed in more detail below in conjunction with FIGS. 9 and 10.
  • [0054]
    At processing block 506, processing logic generates a signature of the incoming message based on the content of the incoming message. In one embodiment, a signature of an incoming email message includes a list of hash values calculated for sets of tokens (e.g., characters, words, lines, etc.) composing the incoming email message, as will be discussed in more detail below in conjunction with FIG. 6A. In other embodiments, a signature of an incoming email message includes various other data characterizing the content of the email message.
  • [0055]
    At processing block 508, processing compares the signature of the incoming messages with signatures of spam messages.
  • [0056]
    At processing block 510, processing logic determines that the resemblance between the signature of the incoming message and a signature of some spam message exceeds a threshold similarity measure. One embodiment of a process for determining the resemblance between two messages is discussed in more detail below in conjunction with FIG. 6B.
  • [0057]
    At processing block 512, processing logic marks the incoming email message as spam.
  • [0058]
    FIG. 6A is a flow diagram of one embodiment of a process 600 for creating a signature of an email message. The process may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both. In one embodiment, processing logic resides at a server 104 of FIG. 1.
  • [0059]
    Referring to FIG. 6A, process 600 begins with processing logic dividing an email message into sets of tokens (processing block 602). Each set of tokens may include a predefined number of sequential units from the email message. The predefined number may be equal to, or greater than, 1. A unit may represent a character, a word or a line in the email message. In one embodiment, each set of tokens is combined with the number of occurrences of this set of tokens in the email message.
  • [0060]
    At processing block 604, processing logic calculates hash values for the sets of tokens. In one embodiment, a hash value is calculated by applying a hash function to each combination of a set of tokens and a corresponding token occurrence number.
  • [0061]
    At processing block 606, processing logic creates a signature for the email message using the calculated hash values. In one embodiment, the signature is created by selecting a subset of calculated hash values and adding a parameter characterizing the email message to the selected subset of calculated hash values. The parameter may specify, for example, the size of the email message, the number of calculated hash values, the keyword associated with the email message, the name of an attachment file, etc.
  • [0062]
    In one embodiment, a signature for an email message is created using a character-based document comparison mechanism that will be discussed in more detail below in conjunction with FIGS. 7 and 8.
  • [0063]
    FIG. 6B is a flow diagram of one embodiment of a process 650 for detecting spam using a signature of an email message. The process may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both. In one embodiment, processing logic resides at a server 104 of FIG. 1.
  • [0064]
    Referring to FIG. 6B, process 650 compares data in a signature of an incoming email message with data in a signature of each spam message. The signature data includes a parameter characterizing the content of an email message and a subset of hash values generated for the tokens contained in the email message. The parameter may specify, for example, the size of the email message, the number of tokens in the email message, the keyword associated with the email message, the name of an attachment file, etc.
  • [0065]
    Processing logic begins with comparing a parameter in a signature of the incoming email message with a corresponding parameter in a signature of each spam message (processing block 652).
  • [0066]
    A decision box 654, processing logic determines whether any spam message signatures contain a parameter similar to the parameter of the incoming message signature. The similarity may be determined, for example, based on the allowed difference between the two parameters or the allowed ratio of the two parameters.
  • [0067]
    If none of the spam message signatures contain a parameter similar to the parameter of the incoming message signature, processing logic decides that the incoming email message is legitimate (i.e., it is not spam) (processing block 662).
  • [0068]
    Alternatively, if one or more spam message signatures have a similar parameter, processing logic determines whether the signature of he first spam message has hash values similar to the hash values in the signature of the incoming email (decision box 656). Based on the similarity threshold, the hash values may be considered similar if, for example, a certain number of them matches or the ratio of matched and unmatched hash values exceeds a specified threshold.
  • [0069]
    If the first spam message signature has hash values similar to the hash values of the incoming email signature, processing logic decides that the incoming email message is spam (processing block 670). Otherwise, processing logic further determines if there are more spam message signatures with the similar parameter (decision box 658). If so, processing logic determines whether the next spam message signature has hash values similar to the hash values of the incoming email signature (decision box 656). If so, processing logic decides that the incoming email message is spam (processing block 670). If not, processing logic returns to processing block 658.
  • [0070]
    If processing logic determines that no other spam message signatures have the similar parameter, then it decides that the incoming mail message is not spam (processing block 662). Character-Based Document Comparison Mechanism
  • [0071]
    FIG. 7 is a flow diagram of one embodiment of a process 700 for a character-based comparison of documents. The process may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both.
  • [0072]
    Referring to FIG. 7, process 700 begins with processing logic pre-processing a document (processing block 702). In one embodiment, the document is pre-processed by changing each upper case alphabetic character within the document to a lower case alphabetic character. For example, the message “I am Sam, Sam I am.” may be pre-processed into an expression “i.am.sam.sam.i.am”.
  • [0073]
    At processing block 704, processing logic divides the document into tokens, with each token including a predefined number of sequential characters from the document. In one embodiment, each token is combined with its occurrence number. This combination is referred to as a labeled shingle. For example, if the predefined number of sequential characters in the token is equal to 3, the expression specified above includes the following set of labeled shingles:
      • i.a1
      • .am1
      • am.1
      • m.s1
      • .sa1
      • sam1
      • sm.2
      • m.s1
      • .sm2
      • sam2
      • am.3
      • m.i1
      • .i.1
      • i.a2
      • .am4
  • [0089]
    In one embodiment, the shingles are represented as a histogram.
  • [0090]
    At processing block 706, processing logic calculates hash values for the tokens. In one embodiment, the hash values are calculated for the labeled shingles. For example, if a hashing function H(x) is applied to each labeled shingle illustrated above, the following results are produced:
      • H(i.a1)->458348732
      • H(.am1)->200404023
      • H(am.1)->692939349
      • H(m.s1)->220443033
      • H(.sa1)->554034022
      • H(8am1)->542929292
      • H(am.2)->629292229
      • H(m.s1)->702202232
      • H(.sa2)->322243349
      • H(8am2)->993923828
      • H(am.3)->163393269
      • H(m.i1)->595437753
      • H(.i.1)->843438583
      • H(i.a2)->244485639
      • H(.am4)->493869359
  • [0106]
    In one embodiment, processing logic then sorts the hash values as follows:
      • 163393269
      • 200604023
      • 220643033
      • 246685639
      • 322263369
      • 458368732
      • 493869359
      • 542929292
      • 554034022
      • 595637753
      • 629292229
      • 692939349
      • 702202232
      • 843438583
      • 993923828
  • [0122]
    At processing block 708, processing logic selects a subset of hash values from the calculated hash values. In one embodiment, processing logic selects X smallest values from the sorted hash values and creates from them a “sketch” of the document. For example, for X=4, the sketch can be expressed as follows:
      • [163393269 200404023 220443033 244485639].
  • [0124]
    At processing block 710, processing logic creates a signature of the document by adding to the sketch a parameter pertaining to the tokens of the document. In one embodiment, the parameter specifies the number of original tokens in the document. In the example above, the number of original tokens is 15. Hence, the signature of the document can be expressed as follows:
      • [15 163393269 200404023 220443033 244485639].
        Alternatively, the parameter may specify any other characteristic of the content of the document (e.g., the size of the document, the keyword associated with the document, etc.).
  • [0126]
    FIG. 8 is a flow diagram of one embodiment of a process 800 for determining whether two documents are similar. The process may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both.
  • [0127]
    Referring to FIG. 8, process 800 begins with processing logic comparing the token numbers specified in the signatures of documents 1 and 2, and determining whether the token number in the first signature is within the allowed range with respect to the token number from the second signature (decision box 802). For example, the allowed range may be a difference of 1 or less or a ratio of 90 percent or higher.
  • [0128]
    If the token number in the first signature is outside of the allowed range with respect to the token number from the second signature, processing logic decides that documents 1 and 2 are different (processing block 808). Otherwise, if the token number in the first signature is within the allowed range with respect to the token number from the second signature, processing logic determines whether the resemblance between hash values in signatures 1 and 2 exceeds a threshold (e.g., more than 95 percent of hash values are the same) (decision box 804). If so, processing logic decides that the two documents are similar (processing block 806). If not, processing logic decides that documents 1 and 2 are different (processing block 808).
  • [0000]
    Email Spam Filtering Using Noise Reduction
  • [0129]
    FIG. 9 is a flow diagram of one embodiment of a process 900 for reducing noise in an email message. The process may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both.
  • [0130]
    Referring to FIG. 9, process 900 begins with processing logic detecting in an email message data indicative of noise (processing block 902). As discussed above, noise represents data that is invisible to a recipient of the mail message and was added to the email message to avoid spam filtering. Such data may include, for example, formatting data (e.g., HTML tags), numeric character references, character entity references, URL data of predefined categories, etc. Numeric character references specify the code position of a character in the document character set. Character entity references use symbolic names so that authors need not remember code positions. For example, the character entity reference &aring refers to the lowercase “a” character topped with a ring.
  • [0131]
    At processing block 904, processing logic modifies the content of the email message to reduce the noise. In one embodiment, the content modification includes removing formatting data, translating numeric character references and charcater entity references to their ASCII equivalents, and modifying URL data.
  • [0132]
    At processing block 906, processing logic compares the modified content of the email message with the content of a spam message. In one embodiment, the comparison is performed to identify an exact match. Alternatively, the comparison is performed to determine whether the two documents are similar.
  • [0133]
    FIG. 10 is a flow diagram of one embodiment of a process 1000 for modifying an email message to reduce noise. The process may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both.
  • [0134]
    Referring to FIG. 10, process 1000 begins with processing logic searching an email message for formatting data (e.g., HTML tags) (processing block 1002).
  • [0135]
    At decision box 1004, processing logic determines whether the found formatting data qualifies as an exception. Typically, HTML formatting does not add anything to the information content of a message. However, a few exceptions exist. These exceptions are the tags that contain useful information for further processing of the message (e.g., tags <BODY>, <A>, <IMG>, and <FONT>). For example, the <FONT> and <BODY> tags are needed for “white on white” text elimination, and the <A> and <IMG> tags typically contain link information that may be used for passing data to other components of the system.
  • [0136]
    If the formatting data does not qualify as an exception, the formatting data is extracted from the email message (processing block 1006).
  • [0137]
    Next, processing logic converts each numerical character reference and character entity reference into a corresponding ASCII character (processing block 1008).
  • [0138]
    In HTML, numeric character references may take two forms:
      • 1. The syntax “&#D;”, where D is a decimal number, refers to the ISO 10646 decimal character number D; and
      • 2. The syntax “&#xH;” or “&#XH;”, where H is a hexadecimal number, refers to the ISO 10646 hexadecimal character number H. Hexadecimal numbers in numeric character references are case-insensitive.
        For example, randomized characters in the body may appear as a following expression:
    • Th&#101&#32&#83a&#118&#105n&#103&#115R&#101&#103 is &#116e&#114&#119&#97&#110&#116&#115&#32yo&#117.
      This expression has a meaning of the phrase “The SavingsRegister wants you.”
  • [0142]
    Some times the conversion performed at processing block 1008 may need to be repeated. For example, the string “&#38;” corresponds to the string “&” in ASCII, the string “&#35;” corresponds to the string “#” in ASCII, the string “&#51;” corresponds to 3 in ASCII, the string “#56;” corresponds to 8 in ASCII, and “#59;” corresponds to the string “; ” in ASCII. Hence, the combined string “&#38;&#35;&#51;&#56;&#59;”, when converted, results in the string “&#38;” that needs to be converted.
  • [0143]
    Accordingly, after the first conversion operation at processing block 1008, processing logic checks whether the converted data still includes numeric character references or character entity references (decision box 1010). If the check is positive, processing logic repeats the conversion operation at processing block 1008. Otherwise, processing logic proceeds to processing block 1012.
  • [0144]
    At processing block 1012, processing logic modifies URL data of predefined categories. These categories may include, for example, numerical character references contained in the URL that are converted by processing logic into corresponding ASCII characters. In addition, the URL “password” syntax may be used to add characters before an “@” in the URL hostname. These characters are ignored by the target web server but they add significant amounts of noise information to each URL. Processing logic modifies the URL data by removing these additional characters. Finally, processing logic removes the “query” part of the URL, following a string “?” at the end of the URL.
  • [0145]
    An example of a URL is as follows:
    • http %3a %2f%2flotsofjunk@www.foo.com%2fbar.html?muchmorejunk
      Processing logic modifies the above URL data into http://www.foo.com/bar.hmil.
      An Exemplary Computer System
  • [0147]
    FIG. 11 is a block diagram of an exemplary computer system 1100 that may be used to perform one or more of the operations described herein. In alternative embodiments, the machine may comprise a network router, a network switch, a network bridge, Personal Digital Assistant (PDA), a cellular telephone, a web appliance or any machine capable of executing a sequence of instructions that specify actions to be taken by that machine.
  • [0148]
    The computer system 1100 includes a processor 1102, a main memory 1104 and a static memory 1106, which communicate with each other via a bus 1108. The computer system 1100 may further include a video display unit 1110 (e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT)). The computer system 1100 also includes an alpha-numeric input device 1112 (e.g., a keyboard), a cursor control device 1114 (e.g., a mouse), a disk drive unit 1116, a signal generation device 1120 (e.g., a speaker) and a network interface device 1122.
  • [0149]
    The disk drive unit 1116 includes a computer-readable medium 1124 on which is stored a set of instructions (i.e., software) 1126 embodying any one, or all, of the methodologies described above. The software 1126 is also shown to reside, completely or at least partially, within the main memory 1104 and/or within the processor 1102. The software 1126 may further be transmitted or received via the network interface device 1122. For the purposes of this specification, the term “computer-readable medium” shall be taken to include any medium that is capable of storing or encoding a sequence of instructions for execution by the computer and that cause the computer to perform any one of the methodologies of the present invention. The term “computer-readable medium” shall accordingly be taken to included, but not be limited to, solid-state memories, optical and magnetic disks, and carrier wave signals.
  • [0150]
    Whereas many alterations and modifications of the present invention will no doubt become apparent to a person of ordinary skill in the art after having read the foregoing description, it is to be understood that any particular embodiment shown and described by way of illustration is in no way intended to be considered limiting. Therefore, references to details of various embodiments are not intended to limit the scope of the claims which in themselves recite only those features regarded as essential to the invention.

Claims (29)

  1. 1. The method comprising:
    receiving an email message;
    generating data characterizing the email message based on content of the email message;
    comparing the data characterizing the email message with a set of data characterizing a plurality of spam messages; and
    determining whether a resemblance between the data characterizing the email message and any data item within the set of data characterizing the plurality of spam messages exceeds a threshold.
  2. 2. The method of claim 1 further comprising:
    marking the email message as spam if the resemblance between the data characterizing the email message and any data item within the set of data characterizing the plurality of spam messages exceeds a threshold.
  3. 3. The method of claim 1 further comprising:
    receiving data characterizing a new spam message; and
    storing the received data in a database.
  4. 4. The method of claim 1 further comprising:
    upon receiving the email message, evaluating the email message for a presence of noise added to avoid spam filtering; and
    modifying content of the email message to reduce the noise.
  5. 5. The method of claim 4 wherein evaluating the email message for the presence of noise comprises detecting at least one of formatting data, a numeric character reference, a character entity reference, and predefined URL data.
  6. 6. The method of claim 1 wherein generating the data characterizing the email message comprises:
    dividing the email message into a plurality of tokens; and
    calculating a plurality of hash values for the plurality of tokens.
  7. 7. The method of claim 6 wherein comparing the data characterizing the email message with the set of data characterizing the plurality of spam messages comprises:
    finding, in the set of data characterizing the plurality of spam messages, one or more data items having additional information similar to additional information contained in the data characterizing the plurality of spam messages; and
    comparing the subset of hash values in the data characterizing the email message with a subset of hash values in each found data item until finding a similar subset of hash values.
  8. 8. The method of claim 3 further comprising:
    evaluating the new spam message for a presence of noise;
    modifying content of the new spam message to reduce the noise; and
    generating data characterizing the spam message based on content of the modified new spam message.
  9. 9. The method comprising:
    receiving a spam message;
    generating data characterizing the spam message based on content of the spam message; and
    transferring the data characterizing the spam message to a server, the data characterizing the spam message being subsequently used to find incoming messages resembling the spam message.
  10. 10. The method of claim 9 further comprising:
    upon receiving the spam message, evaluating the spam message for a presence of noise; and
    modifying content of the spam message to reduce the noise.
  11. 11. The method of claim 9 wherein evaluating the spam message for the presence of noise comprises detecting at least one of formatting data, a numeric character reference, a character entity reference, and predefined URL data.
  12. 12. The method of claim 9 wherein generating the data characterizing the spam message comprises:
    dividing the spam message into a plurality of tokens; and
    calculating a plurality of hash values for the plurality of tokens.
  13. 13. A system comprising:
    an incoming message parser to receive an email message;
    a message data generator to generate data characterizing the email message based on content of the email message; and
    a resemblance identifier to compare the data characterizing the email message with a set of data characterizing a plurality of spam messages, and to determine whether a resemblance between the data characterizing the email message and any data item within the set of data characterizing the plurality of spam messages exceeds a threshold.
  14. 14. The system of claim 13 further comprising:
    a database to store data characterizing a new spam message.
  15. 15. The system of claim 13 further comprising:
    a message cleaning algorithm to evaluate the email message for a presence of noise added to avoid spam filtering, and to modify content of the email message to reduce the noise.
  16. 16. A system comprising:
    a spam content parser to receive a spam message;
    a spam data generator to generate data characterizing the spam message based on content of the spam message; and
    a spam data transmitter to transfer the data characterizing the spam message to a server, the data characterizing the spam message being subsequently used to find incoming messages resembling the spam message.
  17. 17. The system of claim 16 further comprising a noise reduction algorithm to evaluate the spam message for a presence of noise, and to modify content of the spam message to reduce the noise.
  18. 18. The system of claim 17 wherein the noise reduction algorithm is to evaluate the spam message for the presence of noise by detecting at least one of formatting data, a numeric character reference, a character entity reference, and predefined URL data.
  19. 19. An apparatus comprising:
    means for receiving an email message;
    means for generating data characterizing the email message based on content of the email message;
    means for comparing the data characterizing the email message with a set of data characterizing a plurality of spam messages; and
    means for determining whether a resemblance between the data characterizing the email message and any data item within the set of data characterizing the plurality of spam messages exceeds a threshold.
  20. 20. The apparatus of claim 19 further comprising:
    means for receiving data characterizing a new spam message; and
    a database to storing the received data.
  21. 21. An apparatus comprising:
    means for receiving a spam message;
    means for generating data characterizing the spam message based on content of the spam message; and
    means for transferring the data characterizing the spam message to a server, the data characterizing the spam message being subsequently used to find incoming messages resembling the spam message.
  22. 22. The apparatus of claim 21 further comprising:
    means for evaluating the spam message for a presence of noise; and
    means for modifying content of the spam message to reduce the noise.
  23. 23. The apparatus of claim 21 wherein means for evaluating the spam message for the presence of noise comprises means for detecting at least one of formatting data, a numeric character reference, a character entity reference, and predefined URL data.
  24. 24. A computer readable medium comprising executable instructions which when executed on a processing system cause said processing system to perform a method comprising:
    receiving an email message;
    generating data characterizing the email message based on content of the email message;
    comparing the data characterizing the email message with a set of data characterizing a plurality of spam messages; and
    determining whether a resemblance between the data characterizing the email message and any data item within the set of data characterizing the plurality of spam messages exceeds a threshold.
  25. 25. The computer readable medium of claim 24 wherein the method further comprises:
    receiving data characterizing a new spam message; and
    storing the received data in a database.
  26. 26. The computer readable medium of claim 24 wherein the method further comprises:
    upon receiving the email message, evaluating the email message for a presence of noise added to avoid spam filtering; and
    modifying content of the email message to reduce the noise.
  27. 27. A computer readable medium comprising executable instructions which when executed on a processing system cause said processing system to perform a method comprising:
    receiving a spam message;
    generating data characterizing the spam message based on content of the spam message; and
    transferring the data characterizing the spam message to a server, the data characterizing the spam message being subsequently used to find incoming messages resembling the spam message.
  28. 28. The computer readable medium of claim 27 wherein the method further comprises:
    upon receiving the spam message, evaluating the spam message for a presence of noise; and
    modifying content of the spam message to reduce the noise.
  29. 29. The computer readable medium of claim 27 wherein evaluating the spam message for the presence of noise comprises detecting at least one of formatting data, a numeric character reference, a character entity reference, and predefined URL data.
US10846723 2003-05-15 2004-05-13 Method and apparatus for filtering email spam based on similarity measures Abandoned US20050108340A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US47124203 true 2003-05-15 2003-05-15
US10846723 US20050108340A1 (en) 2003-05-15 2004-05-13 Method and apparatus for filtering email spam based on similarity measures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10846723 US20050108340A1 (en) 2003-05-15 2004-05-13 Method and apparatus for filtering email spam based on similarity measures

Publications (1)

Publication Number Publication Date
US20050108340A1 true true US20050108340A1 (en) 2005-05-19

Family

ID=33476818

Family Applications (4)

Application Number Title Priority Date Filing Date
US10846723 Abandoned US20050108340A1 (en) 2003-05-15 2004-05-13 Method and apparatus for filtering email spam based on similarity measures
US10845819 Active 2027-02-24 US7831667B2 (en) 2003-05-15 2004-05-13 Method and apparatus for filtering email spam using email noise reduction
US10845648 Abandoned US20050132197A1 (en) 2003-05-15 2004-05-13 Method and apparatus for a character-based comparison of documents
US12941939 Active 2024-09-12 US8402102B2 (en) 2003-05-15 2010-11-08 Method and apparatus for filtering email spam using email noise reduction

Family Applications After (3)

Application Number Title Priority Date Filing Date
US10845819 Active 2027-02-24 US7831667B2 (en) 2003-05-15 2004-05-13 Method and apparatus for filtering email spam using email noise reduction
US10845648 Abandoned US20050132197A1 (en) 2003-05-15 2004-05-13 Method and apparatus for a character-based comparison of documents
US12941939 Active 2024-09-12 US8402102B2 (en) 2003-05-15 2010-11-08 Method and apparatus for filtering email spam using email noise reduction

Country Status (4)

Country Link
US (4) US20050108340A1 (en)
EP (1) EP1649645A2 (en)
JP (1) JP4598774B2 (en)
WO (1) WO2004105332A9 (en)

Cited By (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040003283A1 (en) * 2002-06-26 2004-01-01 Goodman Joshua Theodore Spam detector with challenges
US20040123157A1 (en) * 2002-12-13 2004-06-24 Wholesecurity, Inc. Method, system, and computer program product for security within a global computer network
US20050108339A1 (en) * 2003-05-15 2005-05-19 Matt Gleeson Method and apparatus for filtering email spam using email noise reduction
US20050188032A1 (en) * 2004-01-14 2005-08-25 Katsuyuki Yamazaki Mass mail detection system and mail server
US20060031307A1 (en) * 2004-05-18 2006-02-09 Rishi Bhatia System and method for filtering network messages
US20060095966A1 (en) * 2004-11-03 2006-05-04 Shawn Park Method of detecting, comparing, blocking, and eliminating spam emails
US20060149820A1 (en) * 2005-01-04 2006-07-06 International Business Machines Corporation Detecting spam e-mail using similarity calculations
US20070038705A1 (en) * 2005-07-29 2007-02-15 Microsoft Corporation Trees of classifiers for detecting email spam
US20070064704A1 (en) * 2002-06-04 2007-03-22 Fortinet, Inc. Methods and systems for a distributed provider edge
US20070180031A1 (en) * 2006-01-30 2007-08-02 Microsoft Corporation Email Opt-out Enforcement
US20070208850A1 (en) * 2006-03-01 2007-09-06 Fortinet, Inc. Electronic message and data tracking system
US20080104062A1 (en) * 2004-02-09 2008-05-01 Mailfrontier, Inc. Approximate Matching of Strings for Message Filtering
US20080104712A1 (en) * 2004-01-27 2008-05-01 Mailfrontier, Inc. Message Distribution Control
US20080114843A1 (en) * 2006-11-14 2008-05-15 Mcafee, Inc. Method and system for handling unwanted email messages
US20080259934A1 (en) * 2000-09-13 2008-10-23 Fortinet, Inc. Distributed virtual system to support managed, network-based services
US20080317231A1 (en) * 2004-11-18 2008-12-25 Fortinet, Inc. Managing hierarchically organized subscriber profiles
WO2009010634A1 (en) * 2007-07-17 2009-01-22 Airwide Solutions Oy Content tracking
US20090046728A1 (en) * 2000-09-13 2009-02-19 Fortinet, Inc. System and method for delivering security services
US20090073977A1 (en) * 2002-06-04 2009-03-19 Fortinet, Inc. Routing traffic through a virtual router-based network switch
US20090089266A1 (en) * 2007-09-27 2009-04-02 Microsoft Corporation Method of finding candidate sub-queries from longer queries
US20090089384A1 (en) * 2007-09-30 2009-04-02 Tsuen Wan Ngan System and method for detecting content similarity within email documents by sparse subset hashing
US20090089539A1 (en) * 2007-09-30 2009-04-02 Guy Barry Owen Bunker System and method for detecting email content containment
US20090089383A1 (en) * 2007-09-30 2009-04-02 Tsuen Wan Ngan System and method for detecting content similarity within emails documents employing selective truncation
US20090225754A1 (en) * 2004-09-24 2009-09-10 Fortinet, Inc. Scalable ip-services enabled multicast forwarding with efficient resource utilization
US20090319506A1 (en) * 2008-06-19 2009-12-24 Tsuen Wan Ngan System and method for efficiently finding email similarity in an email repository
US20090327430A1 (en) * 2008-06-27 2009-12-31 Microsoft Corporation Determining email filtering type based on sender classification
US20100057707A1 (en) * 2008-09-03 2010-03-04 Microsoft Corporation Query-oriented message characterization
US20100057933A1 (en) * 2008-09-03 2010-03-04 Microsoft Corporation Probabilistic mesh routing
US20100094887A1 (en) * 2006-10-18 2010-04-15 Jingjun Ye Method and System for Determining Junk Information
US20100095378A1 (en) * 2003-09-08 2010-04-15 Jonathan Oliver Classifying a Message Based on Fraud Indicators
US7711779B2 (en) 2003-06-20 2010-05-04 Microsoft Corporation Prevention of outgoing spam
US7720053B2 (en) 2002-06-04 2010-05-18 Fortinet, Inc. Service processing switch
US7739337B1 (en) * 2005-06-20 2010-06-15 Symantec Corporation Method and apparatus for grouping spam email messages
US20100162404A1 (en) * 2008-12-23 2010-06-24 International Business Machines Corporation Identifying spam avatars in a virtual universe (vu) based upon turing tests
US20100162403A1 (en) * 2008-12-23 2010-06-24 International Business Machines Corporation System and method in a virtual universe for identifying spam avatars based upon avatar multimedia characteristics
US7760684B2 (en) 2006-02-13 2010-07-20 Airwide Solutions, Inc. Measuring media distribution and impact in a mobile communication network
US7761743B2 (en) 2002-08-29 2010-07-20 Fortinet, Inc. Fault tolerant routing in a non-hot-standby configuration of a network routing system
US20100189016A1 (en) * 2001-06-28 2010-07-29 Fortinet, Inc. Identifying nodes in a ring network
US20100241508A1 (en) * 2007-07-17 2010-09-23 Airwide Solutions Oy Delivery of Advertisements in Mobile Advertising System
US7885207B2 (en) 2000-09-13 2011-02-08 Fortinet, Inc. Managing and provisioning virtual routers
US20110055332A1 (en) * 2009-08-28 2011-03-03 Stein Christopher A Comparing similarity between documents for filtering unwanted documents
US7912936B2 (en) 2000-09-13 2011-03-22 Nara Rajagopalan Managing interworking communications protocols
US7941490B1 (en) * 2004-05-11 2011-05-10 Symantec Corporation Method and apparatus for detecting spam in email messages and email attachments
US20110200044A1 (en) * 2002-11-18 2011-08-18 Fortinet, Inc. Hardware-accelerated packet multicasting in a virtual routing system
US8010609B2 (en) 2005-06-20 2011-08-30 Symantec Corporation Method and apparatus for maintaining reputation lists of IP addresses to detect email spam
US8028335B2 (en) 2006-06-19 2011-09-27 Microsoft Corporation Protected environments for protecting users against undesirable activities
US8065370B2 (en) 2005-11-03 2011-11-22 Microsoft Corporation Proofs to filter spam
US8069233B2 (en) 2000-09-13 2011-11-29 Fortinet, Inc. Switch management system and method
US8135778B1 (en) 2005-04-27 2012-03-13 Symantec Corporation Method and apparatus for certifying mass emailings
US8145710B2 (en) 2003-06-18 2012-03-27 Symantec Corporation System and method for filtering spam messages utilizing URL filtering module
US8224905B2 (en) 2006-12-06 2012-07-17 Microsoft Corporation Spam filtration utilizing sender activity data
US8271588B1 (en) 2003-09-24 2012-09-18 Symantec Corporation System and method for filtering fraudulent email messages
US8306040B2 (en) 2002-06-04 2012-11-06 Fortinet, Inc. Network packet steering via configurable association of processing resources and network interfaces
US8316094B1 (en) * 2010-01-21 2012-11-20 Symantec Corporation Systems and methods for identifying spam mailing lists
WO2012162676A2 (en) 2011-05-25 2012-11-29 Microsoft Corporation Dynamic rule reordering for message classification
US8495144B1 (en) * 2004-10-06 2013-07-23 Trend Micro Incorporated Techniques for identifying spam e-mail
US8700913B1 (en) 2011-09-23 2014-04-15 Trend Micro Incorporated Detection of fake antivirus in computers
US8848718B2 (en) 2002-06-04 2014-09-30 Google Inc. Hierarchical metering in a virtual router-based network switch
US8925087B1 (en) * 2009-06-19 2014-12-30 Trend Micro Incorporated Apparatus and methods for in-the-cloud identification of spam and/or malware
US8954458B2 (en) 2011-07-11 2015-02-10 Aol Inc. Systems and methods for providing a content item database and identifying content items
US9338132B2 (en) 2009-05-28 2016-05-10 International Business Machines Corporation Providing notification of spam avatars
US9407463B2 (en) * 2011-07-11 2016-08-02 Aol Inc. Systems and methods for providing a spam database and identifying spam communications

Families Citing this family (75)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8056131B2 (en) * 2001-06-21 2011-11-08 Cybersoft, Inc. Apparatus, methods and articles of manufacture for intercepting, examining and controlling code, data and files and their transfer
JP4754216B2 (en) 2002-09-04 2011-08-24 ノバルティス アーゲー The treatment of neurological diseases by dsRNA administration
US7984175B2 (en) 2003-12-10 2011-07-19 Mcafee, Inc. Method and apparatus for data capture and analysis system
US20050131876A1 (en) * 2003-12-10 2005-06-16 Ahuja Ratinder Paul S. Graphical user interface for capture system
US7814327B2 (en) 2003-12-10 2010-10-12 Mcafee, Inc. Document registration
US7774604B2 (en) * 2003-12-10 2010-08-10 Mcafee, Inc. Verifying captured objects before presentation
US8548170B2 (en) 2003-12-10 2013-10-01 Mcafee, Inc. Document de-registration
US8656039B2 (en) 2003-12-10 2014-02-18 Mcafee, Inc. Rule parser
US7899828B2 (en) 2003-12-10 2011-03-01 Mcafee, Inc. Tag data structure for maintaining relational data over captured objects
US8301702B2 (en) * 2004-01-20 2012-10-30 Cloudmark, Inc. Method and an apparatus to screen electronic communications
US7930540B2 (en) * 2004-01-22 2011-04-19 Mcafee, Inc. Cryptographic policy enforcement
US20050204005A1 (en) * 2004-03-12 2005-09-15 Purcell Sean E. Selective treatment of messages based on junk rating
US8171549B2 (en) * 2004-04-26 2012-05-01 Cybersoft, Inc. Apparatus, methods and articles of manufacture for intercepting, examining and controlling code, data, files and their transfer
US7434058B2 (en) * 2004-06-07 2008-10-07 Reconnex Corporation Generating signatures over a document
US7962591B2 (en) 2004-06-23 2011-06-14 Mcafee, Inc. Object classification in a capture system
US9154511B1 (en) 2004-07-13 2015-10-06 Dell Software Inc. Time zero detection of infectious messages
US7343624B1 (en) * 2004-07-13 2008-03-11 Sonicwall, Inc. Managing infectious messages as identified by an attachment
US7660865B2 (en) * 2004-08-12 2010-02-09 Microsoft Corporation Spam filtering with probabilistic secure hashes
US8560534B2 (en) 2004-08-23 2013-10-15 Mcafee, Inc. Database for a capture system
US7949849B2 (en) * 2004-08-24 2011-05-24 Mcafee, Inc. File system for a capture system
US8725705B2 (en) * 2004-09-15 2014-05-13 International Business Machines Corporation Systems and methods for searching of storage data with reduced bandwidth requirements
US7523098B2 (en) 2004-09-15 2009-04-21 International Business Machines Corporation Systems and methods for efficient data searching, storage and reduction
US8396897B2 (en) * 2004-11-22 2013-03-12 International Business Machines Corporation Method, system, and computer program product for threading documents using body text analysis
US7596700B2 (en) * 2004-12-22 2009-09-29 Storage Technology Corporation Method and system for establishing trusting environment for sharing data between mutually mistrusting entities
CA2493442C (en) * 2005-01-20 2014-12-16 Certicom Corp. Method and system of managing and filtering electronic messages using cryptographic techniques
WO2006108989A3 (en) * 2005-04-13 2007-02-15 France Telecom Method for controlling the sending of unsolicited voice information
GB0508296D0 (en) * 2005-04-25 2005-06-01 Messagelabs Ltd A method and system to improve speed of detection of related spam emails
US7516130B2 (en) * 2005-05-09 2009-04-07 Trend Micro, Inc. Matching engine with signature generation
JP4559295B2 (en) * 2005-05-17 2010-10-06 株式会社エヌ・ティ・ティ・ドコモ Data communication system and data communication method
JP5437627B2 (en) * 2005-05-26 2014-03-12 エックスコネクト グローバル ネットワークス リミティド Detection of spit in Voip call
US7907608B2 (en) 2005-08-12 2011-03-15 Mcafee, Inc. High speed packet capture
US7818326B2 (en) * 2005-08-31 2010-10-19 Mcafee, Inc. System and method for word indexing in a capture system and querying thereof
US7730011B1 (en) * 2005-10-19 2010-06-01 Mcafee, Inc. Attributes of captured objects in a capture system
US7657104B2 (en) * 2005-11-21 2010-02-02 Mcafee, Inc. Identifying image type in a capture system
CN1987909B (en) 2005-12-22 2012-08-15 腾讯科技(深圳)有限公司 Method, System and device for purifying Bayes spam
CN100556039C (en) 2006-01-13 2009-10-28 腾讯科技(深圳)有限公司 Method and system for removing misdicision of garbage E-mail
US7748022B1 (en) * 2006-02-21 2010-06-29 L-3 Communications Sonoma Eo, Inc. Real-time data characterization with token generation for fast data retrieval
US7627641B2 (en) * 2006-03-09 2009-12-01 Watchguard Technologies, Inc. Method and system for recognizing desired email
JP2007257308A (en) * 2006-03-23 2007-10-04 Canon Inc Document management device, document management system, control method, program and storage medium
US8504537B2 (en) 2006-03-24 2013-08-06 Mcafee, Inc. Signature distribution in a document registration system
US7958227B2 (en) 2006-05-22 2011-06-07 Mcafee, Inc. Attributes of captured objects in a capture system
US7689614B2 (en) 2006-05-22 2010-03-30 Mcafee, Inc. Query generation for a capture system
US8010689B2 (en) 2006-05-22 2011-08-30 Mcafee, Inc. Locational tagging in a capture system
KR100809416B1 (en) * 2006-07-28 2008-03-05 한국전자통신연구원 Appatus and method of automatically generating signatures at network security systems
US7730316B1 (en) * 2006-09-22 2010-06-01 Fatlens, Inc. Method for document fingerprinting
US20090300012A1 (en) * 2008-05-28 2009-12-03 Barracuda Inc. Multilevel intent analysis method for email filtration
CN101594312B (en) 2008-05-30 2012-12-26 电子科技大学 Method for recognizing junk mail based on artificial immunity and behavior characteristics
US8205242B2 (en) 2008-07-10 2012-06-19 Mcafee, Inc. System and method for data mining and security policy management
US9253154B2 (en) 2008-08-12 2016-02-02 Mcafee, Inc. Configuration management for a capture/registration system
US8850591B2 (en) 2009-01-13 2014-09-30 Mcafee, Inc. System and method for concept building
US8706709B2 (en) 2009-01-15 2014-04-22 Mcafee, Inc. System and method for intelligent term grouping
US8473442B1 (en) 2009-02-25 2013-06-25 Mcafee, Inc. System and method for intelligent state management
US8447722B1 (en) 2009-03-25 2013-05-21 Mcafee, Inc. System and method for data mining and security policy management
US8667121B2 (en) 2009-03-25 2014-03-04 Mcafee, Inc. System and method for managing data and policies
KR20100107801A (en) * 2009-03-26 2010-10-06 삼성전자주식회사 Apparatus and method for antenna selection in wireless communication system
US20110015939A1 (en) * 2009-07-17 2011-01-20 Marcos Lara Gonzalez Systems and methods to create log entries and share a patient log using open-ended electronic messaging and artificial intelligence
US8458268B1 (en) * 2010-02-22 2013-06-04 Symantec Corporation Systems and methods for distributing spam signatures
US8806615B2 (en) 2010-11-04 2014-08-12 Mcafee, Inc. System and method for protecting specified data combinations
US9450781B2 (en) * 2010-12-09 2016-09-20 Alcatel Lucent Spam reporting and management in a communication network
US9384471B2 (en) * 2011-02-22 2016-07-05 Alcatel Lucent Spam reporting and management in a communication network
CN102655480B (en) * 2011-03-03 2015-12-02 腾讯科技(深圳)有限公司 Similar mail processing systems and methods
US8819156B2 (en) 2011-03-11 2014-08-26 James Robert Miner Systems and methods for message collection
US9419928B2 (en) 2011-03-11 2016-08-16 James Robert Miner Systems and methods for message collection
US9559868B2 (en) 2011-04-01 2017-01-31 Onavo Mobile Ltd. Apparatus and methods for bandwidth saving and on-demand data delivery for a mobile device
US8612436B1 (en) 2011-09-27 2013-12-17 Google Inc. Reverse engineering circumvention of spam detection algorithms
US20130246336A1 (en) 2011-12-27 2013-09-19 Mcafee, Inc. System and method for providing data protection workflows in a network environment
US8935783B2 (en) * 2013-03-08 2015-01-13 Bitdefender IPR Management Ltd. Document classification using multiscale text fingerprints
US20140334739A1 (en) * 2013-05-08 2014-11-13 Xyratex Technology Limited Methods of clustering computational event logs
RU2013144681A (en) 2013-10-03 2015-04-10 Общество С Ограниченной Ответственностью "Яндекс" the e-mail processing system to determine its classification
US20150295869A1 (en) * 2014-04-14 2015-10-15 Microsoft Corporation Filtering Electronic Messages
US9928465B2 (en) * 2014-05-20 2018-03-27 Oath Inc. Machine learning and validation of account names, addresses, and/or identifiers
WO2015196410A1 (en) * 2014-06-26 2015-12-30 Google Inc. Optimized browser render process
KR101768181B1 (en) 2014-06-26 2017-08-16 구글 인코포레이티드 Optimized browser rendering process
EP3161668A4 (en) 2014-06-26 2018-03-28 Google Llc Batch-optimized render and fetch architecture
WO2017116741A1 (en) * 2015-12-31 2017-07-06 Taser International, Inc. Systems and methods for filtering messages

Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5377354A (en) * 1989-08-15 1994-12-27 Digital Equipment Corporation Method and system for sorting and prioritizing electronic mail messages
US5909677A (en) * 1996-06-18 1999-06-01 Digital Equipment Corporation Method for determining the resemblance of documents
US6023723A (en) * 1997-12-22 2000-02-08 Accepted Marketing, Inc. Method and system for filtering unwanted junk e-mail utilizing a plurality of filtering mechanisms
US6119124A (en) * 1998-03-26 2000-09-12 Digital Equipment Corporation Method for clustering closely resembling data objects
US6161130A (en) * 1998-06-23 2000-12-12 Microsoft Corporation Technique which utilizes a probabilistic classifier to detect "junk" e-mail by automatically updating a training and re-training the classifier based on the updated training set
US6192360B1 (en) * 1998-06-23 2001-02-20 Microsoft Corporation Methods and apparatus for classifying text and for building a text classifier
US6199103B1 (en) * 1997-06-24 2001-03-06 Omron Corporation Electronic mail determination method and system and storage medium
US6421709B1 (en) * 1997-12-22 2002-07-16 Accepted Marketing, Inc. E-mail filter and method thereof
US20020116463A1 (en) * 2001-02-20 2002-08-22 Hart Matthew Thomas Unwanted e-mail filtering
US6453327B1 (en) * 1996-06-10 2002-09-17 Sun Microsystems, Inc. Method and apparatus for identifying and discarding junk electronic mail
US6460050B1 (en) * 1999-12-22 2002-10-01 Mark Raymond Pace Distributed content identification system
US20020198950A1 (en) * 1997-11-25 2002-12-26 Leeds Robert G. Junk electronic mail detector and eliminator
US20030195937A1 (en) * 2002-04-16 2003-10-16 Kontact Software Inc. Intelligent message screening
US20040003283A1 (en) * 2002-06-26 2004-01-01 Goodman Joshua Theodore Spam detector with challenges
US20040044791A1 (en) * 2001-05-22 2004-03-04 Pouzzner Daniel G. Internationalized domain name system with iterative conversion
US20040083270A1 (en) * 2002-10-23 2004-04-29 David Heckerman Method and system for identifying junk e-mail
US6732157B1 (en) * 2002-12-13 2004-05-04 Networks Associates Technology, Inc. Comprehensive anti-spam system, method, and computer program product for filtering unwanted e-mail messages
US20040177110A1 (en) * 2003-03-03 2004-09-09 Rounthwaite Robert L. Feedback loop for spam prevention
US20040204988A1 (en) * 2001-11-16 2004-10-14 Willers Howard Francis Interactively communicating selectively targeted information with consumers over the internet
US20040210640A1 (en) * 2003-04-17 2004-10-21 Chadwick Michael Christopher Mail server probability spam filter
US20040221062A1 (en) * 2003-05-02 2004-11-04 Starbuck Bryan T. Message rendering for identification of content features
US6931433B1 (en) * 2000-08-24 2005-08-16 Yahoo! Inc. Processing of unsolicited bulk electronic communication
US20060168006A1 (en) * 2003-03-24 2006-07-27 Mr. Marvin Shannon System and method for the classification of electronic communication
US7275089B1 (en) * 2001-03-15 2007-09-25 Aws Convergence Technologies, Inc. System and method for streaming of dynamic weather content to the desktop

Family Cites Families (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0240649A (en) 1988-07-30 1990-02-09 Konica Corp Silver halide color photographic sensitive material
CA1321656C (en) 1988-12-22 1993-08-24 International Business Machines Corporation Method for restricting delivery and receipt of electronic message
JPH03117940A (en) 1989-09-25 1991-05-20 Internatl Business Mach Corp <Ibm> Management method for electronic mail
US5822527A (en) * 1990-05-04 1998-10-13 Digital Equipment Corporation Method and apparatus for information stream filtration using tagged information access and action registration
GB2271002B (en) 1992-09-26 1995-12-06 Digital Equipment Int Data processing system
US5634005A (en) * 1992-11-09 1997-05-27 Kabushiki Kaisha Toshiba System for automatically sending mail message by storing rule according to the language specification of the message including processing condition and processing content
US5917615A (en) * 1993-06-07 1999-06-29 Microsoft Corporation System and method for facsimile load balancing
JP2837815B2 (en) * 1994-02-03 1998-12-16 インターナショナル・ビジネス・マシーンズ・コーポレイション Interactive rule-based computer system
US5758257A (en) 1994-11-29 1998-05-26 Herz; Frederick System and method for scheduling broadcast of and access to video programs and other data using customer profiles
US5619648A (en) * 1994-11-30 1997-04-08 Lucent Technologies Inc. Message filtering techniques
US5675507A (en) * 1995-04-28 1997-10-07 Bobo, Ii; Charles R. Message storage and delivery system
JP3998710B2 (en) 1995-05-08 2007-10-31 クランベリー、プロパティーズ、リミテッド、ライアビリティー、カンパニー Rule-compliant electronic message management device
US5678041A (en) * 1995-06-06 1997-10-14 At&T System and method for restricting user access rights on the internet based on rating information stored in a relational database
US5696898A (en) * 1995-06-06 1997-12-09 Lucent Technologies Inc. System and method for database access control
US5845263A (en) * 1995-06-16 1998-12-01 High Technology Solutions, Inc. Interactive visual ordering system
US5826269A (en) * 1995-06-21 1998-10-20 Microsoft Corporation Electronic mail interface for a network server
US5889943A (en) * 1995-09-26 1999-03-30 Trend Micro Incorporated Apparatus and method for electronic mail virus detection and elimination
US5862325A (en) 1996-02-29 1999-01-19 Intermind Corporation Computer-based communication system and method using metadata defining a control structure
US5826022A (en) * 1996-04-05 1998-10-20 Sun Microsystems, Inc. Method and apparatus for receiving electronic mail
US5870548A (en) * 1996-04-05 1999-02-09 Sun Microsystems, Inc. Method and apparatus for altering sent electronic mail messages
US5809242A (en) * 1996-04-19 1998-09-15 Juno Online Services, L.P. Electronic mail system for displaying advertisement at local computer received from remote system while the local computer is off-line the remote system
US5884033A (en) * 1996-05-15 1999-03-16 Spyglass, Inc. Internet filtering system for filtering data transferred over the internet utilizing immediate and deferred filtering actions
US5864684A (en) * 1996-05-22 1999-01-26 Sun Microsystems, Inc. Method and apparatus for managing subscriptions to distribution lists
WO1997046962A1 (en) * 1996-06-07 1997-12-11 At & T Corp. Finding an e-mail message to which another e-mail message is a response
US5926812A (en) * 1996-06-20 1999-07-20 Mantra Technologies, Inc. Document extraction and comparison method with applications to automatic personalized database searching
US5790789A (en) * 1996-08-02 1998-08-04 Suarez; Larry Method and architecture for the creation, control and deployment of services within a distributed computer environment
US5978837A (en) * 1996-09-27 1999-11-02 At&T Corp. Intelligent pager for remotely managing E-Mail messages
US5930479A (en) * 1996-10-21 1999-07-27 At&T Corp Communications addressing system
US5796948A (en) * 1996-11-12 1998-08-18 Cohen; Elliot D. Offensive message interceptor for computers
JPH10240649A (en) 1996-12-27 1998-09-11 Canon Inc Device and system for processing electronic mail
US6146026A (en) * 1996-12-27 2000-11-14 Canon Kabushiki Kaisha System and apparatus for selectively publishing electronic-mail
US5995597A (en) * 1997-01-21 1999-11-30 Woltz; Robert Thomas E-mail processing system and method
CA2282502A1 (en) 1997-02-25 1998-08-27 Intervoice Limited Partnership E-mail server for message filtering and routing
US6189026B1 (en) * 1997-06-16 2001-02-13 Digital Equipment Corporation Technique for dynamically generating an address book in a distributed electronic mail system
US6023700A (en) * 1997-06-17 2000-02-08 Cranberry Properties, Llc Electronic mail distribution system for integrated electronic communication
JP3148152B2 (en) * 1997-06-27 2001-03-19 日本電気株式会社 Broadcast mail delivery method using the e-mail system
US7117358B2 (en) * 1997-07-24 2006-10-03 Tumbleweed Communications Corp. Method and system for filtering communication
US6073165A (en) * 1997-07-29 2000-06-06 Jfax Communications, Inc. Filtering computer network messages directed to a user's e-mail box based on user defined filters, and forwarding a filtered message to the user's receiver
US5999967A (en) * 1997-08-17 1999-12-07 Sundsted; Todd Electronic mail filtering by electronic stamp
US6199102B1 (en) * 1997-08-26 2001-03-06 Christopher Alan Cobb Method and system for filtering electronic messages
JP3439330B2 (en) * 1997-09-25 2003-08-25 日本電気株式会社 E-mail server
US6195686B1 (en) * 1997-09-29 2001-02-27 Ericsson Inc. Messaging application having a plurality of interfacing capabilities
US6381592B1 (en) * 1997-12-03 2002-04-30 Stephen Michael Reuning Candidate chaser
US6052709A (en) * 1997-12-23 2000-04-18 Bright Light Technologies, Inc. Apparatus and method for controlling delivery of unsolicited electronic mail
US5999932A (en) * 1998-01-13 1999-12-07 Bright Light Technologies, Inc. System and method for filtering unsolicited electronic mail messages using data matching and heuristic processing
US5968117A (en) * 1998-01-20 1999-10-19 Aurora Communications Exchange Ltd. Device and system to facilitate accessing electronic mail from remote user-interface devices
US6157630A (en) * 1998-01-26 2000-12-05 Motorola, Inc. Communications system with radio device and server
US6314454B1 (en) * 1998-07-01 2001-11-06 Sony Corporation Method and apparatus for certified electronic mail messages
US6226630B1 (en) * 1998-07-22 2001-05-01 Compaq Computer Corporation Method and apparatus for filtering incoming information using a search engine and stored queries defining user folders
US6275850B1 (en) * 1998-07-24 2001-08-14 Siemens Information And Communication Networks, Inc. Method and system for management of message attachments
US6112227A (en) * 1998-08-06 2000-08-29 Heiner; Jeffrey Nelson Filter-in method for reducing junk e-mail
US6654787B1 (en) * 1998-12-31 2003-11-25 Brightmail, Incorporated Method and apparatus for filtering e-mail
US6732149B1 (en) * 1999-04-09 2004-05-04 International Business Machines Corporation System and method for hindering undesired transmission or receipt of electronic messages
US6804667B1 (en) * 1999-11-30 2004-10-12 Ncr Corporation Filter for checking for duplicate entries in database
US20040073617A1 (en) * 2000-06-19 2004-04-15 Milliken Walter Clark Hash-based systems and methods for detecting and preventing transmission of unwanted e-mail
US7076527B2 (en) * 2001-06-14 2006-07-11 Apple Computer, Inc. Method and apparatus for filtering email
US7080123B2 (en) * 2001-09-20 2006-07-18 Sun Microsystems, Inc. System and method for preventing unnecessary message duplication in electronic mail
US8266215B2 (en) * 2003-02-20 2012-09-11 Sonicwall, Inc. Using distinguishing properties to classify messages
US20050108340A1 (en) * 2003-05-15 2005-05-19 Matt Gleeson Method and apparatus for filtering email spam based on similarity measures
US8145710B2 (en) * 2003-06-18 2012-03-27 Symantec Corporation System and method for filtering spam messages utilizing URL filtering module
US7941490B1 (en) * 2004-05-11 2011-05-10 Symantec Corporation Method and apparatus for detecting spam in email messages and email attachments
JP2006293573A (en) * 2005-04-08 2006-10-26 Yaskawa Electric Corp Electronic mail processor, electronic mail filtering method and electronic mail filtering program
US8010609B2 (en) * 2005-06-20 2011-08-30 Symantec Corporation Method and apparatus for maintaining reputation lists of IP addresses to detect email spam
US7739337B1 (en) * 2005-06-20 2010-06-15 Symantec Corporation Method and apparatus for grouping spam email messages

Patent Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5377354A (en) * 1989-08-15 1994-12-27 Digital Equipment Corporation Method and system for sorting and prioritizing electronic mail messages
US6453327B1 (en) * 1996-06-10 2002-09-17 Sun Microsystems, Inc. Method and apparatus for identifying and discarding junk electronic mail
US5909677A (en) * 1996-06-18 1999-06-01 Digital Equipment Corporation Method for determining the resemblance of documents
US6199103B1 (en) * 1997-06-24 2001-03-06 Omron Corporation Electronic mail determination method and system and storage medium
US20020198950A1 (en) * 1997-11-25 2002-12-26 Leeds Robert G. Junk electronic mail detector and eliminator
US6023723A (en) * 1997-12-22 2000-02-08 Accepted Marketing, Inc. Method and system for filtering unwanted junk e-mail utilizing a plurality of filtering mechanisms
US6421709B1 (en) * 1997-12-22 2002-07-16 Accepted Marketing, Inc. E-mail filter and method thereof
US6349296B1 (en) * 1998-03-26 2002-02-19 Altavista Company Method for clustering closely resembling data objects
US6119124A (en) * 1998-03-26 2000-09-12 Digital Equipment Corporation Method for clustering closely resembling data objects
US6192360B1 (en) * 1998-06-23 2001-02-20 Microsoft Corporation Methods and apparatus for classifying text and for building a text classifier
US6161130A (en) * 1998-06-23 2000-12-12 Microsoft Corporation Technique which utilizes a probabilistic classifier to detect "junk" e-mail by automatically updating a training and re-training the classifier based on the updated training set
US6460050B1 (en) * 1999-12-22 2002-10-01 Mark Raymond Pace Distributed content identification system
US6931433B1 (en) * 2000-08-24 2005-08-16 Yahoo! Inc. Processing of unsolicited bulk electronic communication
US20020116463A1 (en) * 2001-02-20 2002-08-22 Hart Matthew Thomas Unwanted e-mail filtering
US7275089B1 (en) * 2001-03-15 2007-09-25 Aws Convergence Technologies, Inc. System and method for streaming of dynamic weather content to the desktop
US20040044791A1 (en) * 2001-05-22 2004-03-04 Pouzzner Daniel G. Internationalized domain name system with iterative conversion
US20040204988A1 (en) * 2001-11-16 2004-10-14 Willers Howard Francis Interactively communicating selectively targeted information with consumers over the internet
US20030195937A1 (en) * 2002-04-16 2003-10-16 Kontact Software Inc. Intelligent message screening
US20040003283A1 (en) * 2002-06-26 2004-01-01 Goodman Joshua Theodore Spam detector with challenges
US20040083270A1 (en) * 2002-10-23 2004-04-29 David Heckerman Method and system for identifying junk e-mail
US6732157B1 (en) * 2002-12-13 2004-05-04 Networks Associates Technology, Inc. Comprehensive anti-spam system, method, and computer program product for filtering unwanted e-mail messages
US20040177110A1 (en) * 2003-03-03 2004-09-09 Rounthwaite Robert L. Feedback loop for spam prevention
US20060168006A1 (en) * 2003-03-24 2006-07-27 Mr. Marvin Shannon System and method for the classification of electronic communication
US20040210640A1 (en) * 2003-04-17 2004-10-21 Chadwick Michael Christopher Mail server probability spam filter
US20040221062A1 (en) * 2003-05-02 2004-11-04 Starbuck Bryan T. Message rendering for identification of content features

Cited By (124)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7912936B2 (en) 2000-09-13 2011-03-22 Nara Rajagopalan Managing interworking communications protocols
US20080259934A1 (en) * 2000-09-13 2008-10-23 Fortinet, Inc. Distributed virtual system to support managed, network-based services
US8320279B2 (en) 2000-09-13 2012-11-27 Fortinet, Inc. Managing and provisioning virtual routers
US7818452B2 (en) 2000-09-13 2010-10-19 Fortinet, Inc. Distributed virtual system to support managed, network-based services
US7885207B2 (en) 2000-09-13 2011-02-08 Fortinet, Inc. Managing and provisioning virtual routers
US20110032942A1 (en) * 2000-09-13 2011-02-10 Fortinet, Inc. Fast path complex flow processing
US8069233B2 (en) 2000-09-13 2011-11-29 Fortinet, Inc. Switch management system and method
US20110176552A1 (en) * 2000-09-13 2011-07-21 Fortinet, Inc. Managing interworking communications protocols
US20110128891A1 (en) * 2000-09-13 2011-06-02 Fortinet, Inc. Managing and provisioning virtual routers
US20090046728A1 (en) * 2000-09-13 2009-02-19 Fortinet, Inc. System and method for delivering security services
US7890663B2 (en) 2001-06-28 2011-02-15 Fortinet, Inc. Identifying nodes in a ring network
US8208409B2 (en) 2001-06-28 2012-06-26 Fortinet, Inc. Identifying nodes in a ring network
US20100189016A1 (en) * 2001-06-28 2010-07-29 Fortinet, Inc. Identifying nodes in a ring network
US20070064704A1 (en) * 2002-06-04 2007-03-22 Fortinet, Inc. Methods and systems for a distributed provider edge
US8064462B2 (en) 2002-06-04 2011-11-22 Fortinet, Inc. Service processing switch
US8085776B2 (en) 2002-06-04 2011-12-27 Fortinet, Inc. Methods and systems for a distributed provider edge
US8111690B2 (en) 2002-06-04 2012-02-07 Google Inc. Routing traffic through a virtual router-based network switch
US8848718B2 (en) 2002-06-04 2014-09-30 Google Inc. Hierarchical metering in a virtual router-based network switch
US20100220732A1 (en) * 2002-06-04 2010-09-02 Fortinet, Inc. Service processing switch
US7720053B2 (en) 2002-06-04 2010-05-18 Fortinet, Inc. Service processing switch
US20090073977A1 (en) * 2002-06-04 2009-03-19 Fortinet, Inc. Routing traffic through a virtual router-based network switch
US8306040B2 (en) 2002-06-04 2012-11-06 Fortinet, Inc. Network packet steering via configurable association of processing resources and network interfaces
US20040003283A1 (en) * 2002-06-26 2004-01-01 Goodman Joshua Theodore Spam detector with challenges
US8046832B2 (en) 2002-06-26 2011-10-25 Microsoft Corporation Spam detector with challenges
US8819486B2 (en) 2002-08-29 2014-08-26 Google Inc. Fault tolerant routing in a non-hot-standby configuration of a network routing system
US7761743B2 (en) 2002-08-29 2010-07-20 Fortinet, Inc. Fault tolerant routing in a non-hot-standby configuration of a network routing system
US8412982B2 (en) 2002-08-29 2013-04-02 Google Inc. Fault tolerant routing in a non-hot-standby configuration of a network routing system
US20110200044A1 (en) * 2002-11-18 2011-08-18 Fortinet, Inc. Hardware-accelerated packet multicasting in a virtual routing system
US8644311B2 (en) 2002-11-18 2014-02-04 Fortinet, Inc. Hardware-accelerated packet multicasting in a virtual routing system
US20040123157A1 (en) * 2002-12-13 2004-06-24 Wholesecurity, Inc. Method, system, and computer program product for security within a global computer network
US7624110B2 (en) 2002-12-13 2009-11-24 Symantec Corporation Method, system, and computer program product for security within a global computer network
US8402102B2 (en) 2003-05-15 2013-03-19 Symantec Corporation Method and apparatus for filtering email spam using email noise reduction
US7831667B2 (en) 2003-05-15 2010-11-09 Symantec Corporation Method and apparatus for filtering email spam using email noise reduction
US20110055343A1 (en) * 2003-05-15 2011-03-03 Symantec Corporation Method and apparatus for filtering email spam using email noise reduction
US20050108339A1 (en) * 2003-05-15 2005-05-19 Matt Gleeson Method and apparatus for filtering email spam using email noise reduction
US8145710B2 (en) 2003-06-18 2012-03-27 Symantec Corporation System and method for filtering spam messages utilizing URL filtering module
US7711779B2 (en) 2003-06-20 2010-05-04 Microsoft Corporation Prevention of outgoing spam
US8191148B2 (en) * 2003-09-08 2012-05-29 Sonicwall, Inc. Classifying a message based on fraud indicators
US8984289B2 (en) 2003-09-08 2015-03-17 Sonicwall, Inc. Classifying a message based on fraud indicators
US20100095378A1 (en) * 2003-09-08 2010-04-15 Jonathan Oliver Classifying a Message Based on Fraud Indicators
US8661545B2 (en) 2003-09-08 2014-02-25 Sonicwall, Inc. Classifying a message based on fraud indicators
US8271588B1 (en) 2003-09-24 2012-09-18 Symantec Corporation System and method for filtering fraudulent email messages
US20050188032A1 (en) * 2004-01-14 2005-08-25 Katsuyuki Yamazaki Mass mail detection system and mail server
US7853654B2 (en) * 2004-01-14 2010-12-14 Kddi Corporation Mass mail detection system and mail server
US9454672B2 (en) 2004-01-27 2016-09-27 Dell Software Inc. Message distribution control
US8713110B2 (en) 2004-01-27 2014-04-29 Sonicwall, Inc. Identification of protected content in e-mail messages
US20080104712A1 (en) * 2004-01-27 2008-05-01 Mailfrontier, Inc. Message Distribution Control
US8886727B1 (en) 2004-01-27 2014-11-11 Sonicwall, Inc. Message distribution control
US20080104062A1 (en) * 2004-02-09 2008-05-01 Mailfrontier, Inc. Approximate Matching of Strings for Message Filtering
US9471712B2 (en) * 2004-02-09 2016-10-18 Dell Software Inc. Approximate matching of strings for message filtering
US7941490B1 (en) * 2004-05-11 2011-05-10 Symantec Corporation Method and apparatus for detecting spam in email messages and email attachments
US20060031307A1 (en) * 2004-05-18 2006-02-09 Rishi Bhatia System and method for filtering network messages
US7912905B2 (en) * 2004-05-18 2011-03-22 Computer Associates Think, Inc. System and method for filtering network messages
US20100142527A1 (en) * 2004-09-24 2010-06-10 Fortinet, Inc. Scalable IP-Services Enabled Multicast Forwarding with Efficient Resource Utilization
US8369258B2 (en) 2004-09-24 2013-02-05 Fortinet, Inc. Scalable IP-services enabled multicast forwarding with efficient resource utilization
US8213347B2 (en) 2004-09-24 2012-07-03 Fortinet, Inc. Scalable IP-services enabled multicast forwarding with efficient resource utilization
US7881244B2 (en) 2004-09-24 2011-02-01 Fortinet, Inc. Scalable IP-services enabled multicast forwarding with efficient resource utilization
US20090225754A1 (en) * 2004-09-24 2009-09-10 Fortinet, Inc. Scalable ip-services enabled multicast forwarding with efficient resource utilization
US8495144B1 (en) * 2004-10-06 2013-07-23 Trend Micro Incorporated Techniques for identifying spam e-mail
WO2006052583A3 (en) * 2004-11-03 2007-07-12 Shawn Park Method of detecting, comparing, blocking, and eliminating spam emails
WO2006052583A2 (en) * 2004-11-03 2006-05-18 Shawn Park Method of detecting, comparing, blocking, and eliminating spam emails
US20060095966A1 (en) * 2004-11-03 2006-05-04 Shawn Park Method of detecting, comparing, blocking, and eliminating spam emails
US7869361B2 (en) 2004-11-18 2011-01-11 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US7843813B2 (en) 2004-11-18 2010-11-30 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US20080317231A1 (en) * 2004-11-18 2008-12-25 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US7876683B2 (en) 2004-11-18 2011-01-25 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US20090007228A1 (en) * 2004-11-18 2009-01-01 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US7961615B2 (en) 2004-11-18 2011-06-14 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US20110235548A1 (en) * 2004-11-18 2011-09-29 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US8107376B2 (en) 2004-11-18 2012-01-31 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US20080317040A1 (en) * 2004-11-18 2008-12-25 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US20080320553A1 (en) * 2004-11-18 2008-12-25 Fortinet, Inc. Managing hierarchically organized subscriber profiles
US20060149820A1 (en) * 2005-01-04 2006-07-06 International Business Machines Corporation Detecting spam e-mail using similarity calculations
US8135778B1 (en) 2005-04-27 2012-03-13 Symantec Corporation Method and apparatus for certifying mass emailings
US8010609B2 (en) 2005-06-20 2011-08-30 Symantec Corporation Method and apparatus for maintaining reputation lists of IP addresses to detect email spam
US7739337B1 (en) * 2005-06-20 2010-06-15 Symantec Corporation Method and apparatus for grouping spam email messages
US20070038705A1 (en) * 2005-07-29 2007-02-15 Microsoft Corporation Trees of classifiers for detecting email spam
US7930353B2 (en) * 2005-07-29 2011-04-19 Microsoft Corporation Trees of classifiers for detecting email spam
US8065370B2 (en) 2005-11-03 2011-11-22 Microsoft Corporation Proofs to filter spam
US20070180031A1 (en) * 2006-01-30 2007-08-02 Microsoft Corporation Email Opt-out Enforcement
US7760684B2 (en) 2006-02-13 2010-07-20 Airwide Solutions, Inc. Measuring media distribution and impact in a mobile communication network
US7668920B2 (en) * 2006-03-01 2010-02-23 Fortinet, Inc. Electronic message and data tracking system
US20070208850A1 (en) * 2006-03-01 2007-09-06 Fortinet, Inc. Electronic message and data tracking system
US20110219086A1 (en) * 2006-03-01 2011-09-08 Fortinet, Inc. Electronic message and data tracking system
US8028335B2 (en) 2006-06-19 2011-09-27 Microsoft Corporation Protected environments for protecting users against undesirable activities
US8234291B2 (en) 2006-10-18 2012-07-31 Alibaba Group Holding Limited Method and system for determining junk information
US20100094887A1 (en) * 2006-10-18 2010-04-15 Jingjun Ye Method and System for Determining Junk Information
US9419927B2 (en) 2006-11-14 2016-08-16 Mcafee, Inc. Method and system for handling unwanted email messages
US8577968B2 (en) * 2006-11-14 2013-11-05 Mcafee, Inc. Method and system for handling unwanted email messages
US20080114843A1 (en) * 2006-11-14 2008-05-15 Mcafee, Inc. Method and system for handling unwanted email messages
US8224905B2 (en) 2006-12-06 2012-07-17 Microsoft Corporation Spam filtration utilizing sender activity data
US20100241508A1 (en) * 2007-07-17 2010-09-23 Airwide Solutions Oy Delivery of Advertisements in Mobile Advertising System
US20100185767A1 (en) * 2007-07-17 2010-07-22 Airwide Solutions Oy Content Tracking
WO2009010634A1 (en) * 2007-07-17 2009-01-22 Airwide Solutions Oy Content tracking
US7765204B2 (en) * 2007-09-27 2010-07-27 Microsoft Corporation Method of finding candidate sub-queries from longer queries
US20090089266A1 (en) * 2007-09-27 2009-04-02 Microsoft Corporation Method of finding candidate sub-queries from longer queries
US8275842B2 (en) 2007-09-30 2012-09-25 Symantec Operating Corporation System and method for detecting content similarity within email documents by sparse subset hashing
US20090089383A1 (en) * 2007-09-30 2009-04-02 Tsuen Wan Ngan System and method for detecting content similarity within emails documents employing selective truncation
US20090089384A1 (en) * 2007-09-30 2009-04-02 Tsuen Wan Ngan System and method for detecting content similarity within email documents by sparse subset hashing
US8037145B2 (en) 2007-09-30 2011-10-11 Symantec Operating Corporation System and method for detecting email content containment
US20090089539A1 (en) * 2007-09-30 2009-04-02 Guy Barry Owen Bunker System and method for detecting email content containment
US20090319506A1 (en) * 2008-06-19 2009-12-24 Tsuen Wan Ngan System and method for efficiently finding email similarity in an email repository
US8028031B2 (en) 2008-06-27 2011-09-27 Microsoft Corporation Determining email filtering type based on sender classification
US20090327430A1 (en) * 2008-06-27 2009-12-31 Microsoft Corporation Determining email filtering type based on sender classification
US20100057933A1 (en) * 2008-09-03 2010-03-04 Microsoft Corporation Probabilistic mesh routing
US8473455B2 (en) 2008-09-03 2013-06-25 Microsoft Corporation Query-oriented message characterization
US8898144B2 (en) 2008-09-03 2014-11-25 Microsoft Corporation Query-oriented message characterization
US8099498B2 (en) 2008-09-03 2012-01-17 Microsoft Corporation Probabilistic mesh routing
US20100057707A1 (en) * 2008-09-03 2010-03-04 Microsoft Corporation Query-oriented message characterization
US9704177B2 (en) * 2008-12-23 2017-07-11 International Business Machines Corporation Identifying spam avatars in a virtual universe (VU) based upon turing tests
US20100162404A1 (en) * 2008-12-23 2010-06-24 International Business Machines Corporation Identifying spam avatars in a virtual universe (vu) based upon turing tests
US20100162403A1 (en) * 2008-12-23 2010-06-24 International Business Machines Corporation System and method in a virtual universe for identifying spam avatars based upon avatar multimedia characteristics
US9697535B2 (en) * 2008-12-23 2017-07-04 International Business Machines Corporation System and method in a virtual universe for identifying spam avatars based upon avatar multimedia characteristics
US9338132B2 (en) 2009-05-28 2016-05-10 International Business Machines Corporation Providing notification of spam avatars
US8925087B1 (en) * 2009-06-19 2014-12-30 Trend Micro Incorporated Apparatus and methods for in-the-cloud identification of spam and/or malware
US20110055332A1 (en) * 2009-08-28 2011-03-03 Stein Christopher A Comparing similarity between documents for filtering unwanted documents
US8874663B2 (en) * 2009-08-28 2014-10-28 Facebook, Inc. Comparing similarity between documents for filtering unwanted documents
US8316094B1 (en) * 2010-01-21 2012-11-20 Symantec Corporation Systems and methods for identifying spam mailing lists
EP2715565A4 (en) * 2011-05-25 2015-07-15 Microsoft Technology Licensing Llc Dynamic rule reordering for message classification
WO2012162676A2 (en) 2011-05-25 2012-11-29 Microsoft Corporation Dynamic rule reordering for message classification
US9116879B2 (en) 2011-05-25 2015-08-25 Microsoft Technology Licensing, Llc Dynamic rule reordering for message classification
US9407463B2 (en) * 2011-07-11 2016-08-02 Aol Inc. Systems and methods for providing a spam database and identifying spam communications
US8954458B2 (en) 2011-07-11 2015-02-10 Aol Inc. Systems and methods for providing a content item database and identifying content items
US8700913B1 (en) 2011-09-23 2014-04-15 Trend Micro Incorporated Detection of fake antivirus in computers

Also Published As

Publication number Publication date Type
US7831667B2 (en) 2010-11-09 grant
WO2004105332A9 (en) 2005-12-15 application
US20050132197A1 (en) 2005-06-16 application
US20110055343A1 (en) 2011-03-03 application
US20050108339A1 (en) 2005-05-19 application
US8402102B2 (en) 2013-03-19 grant
JP4598774B2 (en) 2010-12-15 grant
EP1649645A2 (en) 2006-04-26 application
WO2004105332A3 (en) 2005-03-10 application
WO2004105332A2 (en) 2004-12-02 application
JP2007503660A (en) 2007-02-22 application

Similar Documents

Publication Publication Date Title
US6591291B1 (en) System and method for providing anonymous remailing and filtering of electronic mail
US7072944B2 (en) Method and apparatus for authenticating electronic mail
US7133898B1 (en) System and method for sorting e-mail using a vendor registration code and a vendor registration purpose code previously assigned by a recipient
US7580982B2 (en) Email filtering system and method
US20030195937A1 (en) Intelligent message screening
US20060004896A1 (en) Managing unwanted/unsolicited e-mail protection using sender identity
US20070078936A1 (en) Detecting unwanted electronic mail messages based on probabilistic analysis of referenced resources
US20060095524A1 (en) System, method, and computer program product for filtering messages
US20070136806A1 (en) Method and system for blocking phishing scams
US20060149823A1 (en) Electronic mail system and method
US6769016B2 (en) Intelligent SPAM detection system using an updateable neural analysis engine
US7779156B2 (en) Reputation based load balancing
US20050177599A1 (en) System and method for complying with anti-spam rules, laws, and regulations
US20060206713A1 (en) Associating a postmark with a message to indicate trust
US20060179113A1 (en) Network domain reputation-based spam filtering
US7409708B2 (en) Advanced URL and IP features
US20040243844A1 (en) Authorized email control system
US20030200334A1 (en) Method and system for controlling the use of addresses using address computation techniques
US6356935B1 (en) Apparatus and method for an authenticated electronic userid
US20060085505A1 (en) Validating inbound messages
US20050076241A1 (en) Degrees of separation for handling communications
US6941466B2 (en) Method and apparatus for providing automatic e-mail filtering based on message semantics, sender&#39;s e-mail ID, and user&#39;s identity
US20090138972A1 (en) Resisting the spread of unwanted code and data
US20070271346A1 (en) Method and System for Filtering Electronic Messages
US20060031313A1 (en) Method and system for segmentation of a message inbox

Legal Events

Date Code Title Description
AS Assignment

Owner name: SYMANTEC CORPORATION, CALIFORNIA

Free format text: MERGER;ASSIGNOR:BRIGHTMAIL, INC.;REEL/FRAME:016329/0690

Effective date: 20040618