WO2017162997A1 - Procédé de protection d'un utilisateur contre des messages avec des liens vers des sites web malveillants contenant des attaques par homographes - Google Patents

Procédé de protection d'un utilisateur contre des messages avec des liens vers des sites web malveillants contenant des attaques par homographes Download PDF

Info

Publication number
WO2017162997A1
WO2017162997A1 PCT/GB2017/000038 GB2017000038W WO2017162997A1 WO 2017162997 A1 WO2017162997 A1 WO 2017162997A1 GB 2017000038 W GB2017000038 W GB 2017000038W WO 2017162997 A1 WO2017162997 A1 WO 2017162997A1
Authority
WO
WIPO (PCT)
Prior art keywords
internet
domains
threat
domain
user
Prior art date
Application number
PCT/GB2017/000038
Other languages
English (en)
Inventor
Alexander John BARNETT
Samuel PRESLEY
Original Assignee
The Secretary Of State For Defence
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Secretary Of State For Defence filed Critical The Secretary Of State For Defence
Publication of WO2017162997A1 publication Critical patent/WO2017162997A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1483Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/07User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail characterised by the inclusion of specific contents
    • H04L51/18Commands or executable codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/212Monitoring or handling of messages using filtering or selective blocking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Definitions

  • the present invention relates to protection of computers and users from electronic messages (such as emails etc) containing links to websites which contain malware or which fraudulently purport to be a genuine service in order to collect users' personal or financial information.
  • Spam filters are a well-known approach for protecting users from messages containing links to websites containing malware.
  • Techniques for filtering the messages include comparing them to templates, statistical techniques, and checking for links to websites that are known to contain malware.
  • Some spam filters go further in looking for messages that are unusual as compared to the type of messages which are commonly received.
  • the inventors have identified some problems with known filters, which is that although they can stop a large proportion of untargeted spam, they tend to be ineffective against highly targeted malicious messages (known as "phishing, or "spear phishing" depending on how targeted the attack is) which can cause considerable harm to the user, the organisation, or their computer systems.
  • a particular type of targeted attack of concern to the inventors is the homoglyph attack.
  • An example of a targeted homoglyph phishing attack would be to send a message to a member of staff of an organisation (e.g. a bank) containing a link that appears to be relevant to the staff of that organisation but with a small change that may not be immediately obvious to the recipient.
  • a member of staff of an organisation e.g. a bank
  • Well known examples are to replace the letter "o" with the number "o", perhaps using a font (in this case "Bell MT") where the number '0' looks more like a conventional "o", or using two letters that in a particular font seem similar to a single letter (e.g. "rn” looking like "m”).
  • Other examples of the technique include www.paypal.com and G00GLE.COM.
  • crylic letters "e” "p” and “o” which look identical to the latin letters “e” "p” and “o"
  • the inventors have identified a weakness in known message filters, which is that in order to become widely used they need to provide generic protection for all types of users, yet the difficulty with phishing attacks is that they are tailored to be of interest to users within an organisation (or a specific individual), and thus tend to be designed to be very similar to the type of messages that the user(s) would not be surprised to received.
  • automated methods of finding unusual messages do not tend to provide adequate protection against targeted or highly targeted phishing attacks.
  • malware hosting website is often only registered shortly before sending of the malicious email to the victims being targeted/ as this ensures that the malware hosting website will not be a known threat website at the time of performing the attack.
  • apparatus for protecting a user against phishing attacks by identifying threat links in incoming electronic messages, and preventing a user from controlling a computer to follow such links, the apparatus comprising:
  • An internet-domain manager operable to store a list of non-threat internet-domains that are relevant to the user or an organisation of which the user is a member, each internet-domain comprising at least a second level domain of an internet address;
  • a content security manager operable to:
  • a content filter operable to directly or indirectly prevent the user from controlling a computer via a message client to follow such hyperlinks if they contain an internet-domain that has been assessed to be a threat; '
  • the assessment further comprises assessing at least one characteristic of internet-domain registration information associated with the internet-domain, against at least one criterion, and basing the assessment at least in part on that output, at least in the case that the output of the digital image similarity assessment algorithm is above a predetermined threshold.
  • a typical characteristic to be used is the date of registration of the internet-domain, or more specifically how recent the registration was.
  • a recent registration may be defined as an indicator that the internet-domain is a threat web address.
  • a historical registration may be an indicator of the internet-domain not being a threat, however other factors may be taken into account. In the absence of the use of other factors this would indicate that the internet-domain is not a threat.
  • the assessment may comprise other checks (either subsequent or in combination), to determine whether the internet-domain is a threat or non-threat.
  • the internet-domain manager is operable to collect a list of internet-domains that are commonly included in a set of electronic messages of the aforesaid organisation, and not collect in that list those internet-domains that are rarely included in the set, where at least one criterion is provided to distinguish common from rare, the collected list of internet-domains being stored as the list of non-threat internet-domains.
  • the criteria could favour collection of internet-domains that occur rarely in electronic messages of the organisation as a whole, but are disproportionately present in messages sent to (and/or from) a minority of the users (i.e. message addresses) of the organisation.
  • the aforesaid collection of a list of internet-domains from the electronic messages of the organisation is performed using messages sent from message addresses controlled by the organisation (i.e. sent by members of the organisation such as the organisation's staff).
  • the internet-domain manager is operable to automatically collect a list of internet-domains that are commonly accessed by users of the organisation via web browsers of computers controlled by the organisation, the collected list of commonly accessed web addresses being stored in the list of non-threat web addresses.
  • Automated collection of commonly included and/or commonly accessed web addresses has the advantage that the user, or a superuser, need not manually collate this list of web addresses which provides a lower barrier to adoption by an organisation. It also typically provides for more reliable inclusion of all commonly referenced web addresses as compared to manual collection.
  • the content filter is operable to directly or indirectly prevent the user from controlling a computer running the message client to follow such hyperlinks. This can be achieved in several ways:
  • Deleting the message Blocking the message, by not forwarding it to the target user's inbox.
  • the content filter blocks or deletes messages containing threat hyperlinks.
  • the message may be forwarded with a deactivated or deleted link, typically with a warning to the user regarding why the link has been removed or deactivated.
  • the messages are either emails or short message service messages (known as “text messages” or “sms's”) or both, however the method is equally applicable to other electronic message services, including bespoke messenger services, especially those suited to being sent from one organisation to another, and especially those which typically would include links, or which are typically used in conjunction with a message client that supports the use of hyperlinks to control an internet browser.
  • text messages or short message service messages
  • bespoke messenger services especially those suited to being sent from one organisation to another, and especially those which typically would include links, or which are typically used in conjunction with a message client that supports the use of hyperlinks to control an internet browser.
  • the internet-domain manager is operable to store a list of non-threat internet-domains that are relevant to an organisation of which the user is a member.
  • Application of the invention to an organisation rather than just an individual provides strong protection against some types of phishing attacks because attackers often target at the level of an organisation.
  • the list of non-threat internet-domains may be ones identified specifically with respect to the user. This provides enhanced protection against the most highly targeted phishing attacks.
  • both organisation-level and user-level lists of relevant internet-domains are collected and used.
  • the step of assessing whether at least one of the suspect internet-domains is a threat in the event that an internet domain of a hyperlink in a message is assessed to have image similarity to one of the non-threat internet domains, the presence of at least a sub-domain of that non-threat internet domain within the message is determined, and if that non-threat interned domain subdomain is identified within the message this is used to contribute towards a finding that the message is a threat.
  • This feature is ' advantageous because typically a phishing attack uses the name of the genuine website, often repeatedly, then including a link to similar-looking domain. For example an email falsely purporting to be from the company Paypal Inc may repeatedly mention the term "Paypal” before including a link to, for example Paypal.com . Therefore having identified that the term "paypal” has image similarity to one of the trusted domains or subdomains ("paypal") the level of confidence that the link is malicious is increased by the determination that the message contains the text "paypal" (irrespective of capitalisation). This may be implemented in different ways, for example the presence of the text in the message may be used as a final check.
  • the required threshold for image similarity may be varied according to whether the trusted domain text is also in the message (if the text is present then a low threshold is used. If the text is absent then a high threshold is used).
  • checking for the text in the message may be an additional check in combination ith other checks, such as checking whether the linked-to internet domain was registered only recently.
  • account is taken of whether any non-threat internet domains to which the hyperlinked internet domain has image similarity to, is also present in the text of the message.
  • account is taken of how many times it is present, whether it is present as a word rather than a text string inside another word, and/or the number of times it is present relative to the number of words in the message.
  • a method for protecting a user against phishing attacks by identifying threat links in incoming electronic messages, and preventing a user from controlling a computer to follow such links comprising the steps of:
  • each internet-domain comprising at least a second level domain of an internet address
  • a content filter operable to directly or indirectly prevent the user from controlling a computer via a message client to follow such hyperlinks if they contain an internet-domain that has been assessed to be a threat;
  • a computer program operable to control a computer to perform the method of the second aspect.
  • Such computer program typically is recorded on a physical computer readable medium.
  • Figure 1 is an illustration of a method of preventing a user from controlling a computer to access a malicious link according to the prior art
  • Figure 2 is an illustration of a method of preventing a user from controlling a computer to access a , malicious link according to one embodiment of the present invention
  • Figure 3 is a block diagram of a computer apparatus for protecting against phishing attacks according to a first embodiment
  • Figure 4 is a block diagram of a computer apparatus for protecting against phishing attacks according to a second embodiment.
  • a message filter such as that of the prior art operates by receiving messages (step 1), identifying whether any hyperlinks in a received message are on a blacklist (step 2) and if so, blocking the message, or if not then allowing the message and active hyperlink to be passed to and displayed to the user.
  • FIG. 2 illustrates a preferred embodiment of the present invention.
  • a list of non-threat internet domains is stored (step 6).
  • Each internet-domain must include at least a second-level domain (e.g. "paypal” or “hmrc”).
  • each internet-domain will also include the top level domain (e.g. ".com” or “.gov.uk”).
  • Any relevant third level domain could be stored too but typically these are not stored, especially not the file extension.
  • the list of non-threat internet domains can optionally be collected automatically by evaluating a set of messages and/or monitoring a flow of messages relevant to the organisation (step 5).
  • Monitoring inbound messages is typically acceptable because targeted phishing attacks will generally only be a very small proportion of an inbound flow of messages.
  • Monitoring outbound messages may however be preferred as of the small number of targeted phishing attack messages received only a small proportion of them will be replied to or forwarded by recipients within the organisation,
  • the disadvantage of only monitoring outbound messages is that this may cause some internet domains that are commonly received to be omitted from the non-threat list due to the messages being of a nature which does not require a reply (e.g. emails from "noreply@! addresses).
  • both inbound and outbound messages should be monitored,. and preferably a higher threshold is set for inbound messages as compared to outbound messages (for example internet domains are deemed to be non-threat, if they are referenced in over 1 in 100 inbound messages, or in over 1 in 1000 outbound messages).
  • At least one additional criterion is used, for example the consistency of the rate at which an internet domain is referenced in messages.
  • Each internet domain will occur in messages a varying number of times each day (or each weekday, week, month or other time period) and this variation may follow the 'normal distribution' or may follow a skewed distribution (a criterion would be needed to distinguish normal and skewed distributions, of which many could be chosen).
  • the threshold could be lower (e.g. 1 in 2000) than for internet domains which occur in a skewed distribution (e.g. 1 in 500).
  • a criterion is defined to distinguish internet domains that tend to be in messages that recipients tend to reply to (and/or forward), and internet domains that tend, to be jn messages that recipients by comparison do not tend to reply to (and/or forward). For simplicity, this can be based only on replies (or forwards) which include the original message text, however the alternative is also possible.
  • automatic harvesting of non-threat internet domains can be performed initially based on a dataset of historical messages relevant to the organisation, or the automatic harvesting can be performed for an initial period of time on messages as they flow through (in and/or out of) the organisation.
  • non-threat internet domains can be left static, it is equally possible to continuously monitor electronic messages to keep the non-threat list of internet domains up to date, however especially in the latter case it is important to ensure that a sudden flood of phishing attacks will not cause the phishing attack internet domain to be added to the list of non-threat internet domains. This is best achieved by only (or preferentially) adding internet domains to the non-threat list if it is identified that recipients within the organisation have a high tendency to reply to messages with hyperlinks to such internet domains.
  • recipient-specific lists of non-threat internet domains This can be best achieved by applying the above methods to identify internet domains which are commonly contained in messages sent to a particular recipient within the organisation. This accounts for the fact that different users are likely to have subscribed to different newsletters or other message services;
  • the use of user specific lists helps to provide stronger protection against phishing messages targeted at specific users.
  • One way to generate a user specific list is to automatically identifying web domains that commonly occur in messages sent particularly to that user, either on a historical dataset (e.g. their inbox) or on messages as they pass to the user and/or updated on a continuous basis. Of course care should be taken to protect such lists as they may contain private information.
  • messages are received (or continue to be received) at the organisation (step 7).
  • the messages typically are emails (but may additionally/alternatively be text messages also known as sms's, and additionally/alternatively may be another type of electronic message).
  • the message contains no links to internet domains, or if ail the links are to non- threat internet domains, then the message is not blocked/deleted/disabled.
  • the most efficient option is to first determine that all links in a message are non-threat ones, and to bypass further checks if this is the case.
  • the method is limited to active links in html format.
  • the method also includes identifying any plain text that would link to a website if it were copied and pasted into a browser (i.e. web addresses), treating this text as a hyperlink, and assessing that hyperlink as described above.
  • Such plain text can be identified by the inclusion of a ".” separating two sections of characters, the latter of which is a known top level internet domain (e.g. com ).
  • the described system is focussed on protecting against phishing attacks using homodyne links. Clearly therefore, it can be implemented in conjunction with other filters, such as a spam filter and/or a content filter, so even if the described system does not block/delete/disable the message, the message might still get blocked/deleted/disabled at some point for other reasons.
  • filters such as a spam filter and/or a content filter
  • step 8 By contrast if the message does contain links to internet domains that are not in the non-threat list (step 8), then these need to be checked.
  • An optional first check (step 9) is to check whether there is text similarity between the internet domain of the link in the message, and any of the internet domains in the non-threat list.
  • YAH00.COM could be confused with YAH00.COM. Therefore to identify web domains which could be mistaken for non-threat web domains a variety of text filters and checks could sensibly be implemented.
  • a simple example would be to require at least 50% of the letters of the second level domain to be both in common and in the right order (in this case Y, A and H, making up 66.7% of the letters and they are found in the same order in the non-threat internet domain).
  • in common means either being the same letter (and therefore being in the same language) or alternatively being one of a number of known homoglyphs in a list of known homoglyphs.
  • the list for example might include:
  • homoglyph includes typographic ligatures (situations where two letters can appear similar to a single letter). So preferably the list of known homoglyphs includes typographic ligatures. These might for example include:
  • Determination of text similarity can . be performed in many ways. For example there are approximately 188 algorithms available on Github related to the subject of text similarity. A basic approach would be to count the proportion of letters in common (as a fraction of, for example, the average number of letters in the two text strings). One of the examples listed on Github appears to use convolutional neural networks. Another attractive option is to treat the letters as vectors and measure the cosine angle between the two vectors. This conveniently produces a value between 0 and 1. Suitable cosine-similarity text evaluation algorithms are available on the internet, or can readily be written by the PSA.
  • the threshold may also be desirable to vary the threshold according to the length of one of the text strings (e.g. second level internet domains), so with long text strings (10+ characters) a high degree of text similarity would be required (high threshold) but with short text strings (E.g. 3-5 characters) a lower degree of text similarity would be required.
  • This can be expressed as a function so the required threshold may vary smoothly but perhaps non-linearly with , number of characters.
  • Suitable thresholds or functions can be established by the PSA readily through trial and error. With all of these approaches it is desirable to treat known homoglyphs as equivalent for purposes of text similarity, but it is very important that they are not treated as equivalent for purposes of checking whether the text strings (internet domains or sub domains thereof) are identical.
  • Unicode (preferably the most up to date version) is a sensible option.
  • the step ' of identifying suspect internet-domains in hyperlinks in the incoming electronic messages includes first assessing a measure of text similarity between any such internet domains in hyperlinks and each of the non-threat internet domains, with respect to a criterion, and treating any internet domains in hyperlinks that meet the criterion as suspect, and any internet domains in hyperlinks that do not meet the criterion as not suspect, and the step of assessing whether at least one of the suspect internet-domains is a threat is applied to those internet domains in hyperlinks that are treated as suspect.
  • the assessment of a measure of text similarity includes providing a list of known pairs of single or double character homoglyphs, and in each instance of comparing an internet domain in a hyperlink with a non-threat internet domain any of the listed homoglyphs are identified and treated as identical for the purposes of measuring text similarity.
  • One way to perform the text assessment while treating listed homoglyphs as identical is to convert one of the homoglyphs (particularly the one in the internet domain in the incoming message) into its counterpart prior to text similarity assessment. Another way is to convert them both into an arbitrary code.
  • any links in messages include a third level domain (and/or other sub- domain) then that sub-domain should be checked against the second level domains of the non- threat lists. In this case a direct match does not lead to the link being considered non-threat, but (barring other factors) will normally lead to the link being considered a phishing threat.
  • any suspect hyperlinks are evaluated via digital image comparison.
  • the second level domain of the hyperlink is converted into a digital image (i.e. generally a rectangular image, generally with data defining black areas and white areas, and typically not defining colour).
  • the second level domain of each of the listed non-threat internet domains is either converted to a digital image or already provided as a digital image.
  • the same method of converting to digital image should be used on both second level domains.
  • the suspect link second level domain is converted from plain text to a digital image, and also a digital image is generated of its formatted appearance, and both digital images are compared to the digital images of the non-threat second level domains. Suitable methods of converting text to an image are easily selected and obtained (or else generated if necessary) by the skilled person.
  • Comparison of two digital images is performed on a computer by an image similarity assessment algorithm.
  • Many relevant algorithms already exist and can be selected and modified to suit this task. Searching for image comparison on github for example reveals a number of algorithms and even an algorithm for comparing image comparison algorithms.
  • searching for image comparison on github for example reveals a number of algorithms and even an algorithm for comparing image comparison algorithms.
  • an algorithm which assesses the degree to which the structural features of one image are present in the other in similar but not necessarily exactly matching locations. This provides a better estimate of whether a user is at risk of mis-reading the link as being a link to a common internet domain.
  • Such an algorithm can be written by the skilled person as an alternative to using and optionally modifying an available image comparison algorithm.
  • each small digital image will be a part of a single letter depicted in the original, and each part of each letter should be contained in at least one small digital image.
  • the similarity between images A and B is output as a measure of how many of the small images from A can be readily matched to a similar location of image B.
  • a suspect internet domain has already been identified as having text similarity to a non-threat internet domain, then it may be preferably to only compare the parts which lack text similarity, preferably along with the any immediately . neighbouring letters. So for example comparing A: “microsoft” to B: “microsoft” would involve first identifying that 8 of the 9 letters in A are present in B in the correct order, and thus the subdomains have text similarity. However then it is only necessary to compare a digital image of the letters "mi” against a digital image of the letters "mi” for example as they would be displayed based on any formatting defined in the message (e.g. a specific font).
  • the suspect hyperlink is either deemed a threat, or remains a suspect link for further analysis (step 12). If the digital image similarity algorithm indicates low image similarity then the suspect hyperlink is deemed a non-threat and the message is passed to the user's inbox or message client for viewing without being deleted, blocked or disabled (step 11).
  • a hyperlink that remains a suspect hyperlink following image similarity assessment is evaluated further, with a further assessment being based on at least one characteristic of the relevant internet domain's registration details.
  • the characteristic assessed is how recently the relevant domain was registered (i.e. recently or historically), based on whether this was longer ago than a threshold value. So, for example if the domain was registered less than a week prior to receipt of the message, then this in combination with the identified image similarity is used to deem the hyperlink a threat.
  • the user is then prevented from controlling a computer to access the hyperlink address - either by deleting the message, blocking the message, deleting or disabling the hyperlink or at the least marking the message as malicious in order to cause a further filter (such as a spam filter, or content filter or other type of filter) downstream to delete or block the message.
  • a further filter such as a spam filter, or content filter or other type of filter
  • the internet domain's registration details are obtained when needed by reference to a suitable resource at a predetermined location on the internet which may be a 3 rd party website or may be a proprietary server providing a bespoke service for detecting phishing attacks.
  • an assessment is made of whether the non-threat internet domain (which has been assessed as having image similarity to an internet domain of a hyperlink) is mentioned in the text of the message. If the text string merely occurs incidentally (for example the internet domain "ample” might occur in a message which contains the word “example”) this could sensibly be disregarded. However the presence of the non-threat internet domain as a word within the message would be a strong indicator that the message is malicious. While this step has been discussed as a separate subsequent step, alternatively the steps are combined and the image similarity and one or more other criteria are assessed jointly via an algorithm which balances two or more input variables to provide an assessment of whether the message is a threat.
  • Such algorithms may vary widely, perhaps relying on many inputs such as the number of times the word is repeated, the amount of image similarity, the recentness of the URL registration/ whether the non-threat domain is in an organisation relevant non-threat list or a personal non-threat list etc, and/or whether the on-threat internet domain is one known to provide login-based access or merely freely accessible information.
  • the link has been described as a hyperlink - generally this includes non-enabled or inactive hyperlinks in the form of a URL in plain text (whether or not the message itself is in plain text format). This is preferable because some phishing messages may merely invite the user to copy and paste the URL into a browser address bar.
  • the method and apparatus provide for protecting against homoglyph phishing attacks by identifying threat links in incoming electronic messages such as emails. Such messages can then be blocked, deleted or edited to prevent the user being duped by them.
  • the method involves establishing a list of internet domains are relevant to a particular organisation, and identifying incoming messages with internet links that may cause users to wrongly think those links are directed to those listed internet domains. This is achieved by applying an image similarity assessment algorithm to detect links which would have visual similarity to those listed internet domains.
  • the method may include only assessing image similarity on those links that have a predetermined amount of text similarity to those listed internet domains, and may also include further checks if the link is still a suspect link after the image similarity check.
  • Figure 3 is a block diagram depicting an example of a computing system 14 that executes a link assessment engine 15 for assessing electronic messages as to whether they contain generating data structures that encode logical propositions as described above.
  • the computing system 14 includes a processor 16 that is communicatively coupled to a memory 15 and that executes computer-executable program code and/or accesses information stored in the memory 15.
  • Examples of the processor 16 include (but are not limited to) a microprocessor, an application-specific integrated circuit ("ASIC"), a field-programmable gate array (“FPGA”), or other processing device.
  • the processor 16 can include any number of processing devices, including one.
  • the memory 15 includes any suitable non-transitory computer-readable medium.
  • the computer-readable medium includes any electronic, optical, magnetic, or other storage device capable of providing a processor with computer-readable instructions or other * program code.
  • Non-limiting examples of a computer-readable medium include a CD-ROM, a DVD, a magnetic disk, a memory chip, a ROM, a RAM, an ASIC, optical storage, magnetic tape or other magnetic storage, or ' any other medium from which a computer processor can read instructions.
  • the computing system 14 may also include a number of external or internal devices such as input or output devices.
  • the computing system 14 is shown with an input/output ("I/O") interface 18 that can receive input from input devices or provide output to output devices.
  • I/O input/output
  • a bus 17 can also be included in the computing system 14. The bus 17 can communicatively couple one or more components of the computing system 14.
  • the computing system 14 executes program code that configures the processor 16 to perform one or more of the operations described above.
  • the memory 15 stores this program code.
  • the program code includes, for example, the link assessment engine 15 or any other suitable engine, module, or application that can be used to perform one or more operations described herein.
  • the program code may be resident in the memory 15 or any suitable computer-readable medium and may be executed by the processor 16 or any other suitable processor.
  • the program code includes processor-specific instructions generated by a compiler and/or an interpreter from code written in any suitable computer-programming language, including, for example, C, C++, C#, Visual Basic, Java, Python, Perl, JavaScript, and ActionScript.
  • the computing system 14 receives electronic messages via I/O device 18 from message server 19, and typically sends them back to the same or another message server for onward transmission, or directly on to be picked up by the relevant message client.
  • Computer 20 is typically one of many computers, typically each owned by the organisation and typically each controlled by a user who is a member of the organisation.
  • the computer which may be a smartphone or other portable device, 20 similarly has memory, processor, bus and an I/O device for receiving messages 21, and an I/O device which is generally a screen but may be another visual (or even audio) display unit 23 for displaying the electronic message to a user 24 (dotted arrow), and allowing the user 24 to control the computer 20 to follow a hyperlink within the displayed message (second dotted arrow).
  • FIG. 4 shows an alternative embodiment where the Computing system 14 and computer system 20 are provided in one computer system 14. This computer system both assesses links to identify targeted phishing attacks, and also displays messages to the client via a message client. The numbering system of figure 3 is used in figure 4.
  • a computing device can include any suitable arrangement of components that provides a result conditioned on one or more inputs.
  • Suitable computing devices include multipurpose microprocessor-based computer systems accessing stored software that programs or configures the computing system from a general purpose computing apparatus to a specialized computing apparatus implementing one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combinations of languages may be used to implement the teachings contained herein in software to be used in programming or configuring a computing device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

L'invention concerne un procédé et un appareil de protection contre des attaques malveillantes de type homoglyphe par identification de liens de menace dans des messages électroniques entrants tels que des courriels. De tels messages peuvent ensuite être bloqués, supprimés ou édités pour empêcher que l'utilisateur commandant un ordinateur ne télécharge un code à partir de ces liens. Le procédé consiste à établir une liste de domaines Internet qui sont pertinents pour un utilisateur ou une organisation, et à identifier des messages entrants avec des liens Internet qui peuvent amener des utilisateurs à penser à tort que ces liens sont dirigés vers ces domaines Internet listés. Ceci est réalisé en appliquant un algorithme d'évaluation de similarité d'image pour détecter des liens qui présenteraient une similarité visuelle avec ces domaines Internet listés. Le procédé peut consister à évaluer uniquement une similarité d'image sur ces liens qui ont une quantité prédéterminée de similarité de texte avec ces domaines Internet listés, et peut également comprendre des vérifications supplémentaires si la liaison est encore une liaison suspecte après le contrôle de similarité d'image.
PCT/GB2017/000038 2016-03-24 2017-03-23 Procédé de protection d'un utilisateur contre des messages avec des liens vers des sites web malveillants contenant des attaques par homographes WO2017162997A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB1605004.9A GB201605004D0 (en) 2016-03-24 2016-03-24 A method of protecting a user from messages with links to malicious websites
GB1605004.9 2016-03-24

Publications (1)

Publication Number Publication Date
WO2017162997A1 true WO2017162997A1 (fr) 2017-09-28

Family

ID=56027318

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2017/000038 WO2017162997A1 (fr) 2016-03-24 2017-03-23 Procédé de protection d'un utilisateur contre des messages avec des liens vers des sites web malveillants contenant des attaques par homographes

Country Status (2)

Country Link
GB (2) GB201605004D0 (fr)
WO (1) WO2017162997A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11063897B2 (en) 2019-03-01 2021-07-13 Cdw Llc Method and system for analyzing electronic communications and customer information to recognize and mitigate message-based attacks
WO2022051663A1 (fr) * 2020-09-04 2022-03-10 Proofpoint, Inc. Systèmes et procédés de traitement de nom de domaine
EP3809299A4 (fr) * 2018-07-25 2022-03-16 Nippon Telegraph And Telephone Corporation Dispositif d'analyse, procédé d'analyse et programme d'analyse
US11665135B2 (en) 2018-05-22 2023-05-30 Proofpoint, Inc. Domain name processing systems and methods

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008086924A1 (fr) * 2007-01-16 2008-07-24 International Business Machines Corporation Procede et dispositif de detection de fraude informatique
CN105357221A (zh) * 2015-12-04 2016-02-24 北京奇虎科技有限公司 识别钓鱼网站的方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2382361T3 (es) * 2005-01-14 2012-06-07 Bae Systems Plc Sistema de seguridad basado en red
US7668921B2 (en) * 2006-05-30 2010-02-23 Xerox Corporation Method and system for phishing detection

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008086924A1 (fr) * 2007-01-16 2008-07-24 International Business Machines Corporation Procede et dispositif de detection de fraude informatique
CN105357221A (zh) * 2015-12-04 2016-02-24 北京奇虎科技有限公司 识别钓鱼网站的方法及装置

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11665135B2 (en) 2018-05-22 2023-05-30 Proofpoint, Inc. Domain name processing systems and methods
EP3809299A4 (fr) * 2018-07-25 2022-03-16 Nippon Telegraph And Telephone Corporation Dispositif d'analyse, procédé d'analyse et programme d'analyse
US11843633B2 (en) 2018-07-25 2023-12-12 Nippon Telegraph And Telephone Corporation Analysis device, analysis method, and analysis program
US11063897B2 (en) 2019-03-01 2021-07-13 Cdw Llc Method and system for analyzing electronic communications and customer information to recognize and mitigate message-based attacks
WO2022051663A1 (fr) * 2020-09-04 2022-03-10 Proofpoint, Inc. Systèmes et procédés de traitement de nom de domaine
US11973799B2 (en) 2020-09-04 2024-04-30 Proofpoint, Inc. Domain name processing systems and methods

Also Published As

Publication number Publication date
GB2550657A (en) 2017-11-29
GB201704507D0 (en) 2017-05-03
GB201605004D0 (en) 2016-05-11

Similar Documents

Publication Publication Date Title
US11936604B2 (en) Multi-level security analysis and intermediate delivery of an electronic message
US10609073B2 (en) Detecting phishing attempts
Baykara et al. Detection of phishing attacks
US10425444B2 (en) Social engineering attack prevention
US10834127B1 (en) Detection of business email compromise attacks
US7634810B2 (en) Phishing detection, prevention, and notification
US8291065B2 (en) Phishing detection, prevention, and notification
US7451487B2 (en) Fraudulent message detection
US7516488B1 (en) Preventing data from being submitted to a remote system in response to a malicious e-mail
US20050289148A1 (en) Method and apparatus for detecting suspicious, deceptive, and dangerous links in electronic messages
EP3516821A1 (fr) Atténuation du risque de communication par détection d'une similarité avec un contact de message de confiance
US20060123478A1 (en) Phishing detection, prevention, and notification
US20060224677A1 (en) Method and apparatus for detecting email fraud
US20100313253A1 (en) Method, system and process for authenticating the sender, source or origin of a desired, authorized or legitimate email or electrinic mail communication
Kang et al. Advanced white list approach for preventing access to phishing sites
WO2017162997A1 (fr) Procédé de protection d'un utilisateur contre des messages avec des liens vers des sites web malveillants contenant des attaques par homographes
CN113630397A (zh) 电子邮件安全控制方法、客户端及系统
Hajgude et al. Phish mail guard: Phishing mail detection technique by using textual and URL analysis
Heron Technologies for spam detection
US20210081962A1 (en) Data analytics tool
WO2018081016A1 (fr) Analyse de sécurité multi-niveau et distribution intermédiaire d'un message électronique
Kumar et al. Email phishing attack mitigation using server side email addon
Sankhwar et al. Defending Against Phishing: Case Studies.
Gan et al. Phishing: a growing challenge for Internet banking providers in Malaysia
Dhinakaran et al. Multilayer approach to defend phishing attacks

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17714503

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17714503

Country of ref document: EP

Kind code of ref document: A1