FR2953957A1

FR2953957A1 - Method for detecting domain name generated by e.g. robot network in Internet, involves determining third score based on first and second scores, and providing proposition for placing domain name if third score exceeds given threshold

Info

Publication number: FR2953957A1
Application number: FR0959032A
Authority: FR
Inventors: Laurent Clevy; Hakim Hacid
Original assignee: Alcatel Lucent SAS
Current assignee: Alcatel Lucent SAS
Priority date: 2009-12-16
Filing date: 2009-12-16
Publication date: 2011-06-17

Abstract

The method involves capturing responses of a domain name server to domain name requests obtained from a telecommunications network. A first score is determined based on information e.g. date or time to live, relative to a protocol of a domain name system (SND), contained in the responses. A second score is determined based on lexical and syntactic analyses of a domain name style contained in the responses. A third score is determined based on the first and second scores. A proposition for placing a domain name in a black list, is provided if the third score exceeds a given threshold. Independent claims are also included for the following: (1) a server for detecting a domain name generated by a malicious machine network in a telecommunication network, comprising a response capturing unit (2) a computer program comprising instructions for performing a method for detecting a domain name generated by a malicious machine network in a telecommunications network.

Description

DETECTION DE NOM DE DOMAINE GENERE PAR UN RESEAU DE MACHINES MALVEILLANTES DOMAIN NAME DETECTION GENERATED BY A MALWARE MACHINE NETWORK

La présente invention concerne une détection de noms de domaines 5 générés par un réseau de machines malveillantes dans un réseau de télécommunications. The present invention relates to a detection of domain names generated by a network of malicious machines in a telecommunications network.

Un réseau de machines malveillantes ou "BotNet" ("roBot Network" en anglais) dans un réseau de télécommunications désigne un ensemble d'automates 10 logiciels reliés entre eux, installé sur des machines informatiques reliées au réseau. Ce type de réseau peut être contrôlé à distance par des entités masquées, notamment à des fins criminelles, par exemple pour envoyer des courriers électroniques de type "spam" ou effectuer des attaques de type "dénis de service". Une machine malveillante est par exemple un ordinateur qui a été infecté par un 15 virus, et l'ordinateur agit pour le compte du réseau de machines malveillantes à l'insu de l'utilisateur de l'ordinateur. Une machine malveillante peut être aussi appelée une machine "zombie" et un réseau de machines malveillantes peut être assimilé à un virus distribué sur des machines fonctionnant à l'insu de leurs utilisateurs. 20 Un réseau de machines malveillantes peut utiliser des noms de domaine qui peuvent être générés pour cacher la localisation réelle des noeuds de commandes du réseau de machines malveillantes. Un nom de domaine a généralement un nom facilement mémorisable par un utilisateur afin d'être utilisé aisément, c'est-à-dire un utilisateur pourra accéder 25 par exemple à un site web à l'aide de mots-clés permettant d'identifier le nom de domaine associé au site web. Un nom de domaine généré par un réseau de machines malveillantes s'adresse directement aux ordinateurs infectés plutôt qu'aux utilisateurs, c'est-à-dire le nom de domaine n'est pas toujours compréhensible par l'utilisateur, le nom de domaine pouvant être généré 30 aléatoirement ou selon des algorithmes complexes. En raison du développement des architectures de réseau de télécommunications et des algorithmes de génération de noms de domaine, il devient de plus en plus difficile de détecter une activité de réseau de machines malveillantes, or la détection de réseau de machines malveillantes est un important problème de sécurité nécessitant de nombreux efforts de recherche. A network of malicious machines or "BotNet" ("roBot Network" in English) in a telecommunications network refers to a set of interconnected software controllers 10 installed on computer machines connected to the network. This type of network can be controlled remotely by masked entities, especially for criminal purposes, for example to send spam emails or to perform "denial of service" attacks. For example, a malicious machine is a computer that has been infected with a virus, and the computer is acting on behalf of the network of malicious machines without the knowledge of the computer user. A malicious machine can also be called a "zombie" machine and a network of malicious machines can be likened to a virus distributed on machines operating without the knowledge of their users. A network of malicious machines can use domain names that can be generated to hide the actual location of the malicious machine network command nodes. A domain name generally has a name easily remembered by a user in order to be easily used, that is to say a user can access for example a website using keywords to identify the domain name associated with the website. A domain name generated by a network of malicious machines directly addresses infected computers rather than users, ie the domain name is not always understandable by the user, the domain name can be generated randomly or according to complex algorithms. Due to the development of telecommunication network architectures and domain name generation algorithms, it is becoming harder and harder to detect malicious machine network activity, and malicious machine network detection is an important problem. security requiring many research efforts.

Un objectif de l'invention est de remédier aux inconvénients précédents en 5 proposant notamment un système de détection de noms de domaine générés par un réseau de machines malveillantes. An object of the invention is to overcome the above disadvantages by proposing in particular a system for detecting domain names generated by a network of malicious machines.

Pour atteindre cet objectif, un procédé pour détecter un nom de domaine généré par un réseau de machines malveillantes dans un réseau de 10 télécommunications, comprend les étapes suivantes : capturer des réponses d'un serveur de nom de domaine à des requêtes de nom de domaine issues du réseau de télécommunications, déterminer un premier score en fonction d'informations relatives au protocole du système de nom de domaine contenues dans les réponses capturées, 15 déterminer un deuxième score en fonction d'une analyse lexicale et d'une analyse syntaxique d'un intitulé du nom de domaine contenu dans les réponses capturées, déterminer un troisième score en fonction du premier score et du deuxième score, 20 fournir une proposition de placer le nom de domaine dans une liste noire, si le troisième score excède un seuil donné. Avantageusement, l'invention permet de mettre à jour des listes noires de noms de domaine générés par des réseaux de machines malveillantes et ajoute une capacité dynamique à la détection de nom de domaine généré par un réseau 25 de machines malveillantes. Selon d'autres caractéristiques de l'invention, le premier score peut être est déterminé en fonction d'une date et d'un temps de vie relatifs à l'enregistrement du nom de domaine, et le deuxième score peut être déterminé au moyen d'au moins un dictionnaire parmi un dictionnaire de langage naturel, un dictionnaire de 30 mots-clés et un dictionnaire de noms de marque. Selon une autre caractéristique de l'invention, un utilisateur peut valider ou refuser la proposition de placer le nom de domaine dans une liste noire. To achieve this goal, a method for detecting a domain name generated by a malicious machine network in a telecommunications network, comprises the following steps: capturing responses from a domain name server to domain name queries from the telecommunications network, determine a first score based on domain name system protocol information contained in the captured responses, determine a second score based on a lexical analysis and a syntactic analysis of a name of the domain name contained in the captured responses, determine a third score based on the first score and the second score, provide a proposal to place the domain name in a blacklist, if the third score exceeds a given threshold. Advantageously, the invention makes it possible to update blacklists of domain names generated by networks of malicious machines and adds a dynamic capacity to the detection of a domain name generated by a network of malicious machines. According to other features of the invention, the first score may be determined according to a date and a life time relating to the registration of the domain name, and the second score may be determined by means of at least one dictionary from a natural language dictionary, a dictionary of 30 keywords and a dictionary of brand names. According to another characteristic of the invention, a user can validate or refuse the proposal to place the domain name in a blacklist.

Un utilisateur peut prendre la décision finale et enrichir le procédé de détection, par exemple lorsque l'utilisateur refuse la proposition de placer le nom de domaine dans une liste noire. A user can make the final decision and enrich the detection process, for example when the user refuses the proposal to place the domain name in a blacklist.

L'invention concerne également un serveur pour détecter un nom de domaine généré par un réseau de machines malveillantes dans un réseau de télécommunications, comprenant : des moyens pour capturer des réponses d'un serveur de nom de domaine à des requêtes de nom de domaine issues du réseau de télécommunications, et des moyens pour déterminer un premier score en fonction d'informations relatives au protocole du système de nom de domaine contenues dans les réponses capturées, des moyens pour déterminer un deuxième score en fonction d'une analyse lexicale et d'une analyse syntaxique d'un intitulé du nom de domaine contenu dans 15 les réponses capturées, des moyens pour déterminer un troisième score en fonction du premier score et du deuxième score, et des moyens pour fournir une proposition de placer le nom de domaine dans une liste noire, si le troisième score excède un seuil donné. 20 L'invention se rapporte encore à un programme d'ordinateur apte à être mis en oeuvre dans un serveur, ledit programme comprenant des instructions qui, lorsque le programme est exécuté dans ledit serveur, réalisent les étapes selon le procédé de l'invention. 25 La présente invention et les avantages qu'elle procure seront mieux compris au vu de la description ci-après faite en référence aux figures annexées, dans lesquelles : - la figure 1 est un bloc-diagramme schématique d'un système de 30 communication selon une réalisation de l'invention, et - la figure 2 est un algorithme d'un procédé de détection de nom de domaine selon une réalisation de l'invention. The invention also relates to a server for detecting a domain name generated by a network of malicious machines in a telecommunications network, comprising: means for capturing responses from a domain name server to domain name requests issued by of the telecommunications network, and means for determining a first score based on information relating to the domain name system protocol contained in the captured responses, means for determining a second score based on a lexical analysis and parsing of a name of the domain name contained in the captured responses, means for determining a third score based on the first score and the second score, and means for providing a proposal to place the domain name in a blacklist, if the third score exceeds a given threshold. The invention also relates to a computer program adapted to be implemented in a server, said program comprising instructions which, when the program is executed in said server, perform the steps according to the method of the invention. The present invention and the advantages thereof will be better understood from the following description with reference to the accompanying figures, in which: FIG. 1 is a schematic block diagram of a communication system according to an embodiment of the invention, and - Figure 2 is an algorithm of a domain name detection method according to an embodiment of the invention.

En référence à la figure 1, un système de communication comprend au moins un serveur de nom de domaine SND et un serveur de détection SD aptes à communiquer entre eux à travers un réseau de télécommunications RT. Le réseau de télécommunications RT peut être un réseau filaire ou sans fil, ou une combinaison de réseaux filaires et de réseaux sans fil. Selon un exemple, le réseau de télécommunications RT est un réseau de paquets à haut débit de type IP ("Internet Protocol" en anglais), tel que l'internet ou un intranet. With reference to FIG. 1, a communication system comprises at least one SND domain name server and a detection server SD able to communicate with each other through a telecommunications network RT. The telecommunications network RT may be a wired or wireless network, or a combination of wired and wireless networks. In one example, the telecommunications network RT is a network of high-speed packets of the IP ("Internet Protocol") type, such as the Internet or an intranet.

Le serveur de détection SD comprend un module de capture CAP, un module d'évaluation EVA, un module d'analyse ANA, un module de décision DEC, et une interface utilisateur IU. Dans la suite de la description, le terme module peut désigner un dispositif, un logiciel ou une combinaison de matériel informatique et de logiciel, 15 configuré pour exécuter au moins une tâche particulière. Dans une variante, le module de capture CAP, le module d'évaluation EVA, le module d'analyse ANA sont respectivement implémentés dans des serveurs externes au serveur de détection SD ou implémentés en partie dans un ou plusieurs serveurs externes au serveur de détection SD. 20 Par exemple, le serveur de détection SD est inclus dans un serveur de nom de domaine, ou dans un pare-feu, ou encore dans un système de détection d'intrusion. Le serveur de détection SD est lié à une base de données d'analyse BDA et une base de données de liste BDL qui sont intégrées dans le serveur de 25 détection SD ou incorporées chacune dans un serveur de gestion de base de données relié au serveur SD par une liaison locale ou distante. En particulier, la base de données d'analyse BDA contient des dictionnaires de langage naturel, de mots-clés et de noms de marque et la base de données de liste BDL contient des listes noires de noms de domaines. 30 Le module de capture CAP capture des réponses, et optionnellement des requêtes de noms de domaine issues du réseau de télécommunications, plus particulièrement qui sont relatives à un nom de domaine à analyser et qui sont échangées avec au moins un serveur de nom de domaine SND. Les réponses et des requêtes contiennent notamment l'intitulé d'un nom de domaine. The detection server SD comprises a capture module CAP, an evaluation module EVA, an analysis module ANA, a decision module DEC, and an user interface UI. In the remainder of the description, the term module may refer to a device, software or combination of computer hardware and software configured to perform at least one particular task. In a variant, the capture module CAP, the evaluation module EVA, the analysis module ANA are respectively implemented in servers external to the detection server SD or partially implemented in one or more servers external to the detection server SD . For example, the detection server SD is included in a domain name server, or in a firewall, or in an intrusion detection system. The SD detection server is linked to a BDA analysis database and a BDL list database which are integrated into the SD detection server or each incorporated into a database management server connected to the SD server. by a local or remote link. In particular, the BDA analysis database contains natural language dictionaries, keywords, and brand names, and the BDL list database contains black lists of domain names. The capture module CAP captures responses, and optionally requests for domain names from the telecommunications network, more particularly that relate to a domain name to be analyzed and which are exchanged with at least one SND domain name server. . Answers and queries include the title of a domain name.

Le module d'évaluation EVA évalue le comportement du protocole du système de nom de domaine, vis-à-vis de techniques dites "Fast Flux". Ces techniques sont utilisées pour dissimuler des activités et sites illégaux, notamment d'hameçonnage ("phishing" en anglais). Ces techniques utilisent les caractéristiques techniques du protocole du système de nom de domaine DNS ("Domain Name System" en anglais) permettant d'attribuer plusieurs adresses IP à un même nom de domaine. Plus particulièrement, le protocole du système de nom de domaine DNS est le protocole utilisé pour déterminer une adresse IP, et donc un serveur, à partir d'un nom de domaine. Afin de répartir les charges entre plusieurs serveurs, plusieurs adresses IP peuvent être attribuées à un même nom de domaine. Les attributions d'adresses IP peuvent être de courte durée afin que les serveurs changent les adresses attribuées très régulièrement. Dans le cas d'un réseau de machines malveillantes, l'adresse IP correspondant à un nom de domaine peut changer régulièrement et pointer vers des machines compromises redirigeant des requêtes vers les serveurs réels, c'est-à-dire les noeuds de commandes du réseau de machines malveillantes. The EVA evaluation module evaluates the protocol behavior of the domain name system against so-called "Fast Flux" techniques. These techniques are used to hide illegal activities and sites, including phishing. These techniques use the technical characteristics of the DNS Domain Name System ("DNS") protocol, which allows multiple IP addresses to be assigned to the same domain name. More specifically, the DNS Domain Name System protocol is the protocol used to determine an IP address, and therefore a server, from a domain name. In order to spread the loads among several servers, several IP addresses can be assigned to the same domain name. IP address assignments can be short-lived so that servers change the assigned addresses very regularly. In the case of a network of malicious machines, the IP address corresponding to a domain name can change regularly and point to compromised machines redirecting requests to the real servers, that is to say the command nodes of the domain. network of malicious machines.

Pour évaluer le comportement du protocole du système de nom de domaine, le module d'évaluation EVA peut notamment déterminer la date et la valeur "temps de vie" TTL ("Time To Live" en anglais) relative à l'enregistrement de nom de domaine, et déterminer dans quelle mesure des adresses IP ("Internet Protocol" en anglais) retournées par le serveur DNS sont locales. Le module d'évaluation EVA peut en outre déterminer le nombre de détails disponibles dans l'enregistrement du nom de domaine. Le module d'évaluation EVA détermine un premier score représentant par exemple la probabilité que le nom de domaine est un domaine "malveillant", c'est-à-dire issu d'un réseau de machines malveillantes utilisant des techniques dites "fast flux". In order to evaluate the behavior of the domain name system protocol, the EVA evaluation module can notably determine the date and the TTL ("Time To Live") value relating to the registration of the name of the domain name system. domain, and determine to what extent the Internet Protocol (IP) addresses returned by the DNS server are local. The EVA evaluation module can further determine the number of details available in the registration of the domain name. The EVA evaluation module determines a first score representing, for example, the probability that the domain name is a "malicious" domain, that is to say from a network of malicious machines using so-called "fast flux" techniques. .

Le module d'analyse ANA effectue une analyse lexicale, une analyse syntaxique et une analyse sémantique sur l'intitulé du nom de domaine. L'analyse lexicale identifie les caractères compris dans l'intitulé du nom de domaine, les caractères pouvant être des lettres de langues différentes, des chiffres ou encore des signes. L'analyse syntaxique détermine la structure de l'intitulé du nom de domaine, par exemple sous forme d'un ou plusieurs mots ou expressions, et vérifie si ces derniers existent ou ont un sens, par exemple à l'aide d'un dictionnaire de langage naturel, et/ou de mots-clés et/ou de noms de marque. The ANA analysis module performs lexical analysis, parsing and semantic analysis on the name of the domain name. The lexical analysis identifies the characters included in the name of the domain name, the characters can be letters of different languages, numbers or signs. The parsing determines the structure of the name of the domain name, for example in the form of one or more words or expressions, and checks whether they exist or have a meaning, for example using a dictionary natural language, and / or keywords and / or brand names.

L'analyse sémantique détermine dans quelle mesure le nom de domaine est compréhensible et mémorisable pour un utilisateur. Les analyses lexicale, syntaxique et sémantique peuvent recourir à des techniques de traitement de langage naturel NLP ("Natural Language Processing" en anglais). The semantic analysis determines to what extent the domain name is understandable and memorable for a user. Lexical, syntactic and semantic analyzes can use NLP (Natural Language Processing) natural language processing techniques.

Le module d'analyse ANA détermine un deuxième score représentant par exemple la probabilité que l'intitulé de nom de domaine a été généré par un réseau de machines malveillantes, en fonction des analyses lexicale, syntaxique et analyse sémantique sur l'intitulé du nom de domaine. Selon un exemple, le nom de domaine est "biz4you.com". L'analyse lexicale identifie des lettres et chiffres couramment utilisés dans les noms de domaine. L'analyse syntaxique identifie l'expression "biz" à l'aide d'un dictionnaire de mots-clés et les expressions "4" et "you" à l'aide d'un dictionnaire de langage naturel. L'analyse sémantique identifie le sens de la combinaison des expressions identifiées des lettres : "biz" est la prononciation phonétique du début du mot anglais "business", "4" est utilisé pour le mot anglais "for". Le nom de domaine "biz4you.com" a donc une signification proche de "business for you" et peut être considéré comme compréhensible par un utilisateur. Par contre, un nom de domaine tel que "4bizyou.com" n'aurait pas de signification selon l'analyse sémantique, et le module d'analyse déterminerait qu'il y une forte probabilité que le nom de domaine a été généré par un réseau de machines malveillantes. The ANA analysis module determines a second score representing, for example, the probability that the domain name has been generated by a network of malicious machines, based on the lexical, syntactic and semantic analyzes on the title of the name of the domain. field. According to an example, the domain name is "biz4you.com". Lexical analysis identifies letters and numbers commonly used in domain names. Parsing identifies the term "biz" using a dictionary of keywords and the expressions "4" and "you" using a natural language dictionary. The semantic analysis identifies the meaning of the combination of the identified expressions of the letters: "biz" is the phonetic pronunciation of the beginning of the English word "business", "4" is used for the English word "for". The domain name "biz4you.com" has a meaning close to "business for you" and can be considered as understandable by a user. On the other hand, a domain name such as "4bizyou.com" would have no meaning according to the semantic analysis, and the analysis module would determine that there is a strong probability that the domain name was generated by a network of malicious machines.

Le module de décision DEC détermine un score final en fonction des premier et deuxième scores déterminés par le module d'évaluation EVA et le module d'analyse ANA. Le score final représente par exemple la probabilité que le nom de domaine est un domaine "malveillant", c'est-à-dire issu d'un réseau de machines malveillantes. Si le score final excède un seuil donné, le module de décision DEC propose de placer le nom de domaine dans une liste noire. The decision module DEC determines a final score based on the first and second scores determined by the evaluation module EVA and the analysis module ANA. The final score represents, for example, the probability that the domain name is a "malicious" domain, that is to say from a network of malicious machines. If the final score exceeds a given threshold, the decision module DEC proposes to place the domain name in a blacklist.

L'interface d'utilisateur IU permet à un utilisateur de valider la proposition du module de décision DEC, c'est-à-dire placer le nom de domaine dans une liste noire. Par exemple, l'utilisateur prend cette décision à l'aide du score final, et éventuellement à l'aide du premier score et du deuxième score. The UI user interface allows a user to validate the proposal of the decision module DEC, that is to say put the domain name in a blacklist. For example, the user makes this decision using the final score, and possibly using the first score and the second score.

Optionnellement, si l'utilisateur refuse la proposition du module de décision DEC, c'est-à-dire placer nom de domaine dans une liste noire, l'interface demande à l'utilisateur quel score parmi les premier et deuxième scores paraît incorrect à l'utilisateur. Cette correction est alors transmise aux modules d'évaluation et d'analyse pour améliorer des déterminations ultérieures de scores et affiner la détection de faux positifs. Optionally, if the user refuses the proposal of the DEC decision module, that is to say, place a domain name in a blacklist, the interface asks the user which score among the first and second scores appears incorrect to the user. This correction is then transmitted to the evaluation and analysis modules to improve subsequent scores determination and refine the detection of false positives.

En référence à la figure 2, un procédé de détection de nom de domaine selon une réalisation de l'invention comprend des étapes El à E5 exécutées dans le système de communication. With reference to FIG. 2, a domain name detection method according to one embodiment of the invention comprises steps E1 to E5 executed in the communication system.

A l'étape El, le module de capture CAP capture des réponses, et optionnellement des requêtes de noms de domaine issues du réseau de télécommunications RT, les réponses étant en provenance d'un serveur de nom de domaine SND et les requêtes étant à destination du serveur de nom de domaine SND. Les réponses et les requêtes contiennent notamment l'intitulé d'un nom de domaine qui est requis par des entités du réseau de télécommunications. Le module de capture CAP transmet les requêtes et réponses capturées aux modules d'évaluation EVA et d'analyse ANA. A l'étape E2, le module d'évaluation EVA extrait des informations relatives au comportement du protocole du système de nom de domaine DNS, vis-à-vis de techniques dites "Fast Flux", depuis les réponses capturées, et optionnellement depuis les requêtes capturées. En particulier, à partir des informations extraites des réponses capturées, et optionnellement les requêtes capturées, le module d'évaluation EVA détermine la date et la valeur "temps de vie" TTL relative à l'enregistrement de nom de domaine, ainsi que le nombre de détails disponibles dans cet enregistrement, et détermine dans quelle mesure des adresses IP retournées par le serveur DNS sont locales. Le module d'évaluation EVA détermine un premier score S1 en fonction de l'analyse des informations relatives au protocole du système de nom de domaine extraites des réponses capturées, et optionnellement les requêtes capturées. Le premier score S1 représente par exemple la probabilité que le nom de domaine est un domaine "malveillant. In step E1, the capture module CAP captures responses, and optionally requests for domain names from the telecommunications network RT, the responses being from an SND domain name server and the requests being at their destination. the SND domain name server. Answers and queries include the name of a domain name that is required by entities in the telecommunications network. The CAP capture module transmits the captured queries and responses to the EVA and ANA analysis modules. In step E2, the evaluation module EVA extracts information relating to the protocol behavior of the DNS domain name system, with respect to so-called "Fast Flux" techniques, from the captured responses, and optionally from the captured queries. In particular, based on the information extracted from the captured responses, and optionally the captured requests, the EVA evaluation module determines the date and TTL "life time" value relative to the domain name registration, as well as the number of requests. details available in this record, and determines to what extent IP addresses returned by the DNS server are local. The EVA evaluation module determines a first score S1 based on the analysis of the domain name system protocol information extracted from the captured responses, and optionally the captured requests. The first score S1 represents for example the probability that the domain name is a "malicious" domain.

A l'étape E3, le module d'analyse ANA effectue une analyse lexicale et une analyse syntaxique, et optionnellement une analyse sémantique, sur l'intitulé du nom de domaine. Le module d'analyse ANA effectue dans un premier temps une analyse lexicale et une analyse syntaxique sur l'intitulé du nom de domaine. En particulier, le module d'analyse ANA vérifie si l'intitulé du nom de domaine comprend des mots ou expressions qui existent ou ont un sens, par exemple à l'aide d'un dictionnaire de langage naturel, et/ou de mots-clés et/ou de noms de marque. Le module d'analyse ANA effectue dans un deuxième temps une analyse sémantique sur l'intitulé du nom de domaine, notamment si les analyses lexicale et syntaxique sont positives, c'est-à-dire l'intitulé de nom de domaine contient des mots ou expressions existant dans des dictionnaires. En effet, si l'intitulé de nom de domaine contient des mots ou expressions qui n'existent pas dans des dictionnaires, le module d'analyse peut déjà déterminer une forte probabilité que le nom de domaine a été généré par un réseau de machines malveillantes sans recourir à une analyse sémantique. Le module d'analyse ANA détermine un deuxième score S2 représentant par exemple la probabilité que l'intitulé de nom de domaine a été généré par un réseau de machines malveillantes, en fonction des analyses lexicale, syntaxique et analyse sémantique sur l'intitulé du nom de domaine. In step E3, the analysis module ANA performs a lexical analysis and a parsing, and optionally a semantic analysis, on the title of the domain name. The ANA analysis module first performs a lexical analysis and a syntax analysis on the name of the domain name. In particular, the ANA analysis module checks whether the name of the domain name includes words or expressions that exist or have a meaning, for example using a natural language dictionary, and / or words keys and / or brand names. The ANA analysis module performs a semantic analysis on the name of the domain name, in particular if the lexical and syntactic analyzes are positive, ie the domain name contains words or existing expressions in dictionaries. Indeed, if the domain name contains words or expressions that do not exist in dictionaries, the analysis module can already determine a high probability that the domain name was generated by a network of malicious machines without resorting to a semantic analysis. The analysis module ANA determines a second score S2 representing, for example, the probability that the domain name has been generated by a network of malicious machines, based on the lexical, syntactic and semantic analyzes on the title of the name. Domain.

A l'étape E4, le module de décision DEC récupère les premier et deuxième scores des modules d'évaluation et d'analyse, et détermine un score final Sf en fonction du premier score S1 et du deuxième score S2. Par exemple, le module de décision DEC fait une moyenne des scores S1 et S2, éventuellement en affectant des coefficients aux scores. In step E4, the decision module DEC retrieves the first and second scores of the evaluation and analysis modules, and determines a final score Sf according to the first score S1 and the second score S2. For example, the decision module DEC averages the scores S1 and S2, possibly by assigning coefficients to the scores.

Le module de décision DEC compare le score final Sf à un seuil donné Th. Si le score final Sf excède le seuil donné Th, le module de décision DEC fournit une proposition de placer le nom de domaine dans une liste noire. A l'étape E5, l'utilisateur, via l'interface d'utilisateur IU, accède à la proposition fournie par le module de décision, et peut valider ou refuser la proposition de placer le nom de domaine dans une liste noire fournie par le module de décision DEC. Si la proposition est validée, le nom de domaine est inclus dans une liste noire mémorisée dans la base de données de liste BDL. The decision module DEC compares the final score Sf with a given threshold Th. If the final score Sf exceeds the given threshold Th, the decision module DEC provides a proposal to place the domain name in a blacklist. In step E5, the user, via the UI user interface, accesses the proposal provided by the decision module, and can validate or reject the proposal to place the domain name in a blacklist provided by the user. DEC decision module. If the proposal is validated, the domain name is included in a blacklist stored in the BDL list database.

L'invention décrite ici concerne un procédé et un serveur pour détecter des noms de domaine générés par un réseau de machines malveillantes. Selon une implémentation de l'invention, les étapes du procédé de l'invention sont déterminées par les instructions d'un programme d'ordinateur incorporé dans un serveur, tel que le serveur de détection SD. Le programme comporte des instructions de programme, qui lorsque ledit programme est chargé et exécuté dans le serveur, réalisent les étapes du procédé de l'invention. En conséquence, l'invention s'applique également à un programme d'ordinateur, notamment un programme d'ordinateur sur ou dans un support d'informations, adapté à mettre en oeuvre l'invention. Ce programme peut utiliser n'importe quel langage de programmation, et être sous la forme de code source, code objet, ou code intermédiaire entre code source et code objet tel que dans une forme partiellement compilée, ou dans n'importe quelle autre forme souhaitable pour implémenter le procédé selon l'invention. The invention described herein relates to a method and a server for detecting domain names generated by a network of malicious machines. According to an implementation of the invention, the steps of the method of the invention are determined by the instructions of a computer program incorporated in a server, such as the detection server SD. The program includes program instructions, which when said program is loaded and executed in the server, perform the steps of the method of the invention. Accordingly, the invention also applies to a computer program, including a computer program on or in an information carrier, adapted to implement the invention. This program can use any programming language, and be in the form of source code, object code, or intermediate code between source code and object code such as in a partially compiled form, or in any other desirable form to implement the method according to the invention.

Claims

REVENDICATIONS1. A method for detecting a domain name generated by a network of malicious machines in a telecommunications network (RT), the method comprising the steps of: capturing (E1) domain name server (SND) responses to domain name queries from the telecommunications network, determining (E2) a first score (Si) based on domain name system protocol information contained in the captured responses, determining (E3) a second score ( S2) based on a lexical analysis and a parsing of a name of the domain name contained in the captured responses, determine (E4) a third score (Sf) based on the first score (Si) 15 and the second score (S2), provide (E5) a proposal to place the domain name in a blacklist, if the third score (Sf) exceeds a given threshold (Th).

The method of claim 1, wherein domain name domain name (SND) domain name requests from the telecommunications network are captured, and the first score (Si) is further determined based on information about the domain name system protocol contained in the captured requests. 25

3. Method according to claim 1 or 2, wherein the first score (Si) is determined according to a date and a life time relating to the registration of the domain name.

4. The method according to one of claims 1 to 3, wherein the second score (S2) is determined by means of at least one of a natural language dictionary, a dictionary of keywords and a dictionary of brand names.

5. Method according to one of claims 1 to 4, wherein the second score (S2) is further determined according to a semantic analysis of the title of the domain name.

6. Method according to one of claims 1 to 5, wherein the third score (Sf) is determined according to a mean of the first score (Si) and the second score (S2).

7. A method according to any one of claims 1 to 6, wherein the third score (Sf) represents a probability that the domain name is a domain from a malicious machine network.

8. A method according to any one of claims 1 to 7, wherein a user validates or refuses the proposal to place the domain name in a blacklist.

A server (SD) for detecting a domain name generated by a network of malicious machines in a telecommunications network (RT), comprising: means (CAP) for capturing server responses (SND) of name of domain to domain name requests from the telecommunications network, and means (EVA) for determining a first score (Si) based on domain name system protocol information contained in the captured responses, 25 means (ANA) for determining a second score (S2) based on a lexical analysis and a parsing of a domain name title contained in the captured responses, means (DEC) for determining a third score ( Sf) according to the first score (Si) and the second score (S2), and means (DEC) to provide a proposal to place the domain name in a blacklist, if the third score (Sf) exceeds a threshold given (Th).

10. A computer program adapted to be implemented in a server (SD) for detecting a domain name generated by a network of malicious machines in a telecommunications network (RT), said program comprising instructions which, when the program is loaded and executed in said server, perform the following steps: capture (El) responses from a domain name server (SND) to domain name queries from the telecommunications network, determine (E2) a first score ( If) based on domain name system protocol information contained in the captured responses, determining (E3) a second score (S2) based on a lexical analysis and a parsing of a title of the domain name contained in the responses captured, determine (E4) a third score (Sf) according to the first score (Si) and the second score (S2), provide (E5) a proposal to place the domain name da ns a black list, if the third score (Sf) exceeds a given threshold (Th). 20