ES2577143T3

ES2577143T3 - Method and system to detect malicious software

Info

Publication number: ES2577143T3
Application number: ES11805856.9T
Authority: ES
Inventors: Francisco Romero Bueno; Antonio Manuel Amaya Calvo
Original assignee: Telefonica SA
Current assignee: Telefonica SA
Priority date: 2011-10-14
Filing date: 2011-12-23
Publication date: 2016-07-13
Anticipated expiration: 2031-12-23
Also published as: WO2013053407A1; EP2767056A1; US20150052606A1; EP2767056B1

Abstract

Metodo para detectar software malintencionado, realizandose dicha deteccion en un sistema de deteccion de anomalias, o ADS, analizando el comportamiento de una red y buscando desviaciones con respecto a una normalidad, indicando dicha normalidad el comportamiento comun de usuarios de dicha red y definiendose antes de dicha deteccion, comprendiendo dicho metodo: - construir una pluralidad de modelos de deteccion para cada una de una pluralidad de diferentes entidades de dicha red, cada uno de dicha pluralidad de modelos de deteccion adaptado a dichas diferentes entidades de dicha red y a diferentes algoritmos, implementando dichos diferentes algoritmos diferentes estrategias de deteccion y representando dicha pluralidad de modelos de deteccion dicha normalidad, y - representando dicha pluralidad de modelos de deteccion en una matriz bidimensional, correspondiendo una dimension de dicha matriz a un numero de dichas diferentes entidades de dicha red y correspondiendo la otra dimension de dicha matriz a un numero de dichos diferentes algoritmos empleados.Method to detect malicious software, said detection being performed in an anomaly detection system, or ADS, analyzing the behavior of a network and looking for deviations from a normality, indicating said normality the common behavior of users of said network and defining before said detection, said method comprising: - constructing a plurality of detection models for each of a plurality of different entities of said network, each of said plurality of detection models adapted to said different entities of said network and to different algorithms, implementing said different algorithms different detection strategies and representing said plurality of detection models said normality, and - representing said plurality of detection models in a two-dimensional matrix, a dimension of said matrix corresponding to a number of said different entities of said network and corresponding the other dimension of said matrix to a number of said different algorithms used.

Description

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

5555

6060

Metodo y sistema para detectar software malintencionado DESCRIPCIONMethod and system to detect malicious software DESCRIPTION

Campo de la tecnicaTechnique Field

La presente invencion se refiere, en general, en un primer aspecto, a un metodo para detectar software malintencionado, realizandose dicha deteccion en un sistema de deteccion de anomalfas, o ADS, analizando el comportamiento de una red y buscando desviaciones con respecto a una normalidad, indicando dicha normalidad el comportamiento comun de usuarios de dicha red y definiendose antes de dicha deteccion, y mas particularmente a un metodo que comprende construir una pluralidad de modelos de deteccion, estando adaptado cada uno de dicha pluralidad de modelos de deteccion a diferentes entidades de dicha red y a diferentes algoritmos, implementando dichos diferentes algoritmos diferentes estrategias de deteccion y representando dicha pluralidad de modelos de deteccion dicha normalidad.The present invention relates, in general, in a first aspect, to a method for detecting malicious software, said detection being performed in an anomaly detection system, or ADS, analyzing the behavior of a network and looking for deviations from normality. , said normality indicating the common behavior of users of said network and being defined before said detection, and more particularly to a method comprising constructing a plurality of detection models, each of said plurality of detection models being adapted to different entities of said network and different algorithms, implementing said different algorithms different detection strategies and representing said plurality of detection models said normality.

Un segundo aspecto de la invencion se refiere a un sistema dispuesto para implementar el metodo del primer aspecto.A second aspect of the invention relates to a system arranged to implement the method of the first aspect.

Estado de la tecnica anteriorState of the prior art

La deteccion de software malintencionado (malware) puede clasificarse de muchas maneras. Una de ellas es la categorizacion clasica que distingue entre la deteccion basada en host y basada en red. El primer tipo intenta encontrar evidencias de la existencia de virus, troyanos, etc. por medio de procesos, memoria ffsica y varios otros analisis en host, mientras que el segundo se centra en las comunicaciones realizadas por tales virus o troyanos.Malicious software (malware) detection can be classified in many ways. One of them is the classic categorization that distinguishes between host-based and network-based detection. The first type tries to find evidence of the existence of viruses, Trojans, etc. through processes, physical memory and several other host analyzes, while the second focuses on communications made by such viruses or Trojans.

Surgieron varias estrategias en los comienzos de la deteccion basada en red. La mas basica era bloquear determinadas comunicaciones que atraviesan un nodo de red especial, un cortafuego. Esta solucion fue util hasta que el malware, con el fin de ocultarse, empezo a usar protocolos usados ampliamente tales como HTTP, SMTP que no pueden bloquearse sin detener el negocio de los ISP y operadores.Several strategies emerged at the beginning of network based detection. The most basic was to block certain communications that go through a special network node, a firewall. This solution was useful until malware, in order to hide, began using widely used protocols such as HTTP, SMTP that cannot be blocked without stopping the business of ISPs and operators.

Entonces, era obvio que no era suficiente analizar unos pocos campos de los paquetes de TCP/IP (protocolo, puertos de origen y destino y direcciones IP), y era necesario extender el proceso de monitorizacion a la cabecera de TCP/IP completa e incluso la carga util de los paquetes. Es asf como nacen los sistemas de deteccion de intrusion de red (NIDS o IDS) [15]. Los IDS se basan en firmas de trafico, es decir, cadenas especiales que, cuando aparecen dentro de los paquetes de red, indican la presencia de un malware espedfico. Los IDS conocidos ampliamente son Snort [1] y Bro [2].Then, it was obvious that it was not enough to analyze a few fields of TCP / IP packets (protocol, source and destination ports and IP addresses), and it was necessary to extend the monitoring process to the full TCP / IP header and even The payload of the packages. This is how network intrusion detection systems (NIDS or IDS) are born [15]. IDSs are based on traffic signatures, that is, special chains that, when they appear within network packages, indicate the presence of specific malware. The widely known IDS are Snort [1] and Bro [2].

Aunque actualmente los IDS continuan desempenando un papel importante en los procesos de monitorizacion de todos los operadores alrededor del mundo, puesto que tratan un porcentaje importante de deteccion de malware, los investigadores encontraron que algun malware permaneda indetectable a los cortafuegos e IDS. Estos virus, gusanos, etc. indetectables usaban tanto protocolos populares como contenido normal a priori dentro de la carga util de paquetes. Algunos ejemplos son: spammers [3], bots orientados a la denegacion de servicio (DoS) [4] o scanners [5]. Entonces, se observo que la unica manera para detectar estas amenazas era analizar el comportamiento de la red con el fin de encontrar desviaciones con respecto a la normalidad, es decir, anomalfas. La normalidad se define a traves de una representacion matematica de la realidad comun, es decir, un modelo, que se construye en una etapa previa a la deteccion. Los sistemas de deteccion de anomalfas de red (NADS) [14] o simplemente sistemas de deteccion de anomalfas (ADS) tal como Proventia [6] demostraron rapidamente ser utiles para cubrir la laguna mencionada en el campo de monitorizacion. Y tambien en muchos otros escenarios, tales como deteccion de malware de dfa cero (nuevo malware existente para el que los IDS no tienen una firma valida) o trafico cifrado (las firmas no son importantes cuando no puede verse la carga util).Although IDS currently continues to play an important role in the monitoring processes of all operators around the world, since they deal with a significant percentage of malware detection, the researchers found that some malware remains undetectable to firewalls and IDS. These viruses, worms, etc. undetectable used both popular protocols and normal content a priori within the payload of packages. Some examples are: spammers [3], service-oriented denial (DoS) bots [4] or scanners [5]. Then, it was observed that the only way to detect these threats was to analyze the behavior of the network in order to find deviations from normality, that is, anomalies. Normality is defined through a mathematical representation of common reality, that is, a model, which is constructed at a stage prior to detection. Network anomaly detection systems (NADS) [14] or simply anomaly detection systems (ADS) such as Proventia [6] quickly proved useful in covering the gap mentioned in the monitoring field. And also in many other scenarios, such as zero-day malware detection (new existing malware for which IDS does not have a valid signature) or encrypted traffic (signatures are not important when the payload cannot be seen).

La deteccion de anomalfas es todavfa un area de investigacion interesante que puede dar bastantes soluciones para los administradores de seguridad. Estan apareciendo algoritmos innovadores de inteligencia artificial (AI), y el objetivo del modelado, que algunas veces es mas importante que el propio algoritmo de deteccion, vana entre la granularidad mas general, la totalidad de la red, hasta la minima, el usuario final individual.The detection of anomalies is still an interesting research area that can provide many solutions for security administrators. Innovative artificial intelligence (AI) algorithms are appearing, and the objective of modeling, which is sometimes more important than the detection algorithm itself, is among the most general granularity, the entire network, to a minimum, the end user individual.

Ejemplos de sistemas de deteccion de anomalfas pueden encontrarse en los documentos US2007/289013 y US8015133.Examples of anomaly detection systems can be found in documents US2007 / 289013 and US8015133.

Pueden distinguirse dos tipos de problemas cuando se habla acerca de los ADS existentes, los relacionados con la eficacia y los relacionados con la eficiencia.Two types of problems can be distinguished when talking about existing ADS, those related to effectiveness and those related to efficiency.

Los problemas de eficacia son evidentes en el dfa a dfa de todas las comparuas. Sus sistemas ya implementados no lo detectan todo, y las cosas que deben detectar no se encuentran apropiadamente debido a estrategias yThe problems of effectiveness are evident on the day after day of all the comparuas. Their already implemented systems do not detect everything, and the things they must detect are not properly found due to strategies and

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

5555

6060

mecanismos obsoletos.obsolete mechanisms

- Modelado de comportamiento basado en red- Network based behavior modeling

Los modelos con respecto al comportamiento de redes monitorizadas siempre se basan en la agregacion, contando el numero de paquetes/flujos/bytes por unidad de tiempo y definiendo por tanto una referencia. Aunque es verdad que este enfoque es suficiente para afrontar amenazas importantes tales como ataques DoS [4], en los que se genera una gran cantidad de datos, no tiene nada que hacer con ataques mas discretos y sofisticados basados en variaciones muy bajas de determinadas caractensticas de trafico que solo pueden apreciarse al nivel de entidad individual.Models with respect to the behavior of monitored networks are always based on aggregation, counting the number of packets / flows / bytes per unit of time and thus defining a reference. Although it is true that this approach is sufficient to face important threats such as DoS attacks [4], in which a large amount of data is generated, it has nothing to do with more discrete and sophisticated attacks based on very low variations of certain characteristics. of traffic that can only be seen at the level of individual entity.

- Algoritmos de deteccion rudimentarios- Rudimentary detection algorithms

Las reglas de condicion-consecuencia y volumenes/umbrales son casi todas las tecnologfas implementadas en la mayona de los NADS exitosos. Se trata de algoritmos muy basicos diftciles de mantener (deben considerarse nuevas condiciones cuando una amenaza evoluciona), mostrando muy baja flexibilidad (un umbral de 1000000 bytes superandose cuando se alcanzan 1000001 bytes) y una pobre auto-adaptacion.The condition-consequence and volume / threshold rules are almost all the technologies implemented in the majority of successful NADS. These are very basic algorithms that are difficult to maintain (new conditions must be considered when a threat evolves), showing very low flexibility (a threshold of 10,000,000 bytes exceeded when 100,000,000 bytes are reached) and poor self-adaptation.

La mayona de los problemas relacionados con la eficiencia con respecto a las soluciones existentes son debido a la falta de un marco comun para fines de monitorizacion. Cada solucion es propietaria y esta muy cerrada a terceras partes, lo que hace que las organizaciones instalen una nueva batena de soluciones cada vez que surge una nueva amenaza.Most of the problems related to efficiency with respect to existing solutions are due to the lack of a common framework for monitoring purposes. Each solution is proprietary and is very closed to third parties, which causes organizations to install a new solution pool every time a new threat arises.

- Nivel de integracion nula entre proveedores- Level of zero integration between suppliers

Excepto SIEM [9], una clase especial de sistemas no encargados de monitorizacion de red directa, pero encargados de la agregacion y correlacion de eventos de las diferentes fuentes, no se permite la integracion entre sistemas de monitorizacion actuales (incluyendo los ADS), y no solo sistemas de monitorizacion: herramientas de husmeado (sniffing), aplicaciones de registro, etc. Por ejemplo, es muy comun que los IDS/IPS/ADS husmeen ellos mismos el trafico de red, mientras que muchos sistemas de husmeado realizan la misma tarea.Except SIEM [9], a special class of systems not responsible for direct network monitoring, but responsible for the aggregation and correlation of events from different sources, integration between current monitoring systems (including ADS) is not allowed, and not only monitoring systems: sniffing tools, registration applications, etc. For example, it is very common for IDS / IPS / ADS to sniff through network traffic, while many snooping systems perform the same task.

- Bajo nivel de personalizacion/extension- Low level of customization / extension

Uno de los principales problemas que tienen en particular los sistemas de monitorizacion y ADS con respecto a su arquitectura es la falta de flexibilidad. Las soluciones de deteccion de anomaftas en investigacion y comerciales actuales estan muy cerradas y son propietarias, lo que hace casi imposible cualquier clase de personalizacion/extension. Este bajo nivel de personalizacion puede parecer normal en cualquier otra clase de sistema de software, pero no en el campo de la monitorizacion, en el que los operadores de seguridad necesitan afrontar la evolucion continua del malware por medio de nuevos algoritmos, estrategias, fuentes de datos, etc.One of the main problems that the monitoring and ADS systems have in particular with respect to their architecture is the lack of flexibility. The solutions of detection of anomaftas in research and current commercials are very closed and are proprietary, which makes almost any kind of customization / extension impossible. This low level of customization may seem normal in any other kind of software system, but not in the field of monitoring, in which security operators need to face the continuous evolution of malware through new algorithms, strategies, sources of data, etc.

- Multiples puntos de monitorizacion- Multiple monitoring points

Ademas, si una organizacion espera usar varios NADS (porque cada uno de estos tiene como objetivo una amenaza diferente) entonces cada NADS individual requerira un punto de monitorizacion exclusivo desde el que obtener el trafico sin procesar. Esto puede parecer un problema insignificante si proporcionar los puntos de monitorizacion no fuera costoso. Por un lado, las soluciones de derivacion ffsica deben cortar el cable durante unos pocos segundos, pero suficiente para detener aplicaciones crfticas, y la division de la potencia optica debe realizarse muy cuidadosamente con el fin de permitir que los nodos de extremo continuen negociando el enlace. Por otro lado, las derivaciones logicas tales como la duplicacion de puertos consumen una alta cantidad de capacidades de procesamiento en los nodos de red; adicionalmente, cada aplicacion de monitorizacion necesita un puerto exclusivo en el nodo.In addition, if an organization expects to use several NADS (because each of these targets a different threat) then each individual NADS will require an exclusive monitoring point from which to obtain the unprocessed traffic. This may seem an insignificant problem if providing monitoring points is not expensive. On the one hand, the physical bypass solutions must cut the cable for a few seconds, but enough to stop critical applications, and the optical power division must be done very carefully in order to allow the end nodes to continue negotiating the link. . On the other hand, logical derivations such as port duplication consume a high amount of processing capabilities in network nodes; Additionally, each monitoring application needs an exclusive port on the node.

- Aplicaciones monolfticas- Monolithic applications

Finalmente, las soluciones existentes estan disenadas para ejecutarse en un unico equipo, evitando una caractenstica de distribucion de procesamiento deseable. Las aplicaciones monolfticas habitualmente requieren grandes cantidades de recursos tales como CPU, memoria RAM y disco para afrontar redes de alta velocidad.Finally, existing solutions are designed to run on a single computer, avoiding a desirable distribution of processing characteristics. Monolithic applications usually require large amounts of resources such as CPU, RAM and disk to deal with high-speed networks.

Descripcion de la invencionDescription of the invention

Es necesario ofrecer una alternativa al estado de la tecnica que cubra las lagunas encontradas en la misma, particularmente con relacion a la falta de propuestas que en realidad permitan detectar todo el software malintencionado posible e implantar un marco comun con fines de monitorizacion de manera que las organizaciones no necesiten instalar una nueva batena de soluciones cada vez que surge una nueva amenaza.It is necessary to offer an alternative to the state of the art that covers the gaps found therein, particularly in relation to the lack of proposals that actually allow detecting all possible malicious software and implement a common framework for monitoring purposes so that Organizations do not need to install a new solution pool every time a new threat arises.

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

5555

6060

Para ello, la presente invencion proporciona, en un primer aspecto, un metodo para detectar software malintencionado, realizandose dicha deteccion en un sistema de deteccion de anomaKas, o ADS, analizando el comportamiento de una red y buscando desviaciones con respecto a una normalidad, indicando dicha normalidad el comportamiento comun de los usuarios de dicha red y definiendose antes de dicha deteccion.For this, the present invention provides, in a first aspect, a method for detecting malicious software, said detection being performed in an anomaly detection system, or ADS, analyzing the behavior of a network and looking for deviations from normality, indicating said normality the common behavior of the users of said network and being defined before said detection.

A diferencia de las propuestas conocidas, el metodo de la invencion, de una manera caractenstica, comprende ademas construir una pluralidad de modelos de deteccion, estando adaptado cada uno de dicha pluralidad de modelos de deteccion a diferentes entidades de dicha red y a diferentes algoritmos, implementando dichos diferentes algoritmos diferentes estrategias de deteccion y representando dicha pluralidad de modelos de deteccion dicha normalidad.Unlike the known proposals, the method of the invention, in a characteristic way, also comprises constructing a plurality of detection models, each of said plurality of detection models being adapted to different entities of said network and to different algorithms, implementing said different algorithms different detection strategies and representing said plurality of detection models said normality.

Otras realizaciones del metodo del primer aspecto de la invencion se describen segun las reivindicaciones adjuntas 1 a 15, y en una seccion posterior relativa a la descripcion detallada de varias realizaciones.Other embodiments of the method of the first aspect of the invention are described according to the appended claims 1 to 15, and in a subsequent section relative to the detailed description of various embodiments.

Un segundo aspecto de la presente invencion se refiere a un sistema para detectar software malintencionado, realizandose dicha deteccion en un sistema de deteccion de anomalfas, o ADS, analizando el comportamiento de una red y buscando desviaciones con respecto a una normalidad, indicando dicha normalidad la realidad comun de dicha red y definiendose antes de dicha deteccion.A second aspect of the present invention relates to a system for detecting malicious software, said detection being carried out in an anomaly detection system, or ADS, analyzing the behavior of a network and looking for deviations from a normality, said normality indicating common reality of said network and defining itself before said detection.

En el sistema del segundo aspecto de la invencion, a diferencia de los sistemas conocidos mencionados en la seccion de estado de la tecnica anterior, y de una manera caractenstica, este comprende un modulo de sonda para la monitorizacion de trafico de dicha red conectada a un modulo controlador encargado de realizar dicha deteccion, en el que dicho modulo controlador esta dotado de una pluralidad de modelos de deteccion construidos por medio de un modulo compilador, estando adaptado cada uno de dicha pluralidad de modelos de deteccion a diferentes entidades de dicha red y a diferentes algoritmos, implementando dichos diferentes algoritmos diferentes estrategias de deteccion y representando dicha pluralidad de modelos de deteccion dicha normalidad.In the system of the second aspect of the invention, unlike the known systems mentioned in the prior art section of the prior art, and in a characteristic way, this comprises a probe module for monitoring traffic of said network connected to a controller module responsible for performing said detection, wherein said controller module is provided with a plurality of detection models constructed by means of a compiler module, each of said plurality of detection models being adapted to different entities of said network and to different algorithms, implementing said different algorithms different detection strategies and representing said plurality of detection models said normality.

El sistema del segundo aspecto de la invencion esta adaptado para implementar el metodo del primer aspecto.The system of the second aspect of the invention is adapted to implement the method of the first aspect.

Otras realizaciones del sistema del segundo aspecto de la invencion se describen segun las reivindicaciones adjuntas 16 a 22, y en una seccion posterior relativa a la descripcion detallada de varias realizaciones.Other embodiments of the system of the second aspect of the invention are described according to the appended claims 16 to 22, and in a subsequent section relating to the detailed description of various embodiments.

Breve descripcion de los dibujosBrief description of the drawings

Las anteriores y otras ventajas y caractensticas se entenderan de manera mas completa a partir de la siguiente descripcion detallada de realizaciones, con referencia a los dibujos adjuntos, que deben considerarse de una manera ilustrativa y no limitativa, en los que:The foregoing and other advantages and features will be more fully understood from the following detailed description of embodiments, with reference to the attached drawings, which should be considered in an illustrative and non-limiting manner, in which:

La figura 1 muestra los componentes y la interaccion entre estos, definiendo dichos componentes e interacciones una posible arquitectura del sistema propuesto, segun una realizacion de la presente invencion.Figure 1 shows the components and the interaction between them, said components and interactions defining a possible architecture of the proposed system, according to an embodiment of the present invention.

La figura 2 muestra el modulo de sonda, segun una realizacion de la presente invencion.Figure 2 shows the probe module, according to an embodiment of the present invention.

La figura 3 muestra el modulo de monitorizacion y la biblioteca de detectores, segun una realizacion de la presente invencion.Figure 3 shows the monitoring module and the detector library, according to an embodiment of the present invention.

La figura 4 muestra el modulo compilador y la biblioteca de compiladores, segun una realizacion de la presente invencion.Figure 4 shows the compiler module and the compiler library, according to an embodiment of the present invention.

La figura 5 muestra el modulo registrador y la biblioteca de registradores, segun una realizacion de la presente invencion.Figure 5 shows the registrar module and the registrar library, according to an embodiment of the present invention.

La figura 6 muestra un posible diagrama de secuencia entre el modulo de monitorizacion y el modulo compilador, en el que el modulo de monitorizacion y el modulo compilador hacen uso de la interfaz compilador-monitorizador, segun una realizacion de la presente invencion.Figure 6 shows a possible sequence diagram between the monitoring module and the compiler module, in which the monitoring module and the compiler module make use of the compiler-monitor interface, according to an embodiment of the present invention.

La figura 7 muestra el diagrama de flujo para el detector de flujo de dominios basado en DNS, segun una realizacion de la presente invencion.Figure 7 shows the flow chart for the DNS-based domain flow detector, according to an embodiment of the present invention.

La figura 8 muestra una representacion grafica de agrupamientos cuando usan el agrupamiento de trafico sospechoso, en el que se representan los vectores normales mediante “0”, las anomalfas mediante “x” y los agrupamientos mediante drculos, segun una realizacion de la presente invencion.Figure 8 shows a graphical representation of clusters when using suspicious traffic clustering, in which normal vectors are represented by "0", anomalies by "x" and clustering by circles, according to an embodiment of the present invention.

La figura 9 muestra una representacion grafica de 2 caractensticas del algoritmo rapido de pertenencia al agrupamiento, segun una realizacion de la presente invencion.Figure 9 shows a graphical representation of 2 characteristics of the rapid group membership algorithm, according to an embodiment of the present invention.

La figura 10 muestra un ejemplo de implantacion ffsica distribuida de los componentes del sistema, segun una realizacion de la presente invencion.Figure 10 shows an example of distributed physical implantation of the system components, according to an embodiment of the present invention.

Descripcion detallada de varias realizacionesDetailed description of various embodiments

La invencion propuesta se refiere a un equipo de hardware y software (o conjunto de equipos si sus componentes seThe proposed invention relates to hardware and software equipment (or set of equipment if its components are

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

5555

6060

distribuyen finalmente) que actua como plataforma que permite mucha mas eficacia y eficiencia en el campo de deteccion de anomaffas de red.finally distribute) which acts as a platform that allows much more efficiency and effectiveness in the field of network anomaly detection.

La ganancia de eficacia se obtiene gracias a la combinacion de dos enfoques innovadores:The efficiency gain is obtained thanks to the combination of two innovative approaches:

- El uso de modelos de deteccion en un modo por usuario. En lugar de crear un unico modelo para la totalidad de la red, cada entidad dentro de esta se modela.- The use of detection models in one mode per user. Instead of creating a single model for the entire network, each entity within it is modeled.

- El uso de modelos de deteccion en un modo por algoritmo. En lugar de tener una estrategia de deteccion unica, multiples ejemplos de detector se ejecutan en paralelo.- The use of detection models in one mode per algorithm. Instead of having a unique detection strategy, multiple detector examples run in parallel.

Los dos enfoques anteriores pueden verse como una matriz de modelo en la que se consideran N entidades (Ei...En) y M algoritmos (Ai...Am), obteniendo NxM modelos diferentes (Mii...Mnm).The two previous approaches can be seen as a model matrix in which N entities (Ei ... En) and M algorithms (Ai ... Am) are considered, obtaining NxM different models (Mii ... Mnm).

El modelo y sus respectivos detectores pueden ser o bien ADS del estado de la tecnica (pueden considerarse matrices sencillas de 1x1, puesto que se usan una unica entidad modelada y solo un algoritmo) o bien algoritmos de deteccion innovadores, como los dos explicados mas adelante basados en inteligencia artificial.The model and its respective detectors can be either ADS of the state of the art (they can be considered simple 1x1 matrices, since a single modeled entity and only one algorithm are used) or innovative detection algorithms, such as the two explained below based on artificial intelligence.

Ademas, otras caractensticas de arquitectura disponibles en la invencion propuesta permiten mucha mas eficiencia en terminos de:In addition, other architectural features available in the proposed invention allow much more efficiency in terms of:

- La distribucion del esfuerzo de procesamiento entre varios nodos ffsicos.- The distribution of the processing effort among several physical nodes.

- Facil integracion de software relacionado con monitorizacion existente, tal como procesadores de trafico de red, mediante el diseno de arquitectura.- Easy integration of software related to existing monitoring, such as network traffic processors, through architecture design.

- Un unico punto de monitorizacion en la red.- A single monitoring point in the network.

• Componentes de la arquitectura• Architecture components

Los componentes del sistema y la interaccion entre ellos se muestran en la figura 1. Los recuadros gris claro representan componentes SoA, el gris oscuro representa la SoA modificada y los recuadros blancos, que incluyen las bibliotecas de detectores y compiladores, son componentes innovadores dentro de la arquitectura. A continuacion, se detallaran los componentes del sistema:The system components and the interaction between them are shown in Figure 1. The light gray boxes represent SoA components, the dark gray represents the modified SoA and the white boxes, which include the libraries of detectors and compilers, are innovative components within the architecture. Next, the system components will be detailed:

-SONDA-PROBE

Es el unico punto de monitorizacion que proporciona rastros de trafico de red al resto de los componentes del sistema.It is the only monitoring point that provides traces of network traffic to the rest of the system components.

La deteccion de anomaffas realizada por las aplicaciones de ADS de multiples algoritmos se basa en el trafico de red, espedficamente buscando desviaciones respecto de la normalidad en los paquetes de TCP/IP que atraviesan la red monitorizada. Para realizar esto es necesario (1) capturar los paquetes de los medios ffsicos y (2) preparar los paquetes capturados para los algoritmos de deteccion, que habitualmente funcionan con flujos agregados (un flujo se define, al menos, por el protocolo de transporte, TCP o UDP, las direcciones IP de origen y destino y los puertos de origen y destino). La SONDA es el componente encargado de la captura de paquetes y del procesamiento en flujos, tal como se muestra en la figura 2.The detection of anomalies performed by ADS applications of multiple algorithms is based on network traffic, specifically looking for deviations from normality in the TCP / IP packets that cross the monitored network. To do this it is necessary (1) to capture the physical media packets and (2) to prepare the captured packets for the detection algorithms, which usually work with aggregate flows (a flow is defined, at least, by the transport protocol, TCP or UDP, source and destination IP addresses and source and destination ports). The PROBE is the component responsible for packet capture and flow processing, as shown in Figure 2.

Las SONDAS tambien pueden ser heredadas, para lo que es necesario simplemente implantar un adaptador con el fin de conseguir la interfaz entre las SONDAS y el MONITORIZADOR.The PROBES can also be inherited, for which it is necessary to simply implant an adapter in order to achieve the interface between the PROBE and the MONITOR.

- MONITORIZADOR y BIBLIOTECA DE DETECTORES- MONITORING AND DETECTORS LIBRARY

El MONITORIZADOR es el controlador principal del sistema, recibe rastros de red desde la SONDA y es responsable (1) del almacenamiento de rastros si funciona en modo de entrenamiento y (2) la invocacion de DETECTORES, pasandoles una copia de los rastros, si funciona en modo de deteccion. La BIBLIOTECA DE DETECTORES es una recopilacion de DETECTORES que implementan cada uno un algoritmo de NADS.The MONITORIZER is the main controller of the system, receives network traces from the PROBE and is responsible (1) for the storage of traces if it works in training mode and (2) the invocation of DETECTORS, passing them a copy of the traces, if it works in detection mode. The DETECTORS LIBRARY is a collection of DETECTORS that each implement an NADS algorithm.

Una vez que el trafico de red se agrega en flujos esta listo para analizarse usando una amplia diversidad de algoritmos de deteccion de anomaffas. Sin embargo, debe recordarse que el trafico puede o bien almacenarse (con el fin de construir los modelos normales) o bien usarse por los algoritmos de deteccion. Asf, alguna entidad debe (1) conocer si el sistema esta funcionando actualmente en modo de deteccion o de almacenamiento y (2) reenviar el trafico al componente de almacenamiento o a las entidades de deteccion dependiendo del modo de funcionamiento. La invencion propuesta implementa esto por medio del MONITORIZADOR.Once the network traffic is added in flows, it is ready to be analyzed using a wide variety of anomaly detection algorithms. However, it should be remembered that the traffic can either be stored (in order to build normal models) or used by the detection algorithms. Thus, some entity must (1) know if the system is currently operating in detection or storage mode and (2) forward the traffic to the storage component or to the detection entities depending on the operating mode. The proposed invention implements this through the MONITORIZER.

Con respecto a la deteccion, puesto que la carpeta de algoritmos no puede estar completa (imposibilidad de implementar todos los algoritmos existentes, algoritmos que no se han descubierto todavfa) sena muy convenienteWith regard to detection, since the algorithm folder cannot be complete (impossibility of implementing all existing algorithms, algorithms that have not yet been discovered) it will be very convenient

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

5555

6060

tener un mecanismo para ajustar el sistema de deteccion (anadir, eliminar o modificar algoritmos). La arquitectura del sistema proporciona un mecanismo acoplable de este tipo por medio de la BIBLIOTECA DE DETECTORES. La BIBLIOTECA DE DETECTORES es una recopilacion discreta de algoritmos de deteccion, la recopilacion actual, que tiene una interfaz definida con el MONITORIZADOR. Un nuevo algoritmo de deteccion, anteriormente un DETECTOR, puede incluirse en la recopilacion si tal DETECTOR respeta la interfaz definida con el MONITORIZADOR. Todos los DETECTORES dentro de la biblioteca no tienen que usarse siempre, y si existe la posibilidad de enumerar el subconjunto deseado de algoritmos para el MONITORIZADOR. Esto puede realizarse, por ejemplo, por medio de archivos de configuracion.have a mechanism to adjust the detection system (add, delete or modify algorithms). The system architecture provides such a docking mechanism through the DETECTORS LIBRARY. The DETECTORS LIBRARY is a discrete collection of detection algorithms, the current collection, which has a defined interface with the MONITOR. A new detection algorithm, formerly a DETECTOR, can be included in the collection if such DETECTOR respects the interface defined with the MONITOR. All DETECTORS within the library do not always have to be used, and if there is a possibility to list the desired subset of algorithms for the MONITOR. This can be done, for example, by means of configuration files.

El MONITORIZADOR entonces reenvfa simplemente tal como se dijo previamente, a traves de la interfaz definida, los flujos recibidos a todos los DETECTORES dentro de la BIBLIOTECA DE DETECTORES. Finalmente, los DETECTORES comparan los flujos entrantes con los modelos de comportamiento normal almacenados.The MONITORIZER then simply resends, as previously stated, through the defined interface, the flows received to all DETECTORS within the DETECTORS LIBRARY. Finally, DETECTORS compare incoming flows with stored normal behavior models.

La BIBLIOTECA DE DETECTORES esta concebida originalmente, precisamente, como una biblioteca que se incorpora de manera nativa al software de MONITORIZADOR (bibliotecas estaticas o dinamicas en C/C++, archivos JAR en Java, etc.). El fin de esta decision es mejorar la entrega de flujos a los DETECTORES, pero el desarrollador puede implementar la biblioteca como un conjunto de procesos que se ejecutan independientemente del MONITORIZADOR, por ejemplo; incluso, los DETECTORES pueden colocarse en dispositivos ffsicos separados.The DETECTORS LIBRARY is originally conceived, precisely, as a library that is natively incorporated into the MONITORING software (static or dynamic libraries in C / C ++, JAR files in Java, etc.). The purpose of this decision is to improve the delivery of flows to the DETECTORS, but the developer can implement the library as a set of processes that are executed independently of the MONITOR, for example; even, DETECTORS can be placed in separate physical devices.

- COMPILADOR y BIBLIOTECA DE COMPILADORES- COMPILER and COMPILER LIBRARY

El COMPILADOR es responsable de la invocacion de COMPILADORES cuando el sistema funciona en modo de entrenamiento. Se ejecuta una vez que la fase de almacenamiento de rastros termina. La BIBLIOTECA DE COMPILADORES es una recopilacion de COMPILADORES que implementan cada uno un mecanismo de modelado de comportamiento.The COMPILER is responsible for invoking COMPILERS when the system operates in training mode. It runs once the trail storage phase ends. The COMPILERS LIBRARY is a compilation of COMPILERS that each implement a behavior modeling mechanism.

El COMPILADOR es el modulo de la arquitectura de ADS de multiples algoritmos implicada en la generacion de modelos de comportamiento normal y, por tanto, es el componente central de la invencion propuesta. Tal como se conoce, los algoritmos de deteccion de anomalfas generalmente comparan el trafico observado con un modelo normal, buscando desviaciones entre uno y otro. Puesto que los algoritmos de deteccion pueden ser muy diferentes, entonces sus modelos de referencia tambien seran muy diferentes a pesar de que los datos usados para generarlos son los mismos (el trafico capturado y agregado). El COMPILADOR es responsable de generar esta generacion de modelos diferenciados a traves de la BIBLIOTECA DE COMPILADORES, que contiene un generador de modelos, anteriormente un COMPILADOR, por algoritmo de deteccion en el sistema. Los COMPILADORES dentro de la biblioteca pueden habilitarse o deshabilitarse.The COMPILER is the module of the ADS architecture of multiple algorithms involved in the generation of normal behavior models and, therefore, is the central component of the proposed invention. As is known, anomaly detection algorithms generally compare the observed traffic with a normal model, looking for deviations between them. Since the detection algorithms can be very different, then their reference models will also be very different despite the fact that the data used to generate them are the same (the traffic captured and aggregated). The COMPILER is responsible for generating this generation of differentiated models through the COMPILER LIBRARY, which contains a model generator, previously a COMPILER, by detection algorithm in the system. COMPILERS within the library can be enabled or disabled.

Sin embargo, el COMPILADOR es un componente opcional puesto que algunos DETECTORES podnan no necesitar un modelo personalizado (un modelo por defecto se usa siempre independientemente del comportamiento espedfico de los usuarios); o los modelos pueden haberse generado por otros medios; o incluso el DETECTOR no necesita un modelo.However, the COMPILER is an optional component since some DETECTORS may not need a custom model (a default model is always used independently of the specific behavior of the users); or the models may have been generated by other means; or even the DETECTOR does not need a model.

- REGISTRADOR y BIBLIOTECA DE REGISTRADORES- REGISTER AND REGISTER LIBRARY

El REGISTRADOR recibe alertas desde el MONITORIZADOR e invoca a los REGISTRADORES. La BIBLIOTECA DE REGISTRADORES es una recopilacion de REGISTRADORES implementando cada uno una facilidad de registro.The REGISTER receives alerts from the MONITOR and invokes the REGISTERS. The REGISTER LIBRARY is a compilation of REGISTERS each implementing a registration facility.

Finalmente, las alarmas colaborativas generadas pueden registrarse, tarea para la cual el componente de REGISTRADOR se introduce en la arquitectura (una vez mas, una carpeta de registro exhaustiva es inviable, por lo que se usa una BIBLIOTECA DE REGISTRADORES para anadir o eliminar dinamicamente facilidades de registro: archivos, bases de datos, syslog, etc.). Los REGISTRADORES existentes pueden habilitarse o deshabilitarse en la configuracion.Finally, the generated collaborative alarms can be registered, a task for which the REGISTER component is introduced into the architecture (once again, an exhaustive registration folder is unfeasible, so a REGISTER LIBRARY is used to dynamically add or delete facilities Registration: files, databases, syslog, etc.). Existing REGISTERS can be enabled or disabled in the configuration.

• Interfaces de arquitectura• Architecture interfaces

A continuacion, se describiran las interfaces entre los componentes del sistema:Next, the interfaces between the system components will be described:

- SONDA - MONITORIZADOR- PROBE - MONITORING

Esta interfaz define como los flujos generados en el componente SONDA se envfan al componente MONITORIZADOR. Las comunicaciones, basicamente, pueden implementar dos esquemas diferentes dependiendo de manera proactiva de la SONDA:This interface defines how the flows generated in the PROBE component are sent to the MONITORING component. Communications, basically, can implement two different schemes proactively depending on the PROBE:

1. Intercambio proactivo: la SONDA envfa los flujos en el momento en que estan listos para compartirse.1. Proactive exchange: the PROBE sends the flows at the moment they are ready to share.

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

5555

6060

2. Intercambio a peticion: los flujos se env^an cuando el MONITORIZADOR los solicita.2. Exchange on request: flows are sent when requested by the MONITOR.

El uso de uno u otro esquema tiene un importante impacto en el rendimiento en tiempo real. Por un lado, es obvio que el intercambio a peticion no es compatible con el tiempo real puesto que el reenvfo de datos depende de la disponibilidad del MONITORIZADOR. Por otro lado, puede pensarse que el intercambio proactivo es siempre compatible con el tiempo real; pero esto puede no ser cierto si el componente MONITORIZADOR realiza de nuevo un almacenamiento en memoria intermedia de los datos recibidos proactivamente.The use of one or the other scheme has an important impact on performance in real time. On the one hand, it is obvious that the exchange on request is not compatible with real time since data forwarding depends on the availability of the MONITOR. On the other hand, it can be thought that proactive exchange is always compatible with real time; but this may not be true if the MONITORING component again buffers proactively received data.

En cualquier caso, debe establecerse el formato de los flujos intercambiados con el fin de garantizar la correcta interaccion entre las SONDAS y el MONITORIZADOR. Si se usan fuentes de datos heredadas entonces debe implementarse un adaptador con el fin de garantizar su disponibilidad. Una lista no exhaustiva de campos que pueden intercambiarse entre las SONDAS y el MONITORIZADOR es la siguiente:In any case, the format of the exchanged flows must be established in order to guarantee the correct interaction between the PROBES and the MONITOR. If legacy data sources are used then an adapter must be implemented in order to ensure its availability. A non-exhaustive list of fields that can be exchanged between the PROBES and the MONITOR is:

- Protocolo de capa 4 (TCP, UDP, etc.).- Layer 4 protocol (TCP, UDP, etc.).

- Sellos de fecha y hora del primer y ultimo paquete.- Date and time stamps of the first and last package.

- Direcciones IP de origen y destino en el flujo.- IP addresses of origin and destination in the flow.

- Puertos de TCP o UDP de origen y destino en el flujo.- TCP or UDP ports of origin and destination in the flow.

- Numero de paquetes enviados y recibidos.- Number of packages sent and received.

- Numero de bytes enviados y recibidos.- Number of bytes sent and received.

- Estado de TCP o, al menos, banderas de TCP implicadas en el flujo.- TCP status or at least TCP flags involved in the flow.

- Determinados datos de cabecera de aplicacion, tales como DNS, HTTP, SMTP, SIP y otros.- Certain application header data, such as DNS, HTTP, SMTP, SIP and others.

- MONITORIZADOR - COMPILADOR- MONITORIZER - COMPILER

Como se indico en las secciones de MONITORIZADOR y COMPILADOR espedficas, la comunicacion entre estos dos componentes de la arquitectura de la invencion propuesta se realiza por medio de bases de datos. Dos son los motivos para esto:As indicated in the specific MONITORIZER and COMPILER sections, communication between these two components of the proposed invention architecture is done through databases. There are two reasons for this:

1. La alta cantidad de flujos capturados no permite un almacenamiento en memoria intermedia sencillo.1. The high amount of captured streams does not allow simple buffer storage.

2. La mayona de los algoritmos de deteccion de anomalfas necesitan amplias ventanas de tiempo para realizar sus modelos de comportamiento normal. Naturalmente, es posible una construccion de modelos en tiempo real en algunos algoritmos de deteccion de anomalfas, pero, puesto que esta tarea no es cntica, siempre se realizara en un modo no en tiempo real.2. The majority of anomaly detection algorithms need ample time windows to perform their normal behavior models. Naturally, a real-time model construction is possible in some anomaly detection algorithms, but, since this task is not critical, it will always be done in a non-real-time mode.

La base de datos implementada debe permitir al MONITORIZADOR almacenar los flujos construidos en el formato de intercambio acordado entre la SONDa y el MONITORIZADOR (vease seccion anterior). Estos flujos almacenados pueden consultarse por el COMPILADOR tambien, que generara los modelos y los almacenara en otra base de datos.The implemented database must allow the MONITORIZER to store the flows built in the exchange format agreed between the SONDa and the MONITORIZER (see previous section). These stored flows can be consulted by the COMPILER as well, which will generate the models and store them in another database.

- MONITORIZADOR - REGISTRADOR- MONITORING - REGISTER

Las alarmas generadas por el MONITORIZADOR se envfan al componente REGISTRADOR en un formato espedfico, para el que los campos candidatos podnan ser los siguientes:The alarms generated by the MONITORIZER are sent to the REGISTER component in a specific format, for which the candidate fields could be the following:

- Identificador del atacante.- Identifier of the attacker.

- Identificador del host atacado.- Identifier of the attacked host.

- Protocolo implicado.- Protocol involved.

- Sello de fecha y hora para la alerta.- Date and time stamp for the alert.

- Tipo de ataque/tipo de anomalfa.- Type of attack / type of anomaly.

- Identificador de DETECTOR.- DETECTOR identifier.

- Informacion de realimentacion tal como datos originales que provocan que se establezca la alarma en el DETECTOR, lista de objetivos extendida (si no hay uno solo), etc.- Feedback information such as original data that causes the alarm to be set in the DETECTOR, extended target list (if there is not one), etc.

• Detectores• Detectors

El sistema preve el uso de algoritmos avanzados para mejorar la eficacia de los sistemas de deteccion de anomalfas. Estos algoritmos avanzados proceden del campo de inteligencia artificial, siendo las redes neuronales y los algoritmos de agrupamiento los candidatos principales para usarse.The system provides for the use of advanced algorithms to improve the efficiency of anomaly detection systems. These advanced algorithms come from the field of artificial intelligence, with neural networks and clustering algorithms the main candidates for use.

Las redes neuronales y otros algoritmos de aprendizaje de maquina supervisados [10] han demostrado varias veces su potencia en problemas de clasificacion, debido a su capacidad para generalizar soluciones por medio de unos pocos ejemplos de entrenamiento; su adaptabilidad; y su baja tasa de falsos positivos.Neural networks and other supervised machine learning algorithms [10] have demonstrated several times their potency in classification problems, due to their ability to generalize solutions through a few training examples; its adaptability; and its low rate of false positives.

Sin embargo, no siempre es posible usar algoritmos supervisados, especialmente cuando no hay un experto queHowever, it is not always possible to use supervised algorithms, especially when there is no expert who

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

5555

pueda etiquetar o clasificar previamente los ejemplos de entrenamiento. Cuando esto se produce, se necesitan los algoritmos no supervisados [11] tal como de agrupamiento. El agrupamiento es muy util en auto-generacion de clases de trafico, comportamientos, etc.can label or classify the training examples beforehand. When this occurs, unsupervised algorithms [11] such as clustering are needed. Clustering is very useful in auto-generation of traffic classes, behaviors, etc.

En este caso se detallan un par de DETECTORES que forman parte de la BIBLIOTECA DE DETECTORES. Estos DETECTORES son ejemplos de los algoritmos de deteccion avanzados contemplados para el ADS de multiples algoritmos propuesto. Otros algoritmos sencillos tales como monitorizacion de trafico volumetrico, recuentos absolutos de paquetes o flujos entre nodos, y la deteccion periodica de flujos no se describen puesto que no son algoritmos innovadores aunque pueden desarrollarse perfectamente sobre la invencion propuesta.In this case, a couple of DETECTORS that are part of the DETECTORS LIBRARY are detailed. These DETECTORS are examples of the advanced detection algorithms contemplated for the proposed multi-algorithm ADS. Other simple algorithms such as volumetric traffic monitoring, absolute packet counts or flows between nodes, and periodic flow detection are not described since they are not innovative algorithms although they can be perfectly developed on the proposed invention.

DETECTORES adicionales no documentados en el presente documento pueden anadirse al ADS de multiples algoritmos en forma de extensiones de la patente actual.Additional DETECTORS not documented in this document may be added to the ADS of multiple algorithms in the form of extensions of the current patent.

- Detector de flujo de dominios basado en DNS- DNS based domain flow detector

Este DETECTOR intenta detectar actividades de flujo de dominios [12] en el trafico de DNS monitorizado, que puede indicar la presencia de un bot [7].This DETECTOR attempts to detect domain flow activities [12] in the monitored DNS traffic, which may indicate the presence of a bot [7].

Los bots dentro de una botnet habitualmente implementan consultas de DNS con el fin de descubrir su servidor de mando y control (C&C); esto permite a los duenos de bots cambiar la ubicacion real (la direccion IP) del servidor sin reconfigurar sus bots. Mientras que los FQDN fijos son faciles de detectar y filtrar, los bots implementan la tecnica de flujo de dominios, que genera dinamicamente una alta cantidad de FQDN para el servidor C&C con el fin de sincronizarse con el dueno. Estos FQDN dinamicamente generados pueden basarse en el sello de fecha y hora actual, o pueden ser un conjunto generado de manera pseudoaleatoria, siendo solo una de las posibilidades la valida. En cualquier caso, esta tecnica genera muchas respuestas de NX_DOMAIN (y otras) cuando se consulta el servidor DNS. Este detector analiza estas respuestas anomalas.Bots within a botnet usually implement DNS queries in order to discover their command and control (C&C) server; This allows bot owners to change the actual location (IP address) of the server without reconfiguring their bots. While the fixed FQDNs are easy to detect and filter, the bots implement the domain flow technique, which dynamically generates a high amount of FQDN for the C&C server in order to synchronize with the owner. These dynamically generated FQDNs can be based on the current date and time stamp, or they can be a set generated in a pseudorandom manner, being only one of the possibilities valid. In any case, this technique generates many responses from NX_DOMAIN (and others) when querying the DNS server. This detector analyzes these anomalous responses.

Por tanto, se analizan las respuestas de DNS y se extrae un conjunto de caractensticas dentro de un intervalo de tiempo, siendo lo siguiente un ejemplo de conjunto de caractensticas:Therefore, DNS responses are analyzed and a set of features is extracted within a time interval, the following being an example of a set of features:

- Numero de respuestas de NX_DOMAIN.- Number of responses of NX_DOMAIN.

- Numero de respuestas de FORMAT_ERROR.- Number of responses from FORMAT_ERROR.

- Numero de respuestas de REFUSED.- Number of REFUSED responses.

- Numero de respuestas de SERVER_FAILURE.- Number of responses from SERVER_FAILURE.

- Numero de diferentes FQDN consultados.- Number of different FQDN consulted.

- Numero de FQDN de una capa- Number of FQDN of a layer

- Numero de FQDN de dos capas.- Number of two layer FQDN.

- Numero de FQDN de tres capas o mas.- FQDN number of three layers or more.

- Numero de dominios de nivel superior sospechosos (TLD).- Number of suspicious top-level domains (TLD).

- Numero de dominios de capa dos con longitud inferior a 6.- Number of layer two domains with length less than 6.

- Numero de dominios de capa dos con longitud superior a 5 e inferior a 21.- Number of layer two domains with length greater than 5 and less than 21.

- Numero de dominios de capa dos con longitud superior a 20.- Number of layer two domains with length greater than 20.

- Numero de dominios de capa dos con numero de vocales inferior al 0,3%- Number of layer two domains with number of vowels less than 0.3%

- Numero de dominios de capa dos con numero de d^gitos superior al 0,5%- Number of layer two domains with number of digits greater than 0.5%

Estas caractensticas componen un vector de caractensticas que se evalua usando una red neuronal. Esta red neuronal se entrena (supervisa) previamente usando ejemplos de vectores que contienen valores para todas las caractensticas anteriores y proporcionando un campo adicional, una etiqueta. Esta etiqueta proporcionara informacion con respecto a la conveniencia de establecer una alarma o no cuando se encuentre un vector parecido en el trafico.These features make up a feature vector that is evaluated using a neural network. This neural network is previously trained (monitored) using examples of vectors that contain values for all the previous features and providing an additional field, a label. This label will provide information regarding the convenience of setting an alarm or not when a similar vector is found in the traffic.

El DETECTOR usa, por tanto, un modelo unico por defecto, construido previamente, la configuracion de red neuronal espedfica que resulta tras la fase de entrenamiento.The DETECTOR therefore uses a unique default model, previously constructed, the specific neural network configuration that results after the training phase.

Se proporciono en la figura 7 un diagrama de flujo para ilustrar el DETECTOR propuesto, en el que se describen algunas variables en la siguiente tabla:A flow chart was provided in Figure 7 to illustrate the proposed DETECTOR, in which some variables are described in the following table:

Nombre de variable Variable name: Descripcion Description

f F: Flujo Flow

sid sid: Identificador de abonado Subscriber Identifier

m m: Modelo para el identificador de abonado (actualmente uno por defecto) Model for the subscriber identifier (currently one by default)

td td: ^Esta el flujo actual relacionado con un dominio superior? Is the current flow related to a higher domain?

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

ts ts: Sello de fecha y hora para el flujo actual Date and time stamp for current flow

tw tw: Sello de fecha y hora de inicio de la ventana de tiempo actual Start date and time stamp of the current time window

tw' tw ': tw actualizado updated tw

s s: Tamano de la ventana de tiempo Time window size

ctw ctw: ^Esta el flujo actual en la ventana de tiempo actual? Is the current flow in the current time window?

a to: Alerta Alert

v v: Vector de caractensticas que se obtiene del flujo actual Feature vector obtained from the current flow

gv gv: Vector acumulado global con respecto a la ventana de tiempo actual Global cumulative vector with respect to the current time window

gv' gv ': gv actualizado updated gv

• Agrupamiento de trafico sospechoso• Grouping of suspicious traffic

Los fundamentos de este DETECTOR se basan en la construccion de un modelo personalizado con respecto a los valores de determinadas caractensticas dentro de un periodo de intervalo. A diferencia del DETECTOR anterior, no se usa ningun experto para entrenarlo, es decir, se va a utilizar un algoritmo no supervisado. Espedficamente, un algoritmo de agrupamiento de maximizacion de la esperanza (EM) [13] definira las clases de trafico en las que cada usuario esta comunmente implicado. Entonces, en una segunda etapa, se detectaran las anomalfas si aparece un patron de trafico diferente del normal. Graficamente, este proceso se muestra en la imagen simplificada de 2 caractensticas de la figura 8, en la que los drculos representan agrupamientos de datos, la “o” representa un vector normal y una “x” es una anomalfa.The fundamentals of this DETECTOR are based on the construction of a customized model with respect to the values of certain characteristics within an interval period. Unlike the previous DETECTOR, no expert is used to train it, that is, an unsupervised algorithm will be used. Specifically, a clustering algorithm for maximizing hope (MS) [13] will define the traffic classes in which each user is commonly involved. Then, in a second stage, anomalies will be detected if a different traffic pattern appears than normal. Graphically, this process is shown in the simplified 2-character image of Figure 8, in which the circles represent clusters of data, the "o" represents a normal vector and an "x" is an anomaly.

Sin embargo, el punto clave no es el algoritmo sino las caractensticas de los vectores que alimentaran la implementacion de EM. Caractensticas tales como el numero de paquetes/bytes/flujos enviados y recibidos parecen ser candidatos validos, pero estas caractensticas no son estadfsticamente estables a lo largo del tiempo. Por el contrario, se necesitan caractensticas estables; son mejores aquellas relacionadas, precisamente, con comportamientos anormales. Este DETECTOR considera al menos las siguientes (recuentos con respecto a un intervalo de tiempo):However, the key point is not the algorithm but the characteristics of the vectors that will feed the EM implementation. Features such as the number of packets / bytes / streams sent and received appear to be valid candidates, but these characteristics are not statistically stable over time. On the contrary, stable characteristics are needed; Those related precisely to abnormal behaviors are better. This DETECTOR considers at least the following (counts with respect to a time interval):

- Numero de flujos que tienen el destino ubicado en un pafs inusual.- Number of flows that have the destination located in an unusual country.

- Numero de flujos que tienen el destino incluido en una lista negra.- Number of flows that have the destination included in a blacklist.

- Numero de flujos que usan un protocolo inusual.- Number of flows that use an unusual protocol.

- Numero de flujos de TCP sospechosos (solo flujos SYN, flujos solo SYN+ACK, flujos solo FIN, flujos solo FIN+ACK).- Number of suspicious TCP flows (SYN flows only, SYN + ACK only flows, FIN only flows, FIN + ACK only flows).

- Numero de flujos de mas de 10 KB de longitud.- Number of flows of more than 10 KB in length.

- Numero de flujos de mas de 50 KB de longitud.- Number of flows of more than 50 KB in length.

- Numero de flujos muy pequenos (menos de la mitad de MTU).- Number of very small flows (less than half of MTU).

- Numero de flujos no generados por la entidad modelada.- Number of flows not generated by the modeled entity.

La mayona de las caractensticas anteriores deben tener valores diferentes de cero, pero muy proximos a cero, en el caso normal. Esto no es relevante puesto que un usuario puede acceder a uno o dos servidores legttimos ubicados en un pafs inusual, por ejemplo; lo importante es encontrar valores anormales.The majority of the above features must have values other than zero, but very close to zero, in the normal case. This is not relevant since a user can access one or two legitimate servers located in an unusual country, for example; The important thing is to find abnormal values.

Los pafses inusuales y protocolos inusuales se definen realizando una estadfstica previa con respecto a pafses a los que se accede mas y protocolos mas usados durante un determinado intervalo de tiempo.Unusual countries and unusual protocols are defined by performing a prior statistic with respect to countries that are most accessed and most used protocols during a certain time interval.

El diagrama de flujo para el DETECTOR de agrupamiento de trafico sospechoso es el mismo que en el DETECTOR de flujo de dominios basado en DNS, pero siendo la funcion de evaluacion una diferente. En este caso la funcion de evaluacion verifica si el vector acumulado se correlaciona con algun agrupamiento dentro del modelo (es comportamiento normal) o no (es una anomalfa). En este sentido se propone un algoritmo rapido de pertenencia al agrupamiento.The flowchart for the suspect traffic pool DETECTOR is the same as in the DNS-based domain flow DETECTOR, but the evaluation function is a different one. In this case, the evaluation function verifies whether the accumulated vector correlates with any grouping within the model (it is normal behavior) or not (it is an anomaly). In this sense a fast algorithm of belonging to the grouping is proposed.

Las funciones de pertenencia al agrupamiento pueden ser complejas, especialmente cuando se usan grandes vectores de caractensticas. En este caso se presenta una rapida funcion de evaluacion, basandose en una simplificacion de los agrupamientos obtenidos: para cada agrupamiento y para cada dimension o caractenstica se calculan sus lfmites, es decir, los valores mmimo y maximo que puede adoptar la caractenstica con respecto al agrupamiento actual. Entonces, la funcion de pertenencia al agrupamiento es tan sencilla como verificar si cada caractenstica dentro del vector evaluado esta dentro de los lfmites de la misma caractenstica del agrupamiento. La idea puede observarse facilmente en un espacio de 2 caractensticas, tal como se muestra en la figura 9.Group membership functions can be complex, especially when large feature vectors are used. In this case, a rapid evaluation function is presented, based on a simplification of the groupings obtained: for each grouping and for each dimension or characteristic its limits are calculated, that is, the minimum and maximum values that the characteristic can adopt with respect to current grouping. Then, the function of belonging to the grouping is as simple as verifying if each characteristic within the evaluated vector is within the limits of the same characteristic of the grouping. The idea can be easily observed in a 2-character space, as shown in Figure 9.

■ Ventajas de la invencion■ Advantages of the invention

1. Ventajas de eficacia1. Advantages of efficiency

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

5555

6060

- Modelado del comportamiento basandose en entidad y basandose en algoritmo- Modeling behavior based on entity and based on algorithm

La invencion propuesta permite el modelado del comportamiento basandose en entidad y basandose en algoritmo, abordando estadfsticamente pocos y sofisticados ataques que no se detectanan basandose en red. Sin embargo, este ADS de multiples algoritmos permite tambien la definicion de modelos de comportamiento global.The proposed invention allows modeling behavior based on entity and based on algorithm, statistically addressing few sophisticated attacks that are not detected based on network. However, this ADS with multiple algorithms also allows the definition of global behavior models.

- Algoritmos de deteccion avanzada- Advanced detection algorithms

Los algoritmos detallados antes se basan en algoritmos de inteligencia artificial compleja y de aprendizaje de maquina que proporcionan mecanismos flexibles, de auto-adaptacion, de autoaprendizaje y de precision para detectar tanto anomalfas generales como comportamientos anomalos espedficos de trafico, tales como flujo de dominios.The algorithms detailed above are based on complex artificial intelligence and machine learning algorithms that provide flexible, self-adaptation, self-learning and precision mechanisms to detect both general anomalies and specific traffic anomaly behaviors, such as domain flow.

2. Ventajas de eficiencia2. Advantages of efficiency

- Un marco comun para el desarrollo de aplicaciones de monitorizacion- A common framework for the development of monitoring applications

La arquitectura mostrada en este documento tiene como objetivo ser un modelo de referencia a la hora de implementar sistemas de deteccion colaborativa, espedficamente aplicaciones de deteccion de anomalfas colaborativas. Tal como puede verse en la seccion de estado de la tecnica anterior, muchos intentos para definir tal clase de aplicaciones han conducido a una confusion de conceptos que no ayuda a conocer si una arquitectura ha emergido de un algoritmo o viceversa, y lo peor, hace diffcil integrar tanto nuevos algoritmos como nuevos requisitos de arquitectura. El sistema propuesto da a los desarrolladores un marco comun para disenar nuevas aplicaciones de deteccion de anomalfas, pero tambien anade esas nuevas aplicaciones a las existentes en un modo aseptico. Esto es claramente una ventaja para los administradores de red (principalmente operadores de telecomunicaciones) que desean incluir en su portafolio de sistemas de deteccion una nueva estrategia o algoritmo de deteccion sin que interfiera con los existentes.The architecture shown in this document aims to be a reference model when implementing collaborative detection systems, specifically collaborative anomaly detection applications. As can be seen in the state of the art section, many attempts to define such applications have led to a confusion of concepts that does not help to know if an architecture has emerged from an algorithm or vice versa, and worse, it does Difficult to integrate both new algorithms and new architecture requirements. The proposed system gives developers a common framework to design new anomaly detection applications, but also add those new applications to existing ones in an aseptic way. This is clearly an advantage for network administrators (mainly telecommunications operators) who wish to include in their portfolio of detection systems a new detection strategy or algorithm without interfering with existing ones.

- Distribucion de procesamiento- Processing distribution

Todos los componentes de esta arquitectura pueden distribuirse facilmente entre varios dispositivos ffsicos; el unico requisito es que el desarrollador de los NADS implemente las interfaces entre los modulos usando una de las muchas tecnologfas de comunicaciones tales como sockets, servicios web, RPC, RMI, etc. La figura 10 muestra un ejemplo de tal implantacion distribuida en de cuatro servidores diferentes.All components of this architecture can be easily distributed among several physical devices; The only requirement is that the NADS developer implements the interfaces between the modules using one of the many communications technologies such as sockets, web services, RPC, RMI, etc. Figure 10 shows an example of such an implementation distributed in four different servers.

La distribucion de procesamiento tambien ayuda en la escalabilidad del sistema desarrollado, puesto que la inclusion de un algoritmo de procesamiento intenso puede resolverse, por ejemplo, incluyendo un servidor especializado para ese algoritmo de deteccion e implementando una interfaz basada en TCP/IP con el MONITORIZADOR.The distribution of processing also helps in the scalability of the developed system, since the inclusion of an intense processing algorithm can be resolved, for example, by including a specialized server for that detection algorithm and implementing a TCP / IP based interface with the MONITORIZER .

- Un solo punto de monitorizacion para una amplia diversidad de aplicaciones- A single monitoring point for a wide variety of applications

El lector debe tener en cuenta que los puntos de monitorizacion estan limitados debido a las restricciones tecnicas (las derivaciones de fibra hacen mas debil la senal optica cuando se divide; la duplicacion de puertos en encaminadores o conmutadores consume bastantes recursos que pueden restarse del procesamiento de trafico; o directamente el negocio no puede detenerse cortando la lmea, al fin y al cabo, debe realizarse o bien una derivacion ffsica o bien una duplicacion logica). Tal como puede observarse, el ADS de multiples algoritmos solo necesita un unico punto de monitorizacion comun a todas las aplicaciones implantadas.The reader should keep in mind that monitoring points are limited due to technical constraints (fiber shunts make the optical signal weaker when splitting; duplication of ports on routers or switches consumes enough resources that can be subtracted from the processing of traffic; or directly the business cannot be stopped by cutting the line, after all, a physical derivation or a logical duplication must be made). As can be seen, the ADS with multiple algorithms only needs a single monitoring point common to all the applications implemented.

- Integracion de sondas heredadas- Integration of inherited probes

El componente SONDA no necesita desarrollarse desde cero. Pueden usarse muchos otros sistemas de generacion de flujo ampliamente conocidos tales como Netflow [8]. El unico requisito para realizar tal integracion es incluir una etapa de normalizacion.The PROBE component does not need to be developed from scratch. Many other widely known flow generation systems such as Netflow [8] can be used. The only requirement to perform such integration is to include a standardization stage.

Un experto en la tecnica puede introducir cambios y modificaciones en las realizaciones descritas sin apartarse del alcance de la invencion tal como se define en las reivindicaciones adjuntas.A person skilled in the art can make changes and modifications to the described embodiments without departing from the scope of the invention as defined in the appended claims.

SiglasAcronym

ADS Anomaly Detection System; sistema de deteccion de anomalfasADS Anomaly Detection System; anomaly detection system

AI Artificial Intelligence; inteligencia artificialAI Artificial Intelligence; artificial intelligence

CPU Central Processing Unit; unidad central de procesamientoCPU Central Processing Unit; central processing unit

C&C Command and Control; mando y controlC&C Command and Control; command and control

55

1010

15fifteen

20twenty

2525

3030

3535

4040

45Four. Five

50fifty

DNS Domain Name System; sistema de nombres de dominioDNS Domain Name System; domain name system

DoS Denial of Service; denegacion de servicioDoS Denial of Service; denial of service

EM Expectation-Maximization; maximizacion de la esperanzaEM Expectation-Maximization; maximization of hope

FQDN Fully Qualified Domain Name; nombre de dominio completamente calificadoFQDN Fully Qualified Domain Name; fully qualified domain name

HTTP Hyper-Text Transfer Protocol; protocolo de transferencia de hipertextoHTTP Hyper-Text Transfer Protocol; hypertext transfer protocol

IP Internet Protocol; protocolo de InternetIP Internet Protocol; internet protocol

IDS Intrusion Detection System; sistema de deteccion de intrusionIDS Intrusion Detection System; intrusion detection system

ISP Internet Service Provider; proveedor de servicio de InternetISP Internet Service Provider; internet service provider

KB Kilo Bytes; kilobytesKB Kilo Bytes; kilobytes

MTU Maximum Transfer Unit; unidad de transferencia maximaMTU Maximum Transfer Unit; maximum transfer unit

NADS Network Anomaly Detection System; sistema de deteccion de anomalfas de redNADS Network Anomaly Detection System; network anomaly detection system

NIDS Network Intrusion Detection System; sistema de deteccion de intrusion de redNIDS Network Intrusion Detection System; network intrusion detection system

RAM Random Access Memory; memoria de acceso aleatorioRAM Random Access Memory; random access memory

SIEM Security Information and Events Management; gestion de eventos e informacion de seguridadSIEM Security Information and Events Management; event management and security information

SMTP Simple Mail Transfer Protocol; protocolo simple de transferencia de correoSMTP Simple Mail Transfer Protocol; simple mail transfer protocol

SOA Service Oriented Architecture; arquitectura orientada a servicioSOA Service Oriented Architecture; service oriented architecture

TCP Transport Control Protocol; protocolo de control de transporteTCP Transport Control Protocol; transport control protocol

TLD Top Layer Domain; dominio de capa superiorTLD Top Layer Domain; top layer domain

UDP User Datagram Protocol; protocolo de datagramas de usuarioUDP User Datagram Protocol; user datagram protocol

BibliograffaBibliography

[1] Snort IDS.
http://www.snort.org/[1] Snort IDS.
http://www.snort.org/

[2] Bro IDS.
http://bro-ids.org/[2] Bro IDS.
http://bro-ids.org/

[3] “Spambot” en Wikipedia.
http://en.wikipedia.org/wiki/Spambot[3] "Spambot" on Wikipedia.
http://en.wikipedia.org/wiki/Spambot

[4] “Denial-of-service attack” en Wikipedia.
http://en.wikipedia.org/wiki/Denial-of-service_attack[4] "Denial-of-service attack" on Wikipedia.
http://en.wikipedia.org/wiki/Denial-of-service_attack

[5] “Port scanner” en Wikipedia.
http://en.wikipedia.org/wiki/Port_scanner[5] "Port scanner" on Wikipedia.
http://en.wikipedia.org/wiki/Port_scanner

[6] Proventia ADS por IBM.[6] Proventia ADS by IBM.

http://www-935.ibm.com/services/uk/index.wss/offering/iss/y1026942
http://www-935.ibm.com/services/uk/index.wss/offering/iss/y1026942

[7] “Botnets” en Wikipedia.
http://en.wikipedia.org/wiki/Botnets[7] "Botnets" on Wikipedia.
http://en.wikipedia.org/wiki/Botnets

[8] Cisco IOS NetFlow.
http://www.cisco.com/en/US/products/ps6601/products_ios_protocol_group_home.html[8] Cisco IOS NetFlow.
http://www.cisco.com/en/US/products/ps6601/products_ios_protocol_group_home.html

[9] “SEM” en Wikipedia.
http://en.wikipedia.org/wiki/Security_event_manager[9] "SEM" on Wikipedia.
http://en.wikipedia.org/wiki/Security_event_manager

[10] “Supervised learning” en Wikipedia.
http://en.wikipedia.org/wiki/Supervised_learning[10] "Supervised learning" on Wikipedia.
http://en.wikipedia.org/wiki/Supervised_learning

[11] “Unsupervised learning” en Wikipedia.
http://en.wikipedia.org/wiki/Unsupervised_learning[11] "Unsupervised learning" on Wikipedia.
http://en.wikipedia.org/wiki/Unsupervised_learning

[12] “Fast-flux” en Wikipedia.
http://en.wikipedia.org/wiki/Fast_flux[12] "Fast-flux" on Wikipedia.
http://en.wikipedia.org/wiki/Fast_flux

[13] “Expectation-maximization algorithm” en Wikipedia
http://en.wikipedia.org/wiki/Expectation- maximization_algorithm[13] “Expectation-maximization algorithm” on Wikipedia
http://en.wikipedia.org/wiki/Expectation- maximization_algorithm

[14] “Anomaly detection” en Wikipedia
http://en.wikipedia.org/wiki/Anomaly_detection[14] "Anomaly detection" on Wikipedia
http://en.wikipedia.org/wiki/Anomaly_detection

[15] “Intrusion detection system” en Wikipedia
http://en.wikipedia.org/wiki/Intrusion-detection_system[15] "Intrusion detection system" on Wikipedia
http://en.wikipedia.org/wiki/Intrusion-detection_system

Claims

5

10

fifteen

twenty

25

30

35

40

Four. Five

fifty

55

60

1. Method to detect malicious software, said detection being carried out in an anomalous detection system, or ADS, analyzing the behavior of a network and looking for deviations from a normality, indicating said normality the common behavior of users of said network and defining before said detection, said method comprising:

- construct a plurality of detection models for each of a plurality of different entities of said network, each of said plurality of detection models adapted to said different entities of said network and to different algorithms, implementing said different algorithms different detection strategies and representing said plurality of detection models said normality, and

- representing said plurality of detection models in a two-dimensional matrix, one dimension of said matrix corresponding to a number of said different entities of said network and the other dimension of said matrix corresponding to a number of said different algorithms employed.

2. Method according to claim 1, which comprises monitoring the traffic of said network, said traffic comprising packages that cross said network, and preparing said packages for said different algorithms, or for said different detection strategies, adding said packages in flows.

3. Method according to claim 2, which comprises storing said traffic in order to construct said plurality of detection models when said ADS is operating in storage mode.

4. Method according to claim 2 or 3, comprising:

- processing said flows according to at least part of said different algorithms when said ADS is operating in detection mode, said different algorithms being defined in a detector library, allowing said detector library to at least add, delete or modify algorithms; Y

- comparing said processed flows with at least part of said plurality of detection models.

5. Method according to claim 4 when it depends on claim 3, which comprises constructing said plurality of detection models according to a compiler library containing a model generator by algorithm, each model generator defining a compiler to be used with said traffic stored when operating in such storage mode.

6. Method according to claim 5, which comprises having a default model generator contained in said compiler library to construct at least part of said plurality of detection models.

7. Method according to claim 4, 5 or 6, comprising recording alarms generated when said malicious software is detected according to said comparison between said flows processed with said at least part of said different algorithms and said at least part of said plurality of models of detection.

8. Method according to claim 7, wherein said alarms contain at least part of the information in the following unclosed list: attacker identifier, host identifier attacked, protocol involved, date and time stamp for the alert, type of attack or type of anomaly, algorithm identifier and feedback information.

9. Method according to claim 8, wherein said different algorithms are defined according to neural networks, supervised machine learning algorithms, grouping algorithms and / or simple algorithms of the following unclosed list: volumetric traffic monitoring, absolute counts of packets or flows between nodes and periodic detection of flows.

10. Method according to claim 9, comprising implementing a given algorithm according to a domain flow detector based on the domain name system, or DNS, detecting said algorithm given domain flow activities in a monitored DNS traffic analyzing the DNS responses, said DNS response analysis comprising:

- extract a plurality of features from said DNS responses;

- construct a feature vector with said plurality of features;

- evaluating said feature vector using a neural network, previously training said neural network with vector examples; Y

- provide, by means of said neural network, a label indicating the convenience of establishing an alarm according to said evaluation.

11. Method according to claim 10, wherein at least part of said plurality of features are

5

10

fifteen

twenty

25

30

35

40

Four. Five

fifty

55

60

contained in the following unclosed list: number of NX_DOMAIN responses, number of FORMAT_ERROR responses, number of REFUSED responses, number of SERVER_FAILUrEs responses, number of different FQDNs consulted, number of FQDN of one layer, number of FQDN of two layers, number of FQDN of three layers or more, number of suspicious top-level domains, number of layer two domains with length less than 6, number of layer two domains with length greater than 5 and less than 21, number of domains of layer two with length greater than 20, number of domains of layer two with number of vowels less than 0.3% and number of domains of layer two with number of digits greater than 0.5%.

12. Method according to claim 8, which comprises implementing a specific algorithm according to a clustering algorithm of maximization of hope, wherein said specific algorithm identifies the traffic classes, groups said traffic classes into clusters and detects an anomaly if a traffic pattern that does not belong to one of said groupings.

13. Method according to claim 12, which comprises feedback said concrete algorithm with a vector of features of said flows, said features being stable over time and being contained in the following unclosed list: number of flows destined for an unusual country, number of flows that are blacklisted, number of flows using an unusual protocol, number of suspicious TCP flows, number of flows more than 10 KB in length, number of flows more than 50 KB in length, number of flows less than half of the maximum transmission unit and number of flows not generated by a modeled entity, in which said suspicious TCP flows comprise SYN only flows, SYN + ACK only flows, END only flows and flows only FIN + ACK.

14. Method according to claim 13, which comprises calculating the limits of each of said groupings or of each characteristic of said feature vector, and detecting an anomaly if a value of a characteristic of each characteristic vector is outside the limits of the corresponding grouping or corresponding feature of said feature vector.

15. System to detect malicious software, said detection being performed in an anomaly detection system, or ADS, analyzing the behavior of a network and looking for deviations from a normality, indicating said normality the common behavior of the users of said network and defined before said detection, said system comprising a probe module for monitoring traffic of said network connected to a controller module responsible for performing said detection, in which said controller module is provided with a plurality of detection models constructed by means of of a compiler module for each of a plurality of different entities of said network, each of said plurality of detection models adapted to said different entities of said network and to different algorithms, implementing said different algorithms different detection strategies and representing said plurality of detection models said normal d.

16. System according to claim 15, wherein a first interface between said probe module and said controller module allows traffic traces to be sent in the form of flows from said probe monitor to said controller module each time said flows are ready to be shared or when said controller module requests said flows.

17. System according to claim 16, wherein said probe module adapts said traffic trails to the flows and allows the availability in said flows of at least part of the following fields: layer 4 protocol, date and time stamps of the first and last packet, IP addresses of origin and destination in the flow, number of packets sent and received, number of bytes sent and received, TCP flags involved in the flow and application header data.

18. System according to claim 16 or 17 wherein said controller module is provided with a database of flows in which at least said flows are stored and said compiler module is provided with a database of models in which they store at least said plurality of detection models.

19. System according to claim 18, wherein a second interface between said controller module and said compiler module is implemented in order to allow communication between said controller module and said flow database, between said flow database and said compiler module and between said compiler module and said model database.

20. System according to any of the preceding claims 15 to 19, wherein a register module connected to said controller module is provided to record alarms generated by said controller module when said malicious software is detected, said controller module and said register module communicated by Middle of a third interface.

21. System according to claim 20, wherein said probe module, said controller module, said module

compiler and / or said registrar module are distributed in different physical servers or devices, using said first interface, said second interface and / or said third interface at least one of the communication technologies of the following unclosed list: sockets, web services, RPC commands and remote method invocation.

5