FR3095879A1

FR3095879A1 - DESIGN PROCESS OF A SIGNATURE GENERATOR OF AN ENTITY PERFORMING ACTIONS IN A COMPUTER ARCHITECTURE, ABNORMAL BEHAVIOR DETECTION PROCESS, COMPUTER PROGRAM AND CORRESPONDING SYSTEM

Info

Publication number: FR3095879A1
Application number: FR1904728A
Authority: FR
Inventors: Adrien BESSE; Loris BOUZONNET; Guillaume Morin; Nicolas WINCKLER
Original assignee: Bull SA
Current assignee: Bull SA
Priority date: 2019-05-06
Filing date: 2019-05-06
Publication date: 2020-11-13
Anticipated expiration: 2039-05-06
Also published as: EP3736743A1; FR3095879B1

Abstract

Ce procédé comporte : - l’obtention d’une machine entraînable (202) comportant au moins : un réseau de neurones amont (204A) destiné à fournir une signature (SG) d’une entité (E) à partir d’un jeu de caractéristiques statistiques (V) d’actions datées de cette entité (E) dans une infrastructure informatique (100), et un classifieur (204B) destiné à fournir une prédiction d’entité (E*) à partir de la signature (SG) fournie par le réseau de neurones amont (204A) ; - l’obtention de données d’entraînement comportant des jeux de caractéristiques statistiques (V) d’actions datés, chacun associé à une entité (E) ayant réalisée ces actions dans l’infrastructure informatique ; et - l’entraînement supervisé de la machine entraînable (202) à partir des données d’entraînement de sorte que la machine entraînable (202) prédise correctement les entités (E) associées aux jeux de caractéristiques statistiques (V) des données d’entraînement. Selon l’invention, l’entraînement supervisé est en outre réalisé de sorte que la machine entraînable (202) regroupe les signatures (SG) de chacune des entités (E) associées aux jeux de caractéristiques statistiques (V) des données d’entraînement autour d’un centre associé à cette entité (E), et le procédé comporte en outre la fourniture du générateur de signature (204A) comportant le réseau de neurones (204A) après entraînement. Figure pour l’abrégé : Fig. 2This method comprises: obtaining a trainable machine (202) comprising at least: an upstream neural network (204A) intended to provide a signature (SG) of an entity (E) from a set of statistical characteristics (V) of dated actions of this entity (E) in an IT infrastructure (100), and a classifier (204B) intended to provide an entity prediction (E *) from the signature (SG) provided by the upstream neural network (204A); - obtaining training data comprising sets of statistical characteristics (V) of dated actions, each associated with an entity (E) having carried out these actions in the IT infrastructure; and - supervised training of the trainable machine (202) from the training data so that the trainable machine (202) correctly predicts the entities (E) associated with the sets of statistical characteristics (V) of the training data . According to the invention, the supervised training is further carried out so that the trainable machine (202) groups the signatures (SG) of each of the entities (E) associated with the sets of statistical characteristics (V) of the training data around of a center associated with this entity (E), and the method further comprises providing the signature generator (204A) comprising the neural network (204A) after training. Figure for the abstract: Fig. 2

Description

METHOD FOR DESIGNING A SIGNATURE GENERATOR OF AN ENTITY PERFORMING ACTIONS IN A COMPUTER ARCHITECTURE, METHOD FOR DETECTING ABNORMAL BEHAVIOR, CORRESPONDING COMPUTER PROGRAM AND SYSTEM

La présente invention concerne un procédé de conception d’un générateur de signature d’une entité réalisant des actions dans une architecture informatique, un procédé de détection de comportement anormal, un programme d’ordinateur et un système correspondants.The present invention relates to a method for designing a signature generator of an entity performing actions in a computer architecture, a method for detecting abnormal behavior, a computer program and a corresponding system.

L'invention s’applique plus particulièrement à un procédé de conception d’un générateur de signature destiné à fournir une signature d’une entité à partir d’un jeu de caractéristiques statistiques d’actions datées de cette entité dans une infrastructure informatique, le procédé comportant :The invention applies more particularly to a method for designing a signature generator intended to provide a signature of an entity from a set of statistical characteristics of dated actions of this entity in a computer infrastructure, the process comprising:

- l’obtention d’une machine entraînable comportant au moins :- obtaining a drivable machine comprising at least:

-- un réseau de neurones amont destiné à fournir une signature d’une entité à partir d’un jeu de caractéristiques statistiques d’actions datées de cette entité dans une infrastructure informatique, et-- an upstream neural network intended to provide a signature of an entity from a set of statistical characteristics of dated actions of this entity in a computing infrastructure, and

-- un classifieur destiné à fournir une prédiction d’entité à partir de la signature fournie par le réseau de neurones amont ;-- a classifier intended to provide an entity prediction from the signature provided by the upstream neural network;

- l’obtention de données d’entraînement comportant des jeux de caractéristiques statistiques d’actions datés, chacun associé à une entité ayant réalisée ces actions dans l’infrastructure informatique ; et- obtaining training data comprising sets of dated statistical characteristics of actions, each associated with an entity having carried out these actions in the IT infrastructure; And

- l’entraînement supervisé de la machine entraînable à partir des données d’entraînement de sorte que la machine entraînable prédise correctement les entités associées aux jeux de caractéristiques statistiques des données d’entraînement.- supervised training of the trainable machine from the training data such that the trainable machine correctly predicts the entities associated with the statistical feature sets of the training data.

L’article de Tuor et al. intitulé « Predicting User Roles from Computer Logs Using Recurrent Neural Networks » et publié dans « Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) », décrit un procédé de conception d’un prédicteur de rôle d’entité destiné à fournir une prédiction du rôle d’une entité à partir d’un jeu de caractéristiques statistiques d’actions datées de cette entité dans une infrastructure informatique, le procédé comportant :The article by Tuor et al. titled "Predicting User Roles from Computer Logs Using Recurrent Neural Networks" and published in "Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)", describes a method for designing an entity role predictor for providing a prediction of the role of an entity from a set of statistical characteristics of dated actions of this entity in a computing infrastructure, the method comprising:

- l’obtention d’une machine entraînable ;- obtaining a drivable machine;

- l’obtention de données d’entraînement comportant des jeux de caractéristiques statistiques d’actions datés d’entités dans l’infrastructure informatique, chaque jeu de caractéristiques statistiques étant associé à un rôle de l’entité ; et- obtaining training data comprising sets of statistical characteristics of dated actions of entities in the IT infrastructure, each set of statistical characteristics being associated with a role of the entity; And

- l’entraînement supervisé de la machine entraînable à partir des données d’entraînement de sorte que la machine entraînable prédise correctement les rôles associés aux jeux de caractéristiques des données d’entraînement.- supervised training of the trainable machine from the training data such that the trainable machine correctly predicts the roles associated with the feature sets of the training data.

Le fait de pouvoir prédire le rôle d’une entité permet de caractériser le comportement habituel de cette dernière. Cependant, cette caractérisation est très vague. En effet, plusieurs entités peuvent avoir le même rôle. En outre, même si un très grand nombre de rôles est prévu, ce nombre restera toujours très faible par rapport à toutes les possibilités comportementales. Ainsi, la connaissance du rôle d’une entité ne permet pas de définir précisément le comportement habituel d’une unité, en particulier pour déterminer lorsque le comportement de l’entité dévie sensiblement de ce comportement habituel. Dit autrement, la classification du rôle est une contrainte trop faible pour la caractérisation de comportements. Il est notamment impossible de discriminer des comportements associés au même rôle. Or, la détection d’une telle déviation peut permettre de détecter la présence d’une menace dans l’architecture informatique, comme la présence d’un logiciel malveillant ou bien un comportement répréhensible d’un utilisateur, tel que de l’espionnage industriel.Being able to predict the role of an entity makes it possible to characterize the usual behavior of the latter. However, this characterization is very vague. Indeed, several entities can have the same role. Moreover, even if a very large number of roles is planned, this number will always remain very low compared to all the behavioral possibilities. Thus, knowledge of the role of an entity does not make it possible to precisely define the usual behavior of a unit, in particular to determine when the behavior of the entity deviates significantly from this usual behavior. In other words, role classification is too weak a constraint for characterizing behaviors. In particular, it is impossible to discriminate between behaviors associated with the same role. However, the detection of such a deviation can make it possible to detect the presence of a threat in the computer architecture, such as the presence of malicious software or reprehensible behavior of a user, such as industrial espionage. .

Il peut ainsi être souhaité de prévoir une caractérisation du comportement d’une entité qui permette de s’affranchir d’au moins une partie des problèmes et contraintes précités.It may thus be desirable to provide a characterization of the behavior of an entity which makes it possible to overcome at least some of the aforementioned problems and constraints.

L’invention a donc pour objet un procédé de conception d’un générateur de signature destiné à fournir une signature d’une entité à partir d’un jeu de caractéristiques statistiques d’actions datées de cette entité dans une infrastructure informatique, le procédé comportant :The subject of the invention is therefore a method for designing a signature generator intended to provide a signature of an entity from a set of statistical characteristics of dated actions of this entity in a computer infrastructure, the method comprising :

- l’entraînement supervisé de la machine entraînable à partir des données d’entraînement de sorte que la machine entraînable prédise correctement les entités associées aux jeux de caractéristiques statistiques des données d’entraînement ;- supervised training of the trainable machine from the training data such that the trainable machine correctly predicts the entities associated with the sets of statistical characteristics of the training data;

le procédé étant caractérisé en ce que l’entraînement supervisé est en outre réalisé de sorte que la machine entraînable regroupe les signatures de chacune des entités associées aux jeux de caractéristiques statistiques des données d’entraînement autour d’un centre associé à cette entité,the method being characterized in that the supervised training is also carried out so that the trainable machine groups the signatures of each of the entities associated with the sets of statistical characteristics of the training data around a center associated with this entity,

et en ce qu’il comporte en outre :and in that it further comprises:

- la fourniture du générateur de signature comportant le réseau de neurones amont après entraînement.- the supply of the signature generator comprising the upstream neural network after training.

Ainsi, grâce à l’invention, le générateur de signature est apte à générer une signature caractérisant de manière très fine le comportement de l’entité, ce qui permet de détecter une déviation comportementale même faible. En outre, les signatures des différentes entités sont suffisamment espacées les unes des autres, grâce à l’entraînement sur le regroupement des signatures, pour minimiser le risque de confondre une signature d’une entité ayant un comportement inhabituel avec une signature associée au comportement habituel d’une autre entité, et donc de ne pas détecter le comportement inhabituel de la première entité.Thus, thanks to the invention, the signature generator is able to generate a signature characterizing very finely the behavior of the entity, which makes it possible to detect even a slight behavioral deviation. In addition, the signatures of the different entities are sufficiently spaced from each other, thanks to the training on the grouping of the signatures, to minimize the risk of confusing a signature of an entity having an unusual behavior with a signature associated with the usual behavior. of another entity, and therefore not to detect the unusual behavior of the first entity.

De façon optionnelle, le centre est un barycentre, par exemple un isobarycentre, des signatures de l’entité à laquelle ce centre est associé.Optionally, the center is a barycenter, for example an isobarycenter, of the signatures of the entity with which this center is associated.

De façon optionnelle également, l’entraînement supervisé utilise une fonction de perte totale utilisant une première fonction de perte associée à la prédiction d’entité et une deuxième fonction de perte associée au regroupement des signatures.Also optionally, the supervised training uses a total loss function using a first loss function associated with entity prediction and a second loss function associated with signature clustering.

De façon optionnelle également, la fonction de perte totale utilise une combinaison linéaire des première et deuxième fonctions de perte.Also optionally, the total loss function uses a linear combination of the first and second loss functions.

De façon optionnelle également, la combinaison linaire présente un coefficient pour au moins une parmi les première et deuxième fonctions de perte, ce coefficient étant un hyperparamètre, et le procédé comporte :Also optionally, the linear combination has a coefficient for at least one of the first and second loss functions, this coefficient being a hyperparameter, and the method comprises:

- l’entraînement supervisé de la machine entraînable à partir des données d’entraînement pour plusieurs jeux d’hyperparamètres ;- supervised training of the trainable machine from training data for several sets of hyperparameters;

- pour chaque jeu d’hyperparamètres, l’analyse des signatures obtenues pour produire une évaluation de performance ; et- for each set of hyperparameters, the analysis of the signatures obtained to produce a performance evaluation; And

- la sélection des hyperparamètres donnant la meilleure évaluation de performance.- the selection of hyperparameters giving the best performance evaluation.

De façon optionnelle également, la machine entraînable comporte un réseau de neurones global comportant N couches de neurones, le réseau de neurones amont comportant les N – k premières couches et le classifieur comportant les k dernières couches, k étant supérieur ou égal à un.Also optionally, the trainable machine comprises a global neural network comprising N layers of neurons, the upstream neural network comprising the N−k first layers and the classifier comprising the k last layers, k being greater than or equal to one.

De façon optionnelle également, l’entraînement supervisé est en outre réalisé de sorte que la machine entraînable réalise une ou plusieurs autres tâches.Also optionally, the supervised training is additionally carried out so that the trainable machine performs one or more other tasks.

Il est également proposé un procédé de détection de comportement anormal d’une entité comportant :There is also proposed a method for detecting abnormal behavior of an entity comprising:

- la fourniture d’au moins un jeu de caractéristiques statistiques d’actions datées de l’entité dans une infrastructure informatique à un générateur de signature conçu d’après un procédé selon l’invention ;- the supply of at least one set of statistical characteristics of dated actions of the entity in a computer infrastructure to a signature generator designed according to a method according to the invention;

- la fourniture par le générateur de signature d’une signature de l’entité pour chaque jeu de caractéristiques statistiques ;- the provision by the signature generator of a signature of the entity for each set of statistical characteristics;

- la comparaison d’une donnée issue de la ou des signatures fournies par le générateur de signature avec au moins une donnée de référence ; et- the comparison of a datum from the signature(s) provided by the signature generator with at least one reference datum; And

- la détection d’un comportement normal ou anormal de l’entité à partir de la comparaison.- the detection of normal or abnormal behavior of the entity from the comparison.

Il est également proposé un programme d’ordinateur téléchargeable depuis un réseau de communication et/ou enregistré sur un support lisible par ordinateur et/ou exécutable par un processeur, caractérisé en ce qu’il comprend des instructions pour l’exécution des étapes d’un procédé selon l’invention, lorsque ledit programme est exécuté sur un ordinateur.There is also proposed a computer program downloadable from a communication network and/or recorded on a computer-readable medium and/or executable by a processor, characterized in that it comprises instructions for the execution of the steps of a method according to the invention, when said program is executed on a computer.

Il est également proposé un système de conception d’un générateur de signature destiné à fournir une signature d’une entité à partir d’un jeu de caractéristiques statistiques d’actions datées de cette entité dans une infrastructure informatique, comportant :There is also proposed a system for designing a signature generator intended to provide a signature of an entity from a set of statistical characteristics of dated actions of this entity in a computer infrastructure, comprising:

- une machine entraînable comportant au moins :- a drivable machine comprising at least:

- un dispositif d’entraînement conçu pour entraîner de manière supervisée la machine entraînable à partir de données d’entraînement comportant des jeux de caractéristiques statistiques d’actions datés, chacun associé à une entité ayant réalisée ces actions dans l’infrastructure informatique, de sorte que la machine entraînable prédise correctement les entités associées aux jeux de caractéristiques statistiques des données d’entraînement ;- a training device designed to train the trainable machine in a supervised manner from training data comprising sets of dated statistical characteristics of actions, each associated with an entity having carried out these actions in the IT infrastructure, so that the trainable machine correctly predicts the entities associated with the statistical feature sets of the training data;

le système étant caractérisé en ce que le dispositif d’entraînement est en outre conçu pour entraîner la machine entraînable de sorte que la machine entraînable regroupe les signatures de chacune des entités associées aux jeux de caractéristiques statistiques des données d’entraînement autour d’un centre associé à cette entité, et en ce que le générateur de signature comporte le réseau de neurones amont après entraînementthe system being characterized in that the training device is further adapted to drive the trainable machine such that the trainable machine gathers the signatures of each of the entities associated with the sets of statistical characteristics of the training data around a center associated with this entity, and in that the signature generator includes the upstream neural network after training

L’invention sera mieux comprise à l’aide de la description qui va suivre, donnée uniquement à titre d’exemple et faite en se référant aux dessins annexés dans lesquels :The invention will be better understood with the aid of the following description, given solely by way of example and made with reference to the appended drawings in which:

la figure 1 représente schématiquement la structure générale d’une infrastructure informatique, de moyens de surveillance et de moyens d’extraction de caractéristiques statistiques, selon un mode de réalisation de l’invention, FIG. 1 schematically represents the general structure of a computer infrastructure, monitoring means and means for extracting statistical characteristics, according to one embodiment of the invention,

la figure 2 représente schématiquement la structure générale d’un système de conception d’un générateur de signature d’entité, selon un mode de réalisation de l’invention, FIG. 2 schematically represents the general structure of a system for designing an entity signature generator, according to one embodiment of the invention,

la figure 3 illustre les étapes successives d’un procédé de conception d’un générateur de signature d’entité, selon un mode de réalisation de l’invention, FIG. 3 illustrates the successive steps of a method for designing an entity signature generator, according to one embodiment of the invention,

la figure 4 représente schématiquement la structure générale d’un système de détection de comportement anormal d’une entité, selon un mode de réalisation de l’invention, et FIG. 4 schematically represents the general structure of a system for detecting abnormal behavior of an entity, according to one embodiment of the invention, and

la figure 5 illustre les étapes successives d’un procédé de détection de comportement anormal d’une entité, selon un mode de réalisation de l’invention. FIG. 5 illustrates the successive steps of a method for detecting abnormal behavior of an entity, according to one embodiment of the invention.

La figure 1 illustre une infrastructure informatique 100 pouvant comporter par exemple un grand nombre d’ordinateurs reliés entre eux en réseau.FIG. 1 illustrates a computer infrastructure 100 that can include, for example, a large number of computers linked together in a network.

Des entités (cinq dans l’exemple décrit, portant respectivement les références E1, E2, E3, E4, E5) utilisent cette infrastructure informatique 100. Chaque entité peut être un élément interne de l’infrastructure informatique 100 (cas des entités E1, E3, E5), par exemple un programme d’ordinateur ou bien un ordinateur de l’infrastructure informatique 100, ou bien un élément externe de l’infrastructure informatique 100 (cas des entités E2, E4), par exemple un utilisateur (humain) ou bien un dispositif informatique connecté à l’infrastructure informatique 100. Ainsi, chaque entité E1, E2, E3, E4, E5 réalise, au cours du temps, différentes actions dans l’infrastructure informatique 100, ces actions étant représentées sur la figure 1 par des flèches.Entities (five in the example described, bearing the references E1, E2, E3, E4, E5 respectively) use this IT infrastructure 100. Each entity can be an internal element of the IT infrastructure 100 (case of entities E1, E3 , E5), for example a computer program or else a computer of the IT infrastructure 100, or else an external element of the IT infrastructure 100 (case of the entities E2, E4), for example a user (human) or indeed a computing device connected to the computing infrastructure 100. Thus, each entity E1, E2, E3, E4, E5 carries out, over time, different actions in the computing infrastructure 100, these actions being represented in FIG. 1 by arrows.

Par ailleurs, dans l’exemple décrit, chaque entité E1, E2, E3, E4, E5 est associée à un groupe. Chaque groupe comporte donc une ou plusieurs des entités E1, E2, E3, E4, E5. Par exemple, un premier groupe comporte les entités E1 et E2, tandis qu’un deuxième groupe comporte les entités E3, E4, E5.Furthermore, in the example described, each entity E1, E2, E3, E4, E5 is associated with a group. Each group therefore comprises one or more of the entities E1, E2, E3, E4, E5. For example, a first group includes the entities E1 and E2, while a second group includes the entities E3, E4, E5.

Pour surveiller les actions des entités E1, E2, E3, E4, E5, des moyens de surveillance 102 sont prévus. Les moyens de surveillance 102 sont conçus pour surveiller les entités E1, E2, E3, E4, E5 et pour fournir un historique H d’actions qu’elles réalisent dans l’infrastructure informatique 100 pendant un intervalle de surveillance de durée W₀. L’historique H associe chaque action à l’entité ayant réalisée cette action et à la date (au sens de date et heure) de cette action. Un exemple d’historique H est illustré dans le tableau suivant.To monitor the actions of the entities E1, E2, E3, E4, E5, monitoring means 102 are provided. The monitoring means 102 are designed to monitor the entities E1, E2, E3, E4, E5 and to provide a history H of actions that they perform in the IT infrastructure 100 during a monitoring interval of duration W ₀ . The history H associates each action with the entity that performed this action and with the date (in the sense of date and time) of this action. An example of H history is shown in the following table.

ActionStock EntitéEntity DateDate Action A1Action A1 E3E3 Jour 1 (lundi), 8h56Day 1 (Monday), 8:56 a.m. Action A1Action A1 E3E3 Jour 1 (lundi), 9h37Day 1 (Monday), 9:37 a.m. Action A2Action A2 E1E1 Jour 1 (lundi), 15h13Day 1 (Monday), 3:13 p.m. Action A1Action A1 E3E3 Jour 2 (mardi), 10h23Day 2 (Tuesday), 10:23 a.m. Action A2Action A2 E3E3 Jour 2 (mardi), 12h01Day 2 (Tuesday), 12:01 p.m. Action A3Action A3 E2E2 Jour 2 (mardi), 18h22Day 2 (Tuesday), 6:22 p.m. Action A2Action A2 E2E2 Jour 2, (mardi), 23h06Day 2, (Tuesday), 11:06 p.m. Action A1Action A1 E1E1 Jour 3 (mercredi), 11h00Day 3 (Wednesday), 11:00 a.m. Action A3Action A3 E2E2 Jour 3, (mercredi), 11h04Day 3, (Wednesday), 11:04 a.m. Action A1Action A1 E3E3 Jour 4 (jeudi), 9h07Day 4 (Thursday), 9:07 a.m. Action A2Action A2 E3E3 Jour 4 (jeudi), 12h57Day 4 (Thursday), 12:57 p.m. Action A3Action A3 E1E1 Jour 4 (jeudi), 13h04Day 4 (Thursday), 1:04 p.m. Action A2Action A2 E3E3 Jour 4 (jeudi), 17h38Day 4 (Thursday), 5:38 p.m. …… …… ……

L’historique H peut en outre indiquer d’autres informations pour certaines actions. Par exemple, une action peut correspondre à l’envoi d’un courriel et être associée, en plus de la date d’envoi (correspondant à la date de l’action), à l’objet du courriel, le ou les destinataires, l’expéditeur, le nombre de pièces jointes et leurs noms.The H history can additionally indicate other information for certain actions. For example, an action can correspond to the sending of an email and be associated, in addition to the date of sending (corresponding to the date of the action), with the subject of the email, the recipient(s), the sender, the number of attachments and their names.

Des moyens 104 d’extraction de caractéristiques statistiques sont en outre prévus. Les moyens d’extraction 104 sont conçus, pour chaque entité E1, E2, E3, E4, E5 et pour chacun d’une pluralité d’intervalles de temps, appelés intervalles d’extraction, contenus dans l’intervalle de surveillance, pour calculer un nombre prédéfini de caractéristiques statistiques d’actions de cette entité listées dans l’historique H et s’étant produites pendant l’intervalle d’extraction considéré. Ainsi, les moyens d’extraction 104 fournissent, pour chaque entité E1, E2, E3, E4, E5, autant de jeux de caractéristiques statistiques que d’intervalles d’extraction.Means 104 for extracting statistical characteristics are also provided. The extraction means 104 are designed, for each entity E1, E2, E3, E4, E5 and for each of a plurality of time intervals, called extraction intervals, contained in the monitoring interval, to calculate a predefined number of statistical characteristics of actions of this entity listed in the history H and having occurred during the extraction interval considered. Thus, the extraction means 104 supply, for each entity E1, E2, E3, E4, E5, as many sets of statistical characteristics as there are extraction intervals.

Les caractéristiques statistiques sont par exemple calculées à partir de règles de calcul respectives prédéfinies.The statistical characteristics are for example calculated from respective predefined calculation rules.

Sur la figure 1, la référence V représente un jeu de caractéristiques statistiques fourni par les moyens d’extraction 104.In FIG. 1, the reference V represents a set of statistical characteristics provided by the extraction means 104.

Chaque jeu de caractéristiques statistiques est par exemple sous la forme d’un vecteur dont les composantes sont respectivement formées par les caractéristiques statistiques.Each set of statistical characteristics is for example in the form of a vector whose components are respectively formed by the statistical characteristics.

De préférence, les intervalles d’extraction ne se chevauchent pas, sont adjacents, présentent une même durée W et sont les mêmes pour toutes les entités E1, E2, E3, E4, E5. Dans l’exemple décrit, la durée W₀de l’intervalle de surveillance vaut douze jours, numérotés par la suite de 1 à 12, et la durée W des intervalles d’extraction est égale à un jour. Ainsi, dans l’exemple décrit, les moyens d’extraction 104 fournissent, pour chacune des entités E1, E2, E3, E4, E5, douze jeux de caractéristiques statistiques, correspondant respectivement aux douze jours de l’intervalle de surveillance.Preferably, the extraction intervals do not overlap, are adjacent, have the same duration W and are the same for all the entities E1, E2, E3, E4, E5. In the example described, the duration W ₀ of the monitoring interval is equal to twelve days, subsequently numbered from 1 to 12, and the duration W of the extraction intervals is equal to one day. Thus, in the example described, the extraction means 104 supply, for each of the entities E1, E2, E3, E4, E5, twelve sets of statistical characteristics, corresponding respectively to the twelve days of the monitoring interval.

Par la suite, l’ensemble des jeux de caractéristiques d’une entité provenant d’intervalles d’extraction consécutifs sera appelé une séquence de jeux de caractéristiques statistiques. Ainsi, dans l’exemple décrit, les moyens d’extraction 104 fournissent, pour chacune des entités E1, E2, E3, E4, E5, une séquence de douze jeux de caractéristiques statistiques.In the following, the set of feature sets of a feature from consecutive extraction intervals will be called a sequence of statistical feature sets. Thus, in the example described, the extraction means 104 supply, for each of the entities E1, E2, E3, E4, E5, a sequence of twelve sets of statistical characteristics.

Une caractéristique statistique est par exemple : le nombre d’occurrences moyen d’une action par unité de temps ; la date de la première activité (utile par exemple pour repérer un changement dans la ponctualité) ; l’agrégation sous forme de comptage, de valeur moyenne ou d’autres moments statistiques d’ordres supérieurs apportant des informations pertinentes sur la distribution sous-jacente (variance, Kurtosis, Skewness, etc.). Dans certains cas une caractéristique statistique peut être issue de la distance entre la distribution sur le passé de la variable avec la nouvelle distribution (quantifiées via des mesures de similarité entre distributions telle que la divergence de Kullback-Leibler par exemple). D’autres caractéristiques statistiques peuvent être issues de la covariance entre deux variables d’activités sur un intervalle de temps (permettant ainsi d’accéder à des informations statistiques plus fines qui n’auraient pas pu être capturées par des informations statistiques sur chacune des variables indépendamment).A statistical characteristic is for example: the average number of occurrences of an action per unit of time ; the date of the first activity (useful for example to identify a change in punctuality); aggregation in the form of counts, mean values or other higher-order statistical moments providing relevant information on the underlying distribution (variance, Kurtosis, Skewness, etc.). In some cases, a statistical characteristic can be derived from the distance between the distribution over the past of the variable and the new distribution (quantified via measures of similarity between distributions such as the Kullback-Leibler divergence for example). Other statistical characteristics can be derived from the covariance between two activity variables over a time interval (thus allowing access to finer statistical information that could not have been captured by statistical information on each of the variables independently).

La figure 2 illustre un système 200 de conception d’un générateur de signature.Figure 2 illustrates a system 200 for designing a signature generator.

Le système de conception 200 comporte tout d’abord une machine entraînable 202 présentant des hyperparamètres, ainsi que des paramètres destinés à être déterminés par apprentissage une fois un jeu d’hyperparamètres choisi. La machine entraînable 202 comporte au moins un réseau de neurones amont 204A et un classifieur aval 204B.The design system 200 first comprises a trainable machine 202 having hyperparameters, as well as parameters intended to be determined by learning once a set of hyperparameters has been chosen. The trainable machine 202 includes at least one upstream neural network 204A and one downstream classifier 204B.

Le réseau de neurones amont 204A est destiné à fournir une signature SG d’une entité E à partir d’un jeu de caractéristiques statistiques (tel que les jeux de caractéristiques fournis par les moyens d’extraction 104) d’actions datées de cette entité. Pour l’entraînement de la machine entraînable 202, le jeu de caractéristiques statistiques V est associé à l’entité E ayant réalisée ces actions, cette entité E étant elle-même associée, dans l’exemple décrit, à un groupe G.The upstream neural network 204A is intended to provide a signature SG of an entity E from a set of statistical characteristics (such as the sets of characteristics provided by the extraction means 104) of dated actions of this entity . For the training of the trainable machine 202, the set of statistical characteristics V is associated with the entity E having carried out these actions, this entity E being itself associated, in the example described, with a group G.

Le classifieur aval 204B est destiné à fournir une prédiction d’entité E* à partir de la signature SG fournie par le réseau de neurones amont 204A.The downstream classifier 204B is intended to provide an entity prediction E* from the signature SG provided by the upstream neural network 204A.

Par exemple, la machine entraînable 202 comporte un réseau de neurones 204, par exemple un réseau de neurones profond, comportant N couches de neurones. Le réseau de neurones amont 204A comporte alors les N-k premières couches et le classifieur aval 204B comporte les k dernières couches, k étant un entier supérieur ou égal à un. k vaut par exemple un.For example, the trainable machine 202 comprises a neural network 204, for example a deep neural network, comprising N layers of neurons. The upstream neural network 204A then comprises the first N-k layers and the downstream classifier 204B comprises the last k layers, k being an integer greater than or equal to one. k is for example one.

Par « réseau de neurones » on entend une structure complexe formée d’une pluralité de couches, chaque couche comportant une pluralité de neurones artificiels. Un neurone artificiel est un processeur élémentaire, qui calcule une sortie unique sur la base des informations qu’il reçoit. Chaque neurone d’une couche est relié à au moins un neurone d’une couche voisine via une synapse artificielle à laquelle est affectée un coefficient synaptique ou poids, mis à jour pendant un entraînement du réseau de neurones. C’est lors de cette étape d’entraînement que le poids de chaque synapse artificielle va être déterminé à partir de données d’entraînement.By "neural network" is meant a complex structure formed of a plurality of layers, each layer comprising a plurality of artificial neurons. An artificial neuron is an elementary processor, which calculates a single output based on the information it receives. Each neuron of a layer is connected to at least one neuron of a neighboring layer via an artificial synapse to which is assigned a synaptic coefficient or weight, updated during training of the neural network. It is during this training step that the weight of each artificial synapse will be determined from training data.

Le réseau de neurones 204 est par exemple un réseau de neurones récurrent à mémoire court et long terme (ou LSTM, de l’anglais « Long Short Term Memory ») ou bien un réseau de neurones perceptron multi-couche (ou MLP, de l’anglais « Multi Layer Perceptron »). Un hyperparamètre peut être utilisé pour sélectionner la structure du réseau de neurones 204, par exemple le nombre de couches de neurones du réseau de neurones.The neural network 204 is for example a recurrent neural network with short and long term memory (or LSTM, from the English “Long Short Term Memory”) or else a multi-layer perceptron neural network (or MLP, from the English "Multi Layer Perceptron"). A hyperparameter can be used to select the structure of the neural network 204, for example the number of layers of neurons in the neural network.

La dernière couche du réseau de neurones amont 204A (c’est-à-dire la N-k ième couche du réseau de neurones 204) présente S neurones, de sorte que la signature SG est composée des sorties de ces S neurones. Ainsi, la signature SG comporte S logits. Un « Logit » est le résultat de l’activation d’un neurone, en l’occurrence de la N-k ième couche.The last layer of the upstream neural network 204A (i.e. the N-kth layer of the neural network 204) has S neurons, so the signature SG is composed of the outputs of these S neurons. Thus, the signature SG comprises S logits. A “Logit” is the result of the activation of a neuron, in this case of the N-kth layer.

La machine entraînable 202 est destinée à être entraînée de manière supervisée pour réaliser plusieurs tâches simultanément (technique dite, en anglais, du « multi-task learning »). Ces tâches comportent au moins les deux tâches suivantes, appelées tâches principales.The trainable machine 202 is intended to be trained in a supervised manner to carry out several tasks simultaneously (technique known as “multi-task learning”). These tasks include at least the following two tasks, called main tasks.

La première tâche principale est de prédire correctement l’entité associée au jeu de caractéristiques statistiques V fourni en entrée. Cette prédiction d’entité est notée E*. Cette première tâche principale sera appelée par la suite « tâche de prédiction d’entité ».The first main task is to correctly predict the entity associated with the set of statistical features V provided as input. This entity prediction is denoted E*. This first main task will be called hereafter “entity prediction task”.

La deuxième tâche principale est de regrouper, dans l’espace des composantes des signatures, les signatures d’une même entité autour d’un centre associé à cette entité. Ce centre est par exemple un barycentre, tel que l’isobarycentre, des signatures de l’entité. Cette deuxième tâche principale sera appelée par la suite « tâche de regroupement des signatures ».The second main task is to group, in the space of signature components, the signatures of the same entity around a center associated with this entity. This center is for example a barycenter, such as the isobarycenter, of the signatures of the entity. This second main task will be called hereinafter “signature grouping task”.

Les tâches réalisées par la machine entraînable 202 peuvent en outre comporter N autres tâches, appelées « tâches métier », relatives à un aspect pertinent pour le métier. Pour chacune de ces tâches métier, la machine entraînable 202 peut comporter un ou plusieurs éléments supplémentaires fournissant des données sur lesquelles l’entraînement est réalisé, à partir de données fournies par le réseau de neurones amont 204A (par exemple l’une des couches, par exemple celle donnant la signature SG).The tasks carried out by the trainable machine 202 can also comprise N other tasks, called “business tasks”, relating to an aspect relevant to the business. For each of these business tasks, the trainable machine 202 can include one or more additional elements providing data on which the training is performed, from data provided by the upstream neural network 204A (for example one of the layers, for example the one giving the signature SG).

Les tâches métier peuvent consister à prédire des variables catégoriques ou numériques. Un exemple de tâche métier catégorique peut être la classification d’une variable ‘Web’ relevant d’une activité de connexion pouvant prendre les valeurs suivantes : 1. Aucun, 2. Occasionnel, 3. Fréquent. Un exemple de tâche métier numérique peut être la prédiction de la fréquence de connexion journalière à des machines.Business tasks can consist of predicting categorical or numerical variables. An example of a categorical business task can be the classification of a 'Web' variable relating to a connection activity which can take the following values: 1. None, 2. Occasional, 3. Frequent. An example of a digital business task can be the prediction of the daily connection frequency to machines.

Dans l’exemple décrit, une seule tâche métier est utilisée. Cette tâche métier est de prédire correctement le groupe G auquel appartient l’entité dont la signature S est fournie par le réseau de neurones amont 204A. Cette prédiction de groupe est notée G*. Ainsi, la machine entraînable 202 comporte en outre un élément prédicteur de groupe 206 conçu pour fournir, à partir de la signature SG, une prédiction G* du groupe auquel appartient l’entité associée au jeu de caractéristiques statistiques V fourni en entrée. Cette tâche métier sera appelée par la suite « tâche de prédiction de groupe ».In the example described, a single business task is used. This business task is to correctly predict the group G to which the entity whose signature S is provided by the upstream neural network 204A belongs. This group prediction is denoted G*. Thus, the trainable machine 202 further comprises a group predictor element 206 designed to provide, from the signature SG, a prediction G* of the group to which the entity associated with the set of statistical characteristics V provided as input belongs. This business task will be called hereafter “group prediction task”.

L’élément 206 comporte par exemple un réseau de neurones destiné à être entraîné en même temps que le réseau de neurones 204.The element 206 comprises for example a neural network intended to be trained at the same time as the neural network 204.

Ainsi, au total, la machine entraînable 202 est destinée à être entraînée de manière supervisée pour réaliser N + 2 tâches simultanément (N entier supérieur ou égal à 0).Thus, in total, the trainable machine 202 is intended to be driven in a supervised manner to perform N+2 tasks simultaneously (N integer greater than or equal to 0).

Pour réaliser cet entraînement, un dispositif d’entraînement 207 est prévu.To achieve this training, a training device 207 is planned.

Ce dispositif d’entraînement 207 comporte, pour chaque tâche, un module conçu pour calculer une fonction de perte, dite fonction de perte spécifique, associée à cette tâche.This training device 207 comprises, for each task, a module designed to calculate a loss function, called specific loss function, associated with this task.

Ainsi, le dispositif d’entraînement 207 comporte tout d’abord un premier module 208 de calcul d’une fonction de perte spécifique L_IDassociée à la tâche de prédiction d’entité. La fonction de perte spécifique L_IDest par exemple représentative d’une distance entre les identifiants ID et les prédictions ID*, par exemple entre les distributions des vrais identifiants et des identifiants prédits. Dans l’exemple décrit, la fonction de perte spécifique L_IDest la fonction de perte dite « d’entropie croisée » (de l’anglais « cross-entropy »).Thus, the training device 207 first of all comprises a first module 208 for calculating a specific loss function L _ID associated with the entity prediction task. The specific loss function L _ID is for example representative of a distance between the identifiers ID and the predictions ID*, for example between the distributions of the true identifiers and of the predicted identifiers. In the example described, the specific loss function L _ID is the so-called “cross-entropy” loss function.

Le dispositif d’entraînement 207 comporte en outre un deuxième module 210 de calcul d’une fonction de perte spécifique L_Cassociée à la tâche de regroupement des signatures. Par exemple, la fonction de perte spécifique L_Cutilise, pour chaque entité, une distance (dans l’espace des S composantes des signatures SG) entre les signatures de cette entité et un centre associé à cette entité, tel que l’isobarycentre de ces signatures. Ainsi, la fonction de perte spécifique L_Cest représentative de ces distances. Dans l’exemple décrit, la fonction de perte spécifique L_Cest la fonction de perte dite de « perte de centre » (« center loss » en anglais) décrite dans l’article de Wen, Y., Zhang, K., Li, Z., Qiao, Y., intitulé « A discriminative feature learning approach for deep face recognition. » et publié dans Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VII. (2016) 499–515.The training device 207 further includes a second module 210 for calculating a specific loss function L _C associated with the signature grouping task. For example, the specific loss function L _C uses, for each entity, a distance (in the space of the S components of the signatures SG) between the signatures of this entity and a center associated with this entity, such as the isobarycentre of these signatures. Thus, the specific loss function L _C is representative of these distances. In the example described, the specific loss function L _C is the so-called “center loss” loss function described in the article by Wen, Y., Zhang, K., Li , Z., Qiao, Y., entitled “A discriminative feature learning approach for deep face recognition. and published in Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VII. (2016) 499–515.

Le dispositif d’entraînement 207 comporte en outre un troisième module 212 de calcul d’une fonction de perte spécifique L_Gassociée à la tâche de prédiction de groupe. La fonction de perte spécifique L_Gest représentative d’une distance entre les prédictions de groupe G* et les groupes G, par exemple entre les distributions des vrais groupes et des groupes prédits. De manière générale, pour les tâches métier, la fonction de perte « cross-entropy » est par exemple utilisée pour chaque tâche métier dont les valeurs sont catégoriques, tandis que la fonction de perte dite « d’erreur quadratique moyenne » (« mean squared error » en anglais) est par exemple utilisée pour les tâches métier dont les valeurs sont numériques. Dans l’exemple décrit, la tâche métier est la prédiction de groupe, ce qui est donc une tâche métier dont les valeurs sont catégoriques, de sorte que la fonction de perte « cross-entropy » est choisie pour la fonction de perte spécifique L_G.The training device 207 further comprises a third module 212 for calculating a specific loss function L_Gassociated with the group prediction task. The specific loss function L_Gis representative of a distance between the group predictions G* and the groups G, for example between the distributions of the true groups and the predicted groups. In general, for business tasks, the “cross-entropy” loss function is for example used for each business task whose values are categorical, while the so-called “mean squared error” loss function (“mean squared error” in English) is for example used for business tasks whose values are numeric. In the example described, the business task is group prediction, which is therefore a business task whose values are categorical, so the “cross-entropy” loss function is chosen for the specific loss function L_G.

En outre, le dispositif d’entraînement 207 comporte un quatrième module 214 de calcul d’une fonction de perte totale L_total _eà partir des fonctions de perte spécifiques L_ID, L_C, et L_G. Dans l’exemple décrit, la fonction de perte totale L_totalecomporte une combinaison linéaire des fonctions de perte spécifiques. La fonction de perte totale L_totaleest par exemple donnée par : L_totale= L_ID+ αL_C+ λL_G+ γL_REG, où les coefficients α, λ et γ sont des hyperparamètres et où L_REGest un terme de perte ajouté pour régulariser le modèle pendant l’entraînement. L_REGest un terme n’étant fonction que des paramètres du modèle et contraignant le modèle à rester dans un certain espace de solutions. Un exemple de terme de régularisation est la norme L2 (appelée « weight decay » dans la littérature) sur les paramètres des couches du réseau de neurones poussant tous les poids vers 0 lorsque ce terme de régularisation est minimisé via la minimisation de la fonction de perte totale L_totale.In addition, the training device 207 comprises a fourth module 214 for calculating a total loss function L _total _e from the specific loss functions L _ID , L _C , and L _G . In the example described, the total loss function L _total comprises a linear combination of the specific loss functions. The total loss function L _total is for example given by: L _total = L _ID + αL _C + λL _G + γL _REG , where the coefficients α, λ and γ are hyperparameters and where L _REG is a loss term added for regularize the pattern during training. L _REG is a term depending only on the parameters of the model and constraining the model to remain in a certain space of solutions. An example of a regularization term is the L2 norm (called “weight decay” in the literature) on the parameters of the layers of the neural network pushing all the weights towards 0 when this regularization term is minimized via the minimization of the loss function total L _total .

Le dispositif d’entraînement 207 comporte en outre un module 216 de mise à jour de paramètres de la machine entraînable 202 à partir de la fonction de perte totale L_total _e. Ces paramètres comportent en particulier les poids synaptiques du ou des réseaux de neurones (les réseaux de neurones 204 et 206 dans l’exemple décrit) de la machine entraînable 202.The drive device 207 further comprises a module 216 for updating parameters of the drivable machine 202 from the total loss function L _total _e . These parameters comprise in particular the synaptic weights of the neural network or networks (the neural networks 204 and 206 in the example described) of the trainable machine 202.

En référence à la figure 3, un procédé 300 de conception d’un générateur de signature va à présent être décrit.Referring to Figure 3, a method 300 of designing a signature generator will now be described.

Au cours d’une étape 302, les moyens de surveillance 102 surveillent les actions des entités E1, E2, E3, E4, E5 pendant un intervalle de temps de surveillance de durée W₀et enregistrent leurs actions dans l’historique H. La durée W₀vaut, dans l’exemple décrit, douze jours numérotées par la suite de un à douze.During a step 302, the monitoring means 102 monitor the actions of the entities E1, E2, E3, E4, E5 during a monitoring time interval of duration W ₀ and record their actions in the history H. The duration W ₀ equals, in the example described, twelve days subsequently numbered from one to twelve.

Au cours d’une étape 304, la durée W est choisie. Dans l’exemple décrit, la durée W vaut un jour.During a step 304, the duration W is chosen. In the example described, the duration W is one day.

Au cours d’une étape 306, des données d’entraînement et des premières et deuxièmes données de validation sont obtenues.During a step 306, training data and first and second validation data are obtained.

Pour cela, à partir de l’historique H, les moyens d’extraction 104 calculent, pour chaque entité E1, E2, E3, E4, E5 et pour chaque intervalle d’extraction, un jeu de caractéristiques statistiques des actions de cette entité E1, E2, E3, E4, E5 pendant l’intervalle d’extraction considéré. Dans l’exemple décrit, ces intervalles de temps présentent la durée W, soit un jour.For this, from the history H, the extraction means 104 calculate, for each entity E1, E2, E3, E4, E5 and for each extraction interval, a set of statistical characteristics of the actions of this entity E1 , E2, E3, E4, E5 during the extraction interval considered. In the example described, these time intervals have the duration W, i.e. one day.

Ainsi, pour l’entité E1, les moyens d’extraction 104 calculent un jeu de caractéristiques statistiques pour chacun des jours 1 à 12. Des calculs similaires sont réalisés pour les autres entités E2, E3, E4, E5, de manière à obtenir douze jeux de caractéristiques pour chaque entité E1, E2, E3, E4, E5.Thus, for the entity E1, the extraction means 104 calculate a set of statistical characteristics for each of the days 1 to 12. Similar calculations are carried out for the other entities E2, E3, E4, E5, so as to obtain twelve sets of characteristics for each entity E1, E2, E3, E4, E5.

Une première partie de jeux de caractéristiques successifs d’une première partie des entités est sélectionnée pour former les données d’entraînement. Une deuxième partie de jeux de caractéristiques successifs de cette première partie des entités est sélectionnée pour former les premières données de validation. Dans l’exemple décrit, les jeux de caractéristiques des entités E1, E2, E3 des jours 1 à 9 sont sélectionnés comme données d’entraînement et les jeux de caractéristiques des entités E1, E2, E3 des jours 10 à 12 sont sélectionnées comme premières données de validation.A first portion of successive feature sets of a first portion of the features are selected to form the training data. A second part of successive feature sets of this first part of the entities is selected to form the first validation data. In the example described, the feature sets of entities E1, E2, E3 from days 1 to 9 are selected as training data and the feature sets of entities E1, E2, E3 from days 10 to 12 are selected as first validation data.

En outre, des jeux de caractéristiques successifs des autres entités sont sélectionnés comme deuxièmes données de validation. Dans l’exemple décrit, les jeux de caractéristiques des entités E4, E5 des jours 1 à 12 sont sélectionnés comme deuxièmes données de validation.Further, successive feature sets of the other entities are selected as second validation data. In the example described, the feature sets of entities E4, E5 from days 1 to 12 are selected as second validation data.

Au cours d’une étape 308, des valeurs pour les hyperparamètres sont choisies, par exemple une structure du réseau de neurones 204 parmi plusieurs prédéfinies et des valeurs des coefficients α, λ et γ sont choisis.During a step 308, values for the hyperparameters are chosen, for example a structure of the neural network 204 among several predefined ones and values of the coefficients α, λ and γ are chosen.

Au cours d’une étape 309, la machine entraînable 202 de la figure 2 est obtenue.During a step 309, the drivable machine 202 of FIG. 2 is obtained.

Au cours d’une étape 310, la machine entraînable 202 est entraînée de manière supervisée à partir des données d’entraînement pour réaliser les différentes tâches prévues, à savoir dans l’exemple décrit : la prédiction d’entité, le regroupement des signatures et, le cas échéant, la réalisation de la ou des tâches métier.During a step 310, the trainable machine 202 is trained in a supervised manner from the training data to carry out the various planned tasks, namely in the example described: entity prediction, grouping of signatures and , if applicable, the performance of the business task(s).

Pour cela, dans l’exemple décrit, les étapes suivantes sont réalisées. Les jeux de caractéristiques statistiques des données d’entraînement sont fournis les uns après les autres à la machine entraînable 202. Pour chaque jeu de caractéristiques statistique fourni V, le réseau de neurones amont 204A génère une signature SG, le classifieur aval 204B fournit une prédiction d’entité E* à partir de la signature SG et le module 212 fournit une prédiction de groupe G* à partir de la signature SG. À partir des signatures SG, des prédictions d’entité E* et des prédictions de groupe G* obtenues des premières données d’entraînement, les modules 208, 210, 212 calculent respectivement les fonctions de perte L_ID, L_C, L_Get le module 216 calcule la fonction de perte totale L_totalepour les données d’entraînement. Le module de mise à jour 216 modifie alors les paramètres de la machine entraînable 202 en fonction de la fonction de perte globale L_totale, dans le but de diminuer cette dernière. Les étapes précédentes sont alors répétées, par exemple jusqu’à ce que la fonction de perte totale L_totalen’évolue plus beaucoup (par exemple, jusqu’à ce que l’amélioration de la fonction de perte totale L_totalepasse sous un seuil prédéfini).For this, in the example described, the following steps are carried out. The statistical feature sets of the training data are provided one after another to the trainable machine 202. For each provided statistical feature set V, the upstream neural network 204A generates a signature SG, the downstream classifier 204B provides a prediction entity E* from the signature SG and the module 212 provides a group prediction G* from the signature SG. From the signatures SG, the entity predictions E* and the group predictions G* obtained from the first training data, the modules 208, 210, 212 respectively calculate the loss functions L _ID , L _C , L _G and module 216 calculates the _total loss function L for the training data. The update module 216 then modifies the parameters of the drivable machine 202 according to the _total global loss function L , with the aim of reducing the latter. The previous steps are then repeated, for example until the total loss function L _total no longer evolves much (for example, until the improvement of the total loss function L _total passes below a threshold preset).

Ainsi, la fonction de perte L_IDguide l’entraînement pour prédire l’entité, c’est-à-dire classifier les jeux de caractéristiques statistiques d’entrée entre U classes, correspondant aux U entités des premières données d’entraînement.Thus, the loss function L _ID guides the training to predict the entity, that is to say to classify the sets of input statistical characteristics between U classes, corresponding to the U entities of the first training data.

La fonction de perte L_Cguide l’entraînement pour regrouper les signatures d’une même entité, et de ce fait réduire la variation intra-classe.The loss function L _C guides the training to group the signatures of the same entity, and thereby reduce the intra-class variation.

Au cours d’une étape 312, le réseau de neurones amont 204A est validé au moyen des premières et deuxièmes données de validation.During a step 312, the upstream neural network 204A is validated using the first and second validation data.

Pour cela, dans l’exemple décrit, les jeux de caractéristiques statistiques des premières et deuxièmes données de validation sont utilisés. Plus précisément, les étapes suivantes sont réalisées. Les jeux de caractéristiques des premières et deuxièmes données de validation sont fournis les uns après les autres au réseau de neurones amont 204A qui génère à chaque fois une signature SG. Les signatures obtenues sont alors analysées pour produire une évaluation de performance du réseau de neurones amont 204A.For this, in the example described, the sets of statistical characteristics of the first and second validation data are used. More specifically, the following steps are carried out. The sets of characteristics of the first and second validation data are supplied one after the other to the upstream neural network 204A which each time generates a signature SG. The signatures obtained are then analyzed to produce a performance evaluation of the upstream neural network 204A.

Les étapes 310 et 312 peuvent être réitérées avec d’autres hyperparamètres, en particulier avec d’autres valeurs pour les coefficients α, λ et γ. Dans ce cas, les hyperparamètres donnant la meilleure évaluation de performance à l’étape 312 sont sélectionnés.Steps 310 and 312 can be repeated with other hyperparameters, in particular with other values for the coefficients α, λ and γ. In this case, the hyperparameters giving the best performance evaluation at step 312 are selected.

Au cours d’une étape 314, le réseau de neurones amont 204A est fourni en tant que générateur de signature (désigné par la suite par la même référence 204A).During a step 314, the upstream neural network 204A is provided as a signature generator (hereinafter designated by the same reference 204A).

En référence à la figure 4, un système 400 de détection de comportement anormal d’une entité va à présent être décrit.With reference to FIG. 4, a system 400 for detecting abnormal behavior of an entity will now be described.

Le système de détection 400 comporte tout d’abord l’infrastructure informatique 100, les moyens de surveillance 102 et les moyens d’extraction 104. Il comporte en outre le générateur de signature 204A obtenu par exemple par le procédé de la figure 3.The detection system 400 first of all comprises the IT infrastructure 100, the monitoring means 102 and the extraction means 104. It also comprises the signature generator 204A obtained for example by the method of FIG. 3.

Le système de détection 400 comporte en outre des moyens 402 de détermination de données d’activité, des moyens 404 de prédiction de données d’activité et des moyens 406 de détection d’anomalie.The detection system 400 further comprises means 402 for determining activity data, means 404 for predicting activity data and means 406 for detecting anomaly.

En référence à la figure 5, un procédé 500 de détection de comportement anormal d’une entité va à présent être décrit.With reference to FIG. 5, a method 500 for detecting abnormal behavior of an entity will now be described.

Au cours d’une étape 502, les moyens de surveillance 102 surveillent au moins une entité E (l’une des entités E1, E2, E3, E4, E5 ou bien une autre entité) sur un intervalle de surveillance pour fournir un historique H d’actions réalisées par chaque entité surveillée E dans l’infrastructure informatique 100.During a step 502, the monitoring means 102 monitor at least one entity E (one of the entities E1, E2, E3, E4, E5 or else another entity) over a monitoring interval to provide a history H of actions performed by each monitored entity E in the IT infrastructure 100.

Au cours d’une étape 504, les moyens d’extraction 104 fournissent, à partir de l’historique H, pour chaque entité surveillée E, une séquence de K+1 jeux de caractéristiques statistiques V₁…V_K, V_K+1, K étant un entier supérieur ou égal à un.During a step 504, the extraction means 104 supply, from the history H, for each monitored entity E, a sequence of K+1 sets of statistical characteristics V ₁ …V _K , V _K+1 , K being an integer greater than or equal to one.

Au cours d’une étape 506, pour chaque entité surveillée E, les moyens 402 déterminent des données d’activité D à partir du dernier jeu de caractéristiques statistiques V_K+1de la séquence.During a step 506, for each monitored entity E, the means 402 determine activity data D from the last set of statistical characteristics V _K+1 of the sequence.

Au cours d’une étape 508, pour chaque entité surveillée E, les K premiers jeux de caractéristiques statistiques V₁…V_Kde la séquence sont fournis au générateur de signature 204A (après l’entraînement de la figure 3) qui fournit en réponse autant de signatures SG₁…SG_Kde l’entité surveillée E.During a step 508, for each monitored entity E, the first K sets of statistical characteristics V ₁ ...V _K of the sequence are supplied to the signature generator 204A (after the training of FIG. 3) which supplies in response as many signatures SG ₁ …SG _K of the monitored entity E.

Au cours d’une étape 510, les moyens de prédiction 404 fournissent une prédiction de données d’activité D* à partir des signatures SG₁…SG_K. Pour cela, les moyens de prédiction 404 comportent un réseau de neurones préalablement entraîné de manière supervisée pour prédire les données d’activité du dernier jeu de caractéristiques statistiques d’une séquence de K+1 jeux de caractéristiques statistiques, à partir des K premiers jeux de caractéristiques statistiques de cette séquence. Pour l’entraînement, des séquences de jeux de caractéristiques sont utilisées en entrée des moyens 402 et du générateur de signature 204A, ce dernier restant inchangé (c’est-à-dire n’étant pas entraîné) pendant l’entraînement des moyens 404. N’importe quelles séquences de jeux de caractéristiques peuvent être utilisées pour l’entraînement (en particulier celles d’entité différentes de celles utilisées pour l’entraînement du générateur de signature 204A), à l’exception, de préférence, de séquences comportant des jeux de caractéristiques statistiques ayant servi à l’entraînement du générateur de signature 204A.During a step 510, the prediction means 404 provide a prediction of activity data D* from the signatures SG ₁ ...SG _K . For this, the prediction means 404 comprise a previously trained neural network in a supervised manner to predict the activity data of the last set of statistical characteristics of a sequence of K+1 sets of statistical characteristics, from the first K sets statistical characteristics of this sequence. For training, feature set sequences are used as input to means 402 and to signature generator 204A, the latter remaining unchanged (i.e. not being trained) during training of means 404 Any sequences of feature sets can be used for training (especially those of entities different from those used for training signature generator 204A), with the exception, preferably, of sequences comprising sets of statistical characteristics having been used to train the signature generator 204A.

Au cours d’une étape 512, les moyens de détection d’anomalie 406 comparent les données d’activité D avec une donnée de référence, leur prédiction D* dans l’exemple décrit, et détectent une anomalie, c’est-à-dire un comportement anormal, ou bien une absence d’anomalie (c’est-à-dire un comportement normal) à partir de la comparaison.During a step 512, the anomaly detection means 406 compare the activity data D with a reference datum, their prediction D* in the example described, and detect an anomaly, i.e. say an abnormal behavior, or else an absence of anomaly (i.e. a normal behavior) from the comparison.

Il apparaît clairement qu’un procédé de conception de générateur de signature tel que celui décrit précédemment permet d’obtenir un générateur de signature fournissant des signatures précises et pouvant être utilisées pour détecter un comportement anormal (inhabituel) d’une entité.It is clear that a signature generator design method such as the one described above makes it possible to obtain a signature generator providing accurate signatures that can be used to detect anomalous (unusual) behavior of an entity.

Il sera en outre apprécié que chacun des éléments 102, 104, 202, 204A, 204B, 206, 207, 208, 210, 212, 214, 216, 402, 404, 406 décrits précédemment peut être implémenté de manière matérielle, par exemple par des fonctions micro programmées ou micro câblées dans des circuits intégrés dédiés (sans programme d’ordinateur), et/ou de manière logicielle, par exemple par un ou plusieurs programmes d’ordinateur destinés à être exécutés par un ou plusieurs ordinateurs comportant chacun, d’une part, une ou plusieurs mémoires pour le stockage de fichiers de données et d’un ou plusieurs de ces programmes d’ordinateurs et, d’autre part, un ou plusieurs processeurs associés cette ou ces mémoires et destinés à exécuter les instructions du ou des programmes d’ordinateur stockés dans la ou les mémoire de cet ordinateur.It will further be appreciated that each of the elements 102, 104, 202, 204A, 204B, 206, 207, 208, 210, 212, 214, 216, 402, 404, 406 previously described can be implemented in hardware, for example by micro-programmed or micro-wired functions in dedicated integrated circuits (without a computer program), and/or in software, for example by one or more computer programs intended to be executed by one or more computers each comprising, d on the one hand, one or more memories for storing data files and one or more of these computer programs and, on the other hand, one or more processors associated with this or these memories and intended to execute the instructions of the or computer programs stored in the memory(s) of this computer.

On notera par ailleurs que l’invention n’est pas limitée aux modes de réalisation décrits précédemment. Il apparaîtra en effet à l'homme de l'art que diverses modifications peuvent être apportées aux modes de réalisation décrits ci-dessus, à la lumière de l'enseignement qui vient de lui être divulgué. Dans la présentation détaillée de l’invention qui est faite précédemment, les termes utilisés ne doivent pas être interprétés comme limitant l’invention aux modes de réalisation exposés dans la présente description, mais doivent être interprétés pour y inclure tous les équivalents dont la prévision est à la portée de l'homme de l'art en appliquant ses connaissances générales à la mise en œuvre de l'enseignement qui vient de lui être divulgué.It will also be noted that the invention is not limited to the embodiments described above. It will indeed appear to those skilled in the art that various modifications can be made to the embodiments described above, in the light of the teaching which has just been disclosed to them. In the detailed presentation of the invention which is made above, the terms used must not be interpreted as limiting the invention to the embodiments set out in the present description, but must be interpreted to include therein all the equivalents whose provision is within the reach of those skilled in the art by applying their general knowledge to the implementation of the teaching which has just been disclosed to them.

Claims

Method (300) for designing a signature generator (204A) intended to provide a signature (SG) of an entity (E) from a set of statistical characteristics (V) of dated actions of this entity ( E) in a computer infrastructure (100), the method comprising:

obtaining (309) a drivable machine (202) comprising at least:
- an upstream neural network (204A) intended to provide a signature (SG) of an entity (E) from a set of statistical characteristics (V) of dated actions of this entity (E) in a computing infrastructure ( 100), and
- a classifier (204B) intended to provide an entity prediction (E*) from the signature (SG) provided by the upstream neural network (204A);
obtaining (306) training data comprising sets of statistical characteristics (V) of dated actions, each associated with an entity (E) having carried out these actions in the computing infrastructure (100); And
the supervised training (310) of the trainable machine (202) from the training data so that the trainable machine (202) correctly predicts the entities (E) associated with the sets of statistical characteristics (V) of the data of coaching ;

the method (300) being characterized in that the supervised training is also carried out so that the trainable machine (202) gathers the signatures (SG) of each of the entities (E) associated with the sets of statistical characteristics (V) of the training data around a center associated with this entity (E),
and in that it further comprises:

providing (314) the signature generator (204A) including the upstream neural network (204A) after training.

Method according to claim 1, in which the center is a barycentre, for example an isobarycentre, of the signatures (SG) of the entity (E) with which this center is associated.

A method according to claim 1 or 2, wherein the supervised training uses a total loss function (L _total ) using a first loss function (L _ID ) associated with the entity prediction and a second loss function (L _C ) associated with the grouping of signatures (SG).

A method according to claim 3, wherein the total loss function (L _total ) uses a linear combination of the first and second loss functions (L _ID , L _C ).

A method according to claim 4, wherein the linear combination has a coefficient for at least one of the first and second loss functions (L_ID, I_VS), this coefficient being a hyperparameter, and comprising:

supervised training (310) of the trainable machine (202) from training data for multiple sets of hyperparameters;
for each set of hyperparameters, analyzing the signatures obtained to produce a performance evaluation; And
the selection of hyperparameters giving the best performance evaluation.

A method as claimed in any one of claims 1 to 5, wherein the trainable machine (202) comprises a global neural network (204) comprising N layers of neurons, the upstream neural network (204A) comprising the first N – k layers and the classifier (204B) having the last k layers, k being greater than or equal to one.

A method according to any of claims 1 to 6, wherein the supervised training is further performed such that the trainable machine (202) performs one or more other tasks.

Method (500) for detecting abnormal behavior of an entity (E) comprising:

supplying (506) at least one set of statistical characteristics (V ₁ …V _K ) of dated actions of the entity (E) in a computing infrastructure (100) to a signature generator (204A) designed to after a process according to any one of claims 1 to 7;
the supply (508) by the signature generator (204A) of a signature (SG ₁ …SG _K ) of the entity (E) for each set of statistical characteristics (V ₁ …V _K );
the comparison (512) of a datum (D*) resulting from the signature(s) (SG ₁ ...SG _N ) supplied by the signature generator (204A) with at least one reference datum (D); And
the detection (512) of a normal or abnormal behavior of the entity (E) from the comparison.

Computer program downloadable from a communication network and/or recorded on a computer-readable medium and/or executable by a processor, characterized in that it comprises instructions for the execution of the steps of a method according to any of claims 1 to 8, when said program is executed on a computer.

System (200) for designing a signature generator (204A) intended to provide a signature (SG) of an entity (E) from a set of statistical characteristics (V) of dated actions of this entity ( E) in an IT infrastructure (100), comprising:

a drivable machine (202) comprising at least:
- an upstream neural network (204A) intended to provide a signature (SG) of an entity (E) from a set of statistical characteristics (V) of dated actions of this entity (E) in an IT infrastructure ( 100), and
- a classifier (204B) intended to provide an entity prediction (E*) from the signature (SG) provided by the upstream neural network (204A);
a training device (207) adapted to train the trainable machine (202) in a supervised manner from training data comprising sets of statistical characteristics (V) of dated actions, each associated with an entity (E) having performing these actions in the computing infrastructure (100), such that the trainable machine (202) correctly predicts the entities (E) associated with the statistical feature sets (V) of the training data;

the system (200) being characterized in that the driving device (207) is further adapted to drive the drivable machine (202) such that the drivable machine (202) gathers the signatures (SG) of each of the entities ( E) associated with the sets of statistical features (V) of the training data around a center associated with this entity (E), and in that the signature generator (204A) includes the upstream neural network (204A) after coaching.