FR3118244A1

FR3118244A1 - Method and device for diagnosing anomalies

Info

Publication number: FR3118244A1
Application number: FR2013938A
Authority: FR
Inventors: Pierre BLANCHART
Original assignee: Commissariat a lEnergie Atomique CEA; Commissariat a lEnergie Atomique et aux Energies Alternatives CEA
Current assignee: Commissariat a lEnergie Atomique et aux Energies Alternatives CEA
Priority date: 2020-12-22
Filing date: 2020-12-22
Publication date: 2022-06-24
Anticipated expiration: 2040-12-22
Also published as: FR3118244B1; WO2022135972A1

Abstract

L’invention propose un dispositif de diagnostic d’anomalie (100) comprenant : - une détermination de la classe à laquelle appartient une donnée d’entrée correspondant à un point requête, en appliquant un modèle (10) utilisant des arbres de décision, - une décomposition (14) des feuilles des arbres de décision en boites pures, correspondant chacune à une région de l’espace des caractéristiques d’entrée de la donnée d’entrée; - un calcul d’un point contrefactuel (17) représentant le point virtuel le plus proche du point requête appartenant à une classe cible, à partir de la décomposition; - un calcul de vecteur de changement (18) représentant les changements à appliquer au minimum à la donnée d'entrée pour que la donnée d’entrée soit classée dans la classe cible par le modèle de classification, à partir de la différence entre le point contrefactuel Y et la requête d’entrée X, le vecteur de changement dX. Figure pour l’abrégé : Fig.1 The invention proposes an anomaly diagnosis device (100) comprising: - a determination of the class to which an input data corresponding to a query point belongs, by applying a model (10) using decision trees, - a decomposition (14) of the leaves of the decision trees into pure boxes, each corresponding to a region of the space of input characteristics of the input data; - a calculation of a counterfactual point (17) representing the virtual point closest to the query point belonging to a target class, from the decomposition; - a calculation of a change vector (18) representing the minimum changes to be applied to the input data so that the input data is classified in the target class by the classification model, from the difference between the point counterfactual Y and the input query X, the change vector dX. Figure for the abstract: Fig.1

Description

Method and device for diagnosing anomalies

L’invention concerne de manière générale le domaine de la classification et en particulier un dispositif et un procédé de diagnostic d’anomalies utilisant un ou plusieurs arbres de décision.The invention generally relates to the field of classification and in particular to a device and a method for diagnosing anomalies using one or more decision trees.

Il est courant dans les dispositifs de détection d'anomalies existants d'utiliser un apprentissage automatique basé sur l’utilisation de modèles d’apprentissage de type «ensemble d’arbres de décision ».It is common in existing anomaly detection devices to use machine learning based on the use of “set of decision tree” type learning models.

De tels dispositifs sont classiquement utilisés dans des systèmes industriels, tels que les systèmes de test et de diagnostic d’objets ou équipements défectueux sur des chaînes de production, les systèmes de diagnostic de fraudes dans le domaine bancaire ou fiscal, ou les systèmes de diagnostic médical ou d'aide au diagnostic médical. Dans un exemple d’application, des dispositifs de diagnostic d’anomalies aux systèmes industriels, les processus de fabrication sont complexes et comportent de nombreuses étapes de sorte qu’il existe un risque fort qu’une anomalie (défaut) soit introduite avant la fin du processus de fabrication.Such devices are conventionally used in industrial systems, such as systems for testing and diagnosing defective objects or equipment on production lines, systems for diagnosing fraud in the banking or tax field, or diagnostic systems medical or aid to medical diagnosis. In an example application, from fault diagnostic devices to industrial systems, the manufacturing processes are complex and involve many steps such that there is a high risk that a fault (defect) will be introduced before completion. of the manufacturing process.

Dans de telles applications, le dispositif de diagnostic d’anomalies basé sur des modèles d’apprentissage automatiques de type ensemble d’arbres doit être capable d'expliquer de façon intelligible la prise de décision du système de détection d'anomalies. Par exemple, dans des dispositifs de diagnostic médical, les modèles d’apprentissage de type ”boîte noire” ne sont pas homologables à l’intérieur de dispositifs médicaux car il n'est pas possible de relier la prise de décision aux paramètres du modèle.In such applications, the anomaly diagnosis device based on set-tree type machine learning models must be able to intelligibly explain the decision-making of the anomaly detection system. For example, in medical diagnostic devices, “black box” type learning models are not homologable inside medical devices because it is not possible to link the decision making to the parameters of the model.

Ainsi, la capacité d'expliquer la prise de décision du modèle repose sur l’interprétabilité des sorties (décisions) délivrés par les modèles de type « ensemble d’arbres de décisions ».Thus, the ability to explain the model's decision-making relies on the interpretability of the outputs (decisions) delivered by the "set of decision tree" type models.

Une définition de l'interprétabilité basée sur la notion d’explications contrefactuelles a été proposée par exemple dans C.Molnar, « Interpretable, Machine Learning, 2019, A Guide for Making Black Box Models Explainable». Cette définition repose sur la notion d’explications contrefactuelles (“counterfactual explanations”), une ‘explication contrefactuelle’ d’une prédiction délivrée par un modèle d’apprentissage décrivant le plus petit changement des valeurs de caractéristiques d’entrée du modèle qui modifie la prédiction du modèle à une valeur de sortie prédéfinie. A definition of interpretability based on the notion of counterfactual explanations has been proposed for example in C. Molnar, “Interpretable, Machine Learning, 2019, A Guide for Making Black Box Models Explainable”. This definition is based on the notion of counterfactual explanations, a 'counterfactual explanation' of a prediction delivered by a learning model describing the smallest change in the values of input characteristics of the model which modifies the model prediction at a predefined output value .

Pour détecter des anomalies, les modèles de classification binaires (en particulier des modèles de type ensemble d'arbres tels que XGBoost1 2 ou lightGBM3 4) sont généralement entraînés pour classer les données en deux classes, par exemple une classe ”présence d’anomalie” versus une classe ”absence d’anomalie”. Le diagnostic d’anomalie doit ainsi être capable de fournir une réponse correspondant à l’interprétation de la décision du modèle de classification binaire de placer la donnée caractérisant l’anomalie dans la classe «anomalie ».To detect anomalies, binary classification models (especially ensemble-tree type models such as XGBoost1 2 or lightGBM3 4) are usually trained to classify data into two classes, for example a ”presence of anomaly” class versus an “absence of anomaly” class. The anomaly diagnosis must thus be able to provide an answer corresponding to the interpretation of the decision of the binary classification model to place the data characterizing the anomaly in the “anomaly” class.

Deux catégories de solutions ont été proposées dans l’état de la technique pour fournir une interprétation des décisions des modèles de classification comprenant:
- les méthodes, dites d’analyse de sensibilité, qui quantifient de façon statistique l’influence d’une caractéristique ou d’un groupe de caractéristiques sur la sortie du modèle. Cependant de telles méthodes, à l’instar de techniques de sélection de variables, ne donnent à l’utilisateur qu’une information sur les caractéristiques qui sont importantes pour le modèle, c’est-à-dire les caractéristiques qui influent sur la décision, sans fournir d’informations d’explication par rapport à une entrée spécifique. Par conséquent, de telles méthodes ne permettent pas de faire de l’explication point par point et ne sont pas par conséquent pas adaptées au diagnostic d’anomalie.
- Les techniques d’explication point à point. Étant donné une entrée du modèle, de telles techniques sont capables de déterminer des paramètres indiquant les raisons pour lesquelles le modèle a pris une décision concernant cette entrée. Les techniques d’explication point à point comprennent des méthodes dites « agnostiques au modèle » (‘model agnostic’) qui considèrent le modèle de décision comme une boîte noire et font une analyse indépendante des particularités internes du modèle, et des méthodes dites « spécifiques au modèle » (”model specific”) qui déconstruisent le modèle en tenant compte de ses particularités pour fournir une explication de la décision prise.Two categories of solutions have been proposed in the state of the art to provide an interpretation of the decisions of the classification models including:
- methods, known as sensitivity analysis, which statistically quantify the influence of a characteristic or group of characteristics on the output of the model. However, such methods, like variable selection techniques, only give the user information about the characteristics that are important for the model, i.e. the characteristics that influence the decision. , without providing explanatory information with respect to a specific entry. Consequently, such methods do not make it possible to make a point-by-point explanation and are consequently not suitable for diagnosing an anomaly.
- Point-to-point explanation techniques. Given an input to the model, such techniques are able to determine parameters indicating why the model made a decision regarding that input. Point-to-point explanation techniques include so-called 'model agnostic' methods which view the decision model as a black box and make an independent analysis of the internal peculiarities of the model, and so-called 'model-specific' methods. to the model” (”model specific”) which deconstruct the model taking into account its particularities to provide an explanation of the decision taken.

Parmi les méthodes agnostiques au modèle, les méthodes les plus couramment utilisées sont les approches de type « modèle de substitution» (‘model surrogate’) qui approximent le modèle par un modèle plus simple et interprétable soit de façon globale (C.Molnar, Interpretable, Machine Learning, 2019, A Guide for Making Black Box Models Explainable, section 5.6), soit de façon locale autour de la prédiction à expliquer (comme décrit par exemple dans M. T. Ribeiro, S. Singh, and C. Guestrin, \" why should i trust you?" explaining the predictions of any classifier," in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135-1144). L’efficacité de telles méthodes est fortement conditionnée par la capacité à approximer le modèle par un modèle de substitution interprétable. Dans les approches de type modèle de substitution, les modèles sont des modèles simples (par exemple des arbres de décision, des modèles linéaires, des modèles additifs généralisés etc.) qui ne permettent pas d’approximer de façon satisfaisante les modèles complexes, ce qui rend les modèle de substitution globaux peu utilisés. Un problème similaire se pose pour les modèles de substitution locaux. Par exemple, l’approximation locale est du type lissage de noyau (‘kernel smoothing’) dans M. T. Ribeiro, S. Singh, and C. Guestrin, \" why should i trust you?" explaining the predictions of any classier," in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135-1144. Cette approximation est très sensible à la taille des voisinages, et donc très instable par rapport au choix du paramètre largeur de bande du noyau (‘kernel’) utilisé pour le lissage (‘smoothing’). De façon générale, il est compliqué de trouver des paramétrisations des modèles de substitution locaux qui garantissent une bonne approximation du modèle original. Par ailleurs, elle requiert une paramétrisation autour de chaque point à expliquer, ce qui rend l’approximation complexe. Il en résulte une grande instabilité des explications déterminées suivant les paramétrisations choisies.Among the model-agnostic methods, the most commonly used methods are “model surrogate” type approaches which approximate the model by a simpler and more interpretable model either globally (C.Molnar, Interpretable , Machine Learning, 2019, A Guide for Making Black Box Models Explainable, section 5.6), or locally around the prediction to be explained (as described for example in M. T. Ribeiro, S. Singh, and C. Guestrin, "why should i trust you?" explaining the predictions of any classifier," in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135-1144). ability to approximate the model by an interpretable surrogate model In surrogate model-type approaches, the models are simple models (e.g. decision trees, linear models, additive models generalized ifs etc.) which do not allow complex models to be approximated satisfactorily, which makes global substitution models little used. A similar problem arises for local surrogate models. For example, the local approximation is of the kernel smoothing type in M. T. Ribeiro, S. Singh, and C. Guestrin, \" why should i trust you?" explaining the predictions of any classier," in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135-1144. This approximation is very sensitive to the size of the neighborhoods, and therefore very unstable with respect to the choice of the bandwidth parameter of the kernel ('kernel') used for smoothing ('smoothing') In general, it is difficult to find parameterizations of the local substitution models which guarantee a good approximation of the original model. , it requires a parametrization around each point to be explained, which makes the approximation complex, resulting in great instability of the explanations determined according to the chosen parametrizations.

Les méthodes spécifiques au modèle (‘model specific’) dédiées aux ensemble d’arbres de décision, analysent, pour une entrée donnée, la contribution de chaque caractéristique (‘feature’) à la décision finale du modèle. Elles fournissent un type d’explication qui est peu approprié au diagnostic d’anomalie, de telles méthodes, similairement aux méthodes de type analyse de sensibilité, analysant principalement les caractéristiques qui ont le plus influencé le modèle dans sa décision. Cependant, le modèle a tendance à regarder les mêmes caractéristiques quelle que soit la classe de la donnée d’entrée (”anomalie” ou ”normal”). L’explication produite ne donne donc pas de caractérisation de l’anomalie en terme de valeurs anormales des caractéristiques d’entrée, c’est-à-dire qu’elle n’explicite pas ce qui a changé au niveau des mesures/caractéristiques d’entrée par rapport à une donnée classée ”normale” par le modèle. Il n’est donc pas possible avec ce type de méthode d’obtenir de diagnostic ciblé du défaut pour en identifier la cause exacte. L’opérateur obtient comme seule information un classement (‘ranking’) par ordre d’importance décroissante des caractéristiques que le modèle a utilisé pour produire sa décision.Model specific methods dedicated to sets of decision trees analyze, for a given input, the contribution of each feature to the final decision of the model. They provide a type of explanation that is not very suitable for anomaly diagnosis, such methods, similar to sensitivity analysis type methods, mainly analyzing the characteristics that most influenced the model in its decision. However, the model tends to look at the same characteristics regardless of the class of the input data (“abnormal” or “normal”). The explanation produced therefore does not give a characterization of the anomaly in terms of abnormal values of the input characteristics, i.e. it does not explain what has changed at the level of the measurements/characteristics of input relative to data classified as “normal” by the model. It is therefore not possible with this type of method to obtain a targeted diagnosis of the fault in order to identify its exact cause. The only information the operator obtains is a ranking ('ranking') in descending order of importance of the characteristics that the model used to produce his decision.

Ainsi, les solutions proposées pour détecter des anomalies à base de modèles de type ensemble d’arbres de décision sont aveugles sur le processus de prise de décision du modèle (notamment les méthodes de type ”model agnostic”), produisent des explications imprécises (notamment les approches par modèle de substitution globaux) et/ou instables (approche par modèle de substitution locaux) du fait du manque de connaissance pour approximer convenablement le modèle. Elles sont généralement trop imprécises par rapport à la définition de l’interprétabilité posée. Par ailleurs, elles ne permettent pas de caractériser précisément les changements dans les caractéristiques d’entrée qui ont provoqué la prise de décision du modèle.Thus, the solutions proposed for detecting anomalies based on models of the set of decision trees type are blind to the decision-making process of the model (in particular “model agnostic” type methods), produce imprecise explanations (in particular global surrogate model approaches) and/or unstable (local surrogate model approaches) due to the lack of knowledge to properly approximate the model. They are generally too imprecise in relation to the definition of interpretability posed. Moreover, they do not make it possible to precisely characterize the changes in the input characteristics which caused the decision-making of the model.

Il existe donc un besoin pour un procédé et un dispositif de diagnostic d’anomalies améliorés capables de caractériser avec précision les changements qui ont provoqué l’anomalie.There is therefore a need for an improved anomaly diagnosis method and device capable of accurately characterizing the changes that caused the anomaly.

Définition générale de l’inventionGeneral definition of invention

L’invention vient améliorer la situation en proposant un dispositif de diagnostic d’anomalie comprenant une unité de classification configurée pour déterminer, en réponse à une requête reçue comprenant une donnée d’entrée et une classe cible, la classe à laquelle appartient la donnée d’entrée parmi un ensemble de classes, en appliquant la donnée d’entrée à un modèle de classification multiclasse, la donnée d’entrée correspondant à un point requête. Le modèle de classification utilise un ensemble d’arbres de décision, la donnée d’entrée comprenant un ensemble de caractéristiques d’entrée, l’espace des caractéristiques d’entrées ayant un nombre de dimensions donné. Chaque arbre de décision comprend un ensemble de nœuds définissant une arborescence depuis un nœud racine jusqu’à un ensemble de feuilles, chaque feuille de l’arbre de décision comprenant un ensemble de scores associés à chaque classe, chaque arbre de décision étant appliqué à la donnée d’entrée pour déterminer une décision de classification élémentaire à partir des scores associés aux feuilles de l’arbre de décision, l’unité de classification appliquant une fonction d’agrégation aux décisions de classification élémentaires déterminées par chacun des arbres du modèle pour déterminer la classe de la donnée d’entrée.The invention improves the situation by proposing an anomaly diagnostic device comprising a classification unit configured to determine, in response to a request received comprising an input datum and a target class, the class to which the datum belongs. entry among a set of classes, by applying the input datum to a multiclass classification model, the input datum corresponding to a query point. The classification model uses a set of decision trees, the input data comprising a set of input features, the space of input features having a given number of dimensions. Each decision tree comprises a set of nodes defining a tree structure from a root node to a set of leaves, each leaf of the decision tree comprising a set of scores associated with each class, each decision tree being applied to the input datum for determining an elementary classification decision from the scores associated with the leaves of the decision tree, the classification unit applying an aggregation function to the elementary classification decisions determined by each of the trees of the model to determine the class of the input data.

Le dispositif de diagnostic d’anomalie comprend en outre :
- une unité de décomposition configurée pour décomposer au moins une partie de l’ensemble des feuilles associées aux arbres de décision du modèle en boites multidimensionnelles, dites boîtes pures, à partir de la requête d’entrée, les boîtes pures correspondant à une collection d’intervalles, chaque boîte pure correspondant à une région de l’espace des caractéristiques d’entrée, associée à un score vectoriel déterminé à partir des scores associés aux feuilles des arbres de décision ;
- une unité de calcul de point contrefactuel configurée pour calculer un point contrefactuel (Y) représentant le point virtuel le plus proche du point requête appartenant à la classe cible, à partir de la décomposition en boîtes pures ;
- une unité de calcul de vecteur de changement configurée pour comparer le point contrefactuel avec la requête d’entrée et déterminer la différence entre le point contrefactuel (Y) et la requête d’entrée (X), ce qui fournit un vecteur de changement (dX) représentant les changements à appliquer au minimum à la donnée d'entrée pour que la donnée d’entrée soit classée dans la classe cible par le modèle de classification, le vecteur de changement (dX) ayant la même taille que les caractéristiques d’entrée.The anomaly diagnosis device further comprises:
- a decomposition unit configured to decompose at least part of the set of leaves associated with the decision trees of the model into multidimensional boxes, called pure boxes, from the input request, the pure boxes corresponding to a collection of intervals, each pure box corresponding to a region of the input feature space, associated with a vector score determined from the scores associated with the leaves of the decision trees;
- a counterfactual point calculation unit configured to calculate a counterfactual point (Y) representing the virtual point closest to the query point belonging to the target class, from the decomposition into pure boxes;
- a change vector calculation unit configured to compare the counterfactual point with the input query and determine the difference between the counterfactual point (Y) and the input query (X), which provides a change vector ( dX) representing the minimum changes to be applied to the input data for the input data to be classified in the target class by the classification model, the change vector (dX) having the same size as the characteristics of entrance.

Dans un mode de réalisation, l’unité de décomposition peut déterminer la décomposition en boîte pures sous la forme d’une structure arborescente dérivée, dont les nœuds à une profondeur d contiennent les régions correspondant aux boîtes pures, la profondeur d correspondant à une dimension de l’espace des caractéristiques d’entrées.In one embodiment, the decomposition unit can determine the pure box decomposition as a derived tree structure, whose nodes at a depth d contain the regions corresponding to the pure boxes, the depth d corresponding to a dimension of the input feature space.

Dans un mode de réalisation, la pluralité d’arbres de décision du modèle peut comprendre des feuilles s’intersectant, l’unité de décomposition étant en outre configurée pour déterminer la décomposition du modèle en boîtes pures sous la forme de régions d’intersection maximale, dites boites d’intersection maximale, une boîte d’intersection maximale représentant une région du modèle possédant un score uniforme sur l’ensemble de la région résultant de l’agrégation des scores des feuilles qui s’intersectent pour former la région, en appliquant la fonction d’agrégation.In one embodiment, the plurality of decision trees of the model may comprise intersecting leaves, the decomposition unit further being configured to determine the decomposition of the model into pure boxes as regions of maximum intersection , called maximum intersection boxes, a maximum intersection box representing a region of the model having a uniform score over the entire region resulting from the aggregation of the scores of the intersecting leaves to form the region, by applying the aggregation function.

L’unité de décomposition peut être configurée pour déterminer les boîtes d’intersection maximales du modèle par récurrence sur les dimensions de l’espace des caractéristiques d’entrée, en appliquant une pluralité d’itérations, chaque itération correspondant à l’une des dimensions, l’unité de décomposition fournissant la décomposition en boîtes pures du modèle sous forme de boites d’intersection maximale dimension par dimension.The decomposition unit can be configured to determine the maximum intersection boxes of the model by induction on the dimensions of the input feature space, applying a plurality of iterations, each iteration corresponding to one of the dimensions , the decomposition unit providing the decomposition into pure boxes of the model in the form of maximum intersection boxes dimension by dimension.

Dans un mode de réalisation, l’unité de décomposition peut être configurée pour déterminer, à partir d’une décomposition en boîtes d’intersection maximale de la restriction du modèle aux d premières dimensions du modèle, une décomposition en boîtes d’intersection maximale de la restriction du modèle aux d+1 premières dimensions du modèle, d désignant une dimension donnée du modèle, une restriction du modèle aux d premières dimensions désignant la restriction de l'ensemble des boites correspondant aux feuilles des arbres de décision du modèle aux d premières dimensions du modèle.In one embodiment, the decomposition unit may be configured to determine, from a maximum intersection box decomposition of the restriction of the model to the first d dimensions of the model, a maximum intersection box decomposition of the restriction of the model to the first d+1 dimensions of the model, d designating a given dimension of the model, a restriction of the model to the first d dimensions designating the restriction of the set of boxes corresponding to the leaves of the decision trees of the model to the first d model size.

Dans un mode de réalisation, les boîtes d’intersection maximale sont des boîtes d’intersection maximale associées à la restriction du modèle aux d+1 premières dimensions.In one embodiment, the maximum intersection boxes are maximum intersection boxes associated with the restriction of the model to the first d+1 dimensions.

Avantageusement, le modèle peut être décomposé en un ensemble de boîtes pures comprenant au plus éléments, N désignant le nombre de feuilles des arbres de décision du modèle et représentant la dimension de l’espace des caractéristiques d’entrée appliquées au modèle.Advantageously, the model can be decomposed into a set of pure boxes comprising at most elements, N denoting the number of leaves of the decision trees of the model and representing the dimension of the input feature space applied to the model.

Dans un mode de réalisation, l’unité de calcul du point contrefactuel peut être configurée pour calculer le point contrefactuel, en déterminant, pour chaque boîte pure, un point de la surface de la boîte pure se trouvant à une distance minimale entre le point requête représentant la donnée d’entrée et la boîte pure.In one embodiment, the counterfactual point calculation unit can be configured to calculate the counterfactual point, by determining, for each pure box, a point on the surface of the pure box located at a minimum distance between the query point representing the input data and the pure box.

Dans un mode de réalisation, le point Z de la surface de la boîte pure Bi se trouvant à une distance minimale entre le point requête X représentant la donnée d’entrée et la boîte pure Bi est donné par :In one embodiment, the point Z of the surface of the pure box Bi located at a minimum distance between the query point X representing the input data and the pure box Bi is given by:

Dans une forme de réalisation, le dispositif de diagnostic peut être configuré pour calculer une borne inférieure et une borne supérieure pour chaque nœud de la structure arborescente dérivée correspondant à la décomposition en boîtes pures, chaque niveau de la structure arborescente dérivée correspondant à une dimension du modèle, la borne inférieure représentant une borne inférieure sur la distance au point requête X de toutes les boîtes pures se trouvant en dessous du nœud, la borne supérieure représentant la borne supérieure sur la distance du point contrefactuel au point requête, le dispositif étant configuré pour arrêter la construction de la structure arborescente dérivée au niveau des nœuds pour lesquels la borne inférieure est supérieure à la borne supérieure.In one embodiment, the diagnostic device can be configured to calculate a lower bound and an upper bound for each node of the derived tree structure corresponding to the decomposition into pure boxes, each level of the derived tree structure corresponding to a dimension of the model, the lower bound representing a lower bound on the distance to the query point X of all the pure boxes lying below the node, the upper bound representing the upper bound on the distance from the counterfactual point to the query point, the device being configured to stop the construction of the derived tree structure at the level of the nodes for which the lower bound is greater than the upper bound.

Dans un mode de réalisation, pour un nœud donné de la structure arborescente dérivée, la borne inférieure est égale à la distance du point requête à la boîte pure correspondant à ce nœud.In one embodiment, for a given node of the derived tree structure, the lower bound is equal to the distance from the query point to the pure box corresponding to this node.

Le point contrefactuel pour un point requête X et une classe cible « j » peut être déterminé à partir des boîtes pures situées à la profondeur D de la structure arborescente calculée pour le modèle selon l’équation : The counterfactual point for a query point X and a target class "j" can be determined from the pure boxes located at depth D of the tree structure calculated for the model according to the equation:

Il est en outre proposé un procédé de diagnostic d’anomalie comprenant les étapes consistant à:
- en réponse à une requête reçue comprenant une donnée d’entrée et une classe cible, déterminer la classe à laquelle appartient la donnée d’entrée parmi un ensemble de classes, en appliquant la donnée d’entrée à un modèle de classification multiclasse, la donnée d’entrée correspondant à un point requête,
le modèle de classification utilisant un ensemble d’arbres de décision, la donnée d’entrée comprenant un ensemble de caractéristiques d’entrée, l’espace des caractéristiques d’entrées ayant un nombre de dimensions donné,
chaque arbre de décision comprenant un ensemble de nœuds définissant une arborescence depuis un nœud racine jusqu’à un ensemble de feuilles, chaque feuille de l’arbre de décision comprenant un ensemble de scores associés à chaque classe, chaque arbre de décision étant appliqué à la donnée d’entrée pour déterminer une décision de classification élémentaire à partir des scores associés aux feuilles de l’arbre de décision, une fonction d’agrégation étant appliquée aux décisions de classification élémentaires fournies par chacun des arbres du modèle pour déterminer la classe de la donnée d’entrée;
- décomposer au moins une partie de l’ensemble des feuilles associées aux arbres de décision du modèle en boites multidimensionnelles, dites boîtes pures, à partir de la requête d’entrée, les boîtes pures correspondant à une collection d’intervalles, chaque boîte pure correspondant à une région de l’espace des caractéristiques d’entrée, associée à un score vectoriel déterminé à partir des scores associés aux feuilles des arbres de décision ;
- calculer un point contrefactuel (Y) représentant le point virtuel le plus proche du point requête appartenant à la classe cible, à partir de la décomposition en boîtes pures ;
- déterminer la différence entre le point contrefactuel (Y) et la requête d’entrée (X), ce qui fournit un vecteur de changement (dX) représentant les changements à appliquer au minimum à la donnée d'entrée pour que la donnée d’entrée soit classée dans la classe cible par le modèle de classification, le vecteur de changement (dX) ayant la même taille que les caractéristiques d’entrée.There is further provided an abnormality diagnosis method comprising the steps of:
- in response to a request received comprising an input datum and a target class, determining the class to which the input datum belongs among a set of classes, by applying the input datum to a multiclass classification model, the input data corresponding to a query point,
the classification model using a set of decision trees, the input data comprising a set of input features, the space of input features having a given number of dimensions,
each decision tree comprising a set of nodes defining a tree structure from a root node to a set of leaves, each leaf of the decision tree comprising a set of scores associated with each class, each decision tree being applied to the input datum for determining an elementary classification decision from the scores associated with the leaves of the decision tree, an aggregation function being applied to the elementary classification decisions provided by each of the trees of the model to determine the class of the input data;
- decomposing at least part of the set of leaves associated with the decision trees of the model into multidimensional boxes, called pure boxes, from the input query, the pure boxes corresponding to a collection of intervals, each pure box corresponding to a region of the input feature space, associated with a vector score determined from the scores associated with the leaves of the decision trees;
- calculate a counterfactual point (Y) representing the virtual point closest to the query point belonging to the target class, from the decomposition into pure boxes;
- determine the difference between the counterfactual point (Y) and the input query (X), which provides a change vector (dX) representing the changes to be applied at least to the input data so that the input is classified into the target class by the classification model, with the change vector (dX) having the same size as the input features.

Le vecteur de changement fourni par le dispositif et le procédé de diagnostic d’anomalie permet d’interpréter les décisions du modèle de classification en un point de l’espace des caractéristiques d’entrée en comparant ce point au point le plus proche appartenant à une autre classe (l’appartenance à une classe étant définie par le modèle de classification).The change vector provided by the anomaly diagnosis device and method makes it possible to interpret the decisions of the classification model at a point in the space of the input characteristics by comparing this point with the closest point belonging to a another class (belonging to a class being defined by the classification model).

Les modes de réalisation de l’invention fournissent une résolution exacte et un passage à l’échelle (« scalability ») du problème algorithmique sous-tendant la recherche du point contrefactuel le plus proche, dans le cadre de modèles de classification de type ”ensemble d’arbres de décision” comportant potentiellement un très grand nombre d'arbres.The embodiments of the invention provide an exact resolution and a scalability of the algorithmic problem underlying the search for the nearest counterfactual point, within the framework of classification models of the “set” type. of decision trees” potentially comprising a very large number of trees.

Brève Description des FiguresBrief Description of Figures

D’autres caractéristiques, détails et avantages de l’invention ressortiront à la lecture de la description faite en référence aux dessins annexés donnés à titre d’exemple et qui représentent, respectivement :Other characteristics, details and advantages of the invention will become apparent on reading the description given with reference to the appended drawings given by way of example and which represent, respectively:

La représente un dispositif de diagnostic d’anomalie, selon certains modes de réalisation ; There represents an anomaly diagnosis device, according to certain embodiments;

La représente un exemple d’arbre de décision avec l’espace de décision associé, selon certains modes de réalisation ; There represents an example of a decision tree with the associated decision space, according to certain embodiments;

La illustre le point contrefactuel pour un exemple de problème de classification. There illustrates the counterfactual point for an example classification problem.

La illustre une décomposition de boites multidimensionnelles en boîtes d’intersection maximale, selon un mode de réalisation ; There illustrates a decomposition of multidimensional boxes into boxes of maximum intersection, according to one embodiment;

La illustre la décomposition en boîtes d’intersection maximale d’un ensemble constitué de l’union de deux boîtes, selon un mode de réalisation ; There illustrates the decomposition into boxes of maximum intersection of a set consisting of the union of two boxes, according to one embodiment;

La illustre une telle propriété de boites s’intersectant, selon un exemple de mode de réalisation; There illustrates such a property of intersecting boxes, according to an exemplary embodiment;

La illustre les étapes du procédé de décomposition en boîtes pures, dans un exemple de réalisation. There illustrates the steps of the decomposition process into pure boxes, in an exemplary embodiment.

Le diagramme de la représente graphiquement un exemple de structure arborescente dérivée obtenue à partir de la décomposition du modèle, selon certains modes de réalisation; The diagram of the graphically represents an example of a derived tree structure obtained from the decomposition of the model, according to certain embodiments;

La est un organigramme représentant le procédé de diagnostic d’anomalies, selon les modes de réalisation de l’invention ; et There is a flowchart representing the method for diagnosing anomalies, according to the embodiments of the invention; And

La est un exemple de représentation haut-niveau du fonctionnement du procédé de décomposition avec retour de trace, selon un mode de réalisation. There is an example of a high-level representation of the operation of the decomposition process with trace feedback, according to one embodiment.

En outre, la description détaillée est augmentée de l’Annexe A. L’annexe A comprend des exemples de procédés décrits en pseudo-code (Algorithme 1 à 5), pouvant être mis en œuvre dans certains modes de réalisation.In addition, the detailed description is augmented by Appendix A. Appendix A includes examples of methods described in pseudo-code (Algorithm 1 to 5), which can be implemented in certain embodiments.

Cette Annexe est mise à part dans un but de clarification, et pour faciliter les renvois. Elle fait partie intégrante de la description, et peut donc non seulement servir à mieux faire comprendre la présente invention, mais aussi contribuer à sa définition, le cas échéant.This Appendix is separated for the purpose of clarification, and for ease of cross-referencing. It is an integral part of the description, and can therefore not only be used to better understand the present invention, but also contribute to its definition, where appropriate.

Dans l’Annexe A, des conventions de notations, usuelles pour l’homme du métier, sont utilisées, telles que les instructions conditionnelles « if-then-else » (si-alors-sinon), « while-do » (tant que-faire), « for-do » (pour-faire). Les algorithmes font intervenir différentes variables associés à des noms.In Appendix A, notation conventions, usual for those skilled in the art, are used, such as the conditional statements "if-then-else" (if-then-else), "while-do" (as long as -do), “for-do” (for-do). Algorithms involve different variables associated with names.

La présente description est de nature à faire intervenir des éléments susceptibles de protection par le droit d’auteur et/ou le copyright. Le titulaire des droits n’a pas d’objection à la reproduction à l’identique par quiconque du présent document de brevet ou de sa description, telle qu’elle apparaît dans les documents officiels. Pour le reste, il réserve intégralement ses droits.This description is likely to involve elements susceptible to protection by author's rights and/or copyright. The rights holder has no objection to the identical reproduction by anyone of this patent document or its description, as it appears in official documents. For the rest, he fully reserves his rights.

Description détaillée de la demandeDetailed description of the request

Les modes de réalisation de l’invention fournissent un dispositif et un procédé de diagnostic d’anomalie basé sur un calcul explicite et exact des régions de décision d’un modèle de classification multiclasse de type ”ensemble d’arbres” en utilisant une caractérisation géométrique de l’espace de décision d’un tel modèle, en réponse à une requête reçue comprenant une donnée d’entrée (représentée par un point requête) et une classe cible. Le dispositif et le procédé de diagnostic d’anomalie détermine la classe à laquelle appartient la donnée d’entrée parmi un ensemble de classes, en appliquant la donnée d’entrée au modèle de classification multiclasse.The embodiments of the invention provide an anomaly diagnosis device and method based on an explicit and exact calculation of the decision regions of a “set of trees” type multiclass classification model using a geometric characterization of the decision space of such a model, in response to a request received comprising an input datum (represented by a query point) and a target class. The anomaly diagnosis device and method determines the class to which the input data belongs among a set of classes, by applying the input data to the multiclass classification model.

À partir de la caractérisation géométrique de l'espace de décision, le dispositif et le procédé selon les modes de réalisation de l’invention déterminent un point contrefactuel. représentant le point virtuel le plus proche, en termes de distance euclidienne dans l’espace des caractéristiques d’entrée du modèle, du point requête appartenant à la classe cible.From the geometric characterization of the decision space, the device and the method according to the embodiments of the invention determine a counterfactual point. representing the closest virtual point, in terms of Euclidean distance in the model input feature space, to the query point belonging to the target class.

Une fois le point virtuel extrait, un vecteur de changement représentant les changements à appliquer au minimum à la donnée d'entrée pour que la donnée d’entrée soit classée dans la classe cible par le modèle est déterminé, ce qui fournit l'explication de la décision du modèle. Un tel vecteur de changement peut alors être utilisé pour diagnostiquer l’anomalie.Once the virtual point is extracted, a change vector representing the minimum changes to be applied to the input data for the input data to be classified in the target class by the model is determined, which provides the explanation of model decision. Such a vector of change can then be used to diagnose the anomaly.

La représente un dispositif de diagnostic d’anomalie 100 selon certains modes de réalisation.There shows a fault diagnostic device 100 according to some embodiments.

Le dispositif de diagnostic d’anomalie 100 comprend une unité de classification 11 configurée pour déterminer, en réponse à une requête comprenant une donnée d’entrée et une classe cible et à une classe cible , la classe à laquelle appartient la donnée d’entrée parmi un ensemble de classes, en utilisant un modèle de classification multiclasse 10 de type ensemble d'arbres de décision.The anomaly diagnostic device 100 comprises a classification unit 11 configured to determine, in response to a request comprising an input datum and a target class and to a target class , the class to which the input data belongs among a set of classes, using a multiclass classification model 10 of the set of decision trees type.

La donnée d’entrée correspond à un point requête . La donnée d’entrée comprend un ensemble de caractéristiques d’entrée. L’espace des caractéristiques d’entrées a un nombre de dimensions donné.The input data corresponds to a query point . The input data includes a set of input characteristics. The input feature space has a number of given dimensions.

Le dispositif de diagnostic d’anomalies 100 selon les modes de réalisation de l’invention permet de déterminer une réponse, de façon exacte, à une requête du type: « pour un point donné appartenant à une classe ” ”, quel est le point « virtuel» de la classe ” ” ( ) le plus proche en terme de distance euclidienne dans l’espace des caractéristiques d’entrée ?».The anomaly diagnostic device 100 according to the embodiments of the invention makes it possible to determine an exact response to a request of the type: "for a given point belonging to a class" ”, what is the “virtual” point of the class ” ” ( ) nearest in terms of Euclidean distance in the input feature space?

Tel qu’utilisé ici, un point « virtuel » fait référence à un point de l’espace des caractéristiques d’entrée qui n’existe pas nécessairement dans l’ensemble d’apprentissage du modèle 10, mais qui est construit en utilisant la caractérisation géométrique du modèle.As used here, a "virtual" point refers to a point in the input feature space that does not necessarily exist in the Model 10 training set, but is constructed using the characterization geometry of the model.

Le modèle de diagnostic d’anomalie 10 est un modèle de classification binaire de type ”ensemble d’arbre de décision”, tel que par exemple XGBoost ou lightGBM.The anomaly diagnosis model 10 is a binary classification model of the “decision tree set” type, such as for example XGBoost or lightGBM.

Étant donné des caractéristiques numériques d’entrée multidimensionnelles caractérisant un objet, le modèle de classification 10 permet de classer cet objet parmi les classes de l’ensemble de classes. Par exemple, si le modèle de classification 10 est un modèle de classification binaire, l’ensemble de classes peut comprendre une classe « présence d’anomalie » et une classe « absence d’anomalie ».Given multidimensional input numerical characteristics characterizing an object, the classification model 10 makes it possible to classify this object among the classes of the set of classes. For example, if the classification model 10 is a binary classification model, the set of classes may include a class "presence of anomaly" and a class "absence of anomaly".

Le modèle de classification 10 de type ensemble d’arbres de décision est préalablement entraîné en utilisant des données d’apprentissage.The classification model 10 of the set of decision trees type is previously trained using training data.

Chaque arbre de décision du modèle de classification comprend un ensemble de nœuds définissant une arborescence depuis un nœud racine jusqu’à un ensemble de feuilles. Chaque feuille de l’arbre de décision comprend un ensemble de scores associés à chaque classe. La donnée d’entrée est appliquée à chaque arbre de décision du modèle, ce qui fournit une décision de classification élémentaire à partir des scores associés aux feuilles de l’arbre de décision. L’unité de classification 11 applique une fonction d’agrégation aux décisions de classification élémentaires fournies par chacun des arbres de décision du modèle pour déterminer la classe de la donnée d’entrée.Each decision tree of the classification model includes a set of nodes defining a tree structure from a root node to a set of leaves. Each leaf of the decision tree includes a set of scores associated with each class. The input data is applied to each decision tree of the model, which provides an elementary classification decision from the scores associated with the leaves of the decision tree. The classification unit 11 applies an aggregation function to the elementary classification decisions provided by each of the decision trees of the model to determine the class of the input data.

Le dispositif de diagnostic 100 comprend une unité de décomposition 14 pour déterminer la caractérisation géométrique du modèle 10.The diagnostic device 100 includes a decomposition unit 14 to determine the geometric characterization of the model 10.

L’unité de décomposition 14 est configurée pour décomposer au moins une partie de l’ensemble des feuilles associées aux arbres de décision du modèle en boites multidimensionnelles, dites « boîtes pures », à partir de la requête d’entrée. Les boîtes pures correspondent à une collection d’intervalles, chaque boîte pure correspondant à une région de l’espace des caractéristiques d’entrée, associée à un score vectoriel déterminé à partir des scores associés aux feuilles des arbres de décision. L'ensemble de définition du modèle 10 de type "ensemble d'arbres" est exactement l'union des boîtes correspondant aux feuilles des arbres.The decomposition unit 14 is configured to decompose at least part of the set of leaves associated with the decision trees of the model into multidimensional boxes, called “pure boxes”, from the input request. The pure boxes correspond to a collection of intervals, each pure box corresponding to a region of the input feature space, associated with a vector score determined from the scores associated with the leaves of the decision trees. The definition set of model 10 of type "set of trees" is exactly the union of the boxes corresponding to the leaves of the trees.

Le dispositif de diagnostic 100 comprend une unité de calcul de point contrefactuel 17 configurée pour calculer un point contrefactuel (Y) représentant le point virtuel le plus proche, en terme de distance euclidienne dans l’espace des caractéristiques d’entrée, du point requête appartenant à la classe cible, à partir de la décomposition en boîtes pures.The diagnostic device 100 comprises a counterfactual point calculation unit 17 configured to calculate a counterfactual point (Y) representing the closest virtual point, in terms of Euclidean distance in the space of the input characteristics, of the query point belonging to the target class, from pure box decomposition.

Le dispositif de diagnostic 100 comprend une unité de détermination de vecteur de changement 18 configurée pour déterminer la différence entre le point contrefactuel et le point requête , ce qui fournit un vecteur de changement représentant les changements à appliquer au minimum à la donnée d'entrée pour que la donnée d’entrée soit classée dans la classe cible par le modèle de classification. Le vecteur de changement ayant la même taille que les caractéristiques d’entrée. Le vecteur est représentatif des causes sous-jacentes de la décision de classification prise par le modèle 10 pour la donnée d’entrée et peut être utilisé pour interpréter la décision de classification du modèle.The diagnostic device 100 includes a change vector determination unit 18 configured to determine the difference between the counterfactual point and the query point , which provides a vector of change representing the minimum changes to be applied to the input data for the input data to be classified in the target class by the classification model. The vector of change having the same size as the input features. The vector is representative of the underlying causes of the classification decision made by the model 10 for the input datum and can be used to interpret the model classification decision.

Le dispositif de diagnostic d’anomalies 100 selon les modes de réalisation de l’invention exploite ainsi une représentation géométrique d’un modèle de classification de type ”ensemble d’arbres de décision” 11 obtenue à partir de la décomposition en boîtes pures du modèle.The anomaly diagnosis device 100 according to the embodiments of the invention thus uses a geometric representation of a classification model of the “set of decision trees” type 11 obtained from the decomposition into pure boxes of the model .

Le dispositif de diagnostic d’anomalies 100 selon les modes de réalisation de l’invention fournit avantageusement une réponse rapide et exacte à une telle requête d’entrée représentée par les caractéristiques . Il est notamment adapté à l’utilisation de modèles de classification 10 arbitrairement gros.The anomaly diagnostic device 100 according to the embodiments of the invention advantageously provides a rapid and exact response to such an input request represented by the characteristics . It is particularly suited to the use of arbitrarily large classification models.

Pour faciliter la compréhension de la description qui suit de certains modes de réalisation et définitions utilisées en relation avec de tels modes, les notations suivantes sont définies en relation avec le modèle de classification 10 :

désigne la dimension de l’espace des caractéristiques d’entrée appliquées au modèle 10 ;
Les caractéristiques d’entrée présentées en entrée du modèle sont représentées par un vecteur ∈ ;
désignele nombre de classes prédéfinies dans l’ensemble d’apprentissage ;
désigne le modèle de classification 10. Un modèle définit une correspondance (mapping) entre l’espace des caractéristiques d’entrée et une classe:

.To facilitate understanding of the following description of certain embodiments and definitions used in connection with such modes, the following notations are defined in connection with Classification Model 10:

denotes the dimension of the input feature space applied to the model 10;
The input characteristics presented as input to the model are represented by a vector ∈ ;
denotes the number of predefined classes in the training set;
denotes classification model 10. A model defines a mapping between the input feature space and a class:

.

Par ailleurs, un arbre de décision désigne un arbre binaire comprenant un nœud racine à partir duquel part une arborescence comprenant un ensemble de nœuds. Un nœud de l’arbre de décision (y compris le nœud racine) contient un indice de caractéristique d’entrée et une valeur de seuil. Quand une donnée arrive en un nœud donné de l’arbre de décision, elle part dans la branche gauche du nœud si la valeur de la caractéristique à l’indice associé au nœud est inférieure au seuil, et dans la branche droite du nœud si la valeur de la caractéristique à l’indice associé au nœud est supérieure au seuil associé au nœud. Ce procédé de parcours de l’arbre par la donnée considérée est répétée jusqu’à ce qu’elle arrive dans une feuille de l’arbre (en étant partie du nœud racine). Une feuille de l’arbre de décision comprend un score et une classe. Le score et la classe associée à la feuille de l’arbre dans laquelle arrive une donnée sont ensuite enregistrés et associée à la donnée.Furthermore, a decision tree designates a binary tree comprising a root node from which starts a tree structure comprising a set of nodes. A decision tree node (including the root node) contains an input feature index and a threshold value. When a piece of data arrives at a given node of the decision tree, it goes to the left branch of the node if the value of the characteristic at the index associated with the node is lower than the threshold, and to the right branch of the node if the value of the characteristic at the index associated with the node is greater than the threshold associated with the node. This process of traversing the tree by the datum considered is repeated until it arrives in a leaf of the tree (being part of the root node). A leaf of the decision tree includes a score and a class. The score and the class associated with the leaf of the tree in which a datum arrives are then recorded and associated with the datum.

Il convient de noter qu’un indice de caractéristique n’est pas contraint à n’apparaître qu’une et une seule fois dans un chemin entre le nœud racine et une feuille de l’arbre. Un nombre arbitraire de nœuds à l’intérieur du chemin peut ainsi correspondre à cet indice (y compris un nombre nul).It should be noted that a characteristic index is not constrained to appear only once and only once in a path between the root node and a leaf of the tree. An arbitrary number of nodes inside the path can thus correspond to this index (including a null number).

Étant donné un arbre de profondeur , une feuille de l’arbre est caractérisée par une succession d’au maximum tests en chacun des nœuds situés sur le chemin entre le nœud racine et la feuille considérée.Given a depth tree , a leaf of the tree is characterized by a succession of at most tests at each of the nodes located on the path between the root node and the considered leaf.

Il est possible de fixer une profondeur arbitraire d’arbre indépendamment du nombre de dimensions . Lorsque (la profondeur de l’arbre est strictement supérieure au nombre de dimensions), pour une feuille donnée, une pluralité de tests peuvent être effectués suivant une même dimension sur le chemin entre le nœud racine et cette feuille.It is possible to set an arbitrary tree depth regardless of the number of dimensions . When (the depth of the tree is strictly greater than the number of dimensions), for a given leaf, a plurality of tests can be performed along the same dimension on the path between the root node and this leaf.

Pour tester l’appartenance à une feuille, il peut être effectué un ensemble de tests d’intervalles du type :
« ∈[a, b] ? » aveca ∈R U {−∞} etb ∈R U {−∞} suivant chaque dimension .To test membership in a leaf, a set of interval tests of the type can be performed:
" ∈ [ a, b ]? » with a ∈ RU {−∞} and b ∈ RU {−∞} according to each dimension .

Une feuille d’un arbre représente donc géométriquement une ”boîte” multidimensionnelle dont chaque face est perpendiculaire à un axe de coordonnée. Certaines faces n’existent pas si l’intervalle de test associé à une coordonnée est ouvert d’un ou des deux côtés. Dans la suite de la description, une feuille F_id’un arbre sera notée comme une collection deDintervalles, selon l’équation 1 :
A leaf of a tree therefore geometrically represents a multidimensional “box” each side of which is perpendicular to a coordinate axis. Some faces do not exist if the test interval associated with a coordinate is open on one or both sides. In the remainder of the description, a leaf F _i of a tree will be denoted as a collection of D intervals, according to equation 1:

Par ailleurs, un arbre de décision sera désigné ci-après parfet la -ème feuille de l’arbre sera désignée par .Furthermore, a decision tree will be denoted hereinafter by f and the -th leaf of the tree will be designated by .

A chaque feuille d’un arbre est associé un couple (score, classe) noté (S[ ,Classe[ ]), avec Classe [ ]∈1, ... , K,signifiant que la feuille vote pour la classe [ ] avec un scoreS[ pour cette classe, et 0 pour les autres classes.Each leaf of a tree is associated with a couple (score, class) noted ( S [ , Class[ ]), with class [ ] ∈ 1 , ... , K, meaning that the leaf vote for the class [ ] with an S- score [ for this class, and 0 for the other classes.

La représente un exemple d’arbre de décision et de l’espace de décision associé. Par construction, les ”boîtes” correspondant aux feuilles d’un même arbre ne s’intersectent pas entre elles. Des intersections se produisent lorsque des modèles de type ”ensemble d’arbres” sont considérés. L’exemple d’arbre représenté génère uniquement des boîtes ouvertes. La feuille ”F1”, par exemple, est ouverte à gauche suivant ”d1” et en bas suivant ”d2”. La feuille ”F6” est ouverte à droite suivant ”d1”, et fermée des trois autres côtés.There shows an example of a decision tree and the associated decision space. By construction, the “boxes” corresponding to the leaves of the same tree do not intersect with each other. Intersections occur when “set of trees” type models are considered. The example tree shown only generates open boxes. The sheet “F1”, for example, is opened on the left following “d1” and at the bottom following “d2”. The sheet “F6” is open on the right following “d1”, and closed on the three other sides.

Pour obtenir une boîte fermée de tous les côtés, au minimum deux tests par coordonnées qui n’induisent ni contradiction (par exemple, ”x1>0.5 et x1<0.3”), ni vérification inutile (par exemple, ”x1>0.3 et x1>0.5”) sont à effectuer. La feuille ”F5”, par exemple, possède un test inutile, ”x2<0.8”, car il est testé ”x2<0.4” au nœud suivant sur le ”chemin” allant du nœud racine à la feuille ”F5”. Plus généralement, pour une dimension donnée, un chemin de l’arbre possède au maximum deux tests ”utiles” (i.e., des tests n’induisant ni contradiction, ni vérification inutile), définissant un intervalle fermé pour une telle dimension.To obtain a box closed on all sides, at least two tests by coordinates which induce neither contradiction (for example, ”x1 > 0.5 and x1 < 0.3”), nor useless verification (for example, ”x1 > 0.3 and x1 > 0.5”) are to be made. The leaf ”F5”, for example, has an unnecessary test, ”x2 < 0.8”, because it is tested ”x2 < 0.4” at the next node on the ”path” from the root node to the leaf ”F5”. More generally, for a given dimension, a path of the tree has at most two “useful” tests (ie, tests inducing neither contradiction nor useless verification), defining a closed interval for such a dimension.

Dans la suite de la description, une notation simplifiée sera utilisée pour désigner le score de chaque feuille sous la forme d’un vecteur , tel que :In the rest of the description, a simplified notation will be used to designate the score of each sheet as a vector , such as :

et And

Une donnée d’entrée, correspondant à un point requête ,appliquée au modèle de classification 10 suit un unique parcours dans un arbre jusqu’à arriver dans une unique feuille de l’arbre comprenant un score . Pour désigner le score associé à la feuille de l’arbre qui est atteinte pour une requête d’entrée , la notation suivante est utilisée :An input data, corresponding to a query point , applied to the classification model 10 follows a single path in a tree until arriving in a single sheet of the tree including a score . To designate the score associated with the leaf of the tree which is reached for an input request , the following notation is used:

En utilisant des fonctions indicatrices d’appartenance à une feuille, la fonction peut-être reformulée selon l’équation (2) :
Using leaf membership indicator functions, the function can be reformulated according to equation (2):

Telle qu’utilisée ici, une ‘collection d’arbres de décisions’ désigne un ensemble de modèles tels que par exemple les modèles XGBoost, LightGBM, les modèles « random forest », et les arbres de décision.As used here, a ‘decision tree collection’ refers to a set of models such as XGBoost models, LightGBM models, random forest models, and decision trees.

Un modèle de classification (10) de type ”collection d’arbres de décision” est défini comme un ensemble de arbres de décision ,munis d’une fonction d’agrégation vectorielle de la forme:A classification model (10) type “collection of decision trees” is defined as a set of decision trees , equipped with a vector aggregation function of shape:

La fonction d’agrégation vectorielle est utilisée pour agréger les décisions individuelles (décision de classification élémentaire) de chacun des arbres du modèle en une unique décision.The vector aggregation function is used to aggregate the individual decisions (elementary classification decision) from each of the trees in the model into a single decision.

Étant donnée une caractéristique d’entrée , la sortie fournie par un modèle en réponse à une requête d’entrée , représentant la décision prise par le modèle sur , est calculée par:Given an input characteristic , the exit provided by a model in response to an input request , representing the decision made by the model on , is calculated by:

Par exemple, pour des modèles de classification de type ”XGBoost”, la fonction d’agrégation peut être de la forme:For example, for classification models of type ”XGBoost”, the aggregation function can be of the form:

La sortie du modèle est donc un vecteur de taille comprenant un ensemble de composants, chaque composant étant associé à une classe parmi les classes chaque classe , le composant du vecteur associé à la k-ième classe comprenant une information d’appartenance de classe indiquant si la donnée d’entrée appartient ou non à la classe (« croyance » du modèle que la donnée appartient à la classe ). L’information d’appartenance de classe peut être par exemple une probabilité. Dans un mode de réalisation, le modèle F détermine qu’une donnée est classée dans une classe (i.e. l’information d’appartenance de classe indique que la donnée d’entrée appartient à la classe ) si et seulement si:The output of the model is therefore a vector of size comprising a set of components, each component being associated with a class among the classes each class , the vector component associated with the k-th class comprising class membership information indicating whether the input data belongs or not to the class (“belief” of the model that the data belongs to the class ). The class membership information may for example be a probability. In one embodiment, the model F determines that a datum is classified in a class (ie the class membership information indicates that the input data belongs to the class ) if and only if:

L’unité de décomposition 14 est configurée pour déterminer une reformulation géométrique d’un modèle sous la forme d’une collection de feuilles/boîtes{ }, chacune associée à un score vectoriel . Dans la suite de la description, la notation donnée par la formule (3) suivante sera utilisée :
The decomposition unit 14 is configured to determine a geometric reformulation of a model as a collection of sheets/boxes { } , each associated with a vector score . In the rest of the description, the notation given by the following formula (3) will be used:

Il est considéré un modèle de classification (10), une requête associée à une donnée d’entrée , une classe et une classe cible , la requête consistant à « déterminer le point virtuel de la classe le plus proche en distance euclidienne d’un élément donné de la classe ». Le point virtuel représentant le « point contrefactuel de la classe associé à un point » est défini par la formule (4) suivante :
It is considered a classification model (10), a request associated with an input data , a class and a target class , the query consisting of "determine the virtual point of the class closest in Euclidean distance to a given element of style ". The virtual point representing the "counterfactual point of the class associated with a point is defined by the following formula (4):

Pour déterminer la réponse à la requête, le dispositif de diagnostic d’anomalies 100 utilise l’unité de décomposition 14 pour déterminer les régions de décision du modèle dans l’espace des caractéristiques d’entrée de la donnée d’entrée . Les régions de décision du modèle correspondent aux régions à l’intérieur desquelles le modèle prend une et une seule décision concernant la classification des éléments de la région. De telles régions sont encore appelées « régions pures » du modèle.To determine the response to the query, the anomaly diagnostic device 100 uses the decomposition unit 14 to determine the decision regions of the model in the space of the input characteristics of the input datum . The decision regions of the model correspond to the regions within which the model makes one and only decision regarding the classification of the elements of the region. Such regions are also called “pure regions” of the model.

La figue 2B illustre la notion de point contrefactuel pour un problème de classification binaire entre une première classe C1 versus une deuxième classe C2. La représente les régions de décision du modèle pour les deux classes C1 et C2. Pour trois points de la classe C1 notés (P1; P2; P3), les points les plus proches de la classe cible C2 dans l'espace de décision du modèle de classification sont recherchés. Ces points sont dénotés respectivement (CF1; CF2; CF3) sur le diagramme de la et correspondent aux points contrefactuels associés respectivement aux points (P1; P2; P3).Fig. 2B illustrates the notion of counterfactual point for a binary classification problem between a first class C1 versus a second class C2. There represents the decision regions of the model for the two classes C1 and C2. For three points of the class C1 denoted (P1; P2; P3), the points closest to the target class C2 in the decision space of the classification model are sought. These points are denoted respectively (CF1; CF2; CF3) on the diagram of the and correspond to the counterfactual points associated respectively with the points (P1; P2; P3).

Pour faciliter la compréhension du traitement mis en œuvre par l’unité de décomposition 14, les définitions et propriétés suivantes sont fournies en relation avec la notion de « région pure ».To facilitate the understanding of the processing implemented by the decomposition unit 14, the following definitions and properties are provided in relation to the notion of “pure region”.

est une région pure associée à un modèle F si: is a pure region associated with a model F if:

Par ailleurs, si Classe( )=k, la région West appelée « région pure de classe kassociée au modèle F». Dans la suite de la description, la notation Classe( )=ksera utilisée pour indiquer que la région pureWest une région pure de classek.Also, if Class( )=k, the region West called “pure region of class ka associated with the model F ” . In the rest of the description, the notation Class( )=ks will be used to indicate that the pure region W is a pure region of class k .

La caractérisation de l’espace de décision d’un modèle F désigne une décomposition de l’espace de définition de F en régions pures distinctes, telle que mise en œuvre par l’unité de décomposition 14.The characterization of the decision space of a model F denotes a decomposition of the definition space of F into distinct pure regions, as implemented by the decomposition unit 14.

Étant donné un modèle d’ensemble de définition E_F R^D, une décomposition en régions pures de , telle que mise en œuvre par l’unité de décomposition 14, désigne une décomposition telle que:Given a model of definition set E _F R ^D , a decomposition into pure regions of , as implemented by decomposition unit 14, denotes a decomposition such as:

- est une région pure associée à ; - is a pure region associated with ;

- ;- ;

- .- .

Il convient de noter qu’une telle décomposition n’est pas unique.It should be noted that such a decomposition is not unique.

La distance d’un point à une région est définie par:The distance from a point to a region is defined by:

(5) (5)

Le point désigne alors le point de la région le plus proche en distance euclidienne de .Point denotes then the point of the region closest in Euclidean distance to .

Étant donné un modèle et une décomposition en régions pures associée , dans certains modes de réalisation, l’unité de calcul de point contrefactuel 17 peut déterminer le point contrefactuel de la classe associé à un point , tel que défini par l’équation (4), en résolvant l’équation (6) suivante :
Given a model and an associated pure region decomposition , in some embodiments, the counterfactual point calculation unit 17 may determine the counterfactual point of the class associated with a point , as defined by equation (4), by solving the following equation (6):

Avantageusement, l’unité de décomposition 14 peut être configurée pour déterminer une décomposition en régions pures de l’ensemble de définition du modèle F, de sorte que la distance par rapport à une région pure, telle que définie par l’équation (5), soit exactement et efficacement calculable, ce qui permet de déterminer une reformulation géométrique d’un modèle de classification F (10).Advantageously, the decomposition unit 14 can be configured to determine a pure region decomposition of the model definition set F, such that the distance to a pure region, as defined by equation (5) , is exactly and efficiently computable, which makes it possible to determine a geometric reformulation of a classification model F (10).

Pour un modèle de type ”collection d’arbres de décisions”, l’unité de décomposition 14 peut être en outre configurée pour déterminer une décomposition en régions pures sous forme de ”boîtes” multidimensionnelles, telles que définies par l’équation (1) de l’annexe (A). Une telle décomposition est encore appelée « décomposition en boîtes pures ». Il convient de noter que l’ensemble de définition d’un modèle F de type ”ensemble d’arbres” est l’union des boîtes correspondant aux feuilles des arbres.For a model of the “collection of decision trees” type, the decomposition unit 14 can further be configured to determine a decomposition into pure regions in the form of multidimensional “boxes”, as defined by equation (1) of Annex (A). Such a decomposition is also called “pure box decomposition”. It should be noted that the definition set of a model F of type “set of trees” is the union of the boxes corresponding to the leaves of the trees.

Étant donné un modèle F de type ”ensemble d’arbres”, l’unité de décomposition 14 peut ainsi déterminer une décomposition en boîtes pures de l’ensemble des feuilles du modèle de classification F.Given a model F of the “set of trees” type, the decomposition unit 14 can thus determine a decomposition into pure boxes of the set of leaves of the classification model F.

Il convient en outre de noter que du fait de la fonction d’agrégation g associée à un modèle, et de l’intersection potentielle entre des feuilles appartenant à des arbres différents, les boîtes correspondant aux feuilles ne sont pas nécessairement des régions pures associées au modèle.It should further be noted that due to the aggregation function g associated with a model, and the potential intersection between leaves belonging to different trees, the boxes corresponding to the leaves are not necessarily pure regions associated with the model.

Ainsi dans un mode de réalisation, l’unité de décomposition 14 peut être configurée pour effectuer en outre une décomposition en boites d'intersection maximale associée au modèle F considéré.Thus in one embodiment, the decomposition unit 14 can be configured to also perform a decomposition into maximum intersection boxes associated with the model F considered.

Telle qu’utilisée ici, une région est dite région « d’intersection maximale » associée à un modèle F de type ”ensemble d’arbres” si:As used here, a region is said to be the “maximum intersection” region associated with a “set of trees” F model if:

- il existe un sous-ensemble de feuilles ⊂ tel que ,- there is a subset of leaves ⊂ such as ,

- pour toute feuille , - for any sheet ,

Une région d’intersection maximale associée à un modèle de type ”ensemble d’arbres” désigne une région pure associée à ce modèle. En effet, par construction, une région d’intersection maximale possède un score uniforme sur l’ensemble de la région résultant de l’agrégation des scores des feuilles qui s’intersectent pour former cette région :A region of maximum intersection associated with a model of type “set of trees” designates a pure region associated with this model. Indeed, by construction, a maximal intersection region has a uniform score over the entire region resulting from the aggregation of the scores of the intersecting leaves to form this region:

Tel qu’utilisé ici, une « boîte d’intersection maximale » associée à un modèleFde type ”ensemble d’arbres” désigne une région d’intersection maximale deFde type boîte.As used herein, a “maximum intersection box” associated with a “set of trees” model F denotes a region of maximum intersection of F of box type.

Étant donné un modèle de type ”ensemble d’arbre” comportant feuilles , telle qu’utilisée ici, une décomposition en boîtes d’intersection maximale de désigne une décomposition telle que:Given a model of the “shaft assembly” type comprising leaves , as used here, a maximum intersection box decomposition of denotes a decomposition such as:

- ∀ n∈ {1, . . . , N}, est une boîte d’intersection maximale associée à ; - ∀ n ∈ {1, . . . , NOT}, is a maximum intersection box associated with ;

- ;- ;

- , - ,

La illustre une décomposition de boites multidimensionnelles en boîte d’intersection maximale, selon un exemple de réalisation. Sur la , il est considéré un modèle à trois feuilles représentées par les boîtes B1, B2, B3 sur le diagramme de gauche de la . Le diagramme de droite de la montre une décomposition des trois feuilles en 11 régions d’intersection maximale de type ”boîte”.There illustrates a decomposition of multidimensional boxes into a maximum intersection box, according to an exemplary embodiment. On the , it is considered a model with three sheets represented by the boxes B1, B2, B3 on the left diagram of the . The right diagram of the shows a decomposition of the three sheets into 11 regions of maximal intersection of the “box” type.

La illustre la décomposition en boîtes d’intersection maximale d’un ensemble constitué de l’union de deux boîtes, B1 et B2 effectués par l’unité de décomposition selon certains modes de réalisation. Une première phase de décomposition des boîtes de l’ensemble de définition B1 et B2 en boîtes d’intersection maximale est effectuée, ce qui donne trois régions d’intersection maximale R1, R2 et R3. Une deuxième phase de décomposition des régions d’intersection maximales R1, R2 et R3 en boîtes d’intersection maximales W1, W2, W3, W4 et W5 (R1 est décomposée en W1 et W2 ; R2 est décomposée en W3 et R3 est décomposée en W4 et W5). Il convient de noter que les régions d’intersection maximale ne sont pas nécessairement des boîtes mais peuvent être des assemblages disjoints de boîtes d’intersection maximale comme la région R1 qui se décompose en deux boîtes d’intersection maximale W1, et W2.There illustrates the decomposition into maximum intersection boxes of a set consisting of the union of two boxes, B1 and B2 performed by the decomposition unit according to certain embodiments. A first phase of decomposition of the boxes of the definition set B1 and B2 into boxes of maximum intersection is carried out, which gives three regions of maximum intersection R1, R2 and R3. A second phase of decomposing the maximal intersection regions R1, R2 and R3 into maximal intersection boxes W1, W2, W3, W4 and W5 (R1 is decomposed into W1 and W2; R2 is decomposed into W3 and R3 is decomposed into W4 and W5). It should be noted that the regions of maximum intersection are not necessarily boxes but can be disjoint assemblies of boxes of maximum intersection such as the region R1 which breaks down into two boxes of maximum intersection W1, and W2.

Comme illustré sur le diagramme de la , les régions Ri ne sont pas nécessairement des boîtes, mais peuvent être également des assemblages disjoints de boîtes, comme les régions R1 et R3 qui se décomposent en deux boîtes d’intersection maximale chacune (W1 et W2 pour R1 ; W4 et W5 pour R3).As shown in the diagram of the , the regions Ri are not necessarily boxes, but can also be disjoint assemblies of boxes, such as the regions R1 and R3 which break down into two boxes of maximum intersection each (W1 and W2 for R1; W4 and W5 for R3 ).

Plus généralement, une intersection entre deux boîtes D-dimensionnelles possède une décomposition comprenant 2D +1 ou moins de 2D+1 boîtes.More generally, an intersection between two D-dimensional boxes has a decomposition comprising 2D+1 or less than 2D+1 boxes.

Par ailleurs, étant donné un modèle F de type ”ensemble d’arbres” comportantNfeuilles et dont l’ensemble de définition est inclus dans R^D, il est possible de trouver une décomposition en boîtes pures des feuilles de Fcomptant au plus (2N-1)^Déléments.Moreover, given a model F of type “set of trees” comprising N leaves and whose definition set is included in R ^D , it is possible to find a decomposition into pure boxes of the leaves of F counting at most (2N -1) ^D elements.

Dans un mode de réalisation, l’unité de décomposition 14 peut être configurée pour déterminer les boîtes d’intersection maximales de F (qui sont également des régions pures associées à F) par récurrence, dimension par dimension.In one embodiment, the decomposition unit 14 can be configured to determine the maximum intersection boxes of F (which are also pure regions associated with F) by recurrence, dimension by dimension.

Dans un mode de réalisation, l’unité de décomposition 14 peut être configurée pour déterminer, à partir d’une décomposition en boîtes d’intersection maximale de la restriction du modèle F (le modèle F étant un modèle de type ”ensemble d’arbres”) aux ”d” premières dimensions, une décomposition en boîtes d’intersection maximale de la restriction de F aux ”d+1” premières dimensions (propriété d’hérédité).In one embodiment, the decomposition unit 14 can be configured to determine, from a decomposition into maximum intersection boxes, the restriction of the model F (the model F being a model of the “set of trees” type ”) in the first ”d” dimensions, a maximal intersection box decomposition of the restriction of F to the first ”d+1” dimensions (heredity property).

Telle qu’utilisée ici, une restriction d’un modèle F de type « ensemble d’arbres » aux premières dimensions désigne la restriction de l'ensemble des boites correspondant aux feuilles de F auxdpremières dimensions. La restriction d’un modèle F auxdpremières dimensions sera notée ci-après .As used here, a restriction of a "set of trees" F-model to first dimensions designates the restriction of the set of boxes corresponding to the leaves of F to the first d dimensions. The restriction of a model F to the first d dimensions will be noted below .

Il est considéré un modèle F de type ”ensemble d’arbres” dont l’ensemble de définition est inclus dans et comportant N feuilles .It is considered a model F of type “set of trees” whose definition set is included in and comprising N sheets .

Pour démontrer la propriété d’hérédité pouvant être exploitée par l’unité de décomposition 14, il est tout d’abord considéré le cas d’initialisation D=1.To demonstrate the inheritance property that can be exploited by decomposition unit 14, we first consider the case of initialization D=1.

Lorsque la dimension D est égale à un (D=1), l’unité de décomposition 14, pour mettre en œuvre une décomposition en boîtes d’intersection maximales, peut déterminer les intersections maximales résultant de l’intersection de N intervalles. Bien que le problème consistant à «déterminer les intersections maximales résultant de l’intersection de N intervalles» puisse sembler combinatoire (étant donné m intervalles parmi N, il doit être vérifié si ces intervalles possèdent une zone d’intersection commune, ce qui reviendrait à analyser l’existence de potentielles intersections), en pratique, l’aspect combinatoire peut être contourné en considérant qu’une zone d’intersection maximale commence ou se termine en un point si et seulement si au moins un intervalle parmi N commence ou se termine en ce point. Pour N intervalles, il existe au maximum N commencements et N terminaisons d’intervalle, ce qui crée par conséquent au maximum 2N−1 régions d’intersection maximale (le dernier intervalle à se terminer ne créé pas de nouvelle zone d’intersection maximale mais termine la dernière zone, d’où le ” - 1”).When the dimension D is equal to one (D=1), the decomposition unit 14, to implement a decomposition into maximum intersection boxes, can determine the maximum intersections resulting from the intersection of N intervals. Although the problem of "determining the maximum intersections resulting from the intersection of N intervals" may seem combinatorial (given m intervals among N, it must be checked whether these intervals have a common intersection area, which would amount to analyze the existence of potential intersections), in practice, the combinatorial aspect can be circumvented by considering that a maximum intersection zone begins or ends at a point if and only if at least one interval among N begins or ends at this point. For N intervals, there are at most N beginnings and N endings of intervals, which consequently creates at most 2N − 1 regions of maximum intersection (the last interval to end does not create a new zone of maximum intersection but ends the last zone, hence the “- 1”).

En variante, pour trouver les zones d’intersection maximales, dans le cas D=1, l’unité de décomposition 14 peut appliquer un procédé consistant à ordonner les 2Nvaleurs correspondant au commencement et à la terminaison desNintervalles, et à créer une nouvelle zone d’intersection maximale à chaque fois qu’un intervalle commence ou se termine. Une matrice de booléens A de taille N (2N-1) peut être ainsi créée, telle que son composant (i,j) noté A(i,j) peut avoir une valeur vraie (soit A(i,j)==TRUE, « true » signifiant « vrai ») indiquant que lei-ème intervalle est présent au niveau de laj-ème région d’intersection maximale ou une valeur « false » sinon (soit A(i,j)==FALSE, « false » signifiant « faux »). La matrice A peut-être de très grande taille. En général, la matrice A a une structure très creuse (« sparse » en langue anglo-saxonne) qui permet sa construction explicite quel que soit le nombre N de feuilles du modèle.As a variant, to find the maximum intersection zones, in the case D=1, the decomposition unit 14 can apply a method consisting in ordering the 2 N values corresponding to the beginning and the end of the N intervals, and in creating a new maximum intersection area each time an interval begins or ends. A boolean matrix A of size N (2N-1) can thus be created, such that its component (i,j) denoted A(i,j) can have a true value (i.e. A(i,j)==TRUE, “true” meaning “true” ) indicating that the i -th interval is present at the j -th region of maximum intersection or a "false" value otherwise (ie A(i,j)==FALSE, "false" meaning "false"). Matrix A can be very large. In general, the matrix A has a very sparse structure (“sparse” in Anglo-Saxon language) which allows its explicit construction whatever the number N of leaves of the model.

Lorsque la dimension D est strictement supérieure à un (D>1), il est considéré une décomposition en boîtes d’intersection maximale pour une dimensiond < Ddéfinie par:When the dimension D is strictly greater than one (D>1), a maximum intersection box decomposition is considered for a dimension d < D defined by:

À partir d’une telle décomposition, l’unité de décomposition 14 peut déterminer une décomposition en boîtes d’intersection maximale à la dimension .From such a decomposition, the decomposition unit 14 can determine a maximum intersection box decomposition at the dimension .

Une boîte quelconque de dimension indexée par sera notée ci-après :A box of any size indexed by will be noted below:

, ..., } , ..., }

Par ailleurs, la n-ième feuille du modèle sera notée ci-après :Moreover, the n-th sheet of the model will be noted below:

, ..., } , ..., }

Le procédé de détermination des zones d’intersection maximales peut être implémenté par exemple selon la procédure Intersect1D décrite sous forme de pseudo-code dans l’Algorithme 1 de l’annexe B.1.The method for determining the maximum intersection zones can be implemented for example according to the Intersect1D procedure described in the form of pseudo-code in Algorithm 1 of appendix B.1.

Un tel procédé de détermination des zones d’intersection maximales est mis en œuvre pour calculer les régions d'intersection maximales associées à N intervalles 1D. La nième région d'intersection maximale construite est définie par l'intervalle et est formée par l'intersection des intervalles de l'ensemble dont l'indice est contenu dans .Such a method for determining the maximum intersection zones is implemented to calculate the maximum intersection regions associated with N 1D intervals. The nth constructed maximal intersection region is defined by the interval and is formed by the intersection of the intervals of the set whose index is contained in .

Le procédé de détermination des zones d’intersection maximales utilise en entrée une collection de N intervalles .The method for determining the maximum intersection zones uses as input a collection of N intervals .

Le procédé de détermination des zones d’intersection maximales utilise en outre :The method for determining the maximum intersection zones also uses:

- L'ensemble P des évènements correspondant au commencement ou à la fin d'un intervalle, avec ;- The set P of events corresponding to the beginning or the end of an interval, with ;

- L'ensemble R contenant les indices des intervalles associés aux événements de P, avec ;- The set R containing the indices of the intervals associated with the events of P, with ;

– L’indice et l’ensemble trié résultant du tri par ordre croissant de l’ensemble P : ;– The index and the whole sorted resulting from sorting in ascending order of the set P: ;

– Un vecteur booléen Q tel que Q(n)==TRUEsi et seulement si l’événement I(n) correspond au commencement d’un intervalle. Si tel est le cas, il s’agit d’un intervalle R[I][n];– A Boolean vector Q such that Q(n)==TRUEif and only if the event I(n) corresponds to the beginning of an interval. If so, it is an interval R[I][n];

– Un vecteur de booléen, noté « Started » (« started » étant un terme anglo-saxon signifiant « commencé ») de taille N dont les valeurs sont initialisées à FALSE tel que Started[n]==TRUE si et seulement si l’événement s’est produit et l’événement ne s’est pas encore produit.– A Boolean vector, noted “Started” (“started” being an Anglo-Saxon term meaning “started”) of size N whose values are initialized to FALSE such that Started[n]==TRUE if and only if the event happened and the event has not happened yet.

- une variableid.ninitialisée à 1 (id.n 1) ;- a variable id.n initialized to 1 ( id.n 1);

- une variableind.uniqueinitialisée à 1 (ind.unique 1) ;- a variable ind.unique initialized to 1 ( ind.unique 1 );

- une matrice booléenne A de taille N (2N-1) initialisée à FALSE.- a boolean matrix A of size N (2N-1) initialized to FALSE.

Le procédé de détermination des zones d’intersection maximales détermine et fournit en sortie les zones d’intersections maximales :The maximum intersection area determination process determines and outputs the maximum intersection areas:

- avec et - with And

- avec .- with .

Étant donnée une boîte = , la boîte est appelée restriction de aux premières dimensions (d ≤ D). Par analogie, la restriction de la feuille aux premières dimensions est notée .Given a box = , the box is called restriction of to first dimensions ( d ≤ D ). By analogy, the leaf restriction to first dimensions is noted .

Dans la suite de la description, en considérant l’ensemble, noté , comprenant les indices des feuilles du modèle restreintes aux premières dimensions qui possèdent une intersection non nulle avec la boîte , la propriété suivante peut être considérée pour rajouter une dimension :In the remainder of the description, considering the whole, denoted , including the indices of the sheets of the model restricted to first dimensions that have a nonzero intersection with the box , the following property can be considered to add a dimension:

Si deux boîtes et ne s’intersectent pas suivant une dimension d, avec , c’est-à-dire si , alors :If two boxes And do not intersect along a dimension d, with , that is, if , SO :

. .

Une telle propriété géométrique découle directement du fait qu’une boîte est, pour une dimension donnée, située entre les deux hyperplans de vecteur normal passant respectivement par les points et , où désigne le vecteur unitaire associé au -ème axe de coordonnée.Such a geometric property follows directly from the fact that a box is, for one dimension datum, located between the two normal vector hyperplanes passing respectively through the points And , Or denotes the unit vector associated with the -th coordinate axis.

La illustre une telle propriété. Sur le diagramme de gauche de la , la montre deux boîtes B1 et B2 qui s’intersectent suivant la dimension 1 mais ne s’intersectent pas suivant la dimension 2. Au contraire, sur le diagramme de droite de la , les boîtes B1 et B2 s’intersectent suivant la dimension 2 mais ne s’intersectent pas suivant la dimension 1. Dans les deux cas, l’intersection de B1 et B2 est nulle.There illustrates such a property. On the left diagram of the , there shows two boxes B1 and B2 which intersect along dimension 1 but do not intersect along dimension 2. On the contrary, on the right diagram of the , boxes B1 and B2 intersect in dimension 2 but do not intersect in dimension 1. In both cases, the intersection of B1 and B2 is zero.

Ainsi, deux boîtes qui ne s’intersectent pas suivant une dimension ont une intersection nulle (même si elles s’intersectent suivant les autres dimensions).Thus, two boxes that do not intersect along one dimension have zero intersection (even if they intersect along the other dimensions).

Dans certains modes de réalisation, l’unité de décomposition 14 peut déterminer les boîtes d’intersection , de façon indépendante, à partir de chacune des boites d’intersection maximale . Pour une boîte d’intersection maximale , le procédé de l’annexe A1.1 (Algorithme 1 en pseudo-code) peut par exemple être appliqué aux intervalles , pour extraire une décomposition en boîtes d’intersection maximale de l’ensemble de feuilles restreintes à d+1 dimensions.In some embodiments, the decomposition unit 14 may determine the intersection boxes , independently, from each of the maximum intersection boxes . For a maximum intersection box , the method of appendix A1.1 (Algorithm 1 in pseudo-code) can for example be applied to the intervals , to extract a maximum intersection box decomposition from the set of leaves restricted to d+1 dimensions.

La décomposition en intervalle d’intersection maximale des boîtes de l’ensemble restreintes à la dimension d, c’est-à-dire des intervalles , sera noté ci-après :The maximum intersection interval decomposition of the boxes of the set restricted to dimension d, i.e. intervals , will be noted below:

En appliquant le procédé de l’annexe A1.1, décrit en pseudo-code par l’Algorithme 1, il vient :By applying the process of appendix A1.1, described in pseudo-code by Algorithm 1, it comes:

, . , .

Dans la suite de la description, la notation suivante sera utilisée pour désigner les boîtes d’intersection maximale :In the remainder of the description, the following notation will be used to designate the maximum intersection boxes :

Les boîtes d’intersection maximale sont des boîtes d’intersection maximale associées à la restriction de aux premières dimensions. The maximum intersection boxes are maximum intersection boxes associated with the restriction of to first dimensions .

La décomposition , déterminée par l’unité de décomposition 14, à partir de toutes les boîtes d’intersection maximale est une décomposition en boîtes d’intersection maximale des feuilles restreintes à d+1 dimensions. The decomposition , determined by decomposition unit 14, from all boxes of maximum intersection is a maximum intersection box decomposition of the leaves restricted to d+1 dimensions .

L’unité de décomposition 14 peut donc déterminer une décomposition en boîtes d’intersection maximale, et donc une décomposition en boîtes pures du modèle , dimension par dimension, en construisant une structure arborescente dérivée dont les nœuds à la profondeur contiennent les régions pures et correspondent donc à une décomposition en régions pures des feuilles restreintes aux premières dimensions. Un exemple d’implémentation du procédé de décomposition des feuilles d’un modèle F en régions pures de type ”boîte”, pouvant être mis en œuvre par l’unité de décomposition 14, est décrit en pseudo-code par l’Algorithme 2 de l’annexe A1.2. Un tel procédé effectue un traitement par récurrence, dimension par dimension.The decomposition unit 14 can therefore determine a decomposition into boxes of maximum intersection, and therefore a decomposition into pure boxes of the model , dimension by dimension, by constructing a derived tree structure whose nodes at depth contain the pure regions and therefore correspond to a decomposition into pure regions of leaves restricted to first dimensions. An example of implementation of the method of decomposing the leaves of a model F into pure regions of the “box” type, which can be implemented by the decomposition unit 14, is described in pseudo-code by Algorithm 2 of Annex A1.2. Such a method performs a processing by recurrence, dimension by dimension.

Le procédé de décomposition en région pures mise en œuvre par l’unité de décomposition 14 (par exemple selon l’Algorithme 2 de l’annexe A.1.2) utilise en entrée le Modèle de type ensemble d'arbres défini par :The method of decomposition into pure regions implemented by the decomposition unit 14 (for example according to Algorithm 2 of appendix A.1.2) uses as input the Model of type set of trees defined by:

- N feuilles , ..., ;- N sheets , ..., ;

- Des scores associées à chacune des feuilles avec ;- Scores associated with each of the sheets with ;

- Une fonction d’agrégation vectorielle :- A vector aggregation function :

Le procédé de décomposition en région pures mis en œuvre par l’unité de décomposition 14 (par exemple selon l’Algorithme 2 de l’annexe A.1.2) fournit en sortie l’ensemble des régions pures de type « boîtes » déterminées pour le modèle et l’ensemble des scores associés à chacune des régions pures.The method of decomposition into pure regions implemented by the decomposition unit 14 (for example according to Algorithm 2 of appendix A.1.2) outputs the set pure “box” regions determined for the model and all scores associated with each of the pure regions.

Le procédé de décomposition en région pures permet de calculer les nœuds situés à une même profondeur de façon concurrente.The process of decomposition into pure regions makes it possible to calculate the nodes located at the same depth in a concurrent way.

La illustre les étapes du procédé de décomposition en boîtes pures (implémenté par exemple selon l’Algorithme 2 de l’annexe A.1.2), dans un exemple de réalisation.There illustrates the steps of the method of decomposition into pure boxes (implemented for example according to Algorithm 2 of appendix A.1.2), in an exemplary embodiment.

Dans l’exemple de la , l’étape E1 correspond à l’entrée appliquée au dispositif de diagnostic 100. Cette entrée est une collection de boites multidimensionnelles Bi avec leurs scores respectifs Si représentant le modèle de classification 10. La requête d’entrée comprend un point requête R et une classe cible C2.In the example of the , step E1 corresponds to the input applied to the diagnostic device 100. This input is a collection of multidimensional boxes Bi with their respective scores Si representing the classification model 10. The input query comprises a query point R and a target class C2.

L’étape E2 correspond à la décomposition en boites pures calculée récursivement à travers les dimensions, comme une décomposition en boites dites d'«intersection maximale».Step E2 corresponds to the pure box decomposition computed recursively through the dimensions, as a so-called “maximum intersection” box decomposition.

L’étape E3 correspond à la détermination des scores des boîtes pures obtenus en appliquant la fonction d’agrégation « g » du modèle de classification 10 de type « ensemble d'arbres » :Step E3 corresponds to the determination of the scores of the pure boxes obtained by applying the aggregation function "g" of the classification model 10 of the "set of trees" type:

- G1=G2=g(S1) ;- G1=G2=g(S1);

- G3=g(S1+S2) ;- G3=g(S1+S2);

- G4=G5=g(S2).- G4=G5=g(S2).

L’étape E4 correspond à la détermination des boites pures appartenant à la classe cible C2 pour le point contrefactuel associé à R.Step E4 corresponds to the determination of the pure boxes belonging to the target class C2 for the counterfactual point associated with R.

L’étape E5 correspond au calcul des distances du point requête à chacune des boites cibles.Step E5 corresponds to the calculation of the distances from the query point to each of the target boxes.

L’étape E6 correspond à la détermination du point contrefactuel comme le point le plus proche de la requête R appartenant à la boite cible la plus proche.Step E6 corresponds to the determination of the counterfactual point as the point closest to the query R belonging to the closest target box.

Le diagramme de la représente graphiquement un exemple de structure arborescente dérivée 60 obtenue en mettant en œuvre le procédé décomposition des feuilles d’un modèle F en régions pures de type ”boîte”, tel que mis en œuvre par les unités de décomposition 12 et 14, selon l’implémentation de l’annexe A.1 .2 (Algorithme 2) pour un modèle F simplifié.The diagram of the graphically represents an example of a derived tree structure 60 obtained by implementing the method of decomposing the leaves of a model F into pure regions of the “box” type, as implemented by the decomposition units 12 and 14, according to the implementation of appendix A.1 .2 (Algorithm 2) for a simplified F model.

Dans l’exemple de la Figure 6, le modèle utilisé est un modèle en dimension 2 (deux dimensions d1 et d2) comportant deux feuilles (F1, F2). Le premier niveau 60-1 de la structure arborescente dérivée 60 (en haut à droite de la Figure 6) correspond à la dimension d1 et à l’application de la procédure 1 sur les feuilles restreintes à cette dimension. Le procédé de calcul des régions d’intersection maximale (par exemple selon l’Algorithme 1 de l’annexe A.1.1) est ensuite appliqué indépendamment dans chacun des intervalles élémentaires formés dans la dimension d1 sur les feuilles contenues dans chacun de ces intervalles et restreintes à la dimension d2. Chaque rectangle à bords arrondis 601 est une représentation symbolique d’une exécution de la procédure 1 dans la dimension associée à le niveau de la structure (60-1 ou 60-2) et sur le sous-ensemble de boîtes propagés depuis le niveau supérieur 60-1 et restreintes à cette dimension (d1 ou d2). En suivant les chemins dans la structure d’arbre formée 60, une décomposition en régions pures de type ”boîte” des feuilles F1 et F2 est déterminée. Les cinq régions pures extraites sont représentées par les rectangles hachurés 602 sur la Figure 6.In the example of Figure 6, the model used is a 2-dimensional model (two dimensions d1 and d2) comprising two sheets (F1, F2). The first level 60-1 of the derived tree structure 60 (top right of Figure 6) corresponds to the dimension d1 and to the application of procedure 1 on the leaves restricted to this dimension. The method for calculating the regions of maximum intersection (for example according to Algorithm 1 of appendix A.1.1) is then applied independently in each of the elementary intervals formed in the dimension d1 on the sheets contained in each of these intervals and restricted to dimension d2. Each rectangle with rounded edges 601 is a symbolic representation of an execution of procedure 1 in the dimension associated with the level of the structure (60-1 or 60-2) and on the subset of boxes propagated from the higher level 60-1 and restricted to this dimension (d1 or d2). By following the paths in the formed tree structure 60, a box-like pure region decomposition of the F1 and F2 leaves is determined. The five pure regions extracted are represented by hatched rectangles 602 in Figure 6.

La est un organigramme représentant le procédé de diagnostic d’anomalies selon les modes de réalisation de l’invention.There is a flowchart representing the method of diagnosing abnormalities according to the embodiments of the invention.

A l’étape 700, une requête reçue associée à une classe cible i est appliqué au modèle F entraîné au préalable, la requête comprenant une donnée d’entrée X et une classe cible.At step 700, a request received associated with a target class i is applied to the model F trained beforehand, the request comprising an input datum X and a target class.

A l’étape 702, une décomposition du modèle de classification en boîtes pures est déterminée, à partir de la requête d’entrée, lesdites boîtes pures correspondant à une collection d’intervalles, chaque boîte pure correspondant à une région de l’espace des caractéristiques d’entrée, associée à un score vectoriel déterminé à partir des scores associés aux feuilles des arbres de décision (par exemple selon l’Algorithme 2 de l’annexe A1.1).At step 702, a decomposition of the classification model into pure boxes is determined, from the input query, said pure boxes corresponding to a collection of intervals, each pure box corresponding to a region of the space of input characteristics, associated with a vector score determined from the scores associated with the leaves of the decision trees (for example according to Algorithm 2 of appendix A1.1).

A l’étape 704, le point contrefactuel CF(X,j) représentant le point virtuel le plus proche du point requête X appartenant à la classe cible est déterminé, à partir de la décomposition en boîtes pures, selon l’équation 6.At step 704, the counterfactual point CF(X,j) representing the virtual point closest to the query point X belonging to the target class is determined, from the decomposition into pure boxes, according to equation 6.

A l’étape 706, la différence entre le point contrefactuel Y et la requête d’entrée Xest déterminé, ce qui fournit un vecteur de changement dX représentant les changements à appliquer au minimum à la donnée d'entrée pour que la donnée d’entrée soit classée dans la classe cible par le modèle de classification, le vecteur de changement dX ayant la même taille que les caractéristiques d’entrée.In step 706, the difference between the counterfactual point Y and the input request X is determined, which provides a change vector dX representing the changes to be applied at least to the input data so that the input data is classified in the target class by the classification model, the change vector dX having the same size as the input characteristics.

Par définition du point contrefactuel Y, le point contrefactuel se trouve sur l’enveloppe extérieure d’une région de la classe cible j. Pour tout point intérieur Y d’une régionR, il est possible de trouver un point de l’enveloppe deRqui est plus proche de la requête X que le point Y. Il peut suffire par exemple de sélectionner un point qui se trouve à l’intérieur du segment [X,Y] et à l’intersection de celui-ci avec l’enveloppe de R.By definition of the counterfactual point Y, the counterfactual point lies on the outer envelope of a region of the target class j. For any interior point Y of a region R , it is possible to find a point of the envelope of R which is closer to the query X than the point Y. It may suffice for example to select a point which is at inside the segment [X,Y] and at its intersection with the envelope of R.

Dans un mode de réalisation, l’étape 704 de détermination du point contrefactuel peut comprendre, pour chaque boîte , déterminée à l’étape de décomposition 702, une détermination de la distance minimale de X à la boîte ainsi que le point Z de la surface de la boîte se trouvant à une distance minimale entre le point requête X représentant la donnée d’entrée et la boîte pure, avec .In one embodiment, step 704 of determining the counterfactual point may include, for each box , determined in decomposition step 702, a determination of the minimum distance from X to the box as well as the point Z of the surface of the box being at a minimum distance between the query point X representing the input data and the pure box, with .

Par exemple, pour chaque boîte , le point Z peut être calculé, en appliquant un procédé de calcul de point de surface, comme par exemple l’Algorithme 3 décrit en pseudo-code dans l’annexe A1.3 en procédant dimension par dimension.For example, for each box , the point Z can be calculated, by applying a surface point calculation method, such as for example Algorithm 3 described in pseudo-code in appendix A1.3, proceeding dimension by dimension.

Dans un tel mode de réalisation, le point Z se trouvant à une distance minimale entre le point requête représentant la donnée d’entrée et la boîte pure peut être déterminé selon l’équation suivante :In such an embodiment, the point Z located at a minimum distance between the request point representing the input data and the pure box can be determined according to the following equation:

Le procédé de calcul du point Z (comme par exemple le procédé mis en œuvre par l’algorithme de l’annexe A1.3), pour une boîte représentant une boîte D-dimensionnelle, peut comprendre une pluralité d’itérations, chaque itération correspondant à une dimension d. À chaque itération correspondant à une dimension ”d”, la distance correspondant à la distance minimale entre X et la boîte restreinte aux d premières dimensions est calculée, avec , où = .The method for calculating the point Z (such as for example the method implemented by the algorithm of appendix A1.3), for a box representing a D-dimensional box, may comprise a plurality of iterations, each iteration corresponding to a dimension d. At each iteration corresponding to a dimension “d”, the distance corresponding to the minimum distance between X and the box restricted to the first d dimensions is computed, with , Or = .

La distance d’un point à une région est définie par l’équation (5).The distance from a point to a region is defined by equation (5).

Le procédé de calcul du point Z (par exemple selon l’Algorithme 3 de l’annexe A1.3) utilise en entrée :The Z point calculation process (for example according to Algorithm 3 of appendix A1.3) uses as input:

- La boîte ;- The box ;

- Le point requête .- The query point .

Le procédé de calcul du point fournit en sortie la distance du point requête X à la boîte au sens de la définition de l’équation (5) représenté par le point tel que :The point calculation process outputs the distance from the query point X to the box within the meaning of the definition of equation (5) represented by the point such as :

. .

L’étape 702 de décomposition fournit une décomposition en boîte pures sous la forme d’une structure arborescente dérivée, dont les nœuds à une profondeur d contiennent les régions correspondant aux boîtes pures, la profondeur d correspondant à une dimension de l’espace des caractéristiques d’entrées. Chaque niveau de la structure arborescente dérivée correspond à une dimension du modèle.The decomposition step 702 provides a pure box decomposition in the form of a derived tree structure, whose nodes at a depth d contain the regions corresponding to the pure boxes, the depth d corresponding to a dimension of the feature space of entries. Each level of the derived tree structure corresponds to a dimension of the model.

Dans un mode de réalisation, au lieu de construire la structure arborescente dérivée, correspondant à la décomposition en boîtes pures, de manière exhaustive, la construction de la structure arborescente dérivée peut être arrêtée au niveau des nœuds pour lesquels une borne inférieure est supérieure à une borne supérieure , selon un procédé de décomposition avec retour sur trace (« backtracking »). La borne inférieure représente une borne inférieure sur la distance au point requête X de toutes les boîtes pures se trouvant en dessous du nœud. Dans un mode de réalisation, pour un nœud donné de la structure arborescente dérivée, la borne inférieure est égale à la distance du point requête à la boîte pure correspondant à ce nœud. La borne supérieure représente la borne supérieure sur la distance du point contrefactuel au point requête.In one embodiment, instead of building the derived tree structure, corresponding to the decomposition into pure boxes, in an exhaustive manner, the construction of the derived tree structure can be stopped at the level of the nodes for which a lower bound is greater than an upper bound , according to a decomposition process with backtracking. The lower bound represents a lower bound on the distance to the query point X of all the pure boxes lying below the node. In one embodiment, for a given node of the derived tree structure, the lower bound is equal to the distance from the query point to the pure box corresponding to this node. The upper bound represents the upper bound on the distance from the counterfactual point to the query point.

Ainsi, pour répondre à une requête X, au lieu de déterminer la décomposition du modèle en régions pures de l’espace de décision du modèle de manière exhaustive, seule une partie de la structure arborescente dérivée, autour du point requête, peut être construite en appliquant le procédé de décomposition avec retour sur trace (« backtracking »). Une telle approche est particulièrement utile pour des modèles F de grande taille pour lesquels le nombre de régions pures explose avec le nombre de feuilles. La propriété utilisée est le fait que chaque nœud de l’arbre de décomposition (structure arborescente dérivée) correspond à une décomposition pure des feuilles indexées par le nœud et restreinte à la dimension courante, correspondant à la profondeur du nœud considéré dans l’arbre, et que cette décomposition pure est compatible avec la décomposition pure sur toutes les feuilles restreintes à cette dimension.Thus, to respond to a query X, instead of determining the decomposition of the model into pure regions of the decision space of the model in an exhaustive way, only a part of the derived tree structure, around the query point, can be built in applying the process of decomposition with backtracking. Such an approach is particularly useful for large F-models for which the number of pure regions explodes with the number of leaves. The property used is the fact that each node of the decomposition tree (derived tree structure) corresponds to a pure decomposition of the leaves indexed by the node and restricted to the current dimension, corresponding to the depth of the node considered in the tree, and that this pure decomposition is compatible with the pure decomposition on all the sheets restricted to this dimension.

Selon un tel procédé de décomposition avec retour de trace, pour un point requête X, seuls les nœuds de l’arbre qui contiennent la valeur de X à la dimension correspondant au nœud sont construits. Si ces nœuds ne sont pas suffisants pour trouver une région pure de classej, la construction peut être élargie aux nœuds voisins en remontant la structure de décomposition dimension par dimension.According to such a decomposition process with trace feedback, for a query point X, only the nodes of the tree which contain the value of X at the dimension corresponding to the node are constructed. If these nodes are not sufficient to find a pure region of class j , the construction can be extended to neighboring nodes by going up the decomposition structure dimension by dimension.

Un exemple de procédé de calcul de la borne supérieure sur la distance au point contrefactuel le plus proche, avec retour de trace (”backtracking”), est fourni en pseudo-code dans l’annexe B4.4 (Algorithme 4).An example of the method for calculating the upper bound on the distance to the nearest counterfactual point, with backtracking, is provided in pseudo-code in appendix B4.4 (Algorithm 4).

Un tel procédé de calcul de la borne supérieure reçoit en entrée le Modèle F de type ensemble d'arbres défini par :Such a method of calculating the upper bound receives as input the Model F of type set of trees defined by:

- N feuilles , ..., ;- N sheets , ..., ;

- Les scores associées à chacune des feuilles avec ;- The scores associated with each of the sheets with ;

- La fonction d’agrégation vectorielle : ;- The vector aggregation function : ;

Le procédé de calcul de la borne supérieure reçoit également en entrée :The method of calculating the upper bound also receives as input:

- La requête d’entrée ; et- The entry request ; And

- la classe cible, désignée par la variable « target », avec target telle que target Classe(X).- the target class, designated by the “target” variable, with target such as target Class(X).

Le procédé de calcul de la borne supérieure détermine et délivre en sortie la borne supérieure sur la quantité dist(X; CF(X ; target)).The upper bound calculation method determines and outputs the upper bound on the quantity dist(X; CF(X; target)).

La est un exemple de représentation haut-niveau du fonctionnement du procédé de décomposition avec retour de trace correspondant à l’application de l’algorithme 4 de l’Annexe B.2.4. Dans l’exemple de la , la classe cible est la classe C2. Le procédé génère une construction d’une structure arborescente du même type que celle construite par le procédé de décomposition exhaustif (par exemple selon l’Algorithme 2 de l’annexe A1.2) dans l’ordre indiqué par le parcours fléché jusqu’à ce qu’une région appartenant à la classe C2 soit déterminée. La partie en pointillés de la structure n’est pas construite. Le procédé de décomposition avec retour de trace (« backtracking ») appliqué à une structure d’arbre permet un parcours en profondeur d’abord comme indiqué par le parcours fléché, alors que le procédé de décomposition exhaustif (selon l’Algorithme 2 de l’annexe B.2 par exemple) fournit une construction exhaustive de la structure en utilisant d’abord un parcours en largeur.There is an example of a high-level representation of the operation of the decomposition process with trace feedback corresponding to the application of algorithm 4 of Appendix B.2.4. In the example of the , the target class is class C2. The method generates a construction of a tree structure of the same type as that constructed by the exhaustive decomposition method (for example according to Algorithm 2 of appendix A1.2) in the order indicated by the arrowed path up to that a region belonging to class C2 is determined. The dotted part of the structure is not built. The decomposition process with backtracking applied to a tree structure allows a depth-first traversal as indicated by the arrowed traversal, while the exhaustive decomposition process (according to Algorithm 2 of the Appendix B.2 for example) provides an exhaustive construction of the structure using first a breadth-first traversal.

Une fois qu’une région pure appartenant à la classe cible est trouvée, la recherche est réitérée en construisant toutes les régions pures se trouvant à une distance de la requête inférieure ou égale à la distance de la requête à la région pure trouvée. Cette dernière étape permet de garantir de trouver la région pure cible la plus proche.Once a pure region belonging to the target class is found, the search is reiterated by constructing all the pure regions located at a distance from the query less than or equal to the distance from the query to the pure region found. This last step guarantees to find the closest target pure region.

Un procédé de calcul du point contrefactuel CF(X; target) de type "branch-and-bound" (par séparation et évaluation) peut alors être appliqué (étape 704 de la figure 7), par exemple selon l’Algorithme 5 de l’annexe B.2.5 décrit en pseudo code (le terme « target » signifie « cible » en langue anglo-saxonne), en utilisant la borne supérieure calculée . Un tel procédé par séparation et évaluation (« branch-and-bound ») ne construit qu'une décomposition partielle de l'espace de décision du modèle se trouvant à l'intérieur d'une hypersphère de centre X et de rayon . Le rayon représente la borne supérieure sur la quantité dist(X; CF(X;target)).A method of calculating the counterfactual point CF(X; target) of the "branch-and-bound" type (by separation and evaluation) can then be applied (step 704 of FIG. 7), for example according to Algorithm 5 of the appendix B.2.5 described in pseudo code (the term "target" means "target" in Anglo-Saxon language), using the upper limit calculated . Such a process by separation and evaluation ("branch-and-bound") constructs only a partial decomposition of the decision space of the model located inside a hypersphere of center X and radius . The Ray represents the upper bound on the quantity dist(X; CF(X; target)).

Dans ce mode de réalisation de l’étape 704 de la figure 4, le procédé de calcul du point contrefactuel CF( , ) reçoit en entrée le Modèle de type ensemble d'arbres défini par :In this embodiment of step 704 of FIG. 4, the method of calculating the counterfactual point CF( , ) receives as input the Model of type set of trees defined by:

- N feuilles , ..., ;- N sheets , ..., ;

- La borne supérieure sur la quantité dist ( ; CF( ; )).- The upper bound on the quantity dist ( ; CF( ; )).

Le procédé de calcul du point contrefactuel CF( , target) reçoit également en entrée :The method of calculating the counterfactual point CF( , target) also receives as input:

- La requête d’entrée ; et- The entry request ; And

- la classe cible , désignée par la variable «target» dans l’Algorithme 5, avec target telle que target Classe(X), Classe( ) désignant la classe de la donnée d’entrée .- the target class , denoted by the “target” variable in Algorithm 5, with target such as target Class(X), Class( ) designating the class of the input data .

Le procédé de calcul du point contrefactuel CF(X,j) détermine et délivre en sortie le point contrefactuel CF(X, j) dans la classe cible .The counterfactual point calculation method CF(X,j) determines and outputs the counterfactual point CF(X,j) in the target class .

Un tel procédé de calcul du point contrefactuel CF(X, target) de type "branch-and-bound" (par séparation et évaluation), par exemple comme décrit en pseudo-code par « Algorithme 5 » de l’annexe B.2.5, est similaire au procédé de l’Algorithme 2 de l’annexe A1.2 à l’exception de l’exploration en un nœud qui est abandonnée si la distance de la région pure définie par ce nœud à la requête (borne inférieure) dépasse la borne supérieure calculée en mettant en œuvre de l’algorithme 4.Such a method for calculating the counterfactual point CF(X, target) of the "branch-and-bound" type (by separation and evaluation), for example as described in pseudo-code by "Algorithm 5" of appendix B.2.5 , is similar to the process of Algorithm 2 of appendix A1.2 with the exception of the exploration at a node which is abandoned if the distance from the pure region defined by this node to the query (lower bound) exceeds the upper bound calculated by implementing Algorithm 4.

Dans un autre mode de réalisation, l’exploration en un nœud peut ne pas être poursuivie si aucune des boîtes associées à ce nœud ne vote pour la région cible. Au fur et à mesure que des régions cibles sont déterminées à l’intérieur de l’hypersphère, la borne supérieure peut être mise à jour avec la distance de la requête à ces régions cibles.In another embodiment, exploration at a node may not proceed if none of the boxes associated with that node vote for the target region. As target regions are determined inside the hypersphere, the upper bound can be updated with the distance of the query to those target regions.

Le dispositif et le procédé de diagnostic d’anomalie selon les modes de réalisation peuvent être utilisés dans de nombreuses applications.The anomaly diagnosis device and method according to the embodiments can be used in many applications.

Dans un exemple d’application de l’invention, le dispositif et le procédé de diagnostic d’anomalie peuvent être utilisés pour détecter et diagnostiquer des défauts dans les procédés de fabrication et les chaînes de production. Le déploiement massif de capteurs sur les lignes de production modernes permet un suivi très fin des produits en cours de fabrication. En même temps, la complexité des processus de fabrication entraîne une démultiplication des étapes de fabrication, et par suite un risque accru que des défauts soient introduits avant la fin du processus de fabrication. Cela permet également l’apprentissage de modèles capables de détecter efficacement les défauts sur les objets manufacturés. Les modèles de l’état de la technique ne permettent pas l’identification et le diagnostic du défaut. Cependant, de telles informations sont nécessaires aux opérations de maintenance sur le ou les dispositifs de production défectueux afin d’empêcher l’introduction ultérieure de défauts dans le processus de production par la réalisation d’une opération de maintenance rapide et ciblée.In an exemplary application of the invention, the anomaly diagnosis device and method can be used to detect and diagnose faults in manufacturing processes and production lines. The massive deployment of sensors on modern production lines allows very detailed monitoring of products during manufacture. At the same time, the complexity of the manufacturing processes entails a multiplication of the manufacturing steps, and consequently an increased risk that defects are introduced before the end of the manufacturing process. This also allows the learning of models capable of efficiently detecting defects on manufactured objects. State-of-the-art models do not allow fault identification and diagnosis. However, such information is necessary for maintenance operations on the defective production device(s) in order to prevent the subsequent introduction of defects into the production process by carrying out a timely and targeted maintenance operation.

Le dispositif et le procédé de diagnostic d’anomalie selon les modes de réalisation de l’invention permettent d’identifier et de diagnostiquer les défauts ”à la volée” à l’intérieur de processus de production complexes, à partir du vecteur dX représentant la différence entre le point contrefactuel et le point requête X, les composantes du vecteur dXindiquant les changements minimum qu’il aurait fallu effectuer pour que la donnée d’entrée soit classée dans la classe « normale ».The anomaly diagnosis device and method according to the embodiments of the invention make it possible to identify and diagnose faults "on the fly" within complex production processes, from the vector dX representing the difference between the counterfactual point and the query point X, the components of the vector dX indicating the minimum changes that should have been made for the input data to be classified in the "normal" class.

Le vecteur dX déterminé par le dispositif et le procédé de diagnostic d’anomalie fournit des données de sortie permettant d’interpréter la décision du modèle de classification binaire de placer la donnée caractérisant le défaut dans la classe ”anormal” (qui correspond dans cet exemple à la classe « défaut »), et ainsi un diagnostic de défaut. En fournissant un diagnostic de défaut, les modes de réalisation de l’invention permettent une intervention ciblée et rapide sur une chaîne de production afin de corriger le défaut et empêcher sa réintroduction dans la production future. Il en résulte un gain de temps et une minimisation des coûts de production, la chaîne de production n’ayant plus à être immobilisée intégralement pendant une longue période pour remonter à la cause des défauts.The vector dX determined by the anomaly diagnosis device and method provides output data allowing the interpretation of the decision of the binary classification model to place the data characterizing the fault in the "abnormal" class (which corresponds in this example to the "fault" class), and thus a fault diagnosis. By providing a fault diagnosis, the embodiments of the invention allow targeted and rapid intervention on a production line in order to correct the fault and prevent its reintroduction in future production. This results in a saving of time and a minimization of production costs, the production chain no longer having to be completely immobilized for a long period to trace the cause of the defects.

Les modes de réalisation de l’invention peuvent également être utilisés pour détecter des comportements anormaux (par exemple le diagnostic de fraudes bancaires ou fiscales), pour la prise de décision (par exemple l’attribution de crédits). Les modes de réalisation de l’invention peuvent être en outre utilisés dans toute application nécessitant non seulement de classer une donnée d’entrée mais aussi de déterminer des données expliquant ou justifiant la décision du modèle (par exemple pour constitution de preuves dans le cas de dossiers de fraudes, et pour justifier la décision prise au demandeur en cas d’attribution de crédits). Dans une autre application, les modes de réalisation de l’invention peuvent être utilisés dans le domaine médical pour le diagnostic d’anomalies chez un patient, comme par exemple le diagnostic de tumeurs. Dans une telle application de l’invention au domaine médical, la décision déterminée à l’aide d’un modèle d’intelligence artificielle n’a de valeur qui si elle est accompagnée de données explicatives de la décision pour pouvoir être validée ou invalidée par un médecin. Il convient de souligner que les modèles d’intelligence artificiels utilisés dans le domaine médical, pour être homologués, doivent fournir des décisions explicables.The embodiments of the invention can also be used to detect abnormal behavior (for example the diagnosis of bank or tax fraud), for decision-making (for example the allocation of credits). The embodiments of the invention can also be used in any application requiring not only to classify input data but also to determine data explaining or justifying the decision of the model (for example for constitution of proofs in the case of fraud files, and to justify the decision taken to the applicant in the event of the allocation of credits). In another application, the embodiments of the invention can be used in the medical field for the diagnosis of abnormalities in a patient, such as for example the diagnosis of tumors. In such an application of the invention to the medical field, the decision determined using an artificial intelligence model only has value if it is accompanied by data explaining the decision in order to be able to be validated or invalidated by doctor. It should be emphasized that artificial intelligence models used in the medical field, to be approved, must provide explainable decisions.

Par exemple, dans un exemple d’application de l’invention à l’obtention de crédit dans le domaine bancaire, un modèle F de type ensemble d’arbre peut être préalablement entraîné (étape 700 de la ) sur l’attribution de crédits à la consommation. Le procédé de diagnostic d’anomalie peut être ensuite appliqué pour diagnostiquer (déterminer la raison) de chacune des décisions négatives fournies (non-attribution du crédit) par le modèle de décision F. Les composantes du vecteur dX représentent les caractéristiques d’entrée qu’il faudrait changer a minima pour que le client soit éligible (i.e. pour que la donnée soit classée dans la classe adverse « normale »). De telles composantes peuvent être traduites en recommandations, en langage haut niveau.For example, in an exemplary application of the invention to obtaining credit in the banking field, a model F of tree set type can be trained beforehand (step 700 of the ) on the allocation of consumer credit. The anomaly diagnosis method can then be applied to diagnose (determine the reason) for each of the negative decisions provided (non-attribution of credit) by the decision model F. The components of the vector dX represent the input characteristics that it would be necessary to change at least for the customer to be eligible (ie for the data to be classified in the "normal" adverse class). Such components can be translated into recommendations, in high-level language.

Il est supposé par exemple que le modèle de décision F est un modèle XGBoost de classification binaire comportant 250 arbres de décision de profondeur 8 et que l’ensemble de donnée d’entrée comporte 20 caractéristiques d’entrée de type numériques et catégorielles. Un exemple de requête d’attribution de crédit bancaire est considéré. Des exemples de caractéristiques d’entrée associées à la requête comprennent, sans limitations, le nombre de crédits en cours, l’historique de remboursement de précédents crédits, la somme empruntée, le but du crédit (type d’achat), le salaire et les économies de l’emprunteur, le type de contrat de travail, la durée dans le présent poste occupé par l’emprunteur, etc. À partir de telles caractéristiques d’entrée, le modèle F est appliqué, ce qui fournit une classification parmi deux classes :It is assumed for example that the decision model F is a binary classification model XGBoost comprising 250 decision trees of depth 8 and that the input data set comprises 20 numeric and categorical type input characteristics. An example of a bank credit allocation request is considered. Examples of input characteristics associated with the query include, but are not limited to, number of outstanding loans, repayment history of previous loans, amount borrowed, purpose of loan (type of purchase), salary and the savings of the borrower, the type of employment contract, the duration in the present position held by the borrower, etc. From such input features, the F model is applied, which provides a classification among two classes:

C1 : décision d’attribuer le crédit (classe « normale »), ouC1: decision to award credit (“normal” class), or

C2 : décision de refuser le crédit demandé par l’emprunteur (classe « anormale »).C2: decision to refuse the credit requested by the borrower (“abnormal” class).

Des exemples de diagnostics de non-attribution de crédits formulés automatiquement à partir du calcul du point contrefactuel de la classe ”normale” (qui est dans ce cas la classe C1 correspondant à une décision positive sur l’attribution du crédit) peuvent inclure par exemple les cas 1, 2 et 3 suivants.Examples of diagnoses of non-attribution of credit formulated automatically from the calculation of the counterfactual point of the “normal” class (which in this case is the class C1 corresponding to a positive decision on the attribution of credit) can include for example the following cases 1, 2 and 3.

Cas 1Case 1

Il est considéré une requête « demande de crédit de 701 dollars faite pour un client 2» pour laquelle la classification donnée par le modèle F est « crédit refusé » (C2). Le Diagnostic correspondant aux composantes du vecteur dXest le suivant :
- Caractéristique d’entrée 1 correspondant aux « Statuts des comptes de vérification existants » :
Recommandé : 0≤...<200 – Le client a:<0
- Caractéristique d’entrée 2 correspondant à « Historique de Crédit » :
Recommandé: tous les crédits sur cette banque ont été dûment payés – Le statut du Client est : « les crédits existant dûment payés jusqu’à maintenant ».
3: Caractéristique d’entrée 3 correspondant à « Épargne » :
Recommandé: 500≤...<1000 – Le client a:<100.
4: Caractéristique d’entrée 4 correspondant à «Taux d’endettement en pourcentage de revenu »:
Recommandé au plus: 2.5 – Le client a: 3
5: Caractéristique d’entrée 5 correspondant à « Autres débiteurs/ garanties »:
Recommandé: codemandeur – Statut client : aucun.A request “credit request for $701 made for customer 2” is considered for which the classification given by model F is “credit refused” (C2). The Diagnosis corresponding to the components of the vector dX is the following:
- Entry characteristic 1 corresponding to "Status of existing verification accounts":
Recommended: 0 ≤ ... < 200 – Customer has: < 0
- Entry characteristic 2 corresponding to “Credit History”:
Recommended: all credits on this bank have been duly paid – The status of the Client is: “existing credits duly paid up to now”.
3: Entry characteristic 3 corresponding to “Savings”:
Recommended: 500 ≤ ... < 1000 – Customer has: < 100.
4: Input characteristic 4 corresponding to “Debt ratio as a percentage of income”:
Recommended at most: 2.5 – Customer has: 3
5: Entry characteristic 5 corresponding to “Other debtors/guarantees”:
Recommended: co-applicant – Client status: none.

Cas 2Case 2

Il est considéré une requête « demande de crédit de 330 dollars faite pour un client 12» pour laquelle la classification donnée par le modèle F est « crédit refusé » (C2). Le Diagnostic correspondant aux composantes du vecteur dX est le suivant :
1: Caractéristique d’entrée 1 correspondant à «Statuts des comptes de vérification existants » :
Recommandé: 0≤...<200 – Le client a:<0.
2: Caractéristique d’entrée 2 correspondant à «Durée du crédit en mois »:
Recommandé au plus: 15 – Le Client a demandé: 16.
4: Caractéristique d’entrée 4 correspondant à «Emploi actuel occupé depuis »:
Recommandé: 4≤...<7 ans – Le statut de Client est: 1≤...<4 ans
5: Caractéristique d’entrée 5 correspondant à « Autres débiteurs / garanties »:
Recommandé: codemandeur – Le statut du Client est: aucunA request “credit request for $330 made for a client 12” is considered for which the classification given by model F is “credit refused” (C2). The Diagnosis corresponding to the components of the vector dX is as follows:
1: Entry characteristic 1 corresponding to “Status of existing checking accounts”:
Recommended: 0 ≤ ... < 200 – Customer has: < 0.
2: Input characteristic 2 corresponding to “Credit period in months”:
Recommended at most: 15 – Client requested: 16.
4: Entry characteristic 4 corresponding to “Current job held since”:
Recommended: 4 ≤ ... < 7 years – Client status is: 1 ≤ ... < 4 years
5: Entry characteristic 5 corresponding to “Other debtors / guarantees”:
Recommended: co-applicant – Client status is: none

Cas 3Case 3

Il est considère une requête « demande de crédit de 221 dollars faite pour un client 18» pour laquelle la classification donnée par le modèle F est « crédit refusé » (C2). Le Diagnostic correspondant aux composantes du vecteur dX est le suivant :
1: Caractéristique d’entrée 1 correspondant à «Statuts des comptes existants » :
Recommandé: 0≤...<200 – Le Client a:<0.
3: Caractéristique d’entrée 3 correspondant à «But »:
Recommandé: voiture (d’occasion) – Le Client veut acheter: voiture (nouvelle).
4: Caractéristique d’entrée 4 correspondant à «Emploi actuel occupé depuis »:
Recommandé: 4≤...<7 ans – Le statut Client est: 1≤...<4 ans.
6: Caractéristique d’entrée 6 correspondant à «Téléphone »:
Recommandé: oui, enregistré sous le nom des clients – Le statut de Client: aucun.A request “credit request for $221 made for a client 18” is considered for which the classification given by model F is “credit refused” (C2). The Diagnosis corresponding to the components of the vector dX is as follows:
1: Entry characteristic 1 corresponding to “Status of existing accounts”:
Recommended: 0 ≤ ... < 200 – Client has: < 0.
3: Input characteristic 3 corresponding to “Goal”:
Recommended: car (used) – Customer wants to buy: car (new).
4: Entry characteristic 4 corresponding to “Current job held since”:
Recommended: 4 ≤ ... < 7 years – Client status is: 1 ≤ ... < 4 years.
6: Input characteristic 6 corresponding to “Telephone”:
Recommended: yes, registered under the clients name – Client status: none.

Le dispositif et le procédé de diagnostic d’anomalie selon les modes de réalisation de l’invention peuvent également être utilisés dans des applications de débogage de modèle 10 (diagnostic d’anomalie de modèle). Étant donnée une donnée d’entrée X de label iet classée par erreur dans la classe j par le modèle (j i), le débogage de modèle autour du point de fonctionnement X consiste à analyser la ”connaissance” que le modèle a acquis au sujet des éléments de la classe i proches de X, et quelle connaissance additionnelle serait nécessaire pour que la donnée Xsoit convenablement classée dans la classe i par le modèle. La connaissance additionnelle nécessaire pour que la donnée X soit convenablement classée dans la classe i par le modèle est représentée par le vecteur de changement dX calculé par le comparateur 18.The anomaly diagnosis device and method according to the embodiments of the invention can also be used in model debugging applications (model anomaly diagnosis). Given an input data X of label iet classified by error in the class j by the model (j i), model debugging around the operating point X consists in analyzing the “knowledge” that the model has acquired about the elements of class i close to X, and what additional knowledge would be necessary for the data X to be properly classified in class i by the model. The additional knowledge necessary for the datum X to be suitably classified in class i by the model is represented by the change vector dX calculated by the comparator 18.

La connaissance que le modèle a de la classe i peut être déterminée en utilisant des éléments représentatifs des régions de décision (formant prototypes) associées à la classe i les plus proches de X.The knowledge that the model has of class i can be determined by using elements representative of the decision regions (forming prototypes) associated with class i closest to X.

Le débogage de modèle selon un tel mode de réalisation peut être utilisé pour diagnostiquer des faiblesses des modèles d’apprentissage 10 autour de points de fonctionnement ambigus pour le modèle. Le diagnostic déterminé peut notamment permettre de corriger les faiblesses des modèles en agissant sur l’ensemble des données apprentissage, par exemple en :
-supprimant les exemples adversariaux qui perturbent l’apprentissage ;
- en levant les ambiguïtés entre classes en rajoutant des exemples dans les zones où le modèle effectue des confusions entre classes.Model debugging according to such an embodiment can be used to diagnose weaknesses in training models 10 around ambiguous operating points for the model. The diagnosis determined can in particular make it possible to correct the weaknesses of the models by acting on all the learning data, for example by:
- removing adversarial examples that disrupt learning;
- by removing the ambiguities between classes by adding examples in the areas where the model confuses between classes.

La fiabilité du diagnostic obtenue en exploitant la sortie (vecteur de changement dX) du dispositif et du procédé de diagnostic d’anomalie peut être caractérisée par la robustesse statistique du diagnostic, c’est-à-dire sa répétabilité soit sur des modèles similaires, soit sur des données d’entrée similaires.The reliability of the diagnosis obtained by exploiting the output (change vector dX) of the anomaly diagnosis device and method can be characterized by the statistical robustness of the diagnosis, that is to say its repeatability either on similar models, or on similar input data.

Pour évaluer la répétabilité du diagnostic sur des modèles similaires, au lieu d’entraîner plusieurs modèles différents, il est possible de simuler des modèles proches en ignorant aléatoirement des régions pures de décision du modèle. En pratique, en utilisant le procédé de détermination du point contrefactuel (par exemple selon l’algorithme 5 de l’annexe A1.5), il est possible de répondre à la requête: quels sont les M exemples contrefactuels les plus proches appartenant àMrégions pures distinctes. En analysant les invariants statistiques à l’intérieur de l’ensemble , il est ensuite possible d’extraire une mesure de robustesse du diagnostic.To evaluate the repeatability of the diagnosis on similar models, instead of training several different models, it is possible to simulate close models by randomly ignoring pure decision regions of the model. In practice, by using the method of determining the counterfactual point (for example according to algorithm 5 of appendix A1.5), it is possible to answer the query: what are the M closest counterfactual examples belonging to M distinct pure regions. By analyzing the statistical invariants inside the set , it is then possible to extract a robustness measure of the diagnosis.

Pour évaluer la répétabilité du diagnostic sur des données d’entrée similaires, en considérant les voisins de à l’intérieur d’une boule de rayon et la différence à leurs points contrefactuels respectifs il est possible de caractériser la stabilité/répétabilité de l’invariant. Il est alors possible d’’appliquer des techniques de caractérisations de robustesse d’un diagnostic, par exemple comme décrit dans D. Alvarez-Melis and T. S. Jaakkola, « on the robustness of interpretability methods," arXiv preprint arXiv:1806.08049, 2018.To assess diagnostic repeatability on similar input data, considering the neighbors of inside a radius ball and the difference at their respective counterfactual points it is possible to characterize the stability/repeatability of the invariant. It is then possible to apply diagnostic robustness characterization techniques, for example as described in D. Alvarez-Melis and TS Jaakkola, “on the robustness of interpretability methods,” arXiv preprint arXiv:1806.08049, 2018.

Le dispositif et le procédé de diagnostic d’anomalies 100 selon les modes de réalisation de l’invention peuvent être utilisés dans différentes applications, par exemple pour un diagnostic de défauts. Il permet d’analyser des changements relativement minimes dans les caractéristiques d’entrée (donnés par le vecteur dX) de sorte qu’il est possible de déterminer de façon très précise l’origine de l’anomalie (défaut par exemple) tout en fournissant une information quantitative sur l’ « ampleur » des changements observés.The anomaly diagnosis device and method 100 according to the embodiments of the invention can be used in different applications, for example for fault diagnosis. It allows the analysis of relatively small changes in the input characteristics (given by the vector dX) so that it is possible to determine very precisely the origin of the anomaly (defect for example) while providing quantitative information on the “magnitude” of the observed changes.

Le dispositif et le procédé de diagnostic d’anomalies 100 selon les modes de réalisation de l’invention exploitent avantageusement la particularité des modèles de type « ensemble d’arbres de décision » de construire de façon explicite, exhaustive et exacte l’espace de décision de ces modèles.The anomaly diagnosis device and method 100 according to the embodiments of the invention advantageously exploit the particularity of models of the "set of decision trees" type of constructing in an explicit, exhaustive and exact manner the decision space of these models.

Les modes de réalisation de l’invention présentent plusieurs avantages parmi lesquels:
-ils permettent une interprétation de la décision du modèle en un point donné ;
-ils sont spécifiques au modèle de décision ;
- ils ne présentent pas d’imprécisions et d’instabilités ;
-ils sont non-paramétriques ;
-ils sont adaptables à des tailles arbitraires de modèle ;
- ils s’appliquent à une grande variété de modèles (XGBoost, LightGBM, ...) qui sont couramment utilisés pour le diagnostic d’anomalie, quelles que soit la nature des caractéristiques d’entrée (données industrielles numériques et/ou catégorielles) ;
-la détermination des régions pures de décision du modèle 10 peut être utilisée dans de nombreuses autres applications, par exemple, pour le prototypage des connaissances du modèle ou le débogage de modèle.The embodiments of the invention have several advantages, including:
- they allow an interpretation of the decision of the model at a given point;
- they are specific to the decision model;
- they do not present inaccuracies and instabilities;
- they are non-parametric;
- they are adaptable to arbitrary model sizes;
- they apply to a wide variety of models (XGBoost, LightGBM, ...) which are commonly used for anomaly diagnosis, whatever the nature of the input characteristics (numerical and/or categorical industrial data) ;
- the determination of the pure decision regions of the model 10 can be used in many other applications, for example, for model knowledge prototyping or model debugging.

Le dispositif de diagnostic 100 peut être implémenté sur une unité de calcul distribué comprenant une pluralité de cœurs physiques (par exemple 16 cœurs). Les modes de réalisation de l’invention sont particulièrement adaptés à une telle implémentation distribuée ainsi qu’à un portage sur des technologies hardware multi-cœur, ce qui permet d’obtenir une réponse rapide à une requête utilisateur quelle que soit la taille du modèle de classification considéré. Le dispositif de diagnostic 100 peut être est implémenté dans différents langages, tels que le langage C++.The diagnostic device 100 can be implemented on a distributed computing unit comprising a plurality of physical cores (for example 16 cores). The embodiments of the invention are particularly suitable for such a distributed implementation as well as for porting to multi-core hardware technologies, which makes it possible to obtain a rapid response to a user request regardless of the size of the model. of classification considered. The diagnostic device 100 can be implemented in different languages, such as the C++ language.

Plus généralement, l’homme du métier comprendra que le dispositif de diagnostic 100 ou des sous-systèmes du dispositif selon les modes de réalisation de l’invention peuvent être mis en œuvre de diverses manières par matériel (« hardware »), logiciel, ou une combinaison de matériel et de logiciels, notamment sous la forme de code de programme pouvant être distribué sous la forme d'un produit de programme, sous diverses formes. En particulier, le code de programme peut être distribué à l'aide de supports lisibles par ordinateur, qui peuvent inclure des supports de stockage lisibles par ordinateur et des supports de communication. Les procédés décrits dans la présente description peuvent être notamment implémentés sous la forme d’instructions de programme d’ordinateur exécutables par un ou plusieurs processeurs dans un dispositif informatique d'ordinateur. Ces instructions de programme d’ordinateur peuvent également être stockées dans un support lisible par ordinateur.More generally, those skilled in the art will understand that the diagnostic device 100 or subsystems of the device according to the embodiments of the invention can be implemented in various ways by hardware (“hardware”), software, or a combination of hardware and software, especially in the form of program code that can be distributed as a program product, in various forms. In particular, the program code may be distributed using computer readable media, which may include computer readable storage media and communication media. The methods described in this description may in particular be implemented in the form of computer program instructions executable by one or more processors in a computer computing device. These computer program instructions may also be stored in computer-readable media.

Par ailleurs, l'invention n'est pas limitée aux modes de réalisation décrits ci-avant à titre d’exemple non limitatif. Elle englobe toutes les variantes de réalisation qui pourront être envisagées par l'homme du métier. En particulier, l’invention n’est pas limitée aux exemples d’application décrits ci-avant tel que le diagnostic et le diagnostic de défauts. Les modes de réalisation de l’invention peuvent s’appliquer par exemple au diagnostic en Intelligence Artificielle médicale, au diagnostic de fraudes, à l’analyse des prises de décision multi-classes de façon générale.Furthermore, the invention is not limited to the embodiments described above by way of non-limiting example. It encompasses all the variant embodiments which may be envisaged by those skilled in the art. In particular, the invention is not limited to the examples of application described above such as the diagnosis and the diagnosis of faults. The embodiments of the invention can be applied, for example, to diagnosis in medical Artificial Intelligence, to the diagnosis of fraud, to the analysis of multi-class decision-making in general.

ANNEXE AANNEX A

A.1.1. Algorithme 1
A.1.1. Algorithm 1

A.1.2. Algorithme 2
A.1.2. Algorithm 2

A.1.3. Algorithme 3
A.1.3. Algorithm 3

A.1.4. Algorithme 4

A.1.4. Algorithm 4

A.1.5. Algorithme 5

A.1.5. Algorithm 5

Claims

1. Anomaly diagnostic device (100) comprising:
- a classification unit (11) configured to determine, in response to a request received comprising an input datum and a target class, the class to which the input datum belongs among a set of classes, by applying the datum d input to a multiclass classification model, the input data corresponding to a query point,
said classification model (10) using a set of decision trees, the input data comprising a set of input features, the space of input features having a given number of dimensions,
each decision tree comprising a set of nodes defining a tree structure from a root node to a set of leaves, each leaf of the decision tree comprising a set of scores associated with each class, each decision tree being applied to the input datum for determining an elementary classification decision from the scores associated with the leaves of the decision tree, said classification unit (11) applying an aggregation function to the elementary classification decisions determined by each of the trees of the model to determine the class of the input data;
- a decomposition unit (14) configured to decompose at least part of the set of leaves associated with the decision trees of the model into multidimensional boxes, called pure boxes, from the input request, said pure boxes corresponding to a collection of intervals, each pure box corresponding to a region of the input feature space, associated with a vector score determined from the scores associated with the leaves of said decision trees;
- a counterfactual point calculation unit (17) configured to calculate a counterfactual point (Y) representing the virtual point closest to the query point belonging to the target class, from the decomposition into pure boxes;
- a change vector calculation unit (18) configured to compare the counterfactual point with the input query and determine the difference between the counterfactual point (Y) and the input query (X), thereby providing a vector change vector (dX) representing the minimum changes to be applied to the input data for the input data to be classified in the target class by the classification model, the change vector (dX) having the same size as the input characteristics.

2. Device according to claim 1, characterized in that the decomposition unit (14) determines the decomposition into pure boxes in the form of a derived tree structure, whose nodes at a depth d contain the regions corresponding to the pure boxes , the depth d corresponding to a dimension of the input feature space.

3. Device according to one of the preceding claims, characterized in that the decision trees of the model comprise intersecting leaves, and in that the decomposition unit (14) is further configured to determine the decomposition of the model in pure boxes in the form of regions of maximum intersection, called maximum intersection boxes, a maximum intersection box representing a region of the model having a uniform score over the whole of the region resulting from the aggregation of the scores of the sheets which intersect to form said region, by applying said aggregation function.

4. Device according to claim 3, characterized in that the decomposition unit (14) is configured to determine the maximum intersection boxes of the model by induction on the dimensions of the space of the input characteristics, by applying a plurality of iterations, each iteration corresponding to one of said dimensions, the decomposition unit (14) supplying the decomposition into pure boxes of the model in the form of boxes of maximum intersection dimension by dimension.

5. Device according to one of Claims 3 and 4, characterized in that the decomposition unit (14) is configured to determine, from a decomposition into maximum intersection boxes of the restriction of the model (10) to the first d dimensions of the model, a decomposition into maximum intersection boxes of the restriction of the model to the first d+1 dimensions of the model, d denoting a given dimension of the model, a restriction of the model to the first d dimensions denoting the restriction of the set of boxes corresponding to the leaves of the decision trees of the model at the first d dimensions of the model.

6. Device according to one of claims 4 to 5, characterized in that the maximum intersection boxes are maximum intersection boxes associated with the restriction of the model to the first d+1 dimensions.

7. Device according to one of the preceding claims, characterized in that the model is broken down into a set of pure boxes comprising at most elements, N denoting the number of leaves of the decision trees of the model (10) and representing the dimension of the input feature space applied to the model (10).

8. Device according to one of the preceding claims, characterized in that the unit for calculating the counterfactual point is configured to calculate the counterfactual point, by determining, for each pure box, a point on the surface of the pure box located at a minimum distance between the query point representing the input data and the pure box.

9. Device according to claim 8, characterized in that the point Z of the surface of the pure box Bi located at a minimum distance between the query point X representing the input data and the pure box Bi is given by:

10. Device according to one of the preceding claims, characterized in that the device is configured to calculate a lower bound and an upper bound for each node of said derived tree structure corresponding to the decomposition into pure boxes, each level of the tree structure derivative corresponding to a dimension of the model, the lower bound representing a lower bound on the distance to the query point X of all the pure boxes located below said node, the upper bound representing the upper bound on the distance from the counterfactual point to the query point , the device being configured to stop the construction of said derived tree structure at the nodes for which the lower bound is greater than the upper bound.

11. Device according to claim 10, characterized in that for a given node of the derived tree structure, the lower bound is equal to the distance from the query point to the pure box corresponding to this node.

12. Device according to claim 11, characterized in that the counterfactual point for a query point X and a target class "j" is determined from the pure boxes located at depth D of the tree structure calculated for the model according to the equation:

13. Anomaly diagnosis method (100) comprising the steps of:
- in response to a request received comprising an input datum and a target class, determining the class to which the input datum belongs among a set of classes, by applying the input datum to a multiclass classification model, the input data corresponding to a query point,
said classification model (10) using a set of decision trees, the input data comprising a set of input features, the input feature space having a given number of dimensions, each decision tree comprising a set of nodes defining a tree structure from a root node to a set of leaves, each leaf of the decision tree comprising a set of scores associated with each class, each decision tree being applied to the input data for determining an elementary classification decision from the scores associated with the leaves of the decision tree, an aggregation function being applied to the elementary classification decisions provided by each of the trees of the model to determine the class of the input data;
- decomposing at least part of the set of leaves associated with the decision trees of the model into multidimensional boxes, called pure boxes, from the input query, said pure boxes corresponding to a collection of intervals, each pure box corresponding to a region of the input feature space, associated with a vector score determined from the scores associated with the leaves of said decision trees;
- calculate a counterfactual point (Y) representing the virtual point closest to the query point belonging to the target class, from the decomposition into pure boxes;
- determine the difference between the counterfactual point (Y) and the input query (X), which provides a change vector (dX) representing the changes to be applied at least to the input data so that the input is classified into the target class by the classification model, with the change vector (dX) having the same size as the input features.