EP3596685A1 - Détection par apprentissage automatique d'anomalies dans un ensemble de transactions bancaires par optimisation de la précision moyenne - Google Patents
Détection par apprentissage automatique d'anomalies dans un ensemble de transactions bancaires par optimisation de la précision moyenneInfo
- Publication number
- EP3596685A1 EP3596685A1 EP18712980.4A EP18712980A EP3596685A1 EP 3596685 A1 EP3596685 A1 EP 3596685A1 EP 18712980 A EP18712980 A EP 18712980A EP 3596685 A1 EP3596685 A1 EP 3596685A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- transactions
- transaction
- model
- meta
- risk
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4016—Transaction verification involving fraud or risk level assessment in transaction processing
Definitions
- Fraud on payment transactions mainly involving bank transactions, is an important and growing phenomenon, particularly as a result of the generalization of online transactions via telecommunication networks.
- other types of anomalies can also occur (errors ).
- the first case has the advantage of being able to block a fraudulent transaction before it takes place, but it is subject to a strong constraint on the processing time, since the mechanism delays the finalization of the payment and impact transaction. therefore negatively the user's experience.
- the second case allows for more of time and thus to be able to put in place treatments more accounts and finer.
- the object of the present invention is to provide a solution at least partially overcoming the aforementioned drawbacks. More particularly, the invention aims to provide tools enabling the determination of a set of transactions presenting a certain risk of being in anomaly (frauds or other phenomena), and which can be presented to a human operator.
- the present invention proposes a method for the detection of anomalies in a set of payment transactions, consisting in
- meta-model consisting of a set of models, each optimized on a training game to determine a risk for each transaction to be an anomaly, said meta-model being established by the technique of "gradient boosting", so as to optimize a differentiable function expressing the average accuracy of said meta-model;
- the invention comprises one or more of the following features which can be used separately or in partial combination with one another or in total combination with one another:
- said subset is presented to one or more human experts and said threshold is determined according to the number of transactions that can be processed by said one or more human experts; prior to the establishment of the meta-model, a subsampling step (E2) is applied to said set of transactions, in order to improve the balance between anomalous transactions and legitimate transactions;
- said sub-sampling step consists in optimizing a measurement F2; the optimization of said measurement F2 consists in minimizing a differentiable function expressing said measurement F2;
- y i is equal to 1 if said transaction is in anomaly, 0 otherwise;
- I () is the indicator function, it is equal to 1 if the condition is true, 0 otherwise;
- N the number of transactions of the learning game
- n is the rank of transaction x; compared to the ranking of all transactions, predicted by model F said function is expressed by the equation (), with:
- Another object of the invention is a computer program comprising instructions which, when executed by a processor of a computer system, result in the implementation of a method as previously described.
- Another object of the invention is a device for the detection of anomalies comprising means enabling the implementation of the previously described method.
- Figure 1 shows schematically an example of the flow of the method according to one embodiment of the invention.
- the cardinal of this subset may be predetermined since it may correspond to the number of transactions that can be processed over a given duration (for example a day) by human operators.
- the problem solved by the invention therefore consists in quickly finding the k transactions presenting the highest risk of being anomalies, where k is the number of transactions that can be processed by the human operators.
- a preprocessing step can be implemented. This step is referenced El in FIG.
- This pretreatment consists in preparing the data corresponding to the transactions in order to allow their good treatment by the subsequent stages.
- This data includes both data previously contained in the transactions, and data external to them.
- a first operation consists in formatting the data present in the submitted transactions, in order to allow their processing by the "machine learning” type algorithm to which they are then subjected.
- the date of the transaction can be transformed into several data, or characteristics ("features"): day, month, year, hour, minute ...
- a second operation is to associate new features to the transactions. These new features can be created from the history of the parties to the transaction, including the holder of a payment card used for the transaction: average amounts spent, previously visited stores, etc.
- This step E2 can be omitted in the overall process according to the invention, but it makes it possible to improve the performances and the processing time.
- the number of transactions in anomaly is certainly too high, but it nevertheless represents a proportion very low total transaction volume (for example, around 0.2%). It shows that the transaction population is very unbalanced, and this imbalance creates significant problems for most learning mechanisms.
- One of the objectives of the invention is to take into account this specificity and to propose a solution to remedy it.
- step E2 It is in this step E2 to discard a certain number of transactions that can be judged as not being in anomaly (that is to say, which are "legitimate"), in order to partly reduce the number of transactions involved in the learning game and, on the other hand, improve the distribution between anomalous transactions and legitimate transactions.
- the step E2 is a binary classification step of assigning each transaction of the learning set submitted in a class "transaction in anomaly” or in a class "legitimate transaction”. It can aim at optimizing a measurement F2 combining the recall rate and the precision measured for this sub-sampling step.
- the recall rate for a given class is defined by the ratio between the number of correctly classified transactions and the number of transactions actually in that class. Accuracy is defined as the ratio of the number of correctly classified transactions to the total number of transactions.
- the "true positive” TP, FP "false positive” and FN “false negative” rates can be expressed according to the scores provided by an F model established for this binary classification step with two classes "+1" and "0" .
- a measurement F2 is preferred for the emphasis it places on recall, rather than accuracy.
- the sub-sampling step allows to discard a large number of "legitimate" transactions, while keeping a maximum of transactions in anomaly for the next step E3.
- the optimization of the measurement F2 consists in minimizing a differentiable function expressing said measurement F2.
- This approximation can be used as an objective function in a classical optimization process.
- This optimization process can be a gradient descent and for example use the "gradient boosting" technique, this same way as step E3.
- gradient boosting technique
- This step E3 consists in establishing a meta-model formed of a set of models, each optimized on a training set, by the "gradient boosting" technique, so as to optimize a differentiable function expressing the average accuracy of said meta-model. model.
- the method used in the context of the invention is a set-learning method, that is to say based on a global model, or meta-model, formed of a set of "individual” models. Each individual model, or “basic”, is built and optimized from a learning game.
- each model performs a prediction
- the final prediction, performed by the meta-model is a combination of individual predictions. Different combinations are possible: majority vote, weighted majority vote, threshold vote, unanimity, etc.
- the combination can be made with a weighted majority vote.
- each model learns autonomously, iteratively, and is evaluated with respect to a result to be achieved which, in the context of the invention, is the optimization of a function expressing a mean accuracy of the models.
- the set-up technique used is a "boosting" technique, or stimulation, and more particularly of “gradient boosting", since it is a function optimization.
- the basic idea is to consider the transactions that have been poorly learned by the models and focus on them in order to improve their learning priority over other transactions, in the following iterations of the learning process.
- AdaBoost AdaBoost algorithm
- the principle consists of assigning weights to the examples of the learning games and, at each iteration, to change its weight by increasing the weights of the badly classified examples and by decreasing those of the well classified examples.
- the use of the "boosting" technique to perform a gradient descent optimization is well known and for example described in the article by JH Friedman, "Greedy function approximation: a gradient boosting machine” in Annals of statistics, 2001, pages 1189-1232.
- the invention does not lie in a new boosting algorithm or boosting gradient, but on how to use them. From a practical point of view, an embodiment of the invention may be a method, implemented by software, using such an algorithm as an autonomous functional module, which may be provided by a library for example.
- the problem we are trying to solve with the boosting gradient algorithm is to improve the set of "best"k's anomaly transactions, where k is the number of transactions that expert users can betray. Therefore, an objective function based on ranks (or rankings) is particularly suitable.
- Each transaction x belongs to a class "+1", corresponding to transactions in anomaly, or to a class "0" corresponding to transactions "legitimate".
- F is a model that has a risk output, that is, a probability for a transaction to belong to the class "+1”.
- I () represents the indicator function.
- N is the number of transactions in the learning set S. This learning set can be written in which the transaction x i is associated with a class y i .
- the above expression therefore defines the number of transactions that have a risk greater than or equal to the transaction x i .
- the average accuracy AP can then be obtained by:
- An idea of the invention therefore consists in approximating this expression of the average accuracy by a differentiable function expressing this average precision. It is this differentiable function that will be optimized by the gradient boosting algorithm.
- the invention can be implemented by the use of a “gradient boosting” algorithm known per se, but modified by the introduction of a specific function to be minimized which is a differentiable function expressing the average precision of the model.
- the meta-model is driven so as to minimize the average accuracy.
- it can then be used in anticipation to assign a risk to the transactions.
- This predetermined threshold may have been learned during the learning phase. His learning can be empirical and constant. It is also possible to vary it according to certain parameters like the date, because certain calendar events are likely to influence the rates of anomalies and frauds (holidays, weekends ). On those events where fraud is more prevalent, the threshold will be increased to obtain a constant number of "at risk” transactions (assuming that human resources remain constant).
- the present invention is not limited to the examples and to the embodiment described and shown, but it is capable of numerous variants accessible to those skilled in the art.
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Computer Security & Cryptography (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1752142A FR3064095B1 (fr) | 2017-03-16 | 2017-03-16 | Detection par apprentissage automatique d'anomalies dans un ensemble de transactions bancaires par optimisation de la precision moyenne |
PCT/FR2018/050544 WO2018167404A1 (fr) | 2017-03-16 | 2018-03-09 | Detection par apprentissage automatique d'anomalies dans un ensemble de transactions bancaires par optimisation de la precision moyenne |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3596685A1 true EP3596685A1 (fr) | 2020-01-22 |
Family
ID=59153036
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP18712980.4A Withdrawn EP3596685A1 (fr) | 2017-03-16 | 2018-03-09 | Détection par apprentissage automatique d'anomalies dans un ensemble de transactions bancaires par optimisation de la précision moyenne |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP3596685A1 (fr) |
CN (1) | CN110678890A (fr) |
FR (1) | FR3064095B1 (fr) |
WO (1) | WO2018167404A1 (fr) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112199414B (zh) * | 2020-09-25 | 2023-03-21 | 桦蓥(上海)信息科技有限责任公司 | 一种金融交易数据的综合分析方法 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7865427B2 (en) * | 2001-05-30 | 2011-01-04 | Cybersource Corporation | Method and apparatus for evaluating fraud risk in an electronic commerce transaction |
-
2017
- 2017-03-16 FR FR1752142A patent/FR3064095B1/fr not_active Expired - Fee Related
-
2018
- 2018-03-09 CN CN201880024752.8A patent/CN110678890A/zh active Pending
- 2018-03-09 WO PCT/FR2018/050544 patent/WO2018167404A1/fr unknown
- 2018-03-09 EP EP18712980.4A patent/EP3596685A1/fr not_active Withdrawn
Also Published As
Publication number | Publication date |
---|---|
WO2018167404A1 (fr) | 2018-09-20 |
FR3064095A1 (fr) | 2018-09-21 |
FR3064095B1 (fr) | 2019-06-14 |
CN110678890A (zh) | 2020-01-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019016106A1 (fr) | Systeme d'apprentissage machine pour diverses applications informatiques | |
EP3018615B1 (fr) | Procede de comparaison de donnees ameliore | |
US10825109B2 (en) | Predicting entity outcomes using taxonomy classifications of transactions | |
WO2018192348A1 (fr) | Procédé et dispositif de traitement de paquets de données, et serveur | |
CN110069545B (zh) | 一种行为数据评估方法及装置 | |
EP0863473A1 (fr) | Procédé de planification de requêtes d'un satellite par recuit simulé contraint | |
EP3574462A1 (fr) | Detection automatique de fraudes dans un flux de transactions de paiement par reseaux de neurones integrant des informations contextuelles | |
FR2871256A1 (fr) | Commande de flot de dispositif de stockage | |
FR3064095B1 (fr) | Detection par apprentissage automatique d'anomalies dans un ensemble de transactions bancaires par optimisation de la precision moyenne | |
FR3048840A1 (fr) | ||
CN108776652A (zh) | 一种基于新闻语料的行情预测方法 | |
CN116596657A (zh) | 贷款风险评估方法、装置、存储介质及电子设备 | |
EP3752948A1 (fr) | Procédé de traitement automatique pour l'anonymisation d'un jeu de données numériques | |
WO2018115616A1 (fr) | Moteur de regles universel et optimise pour le traitement de documents de gestion | |
EP4154189A1 (fr) | Procédés d'utilisation sécurisée d'un premier réseau de neurones sur une donnée d'entrée, et d'apprentissage de paramètres d'un deuxième réseau de neurones | |
EP3502904B1 (fr) | Procédé d'amélioration du temps d'exécution d'une application informatique | |
WO2021110763A1 (fr) | Méthode mise en œuvre par ordinateur pour l'allocation d'une pièce comptable à un couple de comptes débiteur/créditeur et l'écriture comptable | |
Peng et al. | Credit scoring model in imbalanced data based on cnn-atcn | |
EP4028954A1 (fr) | Apprentissage en continu pour la détection automatique de fraudes sur un service accessible sur réseau de télécommunication | |
EP4432175A1 (fr) | Procédé de paramétrage d'une chaîne de traitement de données | |
FR3099614A1 (fr) | Mécanisme de détection de fraudes dans un environnement antagoniste | |
FR3143802A3 (fr) | Détection d’anomalies dans les données billettiques de transport en commun | |
Zhang et al. | Personal Loan Default Prediction Based on LightGBM Model and Zhima Credit | |
CN114140153A (zh) | 银行智能营销方法、装置、计算机设备及存储介质 | |
EP3627329A1 (fr) | Procédé de détermination du type de séquence temporelle d'accès mémoire se deroulant lors d'une exécution d'une application informatique |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20190926 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20200831 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20210112 |