CN114358912A - Risk weight fusion anomaly detection method based on federal learning - Google Patents

Risk weight fusion anomaly detection method based on federal learning Download PDF

Info

Publication number
CN114358912A
CN114358912A CN202111362361.7A CN202111362361A CN114358912A CN 114358912 A CN114358912 A CN 114358912A CN 202111362361 A CN202111362361 A CN 202111362361A CN 114358912 A CN114358912 A CN 114358912A
Authority
CN
China
Prior art keywords
model
client
machine learning
training
bank
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111362361.7A
Other languages
Chinese (zh)
Other versions
CN114358912B (en
Inventor
王楠
张大林
刘娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Original Assignee
Beijing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University filed Critical Beijing Jiaotong University
Priority to CN202111362361.7A priority Critical patent/CN114358912B/en
Publication of CN114358912A publication Critical patent/CN114358912A/en
Application granted granted Critical
Publication of CN114358912B publication Critical patent/CN114358912B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an anomaly detection method based on risk weight fusion of federal learning. The method comprises the following steps: taking each bank organization participating in federal learning as a client, respectively establishing a machine learning model by each client, extracting parameter updating information and risk weight information of the current round of training after the machine learning model of each client is iteratively trained by using a local data sample set in each round of iteration, and uploading the parameter updating information and the risk weight information to a central server; and the central server fuses all the received parameter updating information with the risk weight information of each client for safety aggregation, and then issues the combined model parameter updating information to each client, and each client updates the parameters of the local machine learning model according to the received combined model updating information. The method and the system strengthen and extract the advantage characteristics of corresponding participants through a risk weight aggregation mode, and improve the accuracy and recall rate of abnormal data detection in the abnormal detection fields of financial fraud identification and the like.

Description

Risk weight fusion anomaly detection method based on federal learning
Technical Field
The invention relates to the technical field of financial data safety detection, in particular to a risk weight fusion anomaly detection method based on federal learning.
Background
With the popularization of the internet, financial scenes are deeply integrated into the daily life of people to provide convenient services, however, financial fraud is continuously expanded by using new technologies, and huge losses are brought to financial institutions and consumers. The traditional wind control mode based on statistics and rules cannot effectively detect a variable multi-terminal fraud mode, machine learning and deep learning technologies provide a new idea for financial transaction fraud detection, and validity of the financial transaction fraud detection is verified in a multi-field scene. The detection effect of the machine learning model usually depends on a large number of data sets, however, financial data has natural sensitivity and privacy, and data of different organizations cannot be directly received and processed, so that the application of the machine learning technology in the financial field is limited. Therefore, it is of great value to research how to train the machine learning model under the condition of ensuring data privacy and safety.
The financial data detection and use scheme based on privacy security comprises multi-party secure computation, homomorphic encryption, differential privacy and the like, and the technologies respectively guarantee data privacy from the aspects of multi-party data secure interaction, data encryption and data disturbance, but data still flows out of the local area, so that the leakage risk exists. Based on the mode that data is kept local to the organization and model parameter aggregation is adopted, the privacy and the safety of user data are protected in an innovative mode. Learners further classify the federal learning extended application range into horizontal federal learning, vertical federal learning, and federal migratory learning.
One method in the prior art for applying federal learning to financial fraud detection is: the federal learning is applied to credit card fraud detection, and the AUC index is improved by 10% compared with that of the traditional fraud detection method through combined multi-mechanism data training. Applying longitudinal federal learning in combination with bounded constraint-based logistic regression algorithms to credit score predictions significantly improves the performance of AUC and Kolmogorov-smirnov (ks) statistics due to the richness of data brought about by federal learning.
One of the above prior art approaches to applying federal learning to financial fraud detection has the disadvantages of: the method treats all organization models as equal positions based on the federal learning thought, uses a mean value mode to carry out simple parameter aggregation, and does not carry out specific analysis and research aiming at the personalized characteristics of data sets and models of all organizations, so that the central model training and convergence speed is low, the sensitivity to fraud samples is low, and the fraud identification accuracy rate and the recall rate are low.
Disclosure of Invention
The embodiment of the invention provides an anomaly detection method based on risk weight fusion of federal learning, so as to effectively improve the effect of financial fraud detection.
In order to achieve the purpose, the invention adopts the following technical scheme.
An anomaly detection method based on risk weight fusion of federal learning comprises the following steps:
taking each bank organization participating in federal learning as a client, respectively establishing machine learning models with the same structure by each client, and issuing initial parameters of each machine learning model by a central parameter server;
in each iteration, after the machine learning model of each client side is iteratively trained by using a local data sample set, extracting parameter updating information and risk weight information of the current iteration of training, and uploading the parameter updating information and the risk weight information to a central server;
and the central server fuses all the received parameter updating information with the risk weight information of each client for safety aggregation, and then sends the combined model parameter updating information to each client, each client updates the parameters of the local machine learning model according to the received combined model updating information, and each client performs anomaly detection on local data by using the own machine learning model.
Preferably, the step of taking each banking institution participating in federal learning as a client, wherein each client establishes a machine learning model with the same structure, and initial parameters of each machine learning model are issued by a central parameter server, includes:
a group of local bank organizations are set as participants of federal learning, the bank organizations are regarded as clients, the clients respectively establish own machine learning models, the structures of the machine learning models are the same, initialization parameters issued by a central parameter server are the same, the clients perform parameter information safety aggregation and synchronization by means of the central parameter server during training, each client independently utilizes the own machine learning model to perform local data anomaly detection, and the machine learning models of all the clients form a central combined anomaly detection model through communication iteration.
Preferably, in each iteration, after the machine learning model of each client is iteratively trained by using the local data sample set, extracting parameter update information and risk weight information of the current round of training, and uploading the parameter update information and the risk weight information to the central server, the method includes:
c fixed banks are set to participate in the joint model training, each bank organization has a local data sample set, and the data sample set of the C bank organization is Dc
Figure BDA0003359385240000021
Figure BDA0003359385240000022
Is a vector of the features of the image,
Figure BDA0003359385240000023
is a label, ncRepresenting the size of a data set of the C-th banking institution participating in federal learning, wherein C is the total number of the banking institutions;
in each round of training iteration t being 1,2, …, each banking institution performs machine learning model training on the basis of own data sample set, and calculates parameter updating information of the round of training
Figure BDA0003359385240000031
Calculating the risk weight information of the training of the current round: model accuracy
Figure BDA0003359385240000032
And a weight adjustment policy s for the severity of fraudcWill be
Figure BDA0003359385240000033
And scAnd uploading to a central parameter server.
Preferably, each banking institution performs machine learning model training based on the own data sample set, and calculates parameter updating information of the current training round
Figure BDA0003359385240000034
Calculating the risk weight information of the training of the current round: model accuracy
Figure BDA0003359385240000035
And a weight adjustment policy s for the severity of fraudcWill be
Figure BDA0003359385240000036
And scUploading to a central parameter server for security aggregation, comprising:
1):
Figure BDA0003359385240000037
representing the parameter updating information obtained in the t round of training of the c-th banking institution, the specific calculation formula is as follows:
Figure BDA0003359385240000038
wherein, wt-1Representing the parameters of the combined model in the previous round,
Figure BDA0003359385240000039
representing the new model parameters generated by local machine learning model training iteration performed by the c-th bank in the turn, each client locally completes multiple iterations,
Figure BDA00033593852400000310
representing the parameters after all gradient updates for the entire joint training round iteration are completed;
for each gradient descent training process, the local machine learning model parameters are subjected to gradient updating according to the loss tested on the data sample set and the learning rate eta, and the specific calculation formula is as follows:
Figure BDA00033593852400000311
Figure BDA00033593852400000312
representing the average loss of the model to the current parameter when the c-th bank local data set is trained
Figure BDA00033593852400000313
Gradient of (2), loss value Lc(xc,yc;wt) The loss of each data sample is averaged over all samples of the bank data set, and the specific calculation formula is as follows:
Figure BDA00033593852400000314
wherein, l (x)i,yi;wt) Representing the loss of a single sample, DcSample space of data set, n, representing the c-th bankcA data set sample number representing a c-th bank;
2) calculating model accuracy
Figure BDA00033593852400000315
Model accuracy
Figure BDA0003359385240000041
Representing the proportion of the predicted correct number of the local machine learning model obtained by the training of the round of a certain bank to the total amount of the samples, the specific calculation formula is as follows:
Figure BDA0003359385240000042
Figure BDA0003359385240000043
representing the number of fraud samples correctly predicted by the model,
Figure BDA0003359385240000044
the representative model mispredicts to the number of rogue samples,
Figure BDA0003359385240000045
representing the number of benign samples that the model correctly predicts,
Figure BDA0003359385240000046
representing the number of samples the model mispredicted as benign;
calculating fraud severity value s of each client local data setc
Each banking institution calculates a local data set fraud sample severity value S (S)1,s2……sC) Fraud severity parameter sCPositively correlated with the size of the data set, the proportion of fraud samples contained in the data set and the amount of fraud;
if a C bank institution participates in federal training, when a central parameter server aggregates parameter gradient information, the calculation method of substituting the update of the local machine learning model of the C bank institution into the fraud degree weight value is as follows:
Figure BDA0003359385240000047
in formula (6), the molecule scRepresenting the fraud severity level, denominator, of the c-th bank data set
Figure BDA0003359385240000048
Sum of fraud levels, r, representing all bank data sets participating in federal trainingcThe fraud severity weight of the data set of the c-th bank;
3) after each round of training of each client, the central parameter server calculates the machine learning model update information of each client fusing the risk weight information, namely:
Figure BDA0003359385240000049
in the formula (7), the first and second groups,
Figure BDA00033593852400000410
representing the risk weight of the c-th banking institution in the round of training, fusing the parameter updating information of the subsequent local machine learning model
Figure BDA00033593852400000411
Generating fraud detection model parameter updates to be aggregated eventually
Figure BDA00033593852400000412
Preferably, the central server performs security aggregation by fusing all the received parameter update information with risk weight information of each client, and then issues the associated model parameter update information to each client, and each client performs parameter update of the local machine learning model according to the received associated model update information, including:
in the joint training, the learning targets of the central joint anomaly detection model are as follows:
Figure BDA0003359385240000051
equation (8) represents minimizing the average loss of the federated model over all the bank data sets, where l (x, y; w) represents the average loss of the federated model over all the data sets, n represents the number of samples aggregated over all the bank data sets, and l (x)i,yi(ii) a w) represents the loss value of the combined model above the ith sample;
according to the distribution characteristics of the data set size in each local machine learning model, the optimization objective function of the central combined anomaly detection model is as follows:
Figure BDA0003359385240000052
n in formula (9)cRepresenting the number of local data set samples owned by the bank, n representing the sum of the number of data set samples for all banks, Lc(xc,yc(ii) a w) represents the average loss value of the federation model over the sample of the set of family c bank data, defined as:
Figure BDA0003359385240000053
ncrepresenting the number of local data set samples owned by the c-th bank, l (x)i,yi(ii) a w) represents the loss value of the federation model on the ith sample in the c bank dataset;
the central server carries out safety aggregation on the received parameter updating information uploaded by each client and the risk weight information to obtain the parameter updating information of the combined model, and the calculation formula is as follows:
Figure BDA0003359385240000054
w in formula (11)t-1Is the combined model parameter of the previous round of training,
Figure BDA0003359385240000055
representing the risk weight of the local machine learning model of the c-th bank in the training round,
Figure BDA0003359385240000056
and representing the updated information of the model parameters generated by the current iteration of the c-th bank.
wtThe parameters are the parameters of the combined model after the updating of the round;
the central server sends the combined model parameter updating information to each client, and each client updates a local machine learning model according to the received combined model parameter updating information to obtain an updated central combined anomaly detection model; the above process will perform T-round iteration until the overall central combined anomaly detection model reaches the convergence index.
According to the technical scheme provided by the embodiment of the invention, the method provided by the invention strengthens and extracts the advantage characteristics of the corresponding participants in a risk weight aggregation mode, and inhibits the extraction of the disadvantage characteristics of each participant, thereby improving the process of model training iteration and improving the performance of the final model. Especially in the field of abnormal detection such as financial fraud recognition, the accuracy and recall rate of abnormal data detection are improved.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is an implementation schematic diagram of an anomaly detection method based on risk weight fusion of federal learning according to an embodiment of the present invention;
fig. 2 is a processing flow chart of an anomaly detection method based on risk weight fusion of federal learning according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For the convenience of understanding the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples in conjunction with the drawings, and the embodiments are not to be construed as limiting the embodiments of the present invention.
The embodiment of the invention is based on a risk weight aggregation algorithm framework of Federated Learning (FL), solves the data islanding problem in the data training process, trains the service sensitive data of each organization locally, and constructs a central joint anomaly detection model by utilizing the Federated Learning framework on the premise of ensuring data privacy, thereby realizing the fusion Learning of a multi-source sensitive data model. In the training process, local machine learning models of all mechanisms are built based on a Convolutional Neural Network (CNN), and a Risk Weight Aggregation mode (called FedRWA) is provided to aggregate parameters of all local machine learning models, so that a central combined anomaly detection model can accurately acquire fraud Risk characteristics of all local data sets, and the model training effect is improved.
An implementation schematic diagram of the risk weight fusion anomaly detection method based on federated learning provided by the embodiment of the present invention is shown in fig. 1, and a specific processing flow is shown in fig. 2, and the method includes the following processing steps:
step S10: and taking each bank organization participating in the federal learning as a client, establishing a machine learning model with the same network structure by each client, initializing the parameters of the machine learning model by the central parameter server, and issuing the parameters to each client.
Step S20: in each iteration, after the machine learning model of each client side trains and iterates for k times by using the local data sample set, parameter updating information is extracted, the parameter updating information is fused with local risk weight information, and the local risk weight information is encrypted and uploaded to the central server.
Step S30: and after the central server carries out safety aggregation on all the received parameter updating information and the risk weight information of each client, the central server issues the combined model updating information to each client, and each client carries out local parameter updating of the machine learning model according to the received combined model updating mechanism.
The iterative training process of the above steps S20 and S30 is repeatedly performed until the machine learning models of the respective clients converge. Then, each client performs local anomaly detection by using the trained machine learning model.
The step S10 specifically includes: a group of local bank organizations are set as participants of federal learning, the bank organizations are regarded as clients, the clients respectively establish own machine learning models, the structures of the machine learning models are the same, initialization parameters are the same, and the difference is specific parameters of the models in the training iteration process. Each client is connected and communicated with the central parameter server through a communication network, and each client performs parameter information safety aggregation and synchronization in training by means of the central parameter server.
And each client independently utilizes the own machine learning model to carry out anomaly detection on local data, and the machine learning models of all the clients form a central combined anomaly detection model through communication iteration in subsequent steps.
The step S20 specifically includes: c fixed banks are set to participate in the joint model training, each bank organization has a local data sample set, and the data sample set of the C bank organization is Dc
Figure BDA0003359385240000081
Figure BDA0003359385240000082
Is a vector of the features of the image,
Figure BDA0003359385240000083
is a label, ncRepresents the size of the data set of the C-th banking institution participating in federal learning, and C is the total number of banking institutions.
Since data sets held by different banking institutions tend to have different characteristics, i.e., they have different data set sizes, different numbers of fraud tags, different degrees of fraud severity, etc. Therefore, a risk weight adjustment-based federal training model parameter aggregation strategy is introduced, and fraud sample characteristics of different banking institutions are captured better. Mainly comprises the following steps:
1) accuracy against model
Figure BDA0003359385240000084
The weight adjustment strategy of (1):
when the combined model aggregates parameter information, the model accuracy rate obtained by testing each local machine learning model in the round of training based on local data
Figure BDA0003359385240000085
As a weight, the local machine with good training is effectively extractedThe machine learns model information and accelerates iterative convergence of the overall model.
2) Weight adjustment policy r for fraud severityc
As each banking institution suffers from different fraud attacks in daily business in different numbers, frequency and money, the collected data set is distributed with different numbers of fraudulent user samples in different degrees. The bank can still deal with the risk of small amount and simple fraud, but the fraud which causes serious influence to large amount of group fraud and the like is zero tolerance. Therefore we propose a risk weight adjustment strategy r based on the severity of fraudcAnd the combined model is used for intensively absorbing the severe fraud data samples in the training process to train the generated model, so that a shared model capable of identifying severe fraud behaviors is created, and all banking institutions participating in the federal training benefit from the shared model.
The fraud severity of the data set sample is subjected to risk level division or weight rule setting, the setting can be carried out according to the actual business scene of a financial institution, and a division mode is listed as follows:
the risk level of the data sample set of each banking institution is defined as five levels:
level 1: no fraud samples, or only a small number of fraud samples, such as short-term overdue samples, are in the data set;
and 2, stage: a small amount of fraud samples exist in the data set, the fraud amount is small, and the method belongs to daily credit risks which can be responded by banking operation.
And 3, level: the data set contains more fraud samples, the amount of fraud is smaller, but the bank needs to pay attention to and prevent fraud risks.
4, level: the data set contains more fraud samples, and the case with larger fraud amount has certain influence on banking business, and needs to be mainly analyzed and prevented.
And 5, stage: the data set contains a large number of sample cases with large amount, large amount and frequent occurrence, such as group fraud, and the like, and the sample cases have serious influence on corresponding services and need to be traced to source and investigated and prevented.
Training locality from data sample sets of various banking institutionsAnd determining fraud detection model parameter updating information t of each banking institution according to the parameter updating information obtained by the machine learning model and the risk weight of the parameter updating information. In each communication iteration t being 1,2, …, each banking institution performs machine learning model training on the basis of own data sample set, and calculates the parameter updating information of the training round
Figure BDA0003359385240000091
And calculating the risk weight information of the training of the current round, including the accuracy rate of the local model
Figure BDA0003359385240000092
And severity of dataset fraud scAre fused with each other
Figure BDA0003359385240000093
scAnd uploading the updated information as the fraud detection model parameters to the central parameter server.
1):
Figure BDA0003359385240000094
Representing the updated information of the parameters obtained in the t round of training of the c-th banking institution, as shown in a formula.
Figure BDA0003359385240000095
Wherein, wt-1Representing the parameters of the combined model in the previous round,
Figure BDA0003359385240000096
and (4) carrying out local machine learning model training iteration on the representative bank of the c th time of the turn to generate new model parameters. According to the difference between the random gradient descent batch size B and the iteration times E of the data set during the training of the local machine learning model,
Figure BDA0003359385240000097
representing the parameters after all gradient updates for a complete training round iteration are completed.
And for each gradient descent training process, performing gradient updating on the local machine learning model parameters according to the loss tested on the private data set and the learning rate eta. As shown in the formula:
Figure BDA0003359385240000098
Figure BDA0003359385240000099
representing the average loss of the model to the current parameter when the c-th bank local data set is trained
Figure BDA00033593852400000910
Gradient of (2), loss value Lc(xc,yc;wt) The loss from each data sample is averaged over all samples of the bank data set, as shown in the formula.
Figure BDA00033593852400000911
Wherein, l (x)i,yi;wt) Representing the loss of a single sample, DcSample space of data set, n, representing the c-th bankcRepresenting the number of data set samples for the c-th bank.
2) The risk weight information for each round of training for each client is illustrated as follows:
a) calculating model accuracy
Figure BDA0003359385240000101
Model accuracy
Figure BDA0003359385240000102
Representing the proportion of the predicted correct amount of the local machine learning model obtained by the training of the round of a certain bank to the total amount of the samples. The specific calculation formula is as follows:
Figure BDA0003359385240000103
Figure BDA0003359385240000104
representing the number of fraud samples correctly predicted by the model,
Figure BDA0003359385240000105
the representative model mispredicts to the number of rogue samples,
Figure BDA0003359385240000106
representing the number of benign samples that the model correctly predicts,
Figure BDA0003359385240000107
representing the number of samples the model mispredicted as benign.
b) Calculating a fraud (or anomaly) severity level s for each client local data setcAnd a weight rc
Let the data set fraud sample severity for each banking institution be s(s)1,s2……sC) If the C-total bank institutions participate in federal training, and if the combined model aggregates parameter gradient information, the calculation method of substituting the update of the local machine learning model of the C-th bank institution into the fraud degree weight value is as follows:
Figure BDA0003359385240000108
in formula (6), the molecule scRepresenting the fraud severity level, denominator, of the c-th bank data set
Figure BDA0003359385240000109
Represents the sum of the fraud levels of all bank data sets participating in federal training. r iscI.e. the fraud severity weight of the bank data set of the c-th family.
For example, if 3 banks participate in federal training with their local dataset fraud milestones marked as level 1, 3 and 5, then the aggregate 2 nd bank local machine learning model will substitute weight values
r2=3/1+3+5=1/3.
3) And after each round of training of each client, the central parameter server calculates and aggregates the local machine learning model update information fused with the risk weight information. Namely:
Figure BDA00033593852400001010
in the formula (7), the first and second groups,
Figure BDA00033593852400001011
representing the risk weight of the c-th banking institution in the round of training, fusing the parameter updating information of the subsequent local machine learning model
Figure BDA00033593852400001012
Fraud detection model parameter update information for generating final federated models for aggregation
Figure BDA00033593852400001013
The step S30 specifically includes: in the joint training, the learning targets of the central joint anomaly detection model are as follows:
Figure BDA0003359385240000111
equation (8) represents minimizing the average loss of the federated model over all the bank data sets, where l (x, y; w) represents the average loss of the federated model over all the data sets, n represents the number of samples aggregated over all the bank data sets, and l (x)i,yi(ii) a w) represents the loss value of the joint model above the ith sample.
According to the distribution characteristics of the data set size in each local machine learning model, the optimization objective function of the central combined anomaly detection model can be rewritten as follows:
Figure BDA0003359385240000112
n in formula (9)cRepresenting the number of local dataset samples owned by the bank and n represents the sum of the number of dataset samples for all banks. L isc(xc,yc(ii) a w) represents the average loss value of the federation model over the sample of the c-th bank data set. Is defined as:
Figure BDA0003359385240000113
ncrepresenting the number of local data set samples owned by the c-th bank, l (x)i,yi(ii) a w) represents the loss value of the federation model over the ith sample in the c' th bank dataset.
In the method, a central server carries out safety aggregation on parameter updating information and risk weight information which are received and uploaded by each client side to obtain combined model parameter updating information, the combined model parameter updating information is issued to each client side, each client side carries out local machine learning model updating according to the received combined model parameter updating information, and an updated central combined anomaly detection model is obtained.
Figure BDA0003359385240000114
W in formula (11)t-1Is the combined model parameter of the previous round of training,
Figure BDA0003359385240000115
representing the risk weight of the local machine learning model of the c-th bank in the training round,
Figure BDA0003359385240000116
and representing the updated information of the model parameters generated by the current iteration of the c-th bank. w is atNamely the combined model parameter after the updating of the roundAnd (4) counting.
The above process will perform T-round iteration until the overall central combined anomaly detection model reaches the convergence index.
The above algorithm is noted as: risk Weight adjusted Federated Aggregation algorithm (Risk Weight fed Aggregation, called FedRWA).
In summary, compared with the prior art that all the participating parties are treated equally during the training of the combined model, the method provided by the invention can perform specific analysis according to the model of each participating party and the characteristics of the data set during the training of the combined model, extract the dominant characteristics of the corresponding participating parties in a reinforced manner by risk weight aggregation, and inhibit the extraction of the dominant characteristics of each party, thereby improving the iterative process of model training and improving the performance of the final model. Especially in the field of abnormal detection such as financial fraud recognition, the accuracy and recall rate of abnormal data detection are improved.
The embodiment of the invention aims at the concrete performance of each mechanism model in the training process, such as the verification accuracy rate of local data
Figure BDA0003359385240000121
To perform a weighted aggregation of the multiple model parameters. Therefore, when model parameters are aggregated, model parameter update information with better training effect in each mechanism can be effectively extracted, iterative convergence of whole model training is accelerated, and training time is shortened.
The embodiment of the invention analyzes the statistical characteristics and the fraud severity characteristics of the fraud samples in the data sets of each organization, defines the fraud risk weight grading model, and carries out biased aggregation on the parameters of each organization model according to different risk weights. Model parameter aggregation is carried out by accurately extracting the characteristics of the fraud samples of the data sets of all the institutions, and the recognition capability of the central combined anomaly detection model on the fraud samples is effectively improved, so that the financial fraud detection effect is improved, and large fraud samples can be effectively detected.
Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for apparatus or system embodiments, since they are substantially similar to method embodiments, they are described in relative terms, as long as they are described in partial descriptions of method embodiments. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. An anomaly detection method based on risk weight fusion of federal learning is characterized by comprising the following steps:
taking each bank organization participating in federal learning as a client, respectively establishing machine learning models with the same structure by each client, and issuing initial parameters of each machine learning model by a central parameter server;
in each iteration, after the machine learning model of each client side is iteratively trained by using a local data sample set, extracting parameter updating information and risk weight information of the current iteration of training, and uploading the parameter updating information and the risk weight information to a central server;
and the central server fuses all the received parameter updating information with the risk weight information of each client for safety aggregation, and then sends the combined model parameter updating information to each client, each client updates the parameters of the local machine learning model according to the received combined model updating information, and each client performs anomaly detection on local data by using the own machine learning model.
2. The method according to claim 1, wherein the step of regarding each banking institution participating in federal learning as a client, each client respectively establishes a machine learning model with the same structure, and initial parameters of each machine learning model are issued by a central parameter server, comprises:
a group of local bank organizations are set as participants of federal learning, the bank organizations are regarded as clients, the clients respectively establish own machine learning models, the structures of the machine learning models are the same, initialization parameters issued by a central parameter server are the same, the clients perform parameter information safety aggregation and synchronization by means of the central parameter server during training, each client independently utilizes the own machine learning model to perform local data anomaly detection, and the machine learning models of all the clients form a central combined anomaly detection model through communication iteration.
3. The method of claim 1, wherein in each iteration, after the machine learning model of each client is iteratively trained by using the local data sample set, extracting parameter update information and risk weight information of the current round of training, and uploading the parameter update information and the risk weight information to the central server, the method comprises:
c fixed banks are set to participate in the joint model training, each bank organization has a local data sample set, and the data sample set of the C bank organization is Dc
Figure FDA0003359385230000011
Figure FDA0003359385230000012
Is a vector of the features of the image,
Figure FDA0003359385230000013
is a label, ncRepresenting the size of a data set of the C-th banking institution participating in federal learning, wherein C is the total number of the banking institutions;
in each round of training iteration t being 1,2, …, each banking institution performs machine learning model training on the basis of own data sample set, and calculates parameter updating information of the round of training
Figure FDA0003359385230000021
Calculating the risk weight information of the training of the current round: model accuracy
Figure FDA0003359385230000022
And a weight adjustment policy s for the severity of fraudcWill be
Figure FDA0003359385230000023
And scAnd uploading to a central parameter server.
4. The method of claim 3, wherein the step of removing the substrate comprises removing the substrate from the substrateEach bank organization conducts machine learning model training on the basis of own data sample set and calculates the parameter updating information of the current training
Figure FDA0003359385230000024
Calculating the risk weight information of the training of the current round: model accuracy
Figure FDA0003359385230000025
And a weight adjustment policy s for the severity of fraudcWill be
Figure FDA0003359385230000026
And scUploading to a central parameter server for security aggregation, comprising:
1):
Figure FDA0003359385230000027
representing the parameter updating information obtained in the t round of training of the c-th banking institution, the specific calculation formula is as follows:
Figure FDA0003359385230000028
wherein, wt-1Representing the parameters of the combined model in the previous round,
Figure FDA0003359385230000029
representing the new model parameters generated by local machine learning model training iteration performed by the c-th bank in the turn, each client locally completes multiple iterations,
Figure FDA00033593852300000210
representing the parameters after all gradient updates for the entire joint training round iteration are completed;
for each gradient descent training process, the local machine learning model parameters are subjected to gradient updating according to the loss tested on the data sample set and the learning rate eta, and the specific calculation formula is as follows:
Figure FDA00033593852300000211
Figure FDA00033593852300000212
representing the average loss of the model to the current parameter when the c-th bank local data set is trained
Figure FDA00033593852300000213
Gradient of (2), loss value Lc(xc,yc;wt) The loss of each data sample is averaged over all samples of the bank data set, and the specific calculation formula is as follows:
Figure FDA00033593852300000214
wherein, l (x)i,yi;wt) Representing the loss of a single sample, DcSample space of data set, n, representing the c-th bankcA data set sample number representing a c-th bank;
2) calculating model accuracy
Figure FDA00033593852300000215
Model accuracy
Figure FDA0003359385230000031
Representing the proportion of the predicted correct number of the local machine learning model obtained by the training of the round of a certain bank to the total amount of the samples, the specific calculation formula is as follows:
Figure FDA0003359385230000032
Figure FDA0003359385230000033
representing the number of fraud samples correctly predicted by the model,
Figure FDA0003359385230000034
the representative model mispredicts to the number of rogue samples,
Figure FDA0003359385230000035
representing the number of benign samples that the model correctly predicts,
Figure FDA0003359385230000036
representing the number of samples the model mispredicted as benign;
calculating fraud severity value s of each client local data setc
Each banking institution calculates a local data set fraud sample severity value S (S)1,s2……sC) Fraud severity parameter sCPositively correlated with the size of the data set, the proportion of fraud samples contained in the data set and the amount of fraud;
if a C bank institution participates in federal training, when a central parameter server aggregates parameter gradient information, the calculation method of substituting the update of the local machine learning model of the C bank institution into the fraud degree weight value is as follows:
Figure FDA0003359385230000037
in formula (6), the molecule scRepresenting the fraud severity level, denominator, of the c-th bank data set
Figure FDA0003359385230000038
Sum of fraud levels, r, representing all bank data sets participating in federal trainingcThe fraud severity weight of the data set of the c-th bank;
3) after each round of training of each client, the central parameter server calculates the machine learning model update information of each client fusing the risk weight information, namely:
Figure FDA0003359385230000039
in the formula (7), the first and second groups,
Figure FDA00033593852300000310
representing the risk weight of the c-th banking institution in the round of training, fusing the parameter updating information of the subsequent local machine learning model
Figure FDA00033593852300000311
Generating fraud detection model parameter updates to be aggregated eventually
Figure FDA00033593852300000312
5. The method according to claim 3 or 4, wherein the central server performs security aggregation by fusing all the received parameter update information with risk weight information of each client, and then issues the associated model parameter update information to each client, and each client performs parameter update of the local machine learning model according to the received associated model update information, including:
in the joint training, the learning targets of the central joint anomaly detection model are as follows:
Figure FDA0003359385230000041
equation (8) represents minimizing the average loss of the federated model over all the bank data sets, where l (x, y; w) represents the average loss of the federated model over all the data sets, n represents the number of samples aggregated over all the bank data sets, and l (x)i,yi(ii) a w) represents the loss value of the combined model above the ith sample;
according to the distribution characteristics of the data set size in each local machine learning model, the optimization objective function of the central combined anomaly detection model is as follows:
Figure FDA0003359385230000042
n in formula (9)cRepresenting the number of local data set samples owned by the bank, n representing the sum of the number of data set samples for all banks, Lc(xc,yc(ii) a w) represents the average loss value of the federation model over the sample of the set of family c bank data, defined as:
Figure FDA0003359385230000043
ncrepresenting the number of local data set samples owned by the c-th bank, l (x)i,yi(ii) a w) represents the loss value of the federation model on the ith sample in the c bank dataset;
the central server carries out safety aggregation on the received parameter updating information uploaded by each client and the risk weight information to obtain the parameter updating information of the combined model, and the calculation formula is as follows:
Figure FDA0003359385230000044
w in formula (11)t-1Is the combined model parameter of the previous round of training,
Figure FDA0003359385230000045
representing the risk weight of the local machine learning model of the c-th bank in the training round,
Figure FDA0003359385230000046
and representing the updated information of the model parameters generated by the current iteration of the c-th bank. w is atThe parameters are the parameters of the combined model after the updating of the round;
the central server sends the combined model parameter updating information to each client, and each client updates a local machine learning model according to the received combined model parameter updating information to obtain an updated central combined anomaly detection model; the above process will perform T-round iteration until the overall central combined anomaly detection model reaches the convergence index.
CN202111362361.7A 2021-11-17 2021-11-17 Abnormality detection method for risk weight fusion based on federal learning Active CN114358912B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111362361.7A CN114358912B (en) 2021-11-17 2021-11-17 Abnormality detection method for risk weight fusion based on federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111362361.7A CN114358912B (en) 2021-11-17 2021-11-17 Abnormality detection method for risk weight fusion based on federal learning

Publications (2)

Publication Number Publication Date
CN114358912A true CN114358912A (en) 2022-04-15
CN114358912B CN114358912B (en) 2024-10-15

Family

ID=81095574

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111362361.7A Active CN114358912B (en) 2021-11-17 2021-11-17 Abnormality detection method for risk weight fusion based on federal learning

Country Status (1)

Country Link
CN (1) CN114358912B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114723071A (en) * 2022-04-26 2022-07-08 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Federal learning method and device based on client classification and information entropy
CN114785605A (en) * 2022-04-28 2022-07-22 中国电信股份有限公司 Method, device and equipment for determining network anomaly detection model and storage medium
CN114912705A (en) * 2022-06-01 2022-08-16 南京理工大学 Optimization method for heterogeneous model fusion in federated learning
CN114926154A (en) * 2022-07-20 2022-08-19 江苏华存电子科技有限公司 Protection switching method and system for multi-scene data identification
CN115170565A (en) * 2022-09-06 2022-10-11 浙商银行股份有限公司 Image fraud detection method and device based on automatic neural network architecture search
CN116703553A (en) * 2023-08-07 2023-09-05 浙江鹏信信息科技股份有限公司 Financial anti-fraud risk monitoring method, system and readable storage medium
WO2024039301A1 (en) * 2022-08-19 2024-02-22 National University Of Singapore Federation of scoring systems

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105240A (en) * 2019-12-12 2020-05-05 中国科学院深圳先进技术研究院 Resource-sensitive combined financial fraud detection model training method and detection method
CN113033712A (en) * 2021-05-21 2021-06-25 华中科技大学 Multi-user cooperative training people flow statistical method and system based on federal learning
CN113112027A (en) * 2021-04-06 2021-07-13 杭州电子科技大学 Federal learning method based on dynamic adjustment model aggregation weight
CN113609521A (en) * 2021-07-27 2021-11-05 广州大学 Federated learning privacy protection method and system based on countermeasure training

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105240A (en) * 2019-12-12 2020-05-05 中国科学院深圳先进技术研究院 Resource-sensitive combined financial fraud detection model training method and detection method
CN113112027A (en) * 2021-04-06 2021-07-13 杭州电子科技大学 Federal learning method based on dynamic adjustment model aggregation weight
CN113033712A (en) * 2021-05-21 2021-06-25 华中科技大学 Multi-user cooperative training people flow statistical method and system based on federal learning
CN113609521A (en) * 2021-07-27 2021-11-05 广州大学 Federated learning privacy protection method and system based on countermeasure training

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JING JIANG ET. AL: ""Decentralized Knowledge Acquisition for Mobile Internet Applications"", 《WORLD WIDE WEB(2020)》, pages 2653 - 2669 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114723071A (en) * 2022-04-26 2022-07-08 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Federal learning method and device based on client classification and information entropy
CN114723071B (en) * 2022-04-26 2023-04-07 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Federal learning method and device based on client classification and information entropy
CN114785605A (en) * 2022-04-28 2022-07-22 中国电信股份有限公司 Method, device and equipment for determining network anomaly detection model and storage medium
CN114785605B (en) * 2022-04-28 2023-12-12 中国电信股份有限公司 Determination method, device, equipment and storage medium of network anomaly detection model
CN114912705A (en) * 2022-06-01 2022-08-16 南京理工大学 Optimization method for heterogeneous model fusion in federated learning
CN114926154A (en) * 2022-07-20 2022-08-19 江苏华存电子科技有限公司 Protection switching method and system for multi-scene data identification
WO2024039301A1 (en) * 2022-08-19 2024-02-22 National University Of Singapore Federation of scoring systems
CN115170565A (en) * 2022-09-06 2022-10-11 浙商银行股份有限公司 Image fraud detection method and device based on automatic neural network architecture search
CN115170565B (en) * 2022-09-06 2022-12-27 浙商银行股份有限公司 Image fraud detection method and device based on automatic neural network architecture search
CN116703553A (en) * 2023-08-07 2023-09-05 浙江鹏信信息科技股份有限公司 Financial anti-fraud risk monitoring method, system and readable storage medium
CN116703553B (en) * 2023-08-07 2023-12-05 浙江鹏信信息科技股份有限公司 Financial anti-fraud risk monitoring method, system and readable storage medium

Also Published As

Publication number Publication date
CN114358912B (en) 2024-10-15

Similar Documents

Publication Publication Date Title
CN114358912B (en) Abnormality detection method for risk weight fusion based on federal learning
Cao et al. Fltrust: Byzantine-robust federated learning via trust bootstrapping
Nicholls et al. Financial cybercrime: A comprehensive survey of deep learning approaches to tackle the evolving financial crime landscape
Wang et al. Beyond inferring class representatives: User-level privacy leakage from federated learning
US20220358516A1 (en) Advanced learning system for detection and prevention of money laundering
US11263644B2 (en) Systems and methods for detecting unauthorized or suspicious financial activity
CN113362160B (en) Federal learning method and device for credit card anti-fraud
US20060202012A1 (en) Secure data processing system, such as a system for detecting fraud and expediting note processing
CN109344583B (en) Threshold determination and body verification method and device, electronic equipment and storage medium
Nune et al. Novel artificial neural networks and logistic approach for detecting credit card deceit
CN110084609B (en) Transaction fraud behavior deep detection method based on characterization learning
US20230186311A1 (en) Fraud Detection Methods and Systems Based on Evolution-Based Black-Box Attack Models
Khodabakhshi et al. Fraud detection in banking using knn (k-nearest neighbor) algorithm
Upreti et al. Enhanced algorithmic modelling and architecture in deep reinforcement learning based on wireless communication Fintech technology
Wang Analysis of financial business model towards big data and its applications
Kaur et al. Analysis on Credit Card Fraud Detection and Prevention using Data Mining and Machine Learning Techniques
CN111260372B (en) Resource transfer user group determination method, device, computer equipment and storage medium
Rajkumar et al. Intelligent System for Fraud Detection in Online Banking using Improved Particle Swarm Optimization and Support Vector Machine
CN116523602A (en) Financial product potential user recommendation method for multi-party semi-supervised learning
Chaudhry et al. Fraud Detection and Prevention for a Secure Financial Future Using Artificial Intelligence
Parthasarathy et al. Comparative case study of machine learning classification techniques using imbalanced credit card fraud datasets
US11853278B2 (en) Systems and methods for combining graph embedding and random forest classification for improving classification of distributed ledger activities
Sanni et al. A Predictive Cyber Threat Model for Mobile Money Services
CN115659387A (en) Neural-channel-based user privacy protection method, electronic device and medium
US11935331B2 (en) Methods and systems for real-time electronic verification of content with varying features in data-sparse computer environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant