CN113159918B - Bank client group mining method based on federal group penetration - Google Patents
Bank client group mining method based on federal group penetration Download PDFInfo
- Publication number
- CN113159918B CN113159918B CN202110380531.8A CN202110380531A CN113159918B CN 113159918 B CN113159918 B CN 113159918B CN 202110380531 A CN202110380531 A CN 202110380531A CN 113159918 B CN113159918 B CN 113159918B
- Authority
- CN
- China
- Prior art keywords
- bank
- client
- group
- customer
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000005065 mining Methods 0.000 title claims abstract description 23
- 238000000034 method Methods 0.000 title claims abstract description 22
- 230000035515 penetration Effects 0.000 title claims abstract description 16
- 239000011159 matrix material Substances 0.000 claims abstract description 84
- 238000004422 calculation algorithm Methods 0.000 claims description 37
- 239000013598 vector Substances 0.000 claims description 16
- 230000002776 aggregation Effects 0.000 claims description 12
- 238000004220 aggregation Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 11
- 238000010586 diagram Methods 0.000 claims description 11
- 230000008595 infiltration Effects 0.000 claims description 11
- 238000001764 infiltration Methods 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000003860 storage Methods 0.000 claims description 5
- 238000002360 preparation method Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 10
- 238000012545 processing Methods 0.000 description 5
- 230000006399 behavior Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- FESBVLZDDCQLFY-UHFFFAOYSA-N sete Chemical compound [Te]=[Se] FESBVLZDDCQLFY-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/02—Banking, e.g. interest calculation or account maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/01—Customer relationship services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- General Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Finance (AREA)
- Health & Medical Sciences (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Technology Law (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a bank client group mining method based on federal group penetration. The Coordinator (Coordinator) represents a data aggregator. The method comprises the following steps: privacy protection is carried out between all the bank ends based on a client network to obtain intersection so as to obtain overlapped clients; each bank terminal respectively calculates a client local similarity matrix, encrypts the matrix by using a homomorphic encryption technology and sends the matrix to a coordination terminal; the coordination terminal aggregates the local similarity matrixes to obtain an encrypted global similarity matrix, and sends the encrypted global similarity matrix to each bank terminal; each bank end finds all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set; each bank end performs k-group penetration on the overlapped client network according to the decrypted global similarity matrix and the k-group set; and each bank end divides the bank client group according to the k-group penetration result and outputs the mining structure of the bank client group. The invention can combine the customer information of a plurality of banks to improve the mining accuracy of bank customer groups under the condition of protecting the data privacy of bank customers.
Description
Technical Field
The invention relates to the technical field of federal learning, in particular to a bank client group mining method based on federal group penetration.
Background
The bank needs to know the customer behavior and conduct value mining around the customer needs, and meanwhile, the bank needs to obtain more customer information to achieve accurate analysis. In recent years, the privacy disclosure events of the internet emerge endlessly, and meanwhile, the privacy disclosure events attract more and more attention of multiple users, and the government also pays more and more attention to network security. The European union has issued general protection regulations (GDPR) in 2018 to protect the data privacy of users, and many countries in China, the United states and the like have also formulated and perfected a series of privacy protection regulations successively to penalize privacy disclosure behaviors. Federated learning (federated learning) is a decentralized, privacy-preserving and distributed machine learning framework proposed by Google, which supports distributed parallel processing of large-scale data with decentralized and distributed computation, and ensures that secret data at the bank end is not leaked in the computation process through local computation and encrypted transmission. The bank customer group mining method based on federal group infiltration is important to research. The bank client group mining method has the advantages that the data privacy of bank clients is protected, meanwhile, the bank client group is mined by combining client information of a plurality of banks, the client data owned by the banks can be fully utilized without violating the privacy protection law, the bank client group mining can be accurately helped, and further, high-quality bank client figures can be established, accurate advertisement putting can be carried out, and financial crimes can be detected.
Disclosure of Invention
The invention aims to provide a bank client group mining method based on federal group penetration, which can more accurately divide bank client groups while protecting bank client privacy.
In order to achieve the purpose, the technical scheme of the invention is as follows: a bank customer group mining method based on federal group penetration provides a system which comprises a bank end overlapping customer identification module, a bank end customer similarity calculation module, a coordination end customer similarity aggregation module, a bank end customer network k group discovery module, a bank end customer network k group penetration module and a bank end customer group division module; the system carries out bank customer group mining according to the following steps:
step S1, the bank end overlapping customer identification module is at each bank end PhRespectively reading a bank client network G (V, E, R, A), wherein V represents a client set, E represents an edge set, R represents a characteristic set, and A is a client characteristic matrix; randomly selecting a bank end to generate an RSA encryption algorithm key pair, and sending an RSA public key to all other banksA row end; the selected bank end encrypts the client ID by using the RSA public key, and respectively calculates the intersection with the client IDs encrypted by using the RSA public key of other bank ends; the selected bank end calculates the public intersection of the obtained intersection points to obtain an overlapped client set, and sends the overlapped client set to other bank ends, wherein each bank end PhObtaining a common overlapping customer set Xh;
Step S2, the bank client similarity calculation module randomly selects a bank end to generate a key pair of a homomorphic encryption algorithm, and sends the key pair to all other bank ends; bank terminal PhCalculating a customer feature matrix AhDimension | a ofh| a, using a homomorphic cryptographic algorithm public key pairhThe | is encrypted and sent to a coordination end; wherein, ahIs the feature vector of the client, the coordinating end is the data aggregation party, the aggregation PhThe transmitted encrypted data; the coordination terminal receives each bank terminal PhTransmitted encrypted client feature matrix dimension | ah| a for encryption statushI, adding to obtain the dimension of the global client characteristic matrix, and sending the dimension of the global client characteristic matrix to each bank terminal Ph(ii) a Each bank terminal PhDecrypting the global customer feature matrix dimension using a homomorphic encryption algorithm private key based on the global customer feature matrix dimension and the customer feature matrix AhCalculating the local similarity between the overlapped clients to obtain a local similarity matrix S of the overlapped clientsh(ii) a Each bank terminal PhPublic key encryption S using homomorphic encryption algorithmhAnd send ShTo the coordinating end;
step S3, the coordination terminal client similarity aggregation module receives each bank terminal P at the coordination terminalhTransmitted client local similarity matrix Sh(ii) a Coordinating S for port pair encryption statushAdding the obtained data to obtain an encrypted client global similarity matrix, and sending the global similarity matrix to each bank terminal Ph;
Step S4, the bank client network k group discovery module is at each bank end PhFinding all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set;
step S5, theThe k-group infiltration module of the bank client network is arranged at each bank end PhDecrypting the client global similarity matrix sent by the coordination terminal by using a homomorphic encryption algorithm private key; each bank terminal PhPerforming k-group infiltration according to the decrypted global similarity matrix and the k-group set to obtain a group diagram
Step S6, the bank customer group division module calculates each bank end PhGroup chart ofThe node set in each connected branch is a bank customer group, and the connected branch set is an overlapped customer X on the bank customer network GhGroup division of (C); and outputting a final group division result C of the bank client network.
In an embodiment of the present invention, the step S1 specifically includes the following steps:
step S11, randomly selecting a bank end Pi(i∈h)Generate RSA key pair and send RSA public key to other bank end Pj(j∈h∩j≠i);
Step S12, bank end PiEncrypting bank client network G with RSA public keyiClient V ofiRespectively connected with other bank terminals P under privacy protectionjSolving intersection, and obtaining X by decoding and decrypting with RSA private keyi,j;
Step S13, bank end PiFor the obtained intersection client Xi,jObtaining common intersection clients to obtain an overlapped client set Xi=∪{Xi,j};
Step S14, bank end PiSending overlapping client sets XiTo other bank end PjAll bank terminals PhObtaining overlapping customer sets Xh=Xi。
In an embodiment of the present invention, the step S2 specifically includes the following steps:
step S21, bank end PiGenerating homomorphismEncrypting the key pair of the algorithm and sending the key pair to other bank terminals Pj;
Step S22, each bank end PhCalculating a customer feature matrix AhDimension | a ofhPublic key encryption using homomorphic encryption algorithm | ahObtaining the dimension E (| a) of the encrypted local customer feature matrixhAnd E (| a) andh|) to the coordinating peer; wherein E () is an encryption function;
step S23, the coordination end receives each bank end PhThe sent encrypted local client characteristic matrix dimension E (| a)h|);
Step S24, coordinating the end pair E (| a)h|) are added to obtain the dimension of the global customer feature matrix
Step S26, bank end PhTransmitted from the receiving and coordinating endPrivate key decryption using homomorphic encryption algorithmDeriving global customer feature matrix dimensionsWherein D () is a decryption function;
step S27, bank end PhAccording toAnd a customer feature matrix AhComputing overlapping customer XhLocal similarity matrix S ofh(ii) a Wherein, the similarity calculation between the overlapped clients is shown in formula (4);
wherein, aiAnd ajIs overlapping clients viAnd vjIs determined by the attribute vector of (a),is an XOR operation, | aiIs the feature vector aiLength of (a), s (a)i,aj) Representative client viAnd vjThe similarity of (2);
step S28, bank end PhPublic key pair S using homomorphic encryption algorithmhPerforming encryption to obtain E (S)h) And sends E (S)h) To the coordinator side.
In an embodiment of the present invention, the step S3 specifically includes the following steps:
step S31, the coordination end receives each bank end PhThe encrypted local similarity matrix E (S) is senth);
Step S32, coordinating the local similarity matrix E of the end pairs (S)h) Adding to obtain a global similarity matrix
In an embodiment of the present invention, the step S4 specifically includes the following steps:
step S41, bank end PhAt GhUpper computation by overlapping sets of clients XhComposed overlapping customer networks
Step S42. Bank terminal PhFinding overlapping client networks using k-clique discovery algorithmObtaining a k group set by the k group; wherein, the k-clique is a sub-client network composed of k overlapped clients, and each overlapped client in the sub-client network has an association relation with all other overlapped clients.
In an embodiment of the present invention, the step S5 specifically includes the following steps:
step S51, bank end PhPrivate key decryption global similarity matrix using homomorphic encryption algorithmTo obtain
Step S52, bank end PhConstructing a clique graph by taking each k clique in the k clique set as a node
Step S53, bank end PhGlobal similarity matrix from decryptionCalculating the similarity between the two k groups, and if the similarity is larger than the set threshold value alpha, adding an edge to the two k groupsIn (1). Wherein, the threshold value alpha is 0.8, and the similarity between two k groups is calculated as shown in formula (5);
wherein v isiAnd vjIs an overlapping customer networkClient of (1), CpAnd CqIs a group k, s (v)i,vj) Is client viAnd vjSimilarity of (C), s (C)p,Cq) Is a k group CpAnd CqSimilarity of (d), Ind (v)i,vj) Function is expressed ifChinese client viAnd vjIf the association relationship exists, 1 is returned, otherwise, 0 is returned.
In an embodiment of the present invention, the step S6 specifically includes the following steps:
step S62, bank end PhAll the clients of each connected branch are merged into a client group to obtain a bank client group set C;
step S63, bank end PhCollecting each group C in bank customer group CiClient v in (1)i,jWritten in the form of row vectors Ri=(vi,j);
Step S64, output vector set { Ri},0<i<m, m is the number of customer groups, and each row represents one customer group.
The invention also provides a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions when executed by the processor being capable of performing the method steps as described above.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention provides a bank client group mining method based on federal group penetration, which can more accurately mine the relation between bank clients while protecting the privacy of the bank clients.
(2) The invention provides a new cluster similarity measurement index, simultaneously considers network topological structure and characteristic information, and improves the accuracy of bank customer group division by cluster penetration.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a bank client group mining method based on federal group penetration, and provides a system, which includes a bank-side overlapping client identification module, a bank-side client similarity calculation module, a coordination-side client similarity aggregation module, a bank-side client network k group discovery module, a bank-side client network k group penetration module, and a bank-side client group division module; the system carries out bank customer group mining according to the following steps:
step S1, the bank end overlaps each bank end P of the customer identification modulehReading the bank client network G (V, E, R, A) respectively, wherein V represents the client setE represents an edge set, R represents a feature set, and A is a customer feature matrix. And randomly selecting one bank end to generate an RSA encryption algorithm key pair, and sending an RSA public key to all other bank ends. And the selected bank end encrypts the client ID by using the RSA public key, and respectively calculates the intersection with the client IDs encrypted by using the RSA public key of other bank ends. The bank end obtains the intersection of the intersection points to obtain an overlapped client set, and sends the overlapped client set to other bank ends, wherein each bank end PhObtaining a common overlapping customer set Xh;
Step S2, the bank client similarity calculation module randomly selects a bank end to generate a key pair of the homomorphic encryption algorithm, and sends the key pair to all other bank ends. Bank terminal PhCalculate the customer feature matrix AhDimension | a ofh| public key pair | a using homomorphic encryption algorithmhAnd the information is encrypted and sent to the coordination terminal. Wherein, ahIs the feature vector of the client, the coordinating end is the data aggregation party, the aggregation PhThe transmitted encrypted data. The coordination terminal receives each bank terminal PhTransmitted encrypted client feature matrix dimension | ah| a for encryption statushI, adding to obtain the dimension of a global customer feature matrix, and sending the dimension of the global customer feature matrix to each bank terminal Ph. Each bank terminal PhDecrypting the global customer feature matrix dimension using a homomorphic encryption algorithm private key based on the global customer feature matrix dimension and the customer feature matrix AhCalculating the local similarity between the overlapped clients to obtain a local similarity matrix S of the overlapped clientsh. Each bank terminal PhPublic key encryption S using homomorphic encryption algorithmhAnd send ShTo the coordinating end;
step S3, the coordination end client similarity aggregation module receives each bank end P at the coordination endhTransmitted client local similarity matrix Sh. Coordinating S for port pair encryption statushAdding the obtained data to obtain an encrypted client global similarity matrix, and sending the global similarity matrix to each bank terminal Ph;
Step S4, the client network k group discovery module of the bank isEach bank terminal PhFinding all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set;
step S5, the bank end client network k group infiltration module each bank end PhDecrypting the client global similarity matrix sent by the coordination terminal by using a homomorphic encryption algorithm private key; each bank terminal PhPerforming k-group infiltration according to the decrypted global similarity matrix and the k-group set to obtain a group diagram
Step S6, the bank customer group division module calculates each bank end P respectivelyhGroup chart ofThe node set in each connected branch is a bank customer group, and the connected branch set is an overlapped customer X on the bank customer network GhGroup division C. And outputting a final group division result C of the bank client network.
Further, the step S1 specifically includes the following steps:
step S11, randomly selecting a bank end Pi(i∈h)Generate RSA private key pair and send RSA public key to other bank end Pj(j∈h∩j≠i);
Step S12, bank end PiEncrypting bank client network G with RSA public keyiClient V ofiRespectively connected with other bank terminals P under privacy protectionjSolving intersection, and obtaining X by decoding and decrypting with RSA private keyi,j;
Step S13, bank end PiFor the obtained intersection client Xi,jObtaining common intersection clients to obtain an overlapped client set Xi=∪{Xi,j};
Step S14, bank end PiSending overlapping client sets XiTo other bank end PjAll bank terminals PhObtaining overlapping customer sets Xh=Xi。
Further, the step S2 specifically includes the following steps:
step S21, bank end PiGenerating a homomorphic encryption algorithm key pair and sending the key pair to other bank terminals Pj;
Step S22, each bank end PhCalculating a customer feature matrix AhDimension | a ofhPublic key encryption using homomorphic encryption algorithm | ahObtaining the dimension E (| a) of the encrypted local customer feature matrixhAnd E (| a) andh|) to the coordinating peer; wherein E () is an encryption function;
step S23, the coordination end receives each bank end PhThe sent encrypted local client characteristic matrix dimension E (| a)h|);
Step S24, coordinating end pair E (| a)h|) are added to obtain the global customer feature matrix dimension
Step S26, bank end PhTransmitted from the receiving and coordinating endPrivate key decryption using homomorphic encryption algorithmDeriving global customer feature matrix dimensionsWherein D () is a decryption function;
step S27, bank end PhAccording toAnd a customer feature matrix AhComputing overlapping customer XhLocal similarity matrix S ofh(ii) a Wherein, the similarity calculation between the overlapped clients is shown in formula (7);
wherein, aiAnd ajIs overlapping clients viAnd vjThe vector of the attributes of (a) is,is an XOR operation, | aiIs the feature vector aiLength of (a), s (a)i,aj) Representative client viAnd vjSimilarity of (2);
step S28, bank end PhPublic key pair S using homomorphic encryption algorithmhPerforming encryption to obtain E (S)h) And sends E (S)h) To the coordinator side.
Further, the step S3 specifically includes the following steps:
step S31, the coordination end receives each bank end PhThe encrypted local similarity matrix E (S) is senth);
Step S32, coordinating the local similarity matrix E of the end pairs (S)h) Adding to obtain a global similarity matrix
Further, the step S4 specifically includes the following steps:
step S41, bank end PhAt GhUpper computation by overlapping sets of clients XhComposed overlapping customer networks
Step S42, bank end PhFinding overlapping client networks using k-clique discovery algorithmObtaining a k-group set. Wherein, the k-group is a sub-client network composed of k overlapped clients, and each overlapped client in the sub-client network has an association relationship with all other overlapped clients.
Further, the step S5 specifically includes the following steps:
step S51, bank end PhPrivate key decryption global similarity matrix using homomorphic encryption algorithmTo obtain
Step S52, bank end PhConstructing a clique graph by taking each k clique in the k clique set as a node
Step S53, bank end PhGlobal similarity matrix from decryptionCalculating the similarity between the two k groups, and if the similarity is larger than the set threshold value alpha, adding an edge to the two k groupsIn (1). Wherein, the threshold value alpha is 0.8, and the similarity between two k groups is calculated as shown in a formula (8);
wherein v isiAnd vjIs an overlapping customer networkCustomer in (1), CpAnd CqIs a group k, s (v)i,vj) Is client viAnd vjSimilarity of (C), s (C)p,Cq) Is a k group CpAnd CqSimilarity of (d), Ind (v)i,vj) Function is expressed ifChinese client viAnd vjIf the association relationship exists, 1 is returned, otherwise, 0 is returned.
Further, the step S6 specifically includes the following steps:
step S62, bank end PhAll the clients of each connected branch are merged into a client group to obtain a bank client group set C;
step S63, bank end PhCollecting each group C in bank customer group CiClient v in (1)i,jWritten in the form of row vectors Ri=(vi,j);
Step S64, output vector set { R }i},0<i<m, m is the number of customer groups, and each row represents one customer group.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.
Claims (6)
1. A bank client group mining method based on federal group penetration is characterized in that a bank client group mining system based on federal group penetration is provided, and the system comprises a bank end overlapping client identification module, a bank end client similarity calculation module, a coordination end client similarity aggregation module, a bank end client network k group discovery module, a bank end client network k group penetration module and a bank end client group division module; the system carries out bank customer group mining according to the following steps:
step S1, the bank end overlapping customer identification module is at each bank end PhRespectively reading a bank client network G (V, E, R, A), wherein V represents a client set, E represents an edge set, R represents a feature set, and A is a client feature matrix; randomly selecting one bank end to generate an RSA encryption algorithm key pair, and sending an RSA public key to other bank ends; the selected bank end encrypts the client ID by using the RSA public key, and respectively calculates the intersection with the client IDs encrypted by using the RSA public key of other bank ends; the selected bank end calculates the public intersection of the obtained intersections to obtain an overlapped client set, and sends the overlapped client set to other bank ends, wherein each bank end PhObtaining a common overlapping customer set Xh;
Step S2, the bank client similarity calculation module randomly selects a bank end to generate a key pair of a homomorphic encryption algorithm, and sends the key pair to other bank ends; each bank terminal PhCalculating a customer feature matrix AhDimension | a ofh| a, using a homomorphic cryptographic algorithm public key pairhThe | is encrypted and sent to a coordination end; wherein, ahIs the characteristic vector of the client, the coordinating end is the data aggregation party, and the aggregation of all bank ends PhThe transmitted encrypted data; the coordination terminal receives each bank terminal PhTransmitted encrypted client feature matrix dimension | ah| a for encryption statushL is added to obtain the dimension of the global customer feature matrixAnd sends the global customer feature matrix dimension to each bank terminal Ph(ii) a Each bank terminal PhDecrypting the global customer feature matrix dimension using a homomorphic encryption algorithm private key based on the global customer feature matrix dimension and the customer feature matrix AhCalculating the local similarity between the overlapped clients to obtain a local similarity matrix S of the overlapped clientsh(ii) a Each bank terminal PhPublic key encryption S using homomorphic encryption algorithmhAnd send ShTo the coordinating end;
step S3, the coordination terminal client similarity aggregation module receives each bank terminal P at the coordination terminalhTransmitted local similarity matrix Sh(ii) a Coordinating S for port pair encryption statushAdding the obtained data to obtain an encrypted client global similarity matrix, and sending the global similarity matrix to each bank terminal Ph;
Step S4, the bank client network k group discovery module is at each bank end PhFinding all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set; wherein, the k groups are a sub-client network consisting of k overlapped clients, and each overlapped client in the sub-client network has an association relation with all other overlapped clients;
step S5, the k-group infiltration module of the bank client network is arranged at each bank end PhDecrypting the client global similarity matrix sent by the coordination terminal by using a homomorphic encryption algorithm private key; each bank terminal PhPerforming k-group infiltration according to the decrypted global similarity matrix and the k-group set to obtain a group diagram
Step S6, the bank customer group division module calculates each bank end PhGroup chart ofThe node set in each connected branch is a bank customer group, and the connected branch set is an overlapped customer set X on the bank customer network GhA set of customer groups C; outputting a client group set C of the final bank client network;
the step S1 specifically includes the following steps:
step S11, randomly selecting a bank end PiGenerating RSA encryption algorithm key pair and sending RSA public key to other bank end Pj;
Step S12, bank end PiEncrypting bank client network G with RSA public keyiClient V ofiRespectively with other bank terminals PjSolving intersection and obtaining X by RSA private key decryptioni,j;
Step S13, bank end PiFor the obtained intersection Xi,jObtaining public intersection to obtain public overlapping client set Xi=∪{Xi,j};
Step S14, bank end PiSending a common overlapping customer set XiTo other bank end PjEach bank end PhObtaining a common overlapping customer set Xh=Xi;
The step S2 specifically includes the following steps:
step S21, bank end PiGenerating a homomorphic encryption algorithm key pair, and sending the key pair to other bank terminals Pj;
Step S22, each bank end PhCalculating a customer feature matrix AhDimension | a ofhPublic key encryption using homomorphic encryption algorithm | ahObtaining the dimension E (| a) of the encrypted local customer feature matrixhAnd E (| a) andh|) to the coordinating peer; wherein E () is an encryption function;
step S23, the coordination end receives each bank end PhThe sent encrypted local client characteristic matrix dimension E (| a)h|);
Step S24, coordinating the end pair E (| a)h|) are added to obtain the global customer feature matrix dimension
Step S25, the coordinating end encrypts the wholeLocal customer feature matrix dimensionTo the bank terminals Ph;
Step S26, each bank end PhTransmitted from the receiving and coordinating endPrivate key decryption using homomorphic encryption algorithmTo obtainWherein D () is a decryption function;
step S27, each bank end PhAccording toAnd a customer feature matrix AhCalculating a local similarity matrix S of overlapping customersh(ii) a Wherein, the similarity calculation between the overlapped clients is shown in formula (1);
wherein, aiIs overlapping clients viAttribute vector of ajIs overlapping clients vjIs determined by the attribute vector of (a),is an XOR operation, | aiIs the feature vector aiDimension of, s (a)i,aj) Representative client viAnd vjThe similarity of (2);
step S28, bank end PhPublic key pair S using homomorphic encryption algorithmhPerforming encryption to obtain E (S)h) And transmitting E (S)h) To the coordinator side.
2. The bank customer group mining method based on federal group infiltration as claimed in claim 1, wherein the step S3 specifically comprises the following steps:
step S31, the coordination end receives each bank end PhThe encrypted local similarity matrix E (S) is senth);
3. The bank customer group mining method based on federal group infiltration as claimed in claim 2, wherein the step S4 specifically comprises the following steps:
step S41, bank end PhAt GhUpper computing overlapping client network consisting of overlapping clients
4. The bank customer group mining method based on federal group infiltration as claimed in claim 3, wherein the step S5 specifically comprises the following steps:
Step S52, each bank terminal PhConstructing a clique graph by taking each k clique in the k clique set as a node
Step S53, each bank end PhAccording to decryptionCalculating the similarity between the two k groups, and if the similarity is greater than a preset threshold value alpha, adding an edge to the two k groupsThe preparation method comprises the following steps of (1) performing; wherein the threshold value alpha is 0.8, and the similarity between two k groups is calculated as shown in the following formula;
wherein v isiAnd vjIs an overlapping customer networkCustomer in (1), CpAnd CqIs a group k, s (v)i,vj) Is client viAnd vjSimilarity of (C), s (C)p,Cq) Is a k group CpAnd CqSimilarity of (d), Ind (v)i,vj) Function is expressed ifChinese client viAnd vjIf the association relationship exists, 1 is returned, otherwise, 0 is returned.
5. The bank customer group mining method based on federal group infiltration as claimed in claim 4, wherein the step S6 specifically comprises the following steps:
step S62, each bank end PhAll the clients of each connected branch are merged into a client group to obtain a bank client group set C;
step S63, each bank end PhCollecting each group C in bank customer group CiClient v in (1)i,jWritten in the form of row vectors Ri=(vi,j);
Step S64, output vector set { R }i},0<i<m, m is the number of customer groups, and each row represents one customer group.
6. A computer-readable storage medium, having stored thereon computer program instructions executable by a processor, the computer program instructions being capable of, when executed by the processor, implementing the method steps of any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110380531.8A CN113159918B (en) | 2021-04-09 | 2021-04-09 | Bank client group mining method based on federal group penetration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110380531.8A CN113159918B (en) | 2021-04-09 | 2021-04-09 | Bank client group mining method based on federal group penetration |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113159918A CN113159918A (en) | 2021-07-23 |
CN113159918B true CN113159918B (en) | 2022-06-07 |
Family
ID=76889211
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110380531.8A Expired - Fee Related CN113159918B (en) | 2021-04-09 | 2021-04-09 | Bank client group mining method based on federal group penetration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113159918B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114387064B (en) * | 2022-01-13 | 2024-07-19 | 福州大学 | Electronic commerce platform potential customer recommendation method and system based on comprehensive similarity |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110572253A (en) * | 2019-09-16 | 2019-12-13 | 济南大学 | Method and system for enhancing privacy of federated learning training data |
CN111309788A (en) * | 2020-03-08 | 2020-06-19 | 山西大学 | Community structure discovery method and system for bank customer transaction network |
CN111666460A (en) * | 2020-05-27 | 2020-09-15 | 中国平安财产保险股份有限公司 | User portrait generation method and device based on privacy protection and storage medium |
CN111967910A (en) * | 2020-08-18 | 2020-11-20 | 中国银行股份有限公司 | User passenger group classification method and device |
CN112199702A (en) * | 2020-10-16 | 2021-01-08 | 鹏城实验室 | Privacy protection method, storage medium and system based on federal learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9369273B2 (en) * | 2014-02-26 | 2016-06-14 | Raytheon Bbn Technologies Corp. | System and method for mixing VoIP streaming data for encrypted processing |
-
2021
- 2021-04-09 CN CN202110380531.8A patent/CN113159918B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110572253A (en) * | 2019-09-16 | 2019-12-13 | 济南大学 | Method and system for enhancing privacy of federated learning training data |
CN111309788A (en) * | 2020-03-08 | 2020-06-19 | 山西大学 | Community structure discovery method and system for bank customer transaction network |
CN111666460A (en) * | 2020-05-27 | 2020-09-15 | 中国平安财产保险股份有限公司 | User portrait generation method and device based on privacy protection and storage medium |
CN111967910A (en) * | 2020-08-18 | 2020-11-20 | 中国银行股份有限公司 | User passenger group classification method and device |
CN112199702A (en) * | 2020-10-16 | 2021-01-08 | 鹏城实验室 | Privacy protection method, storage medium and system based on federal learning |
Non-Patent Citations (1)
Title |
---|
一种隐私保护的分布式关联规则挖掘方法;桂琼等;《微电子学与计算机》;20090905(第09期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113159918A (en) | 2021-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | When homomorphic encryption marries secret sharing: Secure large-scale sparse logistic regression and applications in risk control | |
WO2021114927A1 (en) | Method and apparatus for multiple parties jointly performing feature assessment to protect privacy security | |
JP6180177B2 (en) | Encrypted data inquiry method and system capable of protecting privacy | |
Liu et al. | Toward highly secure yet efficient KNN classification scheme on outsourced cloud data | |
Ni et al. | On the security of an efficient dynamic auditing protocol in cloud storage | |
US20130339728A1 (en) | Secure product-sum combination system, computing apparatus, secure product-sum combination method and program therefor | |
Alarood et al. | IES: Hyper-chaotic plain image encryption scheme using improved shuffled confusion-diffusion | |
Erkin et al. | Privacy-preserving distributed clustering | |
CN111741020B (en) | Public data set determination method, device and system based on data privacy protection | |
TWI835300B (en) | A data matching method, device, equipment and medium | |
CN111914264A (en) | Index creation method and device, and data verification method and device | |
CN105553980A (en) | Safety fingerprint identification system and method based on cloud computing | |
CN113159918B (en) | Bank client group mining method based on federal group penetration | |
CN111490995A (en) | Model training method and device for protecting privacy, data processing method and server | |
Wang et al. | Image encryption algorithm based on lattice hash function and privacy protection | |
CN112380404B (en) | Data filtering method, device and system | |
CN111475690B (en) | Character string matching method and device, data detection method and server | |
CN114239018A (en) | Method and system for determining number of shared data for protecting privacy data | |
CN117077209B (en) | Large-scale data hiding trace query method | |
CN112132578B (en) | Efficient transaction processing method, tracking method and device based on block chain | |
CN109409111B (en) | Encrypted image-oriented fuzzy search method | |
CN115599959A (en) | Data sharing method, device, equipment and storage medium | |
Zhu et al. | A privacy preserving algorithm for mining distributed association rules | |
Ghunaim et al. | Secure kNN query of outsourced spatial data using two-cloud architecture | |
Li et al. | Privacy preservation of location information based on MinHash algorithm in online ride-hailing services |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220607 |