CN113159918B - Bank client group mining method based on federal group penetration - Google Patents

Bank client group mining method based on federal group penetration Download PDF

Info

Publication number
CN113159918B
CN113159918B CN202110380531.8A CN202110380531A CN113159918B CN 113159918 B CN113159918 B CN 113159918B CN 202110380531 A CN202110380531 A CN 202110380531A CN 113159918 B CN113159918 B CN 113159918B
Authority
CN
China
Prior art keywords
bank
client
group
customer
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202110380531.8A
Other languages
Chinese (zh)
Other versions
CN113159918A (en
Inventor
郭昆
魏明洋
郭文忠
刘西蒙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202110380531.8A priority Critical patent/CN113159918B/en
Publication of CN113159918A publication Critical patent/CN113159918A/en
Application granted granted Critical
Publication of CN113159918B publication Critical patent/CN113159918B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Finance (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Technology Law (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a bank client group mining method based on federal group penetration. The Coordinator (Coordinator) represents a data aggregator. The method comprises the following steps: privacy protection is carried out between all the bank ends based on a client network to obtain intersection so as to obtain overlapped clients; each bank terminal respectively calculates a client local similarity matrix, encrypts the matrix by using a homomorphic encryption technology and sends the matrix to a coordination terminal; the coordination terminal aggregates the local similarity matrixes to obtain an encrypted global similarity matrix, and sends the encrypted global similarity matrix to each bank terminal; each bank end finds all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set; each bank end performs k-group penetration on the overlapped client network according to the decrypted global similarity matrix and the k-group set; and each bank end divides the bank client group according to the k-group penetration result and outputs the mining structure of the bank client group. The invention can combine the customer information of a plurality of banks to improve the mining accuracy of bank customer groups under the condition of protecting the data privacy of bank customers.

Description

Bank client group mining method based on federal group penetration
Technical Field
The invention relates to the technical field of federal learning, in particular to a bank client group mining method based on federal group penetration.
Background
The bank needs to know the customer behavior and conduct value mining around the customer needs, and meanwhile, the bank needs to obtain more customer information to achieve accurate analysis. In recent years, the privacy disclosure events of the internet emerge endlessly, and meanwhile, the privacy disclosure events attract more and more attention of multiple users, and the government also pays more and more attention to network security. The European union has issued general protection regulations (GDPR) in 2018 to protect the data privacy of users, and many countries in China, the United states and the like have also formulated and perfected a series of privacy protection regulations successively to penalize privacy disclosure behaviors. Federated learning (federated learning) is a decentralized, privacy-preserving and distributed machine learning framework proposed by Google, which supports distributed parallel processing of large-scale data with decentralized and distributed computation, and ensures that secret data at the bank end is not leaked in the computation process through local computation and encrypted transmission. The bank customer group mining method based on federal group infiltration is important to research. The bank client group mining method has the advantages that the data privacy of bank clients is protected, meanwhile, the bank client group is mined by combining client information of a plurality of banks, the client data owned by the banks can be fully utilized without violating the privacy protection law, the bank client group mining can be accurately helped, and further, high-quality bank client figures can be established, accurate advertisement putting can be carried out, and financial crimes can be detected.
Disclosure of Invention
The invention aims to provide a bank client group mining method based on federal group penetration, which can more accurately divide bank client groups while protecting bank client privacy.
In order to achieve the purpose, the technical scheme of the invention is as follows: a bank customer group mining method based on federal group penetration provides a system which comprises a bank end overlapping customer identification module, a bank end customer similarity calculation module, a coordination end customer similarity aggregation module, a bank end customer network k group discovery module, a bank end customer network k group penetration module and a bank end customer group division module; the system carries out bank customer group mining according to the following steps:
step S1, the bank end overlapping customer identification module is at each bank end PhRespectively reading a bank client network G (V, E, R, A), wherein V represents a client set, E represents an edge set, R represents a characteristic set, and A is a client characteristic matrix; randomly selecting a bank end to generate an RSA encryption algorithm key pair, and sending an RSA public key to all other banksA row end; the selected bank end encrypts the client ID by using the RSA public key, and respectively calculates the intersection with the client IDs encrypted by using the RSA public key of other bank ends; the selected bank end calculates the public intersection of the obtained intersection points to obtain an overlapped client set, and sends the overlapped client set to other bank ends, wherein each bank end PhObtaining a common overlapping customer set Xh
Step S2, the bank client similarity calculation module randomly selects a bank end to generate a key pair of a homomorphic encryption algorithm, and sends the key pair to all other bank ends; bank terminal PhCalculating a customer feature matrix AhDimension | a ofh| a, using a homomorphic cryptographic algorithm public key pairhThe | is encrypted and sent to a coordination end; wherein, ahIs the feature vector of the client, the coordinating end is the data aggregation party, the aggregation PhThe transmitted encrypted data; the coordination terminal receives each bank terminal PhTransmitted encrypted client feature matrix dimension | ah| a for encryption statushI, adding to obtain the dimension of the global client characteristic matrix, and sending the dimension of the global client characteristic matrix to each bank terminal Ph(ii) a Each bank terminal PhDecrypting the global customer feature matrix dimension using a homomorphic encryption algorithm private key based on the global customer feature matrix dimension and the customer feature matrix AhCalculating the local similarity between the overlapped clients to obtain a local similarity matrix S of the overlapped clientsh(ii) a Each bank terminal PhPublic key encryption S using homomorphic encryption algorithmhAnd send ShTo the coordinating end;
step S3, the coordination terminal client similarity aggregation module receives each bank terminal P at the coordination terminalhTransmitted client local similarity matrix Sh(ii) a Coordinating S for port pair encryption statushAdding the obtained data to obtain an encrypted client global similarity matrix, and sending the global similarity matrix to each bank terminal Ph
Step S4, the bank client network k group discovery module is at each bank end PhFinding all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set;
step S5, theThe k-group infiltration module of the bank client network is arranged at each bank end PhDecrypting the client global similarity matrix sent by the coordination terminal by using a homomorphic encryption algorithm private key; each bank terminal PhPerforming k-group infiltration according to the decrypted global similarity matrix and the k-group set to obtain a group diagram
Figure BDA0003012635790000021
Step S6, the bank customer group division module calculates each bank end PhGroup chart of
Figure BDA0003012635790000022
The node set in each connected branch is a bank customer group, and the connected branch set is an overlapped customer X on the bank customer network GhGroup division of (C); and outputting a final group division result C of the bank client network.
In an embodiment of the present invention, the step S1 specifically includes the following steps:
step S11, randomly selecting a bank end Pi(i∈h)Generate RSA key pair and send RSA public key to other bank end Pj(j∈h∩j≠i)
Step S12, bank end PiEncrypting bank client network G with RSA public keyiClient V ofiRespectively connected with other bank terminals P under privacy protectionjSolving intersection, and obtaining X by decoding and decrypting with RSA private keyi,j
Step S13, bank end PiFor the obtained intersection client Xi,jObtaining common intersection clients to obtain an overlapped client set Xi=∪{Xi,j};
Step S14, bank end PiSending overlapping client sets XiTo other bank end PjAll bank terminals PhObtaining overlapping customer sets Xh=Xi
In an embodiment of the present invention, the step S2 specifically includes the following steps:
step S21, bank end PiGenerating homomorphismEncrypting the key pair of the algorithm and sending the key pair to other bank terminals Pj
Step S22, each bank end PhCalculating a customer feature matrix AhDimension | a ofhPublic key encryption using homomorphic encryption algorithm | ahObtaining the dimension E (| a) of the encrypted local customer feature matrixhAnd E (| a) andh|) to the coordinating peer; wherein E () is an encryption function;
step S23, the coordination end receives each bank end PhThe sent encrypted local client characteristic matrix dimension E (| a)h|);
Step S24, coordinating the end pair E (| a)h|) are added to obtain the dimension of the global customer feature matrix
Figure BDA0003012635790000031
Step S25, the coordinating end encrypts
Figure BDA0003012635790000032
To the bank terminals Ph
Step S26, bank end PhTransmitted from the receiving and coordinating end
Figure BDA0003012635790000033
Private key decryption using homomorphic encryption algorithm
Figure BDA0003012635790000034
Deriving global customer feature matrix dimensions
Figure BDA0003012635790000035
Wherein D () is a decryption function;
step S27, bank end PhAccording to
Figure BDA0003012635790000036
And a customer feature matrix AhComputing overlapping customer XhLocal similarity matrix S ofh(ii) a Wherein, the similarity calculation between the overlapped clients is shown in formula (4);
Figure BDA0003012635790000037
wherein, aiAnd ajIs overlapping clients viAnd vjIs determined by the attribute vector of (a),
Figure BDA0003012635790000038
is an XOR operation, | aiIs the feature vector aiLength of (a), s (a)i,aj) Representative client viAnd vjThe similarity of (2);
step S28, bank end PhPublic key pair S using homomorphic encryption algorithmhPerforming encryption to obtain E (S)h) And sends E (S)h) To the coordinator side.
In an embodiment of the present invention, the step S3 specifically includes the following steps:
step S31, the coordination end receives each bank end PhThe encrypted local similarity matrix E (S) is senth);
Step S32, coordinating the local similarity matrix E of the end pairs (S)h) Adding to obtain a global similarity matrix
Figure BDA0003012635790000039
Step S33, the coordinating end encrypts the global similarity matrix
Figure BDA00030126357900000310
To the bank terminals Ph
In an embodiment of the present invention, the step S4 specifically includes the following steps:
step S41, bank end PhAt GhUpper computation by overlapping sets of clients XhComposed overlapping customer networks
Figure BDA00030126357900000311
Step S42. Bank terminal PhFinding overlapping client networks using k-clique discovery algorithm
Figure BDA00030126357900000312
Obtaining a k group set by the k group; wherein, the k-clique is a sub-client network composed of k overlapped clients, and each overlapped client in the sub-client network has an association relation with all other overlapped clients.
In an embodiment of the present invention, the step S5 specifically includes the following steps:
step S51, bank end PhPrivate key decryption global similarity matrix using homomorphic encryption algorithm
Figure BDA00030126357900000313
To obtain
Figure BDA00030126357900000314
Step S52, bank end PhConstructing a clique graph by taking each k clique in the k clique set as a node
Figure BDA00030126357900000315
Step S53, bank end PhGlobal similarity matrix from decryption
Figure BDA00030126357900000316
Calculating the similarity between the two k groups, and if the similarity is larger than the set threshold value alpha, adding an edge to the two k groups
Figure BDA00030126357900000317
In (1). Wherein, the threshold value alpha is 0.8, and the similarity between two k groups is calculated as shown in formula (5);
Figure BDA0003012635790000041
Figure BDA0003012635790000042
wherein v isiAnd vjIs an overlapping customer network
Figure BDA0003012635790000043
Client of (1), CpAnd CqIs a group k, s (v)i,vj) Is client viAnd vjSimilarity of (C), s (C)p,Cq) Is a k group CpAnd CqSimilarity of (d), Ind (v)i,vj) Function is expressed if
Figure BDA0003012635790000044
Chinese client viAnd vjIf the association relationship exists, 1 is returned, otherwise, 0 is returned.
In an embodiment of the present invention, the step S6 specifically includes the following steps:
step S61, bank end PhCalculating a clique chart
Figure BDA0003012635790000045
A connected branch of (a);
step S62, bank end PhAll the clients of each connected branch are merged into a client group to obtain a bank client group set C;
step S63, bank end PhCollecting each group C in bank customer group CiClient v in (1)i,jWritten in the form of row vectors Ri=(vi,j);
Step S64, output vector set { Ri},0<i<m, m is the number of customer groups, and each row represents one customer group.
The invention also provides a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions when executed by the processor being capable of performing the method steps as described above.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention provides a bank client group mining method based on federal group penetration, which can more accurately mine the relation between bank clients while protecting the privacy of the bank clients.
(2) The invention provides a new cluster similarity measurement index, simultaneously considers network topological structure and characteristic information, and improves the accuracy of bank customer group division by cluster penetration.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a bank client group mining method based on federal group penetration, and provides a system, which includes a bank-side overlapping client identification module, a bank-side client similarity calculation module, a coordination-side client similarity aggregation module, a bank-side client network k group discovery module, a bank-side client network k group penetration module, and a bank-side client group division module; the system carries out bank customer group mining according to the following steps:
step S1, the bank end overlaps each bank end P of the customer identification modulehReading the bank client network G (V, E, R, A) respectively, wherein V represents the client setE represents an edge set, R represents a feature set, and A is a customer feature matrix. And randomly selecting one bank end to generate an RSA encryption algorithm key pair, and sending an RSA public key to all other bank ends. And the selected bank end encrypts the client ID by using the RSA public key, and respectively calculates the intersection with the client IDs encrypted by using the RSA public key of other bank ends. The bank end obtains the intersection of the intersection points to obtain an overlapped client set, and sends the overlapped client set to other bank ends, wherein each bank end PhObtaining a common overlapping customer set Xh
Step S2, the bank client similarity calculation module randomly selects a bank end to generate a key pair of the homomorphic encryption algorithm, and sends the key pair to all other bank ends. Bank terminal PhCalculate the customer feature matrix AhDimension | a ofh| public key pair | a using homomorphic encryption algorithmhAnd the information is encrypted and sent to the coordination terminal. Wherein, ahIs the feature vector of the client, the coordinating end is the data aggregation party, the aggregation PhThe transmitted encrypted data. The coordination terminal receives each bank terminal PhTransmitted encrypted client feature matrix dimension | ah| a for encryption statushI, adding to obtain the dimension of a global customer feature matrix, and sending the dimension of the global customer feature matrix to each bank terminal Ph. Each bank terminal PhDecrypting the global customer feature matrix dimension using a homomorphic encryption algorithm private key based on the global customer feature matrix dimension and the customer feature matrix AhCalculating the local similarity between the overlapped clients to obtain a local similarity matrix S of the overlapped clientsh. Each bank terminal PhPublic key encryption S using homomorphic encryption algorithmhAnd send ShTo the coordinating end;
step S3, the coordination end client similarity aggregation module receives each bank end P at the coordination endhTransmitted client local similarity matrix Sh. Coordinating S for port pair encryption statushAdding the obtained data to obtain an encrypted client global similarity matrix, and sending the global similarity matrix to each bank terminal Ph
Step S4, the client network k group discovery module of the bank isEach bank terminal PhFinding all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set;
step S5, the bank end client network k group infiltration module each bank end PhDecrypting the client global similarity matrix sent by the coordination terminal by using a homomorphic encryption algorithm private key; each bank terminal PhPerforming k-group infiltration according to the decrypted global similarity matrix and the k-group set to obtain a group diagram
Figure BDA0003012635790000051
Step S6, the bank customer group division module calculates each bank end P respectivelyhGroup chart of
Figure BDA0003012635790000061
The node set in each connected branch is a bank customer group, and the connected branch set is an overlapped customer X on the bank customer network GhGroup division C. And outputting a final group division result C of the bank client network.
Further, the step S1 specifically includes the following steps:
step S11, randomly selecting a bank end Pi(i∈h)Generate RSA private key pair and send RSA public key to other bank end Pj(j∈h∩j≠i)
Step S12, bank end PiEncrypting bank client network G with RSA public keyiClient V ofiRespectively connected with other bank terminals P under privacy protectionjSolving intersection, and obtaining X by decoding and decrypting with RSA private keyi,j
Step S13, bank end PiFor the obtained intersection client Xi,jObtaining common intersection clients to obtain an overlapped client set Xi=∪{Xi,j};
Step S14, bank end PiSending overlapping client sets XiTo other bank end PjAll bank terminals PhObtaining overlapping customer sets Xh=Xi
Further, the step S2 specifically includes the following steps:
step S21, bank end PiGenerating a homomorphic encryption algorithm key pair and sending the key pair to other bank terminals Pj
Step S22, each bank end PhCalculating a customer feature matrix AhDimension | a ofhPublic key encryption using homomorphic encryption algorithm | ahObtaining the dimension E (| a) of the encrypted local customer feature matrixhAnd E (| a) andh|) to the coordinating peer; wherein E () is an encryption function;
step S23, the coordination end receives each bank end PhThe sent encrypted local client characteristic matrix dimension E (| a)h|);
Step S24, coordinating end pair E (| a)h|) are added to obtain the global customer feature matrix dimension
Figure BDA0003012635790000062
Step S25, the coordinating end encrypts
Figure BDA0003012635790000063
To the bank terminals Ph
Step S26, bank end PhTransmitted from the receiving and coordinating end
Figure BDA0003012635790000064
Private key decryption using homomorphic encryption algorithm
Figure BDA0003012635790000065
Deriving global customer feature matrix dimensions
Figure BDA0003012635790000066
Wherein D () is a decryption function;
step S27, bank end PhAccording to
Figure BDA0003012635790000067
And a customer feature matrix AhComputing overlapping customer XhLocal similarity matrix S ofh(ii) a Wherein, the similarity calculation between the overlapped clients is shown in formula (7);
Figure BDA0003012635790000068
wherein, aiAnd ajIs overlapping clients viAnd vjThe vector of the attributes of (a) is,
Figure BDA0003012635790000069
is an XOR operation, | aiIs the feature vector aiLength of (a), s (a)i,aj) Representative client viAnd vjSimilarity of (2);
step S28, bank end PhPublic key pair S using homomorphic encryption algorithmhPerforming encryption to obtain E (S)h) And sends E (S)h) To the coordinator side.
Further, the step S3 specifically includes the following steps:
step S31, the coordination end receives each bank end PhThe encrypted local similarity matrix E (S) is senth);
Step S32, coordinating the local similarity matrix E of the end pairs (S)h) Adding to obtain a global similarity matrix
Figure BDA0003012635790000071
Step S33, the coordinating end encrypts the global similarity matrix
Figure BDA0003012635790000072
To the bank terminals Ph
Further, the step S4 specifically includes the following steps:
step S41, bank end PhAt GhUpper computation by overlapping sets of clients XhComposed overlapping customer networks
Figure BDA0003012635790000073
Step S42, bank end PhFinding overlapping client networks using k-clique discovery algorithm
Figure BDA0003012635790000074
Obtaining a k-group set. Wherein, the k-group is a sub-client network composed of k overlapped clients, and each overlapped client in the sub-client network has an association relationship with all other overlapped clients.
Further, the step S5 specifically includes the following steps:
step S51, bank end PhPrivate key decryption global similarity matrix using homomorphic encryption algorithm
Figure BDA0003012635790000075
To obtain
Figure BDA0003012635790000076
Step S52, bank end PhConstructing a clique graph by taking each k clique in the k clique set as a node
Figure BDA0003012635790000077
Step S53, bank end PhGlobal similarity matrix from decryption
Figure BDA0003012635790000078
Calculating the similarity between the two k groups, and if the similarity is larger than the set threshold value alpha, adding an edge to the two k groups
Figure BDA0003012635790000079
In (1). Wherein, the threshold value alpha is 0.8, and the similarity between two k groups is calculated as shown in a formula (8);
Figure BDA00030126357900000710
Figure BDA00030126357900000711
wherein v isiAnd vjIs an overlapping customer network
Figure BDA00030126357900000712
Customer in (1), CpAnd CqIs a group k, s (v)i,vj) Is client viAnd vjSimilarity of (C), s (C)p,Cq) Is a k group CpAnd CqSimilarity of (d), Ind (v)i,vj) Function is expressed if
Figure BDA00030126357900000713
Chinese client viAnd vjIf the association relationship exists, 1 is returned, otherwise, 0 is returned.
Further, the step S6 specifically includes the following steps:
step S61, bank end PhCalculating a clique chart
Figure BDA00030126357900000714
A connected branch of (a);
step S62, bank end PhAll the clients of each connected branch are merged into a client group to obtain a bank client group set C;
step S63, bank end PhCollecting each group C in bank customer group CiClient v in (1)i,jWritten in the form of row vectors Ri=(vi,j);
Step S64, output vector set { R }i},0<i<m, m is the number of customer groups, and each row represents one customer group.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims (6)

1. A bank client group mining method based on federal group penetration is characterized in that a bank client group mining system based on federal group penetration is provided, and the system comprises a bank end overlapping client identification module, a bank end client similarity calculation module, a coordination end client similarity aggregation module, a bank end client network k group discovery module, a bank end client network k group penetration module and a bank end client group division module; the system carries out bank customer group mining according to the following steps:
step S1, the bank end overlapping customer identification module is at each bank end PhRespectively reading a bank client network G (V, E, R, A), wherein V represents a client set, E represents an edge set, R represents a feature set, and A is a client feature matrix; randomly selecting one bank end to generate an RSA encryption algorithm key pair, and sending an RSA public key to other bank ends; the selected bank end encrypts the client ID by using the RSA public key, and respectively calculates the intersection with the client IDs encrypted by using the RSA public key of other bank ends; the selected bank end calculates the public intersection of the obtained intersections to obtain an overlapped client set, and sends the overlapped client set to other bank ends, wherein each bank end PhObtaining a common overlapping customer set Xh
Step S2, the bank client similarity calculation module randomly selects a bank end to generate a key pair of a homomorphic encryption algorithm, and sends the key pair to other bank ends; each bank terminal PhCalculating a customer feature matrix AhDimension | a ofh| a, using a homomorphic cryptographic algorithm public key pairhThe | is encrypted and sent to a coordination end; wherein, ahIs the characteristic vector of the client, the coordinating end is the data aggregation party, and the aggregation of all bank ends PhThe transmitted encrypted data; the coordination terminal receives each bank terminal PhTransmitted encrypted client feature matrix dimension | ah| a for encryption statushL is added to obtain the dimension of the global customer feature matrixAnd sends the global customer feature matrix dimension to each bank terminal Ph(ii) a Each bank terminal PhDecrypting the global customer feature matrix dimension using a homomorphic encryption algorithm private key based on the global customer feature matrix dimension and the customer feature matrix AhCalculating the local similarity between the overlapped clients to obtain a local similarity matrix S of the overlapped clientsh(ii) a Each bank terminal PhPublic key encryption S using homomorphic encryption algorithmhAnd send ShTo the coordinating end;
step S3, the coordination terminal client similarity aggregation module receives each bank terminal P at the coordination terminalhTransmitted local similarity matrix Sh(ii) a Coordinating S for port pair encryption statushAdding the obtained data to obtain an encrypted client global similarity matrix, and sending the global similarity matrix to each bank terminal Ph
Step S4, the bank client network k group discovery module is at each bank end PhFinding all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set; wherein, the k groups are a sub-client network consisting of k overlapped clients, and each overlapped client in the sub-client network has an association relation with all other overlapped clients;
step S5, the k-group infiltration module of the bank client network is arranged at each bank end PhDecrypting the client global similarity matrix sent by the coordination terminal by using a homomorphic encryption algorithm private key; each bank terminal PhPerforming k-group infiltration according to the decrypted global similarity matrix and the k-group set to obtain a group diagram
Figure FDA0003574110400000011
Step S6, the bank customer group division module calculates each bank end PhGroup chart of
Figure FDA0003574110400000012
The node set in each connected branch is a bank customer group, and the connected branch set is an overlapped customer set X on the bank customer network GhA set of customer groups C; outputting a client group set C of the final bank client network;
the step S1 specifically includes the following steps:
step S11, randomly selecting a bank end PiGenerating RSA encryption algorithm key pair and sending RSA public key to other bank end Pj
Step S12, bank end PiEncrypting bank client network G with RSA public keyiClient V ofiRespectively with other bank terminals PjSolving intersection and obtaining X by RSA private key decryptioni,j
Step S13, bank end PiFor the obtained intersection Xi,jObtaining public intersection to obtain public overlapping client set Xi=∪{Xi,j};
Step S14, bank end PiSending a common overlapping customer set XiTo other bank end PjEach bank end PhObtaining a common overlapping customer set Xh=Xi
The step S2 specifically includes the following steps:
step S21, bank end PiGenerating a homomorphic encryption algorithm key pair, and sending the key pair to other bank terminals Pj
Step S22, each bank end PhCalculating a customer feature matrix AhDimension | a ofhPublic key encryption using homomorphic encryption algorithm | ahObtaining the dimension E (| a) of the encrypted local customer feature matrixhAnd E (| a) andh|) to the coordinating peer; wherein E () is an encryption function;
step S23, the coordination end receives each bank end PhThe sent encrypted local client characteristic matrix dimension E (| a)h|);
Step S24, coordinating the end pair E (| a)h|) are added to obtain the global customer feature matrix dimension
Figure FDA0003574110400000021
Step S25, the coordinating end encrypts the wholeLocal customer feature matrix dimension
Figure FDA0003574110400000022
To the bank terminals Ph
Step S26, each bank end PhTransmitted from the receiving and coordinating end
Figure FDA0003574110400000023
Private key decryption using homomorphic encryption algorithm
Figure FDA0003574110400000024
To obtain
Figure FDA0003574110400000025
Wherein D () is a decryption function;
step S27, each bank end PhAccording to
Figure FDA0003574110400000026
And a customer feature matrix AhCalculating a local similarity matrix S of overlapping customersh(ii) a Wherein, the similarity calculation between the overlapped clients is shown in formula (1);
Figure FDA0003574110400000027
wherein, aiIs overlapping clients viAttribute vector of ajIs overlapping clients vjIs determined by the attribute vector of (a),
Figure FDA0003574110400000028
is an XOR operation, | aiIs the feature vector aiDimension of, s (a)i,aj) Representative client viAnd vjThe similarity of (2);
step S28, bank end PhPublic key pair S using homomorphic encryption algorithmhPerforming encryption to obtain E (S)h) And transmitting E (S)h) To the coordinator side.
2. The bank customer group mining method based on federal group infiltration as claimed in claim 1, wherein the step S3 specifically comprises the following steps:
step S31, the coordination end receives each bank end PhThe encrypted local similarity matrix E (S) is senth);
Step S32, harmonize end pair E (S)h) Adding to obtain a global similarity matrix
Figure FDA0003574110400000029
Step S33, the coordination terminal converts the global similarity matrix into a global similarity matrix
Figure FDA0003574110400000031
To the bank terminals Ph
3. The bank customer group mining method based on federal group infiltration as claimed in claim 2, wherein the step S4 specifically comprises the following steps:
step S41, bank end PhAt GhUpper computing overlapping client network consisting of overlapping clients
Figure FDA0003574110400000032
Step S42, bank end PhFinding overlapping client networks using k-clique discovery algorithm
Figure FDA0003574110400000033
And obtaining a k-group set.
4. The bank customer group mining method based on federal group infiltration as claimed in claim 3, wherein the step S5 specifically comprises the following steps:
step S51, each bank end PhPrivate key decryption using homomorphic encryption algorithm
Figure FDA0003574110400000034
To obtain
Figure FDA0003574110400000035
Step S52, each bank terminal PhConstructing a clique graph by taking each k clique in the k clique set as a node
Figure FDA0003574110400000036
Step S53, each bank end PhAccording to decryption
Figure FDA0003574110400000037
Calculating the similarity between the two k groups, and if the similarity is greater than a preset threshold value alpha, adding an edge to the two k groups
Figure FDA0003574110400000038
The preparation method comprises the following steps of (1) performing; wherein the threshold value alpha is 0.8, and the similarity between two k groups is calculated as shown in the following formula;
Figure FDA0003574110400000039
Figure FDA00035741104000000310
wherein v isiAnd vjIs an overlapping customer network
Figure FDA00035741104000000311
Customer in (1), CpAnd CqIs a group k, s (v)i,vj) Is client viAnd vjSimilarity of (C), s (C)p,Cq) Is a k group CpAnd CqSimilarity of (d), Ind (v)i,vj) Function is expressed if
Figure FDA00035741104000000312
Chinese client viAnd vjIf the association relationship exists, 1 is returned, otherwise, 0 is returned.
5. The bank customer group mining method based on federal group infiltration as claimed in claim 4, wherein the step S6 specifically comprises the following steps:
step S61, each bank end PhCalculating a clique chart
Figure FDA00035741104000000313
A connected branch of (a);
step S62, each bank end PhAll the clients of each connected branch are merged into a client group to obtain a bank client group set C;
step S63, each bank end PhCollecting each group C in bank customer group CiClient v in (1)i,jWritten in the form of row vectors Ri=(vi,j);
Step S64, output vector set { R }i},0<i<m, m is the number of customer groups, and each row represents one customer group.
6. A computer-readable storage medium, having stored thereon computer program instructions executable by a processor, the computer program instructions being capable of, when executed by the processor, implementing the method steps of any of claims 1-5.
CN202110380531.8A 2021-04-09 2021-04-09 Bank client group mining method based on federal group penetration Expired - Fee Related CN113159918B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110380531.8A CN113159918B (en) 2021-04-09 2021-04-09 Bank client group mining method based on federal group penetration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110380531.8A CN113159918B (en) 2021-04-09 2021-04-09 Bank client group mining method based on federal group penetration

Publications (2)

Publication Number Publication Date
CN113159918A CN113159918A (en) 2021-07-23
CN113159918B true CN113159918B (en) 2022-06-07

Family

ID=76889211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110380531.8A Expired - Fee Related CN113159918B (en) 2021-04-09 2021-04-09 Bank client group mining method based on federal group penetration

Country Status (1)

Country Link
CN (1) CN113159918B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114387064B (en) * 2022-01-13 2024-07-19 福州大学 Electronic commerce platform potential customer recommendation method and system based on comprehensive similarity

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110572253A (en) * 2019-09-16 2019-12-13 济南大学 Method and system for enhancing privacy of federated learning training data
CN111309788A (en) * 2020-03-08 2020-06-19 山西大学 Community structure discovery method and system for bank customer transaction network
CN111666460A (en) * 2020-05-27 2020-09-15 中国平安财产保险股份有限公司 User portrait generation method and device based on privacy protection and storage medium
CN111967910A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 User passenger group classification method and device
CN112199702A (en) * 2020-10-16 2021-01-08 鹏城实验室 Privacy protection method, storage medium and system based on federal learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9369273B2 (en) * 2014-02-26 2016-06-14 Raytheon Bbn Technologies Corp. System and method for mixing VoIP streaming data for encrypted processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110572253A (en) * 2019-09-16 2019-12-13 济南大学 Method and system for enhancing privacy of federated learning training data
CN111309788A (en) * 2020-03-08 2020-06-19 山西大学 Community structure discovery method and system for bank customer transaction network
CN111666460A (en) * 2020-05-27 2020-09-15 中国平安财产保险股份有限公司 User portrait generation method and device based on privacy protection and storage medium
CN111967910A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 User passenger group classification method and device
CN112199702A (en) * 2020-10-16 2021-01-08 鹏城实验室 Privacy protection method, storage medium and system based on federal learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种隐私保护的分布式关联规则挖掘方法;桂琼等;《微电子学与计算机》;20090905(第09期);全文 *

Also Published As

Publication number Publication date
CN113159918A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
Chen et al. When homomorphic encryption marries secret sharing: Secure large-scale sparse logistic regression and applications in risk control
WO2021114927A1 (en) Method and apparatus for multiple parties jointly performing feature assessment to protect privacy security
JP6180177B2 (en) Encrypted data inquiry method and system capable of protecting privacy
Liu et al. Toward highly secure yet efficient KNN classification scheme on outsourced cloud data
Ni et al. On the security of an efficient dynamic auditing protocol in cloud storage
US20130339728A1 (en) Secure product-sum combination system, computing apparatus, secure product-sum combination method and program therefor
Alarood et al. IES: Hyper-chaotic plain image encryption scheme using improved shuffled confusion-diffusion
Erkin et al. Privacy-preserving distributed clustering
CN111741020B (en) Public data set determination method, device and system based on data privacy protection
TWI835300B (en) A data matching method, device, equipment and medium
CN111914264A (en) Index creation method and device, and data verification method and device
CN105553980A (en) Safety fingerprint identification system and method based on cloud computing
CN113159918B (en) Bank client group mining method based on federal group penetration
CN111490995A (en) Model training method and device for protecting privacy, data processing method and server
Wang et al. Image encryption algorithm based on lattice hash function and privacy protection
CN112380404B (en) Data filtering method, device and system
CN111475690B (en) Character string matching method and device, data detection method and server
CN114239018A (en) Method and system for determining number of shared data for protecting privacy data
CN117077209B (en) Large-scale data hiding trace query method
CN112132578B (en) Efficient transaction processing method, tracking method and device based on block chain
CN109409111B (en) Encrypted image-oriented fuzzy search method
CN115599959A (en) Data sharing method, device, equipment and storage medium
Zhu et al. A privacy preserving algorithm for mining distributed association rules
Ghunaim et al. Secure kNN query of outsourced spatial data using two-cloud architecture
Li et al. Privacy preservation of location information based on MinHash algorithm in online ride-hailing services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220607