CN113688354B - Chi-square box dividing method based on safe multiparty calculation - Google Patents

Chi-square box dividing method based on safe multiparty calculation Download PDF

Info

Publication number
CN113688354B
CN113688354B CN202110999974.5A CN202110999974A CN113688354B CN 113688354 B CN113688354 B CN 113688354B CN 202110999974 A CN202110999974 A CN 202110999974A CN 113688354 B CN113688354 B CN 113688354B
Authority
CN
China
Prior art keywords
group
packet
data
samples
grouping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110999974.5A
Other languages
Chinese (zh)
Other versions
CN113688354A (en
Inventor
何道敬
孙黎彤
杜润萌
张民
张熙
廖清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN202110999974.5A priority Critical patent/CN113688354B/en
Publication of CN113688354A publication Critical patent/CN113688354A/en
Application granted granted Critical
Publication of CN113688354B publication Critical patent/CN113688354B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioethics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computational Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a chi-square box dividing method based on secure multiparty calculation, which provides a new chi-square value calculating method for federal learning feature engineering, does not need to encrypt all feature data to be sent to a data application party for feature preprocessing, firstly groups the feature data according to categories, mixes false groups, marks the grouping categories, encrypts and sends the grouping categories to the data application party, the encrypted grouping categories can greatly reduce the data quantity of encryption processing, the data application party does not need to decrypt all the feature data, and huge resource loss is avoided; the data provider sends the packet information of the characteristic data to the data application party, the packet information of the characteristic data is obtained after decryption by the data application party, the actual content of the characteristic data is not contained, false packet information is added to the packet information, and the true packet and the false packet are coded and marked.

Description

Chi-square box dividing method based on safe multiparty calculation
Technical Field
The invention belongs to the field of federal learning, and particularly relates to a chi-square box dividing method based on safe multiparty calculation.
Background
Rather than directly modeling with raw data, a data set needs to be constructed before federal learning begins. The task of converting raw data into a dataset is known as feature engineering.
The feature selection is an important step in feature engineering, and generally when a classification model is built, continuous variables are required to be discretized, and after the features are discretized, the model is more stable, so that the risk of model overfitting is reduced. During feature selection, a binning operation is often performed, which is to discretize continuous feature data. The benefits of binning are numerous, for example: the method has stronger robustness to the abnormal data, and solves the problem of modeling interference of the abnormal data; after the feature data are discretized, each feature data has independent weight, so that nonlinearity is introduced into the logistic regression model, and the expression capacity of the model can be improved; the missing values of the features can be taken as an independent class to be brought into the model, and the sparse vector inner product multiplication operation formed after the feature discretization is fast, the calculation result is convenient to store and easy to expand, and the like. For accurate discretization, the data is partitioned by category, if two adjacent bins have very similar category distributions, then the two bins may be merged, otherwise they should remain separate, while a low chi-square value indicates that there are similar category distributions within the two adjacent bins. And calculating the chi-square value of the characteristic data after the characteristic data is divided into boxes, wherein the smaller the chi-square value is, the more similar the distribution is, and the characteristic data can be combined into one box.
In the process of feature discretization and feature prediction capability evaluation, a party needing to lack feature tag data sends own feature data to a party with feature tags for joint feature preprocessing in the federal learning feature preprocessing process.
In most of the existing federal learning frameworks, a part of methods are to enable a data provider to encrypt all feature matrices by using a public key in calculation to meet the requirement of privacy protection, then send ciphertext matrices to a data application party, and the data application party decrypts the data by using the private key after taking the data. This approach obviously results in significant resource loss and performance degradation in large-scale data collection. The other part directly transmits desensitized data to calculate, so that the privacy safety of the data cannot be protected, the legal standards are not met, and the other part of participants independently train themselves, so that training results are fused, and the value of the data cannot be fully exerted.
Disclosure of Invention
The invention aims to provide a novel chi-square box dividing method based on safe multiparty calculation, which is characterized in that for accurate discretization of data, firstly, the data is divided into sections according to categories, if two adjacent sections have very similar category distribution, the two sections can be combined, otherwise, the two sections should be kept separate, and a low chi-square value indicates that the two adjacent sections have similar category distribution. And calculating the chi-square value of the characteristic data after the characteristic data is divided into boxes, wherein the smaller the chi-square value is, the more similar the distribution is, and the characteristic data can be combined into one box.
The specific technical scheme for realizing the aim of the invention is as follows:
a chi-square box-dividing method based on secure multiparty calculation comprises the following steps:
step 1: the data provider generates a pair of public key pk and private key sk through a homomorphic encryption system, and features data X= { X 0 ,x 1 ,...,x n-1 },id∈[0,n-1]Grouping the ids of the data of the same category in the characteristic data X into one section, which is denoted as s groups, and denoted as X t ,t∈[0,s-1]Where n, s are positive integers and marking the real packet x t Is 1, the packet class is encrypted using public key pk, denoted as E x =e (1), resulting in the true Group information Group t (x t ,E x );
Step 2: constructing false grouping, randomly dividing id of characteristic data X into s grouping intervals, keeping the number of the grouping intervals consistent with that of real grouping, and recording the intervals as X v ,v∈[0,s-1]And marking class 0 of the dummy packet, using public key pk to mark packet class encryption as E x E (0) to obtain false packet information as Group v (x v ,E x );
Step 3: connecting real grouping information and false grouping information according to rows, and obtaining grouping information Group according to row disorder X The data provider groups the Group information X (x i ,E x ) Transmitting to a data application party;
step 4: the data application party groups the Group information X (x i ,E x ) And tag data y= { Y 0 ,y 1 ,...,y i ,...,y n-1 },id∈[0,n-1]Is mapped to an id of each packet interval x i Corresponding tag data y i To each packet interval x i Corresponding tag data y i Is added to obtain the number Group of response samples in the grouping interval y According to the total number Group of data in the grouping interval s Calculating the number Group of unresponsive samples in the grouping interval n =Group s -Group y And the number of response samples of all the packet intervals is Group y Number of unresponsive samples Group n Total number of samples Group s And a packet class label E corresponding to the packet section x Transmitting to a data provider;
step 5: data provider marks packet class E using private key x Decrypting to obtain the decrypted packet class mark D x Wherein D is x Let 1 be the true packet, D x If the value of the code word is=0, the code word is a false packet, and false packet information is deleted;
step 6: the data provider responds to the number Group of the samples corresponding to the real grouping interval y Number of unresponsive samples Group n Total number of samples Group s Calculate the i, i E [0,2s-1 ]]Expected sample number E of the j-th class of the group ij Where j ε [0, 2) represents both the responding sample and the non-responding sample; based on the expected number of samples E of two adjacent real groups ij Sample number A of two adjacent real groups ij Calculating chi-square value of two adjacent real groups 2
Step 7: the data provider sets the box division number limit, two groups with the smallest box division value are combined according to the box division value of the adjacent group, the box division value of the adjacent group is recalculated after the two groups are combined, and the combination is stopped until the box division number reaches the box division number limit, so that a box division result of the box division number is obtained.
Step 1. The real packet x t Wherein only the id of the characteristic data is included, id E [0, n-1 ]]The actual value of the characteristic data is not contained, and leakage of the actual value of the characteristic data is avoided.
And step 2, randomly dividing the id of the characteristic data X into s packet intervals, and constructing a false packet, mixing the false packet into a real packet, and protecting real packet information.
Step 3, grouping information Group X (x i ,E x ) Wherein the dummy packet information is mixed with the real packet information and the classes of the dummy packet and the real packet are encrypted, protecting the privacy of the feature data.
Step 4, the number of response samples Group y The method is obtained according to the following steps: grouping information x i The id of the feature data is contained in the tag data Y, and the id is corresponding to the id of the tag data Y to obtain grouping information x i Corresponding tag value, if x in the ith packet information i =[0,2]The corresponding tag value is y 0 ,y 2 ]Because the label value of the response sample is 1 and the label value of the non-response sample is 0, the label values corresponding to the grouping information are added to obtain the number Group of the response samples of the grouping y
Step 4, the number of unresponsive samples Group n The following means: the number of samples in each packet is packet information x i The number of ids in (i.e. x in the packet information) i The length of (a) gives the number of samples Group of the packet s Subtracting the number of response samples according to the number of samples of the Group to obtain the number Group of non-response samples n
Step 6 the expected sample number E of the ith group and the jth class ij The calculation formula of (2) is as follows:
Figure BDA0003233256960000031
wherein R is i The sum of the number of samples representing the j, j+1 th class of the i-th packet, i.e. R i =Group s (i) ,C j C when j represents the response sample class j =Group y (i) +Group y (i+1) N represents the total number of samples of two adjacent packets, i.e. n=group s (i) +Group s (i+1)
Step 6, chi-square value 2 The calculation formula is as follows:
Figure BDA0003233256960000032
wherein A is ij Is the actual sample number of the ith group, the jth category, if j represents the response sample of the ith group, then A ij =Group y (i) ,E ij Is the expected number of samples in the ith group, jth category.
The beneficial effects of the invention are that
In the aspect of safety, the invention protects the data privacy of the card side packet in the federal learning characteristic engineering stage, takes characteristic data packets, takes the data index id of the same class as real packet information, adds false packet information, marks the real packet class as 1, marks the false packet class as 0, encrypts the 0 and 1 codes of the packet class, mixes the real packet information with the false packet information and then sends the mixed false packet information to a data application side, and the data application side does not know the specific value of the characteristic data of the packet, only knows the id corresponding to the characteristic data, and mixes the false packet, thereby protecting the data privacy of the characteristic data.
In terms of operation efficiency, the invention does not need to encrypt all the characteristic values to be sent to a data application party, only encrypts the grouping category of the characteristic data, avoids the calculation cost of encrypting and decrypting a large amount of data, and has quite obvious efficiency in a scene of a large data set.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following specific examples and drawings. The procedures, conditions, experimental methods, etc. for carrying out the present invention are common knowledge and common knowledge in the art, except for the following specific references, and the present invention is not particularly limited.
Examples
Data provider feature data x= {0,2,2,4,5,6,6,6}, data application label data y= {0,1,1,1,0,0,1,1}, taking as an example the chi-square binning result of computing data provider feature data X, the chi-square binning method step based on secure multiparty computing is specified:
firstly, the data provider divides the ids of the data with the same category of the characteristic data X into a section, and the grouping result is as follows: x is x t =[0],[1,2],[3],[4],[5,6,7]Altogether 5 packets, labeled as real packets, and encrypted packet class E using public transport pk x =e (1), resulting in the true Group information Group t (x t ,E x ) True Group information Group t (x t ,E x ) The specific contents are as follows:
x t E x
[0] E(1)
[1,2] E(1)
[3] E(1)
[4] E(1)
[5,6,7] E(1)
secondly, constructing false grouping, randomly dividing the id of the characteristic data X into s sections, wherein the grouping result is as follows: x is x v =[0,1,2],[3,4],[5],[6],[7]The number of packets is kept consistent with the number of real packets for a total of 5 packets. Marking these packets as spurious packets and encrypting packet class E using public key pk x =e (0), resulting in a false packet information Group v (x v ,E x ) False packet information Group v (x v ,E x ) The specific contents are as follows:
x v E x
[0,1,2] E(0)
[3,4] E(0)
[5] E(0)
[6] E(0)
[7] E(0)
then, the true grouping information Group t (x t ,E x ) And false packet information Group v (x v ,E x ) Connected by rows and out of order by rows to obtain grouping information Group X (x i ,E x ) And transmitting the packet information to the data application, the packet information Group X (x i ,E x ) The specific contents are as follows:
x i E x
[0,1,2] E(0)
[3,4] E(0)
[0] E(1)
[5] E(0)
[1,2] E(1)
[3] E(1)
[6] E(0)
[7] E(0)
[4] E(1)
[5,6,7] E(1)
then, the data application party groups the Group information X The id mapping with the tag data y= {0,1,1,1,0,0,1,1} yields the value of the tag data corresponding to each packet section as follows, and each packet section x i Corresponding tag data y i Adding to obtain the number Group of response samples in the grouping interval y According to the total number Group of data in the grouping interval s Calculating the number of unresponsive samples in the grouping interval
Group n =Group s -Group y
Figure BDA0003233256960000051
/>
Figure BDA0003233256960000061
Then, the number of response samples of all the packet intervals is Group y Number of unresponsive samples Group n Total number of samples Group s And a packet class label E corresponding to the packet section x Transmitting to a data provider;
the data provider decrypts the packet class mark E using the private key sk x The true packet information is obtained, and the packet with the decrypted packet class mark of 1 is the true packet. According to the number Group of response samples corresponding to each real grouping interval y Number of unresponsive samples Group n Total number of samples Group s Calculating the expected sample number E of the jth class of the ith group ij Where j ε [0,2 ] represents both the responding sample and the non-responding sample, here in two adjacent real packet intervals [0 ]]And [1,2 ]]For example, the chi-square value of two packets is calculated, and the information of two adjacent real packets is as follows:
packet numbering Grouping Group y Group n R i (Group s )
0 [0] 0 1 1
1 [1,2] 2 0 2
------------- C j 2 1 3
Grouping interval [0 ]]Number of response samples Group y (0) =0, the total number of samples is Group s (0) =1, the number of unresponsive samples is Group n (0) =1, then the period of the packetThe number of the expected samples is as follows:
Figure BDA0003233256960000062
grouping [1,2 ]]The expected number of samples is
Figure BDA0003233256960000063
Based on the expected number of samples E of two adjacent real groups ij Sample number A of two adjacent real groups ij Finally, the chi-square value of two adjacent real groups is calculated 2
Figure BDA0003233256960000071
The data provider sets the limit of the number of sub-boxes, and according to the chi-square value of adjacent groups, chi-square value is obtained 2 And combining the minimum two groups, and re-calculating the chi-square value of the adjacent groups after the combination until the number of the boxes reaches the limit of the number of the boxes, and stopping the combination to obtain the chi-square box-dividing result.

Claims (8)

1. The chi-square box separating method based on the safe multiparty calculation is characterized by comprising the following steps of:
step 1: the data provider generates a pair of public key pk and private key sk through a homomorphic encryption system, and features data X= { X 0 ,x 1 ,...,x n-1 },id∈[0,n-1]Grouping the ids of the data of the same category in the characteristic data X into one section, which is denoted as s groups, and denoted as X t ,t∈[0,s-1]N, s are positive integers and mark the real packet x t Is 1, the packet class is encrypted using public key pk, denoted as E x =e (1), resulting in the true Group information Group t (x t ,E x );
Step 2: constructing false grouping, randomly dividing id of characteristic data X into s grouping intervals, keeping the number of the grouping intervals consistent with that of real grouping, and recording the intervals as X v ,v∈[0,s-1]And marking class 0 of the dummy packet, using public key pk to mark packet class encryption as E x E (0) to obtain false packet information as Group v (x v ,E x );
Step 3: connecting real grouping information and false grouping information according to rows, and obtaining grouping information Group according to row disorder X The data provider groups the Group information X (x i ,E x ) Transmitting to a data application party;
step 4: the data application party groups the Group information X (x i ,E x ) And tag data y= { Y 0 ,y 1 ,...,y i ,...,y n-1 },id∈[0,n-1]Is mapped to an id of each packet interval x i Corresponding tag data y i To each packet interval x i Corresponding tag data y i Is added to obtain the number Group of response samples in the grouping interval y According to the total number Group of data in the grouping interval s Calculating the number Group of unresponsive samples in the grouping interval n =Group s -Group y And the number of response samples of all the packet intervals is Group y Number of unresponsive samples Group n Total number of samples Group s And a packet class label E corresponding to the packet section x Transmitting to a data provider;
step 5: data provider marks packet class E using private key x Decrypting to obtain the decrypted packet class mark D x Wherein D is x Let 1 be the true packet, D x If the value of the code word is=0, the code word is a false packet, and false packet information is deleted;
step 6: the data provider responds to the number Group of the samples corresponding to the real grouping interval y Number of unresponsive samples Group n Total number of samples Group s Calculate the i, i E [0,2s-1 ]]Expected sample number E of the j-th class of the group ij Where j ε [0, 2) represents both the responding sample and the non-responding sample; based on the expected number of samples E of two adjacent real groups ij Sample number A of two adjacent real groups ij Calculated to obtainChi-square value of two adjacent real groups 2
Step 7: the data provider sets the box division number limit, two groups with the smallest box division value are combined according to the box division value of the adjacent group, the box division value of the adjacent group is recalculated after the two groups are combined, and the combination is stopped until the box division number reaches the box division number limit, so that a box division result of the box division number is obtained.
2. The chi-square binning method based on secure multiparty computing of claim 1, wherein the real packet x of step 1 t Wherein only the id of the characteristic data is included, id E [0, n-1 ]]The actual value of the characteristic data is not contained, and leakage of the actual value of the characteristic data is avoided.
3. The chi-square binning method based on secure multiparty computation according to claim 1, wherein step 2 randomly divides the id of the feature data X into s packet intervals in order to construct a dummy packet, mix the dummy packet into a real packet, and protect real packet information.
4. The chi-square binning method based on secure multiparty computing of claim 1, wherein the grouping information Group of step 3 X (x i ,E x ) Wherein the dummy packet information is mixed with the real packet information and the classes of the dummy packet and the real packet are encrypted, protecting the privacy of the feature data.
5. The chi-square binning method based on secure multiparty calculation according to claim 1, wherein the response sample number Group of step 4 y The method is obtained according to the following steps: grouping information x i The id of the feature data is contained in the tag data Y, and the id is corresponding to the id of the tag data Y to obtain grouping information x i Corresponding tag value, if x in the ith packet information i =[0,2]The corresponding tag value is y 0 ,y 2 ]Since the response sample tag value is 1, the non-response sample tag value is 0, the packet isAdding the tag values corresponding to the information to obtain the number Group of response samples of the Group y
6. The chi-square binning method based on secure multiparty computing of claim 1, wherein the number of unresponsive samples Group of step 4 n The following means: the number of samples in each packet is packet information x i The number of ids in (i.e. x in the packet information) i The length of (a) gives the number of samples Group of the packet s Subtracting the number of response samples according to the number of samples of the Group to obtain the number Group of non-response samples n
7. The chi-square binning method based on secure multiparty computing of claim 1, wherein the ith group jth category of expected sample number E of step 6 ij The calculation formula of (2) is as follows:
Figure FDA0003233256950000021
wherein R is i The sum of the number of samples representing the j, j+1 th class of the i-th packet, i.e. R i =Group s (i) ,C j C when j represents the response sample class j =Group y (i) +Group y (i +1) N represents the total number of samples of two adjacent packets, i.e. n=group s (i) +Group s (i+1)
8. The chi-square binning method based on secure multiparty computing of claim 1, wherein the chi-square value χ of step 6 2 The calculation formula is as follows:
Figure FDA0003233256950000031
wherein A is ij Is the actual sample number of the ith group, the jth category, if j represents the response sample of the ith group, then A ij =Group y (i) ,E ij Is the expected number of samples in the ith group, jth category.
CN202110999974.5A 2021-08-27 2021-08-27 Chi-square box dividing method based on safe multiparty calculation Active CN113688354B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110999974.5A CN113688354B (en) 2021-08-27 2021-08-27 Chi-square box dividing method based on safe multiparty calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110999974.5A CN113688354B (en) 2021-08-27 2021-08-27 Chi-square box dividing method based on safe multiparty calculation

Publications (2)

Publication Number Publication Date
CN113688354A CN113688354A (en) 2021-11-23
CN113688354B true CN113688354B (en) 2023-06-09

Family

ID=78583726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110999974.5A Active CN113688354B (en) 2021-08-27 2021-08-27 Chi-square box dividing method based on safe multiparty calculation

Country Status (1)

Country Link
CN (1) CN113688354B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114021198B (en) * 2021-12-29 2022-04-08 支付宝(杭州)信息技术有限公司 Method and device for determining common data for protecting data privacy
CN114329127B (en) * 2021-12-30 2023-06-20 北京瑞莱智慧科技有限公司 Feature binning method, device and storage medium
CN114398671B (en) * 2021-12-30 2023-07-11 翼健(上海)信息科技有限公司 Privacy calculation method, system and readable storage medium based on feature engineering IV value
CN115951165A (en) * 2022-12-06 2023-04-11 南方电网数字电网研究院有限公司 Fault diagnosis system construction method and device based on multi-source sensor of power equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103826152A (en) * 2012-11-16 2014-05-28 中兴通讯股份有限公司 Method, device and system for using set-top box to realize multi-party conference call
WO2015094545A1 (en) * 2013-12-18 2015-06-25 Mun Johnathan System and method for modeling and quantifying regulatory capital, key risk indicators, probability of default, exposure at default, loss given default, liquidity ratios, and value at risk, within the areas of asset liability management, credit risk, market risk, operational risk, and liquidity risk for banks
CN111079283A (en) * 2019-12-13 2020-04-28 四川新网银行股份有限公司 Method for processing information saturation unbalanced data
CN111539009A (en) * 2020-06-05 2020-08-14 支付宝(杭州)信息技术有限公司 Supervised feature binning method and device for protecting private data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103826152A (en) * 2012-11-16 2014-05-28 中兴通讯股份有限公司 Method, device and system for using set-top box to realize multi-party conference call
WO2015094545A1 (en) * 2013-12-18 2015-06-25 Mun Johnathan System and method for modeling and quantifying regulatory capital, key risk indicators, probability of default, exposure at default, loss given default, liquidity ratios, and value at risk, within the areas of asset liability management, credit risk, market risk, operational risk, and liquidity risk for banks
CN111079283A (en) * 2019-12-13 2020-04-28 四川新网银行股份有限公司 Method for processing information saturation unbalanced data
CN111539009A (en) * 2020-06-05 2020-08-14 支付宝(杭州)信息技术有限公司 Supervised feature binning method and device for protecting private data

Also Published As

Publication number Publication date
CN113688354A (en) 2021-11-23

Similar Documents

Publication Publication Date Title
CN113688354B (en) Chi-square box dividing method based on safe multiparty calculation
WO2020248537A1 (en) Model parameter determination method and apparatus based on federated learning
CN113051557B (en) Social network cross-platform malicious user detection method based on longitudinal federal learning
CN103532701B (en) Encryption and decryption method for numeric type data
US9215068B2 (en) Search system, search method, and program
CN104135362B (en) A kind of availability calculations method of the data based on the issue of difference privacy
CN101706947B (en) Image fusion encryption method based on DNA sequences and multiple chaotic mappings
US20170308580A1 (en) Data Aggregation/Analysis System and Method Therefor
CN111756522A (en) Data processing method and system
CN104821942B (en) Face identification method and system
CN104917617A (en) Confounding method of encrypted group signatures
WO2014118230A1 (en) Method and system for providing encrypted data for searching of information therein and a method and system for searching of information on encrypted data
CN110213202B (en) Identification encryption matching method and device, and identification processing method and device
Du et al. A privacy-protected image retrieval scheme for fast and secure image search
Millen On the freedom of decryption
CN102594807A (en) Network gene recognition method based on entity self characteristics in information space
CN103310157A (en) Reverse transcriptase-deoxyribose nucleic acid (RT-DNA) cellular automaton-based image encryption method
CN112948883A (en) Multi-party combined modeling method, device and system for protecting private data
Pradeepthi et al. Machine learning approach for analysing encrypted data
CN116506230A (en) Data acquisition method and system based on RSA asymmetric encryption
CN113159918B (en) Bank client group mining method based on federal group penetration
CN114154476A (en) Execution credibility judgment method for large instrument sharing experiment
CN111177747A (en) Block chain-based social network privacy data protection method
CN115860768A (en) Tracing method and device based on block chain and electronic equipment thereof
Abd-El-Atty et al. Double medical image cryptosystem based on quantum walk

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant