WO2019085656A1 - Data statistics method and apparatus - Google Patents

Data statistics method and apparatus Download PDF

Info

Publication number
WO2019085656A1
WO2019085656A1 PCT/CN2018/105482 CN2018105482W WO2019085656A1 WO 2019085656 A1 WO2019085656 A1 WO 2019085656A1 CN 2018105482 W CN2018105482 W CN 2018105482W WO 2019085656 A1 WO2019085656 A1 WO 2019085656A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
parameter
statistical
identifier
party
Prior art date
Application number
PCT/CN2018/105482
Other languages
French (fr)
Chinese (zh)
Inventor
王华忠
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2019085656A1 publication Critical patent/WO2019085656A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Definitions

  • the present disclosure relates to the field of network technologies, and in particular, to a data statistics method and apparatus.
  • the present disclosure provides a data statistics method and apparatus for implementing two-party secure computing on the basis of protecting the data privacy of two data owners.
  • a data statistics method is provided, where the method is applied to perform data statistics by combining data of a local data party and a cooperative data party, where the local data party has a plurality of first data to be calculated, and the plurality of The data corresponds to different data identifiers, and the cooperative data party has multiple second data corresponding to the data identifier, and the method includes:
  • each data identifier generating a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is according to the first parameter Calculating a parameter and the first data;
  • the cooperation data is obtained.
  • the party selects the second parameter, otherwise, the cooperative data party selects the first parameter;
  • the calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.
  • a data statistics method is provided, where the method is used for performing data statistics between a local data party and a statistical data party, where the statistical data party has a plurality of first data to be calculated, and the The first data corresponds to different data identifiers, and the local data party has second data corresponding to the same data identifier; the method includes:
  • the second data corresponding to the data identifier is data that is locally involved in data statistics, selecting a second parameter corresponding to the data identifier; otherwise, selecting a first parameter corresponding to the data identifier;
  • a data statistics apparatus configured to perform data statistics by combining data of a local data party and a cooperative data party, where the local data party has a plurality of first data to be calculated, and the multiple The first data respectively correspond to different data identifiers, and the cooperation data party has a plurality of second data corresponding to the data identifiers; the device includes:
  • a parameter generating module configured to generate a first parameter and a second parameter corresponding to each data identifier; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, The second parameter is calculated according to the first parameter and the first data;
  • a data sending module configured to send each data identifier, and the first parameter and the second parameter corresponding to the data identifier, to the collaborative data party;
  • a data receiving module configured to receive a partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, if the second data corresponding to the data identifier participates
  • the cooperative data party selects the second parameter; otherwise, the cooperative data party selects the first parameter;
  • a statistical processing module configured to remove the calculated value of each first parameter from the calculated value of the partner, to obtain the statistical value.
  • a data statistics apparatus configured to perform data statistics between a local data party and a statistical data side, where the statistical data side has a plurality of first data to be calculated, and the multiple The first data respectively correspond to different data identifiers, and the local data party has second data corresponding to the same data identifier; the device includes:
  • a parameter receiving module configured to receive a data identifier sent by the statistic data, and a first parameter and a second parameter corresponding to the data identifier; where, when the first data corresponding to the data identifier participates in data statistics, The second parameter is calculated according to the first parameter and the first data; otherwise, the second parameter is equal to the first parameter;
  • a parameter selection module configured to: if the second data corresponding to the data identifier is data that is locally involved in data statistics, select a second parameter corresponding to the data identifier; otherwise, select a first parameter corresponding to the data identifier;
  • a statistical calculation module configured to perform statistical calculation according to the selected first parameter and the second parameter, to obtain a calculated value of the partner
  • a value sending module configured to send the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, to obtain the statistical value.
  • a data statistics device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing the instructions to:
  • each data identifier generating a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is according to the first parameter Calculating a parameter and the first data;
  • the cooperation data is obtained.
  • the party selects the second parameter, otherwise, the cooperative data party selects the first parameter;
  • the calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.
  • a data statistics device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing the instructions to:
  • the second data corresponding to the data identifier is data that is locally involved in data statistics, selecting a second parameter corresponding to the data identifier; otherwise, selecting a first parameter corresponding to the data identifier;
  • the data statistics method and apparatus of one or more embodiments of the present specification can make the cooperative data party not know when transmitting the parameters to the cooperative data party by generating the first parameter and the second parameter for confusing the real data.
  • the real data of the local end, and the calculated value of the partner returned by the cooperative data side is also determined according to the data filtering condition of the cooperative data side, and the local end does not know the data selection made by the cooperative data side, thereby realizing the protection of the two data.
  • the two-party data is jointly calculated by the two parties.
  • FIG. 1 is a flowchart of a data statistics method provided by one or more embodiments of the present specification
  • FIG. 3 is a schematic structural diagram of a data statistics apparatus according to one or more embodiments of the present disclosure
  • FIG. 4 is a schematic structural diagram of a data statistics apparatus provided by one or more embodiments of the present specification.
  • data can be stored in a vertical mode, that is, multiple data owners can have different attribute information of the same entity.
  • a vertical mode that is, multiple data owners can have different attribute information of the same entity.
  • the same natural person’s car insurance is divided into one institution, and the natural person’s claim amount is in another. mechanism.
  • This vertical mode of data storage may result in multiple data owners involved in some statistical calculations, and multiple data owners need to cooperate to complete a data statistics.
  • the company's data secrets cannot be disclosed.
  • data source A there can be two data sources: data source A and data source B.
  • data source A can be a data organization
  • data source B can be an insurance institution. These two data sources can store different information of the same owner.
  • Data Source A Assume that the data source A can store the car insurance score of each car owner.
  • the car insurance score can be the score obtained by performing accurate portrait and risk analysis on the car owner. The higher the car insurance score, the lower the risk.
  • Table 1 the data structure of the data source A side to store the car insurance points is as follows:
  • Data Source B Assume that the data source B can store the claim information of each owner.
  • the claim information of the owner may include the number of claims, the amount of the claim, and the like.
  • Table 2 an example of the data structure of each owner stored on the data source B side is as follows:
  • the data statistics processing can be completed jointly based on the data of the data source A and the data source B.
  • the demand for statistical work can be “the sum of the number of claims for female users with a statistical risk of more than 500 points.” Then, the “auto insurance score greater than 500 points” needs to be determined based on the data of data source A. “Female users, number of claims” These data are stored in data source B. Therefore, this statistical work requires data cooperation between data source A and data source B.
  • the data source having the statistical data may be referred to as a statistical data side, and the other data source may be referred to as a cooperative data side.
  • a statistical data side the data source having the statistical data
  • the other data source may be referred to as a cooperative data side.
  • data source B is the statistical data side
  • data source A is the cooperative data side.
  • the statistical data party and the cooperative data party may separately store different information of the same owner, and the vehicle owner information (for example, the number of claims) stored in the statistical data party to be participated in the statistics may be referred to as first data, and stored in the cooperative data party.
  • the owner information (for example, the car insurance score) participating in the statistics is called the second data.
  • the ID number idcard_no included in both the data source A and the data source B may be referred to as a data identifier
  • the statistical data side eg, the data source B
  • the cooperative data side for example, The data source A
  • the data source A can store the second data corresponding to the same data identifier.
  • Figure 1 illustrates a flow of a statistical method of data, which may include:
  • step 100 the statistical data side generates a first parameter and a second parameter corresponding to each data identifier.
  • the first parameter may be a random number, or the first parameter may also be a value calculated from a random number, such as one-half of a random number.
  • the value of the second parameter may be determined according to the data filtering condition. If the first data corresponding to the data identifier satisfies the local data filtering condition and is data participating in the data statistics, the first parameter and the first data may be calculated according to the first parameter and the first data. The second parameter. For example, the first parameter and the first data may be summed to obtain a second parameter. If the second data corresponding to the data identifier does not satisfy the local data filtering condition, the second parameter may be set to be equal to the first parameter.
  • the manner in which the second parameter is generated is not limited to the manner in which the first data and the first parameter are summed, and other calculation methods may be used.
  • step 102 the statistic data party sends the local data identifier and the first parameter and the second parameter corresponding to the data identifier to the cooperation data party.
  • step 104 the cooperation data party selects a parameter, and if the second data corresponding to the data identifier is the data of the local participation data statistics, the second parameter corresponding to the data identifier is selected; otherwise, the first parameter corresponding to the data identifier is selected.
  • the cooperative data party may perform parameter selection in this step, and the selected parameter may participate in the processing of the subsequent step 106.
  • the cooperative data party may select the second parameter according to the local data filtering condition. If the second data corresponding to the data identifier satisfies the filtering condition and is the data participating in the data statistics, the second parameter may be selected; otherwise, if the data identifier corresponds to the second If the data is not filtered, not the data participating in the statistics, the first parameter can be selected.
  • step 106 the cooperative data party performs statistical calculation on the selected first parameter and the second parameter to obtain a partner calculated value.
  • the statistic value to be obtained is the summation statistic
  • the selected first parameter and the second parameter may be added; of course, in other statistical methods, the first parameter and the second parameter may also be corresponding. Other forms of calculation.
  • step 108 the cooperating data party sends the partner calculation value to the statistical data party.
  • step 110 the statistical data side uses the calculated value of the partner to remove the calculated value of the first parameter, and obtains a statistical value. For example, it is possible to subtract the sum of the respective first parameters from the partner calculation value.
  • Oblivious Transfer which is a privacy-protected two-party communication protocol, which enables the communication parties to transmit messages in a manner that is fuzzified in selection, so that the service can be made.
  • the recipient receives the message entered by the service sender in an unintended manner, thus protecting the recipient's privacy from being known by the sender.
  • the statistic data party can send all the data identifiers and the corresponding first parameter and the second parameter to the cooperative data party.
  • the statistic data party has set the second parameter according to the local data filtering condition. Different values, but from the perspective of the cooperating data side, all data identifiers are received, and the filtering data of the statistics side is not leaked.
  • the statistical data side confuses its own real data by means of two parameters, and the first parameter and the second parameter transmitted to the cooperative data side are not the real first data, nor will the data privacy leak.
  • the calculated value of the partner it receives is the data-filtered selection of the cooperative data party, but the statistical data side cannot distinguish which data is selected by the cooperative data party. Therefore, the cooperation data
  • the party's data can also be protected by privacy.
  • Idcard_no Gender Times Amount 1234567 male 3 5000 2345678 Female 7 23000 3456789 Female 6 16000
  • FIG. 2 illustrates a flow of summation statistics combined with data source A and data source B, which may include:
  • step 200 data source B generates a random number for each row of data and generates M0 and M1 based on the data filtering conditions.
  • the column corresponding to the number of claims times is a statistical column.
  • 3, 7, and 6 are the first data in the statistical column.
  • the owner of the two idcard_no of 2345678 and 3456789 can meet the condition and participate in the first data of this data statistics; while the owner of 1234567 does not meet the filtering conditions and does not participate in the data statistics. .
  • the first parameter and the second parameter corresponding to each idcard_no can be generated.
  • the first parameter may be the random number corresponding to each idcard_no
  • the second parameter may be the sum of the first data corresponding to the idcard_no
  • the first data may be b participating in the statistics.
  • the M0 and M1 generated in this step are to confuse the real statistical column data by the generation of the random value. Even if the cooperative data party receives the M0 and M1 corresponding to the idcard_no, the real statistical column data corresponding to the data identifier idcard_no cannot be known. how many. For example, even if t 2 and t 2 +7 corresponding to the data identifier 2345678 are received, the true value 7 of b cannot be known.
  • the above-mentioned random numbers t 1 , t 2 and t 3 respectively corresponding to each data identifier may be different.
  • step 202 the data source B sends the data identifier of each row of data and the MO and M1 corresponding to the data identifier to the data source A.
  • step 204 the data source A selects M1 according to the local data filtering condition, and if the second data corresponding to the data identifier participates in the data statistics, otherwise selects MO.
  • the data source A can determine whether the second data (score in the table 3) corresponding to each data identifier idcard_no is greater than 500 points according to the filter condition “the vehicle risk score is greater than 500 points”. If the score corresponding to idcard_no is greater than 500, "t+b" in Table 5 is selected. Otherwise, if the score corresponding to idcard_no is less than 500, "t" in Table 5 is selected.
  • idcard_no is 1234567
  • the data identification corresponding to the car insurance score is 490
  • the filtering condition of "the car insurance score is greater than 500 points" is not satisfied.
  • M0 corresponding to 1234567 in Table 5 can be selected, that is, t1 is selected.
  • the idcard_no is 2345678 as an example.
  • the corresponding vehicle risk score is 501
  • the filter condition that satisfies the “auto insurance score greater than 500 points” may be selected, and the M1 corresponding to 2345678 in Table 5 may be selected. T2+7.
  • idcard_no is 3456789
  • t3+6 will be selected.
  • step 206 data source A accumulates the selected numbers to obtain an accumulated value.
  • data source A can accumulate selected parameters to obtain an accumulated value.
  • the accumulated value is the calculated value of the partner.
  • step 208 data source A sends the accumulated value to data source B.
  • step 210 data source B subtracts the sum of M0 from the accumulated value to obtain a statistical value.
  • the statistical value is the sum of the plurality of first data, for example, the sum of the number of claims is obtained.
  • the data statistics method of one or more embodiments of the present disclosure may also be applied to other statistical calculation scenarios.
  • the statistical value may also be an average value of multiple first data.
  • the processing flow shown in FIG. 2 can also be adopted, except that different first parameters and second parameters can be adopted.
  • the one parameter and the second parameter may be the first parameter plus one-half of the first data.
  • the generated M0 may be t 2
  • the generated M1 may be “t 2 +7/2”.
  • the first parameter may be generated as one-half of a random number, such as “t 2 /2”
  • the corresponding second parameter may be “(t 2 +7)/2”.
  • the device may include: a parameter generating module 31, a data sending module 32, a data receiving module 33, and statistics. Processing module 34.
  • a parameter generating module 31 configured to generate a first parameter and a second parameter corresponding to each data identifier; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, The second parameter is calculated according to the first parameter and the first data;
  • the data sending module 32 is configured to send each data identifier, and the first parameter and the second parameter corresponding to the data identifier, to the cooperative data party;
  • the data receiving module 33 is configured to receive a partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, if the data identifier corresponds to the second data Participating in data statistics, the cooperative data party selects the second parameter; otherwise, the cooperative data party selects the first parameter;
  • the statistical processing module 34 is configured to remove the calculated value of each first parameter from the calculated value of the partner, to obtain the statistical value.
  • the plurality of first data are located in the same statistical column of the local data source.
  • the parameter generating module 31 is specifically configured to perform summation statistics by using the first parameter and the first data. Get the second parameter.
  • the statistical processing module 34 when used to remove the calculated value of each first parameter from the calculated value of the partner, specifically for subtracting the sum of each of the first parameters by the accumulated value, the accumulated value is cooperation
  • the data side is accumulated according to the selected first parameter or the second parameter.
  • the parameter generating module 31 is configured to: when the second parameter is used to obtain the second parameter by using the first parameter and the first data, if the first data corresponding to the data identifier is satisfied for determining Participating in the data filtering condition of the statistical data, when the statistical value is the sum of the plurality of first data, the first parameter is a random number, and the second parameter is the random number and the first data Sum.
  • the parameter generating module 31 is configured to: when the second parameter is used to obtain the second parameter by using the first parameter and the first data, if the first data corresponding to the data identifier is satisfied for determining Participating in the data filtering condition of the statistical data, when the statistical value is an average of the plurality of first data, the second parameter is the first parameter plus one-half of the first data.
  • the device may include: a parameter receiving module 41, a parameter selection module 42, a statistical calculation module 43, and a numerical value.
  • Send module 44 the device may include: a parameter receiving module 41, a parameter selection module 42, a statistical calculation module 43, and a numerical value.
  • the parameter receiving module 41 is configured to receive a data identifier sent by the statistic data side, and a first parameter and a second parameter corresponding to the data identifier, where, when the first data corresponding to the data identifier participates in the data statistics, The second parameter is calculated according to the first parameter and the first data; otherwise, the second parameter is equal to the first parameter;
  • the parameter selection module 42 is configured to: if the second data corresponding to the data identifier is the data of the local participation data statistics, select the second parameter corresponding to the data identifier; otherwise, select the first parameter corresponding to the data identifier;
  • the statistical calculation module 43 is configured to perform statistical calculation according to the selected first parameter and the second parameter to obtain a calculated value of the partner;
  • the value sending module 44 is configured to send the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, and obtains the statistical value. .
  • each step may be implemented in the form of software, hardware or a combination thereof, for example, a person skilled in the art may implement it in the form of software code, and may be a computer executable computer capable of implementing the logic function corresponding to the step. instruction.
  • the executable instructions can be stored in a memory and executed by a processor in the device.
  • one or more embodiments of the present specification simultaneously provide a data statistics device for performing data statistics in conjunction with data of a local data party and a cooperative data party, the local data party having statistics to be calculated.
  • the apparatus can include a processor, a memory, and computer instructions stored on the memory and operative on the processor, the processor executing the instructions for implementing the steps of:
  • each data identifier generating a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is according to the first parameter Calculating a parameter and the first data;
  • the cooperation data is obtained.
  • the party selects the second parameter, otherwise, the cooperative data party selects the first parameter;
  • the calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.
  • one or more embodiments of the present specification further provide a data statistics device, configured to perform data statistics between a local data party and a statistical data party, where the statistical data party has statistics to be calculated. a plurality of first data of values, the plurality of first data respectively corresponding to different data identifiers, wherein the local data parties have second data corresponding to the same data identifier.
  • the apparatus can include a processor, a memory, and computer instructions stored on the memory and operative on the processor, the processor executing the instructions for implementing the steps of:
  • the second data corresponding to the data identifier is data that is locally involved in data statistics, selecting a second parameter corresponding to the data identifier; otherwise, selecting a first parameter corresponding to the data identifier;
  • the apparatus or module illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function.
  • a typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, and a game control.
  • one or more embodiments of the present specification can be provided as a method, system, or computer program product.
  • one or more embodiments of the present specification can take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware.
  • one or more embodiments of the present specification can employ a computer program embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer usable program code embodied therein. The form of the product.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
  • One or more embodiments of the present specification can be described in the general context of computer-executable instructions executed by a computer, such as a program module.
  • program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • One or more embodiments of the present specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network.
  • program modules can be located in both local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Evolutionary Biology (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • Algebra (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Telephonic Communication Services (AREA)
  • Complex Calculations (AREA)

Abstract

Provided are a data statistics method and apparatus, the method comprising: generating a first parameter and a second parameter to correspond to each data identifier; if a piece of first data corresponding to a data identifier does not participate in data statistics, the second parameter is equal to the first parameter, and otherwise, the second parameter is calculated according to the first parameter and the piece of first data; sending each data identifier and the corresponding first parameter and second parameter to a cooperative data party; receiving cooperative party calculation values returned by the cooperative data party, the cooperative party calculation values being obtained by the cooperative data party according to selected first parameters or second parameters; removing calculation values of various first parameters from the cooperative party calculation values, and obtaining required statistical values.

Description

一种数据统计方法和装置Data statistics method and device 技术领域Technical field
本公开涉及网络技术领域,特别涉及一种数据统计方法和装置。The present disclosure relates to the field of network technologies, and in particular, to a data statistics method and apparatus.
背景技术Background technique
大数据时代,存在非常多的数据孤岛。例如,一个自然人的数据,可以分散存储于不同的企业中,而企业与企业之间由于竞争关系和用户隐私保护的考虑,并不是完全的互相信任,这就为涉及企业之间数据合作的统计工作造成了障碍。如何在充分保护企业核心数据隐私的前提下,既能够利用双方拥有的数据完成一些数据统计计算,又不会泄露企业各自的数据隐私安全,成为一个亟待解决的迫切问题。但是目前并没有很好的解决方案。In the era of big data, there are many data islands. For example, a natural person's data can be distributed and stored in different enterprises, and the business and enterprise are not completely mutual trust due to competition and user privacy protection. This is the statistics involving data cooperation between enterprises. Work creates obstacles. Under the premise of fully protecting the core data privacy of enterprises, it is possible to use the data owned by both parties to complete some statistical calculations without revealing the privacy of each company's data, which becomes an urgent problem to be solved urgently. But there is currently no good solution.
发明内容Summary of the invention
有鉴于此,本公开提供一种数据统计方法和装置,以在保护两个数据拥有方的数据隐私的基础上,实现两方安全计算。In view of this, the present disclosure provides a data statistics method and apparatus for implementing two-party secure computing on the basis of protecting the data privacy of two data owners.
具体地,本说明书一个或多个实施例是通过如下技术方案实现的:Specifically, one or more embodiments of the present specification are implemented by the following technical solutions:
第一方面,提供一种数据统计方法,所述方法应用于联合本地数据方和合作数据方的数据进行数据统计,本地数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,合作数据方具有所述数据标识对应的多个第二数据,所述方法包括:In a first aspect, a data statistics method is provided, where the method is applied to perform data statistics by combining data of a local data party and a cooperative data party, where the local data party has a plurality of first data to be calculated, and the plurality of The data corresponds to different data identifiers, and the cooperative data party has multiple second data corresponding to the data identifier, and the method includes:
对应于每个数据标识,生成第一参数和第二参数;若所述数据标识对应的第一数据不参与数据统计,则第二参数等于第一参数,否则,所述第二参数是根据第一参数和所述第一数据计算得到;Corresponding to each data identifier, generating a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is according to the first parameter Calculating a parameter and the first data;
将每个数据标识、以及对应所述数据标识的第一参数和第二参数,发 送至合作数据方;Sending each data identifier, and the first parameter and the second parameter corresponding to the data identifier, to the cooperative data party;
接收合作数据方返回的合作方计算值,所述合作方计算值是合作数据方根据选择的第一参数或第二参数得到,若所述数据标识对应的第二数据参与数据统计,则合作数据方选择第二参数,否则,合作数据方选择第一参数;Receiving the partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, and if the second data corresponding to the data identifier participates in the data statistics, the cooperation data is obtained. The party selects the second parameter, otherwise, the cooperative data party selects the first parameter;
由所述合作方计算值中去除各个第一参数的计算值,得到所述统计值。The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.
第二方面,提供一种数据统计方法,所述方法用于在本地数据方与统计数据方之间进行数据统计,所述统计数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,所述本地数据方具有同一数据标识对应的第二数据;所述方法包括:In a second aspect, a data statistics method is provided, where the method is used for performing data statistics between a local data party and a statistical data party, where the statistical data party has a plurality of first data to be calculated, and the The first data corresponds to different data identifiers, and the local data party has second data corresponding to the same data identifier; the method includes:
接收所述统计数据方发送的数据标识、以及对应所述数据标识的第一参数和第二参数;其中,当所述数据标识对应的第一数据参与数据统计时,所述第二参数是根据第一参数和所述第一数据计算得到,否则,所述第二参数等于第一参数;Receiving the data identifier sent by the statistical data side, and the first parameter and the second parameter corresponding to the data identifier; wherein, when the first data corresponding to the data identifier participates in the data statistics, the second parameter is based on Calculating the first parameter and the first data, otherwise, the second parameter is equal to the first parameter;
若所述数据标识对应的第二数据是本地参与数据统计的数据,则选择所述数据标识对应的第二参数;否则,选择所述数据标识对应的第一参数;And if the second data corresponding to the data identifier is data that is locally involved in data statistics, selecting a second parameter corresponding to the data identifier; otherwise, selecting a first parameter corresponding to the data identifier;
根据选择的第一参数和第二参数进行统计计算,得到合作方计算值;Performing statistical calculation according to the selected first parameter and the second parameter to obtain a calculated value of the partner;
将所述合作方计算值发送至所述统计数据方,以使得所述统计数据方根据所述合作方计算值去除各个第一参数的计算值,得到所述统计值。And sending the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, and obtains the statistical value.
第三方面,提供一种数据统计装置,所述装置用于联合本地数据方和合作数据方的数据进行数据统计,所述本地数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,所述合作数据方具有所述数据标识对应的多个第二数据;所述装置包括:In a third aspect, a data statistics apparatus is provided, where the apparatus is configured to perform data statistics by combining data of a local data party and a cooperative data party, where the local data party has a plurality of first data to be calculated, and the multiple The first data respectively correspond to different data identifiers, and the cooperation data party has a plurality of second data corresponding to the data identifiers; the device includes:
参数生成模块,用于对应于每个数据标识,生成第一参数和第二参数;若所述数据标识对应的第一数据不参与数据统计,则第二参数等于第一参数,否则,所述第二参数是根据第一参数和所述第一数据计算得到;a parameter generating module, configured to generate a first parameter and a second parameter corresponding to each data identifier; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, The second parameter is calculated according to the first parameter and the first data;
数据发送模块,用于将每个数据标识、以及对应所述数据标识的第一 参数和第二参数,发送至所述合作数据方;a data sending module, configured to send each data identifier, and the first parameter and the second parameter corresponding to the data identifier, to the collaborative data party;
数据接收模块,用于接收合作数据方返回的合作方计算值,所述合作方计算值是合作数据方根据选择的第一参数或第二参数得到,若所述数据标识对应的第二数据参与数据统计,则合作数据方选择第二参数,否则,合作数据方选择第一参数;a data receiving module, configured to receive a partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, if the second data corresponding to the data identifier participates For data statistics, the cooperative data party selects the second parameter; otherwise, the cooperative data party selects the first parameter;
统计处理模块,用于由所述合作方计算值中去除各个第一参数的计算值,得到所述统计值。And a statistical processing module, configured to remove the calculated value of each first parameter from the calculated value of the partner, to obtain the statistical value.
第四方面,提供一种数据统计装置,所述装置用于在本地数据方与统计数据方之间进行数据统计,所述统计数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,所述本地数据方具有同一数据标识对应的第二数据;所述装置包括:According to a fourth aspect, a data statistics apparatus is provided, where the apparatus is configured to perform data statistics between a local data party and a statistical data side, where the statistical data side has a plurality of first data to be calculated, and the multiple The first data respectively correspond to different data identifiers, and the local data party has second data corresponding to the same data identifier; the device includes:
参数接收模块,用于接收所述统计数据方发送的数据标识、以及对应所述数据标识的第一参数和第二参数;其中,当所述数据标识对应的第一数据参与数据统计时,所述第二参数是根据第一参数和所述第一数据计算得到,否则,所述第二参数等于第一参数;a parameter receiving module, configured to receive a data identifier sent by the statistic data, and a first parameter and a second parameter corresponding to the data identifier; where, when the first data corresponding to the data identifier participates in data statistics, The second parameter is calculated according to the first parameter and the first data; otherwise, the second parameter is equal to the first parameter;
参数选择模块,用于若所述数据标识对应的第二数据是本地参与数据统计的数据,则选择所述数据标识对应的第二参数;否则,选择所述数据标识对应的第一参数;a parameter selection module, configured to: if the second data corresponding to the data identifier is data that is locally involved in data statistics, select a second parameter corresponding to the data identifier; otherwise, select a first parameter corresponding to the data identifier;
统计计算模块,用于根据选择的第一参数和第二参数进行统计计算,得到合作方计算值;a statistical calculation module, configured to perform statistical calculation according to the selected first parameter and the second parameter, to obtain a calculated value of the partner;
数值发送模块,用于将所述合作方计算值发送至所述统计数据方,以使得所述统计数据方根据所述合作方计算值去除各个第一参数的计算值,得到所述统计值。And a value sending module, configured to send the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, to obtain the statistical value.
第五方面,提供一种数据统计设备,所述设备包括存储器、处理器,以及存储在存储器上并可在处理器上运行的计算机指令,所述处理器执行指令时实现以下步骤:In a fifth aspect, a data statistics device is provided, the device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing the instructions to:
对应于每个数据标识,生成第一参数和第二参数;若所述数据标识对 应的第一数据不参与数据统计,则第二参数等于第一参数,否则,所述第二参数是根据第一参数和所述第一数据计算得到;Corresponding to each data identifier, generating a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is according to the first parameter Calculating a parameter and the first data;
将每个数据标识、以及对应所述数据标识的第一参数和第二参数,发送至合作数据方;Sending each data identifier and the first parameter and the second parameter corresponding to the data identifier to the cooperation data party;
接收合作数据方返回的合作方计算值,所述合作方计算值是合作数据方根据选择的第一参数或第二参数得到,若所述数据标识对应的第二数据参与数据统计,则合作数据方选择第二参数,否则,合作数据方选择第一参数;Receiving the partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, and if the second data corresponding to the data identifier participates in the data statistics, the cooperation data is obtained. The party selects the second parameter, otherwise, the cooperative data party selects the first parameter;
由所述合作方计算值中去除各个第一参数的计算值,得到所述统计值。The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.
第六方面,提供一种数据统计设备,所述设备包括存储器、处理器,以及存储在存储器上并可在处理器上运行的计算机指令,所述处理器执行指令时实现以下步骤:In a sixth aspect, a data statistics device is provided, the device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing the instructions to:
接收所述统计数据方发送的数据标识、以及对应所述数据标识的第一参数和第二参数;其中,当所述数据标识对应的第一数据参与数据统计时,所述第二参数是根据第一参数和所述第一数据计算得到,否则,所述第二参数等于第一参数;Receiving the data identifier sent by the statistical data side, and the first parameter and the second parameter corresponding to the data identifier; wherein, when the first data corresponding to the data identifier participates in the data statistics, the second parameter is based on Calculating the first parameter and the first data, otherwise, the second parameter is equal to the first parameter;
若所述数据标识对应的第二数据是本地参与数据统计的数据,则选择所述数据标识对应的第二参数;否则,选择所述数据标识对应的第一参数;And if the second data corresponding to the data identifier is data that is locally involved in data statistics, selecting a second parameter corresponding to the data identifier; otherwise, selecting a first parameter corresponding to the data identifier;
根据选择的第一参数和第二参数进行统计计算,得到合作方计算值;Performing statistical calculation according to the selected first parameter and the second parameter to obtain a calculated value of the partner;
将所述合作方计算值发送至所述统计数据方,以使得所述统计数据方根据所述合作方计算值去除各个第一参数的计算值,得到所述统计值。And sending the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, and obtains the statistical value.
本说明书一个或多个实施例的数据统计方法和装置,通过生成用于混淆真实数据的第一参数和第二参数,在将这些参数发送至合作数据方时,可以使得合作数据方不会知晓本端的真实数据,并且,合作数据方返回的合作方计算值也是根据合作数据方的数据过滤条件确定,而本端不会知晓合作数据方所做的数据选择,从而实现了在保护两个数据拥有方的数据隐私的基础上,联合两方数据进行了两方安全计算。The data statistics method and apparatus of one or more embodiments of the present specification can make the cooperative data party not know when transmitting the parameters to the cooperative data party by generating the first parameter and the second parameter for confusing the real data. The real data of the local end, and the calculated value of the partner returned by the cooperative data side is also determined according to the data filtering condition of the cooperative data side, and the local end does not know the data selection made by the cooperative data side, thereby realizing the protection of the two data. Based on the data privacy of the owner, the two-party data is jointly calculated by the two parties.
附图说明DRAWINGS
为了更清楚地说明本说明书一个或多个实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书一个或多个实施例中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate one or more embodiments of the present specification or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, in the following description The drawings are only some of the embodiments described in one or more embodiments of the present specification, and those skilled in the art can obtain other drawings according to the drawings without any inventive labor. .
图1为本说明书一个或多个实施例提供的一种数据统计方法的流程图;1 is a flowchart of a data statistics method provided by one or more embodiments of the present specification;
图2为本说明书一个或多个实施例提供的一种数据求和统计的流程图;2 is a flow chart of data summation statistics provided by one or more embodiments of the present specification;
图3为本说明书一个或多个实施例提供的一种数据统计装置的结构示意图;FIG. 3 is a schematic structural diagram of a data statistics apparatus according to one or more embodiments of the present disclosure; FIG.
图4为本说明书一个或多个实施例提供的一种数据统计装置的结构示意图。FIG. 4 is a schematic structural diagram of a data statistics apparatus provided by one or more embodiments of the present specification.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本说明书一个或多个实施例中的技术方案,下面将结合本说明书一个或多个实施例中的附图,对本说明书一个或多个实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是一部分实施例,而不是全部的实施例。基于本说明书一个或多个实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。In order to make those skilled in the art better understand the technical solutions in one or more embodiments of the present specification, in the following one or more embodiments of the present specification, in one or more embodiments of the present specification, The technical solutions are described clearly and completely, and it is obvious that the described embodiments are only a part of the embodiments, rather than all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on one or more embodiments of the present disclosure without departing from the inventive scope are intended to be within the scope of the disclosure.
在大数据时代,数据的存储方式可以是垂直模式,即多个数据拥有方可以拥有同一个实体的不同属性信息,例如,同一个自然人的车险分在一个机构,该自然人的理赔金额在另一个机构。这种垂直模式的数据存储,可能导致在进行一些数据统计计算时,会涉及到多个数据拥有方,需要多个数据拥有方合作完成一次数据统计。然而,由于不同企业之间的竞争关系或者隐私保护的考虑,不能泄露企业各自的数据秘密。In the era of big data, data can be stored in a vertical mode, that is, multiple data owners can have different attribute information of the same entity. For example, the same natural person’s car insurance is divided into one institution, and the natural person’s claim amount is in another. mechanism. This vertical mode of data storage may result in multiple data owners involved in some statistical calculations, and multiple data owners need to cooperate to complete a data statistics. However, due to the competitive relationship or privacy protection between different companies, the company's data secrets cannot be disclosed.
本公开的例子中,旨在基于不同的数据拥有方的数据进行数据统计,同时又不会泄露数据拥有方各自的数据隐私,如下将以一个示例的应用场景为例,来详细描述该方法。In the example of the present disclosure, it is intended to perform data statistics based on data of different data owners, without revealing the data privacy of the data owners. The method will be described in detail by taking an example application scenario as an example.
应用场景:Application scenario:
在一个例子中,可以有两个数据源,分别为:数据源A和数据源B。假设数据源A可以是一个数据机构,数据源B可以是一个保险机构,这两个数据源可以分别存储同一个车主的不同信息。In one example, there can be two data sources: data source A and data source B. Assume that data source A can be a data organization, and data source B can be an insurance institution. These two data sources can store different information of the same owner.
数据源A:假设该数据源A可以存储每个车主的车险分,车险分可以是对车主进行精准画像和风险分析后得到的分数,车险分的分数越高,可以表明风险越低。如表1所示,数据源A侧存储车险分的数据结构示例如下:Data Source A: Assume that the data source A can store the car insurance score of each car owner. The car insurance score can be the score obtained by performing accurate portrait and risk analysis on the car owner. The higher the car insurance score, the lower the risk. As shown in Table 1, the data structure of the data source A side to store the car insurance points is as follows:
表1 数据源A的数据结构Table 1 Data structure of data source A
列名Column name 类型Types of 说明Description 示例Example
idcard_noIdcard_no stringString 身份证号identity number ******197309119564******197309119564
scoreScore intInt 车险分Car insurance 510510
数据源B:假设该数据源B可以存储每个车主的理赔信息,例如,车主的理赔信息可以包括理赔次数、理赔金额等。如表2所示,数据源B侧存储的每个车主的数据结构示例如下:Data Source B: Assume that the data source B can store the claim information of each owner. For example, the claim information of the owner may include the number of claims, the amount of the claim, and the like. As shown in Table 2, an example of the data structure of each owner stored on the data source B side is as follows:
表2 数据源B的数据结构Table 2 Data structure of data source B
Figure PCTCN2018105482-appb-000001
Figure PCTCN2018105482-appb-000001
基于上述的应用场景,可以基于数据源A和数据源B的数据,共同完成数据统计的处理。例如,统计工作的需求可以是“统计车险分大于500分的女性用户理赔次数的总和”,那么,“车险分大于500分”需要依据数据源A的数据来确定,“女性用户、理赔次数”这些数据都存储在数据源B中,因此,这种统计工作需要数据源A和数据源B的数据配合。Based on the application scenario described above, the data statistics processing can be completed jointly based on the data of the data source A and the data source B. For example, the demand for statistical work can be “the sum of the number of claims for female users with a statistical risk of more than 500 points.” Then, the “auto insurance score greater than 500 points” needs to be determined based on the data of data source A. “Female users, number of claims” These data are stored in data source B. Therefore, this statistical work requires data cooperation between data source A and data source B.
在本说明书一个或多个实施例对数据统计方法的描述中,可以将拥有统计数据的数据源称为统计数据方,可以将另一个数据源称为合作数据方。例如,在统计工作“统计车险分大于500分的女性用户理赔次数的总和”中,“理赔次数”是统计数据,所以数据源B是统计数据方,那么数据源A是合作数据方。In the description of the data statistics method in one or more embodiments of the present specification, the data source having the statistical data may be referred to as a statistical data side, and the other data source may be referred to as a cooperative data side. For example, in the statistical work "the total number of claims of female users whose statistical risk is greater than 500 points", the "number of claims" is statistical data, so data source B is the statistical data side, then data source A is the cooperative data side.
统计数据方和合作数据方可以分别存储同一个车主的不同信息,可以将统计数据方中存储的待参与统计的车主信息(例如,理赔次数)称为第一数据,将合作数据方中存储的参与统计的车主信息(例如,车险分)称为第二数据。此外,数据源A和数据源B中都包括的身份证号idcard_no可以称为数据标识,统计数据方方(例如,数据源B)可以存储该数据标识对应的第一数据,合作数据方(例如,数据源A)可以存储该同一数据标识对应的第二数据。The statistical data party and the cooperative data party may separately store different information of the same owner, and the vehicle owner information (for example, the number of claims) stored in the statistical data party to be participated in the statistics may be referred to as first data, and stored in the cooperative data party. The owner information (for example, the car insurance score) participating in the statistics is called the second data. In addition, the ID number idcard_no included in both the data source A and the data source B may be referred to as a data identifier, and the statistical data side (eg, the data source B) may store the first data corresponding to the data identifier, and the cooperative data side (for example, The data source A) can store the second data corresponding to the same data identifier.
图1示例了一种数据统计方法的流程,可以包括:Figure 1 illustrates a flow of a statistical method of data, which may include:
在步骤100中,统计数据方对应于每个数据标识,生成第一参数和第二参数。In step 100, the statistical data side generates a first parameter and a second parameter corresponding to each data identifier.
例如,第一参数可以是一个随机数,或者,第一参数也可以是根据一个随机数计算得到的数值,如,随机数的二分之一。For example, the first parameter may be a random number, or the first parameter may also be a value calculated from a random number, such as one-half of a random number.
例如,第二参数的数值可以根据数据过滤条件而确定,如果数据标识对应的第一数据满足本地的数据过滤条件,是参与数据统计的数据,则可以根据第一参数和第一数据计算得到该第二参数。比如,可以将第一参数和第一数据进行求和统计得到第二参数。如果数据标识对应的第二数据不满足本地的数据过滤条件,则可以设置第二参数等于第一参数。但是实际 实施中,第二参数的生成方式不限制于将第一数据和第一参数求和的方式得到,也可以采用其他计算方式。For example, the value of the second parameter may be determined according to the data filtering condition. If the first data corresponding to the data identifier satisfies the local data filtering condition and is data participating in the data statistics, the first parameter and the first data may be calculated according to the first parameter and the first data. The second parameter. For example, the first parameter and the first data may be summed to obtain a second parameter. If the second data corresponding to the data identifier does not satisfy the local data filtering condition, the second parameter may be set to be equal to the first parameter. However, in actual implementation, the manner in which the second parameter is generated is not limited to the manner in which the first data and the first parameter are summed, and other calculation methods may be used.
在步骤102中,统计数据方将本地的数据标识、以及对应所述数据标识的第一参数和第二参数,发送至合作数据方。In step 102, the statistic data party sends the local data identifier and the first parameter and the second parameter corresponding to the data identifier to the cooperation data party.
在步骤104中,合作数据方选择参数,若数据标识对应的第二数据是本地参与数据统计的数据,则选择所述数据标识对应的第二参数;否则,选择数据标识对应的第一参数。In step 104, the cooperation data party selects a parameter, and if the second data corresponding to the data identifier is the data of the local participation data statistics, the second parameter corresponding to the data identifier is selected; otherwise, the first parameter corresponding to the data identifier is selected.
例如,合作数据方在接收到统计数据方发送的数据标识、以及对应数据标识的第一参数和第二参数后,可以在本步骤进行参数的选择,选择的参数可以参与后续步骤106的处理。For example, after receiving the data identifier sent by the statistic party and the first parameter and the second parameter corresponding to the data identifier, the cooperative data party may perform parameter selection in this step, and the selected parameter may participate in the processing of the subsequent step 106.
其中,合作数据方可以根据本地的数据过滤条件,如果一个数据标识对应的第二数据满足过滤条件,是参与数据统计的数据,则可以选择第二参数;否则,如果一个数据标识对应的第二数据不过滤条件,不是参与数据统计的数据,则可以选择第一参数。The cooperative data party may select the second parameter according to the local data filtering condition. If the second data corresponding to the data identifier satisfies the filtering condition and is the data participating in the data statistics, the second parameter may be selected; otherwise, if the data identifier corresponds to the second If the data is not filtered, not the data participating in the statistics, the first parameter can be selected.
在步骤106中,合作数据方将选择的第一参数和第二参数进行统计计算,得到合作方计算值。例如,在所要获取的统计值是求和统计时,可以将选择的第一参数和第二参数进行加和;当然在其他的统计方式中,也可以将第一参数和第二参数进行对应的其他形式的计算。In step 106, the cooperative data party performs statistical calculation on the selected first parameter and the second parameter to obtain a partner calculated value. For example, when the statistic value to be obtained is the summation statistic, the selected first parameter and the second parameter may be added; of course, in other statistical methods, the first parameter and the second parameter may also be corresponding. Other forms of calculation.
在步骤108中,合作数据方将合作方计算值发送至所述统计数据方。In step 108, the cooperating data party sends the partner calculation value to the statistical data party.
在步骤110中,统计数据方用合作方计算值去除第一参数的计算值,得到统计值。例如,可以是用合作方计算值减去各个第一参数之和。In step 110, the statistical data side uses the calculated value of the partner to remove the calculated value of the first parameter, and obtains a statistical value. For example, it is possible to subtract the sum of the respective first parameters from the partner calculation value.
上述图1的流程例子,采用了不经意传输协议(Oblivious transfer,OT),该OT是一种可保护隐私的双方通信协议,能使通信双方以一种选择模糊化的方式传送消息,可以使得服务的接收方以不经意的方式得到服务发送方输入的某些消息,这样就可以保护接受者的隐私不被发送者所知道。The above example of the process of FIG. 1 adopts an Oblivious Transfer (OT), which is a privacy-protected two-party communication protocol, which enables the communication parties to transmit messages in a manner that is fuzzified in selection, so that the service can be made. The recipient receives the message entered by the service sender in an unintended manner, thus protecting the recipient's privacy from being known by the sender.
例如,在图1的例子中,统计数据方可以将所有的数据标识和对应的第一参数、第二参数发送至合作数据方,其实统计数据方已经根据本地的 数据过滤条件对第二参数设置了不同的数值,但是由合作数据方的角度来看,接收到的是所有数据标识,不会泄露统计数据方的过滤数据。再者,统计数据方通过两个参数的方式混淆了自己的真实数据,向合作数据方传送的第一参数和第二参数并不是真实的第一数据,也不会导致数据隐私泄露。并且,再由统计数据方的角度来看,它所接收的合作方计算值是合作数据方进行数据过滤后的选择,但是统计数据方也无法区分合作数据方选择了哪些数据,因此,合作数据方的数据也能够得到隐私保护。For example, in the example of FIG. 1, the statistic data party can send all the data identifiers and the corresponding first parameter and the second parameter to the cooperative data party. In fact, the statistic data party has set the second parameter according to the local data filtering condition. Different values, but from the perspective of the cooperating data side, all data identifiers are received, and the filtering data of the statistics side is not leaked. Furthermore, the statistical data side confuses its own real data by means of two parameters, and the first parameter and the second parameter transmitted to the cooperative data side are not the real first data, nor will the data privacy leak. Moreover, from the perspective of the statistical data side, the calculated value of the partner it receives is the data-filtered selection of the cooperative data party, but the statistical data side cannot distinguish which data is selected by the cooperative data party. Therefore, the cooperation data The party's data can also be protected by privacy.
基于表1所示的数据结构,假设数据源A拥有的车险分数据如下表3,其中,idcard_no可以是车主的身份证号,score可以是该车主的车险分。Based on the data structure shown in Table 1, it is assumed that the data of the car insurance belonging to the data source A is as shown in Table 3 below, wherein idcard_no can be the ID number of the owner, and the score can be the car insurance score of the owner.
表3 数据源A的数据Table 3 Data source A data
idcard_noIdcard_no scoreScore
12345671234567 490490
23456782345678 501501
34567893456789 530530
基于表2所示的数据结构,假设数据源B拥有的数据如下表4:Based on the data structure shown in Table 2, assume that the data owned by data source B is as follows:
表4 数据源B的数据Table 4 Data source B data
idcard_noIdcard_no genderGender timesTimes amountAmount
12345671234567 male 33 50005000
23456782345678 Female 77 2300023000
34567893456789 Female 66 1600016000
如下基于上述表3和表4,统计车险分大于500分的女性用户理赔次数的总和。可以看到,本次统计工作的统计数据“理赔次数”存储在数据源B,表4中的times这一列可以称为“统计列”,即要对这一列的数据进行求和统计。而过滤条件中的“车险分大于500分”位于数据源A(第二数据用于作为统计值获取的过滤条件),过滤条件“女性”位于数据源B,即过滤条件可以 在两个数据源都存在。数据源A和数据源B进行数据合作,可以实现对理赔次数统计求和(获取统计值)的工作。Based on the above Tables 3 and 4, the total number of claims of female users with a car insurance score greater than 500 is counted. It can be seen that the statistical data "claims" of this statistical work is stored in data source B. The times column in Table 4 can be called "statistical column", that is, the data of this column is summed and counted. The "car risk score greater than 500 points" in the filter condition is located in the data source A (the second data is used as the filter condition for the statistical value acquisition), and the filter condition "female" is located in the data source B, that is, the filter condition can be in the two data sources. All exist. Data source A and data source B perform data cooperation, which can achieve the statistical summation of the number of claims (acquisition of statistical values).
图2示例了结合数据源A和数据源B进行求和统计的流程,可以包括:FIG. 2 illustrates a flow of summation statistics combined with data source A and data source B, which may include:
在步骤200中,数据源B针对每一行数据都生成一个随机数,并根据数据过滤条件生成M0和M1。In step 200, data source B generates a random number for each row of data and generates M0 and M1 based on the data filtering conditions.
本步骤中,例如表4所示例的数据,理赔次数times对应的列是统计列。其中的3、7、6都是该统计列中的第一数据。In this step, for example, the data shown in Table 4, the column corresponding to the number of claims times is a statistical column. Among them, 3, 7, and 6 are the first data in the statistical column.
针对每一行数据生成的一个随机数,假设对应1234567的随机数是t1,对应2345678的随机数是t2,对应3456789的随机数是t3。For a random number generated for each row of data, assume that the random number corresponding to 1234567 is t1, the random number corresponding to 2345678 is t2, and the random number corresponding to 3456789 is t3.
根据本地的数据过滤条件“女性用户”,可以得到2345678和3456789这两个idcard_no的车主符合该条件,是参与本次数据统计的第一数据;而1234567的车主不符合过滤条件,不参与数据统计。据此,假设统计列中的各个第一数据用b表示,那么可以生成对应每个idcard_no的第一参数和第二参数。其中的第一参数可以是上述对应每个idcard_no的随机数,第二参数可以是随机数与该idcard_no对应的第一数据之和,所述第一数据可以是参与统计的b。According to the local data filtering condition "female user", the owner of the two idcard_no of 2345678 and 3456789 can meet the condition and participate in the first data of this data statistics; while the owner of 1234567 does not meet the filtering conditions and does not participate in the data statistics. . Accordingly, assuming that each of the first data in the statistical column is represented by b, then the first parameter and the second parameter corresponding to each idcard_no can be generated. The first parameter may be the random number corresponding to each idcard_no, and the second parameter may be the sum of the first data corresponding to the idcard_no, and the first data may be b participating in the statistics.
如下表5的示例,对每行数据都生成一个随机数,假设对应统计列的真实值为b。对每行数据做遍历,如果这行数据满足自身的过滤条件,则生成M0=t,M1=t+b;如果不满足自身的过滤条件,则生成M0=M1=t。As shown in the following example of Table 5, a random number is generated for each row of data, assuming that the true value of the corresponding statistical column is b. Traversing each row of data, if the row of data satisfies its own filtering condition, it generates M0=t, M1=t+b; if it does not satisfy its own filtering condition, it generates M0=M1=t.
表5 每一行数据的MO和M1Table 5 MO and M1 of each row of data
idcard_noIdcard_no M0M0 M1M1
12345671234567 t 1 t 1 t 1 t 1
23456782345678 t 2 t 2 t 2+7 t 2 +7
34567893456789 t 3 t 3 t 3+6 t 3 +6
本步骤生成的M0和M1,是通过随机值的生成来混淆真实的统计列数据,就算合作数据方接收到idcard_no对应的M0和M1,也不能知道该数 据标识idcard_no对应的真实的统计列数据b是多少。例如,即使接收到数据标识2345678对应的t 2和t 2+7,也不能知道真实的b的数值7。 The M0 and M1 generated in this step are to confuse the real statistical column data by the generation of the random value. Even if the cooperative data party receives the M0 and M1 corresponding to the idcard_no, the real statistical column data corresponding to the data identifier idcard_no cannot be known. how many. For example, even if t 2 and t 2 +7 corresponding to the data identifier 2345678 are received, the true value 7 of b cannot be known.
此外,上述分别与每个数据标识对应的随机数t 1、t 2和t 3,可以不同。 Furthermore, the above-mentioned random numbers t 1 , t 2 and t 3 respectively corresponding to each data identifier may be different.
在步骤202中,数据源B将每一行数据的数据标识、以及对应所述数据标识的MO和M1,发送至数据源A。In step 202, the data source B sends the data identifier of each row of data and the MO and M1 corresponding to the data identifier to the data source A.
在步骤204中,数据源A根据本地的数据过滤条件,若数据标识对应的第二数据参与数据统计,则选择M1,否则,选择MO。In step 204, the data source A selects M1 according to the local data filtering condition, and if the second data corresponding to the data identifier participates in the data statistics, otherwise selects MO.
例如,数据源A可以根据过滤条件“车险分大于500分”,来判断每个数据标识idcard_no对应的第二数据(表3中的score)是否大于500分。若idcard_no对应的score大于500,则选择表5中的“t+b”,否则,若idcard_no对应的score小于500,则选择表5中的“t”。For example, the data source A can determine whether the second data (score in the table 3) corresponding to each data identifier idcard_no is greater than 500 points according to the filter condition “the vehicle risk score is greater than 500 points”. If the score corresponding to idcard_no is greater than 500, "t+b" in Table 5 is selected. Otherwise, if the score corresponding to idcard_no is less than 500, "t" in Table 5 is selected.
举例来说,以idcard_no是1234567为例,该数据标识对应的车险分是490,并不满足“车险分大于500分”的过滤条件,则可以选择表5中对应1234567的M0,即选择t1。又例如,以idcard_no是2345678为例,在表3中,该数据标识对应的车险分是501,满足“车险分大于500分”的过滤条件,则可以选择表5中对应2345678的M1,即选择t2+7。同理,对于idcard_no是3456789,将选择t3+6。For example, if idcard_no is 1234567, the data identification corresponding to the car insurance score is 490, and the filtering condition of "the car insurance score is greater than 500 points" is not satisfied. Then, M0 corresponding to 1234567 in Table 5 can be selected, that is, t1 is selected. For another example, the idcard_no is 2345678 as an example. In Table 3, the corresponding vehicle risk score is 501, and the filter condition that satisfies the “auto insurance score greater than 500 points” may be selected, and the M1 corresponding to 2345678 in Table 5 may be selected. T2+7. Similarly, for idcard_no is 3456789, t3+6 will be selected.
在步骤206中,数据源A将选择数做累加,得到累加值。In step 206, data source A accumulates the selected numbers to obtain an accumulated value.
例如,数据源A可以将选择的参数进行累加操作,得到一个累加值。比如,累加值可以是M=t1+t2+7+t3+6。该累加值即为合作方计算值。For example, data source A can accumulate selected parameters to obtain an accumulated value. For example, the accumulated value can be M=t1+t2+7+t3+6. The accumulated value is the calculated value of the partner.
在步骤208中,数据源A将累加值发送至数据源B。In step 208, data source A sends the accumulated value to data source B.
在步骤210中,数据源B用累加值减去M0之和,得到统计值。In step 210, data source B subtracts the sum of M0 from the accumulated value to obtain a statistical value.
本步骤中,数据源B接收到累加值后,把累加值减去所有的随机数MO的和,得到的就是要统计的理赔次数之和。例如,可以计算M–(t1+t2+t3)=13,即为最终的统计值,其中的M是累加值。In this step, after receiving the accumulated value, the data source B subtracts the sum of all the random numbers MO from the accumulated value, and the obtained is the sum of the number of claims to be counted. For example, M–(t1+t2+t3)=13 can be calculated, which is the final statistical value, where M is the accumulated value.
本例子中,数据源B接收到累加值后,并不能知道数据源A侧具体选择的是M0还是M1,而只是接收到一个累加值;同样,数据源A也不能知 道数据源B侧过滤的参与统计数据,而只是接收到两个参数。因此,这种方式在计算过程中没有泄露任何一方的明细数据,而且高效的完成了两方的求和统计。In this example, after the data source B receives the accumulated value, it cannot know whether the data source A side specifically selects M0 or M1, but only receives an accumulated value; likewise, the data source A cannot know the data source B side filtering. Participate in the statistics, but only receive two parameters. Therefore, this method does not disclose the detailed data of either party in the calculation process, and efficiently completes the summation statistics of both parties.
上述图2所示的流程,是以统计值是多个第一数据之和为例,比如求取理赔次数的总和。在其他的例子中,本说明书一个或多个实施例的数据统计方法,还可以应用于其他统计计算的场景,比如,统计值还可以是求取多个第一数据的平均值。The flow shown in FIG. 2 above is an example in which the statistical value is the sum of the plurality of first data, for example, the sum of the number of claims is obtained. In other examples, the data statistics method of one or more embodiments of the present disclosure may also be applied to other statistical calculation scenarios. For example, the statistical value may also be an average value of multiple first data.
以求取“车险分大于500分的女性用户理赔次数的平均值”为例,还可以采用图2所示的处理流程,不同的是,可以采用不同的第一参数和第二参数。比如,当一行数据不满足自身的过滤条件,则对应数据标识生成的第一参数和第二参数可以是M0=M1=t;而当一行数据满足自身的过滤条件,则对应数据标识生成的第一参数和第二参数可以是第一参数加上第一数据的二分之一。For example, taking the average value of the number of claims of female users with a car insurance score greater than 500 points, the processing flow shown in FIG. 2 can also be adopted, except that different first parameters and second parameters can be adopted. For example, when a row of data does not satisfy its own filtering condition, the first parameter and the second parameter generated by the corresponding data identifier may be M0=M1=t; and when one row of data satisfies its own filtering condition, the corresponding data identifier is generated. The one parameter and the second parameter may be the first parameter plus one-half of the first data.
例如,以表5中数据标识2345678为例,生成的M0可以是t 2,生成的M1可以是“t 2+7/2”。或者,还可以是将第一参数生成为随机数的二分之一,比如“t 2/2”,对应的第二参数可以是“(t 2+7)/2”。如下表6所示: For example, taking the data identifier 2345678 in Table 5 as an example, the generated M0 may be t 2 , and the generated M1 may be “t 2 +7/2”. Alternatively, the first parameter may be generated as one-half of a random number, such as “t 2 /2”, and the corresponding second parameter may be “(t 2 +7)/2”. As shown in Table 6 below:
表6 统计平均值时的MO和M1Table 6 MO and M1 when statistical average
idcard_noIdcard_no M0M0 M1M1
12345671234567 t 1 t 1 t 1 t 1
23456782345678 t 2 t 2 t 2+7/2 t 2 +7/2
34567893456789 t 3 t 3 t 3+6/2 t 3 +6/2
待数据源B接收到数据源A发送的累加值M后,假设数据源A选择的是后两行数据(对应数据标识2345678和3456789),仍然可以是M–(t1+t2+t3)=6.5。After the data source B receives the accumulated value M sent by the data source A, it is assumed that the data source A selects the last two rows of data (corresponding to the data identifiers 2345678 and 3456789), and may still be M–(t1+t2+t3)=6.5. .
为了实现上述的方法,本说明书一个或多个实施例还提供了一种数据统计装置,如图3所示,该装置可以包括:参数生成模块31、数据发送模 块32、数据接收模块33和统计处理模块34。In order to implement the above method, one or more embodiments of the present specification further provide a data statistics device. As shown in FIG. 3, the device may include: a parameter generating module 31, a data sending module 32, a data receiving module 33, and statistics. Processing module 34.
参数生成模块31,用于对应于每个数据标识,生成第一参数和第二参数;若所述数据标识对应的第一数据不参与数据统计,则第二参数等于第一参数,否则,所述第二参数是根据第一参数和所述第一数据计算得到;a parameter generating module 31, configured to generate a first parameter and a second parameter corresponding to each data identifier; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, The second parameter is calculated according to the first parameter and the first data;
数据发送模块32,用于将每个数据标识、以及对应所述数据标识的第一参数和第二参数,发送至所述合作数据方;The data sending module 32 is configured to send each data identifier, and the first parameter and the second parameter corresponding to the data identifier, to the cooperative data party;
数据接收模块33,用于接收合作数据方返回的合作方计算值,所述合作方计算值是合作数据方根据选择的第一参数或第二参数得到,若所述数据标识对应的第二数据参与数据统计,则合作数据方选择第二参数,否则,合作数据方选择第一参数;The data receiving module 33 is configured to receive a partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, if the data identifier corresponds to the second data Participating in data statistics, the cooperative data party selects the second parameter; otherwise, the cooperative data party selects the first parameter;
统计处理模块34,用于由所述合作方计算值中去除各个第一参数的计算值,得到所述统计值。The statistical processing module 34 is configured to remove the calculated value of each first parameter from the calculated value of the partner, to obtain the statistical value.
在一个例子中,所述多个第一数据,位于本地数据源的同一个统计列中。In one example, the plurality of first data are located in the same statistical column of the local data source.
在一个例子中,参数生成模块31,在用于根据第一参数和所述第一数据计算得到第二参数时,具体是用于由所述第一参数和所述第一数据进行求和统计得到第二参数。统计处理模块34,在用于由所述合作方计算值中去除各个第一参数的计算值时,具体是用于通过累加值减去各个所述第一参数之和,所述累加值是合作数据方根据选择的第一参数或第二参数累加得到。In an example, when the second parameter is calculated according to the first parameter and the first data, the parameter generating module 31 is specifically configured to perform summation statistics by using the first parameter and the first data. Get the second parameter. The statistical processing module 34, when used to remove the calculated value of each first parameter from the calculated value of the partner, specifically for subtracting the sum of each of the first parameters by the accumulated value, the accumulated value is cooperation The data side is accumulated according to the selected first parameter or the second parameter.
在一个例子中,参数生成模块31,在用于由第一参数和第一数据进行求和统计得到第二参数时,具体是用于:若所述数据标识对应的第一数据满足用于确定参与统计数据的数据过滤条件,则当所述统计值是多个第一数据之和时,所述第一参数是一个随机数,所述第二参数是所述随机数与所述第一数据之和。In an example, the parameter generating module 31 is configured to: when the second parameter is used to obtain the second parameter by using the first parameter and the first data, if the first data corresponding to the data identifier is satisfied for determining Participating in the data filtering condition of the statistical data, when the statistical value is the sum of the plurality of first data, the first parameter is a random number, and the second parameter is the random number and the first data Sum.
在一个例子中,参数生成模块31,在用于由第一参数和第一数据进行求和统计得到第二参数时,具体是用于:若所述数据标识对应的第一数据 满足用于确定参与统计数据的数据过滤条件,则当所述统计值是多个第一数据的平均值时,所述第二参数是所述第一参数加上所述第一数据的二分之一。In an example, the parameter generating module 31 is configured to: when the second parameter is used to obtain the second parameter by using the first parameter and the first data, if the first data corresponding to the data identifier is satisfied for determining Participating in the data filtering condition of the statistical data, when the statistical value is an average of the plurality of first data, the second parameter is the first parameter plus one-half of the first data.
为了实现上述的方法,本说明书一个或多个实施例还提供了一种数据统计装置,如图4所示,该装置可以包括:参数接收模块41、参数选择模块42、统计计算模块43和数值发送模块44。In order to implement the above method, one or more embodiments of the present specification further provide a data statistics device. As shown in FIG. 4, the device may include: a parameter receiving module 41, a parameter selection module 42, a statistical calculation module 43, and a numerical value. Send module 44.
参数接收模块41,用于接收所述统计数据方发送的数据标识、以及对应所述数据标识的第一参数和第二参数;其中,当所述数据标识对应的第一数据参与数据统计时,所述第二参数是根据第一参数和所述第一数据计算得到,否则,所述第二参数等于第一参数;The parameter receiving module 41 is configured to receive a data identifier sent by the statistic data side, and a first parameter and a second parameter corresponding to the data identifier, where, when the first data corresponding to the data identifier participates in the data statistics, The second parameter is calculated according to the first parameter and the first data; otherwise, the second parameter is equal to the first parameter;
参数选择模块42,用于若所述数据标识对应的第二数据是本地参与数据统计的数据,则选择所述数据标识对应的第二参数;否则,选择所述数据标识对应的第一参数;The parameter selection module 42 is configured to: if the second data corresponding to the data identifier is the data of the local participation data statistics, select the second parameter corresponding to the data identifier; otherwise, select the first parameter corresponding to the data identifier;
统计计算模块43,用于根据选择的第一参数和第二参数进行统计计算,得到合作方计算值;The statistical calculation module 43 is configured to perform statistical calculation according to the selected first parameter and the second parameter to obtain a calculated value of the partner;
数值发送模块44,用于将所述合作方计算值发送至所述统计数据方,以使得所述统计数据方根据所述合作方计算值去除各个第一参数的计算值,得到所述统计值。The value sending module 44 is configured to send the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, and obtains the statistical value. .
为了描述的方便,描述以上装置时以功能分为各种模块分别描述。当然,在实施本说明书一个或多个实施例时可以把各模块的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, the above devices are described as being separately divided into various modules by function. Of course, the functions of the various modules may be implemented in one or more software and/or hardware when implementing one or more embodiments of the present specification.
上述方法实施例所示流程中的各个步骤,其执行顺序不限制于流程图中的顺序。此外,各个步骤的描述,可以实现为软件、硬件或者其结合的形式,例如,本领域技术人员可以将其实现为软件代码的形式,可以为能够实现所述步骤对应的逻辑功能的计算机可执行指令。当其以软件的方式实现时,所述的可执行指令可以存储在存储器中,并被设备中的处理器执行。The various steps in the flow shown in the above method embodiments are not limited to the order in the flowchart. In addition, the description of each step may be implemented in the form of software, hardware or a combination thereof, for example, a person skilled in the art may implement it in the form of software code, and may be a computer executable computer capable of implementing the logic function corresponding to the step. instruction. When implemented in software, the executable instructions can be stored in a memory and executed by a processor in the device.
例如,对应于上述方法,本说明书一个或多个实施例同时提供一种数据统计设备,该设备用于联合本地数据方和合作数据方的数据进行数据统计,所述本地数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,所述合作数据方具有所述数据标识对应的多个第二数据。该设备可以包括处理器、存储器、以及存储在存储器上并可在处理器上运行的计算机指令,所述处理器通过执行所述指令,用于实现如下步骤:For example, corresponding to the above method, one or more embodiments of the present specification simultaneously provide a data statistics device for performing data statistics in conjunction with data of a local data party and a cooperative data party, the local data party having statistics to be calculated. a plurality of first data of values, the plurality of first data respectively corresponding to different data identifiers, and the cooperative data side has a plurality of second data corresponding to the data identifiers. The apparatus can include a processor, a memory, and computer instructions stored on the memory and operative on the processor, the processor executing the instructions for implementing the steps of:
对应于每个数据标识,生成第一参数和第二参数;若所述数据标识对应的第一数据不参与数据统计,则第二参数等于第一参数,否则,所述第二参数是根据第一参数和所述第一数据计算得到;Corresponding to each data identifier, generating a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is according to the first parameter Calculating a parameter and the first data;
将每个数据标识、以及对应所述数据标识的第一参数和第二参数,发送至合作数据方;Sending each data identifier and the first parameter and the second parameter corresponding to the data identifier to the cooperation data party;
接收合作数据方返回的合作方计算值,所述合作方计算值是合作数据方根据选择的第一参数或第二参数得到,若所述数据标识对应的第二数据参与数据统计,则合作数据方选择第二参数,否则,合作数据方选择第一参数;Receiving the partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, and if the second data corresponding to the data identifier participates in the data statistics, the cooperation data is obtained. The party selects the second parameter, otherwise, the cooperative data party selects the first parameter;
由所述合作方计算值中去除各个第一参数的计算值,得到所述统计值。The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.
例如,对应于上述方法,本说明书一个或多个实施例还提供一种数据统计设备,该设备用于在本地数据方与统计数据方之间进行数据统计,所述统计数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,所述本地数据方具有同一数据标识对应的第二数据。该设备可以包括处理器、存储器、以及存储在存储器上并可在处理器上运行的计算机指令,所述处理器通过执行所述指令,用于实现如下步骤:For example, corresponding to the above method, one or more embodiments of the present specification further provide a data statistics device, configured to perform data statistics between a local data party and a statistical data party, where the statistical data party has statistics to be calculated. a plurality of first data of values, the plurality of first data respectively corresponding to different data identifiers, wherein the local data parties have second data corresponding to the same data identifier. The apparatus can include a processor, a memory, and computer instructions stored on the memory and operative on the processor, the processor executing the instructions for implementing the steps of:
接收所述统计数据方发送的数据标识、以及对应所述数据标识的第一参数和第二参数;其中,当所述数据标识对应的第一数据参与数据统计时,所述第二参数是根据第一参数和所述第一数据计算得到,否则,所述第二参数等于第一参数;Receiving the data identifier sent by the statistical data side, and the first parameter and the second parameter corresponding to the data identifier; wherein, when the first data corresponding to the data identifier participates in the data statistics, the second parameter is based on Calculating the first parameter and the first data, otherwise, the second parameter is equal to the first parameter;
若所述数据标识对应的第二数据是本地参与数据统计的数据,则选择所述数据标识对应的第二参数;否则,选择所述数据标识对应的第一参数;And if the second data corresponding to the data identifier is data that is locally involved in data statistics, selecting a second parameter corresponding to the data identifier; otherwise, selecting a first parameter corresponding to the data identifier;
根据选择的第一参数和第二参数进行统计计算,得到合作方计算值;Performing statistical calculation according to the selected first parameter and the second parameter to obtain a calculated value of the partner;
将所述合作方计算值发送至所述统计数据方,以使得所述统计数据方根据所述合作方计算值去除各个第一参数的计算值,得到所述统计值。And sending the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, and obtains the statistical value.
上述实施例阐明的装置或模块,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。The apparatus or module illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function. A typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, and a game control. A combination of a tablet, a tablet, a wearable device, or any of these devices.
本领域内的技术人员应明白,本说明书一个或多个实施例可提供为方法、系统、或计算机程序产品。因此,本说明书一个或多个实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本说明书一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that one or more embodiments of the present specification can be provided as a method, system, or computer program product. Thus, one or more embodiments of the present specification can take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, one or more embodiments of the present specification can employ a computer program embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer usable program code embodied therein. The form of the product.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非 排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It is also to be understood that the terms "comprises" or "comprising" or "comprising" or any other variations are intended to encompass a non-exclusive inclusion, such that a process, method, article, Other elements not explicitly listed, or elements that are inherent to such a process, method, commodity, or equipment. An element defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, method, item, or device including the element.
本说明书一个或多个实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本说明书一个或多个实施例,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。One or more embodiments of the present specification can be described in the general context of computer-executable instructions executed by a computer, such as a program module. Generally, program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types. One or more embodiments of the present specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network. In a distributed computing environment, program modules can be located in both local and remote computer storage media including storage devices.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于服务端设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。The various embodiments in the specification are described in a progressive manner, and the same or similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the server device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing description of the specific embodiments of the specification has been described. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than the embodiments and still achieve the desired results. In addition, the processes depicted in the figures are not necessarily in a particular order or in a sequential order to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
以上所述仅为本说明书一个或多个实施例的较佳实施例而已,并不用以限制本公开,凡在本公开的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开保护的范围之内。The above description is only a preferred embodiment of one or more embodiments of the present specification, and is not intended to limit the disclosure, and any modifications, equivalents, improvements, etc., made within the spirit and principles of the present disclosure. All should be included in the scope of protection of the present disclosure.

Claims (12)

  1. 一种数据统计方法,所述方法应用于联合本地数据方和合作数据方的数据进行数据统计,本地数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,合作数据方具有所述数据标识对应的多个第二数据,所述方法包括:A data statistics method, the method is applied to perform data statistics by combining data of a local data party and a cooperative data party, where the local data party has a plurality of first data to be calculated, and the plurality of first data respectively correspond to different data. The data identifier, the cooperation data party has a plurality of second data corresponding to the data identifier, and the method includes:
    对应于每个数据标识,生成第一参数和第二参数;若所述数据标识对应的第一数据不参与数据统计,则第二参数等于第一参数,否则,所述第二参数是根据第一参数和所述第一数据计算得到;Corresponding to each data identifier, generating a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is according to the first parameter Calculating a parameter and the first data;
    将每个数据标识、以及对应所述数据标识的第一参数和第二参数,发送至合作数据方;Sending each data identifier and the first parameter and the second parameter corresponding to the data identifier to the cooperation data party;
    接收合作数据方返回的合作方计算值,所述合作方计算值是合作数据方根据选择的第一参数或第二参数得到,若所述数据标识对应的第二数据参与数据统计,则合作数据方选择第二参数,否则,合作数据方选择第一参数;Receiving the partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, and if the second data corresponding to the data identifier participates in the data statistics, the cooperation data is obtained. The party selects the second parameter, otherwise, the cooperative data party selects the first parameter;
    由所述合作方计算值中去除各个第一参数的计算值,得到所述统计值。The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.
  2. 根据权利要求1所述的方法,The method of claim 1
    所述第二参数是根据第一参数和所述第一数据计算得到,包括:The second parameter is calculated according to the first parameter and the first data, and includes:
    所述第二参数是由所述第一参数和所述第一数据进行求和统计得到;The second parameter is obtained by performing summation statistics by the first parameter and the first data;
    所述由所述合作方计算值中去除各个第一参数的计算值,包括:And the calculated value of each of the first parameters is removed from the calculated value of the partner, including:
    所述合作方计算值是合作数据方根据选择的第一参数或第二参数累加得到的累加值,通过所述累加值减去各个所述第一参数之和。The partner calculation value is an accumulated value obtained by the cooperation data party according to the selected first parameter or the second parameter, and the sum of each of the first parameters is subtracted by the accumulated value.
  3. 根据权利要求2所述的方法,The method of claim 2,
    所述第二参数是由第一参数和第一数据进行求和统计得到,包括:The second parameter is obtained by summing and counting the first parameter and the first data, and includes:
    若所述数据标识对应的第一数据满足用于确定参与统计数据的数据过滤条件,则当所述统计值是多个第一数据之和时,所述第一参数是一个随机数,所述第二参数是所述随机数与所述第一数据之和。If the first data corresponding to the data identifier meets the data filtering condition for determining the participation statistical data, when the statistical value is the sum of the plurality of first data, the first parameter is a random number, The second parameter is the sum of the random number and the first data.
  4. 根据权利要求2所述的方法,The method of claim 2,
    所述第二参数是由第一参数和第一数据进行求和统计得到,包括:The second parameter is obtained by summing and counting the first parameter and the first data, and includes:
    若所述数据标识对应的第一数据满足用于确定参与统计数据的数据过滤条件,则当所述统计值是多个第一数据的平均值时,所述第二参数是所述第一参数加上所述第一数据的二分之一。And if the first data corresponding to the data identifier meets a data filtering condition for determining the participation statistical data, when the statistical value is an average value of the plurality of first data, the second parameter is the first parameter Plus one-half of the first data.
  5. 一种数据统计方法,所述方法用于在本地数据方与统计数据方之间进行数据统计,所述统计数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,所述本地数据方具有同一数据标识对应的第二数据;所述方法包括:A data statistics method, the method is used for performing data statistics between a local data party and a statistical data party, wherein the statistical data party has a plurality of first data to be calculated, and the plurality of first data respectively Corresponding to different data identifiers, the local data party has second data corresponding to the same data identifier; the method includes:
    接收所述统计数据方发送的数据标识、以及对应所述数据标识的第一参数和第二参数;其中,当所述数据标识对应的第一数据参与数据统计时,所述第二参数是根据第一参数和所述第一数据计算得到,否则,所述第二参数等于第一参数;Receiving the data identifier sent by the statistical data side, and the first parameter and the second parameter corresponding to the data identifier; wherein, when the first data corresponding to the data identifier participates in the data statistics, the second parameter is based on Calculating the first parameter and the first data, otherwise, the second parameter is equal to the first parameter;
    若所述数据标识对应的第二数据是本地参与数据统计的数据,则选择所述数据标识对应的第二参数;否则,选择所述数据标识对应的第一参数;And if the second data corresponding to the data identifier is data that is locally involved in data statistics, selecting a second parameter corresponding to the data identifier; otherwise, selecting a first parameter corresponding to the data identifier;
    根据选择的第一参数和第二参数进行统计计算,得到合作方计算值;Performing statistical calculation according to the selected first parameter and the second parameter to obtain a calculated value of the partner;
    将所述合作方计算值发送至所述统计数据方,以使得所述统计数据方根据所述合作方计算值去除各个第一参数的计算值,得到所述统计值。And sending the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, and obtains the statistical value.
  6. 一种数据统计装置,所述装置用于联合本地数据方和合作数据方的数据进行数据统计,所述本地数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,所述合作数据方具有所述数据标识对应的多个第二数据;所述装置包括:A data statistics device, configured to perform data statistics by combining data of a local data party and a cooperative data party, wherein the local data party has a plurality of first data to be calculated, and the plurality of first data respectively Corresponding to different data identifiers, the cooperation data party has a plurality of second data corresponding to the data identifiers; the device includes:
    参数生成模块,用于对应于每个数据标识,生成第一参数和第二参数;若所述数据标识对应的第一数据不参与数据统计,则第二参数等于第一参数,否则,所述第二参数是根据第一参数和所述第一数据计算得到;a parameter generating module, configured to generate a first parameter and a second parameter corresponding to each data identifier; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, The second parameter is calculated according to the first parameter and the first data;
    数据发送模块,用于将每个数据标识、以及对应所述数据标识的第一参数和第二参数,发送至所述合作数据方;a data sending module, configured to send each data identifier, and the first parameter and the second parameter corresponding to the data identifier, to the collaborative data party;
    数据接收模块,用于接收合作数据方返回的合作方计算值,所述合作 方计算值是合作数据方根据选择的第一参数或第二参数得到,若所述数据标识对应的第二数据参与数据统计,则合作数据方选择第二参数,否则,合作数据方选择第一参数;a data receiving module, configured to receive a partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, if the second data corresponding to the data identifier participates For data statistics, the cooperative data party selects the second parameter; otherwise, the cooperative data party selects the first parameter;
    统计处理模块,用于由所述合作方计算值中去除各个第一参数的计算值,得到所述统计值。And a statistical processing module, configured to remove the calculated value of each first parameter from the calculated value of the partner, to obtain the statistical value.
  7. 根据权利要求6所述的装置,The device of claim 6
    所述参数生成模块,在用于根据第一参数和所述第一数据计算得到第二参数时,具体是用于由所述第一参数和所述第一数据进行求和统计得到第二参数;The parameter generating module is configured to: when used to calculate the second parameter according to the first parameter and the first data, to obtain a second parameter by performing summation statistics by using the first parameter and the first data ;
    所述统计处理模块,在用于由所述合作方计算值中去除各个第一参数的计算值时,具体是用于通过累加值减去各个所述第一参数之和,所述累加值是合作数据方根据选择的第一参数或第二参数累加得到。The statistical processing module, when used to remove the calculated value of each first parameter from the calculated value of the partner, is specifically used to subtract the sum of each of the first parameters by an accumulated value, where the accumulated value is The cooperative data side is accumulated according to the selected first parameter or the second parameter.
  8. 根据权利要求7所述的装置,The device according to claim 7,
    所述参数生成模块,在用于由第一参数和第一数据进行求和统计得到第二参数时,具体是用于:若所述数据标识对应的第一数据满足用于确定参与统计数据的数据过滤条件,则当所述统计值是多个第一数据之和时,所述第一参数是一个随机数,所述第二参数是所述随机数与所述第一数据之和。The parameter generating module is configured to: if the first data corresponding to the data identifier meets the requirement for determining the participating statistical data, when the second parameter is used to obtain the second parameter by the first parameter and the first data. Data filtering condition, when the statistical value is the sum of the plurality of first data, the first parameter is a random number, and the second parameter is a sum of the random number and the first data.
  9. 根据权利要求7所述的装置,The device according to claim 7,
    所述参数生成模块,在用于由第一参数和第一数据进行求和统计得到第二参数时,具体是用于:若所述数据标识对应的第一数据满足用于确定参与统计数据的数据过滤条件,则当所述统计值是多个第一数据的平均值时,所述第二参数是所述第一参数加上所述第一数据的二分之一。The parameter generating module is configured to: if the first data corresponding to the data identifier meets the requirement for determining the participating statistical data, when the second parameter is used to obtain the second parameter by the first parameter and the first data. The data filtering condition, when the statistical value is an average of the plurality of first data, the second parameter is the first parameter plus one-half of the first data.
  10. 一种数据统计装置,所述装置用于在本地数据方与统计数据方之间进行数据统计,所述统计数据方具有待计算统计值的多个第一数据,所述多个第一数据分别对应不同的数据标识,所述本地数据方具有同一数据标识对应的第二数据;所述装置包括:A data statistics device, configured to perform data statistics between a local data party and a statistical data party, wherein the statistical data party has a plurality of first data to be calculated, and the plurality of first data respectively Corresponding to different data identifiers, the local data party has second data corresponding to the same data identifier; the device includes:
    参数接收模块,用于接收所述统计数据方发送的数据标识、以及对应所述数据标识的第一参数和第二参数;其中,当所述数据标识对应的第一数据参与数据统计时,所述第二参数是根据第一参数和所述第一数据计算得到,否则,所述第二参数等于第一参数;a parameter receiving module, configured to receive a data identifier sent by the statistic data, and a first parameter and a second parameter corresponding to the data identifier; where, when the first data corresponding to the data identifier participates in data statistics, The second parameter is calculated according to the first parameter and the first data; otherwise, the second parameter is equal to the first parameter;
    参数选择模块,用于若所述数据标识对应的第二数据是本地参与数据统计的数据,则选择所述数据标识对应的第二参数;否则,选择所述数据标识对应的第一参数;a parameter selection module, configured to: if the second data corresponding to the data identifier is data that is locally involved in data statistics, select a second parameter corresponding to the data identifier; otherwise, select a first parameter corresponding to the data identifier;
    统计计算模块,用于根据选择的第一参数和第二参数进行统计计算,得到合作方计算值;a statistical calculation module, configured to perform statistical calculation according to the selected first parameter and the second parameter, to obtain a calculated value of the partner;
    数值发送模块,用于将所述合作方计算值发送至所述统计数据方,以使得所述统计数据方根据所述合作方计算值去除各个第一参数的计算值,得到所述统计值。And a value sending module, configured to send the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, to obtain the statistical value.
  11. 一种数据统计设备,所述设备包括存储器、处理器,以及存储在存储器上并可在处理器上运行的计算机指令,所述处理器执行指令时实现以下步骤:A data statistics device, the device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing the instructions to:
    对应于每个数据标识,生成第一参数和第二参数;若所述数据标识对应的第一数据不参与数据统计,则第二参数等于第一参数,否则,所述第二参数是根据第一参数和所述第一数据计算得到;Corresponding to each data identifier, generating a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is according to the first parameter Calculating a parameter and the first data;
    将每个数据标识、以及对应所述数据标识的第一参数和第二参数,发送至合作数据方;Sending each data identifier and the first parameter and the second parameter corresponding to the data identifier to the cooperation data party;
    接收合作数据方返回的合作方计算值,所述合作方计算值是合作数据方根据选择的第一参数或第二参数得到,若所述数据标识对应的第二数据参与数据统计,则合作数据方选择第二参数,否则,合作数据方选择第一参数;Receiving the partner calculation value returned by the cooperation data party, where the partner calculation value is obtained by the cooperation data party according to the selected first parameter or the second parameter, and if the second data corresponding to the data identifier participates in the data statistics, the cooperation data is obtained. The party selects the second parameter, otherwise, the cooperative data party selects the first parameter;
    由所述合作方计算值中去除各个第一参数的计算值,得到所述统计值。The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.
  12. 一种数据统计设备,所述设备包括存储器、处理器,以及存储在存储器上并可在处理器上运行的计算机指令,所述处理器执行指令时实现 以下步骤:A data statistics device, the device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing the instructions to:
    接收所述统计数据方发送的数据标识、以及对应所述数据标识的第一参数和第二参数;其中,当所述数据标识对应的第一数据参与数据统计时,所述第二参数是根据第一参数和所述第一数据计算得到,否则,所述第二参数等于第一参数;Receiving the data identifier sent by the statistical data side, and the first parameter and the second parameter corresponding to the data identifier; wherein, when the first data corresponding to the data identifier participates in the data statistics, the second parameter is based on Calculating the first parameter and the first data, otherwise, the second parameter is equal to the first parameter;
    若所述数据标识对应的第二数据是本地参与数据统计的数据,则选择所述数据标识对应的第二参数;否则,选择所述数据标识对应的第一参数;And if the second data corresponding to the data identifier is data that is locally involved in data statistics, selecting a second parameter corresponding to the data identifier; otherwise, selecting a first parameter corresponding to the data identifier;
    根据选择的第一参数和第二参数进行统计计算,得到合作方计算值;Performing statistical calculation according to the selected first parameter and the second parameter to obtain a calculated value of the partner;
    将所述合作方计算值发送至所述统计数据方,以使得所述统计数据方根据所述合作方计算值去除各个第一参数的计算值,得到所述统计值。And sending the partner calculation value to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the partner calculation value, and obtains the statistical value.
PCT/CN2018/105482 2017-10-31 2018-09-13 Data statistics method and apparatus WO2019085656A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711046886.3 2017-10-31
CN201711046886.3A CN109726363B (en) 2017-10-31 2017-10-31 Data statistical method and device

Publications (1)

Publication Number Publication Date
WO2019085656A1 true WO2019085656A1 (en) 2019-05-09

Family

ID=66294427

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/105482 WO2019085656A1 (en) 2017-10-31 2018-09-13 Data statistics method and apparatus

Country Status (3)

Country Link
CN (1) CN109726363B (en)
TW (1) TWI689828B (en)
WO (1) WO2019085656A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116108494B (en) * 2023-04-12 2023-06-20 蓝象智联(杭州)科技有限公司 Multiparty joint data statistics method for protecting privacy

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102594889A (en) * 2012-02-17 2012-07-18 广东电网公司电力科学研究院 Data-call-based data synchronization and analysis system
CN105023086A (en) * 2015-01-07 2015-11-04 泰华智慧产业集团股份有限公司 Digital city management data sharing system based on cloud calculation
US20160070795A1 (en) * 2014-05-21 2016-03-10 Knowlege Synthesis Systems and method for searching and analyzing big data
CN105430055A (en) * 2015-11-02 2016-03-23 武大吉奥信息技术有限公司 Large data exchange system based on distributed and multi-level junction
CN107291764A (en) * 2016-04-05 2017-10-24 中兴通讯股份有限公司 A kind of big data exchange method and device, system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI370660B (en) * 2009-02-24 2012-08-11 Ind Tech Res Inst Method and system for coding/decoding, and encryption/decryption method used therein
EP2507708B1 (en) * 2009-12-04 2019-03-27 Cryptography Research, Inc. Verifiable, leak-resistant encryption and decryption
US9202078B2 (en) * 2011-05-27 2015-12-01 International Business Machines Corporation Data perturbation and anonymization using one way hash
SG10201502401XA (en) * 2015-03-26 2016-10-28 Huawei Internat Pte Ltd Method of obfuscating data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102594889A (en) * 2012-02-17 2012-07-18 广东电网公司电力科学研究院 Data-call-based data synchronization and analysis system
US20160070795A1 (en) * 2014-05-21 2016-03-10 Knowlege Synthesis Systems and method for searching and analyzing big data
CN105023086A (en) * 2015-01-07 2015-11-04 泰华智慧产业集团股份有限公司 Digital city management data sharing system based on cloud calculation
CN105430055A (en) * 2015-11-02 2016-03-23 武大吉奥信息技术有限公司 Large data exchange system based on distributed and multi-level junction
CN107291764A (en) * 2016-04-05 2017-10-24 中兴通讯股份有限公司 A kind of big data exchange method and device, system

Also Published As

Publication number Publication date
TWI689828B (en) 2020-04-01
CN109726363A (en) 2019-05-07
CN109726363B (en) 2020-05-29
TW201918910A (en) 2019-05-16

Similar Documents

Publication Publication Date Title
WO2019085650A1 (en) Data statistics method and apparatus
WO2023045503A1 (en) Feature processing method and device based on differential privacy
WO2021000575A1 (en) Data interaction method and apparatus, and electronic device
WO2019085677A1 (en) Garbled circuit-based data calculation method, apparatus, and device
TW202103154A (en) Data processing method and apparatus, and electronic device
TWI706362B (en) Data processing method, device and server based on blockchain
WO2021000574A1 (en) Data interaction method and apparatus, server, and electronic device
WO2019085656A1 (en) Data statistics method and apparatus
US11194824B2 (en) Providing oblivious data transfer between computing devices
US20200364582A1 (en) Performing data processing based on decision tree
WO2019085665A1 (en) Data statistics method and apparatus
US10924273B2 (en) Data exchange for multi-party computation
Scheindlin Judicial Fact-Finding and the Trial Court Judge
CN112232639A (en) Statistical method and device and electronic equipment
CN110851487A (en) Data statistical method and device
TWI706370B (en) Data statistics method and device
CN115758441A (en) Method and device for determining private data intersection of multiple parties
CN117494150A (en) Data processing method and device, electronic equipment and storage medium
Cheng-Guo et al. The interval Shapley value of an M/M/1 service system
CN114723441A (en) Method, device and equipment for constraining behaviors of demander and participator
CN116563009A (en) Multiparty loan query method, device, equipment, medium and program product
JP2017059975A (en) Trust calculation device, apparatus and program for calculating trust

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18873322

Country of ref document: EP

Kind code of ref document: A1