WO2019085665A1 - 一种数据统计方法和装置 - Google Patents

一种数据统计方法和装置 Download PDF

Info

Publication number
WO2019085665A1
WO2019085665A1 PCT/CN2018/105938 CN2018105938W WO2019085665A1 WO 2019085665 A1 WO2019085665 A1 WO 2019085665A1 CN 2018105938 W CN2018105938 W CN 2018105938W WO 2019085665 A1 WO2019085665 A1 WO 2019085665A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
identifier
party
local
extreme value
Prior art date
Application number
PCT/CN2018/105938
Other languages
English (en)
French (fr)
Inventor
王华忠
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2019085665A1 publication Critical patent/WO2019085665A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords

Definitions

  • the present disclosure relates to the field of network technologies, and in particular, to a data statistics method and apparatus.
  • the present disclosure provides a data statistics method and apparatus for implementing two-party secure computing on the basis of protecting the data privacy of two data owners.
  • a data statistics method is provided, where the method is applied to perform data statistics by combining data of a local data party and a cooperative data party, where the local data party has a plurality of first data to be obtained, and the plurality of The first data corresponds to different data identifiers, and the cooperation data party has multiple second data corresponding to the data identifier, and the method includes:
  • the extreme value sorting number is obtained by the plurality of sorting numbers corresponding to the respective data identifiers in the identifier intersection, and the identifier intersection is performed by the plurality of The identifier of the second data selected by the corresponding cooperative data party selected by the plurality of data identifiers corresponding to the first data to participate in the data statistics;
  • a data statistics method is provided, where the method is applied to perform data statistics by combining data of a local data party and a statistical data side, where the statistical data side has a plurality of first data to be obtained, and the plurality of The first data corresponds to different data identifiers, and the local data party has multiple second data corresponding to the data identifiers; the method includes:
  • the statistic data And receiving, by the statistic data, a data identifier and a sorting number, where the data identifier is an identifier corresponding to the plurality of first data that the statistic data party participates in the data statistic, and the sorting number is used to identify a sorting position between the plurality of first data ;
  • the extreme value sorting number is sent to the statistical data side, so that the statistical data side obtains the corresponding first data as the extreme value according to the extreme value sorting number.
  • a data statistics method is provided, the method is configured to perform data statistics between a local data party and a collaborative data party, where the local data party stores a first data corresponding to the data identifier, where the collaborative data party stores The same data identifies the corresponding second data; and the method is applied to obtain an extreme value in the plurality of first data; the method includes:
  • the data identifiers corresponding to the plurality of first data that are locally involved in the data statistics are respectively processed by the local private key according to the key exchange protocol, and the local processing identifier is obtained;
  • the extreme value sorting number the corresponding first data as the extreme value is obtained.
  • a fourth aspect provides a data statistics method, where the method is used for performing data statistics between a local data party and a statistical data party, where the statistical data party has first data corresponding to the data identifier, and the local data party stores The same data identifies the corresponding second data, and the method is applied to obtain an extreme value in the plurality of first data; the method includes:
  • the peer processing identifier is that the statistic data party performs the peer private key processing on the data identifier of the first data participating in the data statistics according to the key exchange protocol.
  • the sorting number is used to identify a sorting position of the first data
  • the data identifiers corresponding to the plurality of second data that are locally involved in the data statistics are respectively processed by the local private key according to the key exchange protocol, to obtain multiple local processing identifiers;
  • a fifth aspect provides a data statistics device, configured to perform data statistics by combining data of a local data party and a cooperative data party, where the local data party has a plurality of first data to be obtained, and the plurality of The first data corresponds to different data identifiers, and the cooperation data party has multiple second data corresponding to the data identifier, and the device includes:
  • a data sending module configured to send a data identifier and a sorting number corresponding to the plurality of first data to the cooperative data party, where the sorting number is used to identify a sorting position between the plurality of first data;
  • a serial number receiving module configured to receive an extreme value sorting number returned by the cooperative data party, where the extreme value sorting number is obtained by a plurality of sorting numbers corresponding to each data identifier of the identifier intersection in the collaborative data side, where the identifier number is intersected An identifier of the second data that is included in the data statistics by the corresponding cooperative data party selected by the plurality of data identifiers corresponding to the plurality of first data;
  • a data determining module configured to acquire, according to the extreme value sorting number, first data of a local data party corresponding to the extreme value sorting number.
  • a data statistics apparatus where the apparatus is applied to perform data statistics by combining data of a local data party and a statistical data side, where the statistical data side has a plurality of first data to be obtained, and the plurality of The first data respectively corresponds to different data identifiers, and the local data party has a plurality of second data corresponding to the data identifiers, and the device includes:
  • a data receiving module configured to receive a data identifier and a sorting number sent by the statistic data side, where the data identifier is an identifier corresponding to the plurality of first data that the statistic data party participates in the data statistic, and the sorting number is used to identify multiple first The sorting position between the data;
  • An intersection determining module configured to determine an identity intersection according to a data identifier corresponding to the plurality of second data in which the local data party participates in the data statistics, and a data identifier of the plurality of first data;
  • a sequence number determining module configured to obtain an extreme value sorting number according to a sorting number corresponding to each data identifier of the identifier intersection;
  • the serial number sending module is configured to send the extreme value sorting number to the statistical data side, so that the statistical data side obtains the corresponding first data as the extreme value according to the extreme value sorting number.
  • a data statistics apparatus configured to perform data statistics between a local data party and a cooperative data party, where the local data party stores first data corresponding to the data identifier, and the cooperative data party stores The same data identifies the corresponding second data; and the method is applied to obtain an extreme value in the plurality of first data; the device includes:
  • the private key processing module is configured to process the data identifiers corresponding to the plurality of first data that are locally involved in the data statistics, and perform local private key processing according to the key exchange protocol to obtain multiple local processing identifiers.
  • a sequence number sending module configured to send a local processing identifier and a sorting number corresponding to the plurality of first data to the cooperation data party, so that the cooperation data party performs a peer private key on the local processing identifier And generating a first key processing identifier, and storing a correspondence between the first key processing identifier and the sorting number, where the sorting number is used to identify a sorting position between the plurality of first data;
  • the identifier receiving module is configured to receive the peer processing identifier sent by the cooperation data party, where the peer processing identifier is obtained by the cooperation data party performing the peer private key processing on the data identifier of the second data participating in the data statistics;
  • a key cooperation module configured to perform a local private key processing on the peer processing identifier, generate a second key processing identifier, and send the second key processing identifier to the cooperation data party;
  • a serial number receiving module configured to receive an extreme value sorting number sent by the cooperative data party, where the extreme value sorting number is corresponding to the intersection of the first key processing identifier and the second key processing identifier of the cooperative data party Obtained in each sorting number;
  • the extreme value determining module is configured to obtain the corresponding first data as the extreme value according to the extreme value sorting number.
  • a data statistics apparatus configured to perform data statistics between a local data party and a cooperative data party, where the local data party has first data corresponding to the data identifier, and the cooperation data party has The same data identifies the corresponding second data; and the method is applied to obtain an extreme value in the plurality of first data; the device includes:
  • a data receiving module configured to receive a peer processing identifier and a sorting number sent by the statistical data side, where the peer processing identifier is obtained by the statistical data side processing the peer private key according to the key exchange protocol,
  • the data identifier corresponds to first data that participates in data statistics, and the sorting number is used to identify a sorting position of the first data;
  • a key processing module configured to perform a local private key operation on the peer processing identifier according to the key exchange protocol, generate a first key processing identifier, and store a correspondence between the first key processing identifier and the sort number;
  • the identifier processing module is configured to perform data identification corresponding to the plurality of second data that are locally involved in the data statistics, and perform local private key processing according to the key exchange protocol to obtain multiple local processing identifiers.
  • a cooperation processing module configured to send the local processing identifier to a statistical data party, and receive a second key processing identifier returned by the statistical data side, where the second key processing identifier is the statistical data side to the local Processing the identifier to process the peer private key;
  • an extremum obtaining module configured to acquire each sorting number corresponding to the intersection of the first key processing identifier and the second key processing identifier, and determine an extreme value sorting number in each sorting number
  • an extreme value sending module configured to send the extreme value sorting number to the statistical data side, so that the statistical data side obtains corresponding first data as an extreme value according to the extreme value sorting number.
  • a data statistics device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing the instructions to:
  • the extreme value sorting number is obtained by the plurality of sorting numbers corresponding to the respective data identifiers in the identifier intersection, and the identifier intersection is performed by the plurality of The identifier of the second data selected by the corresponding cooperative data party selected by the plurality of data identifiers corresponding to the first data to participate in the data statistics;
  • the data statistics method and device of one or more embodiments of the present specification by sending the sorting number to the opposite end in the extreme value statistics, so that only one sorting number is exposed to the opposite end, which realizes the statistics of the extreme value and is effective.
  • the data security of both parties involved in the statistics is protected, and the two-party security calculation is realized on the basis of protecting the data privacy of the two data owners.
  • FIG. 1 is a flowchart of a data statistics method provided by one or more embodiments of the present specification
  • FIG. 2 is a flowchart of a data statistics method provided by one or more embodiments of the present specification
  • FIG. 3 is a schematic structural diagram of a data statistics apparatus according to one or more embodiments of the present disclosure
  • FIG. 4 is a schematic structural diagram of a data statistics apparatus according to one or more embodiments of the present disclosure.
  • FIG. 5 is a schematic structural diagram of a data statistics apparatus according to one or more embodiments of the present disclosure.
  • FIG. 6 is a schematic structural diagram of a data statistics apparatus provided by one or more embodiments of the present specification.
  • data can be stored in a vertical mode, that is, multiple data owners can have different attribute information of the same entity.
  • a vertical mode that is, multiple data owners can have different attribute information of the same entity.
  • the same natural person’s car insurance is divided into one institution, and the natural person’s claim amount is in another. mechanism.
  • This vertical mode of data storage may result in multiple data owners involved in some statistical calculations, and multiple data owners need to cooperate to complete a data statistics.
  • the company's data secrets cannot be disclosed.
  • data source A can be a data organization
  • data source B can be an insurance institution. These two data sources can store different information of the same owner.
  • Data Source A Assume that the data source A can store the car insurance score of each car owner.
  • the car insurance score can be the score obtained by performing accurate portrait and risk analysis on the car owner. The higher the car insurance score, the lower the risk.
  • Table 1 the data structure of the data source A side storage car insurance points is as follows:
  • Data Source B Assume that the data source B can store the claim information of each owner.
  • the claim information of the owner may include the number of claims, the amount of the claim, and the like.
  • Table 2 the data structure of each owner stored on the data source B side is as follows:
  • the extreme value statistics can be completed based on the data of the data source A and the data source B.
  • the demand for a statistical job is “the largest insurance score for female users with statistical claims greater than 5 times”, then, according to the “maximum insurance score”, this is an extreme value statistics for the data in data source A.
  • the extreme value is the maximum value or the minimum value, and the "female user whose claim number is greater than 5 times" indicates that the data in the data source B can be used as the filtering condition for the extreme value acquisition, that is, the largest of the users satisfying the filtering condition needs to be acquired. Insurance points.
  • the maximum or minimum value in the case where such a filter condition is satisfied may be referred to as a "conditional extreme value”.
  • Idcard_no Gender Times Amount 1234567 male 3 5000 2345678 Female 7 23000 3456789 Female 6 16000
  • the data source A having the statistical data "insurance points" may be referred to as a statistical data side
  • the other data source B may be referred to as a cooperative data side.
  • the two data sources can respectively store different information of the same owner, and the vehicle owner information (for example, the insurance score) stored in the data source A and participating in the current data extreme value statistics can be referred to as the first.
  • the vehicle owner information for example, gender, number of claims, and claim amount
  • the ID number idcard_no included in the data source A and the data source B may be referred to as a data identifier, that is, the data source A may store the first data corresponding to the data identifier, and the data source B may store the corresponding data identifier. Two data.
  • FIG. 1 illustrates a flow of a data statistics method. As shown in FIG. 1, the method may include:
  • step 100 the statistic data side sends the data identifier and the sorting number corresponding to the plurality of first data to the cooperative data side, and the sorting number is used to identify the sorting position between the plurality of first data.
  • the plurality of first data in the step may be data that the statistical data party participates in the data statistics, and the data may be selected according to the data filtering condition of the statistical data side.
  • the statistic data side may sequentially sort the plurality of first data participating in the statistics according to the size order, and determine the sorting numbers corresponding to the respective first data according to the sorting result.
  • step 102 the cooperation data party determines the identity intersection according to the data identifier corresponding to the plurality of second data in which the local data party participates in the data statistics, and the data identifiers of the plurality of first data.
  • the cooperative data party can select the second data to participate in the current data statistics according to the local filtering condition, and obtain the data identifier corresponding to the second data.
  • the intersection of the two partial data identifiers is referred to as an identifier intersection, and the identifier intersection may include at least one data identifier, and the identifier corresponding to each data identifier in the intersection is identified.
  • a data is data that the statistical data party participates in the data statistics
  • the second data corresponding to the data identifier is data that the cooperative data party participates in the data statistics.
  • step 104 the cooperative data party obtains the extreme value sorting number according to the sorting number corresponding to each data identifier of the identifier intersection.
  • the collaborative data party can compare the sorting numbers corresponding to the respective data identifiers in the identifier intersection, and obtain the extreme value sorting number, such as the largest sorting number or the smallest sorting number.
  • step 106 the cooperating data party sends the extreme sort number to the statistical data side.
  • step 108 the statistical data side obtains the first data of the local data side corresponding to the extreme value sorting number according to the extreme value sorting number.
  • the data statistics method of the example is to send the sorting number to the peer end, so that the peer end can return the maximum sorting number or the minimum sorting number according to the sorting number, thereby realizing the data filtering of the peer end and realizing the statistics of the extreme value, and It does not expose the real data of the statistics party and protects the data privacy of the two data owners.
  • the data identification may be encrypted according to the key exchange protocol when the data is transmitted between the local data party and the cooperative data party.
  • the statistic party can perform the local private key processing on the data identifier by using the local private key, and then send the data to the peer end, so that the peer end continues to perform the peer private key processing on the data identifier.
  • the statistical data party may also receive the data identifier sent by the cooperation data party and processed by the peer private key, and continue to perform the local private key processing on the data identifier, and then return to the cooperation data party. After the two parties process the key exchange protocol for the data identifier, the data identification can be avoided and the security protection can be provided.
  • Figure 2 illustrates a flow of a statistical method of data that can be based on Tables 3 and 4, which counts the maximum insurance scores of female users with a number of claims greater than 5, and this example uses extreme value statistics and key exchanges.
  • the processing is combined as an example.
  • the method may include:
  • step 200 the statistic data side generates a sorting number corresponding to the first data respectively for the plurality of first data of the local participating data statistics.
  • data source A is a statistical data source that stores the data score to be the extreme value to be determined.
  • the column in which the score is located may be referred to as a statistical column, and each car insurance component may be referred to as a first data.
  • data source A may be the maximum value in the statistical column score in statistical table 3, ie, the maximum of the three car insurance scores 490, 501, 530.
  • the three car insurances 490, 501, and 530 can be referred to as "three first data of local participation data statistics.”
  • data source A may also select a partial car insurance score extreme value based on predetermined data filtering conditions. For example, you can find the maximum of the two car insurance points 501 and 530.
  • the data source A may sort the plurality of first data in order of size according to the plurality of first data that are locally involved in the data statistics, and generate corresponding to each of the first data according to the sorting result.
  • the sort number may be used to sort the plurality of first data in order of size according to the plurality of first data that are locally involved in the data statistics, and generate corresponding to each of the first data according to the sorting result.
  • the three car insurance points in Table 3 are arranged in the order of 490 ⁇ 501 ⁇ 530 in ascending order.
  • the sorting numbers of the respective vehicle risk points are as shown in Table 5 below.
  • the sorting number that is, the sorting number can identify the sorting position between the first data.
  • the generation of the above sort number can be generated online or offline. Offline sorting generates the sorting number corresponding to each first data in the statistical column, which is beneficial to reducing the workload in online statistical calculation and improving the efficiency of statistical calculation.
  • the statistic data unit identifies the data corresponding to the plurality of first data participating in the data statistics, and performs local private key processing according to the key exchange protocol to obtain a plurality of first processing identifiers.
  • the ID number idcard_no corresponding to the auto insurance score in Table 3 may be referred to as a data identifier corresponding to the first data.
  • a key exchange protocol for example, Diffie-Hellman key exchange, referred to as "D-H"
  • D-H Diffie-Hellman key exchange
  • H(K) For example, you can hash idcard_no to get H(K).
  • data source A can generate its own private key ⁇ in the key exchange protocol and perform local private key processing.
  • the processing can be an alpha exponential operation on H(k) to obtain H(k) ⁇ , the H(k) ⁇ can be referred to as a first process identification.
  • the data source A can participate in the current extreme value statistics. After the processing in this step, the data source A can obtain the sorting number and the first processing identifier corresponding to each first data participating in the statistics. As shown in Table 6 below, H(k) ⁇ , that is, Hash(idcard_no) ⁇ is the first processing identifier, and N is the sort number. Taking the first owner in Table 3 as an example, the owner's car insurance is 490, the corresponding sorting number is 1, and the data identifier corresponding to the car insurance sub-490 is 1234567. After the data identifier is hashed and the local private key is processed, , the first process identifier H (1234567) ⁇ is obtained.
  • Table 6 first processing identification and sorting number
  • step 204 the statistic data party sends the first processing identifier and the sorting number corresponding to the plurality of first data to the cooperative data party.
  • data source A can send the data in Table 6 to data source B.
  • step 206 the cooperation data party performs a local private key operation on the first processing identifier according to the key exchange protocol, generates a first key processing identifier, and stores a correspondence between the first key processing identifier and the sort number.
  • the data source B may generate the private key ⁇ local to the data source B according to the key exchange protocol, and use the private key ⁇ to perform the first processing identifier H(k) ⁇ .
  • the local private key operation that is, the exponential operation, yields H(k) ⁇ .
  • the H(k) ⁇ may be referred to as a first key processing identifier.
  • Table 7 first key processing identifier and sort number
  • step 208 the data identifier corresponding to the plurality of second data of the local participation data statistics is processed by the cooperation data party according to the key exchange protocol, and a plurality of second processing identifiers are obtained.
  • the data source B can also determine a plurality of second data that are locally involved in the data statistics. For example, it may be all data, or may be data obtained by local filtering according to predetermined filtering conditions.
  • the predetermined filtering condition is “a female user whose number of claims is greater than 5 times”, and the data in Table 4 can be filtered according to the condition, and the data of the last two rows in Table 4 can be obtained to participate in the statistics.
  • the "female, 7, 23000", “female, 6, 16000” in the amount column may be referred to as the second data.
  • the data identifiers corresponding to the two second data may be 2345678 and 3456789, respectively.
  • Data source B can hash the above data identifiers to obtain H(K), and then perform a ⁇ -index operation on H(k) according to the key exchange protocol, where ⁇ is the private key of data source B, and obtains H ( k) ⁇ .
  • This H(k) ⁇ can be referred to as a second process identification. As shown in Table 8 below:
  • Hash(idcard_no) H(2345678) ⁇ H(3456789) ⁇
  • step 210 the cooperating data party sends the second processing identifier to the statistical data side.
  • data source B can send the data in Table 8 above to data source A.
  • step 212 after the statistical data party performs local private key processing on the second processing identifier, the second key processing identifier is generated.
  • Hash (idcard_no) ⁇ After receiving a data source A table 8 Hash (idcard_no) ⁇ , can re-use the data source A local private key, the second key generating process identifier Hash (idcard_no) ⁇ , shown in Table 9 below.
  • step 214 the statistic party sends the second key processing identifier to the cooperating data party.
  • step 216 the cooperating data party acquires each sorting number corresponding to the intersection of the first key processing identifier and the second key processing identifier, and determines an extreme value sorting number in each of the sorting numbers.
  • the data source B may be key in the second table 9 and table identification process of the first key 7 obtains the intersection identification process, the same value Hash (idcard_no) ⁇ and Hash (idcard_no) ⁇ , represents Corresponding to the same idcard_no, that is, the same idcard_no represents the owner of the filter that satisfies the statistical data of the statistical data source participating in the statistics, and also satisfies the filtering conditions of the cooperative data source participating statistical data.
  • the intersection combined with the correspondence between the first key processing identifier and the sorting number in Table 7, the sorting number corresponding to the first key processing identifier in the intersection can be obtained. As shown in Table 10 below, it is assumed that the intersection portion is included in Table 10, and the respective sort numbers corresponding to the intersections.
  • the extreme value sorting numbers in the respective sorting numbers corresponding to the intersections can be determined. For example, when the maximum insurance score is obtained, the extreme sorting number can be the largest sorting number. The extreme value of this step is 3.
  • step 218 the cooperating data party sends the extreme ranking number to the statistical data side.
  • step 220 the statistical data side obtains the corresponding first data as the extreme value according to the extreme value sorting number.
  • the data statistics method of this example is to protect the data security of the local end by sending a sorting number to the peer end, so that only one sorting number is exposed to the peer end, and the key exchange protocol is used to protect all the filtering. Filter the privacy of the fields.
  • This program not only achieves the statistics of extreme values, but also protects the data security of both parties involved in statistics. For example, in the above example, the insurance institution cannot know the specific score of the insurance score of the owner of a certain idcard_no, and the data organization cannot know the information such as the number of claims of the owner of the idcard_no at the insurance institution.
  • one or more embodiments of the present specification further provide a data statistics device, which is applied to data statistics of a joint local data party and a cooperative data party, and the local data party has an extremum to be obtained.
  • the plurality of first data respectively correspond to different data identifiers
  • the cooperation data party has a plurality of second data corresponding to the data identifiers.
  • the apparatus may include: a data sending module 31, a sequence number receiving module 32, and a data determining module 33.
  • the data sending module 31 is configured to send the data identifier and the sorting number corresponding to the plurality of first data to the cooperative data side, where the sorting number is used to identify a sorting position between the plurality of first data;
  • the sequence number receiving module 32 is configured to receive an extreme value sequence number returned by the cooperation data party, where the extreme value sequence number is obtained by the cooperation data party by multiple sequence numbers corresponding to each data identifier in the identifier intersection, the identifier The intersection is an identifier of the second data that is selected by the corresponding cooperative data party selected from the plurality of data identifiers corresponding to the plurality of first data to participate in the data statistics;
  • the data determining module 33 is configured to obtain, according to the extreme value sorting number, first data of a local data party corresponding to the extreme value sorting number.
  • one or more embodiments of the present specification further provide a data statistics device, which is applied to data statistics of a joint local data party and a statistical data side, and the statistical data side has an extremum to be obtained.
  • the plurality of first data respectively correspond to different data identifiers
  • the local data party has a plurality of second data corresponding to the data identifiers.
  • the apparatus may include: a data receiving module 41, an intersection determining module 42, a sequence number determining module 43, and a serial number transmitting module 44.
  • the data receiving module 41 is configured to receive a data identifier and a sorting number sent by the statistic data side, where the data identifier is an identifier corresponding to the plurality of first data that the statistic data party participates in the data statistic, and the sorting number is used to identify multiple a sorted position between data;
  • the intersection determining module 42 is configured to determine the intersection of the identifiers according to the data identifiers corresponding to the plurality of second data in which the local data party participates in the data statistics, and the data identifiers of the plurality of first data.
  • the sequence number determining module 43 is configured to obtain an extreme value sorting number according to the sorting number corresponding to each data identifier of the identifier intersection;
  • the sequence number sending module 44 is configured to send the extreme value sorting number to the statistical data side, so that the statistical data side obtains the corresponding first data as the extreme value according to the extreme value sorting number.
  • the device may include: a private key processing module 51, a serial number sending module 52, an identifier receiving module 53, The key cooperation module 54, the serial number receiving module 55, and the extreme value determining module 56.
  • the private key processing module 51 is configured to perform local private key processing according to the key exchange protocol to obtain a plurality of first processing identifiers.
  • the sequence number sending module 52 is configured to send the first processing identifier and the sorting number corresponding to the plurality of first data to the cooperation data party, so that the cooperation data party performs the pair of the first processing identifier After the private key processing, the first key processing identifier is generated, and the correspondence between the first key processing identifier and the sorting number is stored, where the sorting number is used to identify a sorting position between the plurality of first data.
  • the identifier receiving module 53 is configured to receive a second processing identifier sent by the cooperation data party, where the second processing identifier is that the cooperation data party performs the peer private key processing on the data identifier of the second data participating in the data statistics. ;
  • a key cooperation module 24 configured to perform a local private key processing on the second processing identifier, generate a second key processing identifier, and send the second key processing identifier to the cooperative data party;
  • the sequence number receiving module 25 is configured to receive an extreme value sequence number sent by the cooperation data party, where the extreme value sequence number is that the cooperation data party is corresponding to the intersection of the first key processing identifier and the second key processing identifier. Obtained in each sorting number;
  • the extreme value determining module 26 is configured to obtain corresponding first data as an extreme value according to the extreme value sorting number.
  • the apparatus can also include:
  • the sequence number generating module is configured to sort the plurality of first data that are locally involved in the data statistics according to the size order; and, according to the sorting result, generate a sorting number respectively corresponding to the plurality of first data.
  • a data filtering module configured to select, according to predetermined data filtering conditions, a plurality of first data obtained by the local participation data statistics.
  • the device may include: a data receiving module 61, a key processing module 62, an identifier processing module 63, Cooperative processing module 64, extremum acquisition module 65 and extremum transmission module 66.
  • the data receiving module 61 is configured to receive a first processing identifier and a sorting number sent by the statistic data side, where the first processing identifier is that the statistic data side processes the data identifier according to a key exchange protocol to perform a peer private key processing.
  • the data identifier corresponds to first data that participates in data statistics, and the sorting number is used to identify a sorting position of the first data;
  • the key processing module 62 is configured to perform a local private key operation on the first processing identifier according to the key exchange protocol, generate a first key processing identifier, and store a correspondence between the first key processing identifier and the sorting number;
  • the identifier processing module 63 is configured to perform local private key processing according to the key exchange protocol to obtain a plurality of second processing identifiers.
  • the cooperation processing module 64 is configured to send the second processing identifier to the statistic data party, and receive a second key processing identifier returned by the statistic data party, where the second key processing identifier is the statistic data side Performing a peer private key processing on the second processing identifier;
  • the extreme value obtaining module 45 is configured to obtain each sorting number corresponding to the intersection of the first key processing identifier and the second key processing identifier, and determine an extreme value sorting number in each sorting number;
  • the extreme value sending module 46 is configured to send the extreme value sorting number to the statistical data side, so that the statistical data side obtains corresponding first data as an extreme value according to the extreme value sorting number.
  • each step may be implemented in the form of software, hardware or a combination thereof, for example, a person skilled in the art may implement it in the form of software code, and may be a computer executable computer capable of implementing the logic function corresponding to the step. instruction.
  • the executable instructions can be stored in a memory and executed by a processor in the device.
  • a data statistics device which may include a processor, a memory, and computer instructions stored on the memory and operable on the processor, The processor executes the instructions to implement the following steps:
  • the extreme value sorting number is obtained by the plurality of sorting numbers corresponding to the respective data identifiers in the identifier intersection, and the identifier intersection is performed by the plurality of The identifier of the second data selected by the corresponding cooperative data party selected by the plurality of data identifiers corresponding to the first data to participate in the data statistics;
  • the apparatus or module illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function.
  • a typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, and a game control.
  • one or more embodiments of the present specification can be provided as a method, system, or computer program product.
  • one or more embodiments of the present specification can take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware.
  • one or more embodiments of the present specification can employ a computer program embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer usable program code embodied therein. The form of the product.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
  • One or more embodiments of the present specification can be described in the general context of computer-executable instructions executed by a computer, such as a program module.
  • program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • One or more embodiments of the present specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network.
  • program modules can be located in both local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Telephonic Communication Services (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Complex Calculations (AREA)

Abstract

本说明书实施例提供一种数据统计方法和装置,其中方法包括:将本端参与数据统计的多个第一数据分别对应的数据标识和排序号,发送至合作数据方,该排序号用于标识多个第一数据之间的排序位置;接收合作数据方返回的极值排序号,所述极值排序号是合作数据方由标识交集中的各个数据标识对应的多个排序号中获取;根据极值排序号,获取对应极值排序号的本地数据方的第一数据。

Description

一种数据统计方法和装置 技术领域
本公开涉及网络技术领域,特别涉及一种数据统计方法和装置。
背景技术
大数据时代,存在非常多的数据孤岛。例如,一个自然人的数据,可以分散存储于不同的企业中,而企业与企业之间由于竞争关系和用户隐私保护的考虑,并不是完全的互相信任,这就为涉及企业之间数据合作的统计工作造成了障碍。如何在充分保护企业核心数据隐私的前提下,既能够利用双方拥有的数据完成一些数据统计计算,又不会泄露企业各自的数据隐私安全,成为一个亟待解决的迫切问题。但是目前并没有很好的解决方案。
发明内容
有鉴于此,本公开提供一种数据统计方法和装置,以在保护两个数据拥有方的数据隐私的基础上,实现两方安全计算。
具体地,本说明书一个或多个实施例是通过如下技术方案实现的:
第一方面,提供一种数据统计方法,所述方法应用于联合本地数据方和合作数据方的数据进行数据统计,本地数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,合作数据方具有所述数据标识对应的多个第二数据,所述方法包括:
将所述多个第一数据分别对应的数据标识和排序号,发送至所述合作数据方,所述排序号用于标识所述多个第一数据之间的排序位置;
接收所述合作数据方返回的极值排序号,所述极值排序号是合作数据方由标识交集中的各个数据标识对应的多个排序号中获取,所述标识交集 是由所述多个第一数据对应的多个数据标识中选择的对应合作数据方参与数据统计的第二数据的标识;
根据所述极值排序号,获取对应所述极值排序号的本地数据方的第一数据。
第二方面,提供一种数据统计方法,所述方法应用于联合本地数据方和统计数据方的数据进行数据统计,统计数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,本地数据方具有所述数据标识对应的多个第二数据;所述方法包括:
接收统计数据方发送的数据标识和排序号,所述数据标识是统计数据方参与数据统计的多个第一数据对应的标识,所述排序号用于标识多个第一数据之间的排序位置;
根据本地数据方参与数据统计的多个第二数据对应的数据标识、以及所述多个第一数据的数据标识,确定标识交集;
根据所述标识交集中的各个数据标识对应的排序号,获取极值排序号;
将所述极值排序号发送至统计数据方,以使得统计数据方根据极值排序号得到对应的作为极值的第一数据。
第三方面,提供一种数据统计方法,所述方法用于在本地数据方与合作数据方之间进行数据统计,所述本地数据方存储数据标识对应的第一数据,所述合作数据方存储同一数据标识对应的第二数据;并且,所述方法应用于在多个第一数据中获取极值;所述方法包括:
将本地参与数据统计的多个第一数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到本地处理标识;
将所述多个第一数据分别对应的本地处理标识和排序号,发送至所述合作数据方,以使得所述合作数据方对所述本地处理标识进行对端私钥处理后生成第一密钥处理标识,并存储所述第一密钥处理标识和排序号的对应关系,所述排序号用于标识所述多个第一数据之间的排序位置;
接收所述合作数据方发送的对端处理标识,所述对端处理标识是所述 合作数据方对参与数据统计的第二数据的数据标识进行对端私钥处理得到;
对所述对端处理标识进行本地私钥处理后,生成第二密钥处理标识,并将所述第二密钥处理标识发送至所述合作数据方;
接收所述合作数据方发送的极值排序号,所述极值排序号是所述合作数据方由第一密钥处理标识和第二密钥处理标识的交集对应的各个排序号中获得;
根据所述极值排序号,得到对应的作为极值的第一数据。
第四方面,提供一种数据统计方法,所述方法用于在本地数据方与统计数据方之间进行数据统计,所述统计数据方具有数据标识对应的第一数据,所述本地数据方存储同一所述数据标识对应的第二数据,并且,所述方法应用于在多个第一数据中获取极值;所述方法包括:
接收所述统计数据方发送的对端处理标识和排序号,所述对端处理标识是所述统计数据方对参与数据统计的第一数据的数据标识根据密钥交换协议进行对端私钥处理得到,所述排序号用于标识所述第一数据的排序位置;
对所述对端处理标识根据密钥交换协议进行本地私钥运算,生成第一密钥处理标识,并存储所述第一密钥处理标识和排序号的对应关系;
将本地参与数据统计的多个第二数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个本地处理标识;
将所述本地处理标识发送至统计数据方,并接收所述统计数据方返回的第二密钥处理标识,所述第二密钥处理标识是所述统计数据方对所述本地处理标识进行对端私钥处理得到;
获取所述第一密钥处理标识和第二密钥处理标识的标识交集对应的各个排序号,并确定所述各个排序号中的极值排序号;
将所述极值排序号发送至所述统计数据方,以使得所述统计数据方根据所述极值排序号得到对应的作为极值的第一数据。
第五方面,提供一种数据统计装置,所述装置用于联合本地数据方和合作数据方的数据进行数据统计,本地数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,合作数据方具有所述数据标识对应的多个第二数据,所述装置包括:
数据发送模块,用于将所述多个第一数据分别对应的数据标识和排序号,发送至所述合作数据方,所述排序号用于标识多个第一数据之间的排序位置;
序号接收模块,用于接收所述合作数据方返回的极值排序号,所述极值排序号是合作数据方由标识交集中的各个数据标识对应的多个排序号中获取,所述标识交集是由所述多个第一数据对应的多个数据标识中选择的对应合作数据方参与数据统计的第二数据的标识;
数据确定模块,用于根据所述极值排序号,获取对应所述极值排序号的本地数据方的第一数据。
第六方面,提供一种数据统计装置,所述装置应用于联合本地数据方和统计数据方的数据进行数据统计,统计数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,本地数据方具有所述数据标识对应的多个第二数据,所述装置包括:
数据接收模块,用于接收统计数据方发送的数据标识和排序号,所述数据标识是统计数据方参与数据统计的多个第一数据对应的标识,所述排序号用于标识多个第一数据之间的排序位置;
交集确定模块,用于根据本地数据方参与数据统计的多个第二数据对应的数据标识、以及所述多个第一数据的数据标识,确定标识交集;
序号确定模块,用于根据所述标识交集中的各个数据标识对应的排序号,获取极值排序号;
序号发送模块,用于将所述极值排序号发送至统计数据方,以使得统计数据方根据极值排序号得到对应的作为极值的第一数据。
第七方面,提供一种数据统计装置,所述装置用于在本地数据方与合 作数据方之间进行数据统计,所述本地数据方存储数据标识对应的第一数据,所述合作数据方存储同一所述数据标识对应的第二数据;并且,所述方法应用于在多个第一数据中获取极值;所述装置包括:
私钥处理模块,用于将本地参与数据统计的多个第一数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个本地处理标识;
序号发送模块,用于将所述多个第一数据分别对应的本地处理标识和排序号,发送至所述合作数据方,以使得所述合作数据方对所述本地处理标识进行对端私钥处理后生成第一密钥处理标识,并存储所述第一密钥处理标识和排序号的对应关系,所述排序号用于标识所述多个第一数据之间的排序位置;
标识接收模块,用于接收所述合作数据方发送的对端处理标识,所述对端处理标识是所述合作数据方对参与数据统计的第二数据的数据标识进行对端私钥处理得到;
密钥合作模块,用于对所述对端处理标识进行本地私钥处理后,生成第二密钥处理标识,并将所述第二密钥处理标识发送至所述合作数据方;
序号接收模块,用于接收所述合作数据方发送的极值排序号,所述极值排序号是所述合作数据方由第一密钥处理标识和第二密钥处理标识的标识交集对应的各个排序号中获得;
极值确定模块,用于根据极值排序号,得到对应的作为极值的第一数据。
第八方面,提供一种数据统计装置,所述装置用于在本地数据方与合作数据方之间进行数据统计,所述本地数据方具有数据标识对应的第一数据,所述合作数据方具有同一所述数据标识对应的第二数据;并且,所述方法应用于在多个第一数据中获取极值;所述装置包括:
数据接收模块,用于接收所述统计数据方发送的对端处理标识和排序号,所述对端处理标识是所述统计数据方对数据标识根据密钥交换协议进行对端私钥处理得到,所述数据标识对应参与数据统计的第一数据,所述 排序号用于标识所述第一数据的排序位置;
密钥处理模块,用于对所述对端处理标识根据密钥交换协议进行本地私钥运算,生成第一密钥处理标识,并存储第一密钥处理标识和排序号的对应关系;
标识处理模块,用于将本地参与数据统计的多个第二数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个本地处理标识;
合作处理模块,用于将所述本地处理标识发送至统计数据方,并接收所述统计数据方返回的第二密钥处理标识,所述第二密钥处理标识是所述统计数据方对本地处理标识进行对端私钥处理得到;
极值获取模块,用于获取所述第一密钥处理标识和第二密钥处理标识的交集对应的各个排序号,并确定所述各个排序号中的极值排序号;
极值发送模块,用于将所述极值排序号发送至所述统计数据方,以使得所述统计数据方根据所述极值排序号得到对应的作为极值的第一数据。
第九方面,提供一种数据统计设备,所述设备包括存储器、处理器,以及存储在存储器上并可在处理器上运行的计算机指令,所述处理器执行指令时实现以下步骤:
将所述多个第一数据分别对应的数据标识和排序号,发送至所述合作数据方,所述排序号用于标识所述多个第一数据之间的排序位置;
接收所述合作数据方返回的极值排序号,所述极值排序号是合作数据方由标识交集中的各个数据标识对应的多个排序号中获取,所述标识交集是由所述多个第一数据对应的多个数据标识中选择的对应合作数据方参与数据统计的第二数据的标识;
根据所述极值排序号,获取对应所述极值排序号的本地数据方的第一数据。
本说明书一个或多个实施例的数据统计方法和装置,通过在极值统计时将排序号发送至对端,使得只暴露一个排序号给对端,既实现了对极值的统计,又有效保护了参与统计的双方的数据安全,实现了在保护两个数 据拥有方的数据隐私的基础上,实现两方安全计算。
附图说明
为了更清楚地说明本说明书一个或多个实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书一个或多个实施例中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本说明书一个或多个实施例提供的一种数据统计方法的流程图;
图2为本说明书一个或多个实施例提供的一种数据统计方法的流程图;
图3为本说明书一个或多个实施例提供的一种数据统计装置的结构示意图;
图4为本说明书一个或多个实施例提供的一种数据统计装置的结构示意图;
图5为本说明书一个或多个实施例提供的一种数据统计装置的结构示意图;
图6为本说明书一个或多个实施例提供的一种数据统计装置的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本说明书一个或多个实施例中的技术方案,下面将结合本说明书一个或多个实施例中的附图,对本说明书一个或多个实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是一部分实施例,而不是全部的实施例。基于本说明书一个或多个实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。
在大数据时代,数据的存储方式可以是垂直模式,即多个数据拥有方 可以拥有同一个实体的不同属性信息,例如,同一个自然人的车险分在一个机构,该自然人的理赔金额在另一个机构。这种垂直模式的数据存储,可能导致在进行一些数据统计计算时,会涉及到多个数据拥有方,需要多个数据拥有方合作完成一次数据统计。然而,由于不同企业之间的竞争关系或者隐私保护的考虑,不能泄露企业各自的数据秘密。
本公开的例子中,旨在基于不同的数据拥有方的数据进行数据统计,同时又不会泄露数据拥有方各自的数据隐私。如下以一个示例性的应用场景详细描述该方法,但是该方法并不局限于该应用场景。
应用场景:
以车险的保险分的数据统计为例,并且该例子中可以有两个数据源,分别为:数据源A和数据源B。假设数据源A可以是一个数据机构,数据源B可以是一个保险机构,这两个数据源可以分别存储同一个车主的不同信息。
数据源A:假设该数据源A可以存储每个车主的车险分,车险分可以是对车主进行精准画像和风险分析后得到的分数,车险分的分数越高,可以表明风险越低。如表1所示,数据源A侧存储车险分的数据结构如下:
表1数据源A的数据结构
列名 类型 说明 示例
idcard_no string 身份证号 ******197309119564
score int 车险分 510
数据源B:假设该数据源B可以存储每个车主的理赔信息,例如,车主的理赔信息可以包括理赔次数、理赔金额等。如表2所示,数据源B侧存储的每个车主的数据结构如下:
表2数据源B的数据结构
Figure PCTCN2018105938-appb-000001
基于上述的应用场景,在对车险的保险分做极值统计时,可以基于数据源A和数据源B的数据,共同完成极值统计。
例如,假设一次统计工作的需求是“统计理赔次数大于5次的女性用户最大的保险分”,那么,根据“最大的保险分”,表明这是一次对数据源A中的数据的极值统计,极值即最大值或者最小值,而“理赔次数大于5次的女性用户”表明可以将数据源B中的数据作为极值获取的过滤条件,即需要获取满足该过滤条件的用户中的最大保险分。这种满足某种过滤条件的情况下的求最大值或者最小值,可以称为“条件极值”。
基于表1所示的数据结构,假设数据源A拥有的车险分数据如下表3,其中,idcard_no可以是车主的身份证号,score可以是该车主的车险分。
表3数据源A的数据
idcard_no score
1234567 490
2345678 501
3456789 530
基于表2所示的数据结构,假设数据源B拥有的数据如下表4:
表4数据源B的数据
idcard_no gender times amount
1234567 3 5000
2345678 7 23000
3456789 6 16000
假设要基于上述表3和表4,统计理赔次数大于5次的女性用户最大的保险分。还可以看到,本次统计工作的统计数据“保险分”存储在数据源A,表3中的score这一列可以称为“统计列”,即要对这一列的数据进行极值统计,求取最大值。而过滤条件中的“理赔次数”、“女性”都存储在数据源B,因此,需要数据源A和数据源B合作完成对保险分的极值统计。
在如下对数据统计方法的描述中,可以将拥有统计数据“保险分”的数据源A称为统计数据方,可以将另一个数据源B称为合作数据方。并且,上面提到过,这两个数据源可以分别存储同一个车主的不同信息,可以将数据源A中存储的参与本次数据极值统计的车主信息(例如,保险分score)称为第一数据,将数据源B中存储的参与数据统计的车主信息(例如,性别、最近一年理赔次数、理赔金额)称为第二数据。此外,数据源A和数据源B中都包括的身份证号idcard_no可以称为数据标识,即数据源A可以存储该数据标识对应的第一数据,数据源B可以存储该同一数据标识对应的第二数据。
图1示例了一种数据统计方法的流程,如图1所示,该方法可以包括:
在步骤100中,统计数据方将多个第一数据分别对应的数据标识和排序号,发送至合作数据方,该排序号用于标识多个第一数据之间的排序位置。
本步骤中的多个第一数据,可以是统计数据方要参与数据统计的数据,这些数据可以是根据统计数据方的数据过滤条件选择得到。例如,统计数据方可以预先将参与统计的多个第一数据之间根据大小顺序进行排序,并 根据排序结果确定各个第一数据分别对应的排序号。
在步骤102中,合作数据方根据本地数据方参与数据统计的多个第二数据对应的数据标识、以及所述多个第一数据的数据标识,确定标识交集。
本步骤中,合作数据方可以根据本地过滤条件,选择得到要参与本次数据统计的第二数据,并得到这些第二数据对应的数据标识。结合在步骤100中接收到的统计数据方发送的数据标识,将这两部分数据标识的交集称为标识交集,标识交集中可以包括至少一个数据标识,标识交集中的每个数据标识对应的第一数据是统计数据方要参与数据统计的数据,且该数据标识对应的第二数据是合作数据方要参与数据统计的数据。
在步骤104中,合作数据方根据所述标识交集中的各个数据标识对应的排序号,获取极值排序号。
本步骤中,合作数据方可以对标识交集中的各个数据标识对应的排序号进行比较,获取极值排序号,比如最大的排序号或者最小的排序号。
在步骤106中,合作数据方将极值排序号发送至统计数据方。
在步骤108中,统计数据方根据所述极值排序号,获取对应所述极值排序号的本地数据方的第一数据。
本例子的数据统计方法,通过将排序号发送至对端,使得对端根据排序号返回最大排序号或最小排序号即可,既实现了对端的数据过滤,并且实现了极值的统计,而且还不会暴露统计数据方的真实数据,保护了两个数据拥有方的数据隐私。
在另一个例子中,为了对两个数据拥有方的数据隐私提供更安全的保护,还可以在本地数据方和合作数据方之间进行数据传输时,对数据标识按照密钥交换协议进行加密处理。例如,统计数据方可以利用本地私钥对数据标识进行本地私钥处理后,发送至对端,以使得对端继续对数据标识进行对端私钥处理。统计数据方还可以接收合作数据方发送的经过对端私钥处理的数据标识,并继续对该数据标识进行本端私钥处理后,返回至合作数据方。经过双方对数据标识进行密钥交换协议的处理,可以避免暴露 数据标识,提供更安全的保护。
图2示例了一种数据统计方法的流程,该流程可以基于表3和表4,统计理赔次数大于5次的女性用户最大的保险分,并且,该例子以将极值统计和密钥交换的处理进行结合为例,如图2所示,该方法可以包括:
在步骤200中,统计数据方对本地参与数据统计的多个第一数据,生成分别与该第一数据对应的排序号。
本例子中,数据源A是统计方数据源,存储有要待求极值的数据score。如表3所示,score所在列可以称为统计列,其中每一个车险分可以称为第一数据。
在一个例子中,数据源A可以是统计表3中的统计列score中的最大值,即统计490、501、530这三个车险分中的最大值。可以将490、501、530这三个车险分称为“本地参与数据统计的三个第一数据”。
在另一个例子中,数据源A还可以根据预定的数据过滤条件,选择部分车险分统计极值。例如,可以求取501和530这两个车险分中的最大值。
本步骤中,数据源A可以在确定本地参与数据统计的多个第一数据的基础上,对该多个第一数据按照大小顺序进行排序,并根据排序结果,生成分别与各个第一数据对应的排序号。
例如,表3中的三个车险分,按照由小到大的顺序排列为490<501<530。由此,可以确定各个车险分的排序号如下表5所示。其中,车险分越小,排序号也相应越小,即排序号可以标识各个第一数据之间的排序位置。
表5排序号和对应的第一数据
第一数据 排序号
490 1
501 2
530 3
上述排序号的生成,可以在线生成,也可以离线生成。离线提前生成 统计列中的各个第一数据对应的排序号,有利于减少在线统计计算时的工作量,提高统计计算的效率。
在步骤202中,统计数据方将本地参与数据统计的多个第一数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个第一处理标识。
本步骤中,表3中与车险分score对应的身份证号idcard_no,可以称为与第一数据对应的数据标识。为了保护数据源A和数据源B双方的明细数据不泄露,可以利用密钥交换协议(例如,可以采用迪菲-赫尔曼密钥交换Diffie–Hellman key exchange,简称“D–H”),对上述的数据标识进行处理。
例如,可以对idcard_no做hash,得到H(K)。同时数据源A可以生成密钥交换协议中自己的私钥α,并进行本地私钥处理,该处理可以是对H(k)做α指数运算,得到H(k) α,该H(k) α可以称为第一处理标识。
以数据源A将统计列的全部第一数据参与本次极值统计为例,经过本步骤的处理后,数据源A可以得到参与统计的各个第一数据对应的排序号和第一处理标识。如下表6所示,H(k) α即Hash(idcard_no) α是第一处理标识,N即排序号。以表3中的第一位车主为例,该车主的车险分490,对应的排序号是1,该车险分490对应的数据标识是1234567,对该数据标识进行哈希和本地私钥处理后,得到第一处理标识H(1234567)α。
表6第一处理标识和排序号
Hash(idcard_no) α N
H(1234567) α 1
H(2345678) α 2
H(3456789) α 3
在步骤204中,统计数据方将所述多个第一数据分别对应的第一处理标识和排序号,发送至所述合作数据方。
本步骤中,数据源A可以将表6中的数据,发送至数据源B。
在步骤206中,合作数据方对所述第一处理标识根据密钥交换协议进行本地私钥运算,生成第一密钥处理标识,并存储所述第一密钥处理标识和排序号的对应关系。
本步骤中,数据源B在接收到表6的数据后,根据密钥交换协议,可以生成数据源B本地的私钥β,并利用该私钥β对第一处理标识H(k) α进行本地私钥运算,即做指数运算,得到H(k) αβ。该H(k) αβ可以称为第一密钥处理标识。经过本步骤的β指数运算处理后,表6就可以变换为表7,如下:
表7第一密钥处理标识和排序号
Hash(idcard_no) αβ N
H(1234567) αβ 1
H(2345678) αβ 2
H(3456789) αβ 3
在步骤208中,合作数据方对本地参与数据统计的多个第二数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个第二处理标识。
本步骤中,数据源B也可以确定本地参与数据统计的多个第二数据。例如,可以是全部的数据,也可以是根据预定的过滤条件进行本地过滤得到的数据。
例如,预定的过滤条件是“理赔次数大于5次的女性用户”,根据该条件可以对表4中的数据做筛选,可以得到表4中后两行数据要参与统计。amount列中的“女、7、23000”、“女、6、16000”可以称为第二数据。这两个第二数据对应的数据标识可以分别是2345678和3456789。
数据源B可以对上述的数据标识分别做hash,得到H(K),再根据密钥交换协议,对对H(k)做β指数运算,该β是数据源B的私钥,得到H(k) β。该H(k) β可以称为第二处理标识。如下表8所示:
表8第二处理标识
Hash(idcard_no) β
H(2345678) β
H(3456789) β
在步骤210中,合作数据方将所述第二处理标识发送至统计数据方。
本步骤中,数据源B可以将上述表8中的数据发送至数据源A。
在步骤212中,统计数据方对第二处理标识进行本地私钥处理后,生成第二密钥处理标识。
例如,数据源A接收到表8中的Hash(idcard_no) β后,可以再利用数据源A的本地私钥处理,生成第二密钥处理标识Hash(idcard_no) βα,如下表9所示。
表9第二密钥处理标识
Hash(idcard_no) βα
H(2345678) βα
H(3456789) βα
在步骤214中,统计数据方将第二密钥处理标识发送至合作数据方。
在步骤216中,合作数据方获取所述第一密钥处理标识和第二密钥处理标识的交集对应的各个排序号,并确定所述各个排序号中的极值排序号。
本步骤中,数据源B可以将表9中的第二密钥处理标识与表7中的第一密钥处理标识求取交集,数值相同的Hash(idcard_no) βα和Hash(idcard_no) αβ,表示对应的是同一个idcard_no,即该相同的idcard_no代表的车主既满足统计方数据源参与统计的数据的过滤条件,也满足合作数据源参与统计数据的过滤条件。根据交集,并结合表7中的第一密钥处理标识和排序号的对应关系,可以得到交集中的第一密钥处理标识对应的排序号。如下表10所示,假设表10中包括交集部分,以及交集对应的各 个排序号。
表10交集和对应的排序号
Hash(idcard_no) αβ Hash(idcard_no) βα N
H(2345678) αβ H(2345678) βα 2
H(3456789) αβ H(3456789) βα 3
根据表10,可以确定上述交集中对应的各个排序号中的极值排序号,例如,当求最大保险分时,极值排序号可以是最大的排序号。本步骤的极值排序号是3。
在步骤218中,合作数据方将所述极值排序号发送至所述统计数据方。
例如,数据源B可以将上述极值排序号N=3发送至数据源A。
在步骤220中,统计数据方根据所述极值排序号得到对应的作为极值的第一数据。
例如,数据源A在接收到极值排序号N=3时,可以根据表5,确定与排序号3对应的第一数据是530,即530是所要统计得到的最大保险分。
本例子的数据统计方法,在极值统计时通过将排序号发送至对端,使得只暴露一个排序号给对端,有效保护了本端的数据安全;并且,采用了密钥交换协议保护所有过滤筛选字段的隐私安全。本方案既实现了对极值的统计,又保护了参与统计的双方的数据安全。比如,在上述的例子中,保险机构不能知道某个idcard_no的车主的保险分的具体分数,同时数据机构也不能知道某个idcard_no的车主在上述保险机构的理赔次数等信息。
为了实现上述的方法,本说明书一个或多个实施例还提供了一种数据统计装置,该装置应用于联合本地数据方和合作数据方的数据进行数据统计,本地数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,合作数据方具有所述数据标识对应的多个第二数据。如图3所示,该装置可以包括:数据发送模块31、序号接收模块32和数据确定模块33。
数据发送模块31,用于将所述多个第一数据分别对应的数据标识和排序号,发送至所述合作数据方,所述排序号用于标识多个第一数据之间的排序位置;
序号接收模块32,用于接收所述合作数据方返回的极值排序号,所述极值排序号是合作数据方由标识交集中的各个数据标识对应的多个排序号中获取,所述标识交集是由所述多个第一数据对应的多个数据标识中选择的对应合作数据方参与数据统计的第二数据的标识;
数据确定模块33,用于根据所述极值排序号,获取对应所述极值排序号的本地数据方的第一数据。
为了实现上述的方法,本说明书一个或多个实施例还提供了一种数据统计装置,该装置应用于联合本地数据方和统计数据方的数据进行数据统计,统计数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,本地数据方具有所述数据标识对应的多个第二数据。如图4所示,该装置可以包括:数据接收模块41、交集确定模块42、序号确定模块43和序号发送模块44。
数据接收模块41,用于接收统计数据方发送的数据标识和排序号,所述数据标识是统计数据方参与数据统计的多个第一数据对应的标识,所述排序号用于标识多个第一数据之间的排序位置;
交集确定模块42,用于根据本地数据方参与数据统计的多个第二数据对应的数据标识、以及所述多个第一数据的数据标识,确定标识交集;
序号确定模块43,用于根据所述标识交集中的各个数据标识对应的排序号,获取极值排序号;
序号发送模块44,用于将所述极值排序号发送至统计数据方,以使得统计数据方根据极值排序号得到对应的作为极值的第一数据。
为了实现上述的方法,本说明书一个或多个实施例还提供了一种数据统计装置,如图5所示,该装置可以包括:私钥处理模块51、序号发送模块52、标识接收模块53、密钥合作模块54、序号接收模块55和极值确定 模块56。
私钥处理模块51,用于将本地参与数据统计的多个第一数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个第一处理标识;
序号发送模块52,用于将所述多个第一数据分别对应的第一处理标识和排序号,发送至所述合作数据方,以使得所述合作数据方对所述第一处理标识进行对端私钥处理后生成第一密钥处理标识,并存储所述第一密钥处理标识和排序号的对应关系,所述排序号用于标识所述多个第一数据之间的排序位置;
标识接收模块53,用于接收所述合作数据方发送的第二处理标识,所述第二处理标识是所述合作数据方对参与数据统计的第二数据的数据标识进行对端私钥处理得到;
密钥合作模块24,用于对所述第二处理标识进行本地私钥处理后,生成第二密钥处理标识,并将所述第二密钥处理标识发送至所述合作数据方;
序号接收模块25,用于接收所述合作数据方发送的极值排序号,所述极值排序号是所述合作数据方由第一密钥处理标识和第二密钥处理标识的交集对应的各个排序号中获得;
极值确定模块26,用于根据极值排序号,得到对应的作为极值的第一数据。
在一个例子中,该装置还可以包括:
序号生成模块,用于将本地参与数据统计的多个第一数据,按照大小顺序进行排序;并根据排序结果,生成分别与所述多个第一数据对应的排序号。
数据过滤模块,用于根据预定的数据过滤条件,选择得到所述本地参与数据统计的多个第一数据。
为了实现上述的方法,本说明书一个或多个实施例还提供了一种数据统计装置,如图6所示,该装置可以包括:数据接收模块61、密钥处理模 块62、标识处理模块63、合作处理模块64、极值获取模块65和极值发送模块66。
数据接收模块61,用于接收所述统计数据方发送的第一处理标识和排序号,所述第一处理标识是所述统计数据方对数据标识根据密钥交换协议进行对端私钥处理得到,所述数据标识对应参与数据统计的第一数据,所述排序号用于标识所述第一数据的排序位置;
密钥处理模块62,用于对第一处理标识根据密钥交换协议进行本地私钥运算,生成第一密钥处理标识,并存储第一密钥处理标识和排序号的对应关系;
标识处理模块63,用于将本地参与数据统计的多个第二数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个第二处理标识;
合作处理模块64,用于将所述第二处理标识发送至统计数据方,并接收所述统计数据方返回的第二密钥处理标识,所述第二密钥处理标识是所述统计数据方对第二处理标识进行对端私钥处理得到;
极值获取模块45,用于获取所述第一密钥处理标识和第二密钥处理标识的交集对应的各个排序号,并确定所述各个排序号中的极值排序号;
极值发送模块46,用于将所述极值排序号发送至所述统计数据方,以使得所述统计数据方根据所述极值排序号得到对应的作为极值的第一数据。
为了描述的方便,描述以上装置时以功能分为各种模块分别描述。当然,在实施本说明书一个或多个实施例时可以把各模块的功能在同一个或多个软件和/或硬件中实现。
上述方法实施例所示流程中的各个步骤,其执行顺序不限制于流程图中的顺序。此外,各个步骤的描述,可以实现为软件、硬件或者其结合的形式,例如,本领域技术人员可以将其实现为软件代码的形式,可以为能够实现所述步骤对应的逻辑功能的计算机可执行指令。当其以软件的方式 实现时,所述的可执行指令可以存储在存储器中,并被设备中的处理器执行。
例如,对应于上述方法,本说明书一个或多个实施例同时提供一种数据统计设备,该设备可以包括处理器、存储器、以及存储在存储器上并可在处理器上运行的计算机指令,所述处理器通过执行所述指令,用于实现如下步骤:
将所述多个第一数据分别对应的数据标识和排序号,发送至所述合作数据方,所述排序号用于标识所述多个第一数据之间的排序位置;
接收所述合作数据方返回的极值排序号,所述极值排序号是合作数据方由标识交集中的各个数据标识对应的多个排序号中获取,所述标识交集是由所述多个第一数据对应的多个数据标识中选择的对应合作数据方参与数据统计的第二数据的标识;
根据所述极值排序号,获取对应所述极值排序号的本地数据方的第一数据。
上述实施例阐明的装置或模块,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。
本领域内的技术人员应明白,本说明书一个或多个实施例可提供为方法、系统、或计算机程序产品。因此,本说明书一个或多个实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本说明书一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理 设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本说明书一个或多个实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本说明书一个或多个实施例,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于服务端设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照 不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
以上所述仅为本说明书一个或多个实施例的较佳实施例而已,并不用以限制本公开,凡在本公开的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开保护的范围之内。

Claims (13)

  1. 一种数据统计方法,所述方法应用于联合本地数据方和合作数据方的数据进行数据统计,本地数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,合作数据方具有所述数据标识对应的多个第二数据,所述方法包括:
    将所述多个第一数据分别对应的数据标识和排序号,发送至所述合作数据方,所述排序号用于标识所述多个第一数据之间的排序位置;
    接收所述合作数据方返回的极值排序号,所述极值排序号是合作数据方由标识交集中的各个数据标识对应的多个排序号中获取,所述标识交集是由所述多个第一数据对应的多个数据标识中选择的对应合作数据方参与数据统计的第二数据的标识;
    根据所述极值排序号,获取对应所述极值排序号的本地数据方的第一数据。
  2. 根据权利要求1所述的方法,
    所述将多个第一数据分别对应的数据标识发送至合作数据方,包括:
    根据密钥交换协议,生成本地私钥;
    利用所述本地私钥,对所述数据标识进行本地私钥处理后,发送至合作数据方;
    所述方法还包括:
    接收所述合作数据方发送的经过对端私钥处理的数据标识;
    对接收的所述数据标识进行本地数据方私钥处理后,返回至合作数据方。
  3. 一种数据统计方法,所述方法应用于联合本地数据方和统计数据方的数据进行数据统计,统计数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,本地数据方具有所述数据标识对应的多个第二数据;所述方法包括:
    接收统计数据方发送的数据标识和排序号,所述数据标识是统计数据 方参与数据统计的多个第一数据对应的标识,所述排序号用于标识多个第一数据之间的排序位置;
    根据本地数据方参与数据统计的多个第二数据对应的数据标识、以及所述多个第一数据的数据标识,确定标识交集;
    根据所述标识交集中的各个数据标识对应的排序号,获取极值排序号;
    将所述极值排序号发送至统计数据方,以使得统计数据方根据极值排序号得到对应的作为极值的第一数据。
  4. 一种数据统计方法,所述方法用于在本地数据方与合作数据方之间进行数据统计,所述本地数据方存储数据标识对应的第一数据,所述合作数据方存储同一数据标识对应的第二数据;并且,所述方法应用于在多个第一数据中获取极值;所述方法包括:
    将本地参与数据统计的多个第一数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到本地处理标识;
    将所述多个第一数据分别对应的本地处理标识和排序号,发送至所述合作数据方,以使得所述合作数据方对所述本地处理标识进行对端私钥处理后生成第一密钥处理标识,并存储所述第一密钥处理标识和排序号的对应关系,所述排序号用于标识所述多个第一数据之间的排序位置;
    接收所述合作数据方发送的对端处理标识,所述对端处理标识是所述合作数据方对参与数据统计的第二数据的数据标识进行对端私钥处理得到;
    对所述对端处理标识进行本地私钥处理后,生成第二密钥处理标识,并将所述第二密钥处理标识发送至所述合作数据方;
    接收所述合作数据方发送的极值排序号,所述极值排序号是所述合作数据方由第一密钥处理标识和第二密钥处理标识的交集对应的各个排序号中获得;
    根据所述极值排序号,得到对应的作为极值的第一数据。
  5. 根据权利要求4所述的方法,
    所述多个第一数据,位于本地数据方的同一个统计列中。
  6. 根据权利要求4所述的方法,在将所述多个第一数据分别对应的本地处理标识和排序号,发送至所述合作数据方之前,所述方法还包括:
    将本地参与数据统计的多个第一数据,按照大小顺序进行排序;
    根据排序结果,生成分别与所述多个第一数据对应的排序号。
  7. 根据权利要求4所述的方法,所述方法还包括:根据预定的数据过滤条件,选择得到所述本地参与数据统计的多个第一数据。
  8. 一种数据统计方法,所述方法用于在本地数据方与统计数据方之间进行数据统计,所述统计数据方具有数据标识对应的第一数据,所述本地数据方存储同一所述数据标识对应的第二数据,并且,所述方法应用于在多个第一数据中获取极值;所述方法包括:
    接收所述统计数据方发送的对端处理标识和排序号,所述对端处理标识是所述统计数据方对参与数据统计的第一数据的数据标识根据密钥交换协议进行对端私钥处理得到,所述排序号用于标识所述第一数据的排序位置;
    对所述对端处理标识根据密钥交换协议进行本地私钥运算,生成第一密钥处理标识,并存储所述第一密钥处理标识和排序号的对应关系;
    将本地参与数据统计的多个第二数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个本地处理标识;
    将所述本地处理标识发送至统计数据方,并接收所述统计数据方返回的第二密钥处理标识,所述第二密钥处理标识是所述统计数据方对所述本地处理标识进行对端私钥处理得到;
    获取所述第一密钥处理标识和第二密钥处理标识的标识交集对应的各个排序号,并确定所述各个排序号中的极值排序号;
    将所述极值排序号发送至所述统计数据方,以使得所述统计数据方根据所述极值排序号得到对应的作为极值的第一数据。
  9. 一种数据统计装置,所述装置用于联合本地数据方和合作数据方的 数据进行数据统计,本地数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,合作数据方具有所述数据标识对应的多个第二数据,所述装置包括:
    数据发送模块,用于将所述多个第一数据分别对应的数据标识和排序号,发送至所述合作数据方,所述排序号用于标识多个第一数据之间的排序位置;
    序号接收模块,用于接收所述合作数据方返回的极值排序号,所述极值排序号是合作数据方由标识交集中的各个数据标识对应的多个排序号中获取,所述标识交集是由所述多个第一数据对应的多个数据标识中选择的对应合作数据方参与数据统计的第二数据的标识;
    数据确定模块,用于根据所述极值排序号,获取对应所述极值排序号的本地数据方的第一数据。
  10. 一种数据统计装置,所述装置应用于联合本地数据方和统计数据方的数据进行数据统计,统计数据方具有待求取极值的多个第一数据,所述多个第一数据分别对应不同的数据标识,本地数据方具有所述数据标识对应的多个第二数据,所述装置包括:
    数据接收模块,用于接收统计数据方发送的数据标识和排序号,所述数据标识是统计数据方参与数据统计的多个第一数据对应的标识,所述排序号用于标识多个第一数据之间的排序位置;
    交集确定模块,用于根据本地数据方参与数据统计的多个第二数据对应的数据标识、以及所述多个第一数据的数据标识,确定标识交集;
    序号确定模块,用于根据所述标识交集中的各个数据标识对应的排序号,获取极值排序号;
    序号发送模块,用于将所述极值排序号发送至统计数据方,以使得统计数据方根据极值排序号得到对应的作为极值的第一数据。
  11. 一种数据统计装置,所述装置用于在本地数据方与合作数据方之间进行数据统计,所述本地数据方存储数据标识对应的第一数据,所述合 作数据方存储同一所述数据标识对应的第二数据;并且,所述方法应用于在多个第一数据中获取极值;所述装置包括:
    私钥处理模块,用于将本地参与数据统计的多个第一数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个本地处理标识;
    序号发送模块,用于将所述多个第一数据分别对应的本地处理标识和排序号,发送至所述合作数据方,以使得所述合作数据方对所述本地处理标识进行对端私钥处理后生成第一密钥处理标识,并存储所述第一密钥处理标识和排序号的对应关系,所述排序号用于标识所述多个第一数据之间的排序位置;
    标识接收模块,用于接收所述合作数据方发送的对端处理标识,所述对端处理标识是所述合作数据方对参与数据统计的第二数据的数据标识进行对端私钥处理得到;
    密钥合作模块,用于对所述对端处理标识进行本地私钥处理后,生成第二密钥处理标识,并将所述第二密钥处理标识发送至所述合作数据方;
    序号接收模块,用于接收所述合作数据方发送的极值排序号,所述极值排序号是所述合作数据方由第一密钥处理标识和第二密钥处理标识的标识交集对应的各个排序号中获得;
    极值确定模块,用于根据极值排序号,得到对应的作为极值的第一数据。
  12. 一种数据统计装置,所述装置用于在本地数据方与合作数据方之间进行数据统计,所述本地数据方具有数据标识对应的第一数据,所述合作数据方具有同一所述数据标识对应的第二数据;并且,所述方法应用于在多个第一数据中获取极值;所述装置包括:
    数据接收模块,用于接收所述统计数据方发送的对端处理标识和排序号,所述对端处理标识是所述统计数据方对数据标识根据密钥交换协议进行对端私钥处理得到,所述数据标识对应参与数据统计的第一数据,所述排序号用于标识所述第一数据的排序位置;
    密钥处理模块,用于对所述对端处理标识根据密钥交换协议进行本地私钥运算,生成第一密钥处理标识,并存储第一密钥处理标识和排序号的对应关系;
    标识处理模块,用于将本地参与数据统计的多个第二数据分别对应的数据标识,根据密钥交换协议进行本地私钥处理,得到多个本地处理标识;
    合作处理模块,用于将所述本地处理标识发送至统计数据方,并接收所述统计数据方返回的第二密钥处理标识,所述第二密钥处理标识是所述统计数据方对本地处理标识进行对端私钥处理得到;
    极值获取模块,用于获取所述第一密钥处理标识和第二密钥处理标识的交集对应的各个排序号,并确定所述各个排序号中的极值排序号;
    极值发送模块,用于将所述极值排序号发送至所述统计数据方,以使得所述统计数据方根据所述极值排序号得到对应的作为极值的第一数据。
  13. 一种数据统计设备,所述设备包括存储器、处理器,以及存储在存储器上并可在处理器上运行的计算机指令,所述处理器执行指令时实现以下步骤:
    将所述多个第一数据分别对应的数据标识和排序号,发送至所述合作数据方,所述排序号用于标识所述多个第一数据之间的排序位置;
    接收所述合作数据方返回的极值排序号,所述极值排序号是合作数据方由标识交集中的各个数据标识对应的多个排序号中获取,所述标识交集是由所述多个第一数据对应的多个数据标识中选择的对应合作数据方参与数据统计的第二数据的标识;
    根据所述极值排序号,获取对应所述极值排序号的本地数据方的第一数据。
PCT/CN2018/105938 2017-10-31 2018-09-17 一种数据统计方法和装置 WO2019085665A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711046647.8 2017-10-31
CN201711046647.8A CN109726581B (zh) 2017-10-31 2017-10-31 一种数据统计方法和装置

Publications (1)

Publication Number Publication Date
WO2019085665A1 true WO2019085665A1 (zh) 2019-05-09

Family

ID=66293827

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/105938 WO2019085665A1 (zh) 2017-10-31 2018-09-17 一种数据统计方法和装置

Country Status (3)

Country Link
CN (1) CN109726581B (zh)
TW (1) TWI704469B (zh)
WO (1) WO2019085665A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414567B (zh) * 2019-07-01 2020-08-04 阿里巴巴集团控股有限公司 数据处理方法、装置和电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100036810A1 (en) * 2008-08-08 2010-02-11 Oracle International Corporation Automated topology-based statistics monitoring and performance analysis
CN102314460A (zh) * 2010-07-07 2012-01-11 阿里巴巴集团控股有限公司 数据分析方法、系统及服务器
WO2014077807A1 (en) * 2012-11-14 2014-05-22 Hewlett-Packard Development Company, L.P. Updating statistics in distributed databases
CN104111958A (zh) * 2013-04-22 2014-10-22 中国移动通信集团山东有限公司 一种数据查询方法及装置

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7197513B2 (en) * 2000-12-08 2007-03-27 Aol Llc Distributed image storage architecture
US20090012841A1 (en) * 2007-01-05 2009-01-08 Yahoo! Inc. Event communication platform for mobile device users
CN102084386A (zh) * 2008-03-24 2011-06-01 姜旻秀 利用数字内容关联元信息的关键字广告方法及其关联系统
US9148483B1 (en) * 2010-09-30 2015-09-29 Fitbit, Inc. Tracking user physical activity with multiple devices
CN102158534B (zh) * 2011-02-09 2015-04-01 中兴通讯股份有限公司 查询方法及装置
CN102457571B (zh) * 2011-09-15 2014-11-05 中标软件有限公司 一种云存储中数据均衡分布方法
CN103246980B (zh) * 2012-02-02 2017-05-03 阿里巴巴集团控股有限公司 信息输出方法及服务器
US9665840B2 (en) * 2014-03-21 2017-05-30 Oracle International Corporation High performance ERP system ensuring desired delivery sequencing of output messages
US9800651B2 (en) * 2014-04-04 2017-10-24 Ca, Inc. Application-specific assessment of cloud hosting suitability
US10524177B2 (en) * 2014-05-30 2019-12-31 Apple Inc. Methods and apparatus to manage data connections for multiple subscriber identities in a wireless communication device
CN104580403B (zh) * 2014-12-24 2017-03-01 腾讯科技(深圳)有限公司 一种数据统计方法及其系统、用户终端、应用服务器
CN104935628B (zh) * 2015-04-20 2018-01-12 电子科技大学 一种在多个数据中心之间迁移多个关联虚拟机的方法
CN106209761A (zh) * 2015-05-29 2016-12-07 松下电器(美国)知识产权公司 相似信息检索方法、终端装置以及相似信息检索系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100036810A1 (en) * 2008-08-08 2010-02-11 Oracle International Corporation Automated topology-based statistics monitoring and performance analysis
CN102314460A (zh) * 2010-07-07 2012-01-11 阿里巴巴集团控股有限公司 数据分析方法、系统及服务器
WO2014077807A1 (en) * 2012-11-14 2014-05-22 Hewlett-Packard Development Company, L.P. Updating statistics in distributed databases
CN104111958A (zh) * 2013-04-22 2014-10-22 中国移动通信集团山东有限公司 一种数据查询方法及装置

Also Published As

Publication number Publication date
TW201918909A (zh) 2019-05-16
TWI704469B (zh) 2020-09-11
CN109726581B (zh) 2020-04-14
CN109726581A (zh) 2019-05-07

Similar Documents

Publication Publication Date Title
WO2019085650A1 (zh) 一种数据统计方法和装置
US11394773B2 (en) Cryptographic currency block chain based voting system
TWI745861B (zh) 資料處理方法、裝置和電子設備
TWI730622B (zh) 資料處理方法、裝置和電子設備
TWI728639B (zh) 資料處理方法、裝置和電子設備
WO2021000575A1 (zh) 数据交互方法、装置和电子设备
TWI706362B (zh) 基於區塊鏈的資料處理方法、裝置和伺服器
TWI718614B (zh) 基於區塊鏈的資料處理方法、裝置和伺服器
US20230068770A1 (en) Federated model training method and apparatus, electronic device, computer program product, and computer-readable storage medium
WO2019085665A1 (zh) 一种数据统计方法和装置
WO2019085656A1 (zh) 一种数据统计方法和装置
CN112084384A (zh) 多方联合进行安全统计的方法和装置
CN114595470A (zh) 数据处理方法及装置
US11921787B2 (en) Identity-aware data management
TWI706370B (zh) 資料統計方法和裝置
CN116244650B (zh) 特征分箱方法、装置、电子设备和计算机可读存储介质
CN112818406B (zh) 一种评分卡模型的预测方法及装置
CN115758441A (zh) 确定多方的隐私数据交集的方法和装置
CN116842567A (zh) 一种两方参与的频繁项数据挖掘的隐私保护方法
CN114091417A (zh) 表格拆分方法、装置、设备及介质
CN117494150A (zh) 一种数据处理方法、装置、电子设备及存储介质
WO2021230771A2 (en) Method of piece data synchronization describing a single entity and stored in different databases
JPWO2022192152A5 (zh)

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18874852

Country of ref document: EP

Kind code of ref document: A1