WO2017092696A1 - Method for safe integration of big data without leaking privacy - Google Patents

Method for safe integration of big data without leaking privacy Download PDF

Info

Publication number
WO2017092696A1
WO2017092696A1 PCT/CN2016/108245 CN2016108245W WO2017092696A1 WO 2017092696 A1 WO2017092696 A1 WO 2017092696A1 CN 2016108245 W CN2016108245 W CN 2016108245W WO 2017092696 A1 WO2017092696 A1 WO 2017092696A1
Authority
WO
WIPO (PCT)
Prior art keywords
data set
party
data
fused
merged
Prior art date
Application number
PCT/CN2016/108245
Other languages
French (fr)
Chinese (zh)
Inventor
周雍恺
柴洪峰
何朔
何东杰
刘国宝
才华
Original Assignee
中国银联股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国银联股份有限公司 filed Critical 中国银联股份有限公司
Publication of WO2017092696A1 publication Critical patent/WO2017092696A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6272Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database by registering files or documents with a third party

Definitions

  • the invention relates to a big data security fusion method.
  • the present invention provides a technical solution as follows:
  • a big data security fusion method for fusing a first data set stored by a first party with a second data set stored by a second party comprising the steps of: a), the first party and the second party Correlating fields, respective data items, and collation rules are negotiated; b) filtering the first data to be merged and the second data to be merged from the first data set and the second data set respectively according to respective data items
  • the first data to be merged and the second data to be merged are sorted according to the sorting rule, and the data corresponding to the associated field is respectively removed from the first data to be merged and the second data to be merged;
  • the first party and the second party respectively submit the first data to be merged and the second data to be merged to a third party computing platform to form a merged data set; e), the third party computing platform pairs the merged data Set Perform analytical calculations to generate a result data set.
  • the third party computing platform is independent of the first party and the second party, respectively.
  • the first to-be-fused data set and the second to-be-fused data set are deleted from the computing system.
  • the big data security fusion method provided by the embodiment of the present invention effectively prevents the leakage of private data while realizing big data fusion, promotes information sharing under the premise of ensuring data security, and broadens the application breadth of big data fusion technology and depth.
  • the above-mentioned big data security convergence method is simple to implement and low in implementation cost, and is advantageous for promotion and application in the industry.
  • FIG. 1 is a schematic flowchart diagram of a big data security convergence method according to a first embodiment of the present invention.
  • the first party stores the first data set in the first database
  • the second party stores the second data set in the second database
  • the first and second data sets respectively record different information, such as activity information of multiple users on different occasions.
  • the first and second data sets have an intersection of information, such as user identity information, which can be extracted as an associated field.
  • the present invention provides various embodiments for performing big data fusion on first and second data sets.
  • a first embodiment of the present invention provides a big data security fusion method, which includes the following steps:
  • Step S10 The first party and the second party negotiate the associated fields, the data items required by each, and the collation rules.
  • the first party and the second party negotiate a session, and agree on the associated fields, the respective required data items, and the collation rules.
  • the respective data items required include data items that the first party desires to obtain indirectly from the second party in the data fusion, and data items that the second party desires to obtain indirectly from the first party in the data fusion.
  • the first party and the second can be determined in the negotiation session by the respective data items required
  • the parties are concerned about which users have relevant information, and further agree on the identity information of these users.
  • the associated field can represent an intersection of information in the first and second data sets, which can be taken directly from any one or more of the following information: identity information of the user; card information held by the user; and/or uniquely determining the user Other identifying information.
  • the collation determines the order in which the specific data sets to be merged are sorted in the subsequent fusion process. Once determined, this sorting rule cannot be arbitrarily changed unless changes are made through a separate negotiation session. According to the determined sorting rules, the correspondence between the data items in the first and second to-be-fused data sets can also be determined.
  • the negotiation session can be initiated by the first party or the second party, and the other party responds.
  • the negotiation session may be initiated by an independent entity module different from the first party and the second party. After receiving the instruction, the first party and the second party directly perform the negotiation session, and after the negotiation session is completed, notify the entity module. .
  • Step S20 Filter the first data to be merged and the second data to be merged from the first data set and the second data set respectively according to the data items required by the respective data items.
  • the first data set to be merged may be filtered out from the first data set, and the second data set to be merged may be filtered out from the second data set.
  • the first to-be-fused data set and the second to-be-fused data set have the same number of data items, and each data item in the first to-be-fused data set can find the corresponding data in the second to-be-fused data set. Item and vice versa.
  • Step S30 Sort the first to-be-fused data set and the second to-be-fused data set according to the sorting rule, and remove the data corresponding to the associated field from the first to-be-fused data set and the second to-be-fused data set respectively.
  • This step S30 specifically includes a sorting step and a culling step.
  • the sorting step may include: the first party and the second party respectively sort the first to-be-fused data set and the second to-be-fused data set according to the sorting rule.
  • the culling step may include: the first party and the second party respectively remove data corresponding to the associated field from the first to-be-fused data set and the second to-be-fused data set.
  • the first and second data sets to be merged no longer include user identity information, thereby effectively shielding the privacy information; and by performing the sorting step, the data items in the first and second data sets to be merged are already Have a clear one-to-one correspondence.
  • Step S40 The first party and the second party respectively submit the first data to be merged and the second data to be merged to a computing platform set up by a third party to form a merged data set.
  • the first party submits the first data to be merged obtained after the performing the sorting step and the culling step to the computing platform of the third party through a dedicated communication line, and the second party performs a similar operation.
  • the third-party computing platform is independent of the first party and the second party, respectively.
  • the data items in the first data to be merged are combined with the data items in the second data group to be merged in a one-to-one correspondence to generate new data items, thereby forming a merged data set.
  • the formed merged data set includes both user activity information from the first party and user activity information from the second party, but does not include the user identity information. Therefore, for the third party, it is impossible to know which user has performed. These activities.
  • Step S50 The third-party computing platform analyzes and calculates the merged data set, and generates a result data set.
  • the third-party computing platform can perform analysis and calculation on the merged data set to generate a result data set, and the result data set can be the result of the analysis statistics, which is completely different from the first and second to-be-fused data sets.
  • the result data set can be fed back to the first party and the second party, and the first party and the second party cannot restore the original data from the result data set.
  • the third-party computing platform may delete the first to-be-fused data set and the second to-be-fused data set, thereby facilitating protection of data security and privacy.
  • the big data security fusion method provided by the embodiment shields the user's identity information while realizing big data fusion, thereby effectively preventing leakage of private data.
  • This method of big data fusion is safe and reliable, and simple to implement.
  • step S10 Including: the first direction the second party proposes a field in the first data set that relates to user privacy information or a field that needs to be protected.
  • step S30 further includes: deleting the data corresponding to the field of the user privacy information or the field to be protected from the first to-be-fused data set.
  • the second party may also present to the first party a field in the second data set that relates to user privacy information or a field that needs to be protected.
  • This improved implementation provides enhanced protection of user privacy information, and is particularly suitable for use in applications where data protection is critical.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Storage Device Security (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A method for safe integration of big data, comprising: a first party and a second party negotiating about associated fields, data items required by the first party and the second party and a sorting rule; screening out, on the basis of the data items required by the first party and the second party, a first to-be-integrated data set and a second to-be-integrated data set respectively from a first data set and a second data set; sorting, according to the sorting rule, respectively the first to-be-integrated data set and the second to-be-integrated data set, and removing, respectively from the first to-be-integrated data set and the second to-be-integrated data set, the data that the associated fields correspond to; submitting the first to-be-integrated data set and the second to-be-integrated data set to a third party computing platform, so as to form an integrated data set; and the third party computing platform generating, by means of analysis and calculation of the integrated data set, a result data set. This invention effectively prevents the private data from being leaked while accomplishing the integration of big data, facilitating share of information on the premise of ensuring the data security.

Description

不泄露隐私的大数据安全融合方法Big data security fusion method without revealing privacy 技术领域Technical field
本发明涉及一种大数据安全融合方法。The invention relates to a big data security fusion method.
背景技术Background technique
随着国家“互联网+”战略的出台,各产业之间的大数据融合需求愈发迫切。然而,一方面,不同的机构对于大数据共享持欢迎的态度,引入不同类型数据的融合可以产生新的分析结果,数据价值将因此产生乘数效应;另一方面,双方对于在数据融合的过程中隐私数据的泄露存在担忧,因为最终的分析结果往往只是一个统计性结论,而在大数据融合计算的过程中却不得不将数据所有的条目细节都暴露于对方。该问题已经成为产业间大数据协作与共享的一大障碍。With the introduction of the national “Internet Plus” strategy, the need for big data convergence between industries is becoming more urgent. However, on the one hand, different institutions have a welcome attitude towards big data sharing. The introduction of different types of data can produce new analysis results, and the data value will have a multiplier effect; on the other hand, the two sides are in the process of data fusion. There is concern about the disclosure of private data, because the final analysis result is often only a statistical conclusion, but in the process of big data fusion calculation, it has to expose all the details of the data to the other party. This problem has become a major obstacle to the sharing and sharing of big data between industries.
因此,本领域技术人员期望获得一种有效屏蔽隐私数据的、可靠的大数据安全融合方法。Therefore, those skilled in the art desire to obtain a reliable big data security fusion method that effectively blocks private data.
发明内容Summary of the invention
本发明的一个目的在于提供一种有效屏蔽隐私数据的大数据安全融合方法。It is an object of the present invention to provide a big data security fusion method that effectively blocks private data.
为实现上述目的,本发明提供一种技术方案如下:To achieve the above object, the present invention provides a technical solution as follows:
一种大数据安全融合方法,用于将第一方存储的第一数据集与第二方存储的第二数据集进行融合,该方法包括如下步骤:a)、第一方与第二方就关联字段、各自所需的数据项以及排序规则进行协商;b)、基于各自所需的数据项分别从第一数据集、第二数据集中筛选出第一待融合数据集、第二待融合数据集;c)、依据排序规则分别对第一待融合数据集、第二待融合数据集进行排序,并将关联字段对应的数据分别从第一待融合数据集、第二待融合数据集中剔除;d)、第一方、第二方分别将第一待融合数据集、第二待融合数据集提交到第三方计算平台,以形成已融合数据集;e)、第三方计算平台对已融合数据集 进行分析计算,生成结果数据集。A big data security fusion method for fusing a first data set stored by a first party with a second data set stored by a second party, the method comprising the steps of: a), the first party and the second party Correlating fields, respective data items, and collation rules are negotiated; b) filtering the first data to be merged and the second data to be merged from the first data set and the second data set respectively according to respective data items The first data to be merged and the second data to be merged are sorted according to the sorting rule, and the data corresponding to the associated field is respectively removed from the first data to be merged and the second data to be merged; d), the first party and the second party respectively submit the first data to be merged and the second data to be merged to a third party computing platform to form a merged data set; e), the third party computing platform pairs the merged data Set Perform analytical calculations to generate a result data set.
优选地,第三方计算平台分别独立于第一方以及第二方。Preferably, the third party computing platform is independent of the first party and the second party, respectively.
优选地,在分析计算完成后,将第一待融合数据集、第二待融合数据集从计算系统中删除。Preferably, after the analysis calculation is completed, the first to-be-fused data set and the second to-be-fused data set are deleted from the computing system.
本发明实施例提供的大数据安全融合方法,在实现大数据融合的同时,有效防止隐私数据的泄露,在确保数据安全的前提下促进了信息的共享,拓宽了大数据融合技术的应用广度和深度。此外,上述大数据安全融合方法实施简单、实现成本低,利于在业内推广应用。The big data security fusion method provided by the embodiment of the present invention effectively prevents the leakage of private data while realizing big data fusion, promotes information sharing under the premise of ensuring data security, and broadens the application breadth of big data fusion technology and depth. In addition, the above-mentioned big data security convergence method is simple to implement and low in implementation cost, and is advantageous for promotion and application in the industry.
附图说明DRAWINGS
图1示出本发明第一实施例提供的大数据安全融合方法的流程示意图。FIG. 1 is a schematic flowchart diagram of a big data security convergence method according to a first embodiment of the present invention.
具体实施方式detailed description
需要说明的是,依照本发明所公开的各实施例,第一方在第一数据库中存储第一数据集,第二方在第二数据库中存储第二数据集。It should be noted that, in accordance with various embodiments of the present disclosure, the first party stores the first data set in the first database, and the second party stores the second data set in the second database.
第一、第二数据集分别记录不同的信息,例如多个用户分别在不同场合的活动信息。第一、第二数据集具有信息的交集,例如,用户的身份信息,其可以提取出来作为关联字段。The first and second data sets respectively record different information, such as activity information of multiple users on different occasions. The first and second data sets have an intersection of information, such as user identity information, which can be extracted as an associated field.
本发明提供对第一、第二数据集进行大数据融合的各种实施方式。The present invention provides various embodiments for performing big data fusion on first and second data sets.
如图1所示,本发明第一实施例提供一种大数据安全融合方法,其包括如下步骤:As shown in FIG. 1 , a first embodiment of the present invention provides a big data security fusion method, which includes the following steps:
步骤S10、第一方与第二方就关联字段、各自所需的数据项以及排序规则进行协商。Step S10: The first party and the second party negotiate the associated fields, the data items required by each, and the collation rules.
具体地,第一方与第二方进行协商会话,并就关联字段、各自所需的数据项以及排序规则达成一致。Specifically, the first party and the second party negotiate a session, and agree on the associated fields, the respective required data items, and the collation rules.
各自所需的数据项包括第一方期望在数据融合中从第二方间接获得的数据项,以及第二方期望在数据融合中从第一方间接获得的数据项。通过各自所需的数据项,在协商会话中可以确定第一方、第二 方分别关心哪些用户的相关信息,并进一步就这些用户的身份信息达成一致。The respective data items required include data items that the first party desires to obtain indirectly from the second party in the data fusion, and data items that the second party desires to obtain indirectly from the first party in the data fusion. The first party and the second can be determined in the negotiation session by the respective data items required The parties are concerned about which users have relevant information, and further agree on the identity information of these users.
关联字段能够表示第一、第二数据集中的信息交集部分,其可直接取自下列信息中的任一个或多个:用户的身份信息;用户的所持卡信息;和/或,唯一地确定用户的其他标识信息。The associated field can represent an intersection of information in the first and second data sets, which can be taken directly from any one or more of the following information: identity information of the user; card information held by the user; and/or uniquely determining the user Other identifying information.
排序规则确定在后续的融合过程中,按照何种顺序来对具体的待融合数据集进行排序。一旦确定,这种排序规则不能被随意改变,除非通过再次的协商会话进行变更。依照所确定的排序规则进行排序,第一、第二待融合数据集中各数据项之间的对应关系也能够被确定。The collation determines the order in which the specific data sets to be merged are sorted in the subsequent fusion process. Once determined, this sorting rule cannot be arbitrarily changed unless changes are made through a separate negotiation session. According to the determined sorting rules, the correspondence between the data items in the first and second to-be-fused data sets can also be determined.
协商会话可以由第一方或第二方发起,另一方进行响应。或者,协商会话可以由不同于第一方和第二方的一个独立的实体模块来发起,第一方、第二方收到指令后,直接进行协商会话,协商会话完成后,通知该实体模块。The negotiation session can be initiated by the first party or the second party, and the other party responds. Alternatively, the negotiation session may be initiated by an independent entity module different from the first party and the second party. After receiving the instruction, the first party and the second party directly perform the negotiation session, and after the negotiation session is completed, notify the entity module. .
步骤S20、基于各自所需的数据项分别从第一数据集、第二数据集中筛选出第一待融合数据集、第二待融合数据集。Step S20: Filter the first data to be merged and the second data to be merged from the first data set and the second data set respectively according to the data items required by the respective data items.
具体地,基于协商会话所确定的各自所需的数据项,可以从第一数据集中筛选出第一待融合数据集,以及从第二数据集中筛选出第二待融合数据集。可以理解,第一待融合数据集与第二待融合数据集具有数量相同的数据项,且第一待融合数据集中的每个数据项都能够在第二待融合数据集中找到与之对应的数据项,反之亦然。Specifically, based on the respective required data items determined by the negotiation session, the first data set to be merged may be filtered out from the first data set, and the second data set to be merged may be filtered out from the second data set. It can be understood that the first to-be-fused data set and the second to-be-fused data set have the same number of data items, and each data item in the first to-be-fused data set can find the corresponding data in the second to-be-fused data set. Item and vice versa.
步骤S30、依据排序规则分别对第一待融合数据集、第二待融合数据集进行排序,并将关联字段对应的数据分别从第一待融合数据集、第二待融合数据集中剔除。Step S30: Sort the first to-be-fused data set and the second to-be-fused data set according to the sorting rule, and remove the data corresponding to the associated field from the first to-be-fused data set and the second to-be-fused data set respectively.
该步骤S30具体包括排序步骤和剔除步骤。This step S30 specifically includes a sorting step and a culling step.
依照一种具体实现,排序步骤可以包括:第一方、第二方分别依据排序规则对第一待融合数据集、第二待融合数据集进行排序。According to a specific implementation, the sorting step may include: the first party and the second party respectively sort the first to-be-fused data set and the second to-be-fused data set according to the sorting rule.
剔除步骤可以包括:第一方、第二方分别将关联字段对应的数据分别从第一待融合数据集、第二待融合数据集中剔除。 The culling step may include: the first party and the second party respectively remove data corresponding to the associated field from the first to-be-fused data set and the second to-be-fused data set.
通过执行剔除步骤,第一、第二待融合数据集不再包括用户身份信息,从而有效地屏蔽了隐私信息;而通过执行排序步骤,第一、第二待融合数据集中的数据项之间已具有明确的一一对应关系。By performing the culling step, the first and second data sets to be merged no longer include user identity information, thereby effectively shielding the privacy information; and by performing the sorting step, the data items in the first and second data sets to be merged are already Have a clear one-to-one correspondence.
步骤S40、第一方、第二方分别将第一待融合数据集、第二待融合数据集提交到第三方架设的计算平台,以形成已融合数据集。Step S40: The first party and the second party respectively submit the first data to be merged and the second data to be merged to a computing platform set up by a third party to form a merged data set.
具体地,第一方将执行排序步骤和剔除步骤之后得到的第一待融合数据集通过专用通信线路提交到第三方架设的计算平台,同时,第二方执行类似操作。其中,第三方计算平台分别独立于第一方以及第二方。Specifically, the first party submits the first data to be merged obtained after the performing the sorting step and the culling step to the computing platform of the third party through a dedicated communication line, and the second party performs a similar operation. The third-party computing platform is independent of the first party and the second party, respectively.
随后,依照执行上述排序步骤所得到的先后顺序,将第一待融合数据集中的数据项与第二待融合数据集中的数据项一一对应地进行结合来生成新的数据项,进而形成已融合数据集。Then, according to the sequence obtained by performing the above sorting step, the data items in the first data to be merged are combined with the data items in the second data group to be merged in a one-to-one correspondence to generate new data items, thereby forming a merged data set.
所形成的已融合数据集同时包括来自第一方的用户活动信息以及来自第二方的用户活动信息,但不包括用户身份信息,因此,对第三方来说,其无法获知是哪个用户进行了这些活动。The formed merged data set includes both user activity information from the first party and user activity information from the second party, but does not include the user identity information. Therefore, for the third party, it is impossible to know which user has performed. These activities.
步骤S50、第三方计算平台对已融合数据集进行分析计算,生成结果数据集。Step S50: The third-party computing platform analyzes and calculates the merged data set, and generates a result data set.
通过该步骤S50,第三方计算平台可以对已融合数据集进行分析计算,生成结果数据集,结果数据集可以是分析统计的结果,其完全不同于第一、第二待融合数据集。结果数据集可以反馈给第一方、第二方,而第一方、第二方从结果数据集无法还原出原始数据。Through the step S50, the third-party computing platform can perform analysis and calculation on the merged data set to generate a result data set, and the result data set can be the result of the analysis statistics, which is completely different from the first and second to-be-fused data sets. The result data set can be fed back to the first party and the second party, and the first party and the second party cannot restore the original data from the result data set.
进一步地,在上述分析计算完成后,第三方计算平台可以删除第一待融合数据集、第二待融合数据集,从而更有利于保护数据的安全性与隐私性。Further, after the foregoing analysis and calculation is completed, the third-party computing platform may delete the first to-be-fused data set and the second to-be-fused data set, thereby facilitating protection of data security and privacy.
该实施例所提供的大数据安全融合方法,在实现大数据融合的同时,屏蔽了用户的身份信息,从而有效防止隐私数据的泄露。这种大数据融合方法安全可靠,实现简单。The big data security fusion method provided by the embodiment shields the user's identity information while realizing big data fusion, thereby effectively preventing leakage of private data. This method of big data fusion is safe and reliable, and simple to implement.
根据上述实施例进一步改进的实现方式,在步骤S10中还可以包 括:第一方向第二方提出第一数据集中涉及用户隐私信息的字段或需要保护的字段。与此相应地,步骤S30还包括:将该涉及用户隐私信息的字段或需要保护的字段所对应的数据从第一待融合数据集中剔除。According to the implementation of the foregoing embodiment, the implementation may be further improved in step S10. Including: the first direction the second party proposes a field in the first data set that relates to user privacy information or a field that needs to be protected. Correspondingly, step S30 further includes: deleting the data corresponding to the field of the user privacy information or the field to be protected from the first to-be-fused data set.
类似地,第二方也可以向第一方提出第二数据集中涉及用户隐私信息的字段或需要保护的字段。Similarly, the second party may also present to the first party a field in the second data set that relates to user privacy information or a field that needs to be protected.
这种改进实现方式,提供对用户隐私信息的强化保护,特别适合在对数据保护要求较高的场合中使用。This improved implementation provides enhanced protection of user privacy information, and is particularly suitable for use in applications where data protection is critical.
上述说明仅针对于本发明的优选实施例,并不在于限制本发明的保护范围。本领域技术人员可作出各种变形设计,而不脱离本发明的思想及附随的权利要求。 The above description is only for the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Various modifications may be made by those skilled in the art without departing from the spirit of the invention and the appended claims.

Claims (5)

  1. 一种大数据安全融合方法,用于将第一方存储的第一数据集与第二方存储的第二数据集进行融合,所述方法包括如下步骤:A big data security fusion method is used for fusing a first data set stored by a first party with a second data set stored by a second party, the method comprising the following steps:
    a)、所述第一方与所述第二方就关联字段、各自所需的数据项以及排序规则进行协商;a), the first party and the second party negotiate the associated fields, their respective data items, and the collation rules;
    b)、基于所述各自所需的数据项分别从所述第一数据集、第二数据集中筛选出第一待融合数据集、第二待融合数据集;b) filtering, according to the respective required data items, the first to-be-fused data set and the second to-be-fused data set from the first data set and the second data set respectively;
    c)、依据所述排序规则分别对所述第一待融合数据集、第二待融合数据集进行排序,并将所述关联字段对应的数据分别从所述第一待融合数据集、第二待融合数据集中剔除;c) sorting the first to-be-fused data set and the second to-be-fused data set according to the sorting rule, and respectively, the data corresponding to the associated field is respectively from the first to-be-fused data set, and the second The data to be merged is removed;
    d)、所述第一方、第二方分别将所述第一待融合数据集、第二待融合数据集提交到第三方计算平台,以形成已融合数据集;d), the first party and the second party respectively submit the first to-be-fused data set and the second to-be-fused data set to a third-party computing platform to form a merged data set;
    e)、所述第三方计算平台对所述已融合数据集进行分析计算,生成结果数据集。e), the third-party computing platform analyzes and calculates the merged data set, and generates a result data set.
  2. 根据权利要求1所述的方法,其特征在于,所述第三方计算平台分别独立于所述第一方以及所述第二方。The method of claim 1 wherein said third party computing platform is independent of said first party and said second party, respectively.
  3. 根据权利要求1所述的方法,其特征在于,所述步骤e)还包括:The method of claim 1 wherein said step e) further comprises:
    在所述分析计算完成后,将所述第一待融合数据集、第二待融合数据集从所述计算系统中删除。After the analysis and calculation is completed, the first to-be-fused data set and the second to-be-fused data set are deleted from the computing system.
  4. 根据权利要求1所述的方法,其特征在于,所述第一数据集、第二数据集分别记录多个用户的不同活动信息,所述关联字段包括:The method according to claim 1, wherein the first data set and the second data set respectively record different activity information of a plurality of users, and the associated fields include:
    用户的身份信息;User identity information;
    用户的所持卡信息;和/或User's card information; and/or
    唯一地确定用户的标识信息。Uniquely identifies the user's identification information.
  5. 根据权利要求4所述的方法,其特征在于,所述步骤a)还包括:The method according to claim 4, wherein said step a) further comprises:
    所述第一方向所述第二方提出所述第一数据集中涉及用户隐私信息的字段; The first direction, the second party, proposes a field in the first data set that relates to user privacy information;
    所述步骤c)还包括:The step c) further includes:
    将所述涉及用户隐私信息的字段所对应的数据从所述第一待融合数据集中剔除。 And deleting data corresponding to the field related to the user privacy information from the first to-be-fused data set.
PCT/CN2016/108245 2015-12-02 2016-12-01 Method for safe integration of big data without leaking privacy WO2017092696A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510868103.4A CN105590066B (en) 2015-12-02 2015-12-02 The safe fusion method of big data of privacy is not revealed
CN201510868103.4 2015-12-02

Publications (1)

Publication Number Publication Date
WO2017092696A1 true WO2017092696A1 (en) 2017-06-08

Family

ID=55929639

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/108245 WO2017092696A1 (en) 2015-12-02 2016-12-01 Method for safe integration of big data without leaking privacy

Country Status (3)

Country Link
CN (1) CN105590066B (en)
TW (1) TWI664538B (en)
WO (1) WO2017092696A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111506921A (en) * 2020-04-17 2020-08-07 浙江同花顺智能科技有限公司 Data processing method, system, device, terminal and storage medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105590066B (en) * 2015-12-02 2018-08-10 中国银联股份有限公司 The safe fusion method of big data of privacy is not revealed
CN109726580B (en) 2017-10-31 2020-04-14 阿里巴巴集团控股有限公司 Data statistical method and device
CN108683657B (en) * 2018-05-11 2021-03-02 试金石信用服务有限公司 Data security access method and device, terminal equipment and readable storage medium
US11138327B2 (en) 2018-12-27 2021-10-05 Industrial Technology Research Institute Privacy data integration method and server
CN109492435B (en) * 2019-01-10 2022-03-08 贵州财经大学 Privacy disclosure risk assessment method, device and system based on data open sharing
CN110674125B (en) * 2019-09-24 2022-05-17 北京明略软件系统有限公司 Filtering method and filtering device for data to be fused and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101834858A (en) * 2010-04-16 2010-09-15 北京工业大学 Trust and replacement-based privacy information protection method in data sharing
CN102867022A (en) * 2012-08-10 2013-01-09 上海交通大学 System for anonymizing set type data by partially deleting certain items
CN104679827A (en) * 2015-01-14 2015-06-03 北京得大信息技术有限公司 Big data-based public information association method and mining engine
CN105590066A (en) * 2015-12-02 2016-05-18 中国银联股份有限公司 Big data safe integration method capable of protecting privacy

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1728138A1 (en) * 2004-03-16 2006-12-06 Grid Analytics Llc System and method for aggregation and analysis of information from multiple disparate sources while assuring source and record anonymity using an exchange hub
US8607353B2 (en) * 2010-07-29 2013-12-10 Accenture Global Services Gmbh System and method for performing threat assessments using situational awareness
CN102638791B (en) * 2012-04-11 2014-09-10 南京邮电大学 Protection method for fusion integrity of sensor network data
US9594823B2 (en) * 2012-08-22 2017-03-14 Bitvore Corp. Data relationships storage platform
CN103425780B (en) * 2013-08-19 2016-08-17 曙光信息产业股份有限公司 The querying method of a kind of data and device
CN104866775A (en) * 2015-06-12 2015-08-26 四川友联信息技术有限公司 Bleaching method for financial data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101834858A (en) * 2010-04-16 2010-09-15 北京工业大学 Trust and replacement-based privacy information protection method in data sharing
CN102867022A (en) * 2012-08-10 2013-01-09 上海交通大学 System for anonymizing set type data by partially deleting certain items
CN104679827A (en) * 2015-01-14 2015-06-03 北京得大信息技术有限公司 Big data-based public information association method and mining engine
CN105590066A (en) * 2015-12-02 2016-05-18 中国银联股份有限公司 Big data safe integration method capable of protecting privacy

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111506921A (en) * 2020-04-17 2020-08-07 浙江同花顺智能科技有限公司 Data processing method, system, device, terminal and storage medium

Also Published As

Publication number Publication date
TW201727516A (en) 2017-08-01
CN105590066B (en) 2018-08-10
TWI664538B (en) 2019-07-01
CN105590066A (en) 2016-05-18

Similar Documents

Publication Publication Date Title
WO2017092696A1 (en) Method for safe integration of big data without leaking privacy
TWI684108B (en) Data statistics method and device
CN107784111B (en) Data mining method, device, equipment and storage medium
CN103259788B (en) For Formal Modeling and the verification method of security protocol
CN106951796B (en) Desensitization method and device for data privacy protection
US20210288964A1 (en) System, method and computer-readable medium for utilizing a shared computer system
CN107908979A (en) For the method and electronic equipment for being configured and being endorsed in block chain
CN107680003A (en) The node tree generation method and device of project supervision task
Huai et al. Towards automating model explanations with certified robustness guarantees
Rossi et al. Challenges of protecting confidentiality in social media data and their ethical import
Zhang et al. Privacyasst: Safeguarding user privacy in tool-using large language model agents
CN106228453A (en) A kind of method and apparatus obtaining user's occupational information
CN117171779A (en) Data processing device based on intersection protection
US8607355B2 (en) Social network privacy using morphed communities
US9235616B2 (en) Systems and methods for partial workflow matching
Wiraguna et al. The Implementation of Electronic Contract on Business to Business (B2B) Electronic Transaction
Bhattacharyya et al. Cloud computing for suitable data management and security within organisations
CN107798249A (en) The dissemination method and terminal device of behavioral pattern data
WO2019085665A1 (en) Data statistics method and apparatus
Al-Fedaghi Engineering privacy revisited
CN112541540A (en) Data fusion method, device, equipment and storage medium
CN108881235A (en) Identify the method and system of account
CN117236420B (en) Method and system for debugging vertical federation learning abnormal data based on data subset
Sun et al. Knowledge engineering and management
Dinur et al. Cyber Security Cryptography and Machine Learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16870006

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16870006

Country of ref document: EP

Kind code of ref document: A1