CN111078725A - Visual query method and device for multi-data-source combined data - Google Patents

Visual query method and device for multi-data-source combined data Download PDF

Info

Publication number
CN111078725A
CN111078725A CN201911282807.8A CN201911282807A CN111078725A CN 111078725 A CN111078725 A CN 111078725A CN 201911282807 A CN201911282807 A CN 201911282807A CN 111078725 A CN111078725 A CN 111078725A
Authority
CN
China
Prior art keywords
data
random number
participants
participant
visual query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911282807.8A
Other languages
Chinese (zh)
Other versions
CN111078725B (en
Inventor
王智勇
魏雅婷
周舒悦
陈为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201911282807.8A priority Critical patent/CN111078725B/en
Publication of CN111078725A publication Critical patent/CN111078725A/en
Application granted granted Critical
Publication of CN111078725B publication Critical patent/CN111078725B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a visual query method and a visual query device for multi-data source combined data, which belong to the field of visual query and privacy protection and comprise the following steps: 1) establishing a server among the N participants; 2) on visual request, each participant calculates local data Di,i∈[1,N](ii) a 3) Each participant exchanges a random number vector between each two participants and transmits local data DiFusing the received random number vector and the sent random number vector; meanwhile, uploading the fusion results of the N participants to a server; 4) and the server side merges the fusion results uploaded by the N participants and returns the merged results to the front end for the front end to perform visual analysis. The data of a plurality of clients are merged without iteration, and the accurate visual query result can be obtained by only one-time processingAnd the privacy of the data can be ensured not to be revealed.

Description

Visual query method and device for multi-data-source combined data
Technical Field
The invention relates to the field of visual query and privacy protection, in particular to a visual query method and device for multi-data-source combined data.
Background
In the background of the era of large data, importance on data privacy and security has become a worldwide trend, and therefore, there is a great risk in collecting sensitive data on a large scale. With the implementation of General Data Protection Regulations (GDPR) in 2018, 5 and 25, the european union is paying more attention to the personal privacy and data security of users in all countries. However, in most industries, data exists in an isolated island form, and due to problems of industry competition, privacy safety, complex administrative procedures and the like, even if data integration is realized among different departments of the same company, the data integration is hard to realize, and the cost for integrating data scattered in various places and various organizations is huge in reality. Therefore, how to merge sensitive data under the premise of guaranteeing privacy is an important research field.
In order to solve the data island problem, chinese patent publication No. CN103338198A discloses a method for solving network security and data island by using Linux system, comprising the steps of: (1) building a Linux intermediate layer: installing a Linux operating system platform on an X86 dual-network-card server, setting respective ip addresses and gateways of the dual-network cards according to respective network segments of an internal network and an external network, and installing Oracle database software on the Linux operating system platform; (2) a Linux intermediate layer setting step: blocking a network port which is not necessary for Oracle data exchange in the Linux system; setting a monitoring program (TNS) of the installed Oracle software, and performing data access with database platforms of related production internal networks and external networks; (3) and the production intranet and the production extranet respectively obtain information by accessing an Oracle database. The method for obtaining information by respectively accessing the Oracle database by the production intranet and the production extranet in the step (3) comprises the following steps: and setting a production intranet server A and an extranet server B for data interaction of the intranet and the extranet and installing Oracle software, setting a monitor program (TNS) and communicating with Oracle installed on a server C in a Linux intermediate layer in a DB _ LINK manner. The method for communicating the production intranet server A, the production extranet server B and the Linux intermediate layer server C comprises the following steps: and mirroring the database content to a Linux intermediate layer server C, and searching the database information through mirroring.
However, the above patent only realizes the inter-internet and intranet data mutual access and is one-to-one mutual access, and in the field of visual query, people often need to perform local visual analysis from merged data of a plurality of data sources and relatively accurate query results are needed. Due to the existence of the data island problem, on the premise of ensuring privacy, merging different data sources and simultaneously meeting the accuracy of a merging result are a difficult task. The existing methods are mostly like differential privacy and federal learning, the former adds a large amount of randomization into each data source, so that the usability of the data source is reduced sharply, and particularly for some complex queries, sometimes the randomization result can mask the real result to a great extent; the latter requires multiple iterations and repetitive training, is costly, and does not guarantee absolute accuracy.
Disclosure of Invention
The invention aims to provide a visual query method and device for multi-data-source combined data, which can combine and query the data of multiple data sources on the premise of ensuring privacy and ensure the accuracy of the data.
In order to achieve the above object, in a first aspect, the present invention provides a method for visualizing query of multiple data sources merging data, including the following steps:
step 1) establishing a server among N participants;
step 2) calculating local data D by each participant according to the visual requesti,i∈[1,N];
Step 3) exchanging a random number vector between each two participants, and converting the local data DiFusing the received random number vector and the sent random number vector; meanwhile, N participants upload local fusion results to a server;
and 4) the server side merges the merged results uploaded by the N participants and returns the merged results to the front end for the front end to perform visual analysis.
Wherein, step 3) includes:
step 3-1) each participant locally generates N-1 different random number vectors, and distributes the N-1 different random number vectors to other N-1 participants;
step 3-2) each participant locally calculates the difference value between each random number vector and the random number vector sent by the corresponding participant receiving the random number vector, and sums all the difference values;
step 3-3) summing results of all participants in the step 3-2) with local data D of the participantsiAnd adding and fusing, and uploading the final fusion result to the server.
According to the technical scheme, the data of the clients are merged, iteration is not needed, an accurate visual query result can be obtained by one-time processing, and the privacy of the data can be guaranteed against being revealed. Has certain value and significance in the fields of visual inquiry and privacy protection.
In a second aspect, the present invention provides a device for visualizing query of merged data of multiple data sources, including:
the acquisition module is used for acquiring data uploaded by N participants;
and the processing module is used for fusing the data uploaded by the N participants to obtain an accurate value of local data combination of each participant, and simultaneously returning the accurate value to the front end for visual analysis by the front end.
In a third aspect, the present invention provides a system for visualizing query on merged data of multiple data sources, including:
a memory storing computer-executable instructions and data for use or production in executing the computer-executable instructions;
and a processor communicatively coupled to the memory and configured to execute computer-executable instructions stored by the memory;
the computer-executable instructions, when executed, perform the method for visual query of merged data from multiple data sources of the first aspect.
In a fourth aspect, the present invention provides a storage medium, which includes a program or instructions, and when the program or instructions are executed, the method for visualizing query by merging multiple data sources in the first aspect is executed.
Compared with the prior art, the invention has the beneficial effects that:
according to the visual query method for the multi-data-source combined data, repeated iterative training is not needed like federal learning, the privacy is protected by limiting the accuracy of the query result, the accurate visual query result can be obtained, and the privacy of the data source cannot be revealed in the whole process.
Drawings
FIG. 1 is a schematic diagram of a visualization query method for merging data of multiple data sources according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the following describes the visual query method for multiple data sources combined data according to the present invention with reference to the following embodiments and accompanying drawings.
Examples
Referring to fig. 1, N data owners (e.g., enterprises) F _ i, i equals 1.. N all have a same-dimensional geographic data, and it is desired to obtain accurate merged data volumes of N data sources within a framed latitude and longitude range in a manner of framing on a map for further visual analysis. This example includes the following steps:
s101: and (4) performing frame selection on the map at the front end, and sending the framed longitude and latitude range as a request parameter to the client corresponding to all participants (data owners).
S102: each client calculates local data D according to the latitude and longitude range of the requestiAnd locally generating N-1 different random number vectors Ri,j(R has the same dimension as D), and correspondingly distributing the N-1 random numbers to other N-1 clients, as shown in (a) in FIG. 1, wherein i is the current client subscript, i is E [1, N)]J is a received random number vector Ri,jIs the subscript of the target client, j belongs to [1, N ]]And j ≠ i; for example, the random number vector sent by the client 1 to the client 3 is R1,3
S103: for each client ClientiN-1 random number vectors R are generated locallyi,j,j∈[1,N]And j ≠ i, and receives random number vectors R transmitted by other N-1 clientsj,i,j∈[1,N]And j ≠ i, i.e., there are (N-1) × 2 random numbers in total, specifically, the two groups of random numbers are one-to-one. For example, R1,3And R3,1The random numbers sent by the client 1 to the client 3 and the random numbers sent by the client 3 to the client 1 are respectively. It can also be said that there will be N-1 pairs of random number vectors locally, and for client 1 there will be (R) pairs of random number vectors locally1,2,R2,1),(R1,3,R3,1),...,(R1,N,RN,1) Each random number vector pair is subjected to subtraction, then the N-1 difference values are summed, and the local data D of the current client side is usediAnd adding the summation value, and uploading the obtained result to the server. Specifically, the method comprises the following steps:
first, calculate ClientiSending to ClientjRandom number and Client ofjSending to ClientiDifference value I of random number ofi,j=Ri,j-Rj,i,j∈[1,N]And j ≠ i, and sums it to obtain
Figure BDA0002317222530000051
Data D _ upload uploaded to serveriAmount of data D for current clientiThe result of the above summation is added, which is shown in fig. 1 (b).
S104: the server receives the data uploaded by the N clients, and merges the data, as shown in (c) of fig. 1, with the result of merging being
Figure BDA0002317222530000052
The random number is partially offset, and the obtained result is the accurate value of the N client data combinations.
S105: the server returns the accurate value D _ sum of the visual query result to the front end, and the front end performs visual analysis. The whole process only needs one iteration, and an accurate visual query result can be obtained while data privacy is guaranteed.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (8)

1. A visual query method for multi-data source combined data is characterized by comprising the following steps:
step 1) establishing a server among N participants;
step 2) calculating local data D by each participant according to the visual requesti,i∈[1,N];
Step 3) exchanging a random number vector between each two participants, and converting the local data DiFusing the received random number vector and the transmitted random number vector(ii) a Meanwhile, N participants upload local fusion results to a server;
and 4) the server side merges the merged results uploaded by the N participants and returns the merged results to the front end for the front end to perform visual analysis.
2. The method for visual query of multiple data sources combined data according to claim 1, wherein step 3) comprises:
step 3-1) each participant locally generates N-1 different random number vectors, and distributes the N-1 different random number vectors to other N-1 participants;
step 3-2) each participant locally calculates the difference value between each random number vector and the random number vector sent by the corresponding participant receiving the random number vector, and sums all the difference values;
step 3-3) summing results of all participants in the step 3-2) with local data D of the participantsiAnd adding and fusing, and uploading the final fusion result to the server.
3. The method for visual query of multiple data sources combined data according to claim 2, wherein in step 3-1), the dimensionality of the random number vector generated locally by each participant is the same as the dimensionality of the local data.
4. The method for visual query of multiple data sources combined data according to claim 2, wherein step 3-1) further comprises that each participant locally forms N-1 random number pairs, each random number pair consisting of a random number vector and a random number vector sent by the participant who correspondingly receives the random number vector.
5. The visual query method for multi-data source merged data according to claim 1, wherein in step 4), the server merges the merged results uploaded by the N participants and cancels the random number part, and the obtained result is an accurate value of the merged local data of each participant.
6. A visual inquiry device for merging data of multiple data sources is characterized by comprising:
the acquisition module is used for acquiring data uploaded by N participants;
and the processing module is used for fusing the data uploaded by the N participants to obtain an accurate value of local data combination of each participant, and simultaneously returning the accurate value to the front end for visual analysis by the front end.
7. A visual query system for merging data from multiple data sources, comprising:
a memory storing computer-executable instructions and data for use or production in executing the computer-executable instructions;
and a processor communicatively coupled to the memory and configured to execute computer-executable instructions stored by the memory;
the computer-executable instructions, when executed, perform the method for visual query of merged data from multiple data sources of claims 1-5.
8. A storage medium comprising a program or instructions which, when executed, perform a method for visual query of multiple data sources combined data according to any one of claims 1 to 5.
CN201911282807.8A 2019-12-13 2019-12-13 Visual query method and device for multi-data-source combined data Active CN111078725B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911282807.8A CN111078725B (en) 2019-12-13 2019-12-13 Visual query method and device for multi-data-source combined data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911282807.8A CN111078725B (en) 2019-12-13 2019-12-13 Visual query method and device for multi-data-source combined data

Publications (2)

Publication Number Publication Date
CN111078725A true CN111078725A (en) 2020-04-28
CN111078725B CN111078725B (en) 2022-07-12

Family

ID=70314293

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911282807.8A Active CN111078725B (en) 2019-12-13 2019-12-13 Visual query method and device for multi-data-source combined data

Country Status (1)

Country Link
CN (1) CN111078725B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553485A (en) * 2020-04-30 2020-08-18 深圳前海微众银行股份有限公司 View display method, device, equipment and medium based on federal learning model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120072353A1 (en) * 2010-02-11 2012-03-22 Christopher Boone Enhanced system and method for multipath contactless transactions
CN105574078A (en) * 2015-12-02 2016-05-11 上海华兴数字科技有限公司 Data analysis system and method for excavator
CN105760477A (en) * 2016-02-15 2016-07-13 中国建设银行股份有限公司 Data query method and system for multiple data sources and associated equipment therefore
US20160378867A1 (en) * 2015-06-23 2016-12-29 Drastin, Inc. Systems and Methods for Instant Crawling, Curation of Data Sources, and Enabling Ad-hoc Search
US20190026630A1 (en) * 2016-03-28 2019-01-24 Sony Corporation Information processing apparatus and information processing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120072353A1 (en) * 2010-02-11 2012-03-22 Christopher Boone Enhanced system and method for multipath contactless transactions
US20160378867A1 (en) * 2015-06-23 2016-12-29 Drastin, Inc. Systems and Methods for Instant Crawling, Curation of Data Sources, and Enabling Ad-hoc Search
CN105574078A (en) * 2015-12-02 2016-05-11 上海华兴数字科技有限公司 Data analysis system and method for excavator
CN105760477A (en) * 2016-02-15 2016-07-13 中国建设银行股份有限公司 Data query method and system for multiple data sources and associated equipment therefore
US20190026630A1 (en) * 2016-03-28 2019-01-24 Sony Corporation Information processing apparatus and information processing method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553485A (en) * 2020-04-30 2020-08-18 深圳前海微众银行股份有限公司 View display method, device, equipment and medium based on federal learning model

Also Published As

Publication number Publication date
CN111078725B (en) 2022-07-12

Similar Documents

Publication Publication Date Title
US9551579B1 (en) Automatic connection of images using visual features
CN106156279B (en) Address similarity identification method and system based on longitude and latitude and text comparison
CN102892073A (en) Continuous query oriented location anonymizing method applicable to location service system
WO2010006254A2 (en) System and methods for dynamically generating earth position data for overhead images and derived information
CN110837653B (en) Label prediction method, apparatus and computer readable storage medium
CN111078725B (en) Visual query method and device for multi-data-source combined data
CN104573395A (en) Big data platform safety assessment quantitative analysis method
Al‐Omari et al. Quaternion Fourier integral operators for spaces of generalized quaternions
CN114186263A (en) Data regression method based on longitudinal federal learning and electronic device
Yang et al. Leveraging blockchain for scaffolding work management in construction
CN111402400A (en) Pipeline engineering display method, device, equipment and storage medium
CN105096062A (en) CORS (continuously operating reference station) application management system
CN113807736A (en) Data quality evaluation method, computer equipment and storage medium
US11699185B2 (en) Systems and methods for privacy-preserving inventory matching
CN116451279B (en) Data processing method, device, equipment and readable storage medium
CN106595602B (en) Relative orientation method based on homonymous line feature
CN115564901B (en) Method and device for constructing 3D building model
Zhou et al. Method for fundamental matrix estimation combined with feature lines
CN108881663B (en) Image area copying detection method supporting privacy protection function
CN112817275A (en) Intelligent manufacturing management system
CN109242254A (en) A kind of enterprise's account approaches to IM and device, readable storage medium storing program for executing
CN110706098A (en) Accurate poverty alleviation system and method based on block chain
Chuang et al. Automated 3d feature matching
CN117556429B (en) Safety protection capability evaluation method and system for public safety video image system
Burlacu et al. IT Governance in Romania During the Covid-19 Pandemic

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant