CN114692200A - Privacy protection distributed graph data feature decomposition method and system - Google Patents

Privacy protection distributed graph data feature decomposition method and system Download PDF

Info

Publication number
CN114692200A
CN114692200A CN202210341719.6A CN202210341719A CN114692200A CN 114692200 A CN114692200 A CN 114692200A CN 202210341719 A CN202210341719 A CN 202210341719A CN 114692200 A CN114692200 A CN 114692200A
Authority
CN
China
Prior art keywords
encryption
computing terminal
target
degree
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210341719.6A
Other languages
Chinese (zh)
Other versions
CN114692200B (en
Inventor
郑宜峰
王松磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN202210341719.6A priority Critical patent/CN114692200B/en
Publication of CN114692200A publication Critical patent/CN114692200A/en
Application granted granted Critical
Publication of CN114692200B publication Critical patent/CN114692200B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6263Protecting personal data, e.g. for financial or medical purposes during internet communication, e.g. revealing personal data from cookies

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Computer Security & Cryptography (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Analysis (AREA)
  • Data Mining & Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a distributed graph data characteristic decomposition method and a system for privacy protection, in the method provided by the invention, randomly sampled graph nodes holding local graph data encrypt own degree information and send the information to a first computing terminal and a second computing terminal, the first computing terminal and the second computing terminal cooperatively compute and generate first encryption degree distribution information and second encryption degree distribution information in a ciphertext domain, so that the graph nodes can determine a target interval to which the degree of the graph nodes belong, further select proper sampling sensitivity sampling noise, add false edges with weight of 0 in a real graph adjacent matrix, realize sparse representation of the matrix in the form of the matrix, encrypt a triple set added with the false edges and respectively send the ciphertext to the first computing terminal and the second computing terminal for encrypted characteristic decomposition, on the premise of protecting node privacy, the sparsity of graph data is kept and the effectiveness of feature decomposition is guaranteed.

Description

Privacy protection distributed graph data feature decomposition method and system
Technical Field
The invention relates to the technical field of information security, in particular to a distributed graph data feature decomposition method and system with privacy protection.
Background
Graph (Graph) data can describe complex interrelationships between entities, a wide variety of analysis tasks can be performed on the information-rich Graph data, and the analysis tasks on the Graph can become more challenging when the Graph data appears in a distributed fashion. By distributed, it is meant that each entity can only obtain partial data about the entire graph (named local graph data). For example, in a phonebook network, each user is a graph node, and the phonebook of each user represents the contacts (i.e., edges in the graph data) between that user and other users. Obviously, if the phonebook network is modeled as a graph, no entity can directly obtain the information of the whole graph, and instead, each user can only know a part of the connection relationship (i.e. the contact information contained in his own phonebook).
Collecting such distributed graph data for graph task analysis can raise significant privacy concerns (e.g., no one would like to share out his phone book). Thus, if the local graph data owned by each user is not protected, they may be unwilling to participate in the analysis of such graph tasks. Therefore, there is a need to introduce privacy protection mechanisms in task analysis performed on such distributed graph data, so that valuable graph analysis tasks can be performed without compromising each user's sensitive and private local graph data.
In graph analysis tasks, feature decomposition is a very popular basic task. The characteristic decomposition based on the graph data acts on the adjacency matrix of the graph data to generate characteristic values and characteristic vectors, which can provide basic information for various graph analysis tasks, such as community structure detection, community important member discovery, graph division, webpage sorting and the like, but at present, no characteristic decomposition scheme for realizing privacy protection on distributed graph data exists.
Thus, there is a need for improvements and enhancements in the art.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a method and a system for decomposing the characteristics of distributed graph data with privacy protection, and aims to solve the problem that a characteristic decomposition scheme for realizing privacy protection on the distributed graph data does not exist in the prior art.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
in a first aspect of the present invention, a privacy-preserving distributed graph data feature decomposition method is provided, where the method includes:
generating an initial set by a target graph node in a global graph according to local graph data, wherein the initial set comprises a plurality of groups of triples, and each group of triples comprises a node mark of the target graph node, a node mark of an adjacent graph node of the target graph node and the weight of a connecting edge of the target graph node and the adjacent graph node;
the target graph node encrypts the degree of the target graph node based on function secret sharing to obtain first encryption degree information and second encryption degree information, the first encryption degree information is sent to a first computing terminal, and the second encryption degree information is sent to a second computing terminal;
the first computing terminal and the second computing terminal generate first encryption degree distribution information and second encryption degree distribution information of global graph data according to the first encryption degree information and the second encryption degree information sent by the target graph nodes;
the target graph node determines a target interval to which the degree of the target graph node belongs according to the received first encryption degree distribution information and the second encryption degree distribution information, determines a target sampling sensitivity according to boundary information of the target interval, samples noise from Laplace distribution according to the target sampling sensitivity, adds a false triple in the target combination according to the noise, and generates a target set, wherein the weight value of the false triple is 0;
the target graph node encrypts the target set based on additive secret sharing to obtain a first encryption set and a second encryption set, sends the first encryption set to a first computing terminal, and sends the second encryption set to a second computing terminal;
and the first computing terminal and the second computing terminal carry out feature decomposition on the global graph data according to the first encryption set and the second encryption set corresponding to each node in the global graph.
The privacy-protected distributed graph data feature decomposition method includes, before the target graph node encrypts the degree of the target graph node based on function secret sharing, the steps of:
the first computing terminal and/or the second computing terminal randomly selects part of nodes in all nodes of the global graph to send encryption requests;
and after the target graph node receives the encryption request, the target graph node encrypts the degree of the target graph node based on function secret sharing.
The privacy-protected distributed graph data feature decomposition method, wherein the target graph node encrypts the degree of the target graph node based on function secret sharing to obtain first encryption degree information and second encryption degree information, includes:
and the target graph node acquires the first encryption degree information and the second encryption degree information output by a first preset algorithm in function secret sharing, wherein the input of the first preset algorithm comprises the degree of the target graph node.
The privacy-protected distributed graph data feature decomposition method, in which the first and second computing terminals generate first and second encryption degree distribution information of global graph data according to the first and second encryption degree information sent by the plurality of target graph nodes, includes:
the first computing terminal inputs the first encryption degree information of the target graph node and a target scale into a second preset algorithm in function secret sharing to obtain first encryption degree comparison information between the degree of the target graph node and the target scale, and the second computing terminal inputs the second encryption degree information of the target graph node and the target scale into the second preset algorithm to obtain second encryption degree comparison information between the degree of the target graph node and the target scale;
wherein the sum of the degree of the target graph node and the first encryption degree comparison information and the second encryption degree comparison information of the target scale is 1 when the degree of the target graph node is equal to the target degree, and is 0 otherwise;
the first computing terminal obtains first encryption histogram information, the second computing terminal obtains second encryption histogram information, the first encryption histogram information includes first encryption graph node quantity information corresponding to each target degree, each first encryption graph node quantity information is the sum of all the first encryption degree comparison information corresponding to one target degree, the second encryption histogram information includes second encryption graph node quantity corresponding to each target degree, and each second encryption graph node quantity information is the sum of all the second encryption degree comparison information corresponding to one target degree;
acquiring first encryption degree information between the degrees of the target graph nodes and each target scale as first encryption degree histogram information, and acquiring second encryption degree information between the degrees of the target graph nodes and each target scale as second encryption degree histogram information by the second computing terminal;
and the first computing terminal and the second computing terminal determine the first encryption degree distribution information and the second encryption degree distribution information according to the first encryption degree histogram information and the second encryption degree histogram information.
The method for decomposing the data characteristics of the privacy-protected distributed graph comprises the following steps that each digit value in the first encryption degree distribution information and the second encryption degree distribution information is 0 or 1; the first computing terminal and the second computing terminal determine the first encryption degree distribution information and the second encryption degree distribution information according to the first encryption degree histogram information and the second encryption degree histogram information, and the method comprises the following steps:
the first computing terminal and the second computing terminal determine the number of target nodes in each interval according to the number of the target graph nodes sending the encryption degree information and the preset interval number;
the first computing terminal sequentially adds each first encryption graph node quantity information in the first encryption degree histogram information to a first accumulator according to the size sequence of the corresponding target scale, and the second computing terminal sequentially adds each second encryption graph node quantity information in the second encryption degree histogram information to a second accumulator according to the size sequence of the corresponding target scale;
after the first encryption graph node quantity information and the second encryption graph node quantity information are respectively added into the first accumulator and the second accumulator, the first computing terminal obtains a first encryption comparison result according to the first accumulator and generates a new one-bit numerical value in the first encryption degree distribution information according to the first encryption comparison result, the second computing terminal obtains a second encryption comparison result according to the second accumulator and generates a new one-bit numerical value in the second encryption degree distribution information according to the second encryption comparison result, wherein when the sum of the first accumulator and the second accumulator is not less than the target node quantity, the XOR gate operation result of the first encryption comparison result and the second encryption comparison result is 1, and when the sum of the first accumulator and the second accumulator is less than the target node quantity, the exclusive-or gate operation result of the first encryption comparison result and the second encryption comparison result is 0;
the first computing terminal inverts the latest one-bit numerical value in the first encryption degree distribution information to obtain an inversion bit, the first computing terminal obtains a first secret share based on additive secret sharing calculation, the second computing terminal obtains a second secret share based on additive secret sharing calculation, wherein the sum of the first secret share and the second secret share is the product of a first value and a second value, the first value is the exclusive-or gate operation result of the inversion bit and the latest one-bit in the second encryption degree distribution information, and the second value is the sum of the first accumulator and the second accumulator;
the first computing terminal updates the value of the first accumulator to the first secret share, adds next first encryption map node number information to the first accumulator, and the second computing terminal updates the value of the second accumulator to the second secret share, and adds next second encryption map node number information to the second accumulator.
The method for decomposing the data characteristics of the distributed graph with the privacy protection function, wherein the first computing terminal obtains a first encryption comparison result according to the first accumulator, generates a new one-digit numerical value in the first encryption degree distribution information according to the first encryption comparison result, and the second computing terminal obtains a second encryption comparison result according to the second accumulator, and generates a new one-digit numerical value in the second encryption degree distribution information according to the second encryption comparison result, includes:
the third computing terminal generates a first key, a second key, a first random number and a second random number according to a third preset algorithm shared by function secrets and the number of the target nodes, sends the first key and the first random number to the first computing terminal, and sends the second key and the second random number to the second computing terminal,
the first computing terminal sending the sum of the first random number and the first accumulator to the second computing terminal, the second computing terminal generating a second random number, the sum of the second random number and the second accumulator being sent to the first computing terminal, such that the first computing terminal and the second computing terminal each obtain a scrambling input value, the scrambling input value being the sum of the first random number, the second random number, the first accumulator, and the second accumulator;
the first computing terminal inputs the permutation input value and the first secret key to a fourth preset algorithm of function secret sharing to obtain the first encryption bit, and the second computing terminal inputs the permutation input value and the second secret key to the fourth preset algorithm of function secret sharing to obtain a second encryption bit, wherein when the sum of the first accumulator and the second accumulator is less than the number of target nodes, the result of the exclusive-or gate operation of the first encryption bit and the second encryption bit is 1, and when the sum of the first accumulator and the second accumulator is not less than the number of target nodes, the result of the exclusive-or gate operation of the first encryption bit and the second encryption bit is 0;
the first computing terminal takes the first encryption bit as a new numerical value in the first encryption degree distribution information, the second computing terminal turns over the second encryption bit to be used as a new numerical value in the second encryption degree distribution information, or the first computing terminal turns over the first encryption bit to be used as a new numerical value in the first encryption degree distribution information, and the second computing terminal takes the second encryption bit as a new numerical value in the second encryption degree distribution information.
The method for decomposing the data characteristics of the distributed graph with the privacy protection function, wherein the first computing terminal obtains a first encryption comparison result according to the first accumulator and generates a new one-digit numerical value in the first encryption degree distribution information according to the first encryption comparison result, the second computing terminal obtains a second encryption comparison result according to the second accumulator and generates a new one-digit numerical value in the second encryption degree distribution information according to the second encryption comparison result, comprises the following steps:
the first computing terminal acquires a first random number, the second computing terminal acquires a second random number, and the sum of the first random number and the second random number is the number of the target nodes;
the first computing terminal obtains first bit data, the second computing terminal obtains second bit data, the first bit data is bit data corresponding to the difference between the first accumulator and the first random number, and the second bit data is bit data corresponding to the difference between the second accumulator and the second random number;
the first computing terminal and the second computing terminal input the respective held bit data to a parallel prefix adding circuit, perform exclusive-or gate calculation and gate calculation to obtain the most significant bit of the first bit data and the most significant bit of the second bit data, respectively, when the sum of the first accumulator and the second accumulator is less than the number of target nodes, the exclusive-or gate operation result of the most significant bit of the first bit data and the second bit data is 1, and when the sum of the first accumulator and the second accumulator is not less than the number of target nodes, the exclusive-or gate operation result of the most significant bit of the first bit data and the second bit data is 0;
the first computing terminal takes the most significant bit of the first bit data as a new bit value in the first encryption degree distribution information, the second computing terminal turns over the most significant bit of the second bit data to be used as a new bit value in the second encryption degree distribution information, or the first computing terminal turns over the most significant bit of the first bit data to be used as a new bit value in the first encryption degree distribution information, and the second computing terminal takes the most significant bit of the second bit data as a new bit value in the second encryption degree distribution information.
The privacy-protected distributed graph data feature decomposition method, wherein the feature decomposition of the global graph data is performed by the first computing terminal and the second computing terminal according to the first encryption set and the second encryption set corresponding to each node in the global graph, and includes:
the first computing terminal obtains a first encryption adjacent matrix according to the first encryption set corresponding to each node in the global graph, and the second computing terminal obtains a second encryption adjacent matrix according to the second encryption set corresponding to each node in the global graph;
the first computing terminal and the second computing terminal perform dimensionality reduction on the sum of the first encryption adjacent matrix and the second encryption adjacent matrix based on additive secret sharing to obtain a dimensionality reduction matrix;
the first computing terminal and the second computing terminal execute a QR algorithm on the dimension reduction matrix based on additive secret to obtain an encrypted characteristic value and an encrypted characteristic vector of the global graph data;
for square root operation in the dimension reduction process, the first computing terminal and the second computing terminal iteratively calculate through a second computing formula based on additive secret sharing to obtain the reciprocal of the square root;
the second calculation formula is:
Figure BDA0003579650210000061
y'nthe calculation result of the reciprocal of the square root of the nth iteration calculation is shown, and x' represents the number of the square root to be generated.
The privacy-protected distributed graph data feature decomposition method, wherein the first computing terminal and the second computing terminal execute a QR algorithm on the dimensionality reduction matrix based on an additive secret, and includes:
in the ith iteration in a QR algorithm, a first encryption matrix and a second encryption matrix are obtained by a first computing terminal, the sum of the first encryption matrix and the second encryption matrix is a target matrix, and the target matrix is a matrix formed by elements with the positions of (i, i), (i, i +1), (i +1, i) and (i +1) in a plaintext Givens rotation matrix used in the ith iteration;
for matrix multiplication in the QR algorithm, the first computing terminal and the second computing terminal realize multiplication in additive secret sharing by taking the first encryption matrix and the second encryption matrix as two secret shares of a Givens rotation matrix in the QR algorithm based on a randomly generated multiplication tuple matrix.
In a second aspect of the present invention, a privacy-protected distributed graph data feature decomposition system is provided, where the system includes a target graph node, a first computing terminal, and a second computing terminal, where the target graph node, the first computing terminal, and the second computing terminal are configured to execute relevant steps in the privacy-protected distributed graph data feature decomposition method provided in the first aspect of the present invention.
Compared with the prior art, the invention provides a distributed graph data feature decomposition method and a distributed graph data feature decomposition system for privacy protection, wherein in the distributed graph data feature decomposition method for privacy protection, randomly sampled graph nodes holding local graph data encrypt own degree information and send the degree information to a first computing terminal and a second computing terminal, the first computing terminal and the second computing terminal cooperatively compute and generate first encryption degree distribution information and second encryption degree distribution information in a ciphertext domain, so that the graph nodes can determine a target interval to which the own degrees belong, further select proper sampling sensitivity sampling noise, add false edges with weight of 0 in a real graph adjacent matrix, realize sparse representation of the matrix in the form of a matrix triplet, encrypt a triplet set to which the false edges are added by the graph nodes to obtain a first encryption set and a second encryption set, and respectively send the first encryption set and the second encryption set to the first computing terminal and the second computing terminal, the first computing terminal and the second computing terminal carry out feature decomposition based on the first encryption set and the second encryption set, on the premise of protecting node privacy, sparsity of graph data is kept, meanwhile, effectiveness of feature decomposition is guaranteed, and distributed feature data feature decomposition of privacy protection is achieved.
Drawings
FIG. 1 is a flow diagram of an embodiment of a privacy-preserving distributed graph data feature decomposition method provided by the present invention;
FIG. 2 is a schematic diagram of an application scenario of an embodiment of a distributed graph data feature decomposition method for privacy protection provided in the present invention;
FIG. 3 is a density function of Laplace distributions for different sensitivities;
FIG. 4 is a schematic diagram of a secure degree histogram estimation algorithm in an embodiment of a distributed graph data feature decomposition method for privacy protection provided by the present invention;
FIG. 5 is a schematic diagram of a secure degree distribution information generation algorithm in an embodiment of a distributed graph data feature decomposition method for privacy protection provided by the present invention;
FIG. 6 is a schematic diagram of a parallel prefix addition circuit in an embodiment of a privacy preserving distributed graph data feature decomposition method provided by the present invention;
FIG. 7 is a schematic diagram of a partial graph data encryption algorithm in an embodiment of a privacy preserving distributed graph data feature decomposition method provided by the present invention;
FIG. 8 is a schematic illustration of the Arnoldi algorithm in plaintext;
FIG. 9 is a diagram of a plain-text Lanczos algorithm;
fig. 10 is a schematic diagram of an Arnoldi algorithm of a ciphertext domain in an embodiment of a privacy-preserving distributed graph data feature decomposition method provided by the present invention;
FIG. 11 is a diagram illustrating an iterative process of a QR algorithm in the prior art;
FIG. 12 is a diagram illustrating a QR algorithm of a ciphertext domain in an embodiment of a distributed graph data feature decomposition method of privacy protection provided by the present invention;
fig. 13 is a schematic diagram of an iterative process of a QR algorithm in an embodiment of the privacy-preserving distributed graph data feature decomposition method provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example one
The embodiment provides a privacy-protection distributed graph data feature decomposition method, which aims to realize feature decomposition of a ciphertext domain on distributed graph data in a privacy-protection mode, and generate an eigenvalue and an eigenvector based on an adjacency matrix acted on the graph data by the feature decomposition of the graph data. As shown in fig. 2, in the method provided in this embodiment, three entities are included: n users Ui(i∈[1,N]) Two cloud servers (computing terminals) CS1And CS2And an analyst. Each user holds partial Graph data (for example, an address book held by each user in an address book scene) to form a complete Graph (Graph). In the graph, each user represents a graph node, and the relationship between the usersEdges in the graph are represented (e.g., by contact addresses in a contact list scenario). The size of the local data held by each user indicates the degree (e.g., the number of contacts per user in the address book scenario) of each node.
The distributed graph may be represented in the form of an adjacency matrix a of size N × N, where each row a [ i,:](i∈[1,N]) Representing a user UiHeld partial graph data. For example, in an unweighted graph, A [ i, j ]]1 may represent a user UiAnd UjThere is a relationship between; in a weighted graph, A [ i, j ]]V may represent a user UiAnd UjThere is a relationship between them and the intimacy is v. These users allow the cloud server to perform analysis tasks on the consolidated data of their partial graph data (i.e., the complete graph data). However, due to privacy concerns, each user U is in the process of the entire feature decompositioniLocal graph data A [ i, j ] which do not want to keep private]The disclosure is carried out, and because the sensitive information of the scheme can be leaked, a feature decomposition method for privacy protection on a distributed graph is provided for the scheme.
In the method provided by the embodiment, the number of the participants providing the cloud computing service is two (denoted as CS)1And CS2) And from a different trust domain. This can be served by two competing cloud providers in a real-world industrial scenario. CS1And CS2The feature decomposition task is performed assisted and no sensitive information of any user can be obtained all the way, and the result of the feature decomposition in the clear cannot be obtained at the same time, because the result of the feature decomposition is also encrypted. Both cloud services merchants are "honest but curious" and non-collusive. Namely, each cloud server as a computing terminal can faithfully execute the security protocol designed by us to carry out operation, and simultaneously, the cloud servers try to independently infer the sensitive data of the user from the processes of data collection and feature decomposition. Under the scene, the method provided by the invention comprises the following two parts:
1. secure distributed graph data gathering: at this stage, the cloud server collects each user UiThe encrypted partial map data A [2 ]i,:](i∈[1,N]) A complete encrypted adjacency matrix is formed to support subsequent eigen decomposition. Meanwhile, in the process of completing the collection of the graph data, the method provided by the embodiment can well keep the sparsity of the graph data, and the characteristic can greatly save the calculation and communication overhead of performing feature decomposition on the encrypted adjacency matrix subsequently.
2. Safe feature decomposition: the cloud server, after collecting the complete encrypted adjacency matrix, cooperatively performs a feature decomposition of the ciphertext domain. Specifically, the dimensionality reduction of the matrix is completed in the ciphertext domain, and then the QR algorithm is implemented in the ciphertext domain to obtain complete eigenvalues and eigenvectors of the small matrix after dimensionality reduction.
The method provided by the embodiment adopts two cryptographic techniques: additive secret sharing and function secret sharing, the following describes the cryptographic technique employed in the method provided in this embodiment.
Additive secret sharing
An Additive Secret Sharing (ASS) of privacy value x is denoted as
Figure BDA0003579650210000091
It has two forms:
1. arithmetic secret sharing:
Figure BDA0003579650210000092
wherein
Figure BDA0003579650210000093
<x>1And<x>2held by two computing participants, respectively.
2. Boolean secret sharing:
Figure BDA0003579650210000094
wherein
Figure BDA0003579650210000095
<b>1And<b>2held by two computing participants, respectively.
With the secret sharing described above, two computation participants can perform linear and multiplicative computations securely without obtaining plaintext data.
1) Secure linear computation: linear computation in secret sharing requires only two parties to compute local computations. I.e., if a, β, γ are constants in the plaintext,
Figure BDA0003579650210000096
and
Figure BDA0003579650210000097
is a secret shared value, then
Figure BDA0003579650210000098
Each party can use the ciphertext they hold to perform local computations.
2) Secure multiplication computation: to calculate the product of two secret sharing values requires two parties to make a round of communication. I.e. to calculate
Figure BDA0003579650210000099
Two parties need to share one multiplication tuple in advance
Figure BDA00035796502100000910
Then, each party PiLocal computing<e>i=<x>i-<u>iAnd<f>i=<y>i-<v>i. Then each party PiWill be provided with<e>iAnd<f>isent to each other to obtain e and f in the clear. Finally, PiThe product ciphertext held by i ∈ {0,1} is
<z>i=i×e×f+f×<u>i+e×<v>i+Kw)i
Linear and multiply operations in Boolean secret sharing are similar to those in arithmetic sharing, except that an XOR is used
Figure BDA00035796502100000911
"instead of an addition operation, use" an
Figure BDA00035796502100000912
"replace multiply operation.
Secret sharing of functions
Function Secret Sharing (FSS) is an extension of additive secret sharing that can accomplish secure function computations with a lower traffic volume. Therefore, FSS has a great performance advantage over ordinary secret sharing in high-latency networks. In general, a two-party FSS-based privacy function, f, consists of the following two abstract algorithms:
1.(k1,k2)←Gen(1λf): given a security parameter lambda and a function description f, two FSS keys k are output1,k2One for each computing participant.
2.<f(x)>i←Eval(kiX): given an FSS key kiAnd an evaluation point x for outputting an additive secret share of the evaluation result<f(x)>i
The FSS can ensure that if an attacker learns only one of the two FSS keys, he cannot obtain any information about this objective function and the calculation output f (x).
As shown in fig. 1, the method for decomposing the data characteristics of the distributed graph with privacy protection provided by this embodiment includes the steps of:
s100, generating an initial set by target graph nodes in a global graph according to local graph data, wherein the initial set comprises a plurality of groups of triples, and each group of triples comprises a node mark of the target graph node, a node mark of an adjacent graph node of the target graph node and the weight of a connecting edge of the target graph node and the adjacent graph node.
The target graph node can be any node in the global graph, and for the feature decomposition task of the distributed graph data, the local graph data of each user needs to be collected firstly so as to form a complete encrypted adjacency matrix A related to the graph, and therefore, the target graph node can be any node in the global graphAnd performing subsequent feature decomposition of the ciphertext domain. Specifically, adjacent to each row a [ i,:]the local graph data for each user is represented. At this stage each user UiShare his local graph data a [ i,:]for two cloud servers CS1And CS2. In order to obtain a high efficiency, each user U is given access by means of ASSiEncrypt a [ i,:]. Simply applying this technique by each node on its own local graph data would result in a high overhead, since the distributed graph data is typically sparse, i.e. contiguous to each row a [ i,:]mostly 0 elements and only a small part of the data is valid (e.g. only a small number of phone numbers per user's phone book). In this embodiment a sparse representation of the matrix is used, the basic idea being to process and submit only the (encrypted) values of the non-zero elements and their positions. In particular, each user UiOnly the positions and weights of the non-zero elements are stored: { (i, j, A [ i, j ]])}. Each element in the set is a matrix triplet: (i, j, A [ i, j)]) Where i represents the node label of the target graph node and j represents the node label of a neighboring node of the target graph node, i.e., each element is the display node (i.e., user) UiAnd node UjA side in between, A [ i, j ]]Representing the weight of the edge, the number of elements in the set representing the node UiDegree of (degree). Then let each user UiEncryption of weights A [ i, j ] using ASS techniques]. Specifically, an edge weight is given
Figure BDA0003579650210000101
User generates a random number
Figure BDA0003579650210000102
Then the weight A [ i, j ]]Respectively of the two shares of the arithmetic cryptogram of<A[i,j]>1=A[i,j]-r and<A[i,j]>2r. Last user UiThe sum of { (i, j,<A[i,j]>1) And { (i, j,<A[i,j]>2) Are respectively sent to a first computing terminal CS1And a second computing terminal CS2. However, this is simple because the number of non-zero elements represents the number of edges (i.e., degree), and soThe encryption mode will reveal the degree information of each node to the computing terminal. Based on this information, the existing literature indicates that the computing terminal can infer the user UiVarious private information. Meanwhile, if the distribution diagram is an unweighted diagram (namely, the elements in the adjacency matrix are 0 or 1), it is meaningless to encrypt only the edges with non-0 weights, because the existence of the edges reveals that the weights of the edges are 1, and further, the computing terminal can obtain the complete diagram adjacency matrix. Therefore, the challenge here is how to protect each user U while using the ternary coding scheme of the sparse matrixiDegree of (c) information. While not affecting the effectiveness of subsequent feature decomposition.
To solve this challenge, the present embodiment provides a method to find a theoretical balance (trade-off) between the user-degree information and the matrix sparsity. In particular, each user UiIn { (i, j, A [ i,:]) And adding some false edges (i.e., (i, j,0)) with weight value of 0 at random empty positions in the data, and then simultaneously applying ASS technology to the weight values of the real edges and the false edges for encryption. Since in the ASS technique, even if the same (e.g., 0) value is encrypted multiple times, the indiscriminability of the cipher texts of the cipher can be ensured. Therefore, the method can not only make the cloud server unable to distinguish the real edge from the false edge, but also does not affect the effectiveness of the subsequent security feature decomposition process (because the weight of the false edge is 0). Simultaneously, the user U can be protectediDegree information (because some false edges are added). There remains a challenge in choosing an appropriate number of false edges to achieve a theoretical balance between sparsity and privacy. Specifically, too many false edges weaken sparsity of the adjacency matrix of the collected ciphertext graph and increase subsequent system overhead, while too few false edges result in poor privacy protection.
In one possible implementation, each user UiSampling a noise n from a discrete Laplace distribution (definition 2)iOnly own noisy local data is shared.
The laplacian distribution is one of the most popular noise distributions, which can be defined as:
a discrete random variable obeys a laplacian distribution Lap (e, δ, Δ) when its probability density function satisfies the following equation.
Figure BDA0003579650210000111
Where μ is the mean of the laplace distribution.
Figure BDA0003579650210000112
Where Δ is the sensitivity of the function f:
A=max|f(x)-f(x')|
which can be used to measure how much a single entity's data can change the data output in the worst case.
According to the definition of Laplace distribution, if no setting is made, each user UiSampling a noise n from a discrete Laplace distribution (definition 2)iIn this case, the sensitivity Δ of the laplace distribution should be set to Δ ═ dmax-dminWherein d ismax,dminRespectively the maximum and minimum of possible nodes in the distributed graph. Then adding n at random position in its local graph dataiA false edge with a weight of 0 (i.e., (i, j, 0)). And finally, simultaneously encrypting the weight values of the real edges and the weight values of the false edges by applying an ASS technology, and sending the ciphertext to each cloud server. Therefore, privacy protection of each node degree can be realized, and the existence of each edge can be protected.
Although this scheme is effective, it will result in a large sensitivity Δ (which will theoretically reach N, i.e. the number of nodes in the graph), which will result in sampled laplacian noise NiCan be very large, meaning that each user needs to add nearly N false edges, which can severely impact the sparsity of the graph and the performance of subsequent feature decomposition. As shown in fig. 3 for the probability density function of the discrete laplacian distribution for different deltas. The figure reveals a large sensitivityThe degree Δ will make the shape of the density function of the laplacian distribution more uniform. This characteristic indicates a greater sensitivity Δ, user UiThe greater the probability that a larger noise | n will be selectediL. Conversely, a small sensitivity Δ (e.g., 50 in fig. 3) will make the probability density function more focused, which means that the user U isiWill choose a smaller noise | n most probableiL. the method is used for the preparation of the medicament. Therefore, if all users UiSampling noise from a laplacian distribution with a greater sensitivity Δ will result in each user adding too many false edges to their local graph data, thereby severely affecting the sparsity of the collected graph data.
In order to obtain better sparsity, in this embodiment, privacy protection based on the idea of "partitioning bucket" is used, that is, each node in the graph is divided into different "buckets" to achieve privacy protection of node degree information in the "bucket", specifically, the node in the graph is divided into several buckets, or a plurality of degree intervals are divided between the maximum value and the minimum value of the degree of the node in the graph, each bucket contains approximately equal number of users, the degree of all nodes in the bucket is within one cell interval, and therefore the cell interval [ dp,dq]And (4) the following steps. Thus, all users in the same bucket may use a smaller sensitivity Δ dq-dp. In order to implement safe bucket allocation for nodes, the method provided by this embodiment further includes the steps of:
s200, encrypting the degree of the target graph node by the target graph node based on function secret sharing to obtain first encryption degree information and second encryption degree information, sending the first encryption degree information to a first computing terminal, and sending the second encryption degree information to a second computing terminal;
s300, the first computing terminal and the second computing terminal generate first encryption degree distribution information and second encryption degree distribution information of global graph data according to the first encryption degree information and the second encryption degree information sent by the target graph nodes.
The first encryption degree distribution information and the second encryption degree distribution information areThe first computing terminal and the second computing terminal respectively send the first encryption degree distribution information and the second encryption degree distribution information to the node, and the node can decrypt to obtain the barrel mapping, so that the barrel to which the first computing terminal belongs is determined, and the sensitivity of sampling noise is set. In this embodiment, the bucket map is a string of bits, where element 1 shows the boundary of the bucket, e.g., given d max10, bucket mapping inter 0001000001 shows that the user is divided into two buckets (intervals): users' degree e [1,4 ∈]And users their degree e [5,10 ∈]. How to obtain the bucket mapping in the ciphertext domain calculation is described in detail below.
In order to securely let the first computing terminal CS1And said second computing terminal CS2Dividing all possible value ranges of the degrees of the nodes in the global graph into regions to realize barrel division for users, and firstly enabling the first computing terminal and the second computing terminal to estimate the encrypted degree histograms of all the nodes under the condition of not obtaining any node plaintext degree information, namely estimating the number of the nodes corresponding to the possible values of each degree. Specifically, a common degree d is giveniAnd a specific user UjDegree information d ofjAt djAnd when the detection results are both ciphertext, the CS{1,2}It is necessary to detect whether dj=di. To achieve this objective, the present embodiment mainly utilizes a Distributed Point Function (DPF) based on FSS. A DPF mechanism fα,β(x) If x is alpha, otherwise 0 is output.
Similar to the general framework of FSS, an FSS-based two-sided DPF mechanism consists of the following two algorithms:
1.(k1,k2)←Gen(1λα, β): given a safety parameter lambda and alpha, beta, two DPF secret keys k are output1,k2Each to cloud server CS1And CS2One of them.
2.<fα,β(x)>i←Eval(kiX): given a DPF key kiAnd an evaluation point x for outputting a secret share of the evaluation result<fα,β(x)>i
The pseudo code of the secure degree histogram estimation algorithm in the present embodiment is shown in fig. 4.
Because all nodes are directly asked to send their degree of encryption diThe scheme utilizes a sampling strategy because higher system overhead is caused to a computing terminal. That is, the computing terminal randomly samples S users (denoted as SU) from the entire user populationj}j∈[1,S]) And the sampled users are allowed to send own encrypted degree information. Before the target graph node encrypts the degree of the target graph node based on the function secret sharing, the method comprises the following steps:
the first computing terminal and/or the second computing terminal randomly selects part of nodes in all nodes of the global graph to send encryption requests;
and after the target graph node receives the encryption request, the target graph node encrypts the degree of the target graph node based on function secret sharing.
The target graph node encrypts the degree of the target graph node based on function secret sharing to obtain first encryption degree information and second encryption degree information, and the method comprises the following steps:
the target graph node obtains the first encryption degree information and the second encryption degree information output by a first preset algorithm in function secret sharing, wherein the input of the first preset algorithm comprises the degree of the target graph node.
The first and second computing terminals generate first and second encryption degree distribution information of global graph data according to the first and second encryption degree information sent by the plurality of target graph nodes, and the method includes:
the first computing terminal inputs the first encryption degree information of the target graph node and a target scale into a second preset algorithm in function secret sharing to obtain first encryption degree comparison information between the degree of the target graph node and the target scale, and the second computing terminal inputs the second encryption degree information of the target graph node and the target scale into the second preset algorithm to obtain second encryption degree comparison information between the degree of the target graph node and the target scale;
wherein the sum of the degree of the target graph node and the first encryption degree comparison information and the second encryption degree comparison information of the target scale is 1 when the degree of the target graph node is equal to the target degree, and is 0 otherwise;
the first computing terminal obtains first encryption histogram information, the second computing terminal obtains second encryption histogram information, the first encryption histogram information includes first encryption graph node quantity information corresponding to each target degree, each first encryption graph node quantity information is the sum of all the first encryption degree comparison information corresponding to one target degree, the second encryption histogram information includes second encryption graph node quantity corresponding to each target degree, and each second encryption graph node quantity information is the sum of all the second encryption degree comparison information corresponding to one target degree;
acquiring first encryption degree information between the degrees of the target graph nodes and each target scale as first encryption degree histogram information, and acquiring second encryption degree information between the degrees of the target graph nodes and each target scale as second encryption degree histogram information by the second computing terminal;
and the first computing terminal and the second computing terminal determine the first encryption degree distribution information and the second encryption degree distribution information according to the first encryption degree histogram information and the second encryption degree histogram information.
In particular, each sampled user SUjBased on his degree djDPF key is generated (Algorithm 3 line 1 in FIG. 4) with DPF parameters α, β set to d, respectivelyjAnd 1. Thereafter, each user SUjSending a secret key kj,1(i.e. said first encryption degree information) to the first computing terminal CS1Sending a secret key kj,2(i.e. said second encryption degree information) to a second computing terminal CS2. After all sampled users send DPF keys, each cloud server CSt∈{1,2}For all possible degrees i e [1, d ]max]Using each key kj,t}j∈[1,S]Evaluating Eval (k)j,tI). Finally these evaluation results are summed (Algorithm 3 line 6), CStIt is possible to accurately obtain how many sampled users their degree d, respectivelyjIs an encrypted share of this information equal to i, where i ∈ [1, dmax]. The correctness is demonstrated as follows:
Figure BDA0003579650210000151
through the above steps, the first computing terminal and the second computing terminal respectively hold encrypted degree histogram estimates
Figure BDA0003579650210000152
Then the first computing terminal and the second computing terminal further generate a bucket map in the ciphertext domain. The pseudo code of the algorithm that generates the bucket map is shown in FIG. 5. Algorithm 4 in FIG. 5, which outputs encrypted bucket mappings
Figure BDA0003579650210000153
(share encrypted bit string with boolean secret), where element 1 shows the boundary of each bucket. E.g. given a d max10, inter 0001000001 shows that the user is divided into two buckets: users' degree e [1,4 ∈]And users their degree e [5,10 ∈]. After computing the encrypted bucket map, cloud server CS1And CS2Can be combined with
Figure BDA0003579650210000154
And sending the data to each user, and judging which bucket the user belongs to according to the degree of the user. Is connected withHow to implement Algorithm 4 is described in detail below.
Algorithm 4 (line 1) first lets CS1,2Calculate the bucket size sizeB of the plaintext (i.e., how many users need to be contained in each bucket), and then initialize an encrypted accumulator
Figure BDA0003579650210000155
Then CS1,2Degree histogram estimated in last stage
Figure BDA0003579650210000156
The accumulator is added one by one (line 4 of Algorithm 4). At the same time, each additional one
Figure BDA0003579650210000157
Thereafter, CS1,2In the ciphertext domain, whether to judge
Figure BDA0003579650210000158
And adds this comparison result (encrypted with a boolean secret share) to the bucket map
Figure BDA0003579650210000159
(line 5 of Algorithm 4). Specifically, if accu is not less than sizeB, inter [ i ≧ i]One bucket boundary is shown at 1, otherwise inter i]0. Thereafter, based on the result of the comparison of the above encryptions, the CS1,2Accumulator for judging whether to encrypt in cipher text field
Figure BDA00035796502100001510
And setting 0. Specifically, if inter [ i ]]1, indicates that a bucket boundary occurs, thus requiring the accumulator [ accum]A Set 0 and prepare for the next bucket to accumulate. If inter [ i)]If no bucket boundary appears, the accumulation is continued unchanged, namely 0. The above steps are shown as Algorithm 4 at line 6. Wherein "! "means" not "operation, which may let the CS be1,2One of which flips its secret shared share<inter[i]>1Or<inter[i]>2To complete. Finally, CS1,2Outputting encrypted bucket mappings
Figure BDA00035796502100001511
In Algorithm 4, the addition operation may be done by a protocol supported by the additive secret sharing itself, but the comparison operation of the ciphertext domains
Figure BDA00035796502100001512
And are not natively supported. Thus, the present embodiment provides two operations that can be performed in the ciphertext domain
Figure BDA00035796502100001513
The method of (1). The first method is based on a function secret sharing FSS. It is more suitable for high latency network scenarios because it requires a minimum number of interaction rounds between servers (at the cost of more local computations). The second approach is based on an additive secret shared ASS, which requires less local computation, but requires more online traffic and traffic theory between the two servers, and is therefore more suitable for low latency network scenarios.
In a first method, the first computing terminal obtains a first encryption comparison result according to the first accumulator, and generates a new one-bit value in the first encryption degree distribution information according to the first encryption comparison result, and the second computing terminal obtains a second encryption comparison result according to the second accumulator, and generates a new one-bit value in the second encryption degree distribution information according to the second encryption comparison result, including:
the third computing terminal generates a first key, a second key, a first random number and a second random number according to a third preset algorithm shared by function secrets and the number of the target nodes, sends the first key and the first random number to the first computing terminal, and sends the second key and the second random number to the second computing terminal,
the first computing terminal sending the sum of the first random number and the first accumulator to the second computing terminal, the second computing terminal generating a second random number, the sum of the second random number and the second accumulator being sent to the first computing terminal, such that the first computing terminal and the second computing terminal each obtain a scrambling input value, the scrambling input value being the sum of the first random number, the second random number, the first accumulator, and the second accumulator;
the first computing terminal inputs the permutation input value and the first secret key to a fourth preset algorithm of function secret sharing to obtain the first encryption bit, and the second computing terminal inputs the permutation input value and the second secret key to the fourth preset algorithm of function secret sharing to obtain the second encryption bit, wherein when the sum of the first accumulator and the second accumulator is smaller than the number of target nodes, the exclusive-or gate operation result of the first encryption bit and the second encryption bit is 1, and when the sum of the first accumulator and the second accumulator is not smaller than the number of target nodes, the exclusive-or gate operation result of the first encryption bit and the second encryption bit is 0;
the first computing terminal takes the first encryption bit as a new numerical value in the first encryption degree distribution information, the second computing terminal turns over the second encryption bit to be used as a new numerical value in the second encryption degree distribution information, or the first computing terminal turns over the first encryption bit to be used as a new numerical value in the first encryption degree distribution information, and the second computing terminal takes the second encryption bit as a new numerical value in the second encryption degree distribution information.
Specifically, the first method mainly uses a distributed comparison function (hereinafter referred to as DCF) in the function secret sharing to implement the comparison operation of the ciphertext domain. One DCFgα,β(x) Output β if input x<α, otherwise 0 is output. Similar to the general framework of FSS, an FSS-based two-party DCF mechanism consists of the following two algorithms:
1.(k1,k2,r1,r2)←Gen(1λα α, β): given a security parameter lambda and alpha, beta, two DCF keys k are output1,k2And two random numbers r1,r2(wherein r is1+r2=rin) Each to one of the two parties.
2.<gα,β(x)>i←Eval(ki,x+rin): given a DCF key kiAnd a scrambled (masked) input x + rinOutputting a secret share of the evaluation result<gα,β(x)>i
The security evaluation process of the DCF function only needs one round of online communication, namely the computing terminal CS1And CS2Sending<x>i+riI ∈ {1,2} to each other, to expose the scrambled (masked) input x + rinWithout compromising the privacy of the encrypted input x. Next, how to accomplish based on DCF function is described
Figure BDA0003579650210000171
And (5) operating.
To accomplish
Figure BDA0003579650210000172
Setting relevant parameters as alpha ═ sizeB, beta ═ 1 and output domain as
Figure BDA0003579650210000173
The generated DCF keys can be sent to the cloud server CS respectively1And CS2. Note that in an actual working scenario, this offline key preparation may be done by a third party server. After obtaining the relevant DCF key, the CSt∈{1,2}First of all, exchange<accu>t+rtTo disclose accu + rin. Thereafter, they each evaluate Eval (k) locallyt,accu+rin) The evaluation will output<1>tIf accu<sizeB, otherwise output<0>t. Since the requirement of CS in Algorithm 4 is1,2Output [1 ]]BIf accu ≧ sizeB, CS1,2One of the parties needs to flip his evaluation result locally to take the "not" of the evaluation result.
In a second method, the first computing terminal obtains a first encryption comparison result according to the first accumulator, and generates a new one-bit value in the first encryption degree distribution information according to the first encryption comparison result, and the second computing terminal obtains a second encryption comparison result according to the second accumulator, and generates a new one-bit value in the second encryption degree distribution information according to the second encryption comparison result, including:
the first computing terminal acquires a first random number, the second computing terminal acquires a second random number, and the sum of the first random number and the second random number is the number of the target nodes;
the first computing terminal obtains first bit data, the second computing terminal obtains second bit data, the first bit data is bit data corresponding to the difference between the first accumulator and the first random number, and the second bit data is bit data corresponding to the difference between the second accumulator and the second random number;
the first computing terminal and the second computing terminal input the respective held bit data to a parallel prefix adding circuit, perform exclusive-or gate calculation and gate calculation to obtain the most significant bit of the first bit data and the most significant bit of the second bit data, respectively, when the sum of the first accumulator and the second accumulator is less than the number of target nodes, the exclusive-or gate operation result of the most significant bit of the first bit data and the second bit data is 1, and when the sum of the first accumulator and the second accumulator is not less than the number of target nodes, the exclusive-or gate operation result of the most significant bit of the first bit data and the second bit data is 0;
the first computing terminal takes the most significant bit of the first bit data as a new bit value in the first encryption degree distribution information, the second computing terminal turns over the most significant bit of the second bit data to take the most significant bit of the second bit data as a new bit value in the second encryption degree distribution information, or the first computing terminal turns over the most significant bit of the first bit data to take the most significant bit of the first bit data as a new bit value in the first encryption degree distribution information, and the second computing terminal takes the most significant bit of the second bit data as a new bit value in the second encryption degree distribution information.
The second approach is based on "bit decomposition" of the secret shared domain. Specifically, a most significant bit (msb) of the complement of x may represent the positive and negative attributes of x (i.e., x ≧ 0 if msb ≧ 0, otherwise msb ≧ 1). Given two numbers a and B represented by complementary codes, which may be represented as two ciphertexts of a number sharing shares, are respectively computed by the terminal CS1And CS2And (4) holding. The most significant bits of a + B can be safely computed by a custom parallel prefix addition circuit. A custom 8-bit parallel prefix addition circuit is shown in fig. 6.
Given a computed terminal CS1And CS2Respectively held cipher texts
Figure BDA0003579650210000181
Cloud server CS1And CS2May first be decomposed locally<x>1And<x>2for bit data:<x>i=xi[1],…,xi[k]i is e {1,2 }. Then calculates the terminal CS1And CS2The bit held by itself is input to a customized parallel prefix addition circuit to safely execute an' exclusive-OR gate
Figure BDA0003579650210000182
AND gate
Figure BDA0003579650210000183
And calculating. As described above, the xor
Figure BDA0003579650210000184
And
Figure BDA0003579650210000185
"is natively supported in boolean secret sharing. So cloud server CS1And CS2Can be safely usedCalculating the most significant bit of a ciphertext data to obtain a private input
Figure BDA0003579650210000186
And a magnitude relation of 0. Based on parallel prefix addition circuit, cloud server CS1And CS2Can calculate
Figure BDA0003579650210000187
Namely the output
Figure BDA0003579650210000188
If accu is greater than or equal to sizeB, otherwise
Figure BDA0003579650210000189
CS is required in Algorithm 41,2Output of
Figure BDA00035796502100001810
If accu ≧ sizeB, CS1,2One of the parties needs to flip his evaluation result locally to take the "not" of the evaluation result.
As can be seen from the foregoing description, for each bit in the bucket map, there are two cases: to be 0 or 1, when it is 1, the accumulator needs to be cleared, when it is 0, the accumulator does not need to be cleared, and in order to achieve clearing or not clearing of the accumulator in the ciphertext domain, that is, the first accumulator and the second accumulator are updated respectively, but the first computing terminal and the second computing terminal cannot infer whether the sum of the first accumulator and the second accumulator is cleared, the method provided in this embodiment is that, after obtaining the first encryption degree distribution information and the second encryption degree distribution information, one of the two computing terminals flips the latest bit of the locally held encryption degree distribution information, for example, the first computing terminal flips the latest bit value in the first encryption degree distribution information to obtain a flipped bit, and if the latest bit in the bucket map is 1, the sum of the flipped bit and the second encryption degree distribution information (xor gate operation result) is 0, if the sum (exclusive or gate operation result) of the inversion bit and the second encryption degree distribution information is 1 when the latest bit in the bucket map is 0, and the first computing terminal and the second computing terminal compute the product of the first value and the second value in the ciphertext domain, the sum of the first accumulator and the second accumulator can be emptied or not emptied based on the latest bit of the bucket map in the case of ciphertext.
Referring to fig. 1 again, the method for decomposing the data characteristics of the distributed graph with privacy protection provided in the embodiment further includes the following steps:
s400, the target graph node determines a target interval of the degree of the target graph node according to the received first encryption degree distribution information and the second encryption degree distribution information, determines a target sampling sensitivity according to boundary information of the target interval, samples noise from Laplace distribution according to the target sampling sensitivity, adds a false triple in the target combination according to the noise, and generates a target set, wherein the weight value of the false triple is 0;
s500, the target graph node encrypts the target set based on additive secret sharing to obtain a first encryption set and a second encryption set, sends the first encryption set to a first computing terminal, and sends the second encryption set to a second computing terminal.
The sum of the first encryption degree distribution information and the second encryption degree distribution information which are locally held by the first computing terminal and the second computing terminal respectively is the barrel mapping, the first computing terminal and the second computing terminal send the locally held encryption degree distribution information to nodes in the graph, then the nodes in the graph are decrypted to obtain the barrel mapping of the plaintext, and then local graph data held by the first computing terminal and the second computing terminal are encrypted based on the barrel mapping. Specifically, as shown in FIG. 7, after the encrypted bucket map is decrypted at the user end, each user UiIts partial graph data is encrypted with Algorithm 5. The main point to be noted here is the noise n sampled from the laplacian distributioniMay be negative, which means that the user U isiSome edges need to be deleted. It is obvious thatThis will seriously affect the accuracy of the subsequent feature decomposition. To solve this problem, in the present embodiment, each user UiTruncation of ni(i.e., line 4 of Algorithm 5). Then we use
Figure BDA0003579650210000191
Representing a set of real and false edges,
Figure BDA0003579650210000192
representing a user UiLocal graph data after the dummy edge with weight 0 is added. Last user UiThe weight A [ i, j ] at each (real or false) edge]Obtaining final partial graph data ciphertext by applying ASS
Figure BDA0003579650210000193
And shares of ciphertext
Figure BDA0003579650210000194
And
Figure BDA0003579650210000195
are sent to CS respectively1And CS2
Compared with each element in the local graph data a [ i ], ] directly encrypted by each user, the scheme for encrypting the local graph data provided by the embodiment can save 90% of ciphertext storage space, and can save 80% of online communication and 50% of calculation time in subsequent feature decomposition.
S600, the first computing terminal and the second computing terminal perform feature decomposition on the global graph data according to the first encryption set and the second encryption set corresponding to each node in the global graph.
And the first computing terminal receives the first encryption set sent by each node in the global graph, the second computing terminal receives the second encryption set sent by each node in the global graph, and feature decomposition is carried out on global graph data on a ciphertext domain. The following first explains the graph data feature decomposition process in the clear text:
given an N by N adjacency matrix, the complexity at which a complete eigen decomposition is performed is N3. This is unnecessary because most graph analysis applications only require the eigenvalues and eigenvectors of top-k (k is much smaller than N). Thus, in a practical application scenario, given a large-scale adjacency matrix a, to compute its eigenvalues and eigenvectors of top-k, the first step is to reduce its dimension from N × N to M × M (M is usually slightly larger than k), resulting in a new small matrix a
Figure BDA0003579650210000201
And carrying out further treatment. The most popular dimension reduction algorithms are the Arnoldi Algorithm (Algorithm 1, pseudocode as shown in FIG. 8) and the Lanzcos Algorithm (Algorithm 2, pseudocode as shown in FIG. 9), which act on asymmetric and symmetric matrices, respectively. After dimensionality reduction, a new small matrix is computed, typically using QR algorithms
Figure BDA0003579650210000202
All eigenvalues and eigenvectors. Finally, the small matrix
Figure BDA0003579650210000203
The characteristic value and the characteristic vector of top-k of the original matrix A are represented; and then
Figure BDA0003579650210000204
Feature vector corresponding to feature value of top-k
Figure BDA0003579650210000205
(each column vector of the matrix is
Figure BDA0003579650210000206
A feature vector of) can be expressed by formula
Figure BDA0003579650210000207
Converting the characteristic vector V corresponding to the characteristic value of top-k of the original matrix A, wherein the matrix P is formed by AlgoLine 11 in rithm 1 and Algorithm 2. The clear QR algorithm is described below.
The QR algorithm proceeds in an iterative manner. Formally, given an objective matrix L, let T0L, at the kth iteration (K e 1, K)]) In the method, the calculation result T of the previous iteration is inputk-1Once QR decomposition T can be calculatedk-1=Qk-1Rk-1Wherein Q isk-1Is an orthogonal matrix, Rk-1Is a Shanghaineberg matrix and outputs Tk=Rk-1Qk-1. When the QR algorithm is finished, the output matrix T of the last iterationKThe diagonal elements of (a) are the eigenvalues of the target matrix L, the matrix S ═ Q1...QKAre all eigenvectors (one for each column) of the target matrix L. One QR decomposition may be done using Givens rotation. Formally, given an M x M shanghai senberg matrix Tk-1An orthogonal Givens rotation matrix G may be createdi,i∈[1,M-1]。
Figure BDA0003579650210000208
Wherein,
Figure BDA0003579650210000209
and is provided with
Figure BDA00035796502100002010
H(1)=Tk-1. At the end of this QR decomposition,
Figure BDA00035796502100002011
in addition, Qk-1=G1...GM-1. Figure 11 shows the process of performing a QR decomposition on a 4 x 4 shanghaneberg matrix using a series of Givens rotation matrices.
The first computing terminal and the second computing terminal perform feature decomposition on the global graph data according to the first encryption set and the second encryption set corresponding to each node in the global graph, and the feature decomposition includes:
the first computing terminal obtains a first encryption adjacent matrix according to the first encryption set corresponding to each node in the global graph, and the second computing terminal obtains a second encryption adjacent matrix according to the second encryption set corresponding to each node in the global graph;
the first computing terminal and the second computing terminal perform dimension reduction on the sum of the first encryption adjacent matrix and the second encryption adjacent matrix based on additive secret sharing to obtain a dimension reduction matrix;
the first computing terminal and the second computing terminal execute a QR algorithm on the dimension reduction matrix based on additive secret to obtain an encrypted characteristic value and an encrypted characteristic vector of the global graph data;
for square root operation in the dimension reduction process, the first computing terminal and the second computing terminal iteratively calculate through a second computing formula based on additive secret sharing to obtain the reciprocal of the square root;
the second calculation formula is:
Figure BDA0003579650210000211
wherein, y'nAnd the calculation result of the reciprocal of the root in the nth iteration calculation is shown, and x' represents the number of the root to be extracted.
The procedure of performing the dimensionality reduction operation in the ciphertext domain by the first computing terminal and the second computing terminal is described below by taking the Arnoldi algorithm as an example. Observing the operations of lines 1-7 in fig. 8, both composed of additions and multiplications, is naturally supported in the additive secret sharing domain. However, how to perform the operations of lines 8 and 9 in the ciphertext domain is challenging because they require square root operations (L) respectively2Norm requiring an open square root) and division operations.
In this embodiment, the method of approximate computation is used to decompose the square root operation and the division operation into a series of additions supported in the ciphertext domainAnd multiplication operations. In particular, to compute square roots in the ciphertext domain
Figure BDA0003579650210000212
First, the reciprocal of the square root is approximated
Figure BDA0003579650210000213
Namely, it is
Figure BDA0003579650210000214
Wherein, y'nRepresenting the result of the calculation of the reciprocal of the root for the nth iteration, x' representing the number of roots to be prescribed, which converges the iteration to
Figure BDA0003579650210000215
Clearly, both subtraction and multiplication are supported natively in the secret shared domain. In addition, to obtain a faster convergence rate, an initial value may be used
y'0=3e0.5-x'+0.003。
Can then calculate
Figure BDA0003579650210000216
To obtain
Figure BDA0003579650210000217
Division for ciphertext domains
Figure BDA0003579650210000218
The main challenge is to calculate the reciprocal
Figure BDA0003579650210000219
However, the reciprocal in the division operation of Algorithm 1 (line 9) is
Figure BDA00035796502100002110
Which is the inverse of the calculation (square root) of the eighth line. Therefore, the calculation result of the reciprocal of the square root can be directly obtained
Figure BDA0003579650210000221
As
Figure BDA0003579650210000222
Simply multiplied by pkLine 9 may be completed in the ciphertext domain. Up to this point, algorithmm 1 may be performed in its entirety in the ciphertext domain. The Arnoldi algorithm for a specific ciphertext domain may be as shown in FIG. 10. For other security calculation methods of dimension reduction algorithm, such as Lanczos method, the operations that need to be performed safely are the same as the security Arnoldi method, and the description of the algorithm of the security Lanczos method is omitted here.
The QR algorithm is mainly composed of matrix multiplications, and for the multiplication between two M x M matrices, the direct method in the secret shared domain is element-by-element multiplication, which requires M3Secondary multiplication and requires 2M communication between two servers3In this embodiment, the vectorized multiplicative tuple is used to implement more efficient multiplication between matrices, thereby saving traffic and computation. That is, the multiplication tuple needed for multiplication between two matrices can be vectorized as Z ═ XY, where X and Y play a role in masking (masked) the two input matrices during the secure multiplication process. Specifically, for two ciphertext matrices of size N × N
Figure BDA0003579650210000223
And
Figure BDA0003579650210000224
if one wants to calculate the ciphertext matrix product between them
Figure BDA0003579650210000225
Existing secret sharing protocols configure a pair of multiplication tuples for each multiplication in a matrix instead of each element (i.e., each multiplication tuple is a binary set of elements in a matrix, which is a set of elements in a matrix, which are each a function of a multiplier
Figure BDA0003579650210000226
) Therefore, it is necessary to prepare 3N in advance3Element (i.e. multiplication of two matrices of size N x N requires N3Sub-multiplication, each multiplication requiring a multiplication tuple)This approach is not efficient and unnecessary. The multiplicative tuple vectorization adopted in the embodiment means that independent multiplicative tuples are not directly generated randomly, but a multiplicative tuple matrix is generated randomly
Figure BDA0003579650210000227
After that, ciphertext matrix multiplication can be directly performed. Two cloud servers Pi∈{0,1}Matrix operation using secret shares of its own<E>i=<A>i-<U>iAnd<F>i=<B>i-<V>i. Then each party PiWill be provided with<E>iAnd<F>isent to E and F, which obtain the plaintext from each other. Finally, PiThe product ciphertext held by i ∈ {0,1} is<C>i=i×E×F+F×<U>i+E×<V>i+<E>i. It can be found that the two parties only need to communicate on line in the process of 2N2One element, i.e. two matrices each sent to the other<E>iAnd<F>i. At the same time, the multiplication tuples needing to be prepared in advance also become
Figure BDA0003579650210000228
I.e. 3N2And (4) each element.
The pseudo code of the QR algorithm secured in this embodiment is shown in fig. 12. Input of it
Figure BDA0003579650210000229
Output of
Figure BDA00035796502100002210
And
Figure BDA00035796502100002211
reviewing the decomposition of the features of the plaintext as described above,
Figure BDA00035796502100002212
top-k (the largest k) of the diagonal elements of (a) are the original matrices
Figure BDA00035796502100002213
Characteristic value of top-k. While
Figure BDA00035796502100002214
Is a small matrix
Figure BDA00035796502100002215
By a formula
Figure BDA00035796502100002216
Which can be converted into the original matrix
Figure BDA00035796502100002217
A feature vector of (1), wherein
Figure BDA00035796502100002218
Is the output matrix of the secure dimension reduction algorithm.
Further, the inventors have also found that at each Givens rotation
Figure BDA00035796502100002219
Or H (i) GiOnly row i and row i +1 of h (i) are updated. Thus, to save overhead, the Givens rotation matrix G may be simplifiediFrom the formula (1) to
Figure BDA00035796502100002220
Figure 13 shows the completion of one QR decomposition with an optimized 4 x 4 Givens rotation matrix. It can be seen that a significant number of multiplications can be saved compared to fig. 11. Similarly, in the calculation
Figure BDA0003579650210000231
(i.e., Algorithm 7, line 15), G may also be substitutediSimplified to gi. After simplification, the Givens rotation matrix is multiplied in a form of traversal, as shown in fig. 13, the Givens rotation matrix g1Multiply first four elements of the upper left 2 x 2 of the H matrix, H [1:2,1:2]Then multiply by a group of elements, i.e. H1: 2,2:3]、H[1:2,3:4]. Then update givens rotationMatrix, get g2Then proceed to the next row, i.e. H2: 3,1:2]、[2:3,2:3]、H[2:3,3:4]And so on (see the shaded portion in fig. 13).
It should be noted that the 4 × 4H matrix is an example, and in practical applications, H may be of any dimension, and may be updated by multiplying by a 2 × 2 Givens matrix in the manner described above.
After the above simplification, it can be seen that the simplified Givens rotation matrix
Figure BDA0003579650210000232
Multiple multiplications with two rows of elements in the large matrix H are required, i.e. repeated use in multiple multiplications. Thus, coupled multiplicative tuples may be used to save traffic between two computing terminals. For example, assume that a ciphertext matrix needs to be transformed
Figure BDA0003579650210000233
Multiplying by a ciphertext matrix
Figure BDA0003579650210000234
Only one random matrix need be used
Figure BDA0003579650210000235
Unmasking (mask)
Figure BDA0003579650210000236
Without the use of k different random matrices. Therefore, when multiple multiplications need to be performed by the same multiplier
Figure BDA0003579650210000237
As a multiplier, only one random matrix needs to be used to mask it in the present invention. Finally, the process is carried out in a closed loop,
Figure BDA0003579650210000238
is transposed matrix of
Figure BDA0003579650210000239
Can let the computing terminal CS1And CS2Locally sharing shares of own secret<gi>1Or<gi>2And the transposition is carried out, so that the communication overhead is further saved.
After the optimization, the performance of the secure QR algorithm provided in the present embodiment can be greatly improved. In particular, the underlying secure QR algorithm requires the computing terminal CS1And CS2Online communication 6K (M-1) M2One element (ignoring the traffic needed to approximate the square root because it was not optimized; K and M correspond to K and M in Algorithm 7, respectively), while the optimized secure QR Algorithm only needs to communicate K (M-1) (6M +4) elements online. According to experiments, the optimized safe QR algorithm can save up to 97% of on-line communication and 9.3% of calculation time compared with the basic safe QR algorithm.
To sum up, this embodiment provides a method for decomposing characteristics of distributed graph data with privacy protection, in which a graph node holding local graph data encrypts its own degree information and sends the encrypted degree information to a first computing terminal and a second computing terminal, where the first computing terminal and the second computing terminal cooperatively compute in a ciphertext domain to generate first encryption degree distribution information and second encryption degree distribution information, so that the graph node can determine a target interval to which its degree belongs, and further select a suitable sampling sensitivity sampling noise, add a false edge with a weight of 0 to a real graph adjacent matrix, and realize sparse representation of the matrix in the form of a triple matrix, and perform encrypted characteristic decomposition on the adjacent matrix to which the false edge is added, so as to achieve keeping sparsity of the graph data and ensuring validity of characteristic decomposition on the premise of protecting node privacy, and the distributed special data characteristic decomposition with privacy protection is realized.
It should be understood that, although the steps in the flowcharts shown in the figures of the present specification are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the flowchart may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
Example two
Based on the above embodiment, the present invention also provides a distributed graph data feature decomposition system for privacy protection, where the system includes a target graph node, a first computing terminal, and a second computing terminal; the target graph node, the first computing terminal and the second computing terminal are used for cooperatively executing relevant steps in the privacy-protected distributed graph data feature decomposition method in the first embodiment.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A privacy preserving distributed graph data feature decomposition method, the method comprising:
generating an initial set by a target graph node in a global graph according to local graph data, wherein the initial set comprises a plurality of groups of triples, and each group of triples comprises a node mark of the target graph node, a node mark of an adjacent graph node of the target graph node and the weight of a connecting edge of the target graph node and the adjacent graph node;
the target graph node encrypts the degree of the target graph node based on function secret sharing to obtain first encryption degree information and second encryption degree information, the first encryption degree information is sent to a first computing terminal, and the second encryption degree information is sent to a second computing terminal;
the first computing terminal and the second computing terminal generate first encryption degree distribution information and second encryption degree distribution information of global graph data according to the first encryption degree information and the second encryption degree information sent by the target graph nodes;
the target graph node determines a target interval to which the degree of the target graph node belongs according to the received first encryption degree distribution information and the second encryption degree distribution information, determines a target sampling sensitivity according to boundary information of the target interval, samples noise from Laplace distribution according to the target sampling sensitivity, adds a false triple in the target combination according to the noise, and generates a target set, wherein the weight value of the false triple is 0;
the target graph node encrypts the target set based on additive secret sharing to obtain a first encryption set and a second encryption set, sends the first encryption set to a first computing terminal, and sends the second encryption set to a second computing terminal;
and the first computing terminal and the second computing terminal perform feature decomposition on the global graph data according to the first encryption set and the second encryption set corresponding to each node in the global graph.
2. The privacy-preserving distributed graph data feature decomposition method according to claim 1, wherein before the target graph node encrypts the degree of the target graph node based on function secret sharing, the method comprises:
the first computing terminal and/or the second computing terminal randomly selects part of nodes in all nodes of the global graph to send encryption requests;
and after the target graph node receives the encryption request, the target graph node encrypts the degree of the target graph node based on function secret sharing.
3. The privacy-preserving distributed graph data feature decomposition method according to claim 1, wherein the encrypting, by the target graph node, the degree of the target graph node based on function secret sharing to obtain first encryption degree information and second encryption degree information comprises:
the target graph node obtains the first encryption degree information and the second encryption degree information output by a first preset algorithm in function secret sharing, wherein the input of the first preset algorithm comprises the degree of the target graph node.
4. The privacy-preserving distributed graph data feature decomposition method according to claim 3, wherein the first and second computing terminals generate first and second encryption degree distribution information of global graph data according to the first and second encryption degree information sent by the plurality of target graph nodes, and the method comprises:
the first computing terminal inputs the first encryption degree information of the target graph node and a target scale into a second preset algorithm in function secret sharing to obtain first encryption degree comparison information between the degree of the target graph node and the target scale, and the second computing terminal inputs the second encryption degree information of the target graph node and the target scale into the second preset algorithm to obtain second encryption degree comparison information between the degree of the target graph node and the target scale;
wherein the sum of the degree of the target graph node and the first encryption degree comparison information and the second encryption degree comparison information of the target scale is 1 when the degree of the target graph node is equal to the target degree, and is 0 otherwise;
the first computing terminal obtains first encryption histogram information, the second computing terminal obtains second encryption histogram information, the first encryption histogram information includes first encryption graph node quantity information corresponding to each target degree, each first encryption graph node quantity information is the sum of all the first encryption degree comparison information corresponding to one target degree, the second encryption histogram information includes second encryption graph node quantity corresponding to each target degree, and each second encryption graph node quantity information is the sum of all the second encryption degree comparison information corresponding to one target degree;
acquiring first encryption degree information between the degrees of the target graph nodes and each target scale as first encryption degree histogram information, and acquiring second encryption degree information between the degrees of the target graph nodes and each target scale as second encryption degree histogram information by the second computing terminal;
and the first computing terminal and the second computing terminal determine the first encryption degree distribution information and the second encryption degree distribution information according to the first encryption degree histogram information and the second encryption degree histogram information.
5. The privacy-preserving distributed graph data characteristic decomposition method according to claim 4, wherein each bit value in the first encryption degree distribution information and the second encryption degree distribution information is 0 or 1; the first computing terminal and the second computing terminal determine the first encryption degree distribution information and the second encryption degree distribution information according to the first encryption degree histogram information and the second encryption degree histogram information, and the method comprises the following steps:
the first computing terminal and the second computing terminal determine the number of target nodes in each interval according to the number of the target graph nodes sending the encryption degree information and the preset interval number;
the first computing terminal sequentially adds each piece of first encryption graph node quantity information in the first encryption degree histogram information into a first accumulator according to the size sequence of the corresponding target scale, and the second computing terminal sequentially adds each piece of second encryption graph node quantity information in the second encryption degree histogram information into a second accumulator according to the size sequence of the corresponding target scale;
after the first encryption graph node quantity information and the second encryption graph node quantity information are respectively added into the first accumulator and the second accumulator, the first computing terminal obtains a first encryption comparison result according to the first accumulator and generates a new one-bit numerical value in the first encryption degree distribution information according to the first encryption comparison result, the second computing terminal obtains a second encryption comparison result according to the second accumulator and generates a new one-bit numerical value in the second encryption degree distribution information according to the second encryption comparison result, wherein when the sum of the first accumulator and the second accumulator is not less than the target node quantity, the XOR gate operation result of the first encryption comparison result and the second encryption comparison result is 1, and when the sum of the first accumulator and the second accumulator is less than the target node quantity, the exclusive-or gate operation result of the first encryption comparison result and the second encryption comparison result is 0;
the first computing terminal inverts the latest one-bit numerical value in the first encryption degree distribution information to obtain an inversion bit, the first computing terminal obtains a first secret share based on additive secret sharing calculation, the second computing terminal obtains a second secret share based on additive secret sharing calculation, wherein the sum of the first secret share and the second secret share is the product of a first value and a second value, the first value is the exclusive-or gate operation result of the inversion bit and the latest one-bit in the second encryption degree distribution information, and the second value is the sum of the first accumulator and the second accumulator;
the first computing terminal updates the value of the first accumulator to the first secret share, adds the next first encryption graph node number information to the first accumulator, and the second computing terminal updates the value of the second accumulator to the second secret share, and adds the next second encryption graph node number information to the second accumulator.
6. The method as claimed in claim 5, wherein the first computing terminal obtains a first encryption comparison result according to the first accumulator, generates a new one-bit value in the first encryption degree distribution information according to the first encryption comparison result, and the second computing terminal obtains a second encryption comparison result according to the second accumulator, and generates a new one-bit value in the second encryption degree distribution information according to the second encryption comparison result, including:
the third computing terminal generates a first key, a second key, a first random number and a second random number according to a third preset algorithm shared by function secrets and the number of the target nodes, sends the first key and the first random number to the first computing terminal, and sends the second key and the second random number to the second computing terminal,
the first computing terminal sending the sum of the first random number and the first accumulator to the second computing terminal, the second computing terminal generating a second random number, the sum of the second random number and the second accumulator being sent to the first computing terminal, such that the first computing terminal and the second computing terminal each obtain a scrambling input value, the scrambling input value being the sum of the first random number, the second random number, the first accumulator, and the second accumulator;
the first computing terminal inputs the permutation input value and the first secret key to a fourth preset algorithm of function secret sharing to obtain the first encryption bit, and the second computing terminal inputs the permutation input value and the second secret key to the fourth preset algorithm of function secret sharing to obtain a second encryption bit, wherein when the sum of the first accumulator and the second accumulator is less than the number of target nodes, the result of the exclusive-or gate operation of the first encryption bit and the second encryption bit is 1, and when the sum of the first accumulator and the second accumulator is not less than the number of target nodes, the result of the exclusive-or gate operation of the first encryption bit and the second encryption bit is 0;
the first computing terminal takes the first encryption bit as a new numerical value in the first encryption degree distribution information, the second computing terminal turns over the second encryption bit to be used as a new numerical value in the second encryption degree distribution information, or the first computing terminal turns over the first encryption bit to be used as a new numerical value in the first encryption degree distribution information, and the second computing terminal takes the second encryption bit as a new numerical value in the second encryption degree distribution information.
7. The privacy-preserving distributed graph data feature decomposition method according to claim 5, wherein the first computing terminal obtains a first encryption comparison result according to the first accumulator and generates a new one-bit value in the first encryption degree distribution information according to the first encryption comparison result, and the second computing terminal obtains a second encryption comparison result according to the second accumulator and generates a new one-bit value in the second encryption degree distribution information according to the second encryption comparison result, including:
the first computing terminal acquires a first random number, the second computing terminal acquires a second random number, and the sum of the first random number and the second random number is the number of the target nodes;
the first computing terminal obtains first bit data, the second computing terminal obtains second bit data, the first bit data is bit data corresponding to the difference between the first accumulator and the first random number, and the second bit data is bit data corresponding to the difference between the second accumulator and the second random number;
the first computing terminal and the second computing terminal input the respective held bit data to a parallel prefix adding circuit, perform exclusive-or gate calculation and gate calculation to obtain the most significant bit of the first bit data and the most significant bit of the second bit data, respectively, when the sum of the first accumulator and the second accumulator is less than the number of target nodes, the exclusive-or gate operation result of the most significant bit of the first bit data and the second bit data is 1, and when the sum of the first accumulator and the second accumulator is not less than the number of target nodes, the exclusive-or gate operation result of the most significant bit of the first bit data and the second bit data is 0;
the first computing terminal takes the most significant bit of the first bit data as a new bit value in the first encryption degree distribution information, the second computing terminal turns over the most significant bit of the second bit data to be used as a new bit value in the second encryption degree distribution information, or the first computing terminal turns over the most significant bit of the first bit data to be used as a new bit value in the first encryption degree distribution information, and the second computing terminal takes the most significant bit of the second bit data as a new bit value in the second encryption degree distribution information.
8. The privacy-preserving distributed graph data feature decomposition method according to claim 1, wherein the feature decomposition of the global graph data by the first computing terminal and the second computing terminal according to the first encryption set and the second encryption set corresponding to each node in the global graph comprises:
the first computing terminal obtains a first encryption adjacent matrix according to the first encryption set corresponding to each node in the global graph, and the second computing terminal obtains a second encryption adjacent matrix according to the second encryption set corresponding to each node in the global graph;
the first computing terminal and the second computing terminal perform dimensionality reduction on the sum of the first encryption adjacent matrix and the second encryption adjacent matrix based on additive secret sharing to obtain a dimensionality reduction matrix;
the first computing terminal and the second computing terminal execute a QR algorithm on the dimension reduction matrix based on additive secret to obtain an encrypted characteristic value and an encrypted characteristic vector of the global graph data;
for square root operation in the dimension reduction process, the first computing terminal and the second computing terminal iteratively calculate through a second computing formula based on additive secret sharing to obtain the reciprocal of the square root;
the second calculation formula is:
Figure FDA0003579650200000051
y'nthe calculation result of the reciprocal of the square root of the nth iteration calculation is shown, and x' represents the number of the square root to be generated.
9. The privacy-preserving distributed graph data feature decomposition method according to claim 8, wherein the first computing terminal and the second computing terminal perform a QR algorithm on the reduced-dimension matrix based on an additive secret, comprising:
in the ith iteration in a QR algorithm, a first encryption matrix and a second encryption matrix are obtained by a first computing terminal, the sum of the first encryption matrix and the second encryption matrix is a target matrix, and the target matrix is a matrix formed by elements with the positions of (i, i), (i, i +1), (i +1, i) and (i +1) in a plaintext Givens rotation matrix used in the ith iteration;
for matrix multiplication in the QR algorithm, the first computing terminal and the second computing terminal realize multiplication in additive secret sharing by taking the first encryption matrix and the second encryption matrix as two secret shares of a Givens rotation matrix in the QR algorithm based on a randomly generated multiplication tuple matrix.
10. A distributed graph data feature decomposition system with privacy protection is characterized by comprising a target graph node, a first computing terminal and a second computing terminal; the target graph node, the first computing terminal, and the second computing terminal cooperate to perform the privacy-preserving distributed graph data feature decomposition method of any one of claims 1-9.
CN202210341719.6A 2022-04-02 2022-04-02 Privacy-protected distributed graph data feature decomposition method and system Active CN114692200B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210341719.6A CN114692200B (en) 2022-04-02 2022-04-02 Privacy-protected distributed graph data feature decomposition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210341719.6A CN114692200B (en) 2022-04-02 2022-04-02 Privacy-protected distributed graph data feature decomposition method and system

Publications (2)

Publication Number Publication Date
CN114692200A true CN114692200A (en) 2022-07-01
CN114692200B CN114692200B (en) 2024-06-14

Family

ID=82141100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210341719.6A Active CN114692200B (en) 2022-04-02 2022-04-02 Privacy-protected distributed graph data feature decomposition method and system

Country Status (1)

Country Link
CN (1) CN114692200B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117150545A (en) * 2023-08-11 2023-12-01 湖北大学 Data evaluation method based on optimized distributed computation
CN118468326A (en) * 2024-07-15 2024-08-09 成都进托邦互联网信息服务有限公司 Webpage data encryption method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104094573A (en) * 2011-12-27 2014-10-08 意大利电信股份公司 Dynamic pseudonymization method for user data profiling networks and user data profiling network implementing the method
CN108055118A (en) * 2017-12-11 2018-05-18 东北大学 A kind of diagram data intersection computational methods of secret protection
WO2018174873A1 (en) * 2017-03-22 2018-09-27 Visa International Service Association Privacy-preserving machine learning
CN109740376A (en) * 2018-12-21 2019-05-10 哈尔滨工业大学(深圳) Location privacy protection method, system, equipment and medium based on NN Query
WO2019115697A1 (en) * 2017-12-14 2019-06-20 Robert Bosch Gmbh Method for faster secure multiparty inner product with spdz
WO2019202586A1 (en) * 2018-04-17 2019-10-24 B. G. Negev Technologies & Applications Ltd., At Ben-Gurion One-round secure multiparty computation of arithmetic streams and evaluation of functions
US20190372760A1 (en) * 2018-06-04 2019-12-05 Robert Bosch Gmbh Method and System for Fault Tolerant and Secure Multiparty Computation with SPDZ
CN112765657A (en) * 2021-01-15 2021-05-07 西安电子科技大学 Privacy protection method, system, storage medium and application of distributed support vector machine
CN112819058A (en) * 2021-01-26 2021-05-18 武汉理工大学 Distributed random forest evaluation system and method with privacy protection attribute

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104094573A (en) * 2011-12-27 2014-10-08 意大利电信股份公司 Dynamic pseudonymization method for user data profiling networks and user data profiling network implementing the method
WO2018174873A1 (en) * 2017-03-22 2018-09-27 Visa International Service Association Privacy-preserving machine learning
CN108055118A (en) * 2017-12-11 2018-05-18 东北大学 A kind of diagram data intersection computational methods of secret protection
WO2019115697A1 (en) * 2017-12-14 2019-06-20 Robert Bosch Gmbh Method for faster secure multiparty inner product with spdz
WO2019202586A1 (en) * 2018-04-17 2019-10-24 B. G. Negev Technologies & Applications Ltd., At Ben-Gurion One-round secure multiparty computation of arithmetic streams and evaluation of functions
US20190372760A1 (en) * 2018-06-04 2019-12-05 Robert Bosch Gmbh Method and System for Fault Tolerant and Secure Multiparty Computation with SPDZ
CN109740376A (en) * 2018-12-21 2019-05-10 哈尔滨工业大学(深圳) Location privacy protection method, system, equipment and medium based on NN Query
CN112765657A (en) * 2021-01-15 2021-05-07 西安电子科技大学 Privacy protection method, system, storage medium and application of distributed support vector machine
CN112819058A (en) * 2021-01-26 2021-05-18 武汉理工大学 Distributed random forest evaluation system and method with privacy protection attribute

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
何明;常盟盟;吴小飞: "一种基于差分隐私保护的协同过滤推荐方法", 计算机研究与发展, vol. 54, no. 7, 31 December 2017 (2017-12-31) *
周俊;沈华杰;林中允;曹珍富;董晓蕾;: "边缘计算隐私保护研究进展", 计算机研究与发展, no. 10, 9 October 2020 (2020-10-09) *
张莹光;苏森;陈维峰;杨放春;: "云环境下保护隐私的最短距离计算方法研究", 华中科技大学学报(自然科学版), no. 2, 10 January 2014 (2014-01-10) *
沈华峰;冯新扬;邵超;: "一种云环境下图数据中带边权重的隐私保护方法", 电视技术, no. 10, 5 October 2018 (2018-10-05) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117150545A (en) * 2023-08-11 2023-12-01 湖北大学 Data evaluation method based on optimized distributed computation
CN117150545B (en) * 2023-08-11 2024-07-30 湖北大学 Data evaluation method based on optimized distributed computation
CN118468326A (en) * 2024-07-15 2024-08-09 成都进托邦互联网信息服务有限公司 Webpage data encryption method
CN118468326B (en) * 2024-07-15 2024-09-13 成都进托邦互联网信息服务有限公司 Webpage data encryption method

Also Published As

Publication number Publication date
CN114692200B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
US20150381349A1 (en) Privacy-preserving ridge regression using masks
CN114692200B (en) Privacy-protected distributed graph data feature decomposition method and system
CN114817958B (en) Model training method, device, equipment and medium based on federal learning
He et al. Privacy-preserving and low-latency federated learning in edge computing
CN111291411B (en) Safe video anomaly detection system and method based on convolutional neural network
CN114547643A (en) Linear regression longitudinal federated learning method based on homomorphic encryption
CN115510502B (en) PCA method and system for privacy protection
Xue et al. Secure and privacy-preserving decision tree classification with lower complexity
Wang et al. Privacy-preserving analytics on decentralized social graphs: The case of eigendecomposition
Battarbee et al. Cryptanalysis of semidirect product key exchange using matrices over non-commutative rings
Zhao et al. SGBoost: An efficient and privacy-preserving vertical federated tree boosting framework
Bansal Survey on homomorphic encryption
Xiong et al. Decentralized privacy-preserving truth discovery for crowd sensing
Zhao et al. VFLR: An efficient and privacy-preserving vertical federated framework for logistic regression
Qin et al. Cryptographic Primitives in Privacy-Preserving Machine Learning: A Survey
Zheng et al. SecDR: Enabling secure, efficient, and accurate data recovery for mobile crowdsensing
CN111277406B (en) Block chain-based safe two-direction quantity advantage comparison method
Kotukh et al. Method of Security Improvement for MST3 Cryptosystem Based on Automorphism Group of Ree Function Field
CN117131942A (en) Drawing meaning network reasoning method and system with privacy protection
Zhang et al. Efficient federated learning framework based on multi-key homomorphic encryption
Gupta et al. Secure computation from leaky correlated randomness
CN116094708A (en) Privacy protection method, terminal and storage medium of DBSCAN algorithm
Shen et al. Privacy-preserving multi-party deep learning based on homomorphic proxy re-encryption
Feng et al. Secure distributed outsourcing of large-scale linear systems
Zhao et al. PPCNN: An efficient privacy‐preserving CNN training and inference framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant