CN112348192B - Knowledge federation-based knowledge reasoning method, system, equipment and medium - Google Patents

Knowledge federation-based knowledge reasoning method, system, equipment and medium Download PDF

Info

Publication number
CN112348192B
CN112348192B CN202010988090.5A CN202010988090A CN112348192B CN 112348192 B CN112348192 B CN 112348192B CN 202010988090 A CN202010988090 A CN 202010988090A CN 112348192 B CN112348192 B CN 112348192B
Authority
CN
China
Prior art keywords
client
data
knowledge
network
reasoning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010988090.5A
Other languages
Chinese (zh)
Other versions
CN112348192A (en
Inventor
孟丹
张宇
李宏宇
李晓林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongdun Holdings Co Ltd
Original Assignee
Tongdun Holdings Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongdun Holdings Co Ltd filed Critical Tongdun Holdings Co Ltd
Priority to CN202010988090.5A priority Critical patent/CN112348192B/en
Publication of CN112348192A publication Critical patent/CN112348192A/en
Application granted granted Critical
Publication of CN112348192B publication Critical patent/CN112348192B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Storage Device Security (AREA)

Abstract

The invention relates to a knowledge federal-based knowledge reasoning method, a system, equipment and a medium, belonging to the field of machine learning, wherein the method comprises the following steps: receiving an reasoning request sent by a client, wherein the reasoning request carries characteristic data corresponding to entity data to be inferred, the client encodes the entity data to be inferred through a local model generated in advance to obtain encoded data, compresses the encoded data to obtain compressed data, and encrypts the compressed data to generate the characteristic data; and feeding back a ciphertext reasoning result to the client according to the reasoning request and a pre-constructed knowledge network, wherein the ciphertext reasoning result is used for the client to decrypt so as to obtain a plaintext reasoning result. By adopting the method, the entity original data of the client can not be leaked, thereby guaranteeing the data privacy safety of the client and reducing the transmission quantity of the data.

Description

Knowledge federation-based knowledge reasoning method, system, equipment and medium
Technical Field
The present invention relates to the field of machine learning, and in particular, to a knowledge reasoning method, system, electronic device, and storage medium based on knowledge federation.
Background
Hidden, unknown, but valuable information in big data can be mined through a knowledge network based on a priori knowledge of humans. In the process of realizing data mining, the expression of a knowledge network is a very important work. Based on knowledge network expression, knowledge reasoning can be performed, and the knowledge reasoning refers to an operation process of solving a problem in a computer or an intelligent machine. In other words, knowledge reasoning is the process of machine thinking and solving problems by using the relation between knowledge, and aims to explore the relation between the problem and the conclusion, so as to achieve the conclusion from the known knowledge. Knowledge reasoning can be widely applied to the artificial intelligence fields such as risk analysis, event prediction and the like.
However, knowledge reasoning often needs to construct a knowledge network by connecting each client as a knowledge node in order to enable knowledge to flow freely among different knowledge sources in the knowledge sharing scenario among multiple clients, so as to create and mine more comprehensive and valuable knowledge. The following problems are: on the one hand, the data in the client is at risk of leakage; on the other hand, network transmission is required between the client and the server, and as the number of clients increases, the overhead of network transmission becomes a bottleneck for improving the system performance.
Disclosure of Invention
In order to solve the above problems, in a first aspect, an embodiment of the present invention provides a knowledge reasoning method based on knowledge federation, applied to a server, the method includes: receiving an reasoning request sent by a client, wherein the reasoning request carries characteristic data corresponding to entity data to be inferred, the client encodes the entity data to be inferred through a local model generated in advance to obtain encoded data, compresses the encoded data to obtain compressed data, and encrypts the compressed data to obtain the characteristic data;
And feeding back a ciphertext reasoning result to the client according to the reasoning request and a pre-constructed knowledge network, wherein the ciphertext reasoning result is used for the client to decrypt so as to obtain a plaintext reasoning result.
Optionally, the compressing the encoded data to obtain compressed data includes: and compressing the coded data by adopting a singular value decomposition method to obtain compressed data.
Optionally, the local model and the knowledge network are obtained by:
Selecting a reference client from a plurality of clients connected with the service end, wherein the reference client holds labels of own entity data relative to entity data of other clients; the steps are circularly carried out until a preset stopping condition is reached, and a hypothesized local model and a hypothesized knowledge network are obtained by the last cycle and serve as the knowledge network of the local model and the server of each client:
Acquiring characteristic data of each client, wherein the characteristic data is obtained by encoding each entity data by using a preset assumed local model for each client to express knowledge, compressing the encoded data, and encrypting the compressed data;
pairing a plurality of the feature data from all clients, wherein one feature data in each pair of feature data is from the reference client;
Learning a network topological relation between entity data pairs corresponding to each pair of characteristic data to train a preset hypothesized knowledge network, and sending the network topological relation to a reference client;
Receiving ciphertext loss from the reference client, wherein the ciphertext loss is a network topological relation sent by the reference client according to a label and a server based on a preset loss function;
And training the hypothesized local models of the plurality of clients and the hypothesized knowledge network in a combined way based on the ciphertext loss by adopting a gradient descent method to update the hypothesized knowledge network and the hypothesized local models.
Optionally, the assumed knowledge network is a neural network model, and the gradient descent method includes:
Using a least squares loss method, iteratively calculating an update gradient of a k-th layer neural network of the hypothesized knowledge network according to a chain rule:
wherein [ s k ] is the input to the layer k neural network of the hypothetical knowledge network;
Weight parameters of a k-th layer neural network of the hypothesized knowledge network The update expression of (2) is:
Input of the hypothetical knowledge network with respect to the first layer neural network The update gradient of (2) is:
Wherein the method comprises the steps of AndRespectively representing the characteristic data corresponding to the entity data in the client i and the characteristic data corresponding to the entity data in the client j, and recording the characteristic data from the client iAnd the characteristic data from client jThe encryption gradients of (a) are respectively: [ L i]、[Lj ] wherein [ L i]、[Lj ] isInput of a first layer neural network based on said hypothesized knowledge networkIn (a)AndThe dimension is correspondingly segmented to obtain the dimension;
The client i and the client j respectively decrypt and decompress the received encryption gradient [ L i]、[Lj ] to obtain plaintext gradients L i and L j, and respectively update the weight parameters of the respective assumed local model:
where λ is the learning rate.
Optionally, the neural network model includes any one of a multi-layer neural network model, a cyclic neural network model, and a convolutional neural network model.
Optionally, the convolutional neural network model is a triplet convolutional neural network model.
In a second aspect, an embodiment of the present invention provides a knowledge reasoning method based on knowledge federation, applied to a client, where the method includes the following steps:
Sending an reasoning request to a server, wherein the reasoning request carries characteristic data corresponding to entity data to be inferred, the client encodes the entity data to be inferred through a local model generated in advance to obtain encoded data, compresses the encoded data to obtain compressed data, and encrypts the compressed data to obtain the characteristic data;
receiving a ciphertext reasoning result sent by the server, decrypting the ciphertext reasoning result to obtain a plaintext reasoning result, wherein the server feeds back the ciphertext reasoning result to the client according to the reasoning request and a pre-constructed knowledge network.
In a third aspect, an embodiment of the present invention provides a knowledge reasoning system based on knowledge federation, including a server and a client, where the server is configured to receive a reasoning request sent by the client; feeding back a ciphertext reasoning result to the client according to the reasoning request and a pre-constructed knowledge network;
The client is used for sending an reasoning request to the server, wherein the reasoning request carries characteristic data corresponding to entity data to be inferred, the client encodes the entity data to be inferred through a local model generated in advance to obtain encoded data, compresses the encoded data to obtain compressed data, and encrypts the compressed data to generate the characteristic data; and receiving the ciphertext reasoning result sent by the server, and decrypting the ciphertext reasoning result to obtain a plaintext reasoning result.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including a processor and a storage medium storing a computer program, where the computer program when executed by the processor implements a knowledge reasoning method as described in any of the above.
In a fifth aspect, embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a knowledge reasoning method as claimed in any of the preceding claims.
According to the knowledge federal-based knowledge reasoning method, the data transmitted from the client to the server and the data synchronized from the server to the client are all encrypted data, and entity original data of the client cannot be leaked, so that the data privacy safety of the client is guaranteed. Because the original data of the client is encoded, compressed and encrypted and then uploaded to the server, the circulation of the data (namely the knowledge federation of the application) can be realized under the condition that the original data does not exist locally. In addition, the entity data of the client is compressed after being encoded, so that the transmission quantity of the data is reduced, and the problem that the overhead of network transmission becomes the bottleneck of system performance improvement along with the continuous increase of the number of the clients can be effectively solved. In addition, due to the reduction of the data transmission quantity, the data transmission efficiency can be improved, the problem of slow response of the server is solved, and the CPU calculation cost and the storage cost of the storage medium can be saved.
Drawings
FIG. 1 is a flow diagram of a knowledge federal based knowledge reasoning method in accordance with an embodiment of the present invention;
FIG. 2 (a) is a schematic diagram of data reconstruction of a client according to an embodiment, and FIG. 2 (b) is a schematic diagram of knowledge network construction of a server according to an embodiment;
FIG. 3 is a schematic diagram of the virtual relationship between a client and a server in the knowledge reasoning process based on knowledge federation in embodiment 1;
FIG. 4 is a schematic diagram of the knowledge federally based knowledge reasoning system of example 2;
Fig. 5 is a schematic structural diagram of the electronic device of embodiment 3.
Detailed Description
The invention will now be described in more detail with reference to the accompanying drawings, to which it should be noted that the description is given below by way of illustration only and not by way of limitation. Various embodiments may be combined with one another to form further embodiments not shown in the following description.
Referring to fig. 1, the knowledge reasoning method based on knowledge federation according to an embodiment of the present invention includes the following steps:
S1: receiving an reasoning request sent by a client, wherein the reasoning request carries characteristic data corresponding to an entity to be inferred, the client encodes the entity to be inferred through a local model generated in advance to obtain encoded data, compresses the encoded data to obtain compressed data, and encrypts the compressed data to obtain the characteristic data;
S2: and feeding back a ciphertext reasoning result to the client according to the reasoning request and a pre-constructed knowledge network, wherein the ciphertext reasoning result is used for the client to decrypt so as to obtain a plaintext reasoning result.
In the knowledge reasoning process, the client side and the server side only interact characteristic data, namely encrypted data, so that the knowledge sharing is ensured, and meanwhile, the safety of a user is also protected. Specifically, the data transmitted from the client to the server and the data synchronized from the server to the client are encrypted data, and the entity original data of the client cannot be leaked, so that the data privacy security of the client is ensured. Because the original data of the client is encoded, compressed and encrypted and then uploaded to the server, the circulation of the data can be realized under the condition that the original data does not exist locally. In addition, the entity data of the client is compressed after being encoded, so that the transmission quantity of the data is reduced, and the problem that the overhead of network transmission becomes the bottleneck of system performance improvement along with the continuous increase of the number of the clients can be effectively solved.
Optionally, the compressing the encoded data specifically includes: and compressing the encoded data by adopting a singular value decomposition method. The Singular Value Decomposition (SVD) is an important matrix decomposition in linear algebra, and is to map a data set to a low-dimensional space so as to achieve the purposes of dimension reduction and compression. Singular values of the data set after singular value decomposition are arranged according to importance. In this embodiment, after performing aggregation processing on the data encoded in each client, SVD processing is performed, specifically, processing is performed according to a gradient matrix of each layer, and matrix decomposition is performed on the matrix corresponding to each layer to obtain a matrix formed by a u×s×v, where U is a matrix formed by a left singular vector, S is a diagonal matrix, and V is a matrix formed by a right singular vector, elements on a diagonal line of the diagonal matrix are singular values, and when the singular values are arranged according to a size order, only the first n (two or three, etc. can be adjusted according to actual needs) non-zero singular values and the corresponding first n singular vectors, that is, only the first n columns of U (i.e., the first n left singular vectors), the first n rows of V (i.e., the first n right singular vectors) and the first n diagonal elements of S (i.e., the largest n singular values) are selected. The non-zero singular values are arranged according to importance, and only the first few singular values are selected to reduce transmission quantity without affecting the accuracy of the subsequent model training results.
Specific examples are given below to illustrate the SVD processing effect: assuming that the gradient matrix of a certain layer is a matrix of 500 x 800 after the data coded by each client are aggregated, if normal transmission is not performed by SVD processing, a matrix composed of 400000 numbers is required to be transmitted; after SVD approximation we take 2 singular values, then 2 left singular vectors of length 500, 2 right singular vectors of length 800 and 2 singular values now need to be transmitted, the total amount transmitted being (2×500+2×800+2) =2602 numbers.
If the encoded data in the client is transmitted directly without compression, using encryption (e.g., homomorphic encryption), then the amount of transmission is at least 64 times the original data (related to the key space at the time of root encryption). However, after SVD compression technology is adopted, the transmission quantity is about 1/10 of the original data, and finally after homomorphic encryption, the transmission quantity is 6.4 times of the original data, so that the beneficial effect of reducing the transmission quantity can be achieved.
It should be noted that, when transmission efficiency is not considered, the knowledge federal-based knowledge reasoning method client according to another embodiment of the present invention may not compress the encoded data.
The local model and the way in which the knowledge network is obtained will be described below.
In the application, the client side encodes the entity data to express the knowledge, so that the knowledge with different expression forms is unified to the same framework, and the self-adaption of different problem fields is realized. Referring to fig. 2 (a), in order to achieve knowledge adaptation and user privacy protection between multiple clients, in the present application, a client encodes entity data to express knowledge, implements knowledge reconstruction in each client, compresses the encoded data, and encrypts the compressed data to generate feature data. Referring to fig. 2 (b), the server learns the network topology relationship between the entity data based on the acquired large amount of feature data to construct a knowledge network, which includes the relationship between the knowledge.
As an example, the local model and the knowledge network are obtained by:
Selecting a reference client from a plurality of clients connected with the server, wherein the reference client is provided with labels which are not provided by other clients and correspond to the same entity data besides the entity data of the reference client;
the steps are circularly carried out until a preset stopping condition is reached, and a hypothesized local model and a hypothesized knowledge network are obtained by the last cycle and serve as the knowledge network of the local model and the server of each client:
acquiring characteristic data of each client, wherein the characteristic data is obtained by respectively encoding local entity data by using a preset assumed local model to express knowledge, then compressing the encoded data and finally encrypting the compressed data;
pairing a plurality of the feature data from all clients, wherein one feature data in each pair of feature data is from the reference client;
learning a network topology relation between entity data pairs corresponding to each pair of characteristic data to train a preset hypothesized knowledge network, and sending ciphertext expression of the network topology relation to a reference client;
receiving ciphertext loss from a reference client, wherein the ciphertext loss is calculated and encrypted by the reference client based on a preset loss function according to a label and ciphertext expression of a network topological relation sent by a server after decryption by a private key of the reference client;
And training the hypothesized local models of the plurality of clients and the hypothesized knowledge network in a combined way based on the ciphertext loss by adopting a gradient descent method to update the hypothesized knowledge network and the hypothesized local models.
The preset stopping condition may be the number of cycles or other conditions set according to the needs, for example, the difference value of the loss functions in two adjacent cycles is smaller than a certain set value, etc.
The labels held by the reference client can be understood as the topological relation between the entity data of the reference client and the entity data in other clients, and the number of labels held by the reference client is large to ensure the accuracy of subsequent model training.
For example, a client trains a hypothetical local model through a training set comprising several samples, each sample comprising two entity data (entity data may comprise ID and characteristic information) and a relationship between the two entity data, e.g., one sample in the mth client contains a pair of entity data (h, t) that passes through the hypothetical local modelAndEncoding to generate encoded data, whereinAndA weight parameter to be learned for the hypothesized local model, the weight parameterAndThe method can adopt a random initialization mode, and then adopts a back propagation mode to optimize the assumed local model according to the label of the local entity data of the reference client as supervision;
Referring to fig. 3, a client i (designated as a reference client) encodes its own plurality of entity data h i with its hypothesized local model to generate a plurality of encoded data The client j encodes its own plurality of entity data t j with its hypothesized local model to generate a plurality of encoded dataWherein the method comprises the steps ofi≥1,j≥1,The weight parameters that need to be learned for the hypothetical local model of client i,The weight parameters to be learned for the hypothesized local model of client j; client i and client j encode a plurality of data respectivelyAndCompression and encryption to generate a plurality of feature dataAndUploading the characteristic data to a server;
According to the mode, the server side can obtain a plurality of feature data uploaded by a plurality of clients and pair the feature data, and one feature data in each pair of feature data is from a reference client side; learning the network topology relationship between the entity data pairs corresponding to each pair of feature data according to the neural network model to train a preset hypothesized knowledge network, e.g. by learning the network topology relationship between the entity data (h i,tj) to construct the hypothesized knowledge network Wherein the method comprises the steps ofWeight parameters to be learned for the hypothesized knowledge network;
Using gradient descent method, the hypothetical local model is summed up to, for example, P clients And the hypothesized knowledge networkAnd (3) jointly training to generate the knowledge network and the local model, wherein m represents an mth client, and m is more than or equal to 1.
As one example, the gradient descent method includes:
Using a least squares loss method, and iteratively calculating the update gradient of the k-layer neural network of the hypothesized knowledge network according to a chain rule as follows:
wherein [ s k ] is the input to the layer k neural network of the hypothetical knowledge network;
the update expression of the weight parameters of the k-th layer neural network of the hypothesized knowledge network is as follows:
Wherein the method comprises the steps of A weight parameter of a k-th layer neural network of the hypothesized knowledge network;
Input of the hypothetical knowledge network with respect to the first layer neural network The update gradient of (2) is:
Wherein, Is a complex derivative, and there are cases where an activation function is added after the convolutional layer.
Recording a plurality of said characteristic data from client iAnd a plurality of said feature data from client jThe encryption gradients of (a) are respectively: [ L i]、[Lj ] wherein [ L i]、[Lj ] isInput of a first layer neural network based on said hypothesized knowledge networkIn (a)AndThe dimension is correspondingly segmented to obtain the dimension;
The client i and the client j respectively decrypt and decompress the received encryption gradient [ L i]、[Lj ] to obtain plaintext gradients L i and L j, and respectively update parameters of respective assumed local models:
where λ is the learning rate.
It should be noted that, the client i may be used as a reference client to perform key distribution on other clients to form a relationship, where the clients are all connected to the server, so that the server may jointly train the hypothesized local model and the hypothesized knowledge network with the clients.
On the basis of obtaining a local model and a knowledge network, a query client can send a query request to a server, wherein the query request carries characteristic data corresponding to entity data to be queried as a query target; after receiving the query request, the server acquires feature data corresponding to all entity data from other clients; and learning the ciphertext relationship between the characteristic data of the query target and other clients by using the knowledge network, feeding back the ciphertext relationship to the query client, and decrypting the ciphertext relationship by the query client to obtain a plaintext relationship.
It should be noted that, the "generated or constructed local model and knowledge network" described above refers to a local model and knowledge network that is trained at a certain stage, and the more training sets are used in the process of training the local model and knowledge network, the more the obtained local model and knowledge network are perfect, so when new training sets are added, learning can be continued based on already trained results.
Through the content, the original data of the client is encoded, compressed and encrypted and then uploaded to the server, so that the data circulation can be realized under the condition that the original data does not exist locally.
Optionally, the neural network model includes, but is not limited to, a multi-layer neural network model, a recurrent neural network model (RNN, recurrent Neural Network), a convolutional neural network model (CNN, convolutional Neural Networks).
Further, the convolutional neural network model may be a triplet convolutional neural network model.
Furthermore, the method also comprises the step of carrying out key distribution among the clients by adopting a homomorphic encryption method, and the server does not hold the key, so that encryption and decryption operations are not carried out.
According to the above, the technical scheme of the application has the following advantages:
(1) Defining a knowledge network using a network topology form;
(2) The self-adaption of different problem fields is realized by encoding the original knowledge;
(3) Only encrypted data is transmitted in the training and reasoning process, so that the privacy of a user is protected;
(4) The entity data of the client is compressed after being encoded, so that the transmission quantity of the data is reduced, and the problem that the overhead of network transmission becomes the bottleneck of system performance improvement along with the continuous increase of the number of the clients can be effectively solved.
The construction of the knowledge network is described in detail by way of example.
Example 1
The scenario of this embodiment is set as cross-feature federation learning (i.e., lateral federation learning), applicable to the following cases: the sample IDs overlap more, but there is little overlap in the feature dimension, and each client contains the user's features in different fields. When only one client has a label, a cross-domain knowledge network can be adopted to realize safe knowledge network system architecture and knowledge reasoning. For convenience of description, take 1 client i, 1 client j and 1 server C as examples, where client i holds a tag
An iterative process for building a secure knowledge network is as follows:
1. Initializing:
a. designating a client i as a reference client, the reference client i holding a tag
B. generating a homomorphic encrypted public-private key pair (pub_key, pri_key) by the client i, and transmitting the public-private key pair (pub_key, pri_key) to the client j;
c. The method comprises the steps of defining a hypothesis local model and a hypothesis knowledge network in advance, and setting corresponding weight parameters for each client in advance.
2. Each client side encodes by using a preset assumed local model, sequentially compresses and encrypts the encoded entity data to generate characteristic data of each entity, and sends the characteristic data to the server side. For example, the client i encodes its plurality of entity data h i with the hypothesized local model to generate a plurality of encoded dataThe client j encodes its plurality of entity data t j with the hypothesized local model to generate a plurality of encoded dataWherein the method comprises the steps ofi≥1,j≥1,The weight parameters which are needed to be learned by the hypothesized local model of the client i and the client j respectively, and the client i and the client j respectively encode a plurality of dataAndCompression (SVD) and encryption to generate a plurality of feature dataAndAnd uses the characteristic dataAndUploading to the server C.
3. The server C pairs the feature data uploaded by all clients, for example, obtains a pairing asWherein the characteristic dataFrom client i, the reference client (one feature data must be from each pairing), for each pairingLearning network topology relationships between pairs of physical data (h i,tj) to build a hypothetical knowledge networkThereby obtaining ciphertext representation r' of network topological relation among all entity data in the hypothesized knowledge network, whereinWeight parameters to be learned for the hypothesized knowledge network;
4. The server C synchronizes the ciphertext expression r 'to the client i, and the client i decrypts the ciphertext expression r' by using the private key pri_key to obtain a plaintext expression r of the network topological relation between entity data in the hypothesized knowledge network; client i is based on the held tag Calculating the loss for the plaintext expression r (the loss function may be defined as) Encrypting the loss value through the public key pub_key to obtain ciphertext loss, and uploading the ciphertext loss to the server C;
5. The server C carries out back propagation based on ciphertext loss, calculates gradient ciphertext of the hypothesized knowledge network, and updates weight parameters of the hypothesized knowledge network:
Wherein lambda is the learning rate of the learning device, Is a weight parameter of a k-th layer neural network of the hypothesized knowledge network. AndObtained by the following steps:
Using a least squares loss method, and iteratively calculating the update gradient of the k-layer neural network of the hypothesized knowledge network according to a chain rule as follows:
wherein [ s k ] is the input to the layer k neural network of the hypothetical knowledge network;
The hypothetical knowledge network relates to the input of a first layer neural network The update gradient of (2) is:
recording a plurality of said characteristic data from client i And a plurality of said feature data from client jThe encryption gradients of (a) are respectively: [ L i]、[Lj ] wherein [ L i]、[Lj ] isInput of a first layer neural network based on said hypothesized knowledge networkIn (a)AndAnd correspondingly cutting the dimension to obtain the product. The server C synchronizes the encryption gradient [ L i]、[Lj ] belonging to the client i and the client j. During model training, the whole data size is trained once as one epoch, the feature data is trained in batches (for example, 64 pieces of feature data in batches) in one epoch, and the training is one item at a time, and [ L i]、[Lj ] refers to an encryption gradient corresponding to the batch of feature data. In one item, only one time is accepted [ L i]、[Lj ], and the historical data for the next item is cleared.
6. The client i and the client j respectively decrypt and decompress the encrypted gradient by using a private key pri_key (recover the SVD compressed data) to obtain plaintext gradients L i and L j, calculate the gradient of the assumed local model by using the plaintext gradient, and update the weight parameters of the respective assumed local model:
iterating the steps 2-6 through optimization So as to achieve the purposes of training a local model and constructing a knowledge network.
The following steps are included in the knowledge federal-based knowledge reasoning process:
Any client can initiate knowledge reasoning, and the result of knowledge reasoning is only returned to the initiator, taking the case that the client i initiates knowledge reasoning:
1. The client i generates a homomorphic encrypted public and private key pair (pub_key, pri_key) and sends the public and private key pair (pub_key, pri_key) to the client j;
2. the client i and the client j encode the respective entity data n and entity data m with a trained local model, respectively, and compress (SVD) the encoded data and encrypt with a public key pub_key to generate feature data And
3. The server C receives an inference request (the inference request carries characteristic data corresponding to entity data n to be inferred) sent by the client i) Acquiring the characteristic data according to a pre-constructed knowledge networkRelational feature dataCiphertext inference is carried out to obtain ciphertext expression r 'of the network topological relation between entity data n and m in the knowledge network, and the ciphertext expression r' is synchronized to an initiator (client i);
4. The client i decrypts the ciphertext expression r' by using the private key pri_key to obtain a plaintext relation r between the target entity data n to be inferred and the target entity data m to be inferred, and knowledge reasoning is completed.
The step3 may be: the client i firstly obtains the characteristic data from the client jThen client i may send an inference request to server C to arrive at the obtained feature dataAndThe purpose of the relation between the data is that the reasoning request carries the characteristic data corresponding to the entity data n and m respectivelyAndThe server performs ciphertext inference according to the pre-constructed knowledge network to obtain ciphertext expression r' of the network topological relation between the entity data n and m in the knowledge network (namelyAndAnd synchronize the ciphertext representation r' to the initiator (client i).
It should be noted that, if the step 1 is performed during the system building, that is, the key generation step is already performed during the model management process, the key distribution may not be performed any more during the reasoning.
Further, if a new client needs to be added after the system is built, model training needs to be performed again (equivalent to updating the system).
Example 2
As shown in fig. 4, the knowledge reasoning system based on knowledge federation is a schematic structural diagram, and the knowledge reasoning system includes a server and a client, where the server may be one server or multiple connected servers; the number of the clients is multiple, and each client is connected with the server.
The client and the server are respectively used for executing necessary steps in knowledge reasoning processes based on knowledge federation in various embodiments of the invention.
Example 3
As shown in fig. 5, a schematic structural diagram of an electronic device includes a processor 610, a memory 620, an input device 630, and an output device 640; the number of processors 610 in a computer device may be one or more; the processor 610, memory 620, input devices 630, and output devices 640 in the electronic device may be connected by a bus or other means.
The processor 610 executes various functional applications of the electronic device and implements knowledge federal-based knowledge reasoning methods of various aspects of the invention by running software programs, instructions, and modules stored in the memory 620.
Memory 620 may include primarily a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application program required for functionality; the storage data area may store data created according to the use of the terminal, etc. In addition, memory 620 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 620 may further include memory remotely located relative to processor 610, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 630 may be used to receive entity data, coded data, compressed data, characteristic data, and the like. The output device 640 may include a display device such as a display screen.
Example 4
From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk, or an optical disk of a computer, etc., and includes several instructions for causing an electronic device (which may be a mobile phone, a personal computer, a server, or a network device, etc.) to perform the knowledge federally based knowledge reasoning method of various embodiments of the present invention.
It will be apparent to those skilled in the art from this disclosure that various other changes and modifications can be made which are within the scope of the invention as defined in the appended claims.

Claims (9)

1. A knowledge reasoning method based on knowledge federation is applied to a server and is characterized by comprising the following steps:
receiving an reasoning request sent by a client, wherein the reasoning request carries characteristic data corresponding to entity data to be inferred, the client encodes the entity data to be inferred through a local model generated in advance to obtain encoded data, compresses the encoded data to obtain compressed data, and encrypts the compressed data to obtain the characteristic data;
Feeding back a ciphertext reasoning result to the client according to the reasoning request and a pre-constructed knowledge network, wherein the ciphertext reasoning result is used for the client to decrypt to obtain a plaintext reasoning result;
wherein the local model and the knowledge network are obtained by:
Selecting a reference client from a plurality of clients connected with the service end, wherein the reference client holds labels of own entity data relative to entity data of other clients; circularly performing the following steps until a preset stopping condition is reached, and obtaining a hypothesized local model and a hypothesized knowledge network as the local model of each client and the knowledge network of the server by using the last cycle;
Acquiring characteristic data of each client, wherein the characteristic data is obtained by encoding each entity data by using a preset assumed local model for each client to express knowledge, compressing the encoded data, and encrypting the compressed data;
pairing a plurality of the feature data from all clients, wherein one feature data in each pair of feature data is from the reference client;
Learning a network topological relation between entity data pairs corresponding to each pair of characteristic data to train a preset hypothesized knowledge network, and sending the network topological relation to a reference client;
Receiving ciphertext loss from the reference client, wherein the ciphertext loss is a network topological relation sent by the reference client according to a label and a server based on a preset loss function;
And training the hypothesized local models of the plurality of clients and the hypothesized knowledge network in a combined way based on the ciphertext loss by adopting a gradient descent method to update the hypothesized knowledge network and the hypothesized local models.
2. The knowledge federal based knowledge reasoning method of claim 1, wherein the compressing the encoded data to obtain compressed data comprises: and compressing the coded data by adopting a singular value decomposition method to obtain compressed data.
3. The knowledge federally based knowledge reasoning method of claim 1, wherein the hypothetical knowledge network is a neural network model, and the gradient descent method comprises:
Using a least squares loss method, iteratively calculating an update gradient of a k-th layer neural network of the hypothesized knowledge network according to a chain rule:
wherein [ sk ] is an input to a layer k neural network of the hypothetical knowledge network;
Weight parameters of a k-th layer neural network of the hypothesized knowledge network The new expression of (2) is:
the update gradient of the hypothetical knowledge network with respect to the input of the first layer neural network is:
Wherein, AndRespectively representing the characteristic data corresponding to the entity data in the client i and the characteristic data corresponding to the entity data in the client j, and recording the characteristic data from the client iAnd the characteristic data from client jThe encryption gradients of (a) are respectively: [ L i]、[Lj ] wherein [ L i]、[Lj ] isInput of a first layer neural network based on said hypothesized knowledge networkIn (a)AndThe dimension is correspondingly segmented to obtain the dimension;
The client i and the client j respectively decrypt and decompress the received encryption gradient [ L i]、[Lj ] client i and the received encryption gradient [ Li ], lj respectively to obtain plaintext gradients Li and Lj, and respectively update the weight parameters of the respective assumed local model:
where λ is the learning rate.
4. The knowledge federal based knowledge reasoning method of claim 3, wherein the neural network model comprises any one of a multi-layer neural network model, a recurrent neural network model, and a convolutional neural network model.
5. The knowledge federal based knowledge reasoning method of claim 4, wherein the convolutional neural network model is a triplet convolutional neural network model.
6. A knowledge reasoning method based on knowledge federation is applied to a client and is characterized by comprising the following steps:
Sending an reasoning request to a server, wherein the reasoning request carries characteristic data corresponding to entity data to be inferred, the client encodes the entity data to be inferred through a local model generated in advance to obtain encoded data, compresses the encoded data to obtain compressed data, and encrypts the compressed data to obtain the characteristic data;
Receiving a ciphertext reasoning result sent by the server, decrypting the ciphertext reasoning result to obtain a plaintext reasoning result, wherein the server feeds back the ciphertext reasoning result to the client according to the reasoning request and a pre-constructed knowledge network;
wherein the local model and the knowledge network are obtained by:
Selecting a reference client from a plurality of clients connected with the service end, wherein the reference client holds labels of own entity data relative to entity data of other clients; circularly performing the following steps until a preset stopping condition is reached, and obtaining a hypothesized local model and a hypothesized knowledge network as the local model of each client and the knowledge network of the server by using the last cycle;
Acquiring characteristic data of each client, wherein the characteristic data is obtained by encoding each entity data by using a preset assumed local model for each client to express knowledge, compressing the encoded data, and encrypting the compressed data;
pairing a plurality of the feature data from all clients, wherein one feature data in each pair of feature data is from the reference client;
Learning a network topological relation between entity data pairs corresponding to each pair of characteristic data to train a preset hypothesized knowledge network, and sending the network topological relation to a reference client;
Receiving ciphertext loss from the reference client, wherein the ciphertext loss is a network topological relation sent by the reference client according to a label and a server based on a preset loss function;
And training the hypothesized local models of the plurality of clients and the hypothesized knowledge network in a combined way based on the ciphertext loss by adopting a gradient descent method to update the hypothesized knowledge network and the hypothesized local models.
7. A knowledge reasoning system based on knowledge federation is characterized by comprising a server side and a client side,
The server side is used for receiving an reasoning request sent by the client side; feeding back a ciphertext reasoning result to the client according to the reasoning request and a pre-constructed knowledge network;
The client is used for sending an reasoning request to the server, wherein the reasoning request carries characteristic data corresponding to entity data to be inferred, the client encodes the entity data to be inferred through a local model generated in advance to obtain encoded data, compresses the encoded data to obtain compressed data, and encrypts the compressed data to generate the characteristic data; receiving the ciphertext reasoning result sent by the server, and decrypting the ciphertext reasoning result to obtain a plaintext reasoning result;
wherein the local model and the knowledge network are obtained by:
Selecting a reference client from a plurality of clients connected with the service end, wherein the reference client holds labels of own entity data relative to entity data of other clients; circularly performing the following steps until a preset stopping condition is reached, and obtaining a hypothesized local model and a hypothesized knowledge network as the local model of each client and the knowledge network of the server by using the last cycle;
Acquiring characteristic data of each client, wherein the characteristic data is obtained by encoding each entity data by using a preset assumed local model for each client to express knowledge, compressing the encoded data, and encrypting the compressed data;
pairing a plurality of the feature data from all clients, wherein one feature data in each pair of feature data is from the reference client;
Learning a network topological relation between entity data pairs corresponding to each pair of characteristic data to train a preset hypothesized knowledge network, and sending the network topological relation to a reference client;
Receiving ciphertext loss from the reference client, wherein the ciphertext loss is a network topological relation sent by the reference client according to a label and a server based on a preset loss function;
And training the hypothesized local models of the plurality of clients and the hypothesized knowledge network in a combined way based on the ciphertext loss by adopting a gradient descent method to update the hypothesized knowledge network and the hypothesized local models.
8. An electronic device comprising a processor and a storage medium storing a computer program, characterized in that the computer program, when executed by the processor, implements the knowledge reasoning method of any of claims 1-6.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the knowledge reasoning method as claimed in any of claims 1 to 6.
CN202010988090.5A 2020-09-18 2020-09-18 Knowledge federation-based knowledge reasoning method, system, equipment and medium Active CN112348192B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010988090.5A CN112348192B (en) 2020-09-18 2020-09-18 Knowledge federation-based knowledge reasoning method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010988090.5A CN112348192B (en) 2020-09-18 2020-09-18 Knowledge federation-based knowledge reasoning method, system, equipment and medium

Publications (2)

Publication Number Publication Date
CN112348192A CN112348192A (en) 2021-02-09
CN112348192B true CN112348192B (en) 2024-07-12

Family

ID=74357330

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010988090.5A Active CN112348192B (en) 2020-09-18 2020-09-18 Knowledge federation-based knowledge reasoning method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN112348192B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114124345A (en) * 2021-11-10 2022-03-01 新智我来网络科技有限公司 Data homomorphic encryption reasoning method, device, equipment, system and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575527A (en) * 2014-04-02 2017-04-19 国际商业机器公司 Generating molecular encoding information for data storage
CN110719158A (en) * 2019-09-11 2020-01-21 南京航空航天大学 Edge calculation privacy protection system and method based on joint learning

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190394242A1 (en) * 2012-09-28 2019-12-26 Rex Wig System and method of a requirement, active compliance and resource management for cyber security application
US11100420B2 (en) * 2014-06-30 2021-08-24 Amazon Technologies, Inc. Input processing for machine learning
US11361211B2 (en) * 2018-06-20 2022-06-14 Accenture Global Solutions Limited Artificial intelligence (AI) based chatbot creation and communication system
CN110874484A (en) * 2019-10-16 2020-03-10 众安信息技术服务有限公司 Data processing method and system based on neural network and federal learning
CN110874638B (en) * 2020-01-19 2020-06-02 同盾控股有限公司 Behavior analysis-oriented meta-knowledge federation method, device, electronic equipment and system
CN111046857A (en) * 2020-03-13 2020-04-21 同盾控股有限公司 Face recognition method, device, equipment, medium and system based on knowledge federation
CN111402095A (en) * 2020-03-23 2020-07-10 温州医科大学 Method for detecting student behaviors and psychology based on homomorphic encrypted federated learning
CN111461874A (en) * 2020-04-13 2020-07-28 浙江大学 Credit risk control system and method based on federal mode
CN111553484B (en) * 2020-04-30 2023-09-08 同盾控股有限公司 Federal learning method, device and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575527A (en) * 2014-04-02 2017-04-19 国际商业机器公司 Generating molecular encoding information for data storage
CN110719158A (en) * 2019-09-11 2020-01-21 南京航空航天大学 Edge calculation privacy protection system and method based on joint learning

Also Published As

Publication number Publication date
CN112348192A (en) 2021-02-09

Similar Documents

Publication Publication Date Title
Dong et al. Eastfly: Efficient and secure ternary federated learning
CN113505882B (en) Data processing method based on federal neural network model, related equipment and medium
CN112347500B (en) Machine learning method, device, system, equipment and storage medium of distributed system
EP3031166A2 (en) A method and system for privacy-preserving recommendation based on matrix factorization and ridge regression
CN109474594A (en) Ship end data lightweight device, bank end data reduction apparatus, ship-shore cooperation data lightweight Transmission system and transmission method
CN114696990B (en) Multi-party computing method, system and related equipment based on fully homomorphic encryption
CN113065145A (en) Privacy protection linear regression method based on secret sharing and random disturbance
Lan et al. Communication-efficient federated learning for resource-constrained edge devices
CN116667996A (en) Verifiable federal learning method based on mixed homomorphic encryption
CN116708009A (en) Network intrusion detection method based on federal learning
Beguier et al. Safer: Sparse secure aggregation for federated learning
CN112348192B (en) Knowledge federation-based knowledge reasoning method, system, equipment and medium
CN104717644A (en) Two-tiered wireless sensor network range query method capable of verifying privacy protection
CN114448598A (en) Ciphertext compression method, ciphertext decompression method, device, equipment and storage medium
Dong et al. Meteor: improved secure 3-party neural network inference with reducing online communication costs
Zhang et al. Privacyeafl: Privacy-enhanced aggregation for federated learning in mobile crowdsensing
Xu et al. Simc 2.0: Improved secure ml inference against malicious clients
Lu et al. SCALR: Communication-Efficient Secure Multi-Party Logistic Regression
Shafran et al. Crypto-oriented neural architecture design
Lu et al. Privacy-Preserving Collaborative Learning with Linear Communication Complexity
CN114726524B (en) Target data sorting method and device, electronic equipment and storage medium
CN111581663B (en) Federal deep learning method for protecting privacy and facing irregular users
Feng et al. PpNNT: Multiparty Privacy-Preserving Neural Network Training System
Miao et al. Efficient Privacy-Preserving Federated Learning Against Inference Attacks for IoT
Hu et al. The blockchain-based edge computing framework for privacy-preserving federated learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant