WO2021029034A1 - 秘密勾配降下法計算方法、秘密深層学習方法、秘密勾配降下法計算システム、秘密深層学習システム、秘密計算装置、およびプログラム - Google Patents
秘密勾配降下法計算方法、秘密深層学習方法、秘密勾配降下法計算システム、秘密深層学習システム、秘密計算装置、およびプログラム Download PDFInfo
- Publication number
- WO2021029034A1 WO2021029034A1 PCT/JP2019/031941 JP2019031941W WO2021029034A1 WO 2021029034 A1 WO2021029034 A1 WO 2021029034A1 JP 2019031941 W JP2019031941 W JP 2019031941W WO 2021029034 A1 WO2021029034 A1 WO 2021029034A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- secret
- calculates
- gradient
- value
- activation
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/08—Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
- H04L9/0816—Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
- H04L9/085—Secret sharing or secret splitting, e.g. threshold schemes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/06—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L2209/00—Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
- H04L2209/04—Masking or blinding
- H04L2209/046—Masking or blinding of operations, operands or results of the operations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L2209/00—Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
- H04L2209/46—Secure multiparty computation, e.g. millionaire problem
Definitions
- the present invention relates to a technique for calculating the gradient descent method in secret calculation.
- Gradient descent is a learning algorithm often used in machine learning such as deep learning and logistic regression.
- SecureML Non-Patent Document 1
- SecureNN Non-Patent Document 2
- the most basic gradient descent method is relatively easy to implement, but it is known to have problems such as being easily stuck in a local solution and slow convergence.
- various optimization methods for gradient descent have been proposed, and it is known that the method called Adam converges quickly.
- An object of the present invention is to provide a technique capable of performing the calculation of the gradient descent method in secret calculation at high speed while maintaining the accuracy in view of the above technical problems.
- the secret gradient descent calculation method of the first aspect of the present invention performs at least a gradient G and a parameter W, which is executed by a secret gradient descent calculation system including a plurality of secret calculation devices. It is a secret gradient descent method calculation method that calculates the gradient descent method while keeping it secret.
- ⁇ 1 , ⁇ 2 , ⁇ , ⁇ are predetermined hyper parameters
- ⁇ is the product of each element
- t is the number of learnings.
- [G] is the concealed value of the gradient G
- [W] is the concealed value of the parameter W
- [M], [M ⁇ ], [V], [V ⁇ ], [G ⁇ ] are the gradient G and the number of elements.
- Adam uses the secret value [V ⁇ ] of the matrix V ⁇ of the value v ⁇ as an input and outputs the secret value [G ⁇ ] of the matrix G ⁇ of the value g ⁇ as a function to calculate the secret batch mapping of each secret calculator.
- the parameter update section calculates [M] ⁇ ⁇ 1 [M] + (1- ⁇ 1 ) [G], and the parameter update section calculates [V] ⁇ ⁇ 2 [V] + (1- ⁇ 2 ) [ G] ⁇ [G] is calculated, the parameter update part calculates [M ⁇ ] ⁇ ⁇ ⁇ 1, t [M], and the parameter update part is [V ⁇ ] ⁇ ⁇ ⁇ 2, t [V] Is calculated, the parameter update part calculates [G ⁇ ] ⁇ Adam ([V ⁇ ]), the parameter update part calculates [G ⁇ ] ⁇ [G ⁇ ] ⁇ [M ⁇ ], and the parameter is updated. The part calculates [W] ⁇ [W]-[G ⁇ ].
- the secret deep learning method of the second aspect of the present invention is executed by a secret deep learning system including a plurality of secret computing devices, and at least the feature amount X of the training data and the correct answer of the training data are executed. It is a secret deep learning method that learns a deep neural network while keeping the data T and the parameter W secret.
- ⁇ 1 , ⁇ 2 , ⁇ , ⁇ are predetermined hyperparameters
- ⁇ is the product of matrices
- ⁇ is The product of each element
- t is the number of trainings
- [G] is the secret value of the gradient G
- [W] is the secret value of the parameter W
- [X] is the secret value of the feature amount X of the training data
- [ T] is the concealed value of the correct label T of the training data
- [M], [M ⁇ ], [V], [V ⁇ ], [G ⁇ ], [U], [Y], [Z] are gradients.
- Adam is a function that calculates the secret batch mapping that takes the secret value [V ⁇ ] of the matrix V ⁇ of the value v ⁇ as input and outputs the secret value [G ⁇ ] of the matrix G ⁇ of the value g ⁇ , and rshift is the arithmetic right shift.
- m be the number of training data used for one learning, and H'become the following equation.
- n is the number of hidden layers in the deep neural network
- Activation is the activation function of the hidden layers
- Activation2 is the activation function of the output layer of the deep neural network
- Activation2' is the loss function corresponding to the activation function Activation2.
- Activation' is the differentiation of the activation function Activation, and the forward propagation part of each secret calculation device calculates [U 1 ] ⁇ [W 0 ] ⁇ [X], and the forward propagation part is [Y 1 ] ⁇ Activation ( [U 1 ]) is calculated, and the forward propagation part calculates [U i + 1 ] ⁇ [W i ] ⁇ [Y i ] for each i of 1 or more and n-1 or less, and the forward propagation part is 1.
- [Y i + 1 ] ⁇ Activation ([U i + 1 ]) is calculated for each i above n-1 and below, and the forward propagation part is [U n + 1 ] ⁇ [W n ] ⁇ [Y n ] Is calculated, the forward propagation part calculates [Y n + 1 ] ⁇ Activation2 ([U n + 1 ]), and the back propagation part of each secret computing device is [Z n + 1 ] ⁇ Activation2'([ Y n + 1 ], [T]) is calculated, and the back propagation part calculates [Z n ] ⁇ Activation'([U n ]) ⁇ ([Z n + 1 ] ⁇ [W n ]).
- the backpropagation part calculates [Z ni ] ⁇ Activation'([U ni ]) ⁇ ([Z n-i + 1 ] ⁇ [W ni ]) for each i of 1 or more and n-1 or less, and each secret
- the gradient calculation unit of the calculation device calculates [G 0 ] ⁇ [Z 1 ] ⁇ [X], and the gradient calculation unit calculates [G i ] ⁇ [Z i + 1 ] for each i of 1 or more and n-1 or less.
- the gradient calculation unit calculates [G n ] ⁇ [Z n + 1 ] ⁇ [Y n ], and the parameter update unit of each secret calculation device is [G 0 ] ⁇ rshift ([G 0 ], H') is calculated, and the parameter update unit calculates [G i ] ⁇ rshift ([G i ], H') for each i from 1 to n-1 and updates the parameters.
- the part calculates [G n ] ⁇ rshift ([G n ], H'), and the parameter update part calculates i by the secret gradient descent method calculation method of the first aspect for each i of 0 or more and n or less.
- the parameters [W i ] between the i layer and the i + 1 layer are learned using the gradient [G i ] between the layers and the i + 1 layer.
- the calculation of the gradient descent method for secret calculation can be performed at high speed while maintaining the accuracy.
- FIG. 1 is a diagram illustrating a functional configuration of a secret gradient descent calculation system.
- FIG. 2 is a diagram illustrating a functional configuration of the secret calculation device.
- FIG. 3 is a diagram illustrating a processing procedure of the secret gradient descent calculation method.
- FIG. 4 is a diagram illustrating a processing procedure of the secret gradient descent calculation method.
- FIG. 5 is a diagram illustrating the functional configuration of the secret deep learning system.
- FIG. 6 is a diagram illustrating the functional configuration of the secret calculation device.
- FIG. 7 is a diagram illustrating a processing procedure of the secret deep learning method.
- FIG. 8 is a diagram illustrating a functional configuration of a computer.
- x y_z means that y z is a superscript for x
- x y_z means that y z is a subscript for x
- [A] represents a encrypted by secret sharing, etc., and is called "share”.
- the secret batch mapping is a function to calculate the look-up table, and is a technology that can arbitrarily define the domain and range. Since the secret batch mapping is processed in vector units, it has the property of being efficient when performing the same processing for multiple inputs. The specific processing of the secret batch mapping is shown below.
- the gradient descent optimization method Adam is realized by using the secret batch map while concealing the gradient, the parameter, and the value in the middle of calculation.
- ⁇ ⁇ 1, t , ⁇ ⁇ 2, t , g ⁇ are defined by the following equations.
- ⁇ ⁇ 1, t and ⁇ ⁇ 2, t are calculated in advance for each t.
- the calculation of g ⁇ is realized by using a secret batch map that takes v ⁇ as input and outputs ⁇ / ( ⁇ v ⁇ + ⁇ ).
- the secret batch map is written as Adam (v ⁇ ).
- the constants ⁇ 1 , ⁇ 2 , ⁇ , and ⁇ are plaintext. Since the calculation of g ⁇ includes the square root and division, the processing cost in the secret calculation is large. However, by using the secret batch mapping, only one process is required, which is efficient.
- the secret gradient descent calculation system 100 includes, for example, N ( ⁇ 2) secret calculation devices 1 1 , ..., 1 N, as shown in FIG.
- the secret computing devices 1 1 , ..., 1 N are connected to the communication network 9, respectively.
- the communication network 9 is a circuit-switched or packet-switched communication network configured so that each connected device can communicate with each other.
- the Internet LAN (Local Area Network), WAN (Wide Area Network). Etc. can be used. It should be noted that each device does not necessarily have to be able to communicate online via the communication network 9.
- the information input to the secret computing device 1 1 , ..., 1 N is stored in a portable recording medium such as a magnetic tape or a USB memory, and the portable recording medium is offline to the secret computing device 1 1 , ..., 1 N. It may be configured to be input with.
- a portable recording medium such as a magnetic tape or a USB memory
- the secret calculation device 1 i includes, for example, a parameter storage unit 10, an initialization unit 11, a gradient calculation unit 12, and a parameter update unit 13.
- the secret computing device 1 i is configured by loading a special program into, for example, a publicly known or dedicated computer having a central processing unit (CPU), a main storage device (RAM: Random Access Memory), and the like. It is a special device.
- the secret calculation device 1 i executes each process under the control of the central processing unit, for example.
- the data input to the secret computing device 1i and the data obtained by each process are stored in the main storage device, for example, and the data stored in the main storage device is read out to the central processing unit as needed. It is used for other processing.
- At least a part of each processing unit of the secret computing device 1i may be configured by hardware such as an integrated circuit.
- Each storage unit included in the secret computing device 1 i is, for example, a main storage device such as RAM (Random Access Memory), an auxiliary storage device composed of a hard disk, an optical disk, or a semiconductor memory element such as a flash memory. Alternatively, it can be configured with middleware such as a relational database or key value store.
- a main storage device such as RAM (Random Access Memory)
- auxiliary storage device composed of a hard disk, an optical disk, or a semiconductor memory element such as a flash memory.
- middleware such as a relational database or key value store.
- the parameter storage unit 10 stores predetermined hyperparameters ⁇ 1 , ⁇ 2 , ⁇ , and ⁇ . These hyperparameters may be set to the values described in, for example, Reference 3. Further, the parameter storage unit 10 stores the hyperparameters ⁇ ⁇ 1, t and ⁇ ⁇ 2, t calculated in advance. Further, the parameter storage unit 10 stores a secret batch mapping Adam in which a domain and a range are set in advance.
- step S11 the initialization unit 11 of each secret calculation device 1i initializes the secret values [M] and [V] of the matrices M and V to 0.
- the matrices M and V are matrices of the same magnitude as the gradient G.
- the initialization unit 11 outputs the secret values [M] and [V] of the matrices M and V to the parameter update unit 13.
- step S12 the gradient calculation unit 12 of each secret calculation device 1 i calculates the secret value [G] of the gradient G.
- the gradient G may be obtained by a method usually used in the processing of the object to which the gradient descent method is applied (for example, logistic regression, neural network learning, etc.).
- the gradient calculation unit 11 outputs the secret value [G] of the gradient G to the parameter update unit 13.
- step S13-1 the parameter update unit 13 of each secret calculation device 1 i uses the hyperparameter ⁇ 1 stored in the parameter storage unit 10 to [M] ⁇ ⁇ 1 [M] + (1- ⁇ . 1 ) Calculate [G] and update the secret value [M] of the parameter M.
- step S13-2 the parameter update unit 13 of each secret calculation device 1 i uses the hyperparameter ⁇ 2 stored in the parameter storage unit 10 to [V] ⁇ ⁇ 2 [V] + (1- ⁇ . 2 ) [G] ⁇ Calculate [G] and update the secret value [V] of the matrix V.
- step S13-3 the parameter update unit 13 of each secret calculation device 1 i uses the hyperparameters ⁇ ⁇ 1, t stored in the parameter storage unit 10 to [M ⁇ ] ⁇ ⁇ ⁇ 1, t [ Calculate M] and generate the secret value [M ⁇ ] of the parameter M ⁇ .
- the matrix M ⁇ is a matrix with the same number of elements as the matrix M (that is, the same number of elements as the gradient G).
- step S13-4 the parameter update unit 13 of each secret calculation device 1 i uses the hyperparameters ⁇ ⁇ 2, t stored in the parameter storage unit 10 to [V ⁇ ] ⁇ ⁇ ⁇ 2, t [ Calculate V] and generate the secret value [V ⁇ ] of the matrix V ⁇ .
- the matrix V ⁇ is a matrix with the same number of elements as the matrix V (that is, the same number of elements as the gradient G).
- step S13-5 the parameter updating unit 13 of the secure computing apparatus 1 i, using the secret bulk mapping Adam, [G ⁇ ] ⁇ Adam ([V ⁇ ]) is calculated and the matrix G ⁇ confidential value [ G ⁇ ] is generated.
- the matrix G ⁇ is a matrix with the same number of elements as the matrix V ⁇ (that is, the same number of elements as the gradient G).
- step S13-6 the parameter updating unit 13 of the secure computing apparatus 1 i updates the [G ⁇ ] ⁇ [G ⁇ ] ⁇ [M ⁇ ] is calculated, and the gradient G ⁇ confidential value [G ⁇ ] ..
- step S13-7 the parameter updating unit 13 of the secure computing apparatus 1 i is, [W] ⁇ [W] - [G ⁇ ] is calculated, and updates the confidential value parameters W [W].
- Algorithm 1 shows the parameter update algorithm executed by the parameter update unit 13 of the present embodiment from step S13-1 to step S13-7.
- the V ⁇ input to the secret batch map Adam is always positive.
- the secret batch map Adam is a function that decreases monotonically, and has a feature that the slope is very large in the part where V ⁇ is close to 0, and Adam (V ⁇ ) gradually approaches 0 as V ⁇ becomes large. Since secret calculation uses fixed-point numbers from the viewpoint of processing cost, it cannot handle very small decimal numbers that can be handled by floating-point numbers. In other words, the range of the output of Adam (V ⁇ ) does not need to be set so large because very small V ⁇ is not input. For example, when each hyperparameter is set as described in Reference 3 and the precision after the decimal point of V ⁇ is set to 20 bits, the maximum value of Adam (V ⁇ ) may be about 1. Also, since the minimum value of Adam (V ⁇ ) can be determined by the required accuracy of Adam (V ⁇ ), the size of the mapping table can be determined by determining the accuracy of the input V ⁇ and output Adam (V ⁇ ). Can be decided.
- the parameter update unit 13 of this modification executes step S13-11 after step S13-1, executes step S13-12 after step S13-2, and executes step S13-12 in step S13-6. Later, step S13-13 is executed.
- step S13-11 the parameter update unit 13 of each secret calculation device 1 i shifts the secret value [M] of the matrix M to the right by b ⁇ bit arithmetic. That is, [M] ⁇ rshift ([M], b ⁇ ) is calculated, and the secret value [M] of the matrix M is updated.
- step S13-12 the parameter update unit 13 of each secret calculation device 1 i shifts the secret value [V] of the matrix V to the right by b ⁇ bit arithmetic. That is, [V] ⁇ rshift ([V], b ⁇ ) is calculated, and the secret value [V] of the matrix V is updated.
- step S13-13 the parameter update unit 13 of each secret calculation device 1 i shifts the secret value [G ⁇ ] of the matrix G ⁇ to the right by b g ⁇ + b ⁇ ⁇ _1 bit arithmetic. That is, [G ⁇ ] ⁇ rshift ([G ⁇ ], b g ⁇ + b ⁇ ⁇ _1 ) is calculated, and the secret value [G ⁇ ] of the matrix G ⁇ is updated.
- Algorithm 2 shows the parameter update algorithm executed by the parameter update unit 13 of this modification in steps S13-1 to S13-7 and S13-11 to S13-13.
- Algorithm 2 Secret calculation using secret batch mapping
- Adam algorithm input 1 Gradient [G]
- Input 2 Parameter [W] [M], [V] initialized with input 3: 0
- Input 4 Hyperparameters ⁇ 1 , ⁇ 2 , ⁇ ⁇ 1, t , ⁇ ⁇ 2, t
- Input 5 Number of learnings t
- Output 1 Updated parameter [W]
- Output 2 Updated [M]
- the accuracy setting is devised as follows.
- the precision here indicates the number of bits in the decimal point part.
- the variable w is set to the precision b w bits
- the actual value is w ⁇ 2 b_w .
- the range is different for each variable, it is advisable to determine the accuracy according to each range.
- w tends to be a small value, and parameters are very important values in machine learning, so it is better to increase the precision of the decimal point part.
- the hyperparameters ⁇ 1 and ⁇ 2 are set to about 0.9 or 0.999 in Reference 3, there is little need to increase the precision of the decimal point part.
- the following measures are taken for the right shift.
- secret calculation it is faster to process with fixed point numbers instead of floating point numbers from the viewpoint of processing cost, but with fixed point numbers, the decimal point position changes with each multiplication, so it is necessary to adjust by right shift. ..
- right shift is a costly process in secret calculation, it is better to reduce the number of right shifts as much as possible.
- secret batch map has the property that the range and domain can be set arbitrarily, it is also possible to adjust the number of digits like a right shift. Due to the characteristics of such secret calculation and secret batch mapping, it is more efficient to process as in this modification.
- Deep learning is performed by the optimization method Adam realized by using the secret batch mapping.
- deep learning is performed while keeping the learning data, learning label, and parameters secret.
- Any activation function can be used for the hidden layer and the output layer, and the shape of the neural network model is arbitrary.
- L the layer number
- the secret deep learning system 200 includes, for example, N ( ⁇ 2) secret calculation devices 2 1 , ..., 2 N, as shown in FIG.
- the secret computing devices 2 1 , ..., 2 N are connected to the communication network 9, respectively.
- the communication network 9 is a circuit-switched or packet-switched communication network configured so that each connected device can communicate with each other.
- the Internet LAN (Local Area Network), WAN (Wide Area Network). Etc. can be used. It should be noted that each device does not necessarily have to be able to communicate online via the communication network 9.
- the information input to the secret calculator 2 1 , ..., 2 N is stored in a portable recording medium such as a magnetic tape or a USB memory, and the portable recording medium is offline to the secret calculator 2 1 , ..., 2 N. It may be configured to be input with.
- the secret calculation device 2 i includes a parameter storage unit 10, an initialization unit 11, a gradient calculation unit 12, and a parameter update unit 13, as in the first embodiment, and stores learning data.
- a unit 20, a forward propagation calculation unit 21, and a back propagation calculation unit 22 are further provided.
- the learning data storage unit 20 stores a secret value [X] of the feature amount X of the learning data and a secret value [T] of the correct answer label T of the learning data.
- the parameter initialization method is selected according to the activation function. For example, when the ReLU function is used as the activation function of the intermediate layer, it is known that good learning results can be easily obtained by using the initialization method described in Reference 4.
- Algorithm 3 shows Adam's deep learning algorithm using the secret batch mapping executed by the secret deep learning system 200 of this embodiment.
- processing other than the parameter initialization in step 1 of Algorithm 3 is executed until it converges, such as for the preset number of learning times or the amount of parameter change becomes sufficiently small.
- the input layer, hidden layer, and output layer are calculated in this order, and in the calculation of back propagation, the output layer, hidden layer, and input layer are calculated in this order, but (3) gradient calculation. And (4) Since parameter update can be processed in parallel for each layer, the efficiency of processing can be improved by processing them collectively.
- the activation functions of the output layer and the hidden layer may be set as follows.
- the activation function used in the output layer is selected according to the analysis to be performed.
- Equal function f (x) x for numerical prediction (regression analysis), sigmoid function 1 / (1 + exp (-x)) for binary classification such as disease diagnosis and spam judgment, image classification, etc.
- the ReLU function is known to give good learning results even in deep networks and is frequently used in the field of deep learning.
- the batch size may be set as follows.
- the batch size m should be set to a value of 2, and the shift amount H'at that time is calculated by the equation (9).
- the batch size is the number of training data used in one learning.
- the forward propagation calculation unit 21 of this modification calculates the secret value [Y i + 1 ] of the output of the i + 1 layer for each integer i of 1 or more and n-1 or less, and then sets [Y i + 1 ] to b. Shift right by w bits. That is, [Y i + 1 ] ⁇ rshift ([Y i + 1 ], b w ) is calculated.
- Backpropagation calculation unit 22 of this modification after calculating the secret value of the error of the n layer [Z n], to b y bit arithmetic right shift [Z n]. That is, to calculate the [Z n] ⁇ rshift ([ Z n], b y).
- [Z ni ] is shifted to the right by b w bit arithmetic. That is, [Z ni ] ⁇ rshift ([Z ni ], b w ) is calculated.
- the hidden value [G 0 ] of the gradient between the input layer and the hidden layer of the first layer is arithmetically shifted to the right by the shift amount b x + H', and the hidden layer from the first layer to the n layer is hidden.
- the secret value of the gradient [G 1 ],..., [G n-1 ] shifts arithmetically to the right with the shift amount b w + b x + H', and the secret value of the gradient between the hidden layer of the n layer and the output layer [G n-1 ] n] is arithmetically right-shifted by the shift amount b x + b y + H ' .
- the secret value [W]: ([W 0 ], ..., [W n ]) of the parameter of each layer is updated according to the secret gradient descent calculation method of the second embodiment of the first embodiment.
- Algorithm 4 shows Adam's deep learning algorithm using the secret batch mapping executed by the secret deep learning system 200 of this modified example.
- deep learning can be performed until the processes other than the parameter initialization in step 1 in Algorithm 4 are converged, or by repeating the set number of learning times.
- the same device as in the second modification of the first embodiment is adopted.
- ⁇ Point of invention> the calculation of the gradient descent method Adam, which is not good at secret calculation such as square root and division, is regarded as one function, and the optimization method Adam is processed by one secret batch mapping.
- This optimization method can be applied to any model, regardless of the form of the machine learning model, as long as it is trained using the gradient descent method. For example, it can be used in various machine learning such as neural network (deep learning), logistic regression, and linear regression.
- the program that describes this processing content can be recorded on a computer-readable recording medium.
- the computer-readable recording medium may be, for example, a magnetic recording device, an optical disk, a photomagnetic recording medium, a semiconductor memory, or the like.
- the distribution of this program is carried out, for example, by selling, transferring, or renting a portable recording medium such as a DVD or CD-ROM on which the program is recorded.
- the program may be stored in the storage device of the server computer, and the program may be distributed by transferring the program from the server computer to another computer via a network.
- a computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. Then, when the process is executed, the computer reads the program stored in its own storage device and executes the process according to the read program. Further, as another execution form of this program, a computer may read the program directly from a portable recording medium and execute processing according to the program, and further, the program is transferred from the server computer to this computer. It is also possible to execute the process according to the received program one by one each time. In addition, the above processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition without transferring the program from the server computer to this computer. May be.
- the program in this embodiment includes information used for processing by a computer and equivalent to the program (data that is not a direct command to the computer but has a property of defining the processing of the computer, etc.).
- the present device is configured by executing a predetermined program on the computer, but at least a part of these processing contents may be realized by hardware.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Complex Calculations (AREA)
- Storage Device Security (AREA)
Abstract
Description
文中で使用する記号「→」「^」は、本来直前の文字の真上に記載されるべきものであるが、テキスト記法の制限により、当該文字の直後に記載する。数式中においてはこれらの記号は本来の位置、すなわち文字の真上に記述している。例えば、「a→」「a^」は数式中では次式で表される。
秘密一括写像はルックアップテーブルを計算する機能であり、定義域と値域を任意に定めることができる技術である。秘密一括写像ではベクトル単位で処理を行うため、複数の入力に対して同じ処理をする際の効率が良いという性質がある。以下に、秘密一括写像の具体的な処理を示す。
シェアの列[a→]:=([a0], …, [am-1])と公開値tとを入力とし、[a→]の各要素をtビット算術右シフトした[b→]:=([b0], …, [bm-1])を出力する。以下、右シフトはrshiftと表す。算術右シフトは左側を0ではなく符号ビットでパディングするシフトであり、論理右シフトrlshiftを用いて、式(1)~(3)のように、rshift([A×2n], n-m)=[A×2m]を構成する。なお、論理右シフトrlshiftの詳細は参考文献2を参照されたい。
単純な勾配降下法では、計算した勾配gに対してw=w-ηg(ηは学習率)という処理を行ってパラメータwを更新する。一方、Adamでは勾配に対して式(4)~(8)の処理を行ってパラメータを更新する。勾配gを計算するまでの処理は、単純な勾配降下法の場合でも、Adamを適用した場合でも同じ処理である。なお、tは何回目の学習かを表す変数であり、gtはt回目の勾配を表す。また、m, v, m^, v^はgと同じ大きさの行列であり、すべて0で初期化されているものとする。・t(上付き添え字のt)はt乗を表す。
第一実施形態では、秘密一括写像を用いて、勾配やパラメータ、計算途中の値を秘匿したまま、勾配降下法の最適化手法Adamを実現する。
入力1: 勾配[G]
入力2: パラメータ[W]
入力3: 0で初期化した[M], [V]
入力4: ハイパーパラメータβ1, β2, β^1,t, β^2,t
入力5: 学習回数t
出力1: 更新したパラメータ[W]
出力2: 更新した[M], [V]
1: [M]←β1[M]+(1-β1)[G]
2: [V]←β2[V]+(1-β2)[G]○[G]
3: [M^]←β^1,t[M]
4: [V^]←β^2,t[V]
5: [G^]←Adam([V^])
6: [G^]←[G^]○[M^]
7: [W]←[W]-[G^]
変形例1では、第一実施形態で用いた秘密一括写像Adamを構成する際に、定義域と値域からなるテーブルの作成方法を工夫する。
変形例2では、第一実施形態で、さらに各変数の精度を表1のように設定する。
入力1: 勾配[G]
入力2: パラメータ[W]
入力3: 0で初期化した[M], [V]
入力4: ハイパーパラメータβ1, β2, β^1,t, β^2,t
入力5: 学習回数t
出力1: 更新したパラメータ[W]
出力2: 更新した[M], [V]
1: [M]←β1[M]+(1-β1)[G] (精度:bw+bβ)
2: [M]←rshift([M],bβ) (精度:bw)
3: [V]←β2[V]+(1-β2)[G]○[G] (精度:2bw+bβ)
4: [V]←rshift([V],bβ) (精度:2bw)
5: [M^]←β^1,t[M] (精度:bw+bβ^_1)
6: [V^]←β^2,t[V] (精度:2bw+bβ^_2)
7: [G^]←Adam([V^]) (精度:bg^)
8: [G^]←[G^]○[M^] (精度:bg^+bw+bβ^_1)
9: [G^]←rshift([G^],bg^+bβ^_1) (精度:bw)
10: [W]←[W]-[G^] (精度:bw)
第二実施形態では、秘密一括写像を用いて実現した最適化手法Adamによってディープラーニングを行う。この例では、学習データ、学習ラベル、パラメータを秘匿したままディープラーニングを行う。隠れ層および出力層で用いる活性化関数は何を用いてもよく、ニューラルネットワークのモデルの形も任意である。ここでは、隠れ層の数がn層のディープニューラルネットワークを学習するものとする。すなわち、Lを層の番号として、入力層はL=0であり、出力層はL=n+1となる。第二実施形態によれば、単純な勾配降下法を用いた従来技術と比較して、少ない学習回数であっても良い学習結果を得られる。
入力1: 学習データの特徴量[X]
入力2: 学習データの正解ラベル[T]
入力3: l層とl+1層間のパラメータ[Wl]
出力: 更新したパラメータ[Wl]
1: すべての[W]を初期化
2: (1)順伝搬の計算
3: [U1]←[W0]・[X]
4: [Y1]←Activation([U1])
5: for i=1 to n-1 do
6: [Ui+1]←[Wi]・[Yi]
7: [Yi+1]←Activation([Ui+1])
8: end for
9: [Un+1]←[Wn]・[Yn]
10: [Yn+1]←Activation2([Un+1])
11: (2)逆伝搬の計算
12: [Zn+1]←Activation2'([Yn+1],[T])
13: [Zn]←Activation'([Un])○([Zn+1]・[Wn])
14: for i=1 to n-1 do
15: [Zn-i]←Activation'([Un-i])○([Zn-i+1]・[Wn-i])
16: end for
17: (3)勾配の計算
18: [G0]←[Z1]・[X]
19: for i=1 to n-1 do
20: [Gi]←[Zi+1]・[Yi]
21: end for
22: [Gn]←[Zn+1]・[Yn]
23: (4)パラメータの更新
24: [G0]←rshift([G0],H')
25: for i=1 to n-1 do
26: [Gi]←rshift([Gi],H')
27: end for
28: [Gn]←rshift([Gn],H')
29: for i=0 to n do
30: [Mi]←β1[Mi]+(1-β1)[Gi]
31: [Vi]←β2[Vi]+(1-β2)[Gi]○[Gi]
32: [M^i]←β^1,t[Mi]
33: [V^i]←β^2,t[Vi]
34: [G^i]←Adam([V^i])
35: [G^i]←[G^i]○[M^i]
36: [Wi]←[Wi]-[G^i]
37: end for
第二実施形態のディープラーニングで、学習に用いる各値の精度を表2のように設定する。wは各層の間のパラメータ、xは学習データ、tは各学習データに対応する正解データ(教師データ)である。隠れ層の活性化関数の出力は、正解データの精度と同じになるように処理する。また、g^は秘密一括写像Adamの計算によって得られる値である。
入力1: 学習データの特徴量[X]
入力2: 学習データの正解ラベル[T]
入力3: l層とl+1層間のパラメータ[Wl]
出力: 更新したパラメータ[Wl]
1: すべての[W]を初期化 (精度:bw)
2: (1)順伝搬の計算
3: [U1]←[W0]・[X] (精度:bw+bx)
4: [Y1]←ReLU([U1]) (精度:bw+bx)
5: for i=1 to n-1 do
6: [Ui+1]←[Wi]・[Yi] (精度:2bw+bx)
7: [Yi+1]←ReLU([Ui+1]) (精度:2bw+bx)
8: [Yi+1]←rshift([Yi+1],bw) (精度:bw+bx)
9: end for
10: [Un+1]←[Wn]・[Yn] (精度:2bw+bx)
11: [Yn+1]←softmax([Un+1]) (精度:by)
12: (2)逆伝搬の計算
13: [Zn+1]←[Yn+1]-[T] (精度:by)
14: [Zn]←ReLU'([Un])○([Zn+1]・[Wn]) (精度:bw+by)
15: [Zn]←rshift([Zn],by) (精度:bw)
16: for i=1 to n-1 do
17: [Zn-i]←ReLU'([Un-i])○([Zn-i+1]・[Wn-i]) (精度:2bw)
18: [Zn-i]←rshift([Zn-i],bw) (精度:bw)
19: end for
20: (3)勾配の計算
21: [G0]←[Z1]・[X] (精度:bw+bx)
22: for i=1 to n-1 do
23: [Gi]←[Zi+1]・[Yi] (精度:2bw+bx)
24: end for
25: [Gn]←[Zn+1]・[Yn] (精度:bw+bx+by)
26: (4)パラメータの更新
27: [G0]←rshift([G0],bx+H') (精度:bw)
28: for i=1 to n-1 do
29: [Gi]←rshift([Gi],bw+bx+H') (精度:bw)
30: end for
31: [Gn]←rshift([Gn],bx+by+H') (精度:bw)
32: for i=0 to n do
33: [Mi]←β1[Mi]+(1-β1)[Gi] (精度:bw+bβ)
34: [Mi]←rshift([Mi],bβ) (精度:bw)
35: [Vi]←β2[Vi]+(1-β2)[Gi]○[Gi] (精度:2bw+bβ)
36: [Vi]←rshift([Vi],bβ) (精度:2bw)
37: [M^i]←β^1,t[Mi] (精度:bw+bβ^_1)
38: [V^i]←β^2,t[Vi] (精度:2bw+bβ^_2)
39: [G^i]←Adam([V^i]) (精度:bg^)
40: [G^i]←[G^i]○[M^i] (精度:bg^+bw+bβ^_1)
41: [G^i]←rshift([G^i],bg^+bβ^_1) (精度:bw)
42: [Wi]←[Wi]-[G^i] (精度:bw)
43: end for
本発明では、勾配降下法の最適化手法Adamに含まれる平方根や除算といった秘密計算が苦手とする計算をまとめて1つの関数とみなすことで、1回の秘密一括写像で最適化手法Adamの処理を効率的に行えるようにした。これによって、秘密計算上で機械学習を行う従来技術よりも少ない回数での学習が可能になり、全体の処理時間を短く抑えることができる。この最適化手法は機械学習モデルの形は問わず、勾配降下法を用いて学習する場合であればどのようなモデルにも適用できる。例えば、ニューラルネットワーク(ディープラーニング)やロジスティック回帰、線形回帰といった様々な機械学習で用いることができる。
上記実施形態で説明した各装置における各種の処理機能をコンピュータによって実現する場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムを図8に示すコンピュータの記憶部1020に読み込ませ、制御部1010、入力部1030、出力部1040などに動作させることにより、上記各装置における各種の処理機能がコンピュータ上で実現される。
Claims (8)
- 複数の秘密計算装置を含む秘密勾配降下法計算システムが実行する、少なくとも勾配GとパラメータWとを秘匿したまま勾配降下法を計算する秘密勾配降下法計算方法であって、
β1, β2, η, εは予め定めたハイパーパラメータとし、○は要素ごとの積とし、tは学習回数とし、[G]は上記勾配Gの秘匿値とし、[W]は上記パラメータWの秘匿値とし、[M], [M^], [V], [V^], [G^]は上記勾配Gと要素数が等しい行列M, M^, V, V^, G^の秘匿値とし、β^1,t, β^2,t, g^を次式とし、
Adamは値v^の行列V^の秘匿値[V^]を入力として値g^の行列G^の秘匿値[G^]を出力する秘密一括写像を計算する関数とし、
各秘密計算装置のパラメータ更新部が、[M]←β1[M]+(1-β1)[G]を計算し、
上記パラメータ更新部が、[V]←β2[V]+(1-β2)[G]○[G]を計算し、
上記パラメータ更新部が、[M^]←β^1,t[M]を計算し、
上記パラメータ更新部が、[V^]←β^2,t[V]を計算し、
上記パラメータ更新部が、[G^]←Adam([V^])を計算し、
上記パラメータ更新部が、[G^]←[G^]○[M^]を計算し、
上記パラメータ更新部が、[W]←[W]-[G^]を計算する、
秘密勾配降下法計算方法。 - 請求項1に記載の秘密勾配降下法計算方法であって、
rshiftは算術右シフトとし、bβはβ1およびβ2の精度とし、bβ^_1はβ^1,tの精度とし、bg^はg^の精度とし、
上記パラメータ更新部が、[M]←β1[M]+(1-β1)[G]を計算した後に、[M]←rshift([M],bβ)を計算し、
上記パラメータ更新部が、[V]←β2[V]+(1-β2)[G]○[G]を計算した後に、[V]←rshift([V],bβ)を計算し、
上記パラメータ更新部が、[G^]←[G^]○[M^]を計算した後に、[G^]←rshift([G^],bg^+bβ^_1)を計算する、
秘密勾配降下法計算方法。 - 複数の秘密計算装置を含む秘密深層学習システムが実行する、少なくとも学習データの特徴量Xと学習データの正解データTとパラメータWとを秘匿したままディープニューラルネットワークを学習する秘密深層学習方法であって、
β1, β2, η, εは予め定めたハイパーパラメータとし、・は行列の積とし、○は要素ごとの積とし、tは学習回数とし、[G]は勾配Gの秘匿値とし、[W]は上記パラメータWの秘匿値とし、[X]は上記学習データの特徴量Xの秘匿値とし、[T]は上記学習データの正解ラベルTの秘匿値とし、[M], [M^], [V], [V^], [G^], [U], [Y], [Z]は上記勾配Gと要素数が等しい行列M, M^, V, V^, G^, U, Y, Zの秘匿値とし、β^1,t, β^2,t, g^を次式とし、
Adamは値v^の行列V^の秘匿値[V^]を入力として値g^の行列G^の秘匿値[G^]を出力する秘密一括写像を計算する関数とし、rshiftは算術右シフトとし、mは1回の学習に用いる学習データの数とし、H'は次式とし、
nは上記ディープニューラルネットワークの隠れ層の数とし、Activationは上記隠れ層の活性化関数とし、Activation2は上記ディープニューラルネットワークの出力層の活性化関数とし、Activation2'は上記活性化関数Activation2に対応する損失関数とし、Activation'は上記活性化関数Activationの微分とし、
各秘密計算装置の順伝搬部が、[U1]←[W0]・[X]を計算し、
上記順伝搬部が、[Y1]←Activation([U1])を計算し、
上記順伝搬部が、1以上n-1以下の各iについて[Ui+1]←[Wi]・[Yi]を計算し、
上記順伝搬部が、1以上n-1以下の各iについて[Yi+1]←Activation([Ui+1])を計算し、
上記順伝搬部が、[Un+1]←[Wn]・[Yn]を計算し、
上記順伝搬部が、[Yn+1]←Activation2([Un+1])を計算し、
各秘密計算装置の逆伝搬部が、[Zn+1]←Activation2'([Yn+1],[T])を計算し、
上記逆伝搬部が、[Zn]←Activation'([Un])○([Zn+1]・[Wn])を計算し、
上記逆伝搬部が、1以上n-1以下の各iについて[Zn-i]←Activation'([Un-i])○([Zn-i+1]・[Wn-i])を計算し、
各秘密計算装置の勾配計算部が、[G0]←[Z1]・[X]を計算し、
上記勾配計算部が、1以上n-1以下の各iについて[Gi]←[Zi+1]・[Yi]を計算し、
上記勾配計算部が、[Gn]←[Zn+1]・[Yn]を計算し、
各秘密計算装置のパラメータ更新部が、[G0]←rshift([G0],H')を計算し、
上記パラメータ更新部が、1以上n-1以下の各iについて[Gi]←rshift([Gi],H')を計算し、
上記パラメータ更新部が、[Gn]←rshift([Gn],H')を計算し、
上記パラメータ更新部が、0以上n以下の各iについて、請求項1に記載の秘密勾配降下法計算方法により、i層とi+1層間の勾配[Gi]を用いてi層とi+1層間のパラメータ[Wi]を学習する、
秘密深層学習方法。 - 請求項3に記載の秘密深層学習方法であって、
bwはwの精度とし、byはYの要素の精度とし、bβはβ1およびβ2の精度とし、bβ^_1はβ^1,tの精度とし、bg^はg^の精度とし、
上記順伝搬計算部が、[Yi+1]←Activation([Ui+1])を計算した後に、[Yi+1]←rshift([Yi+1],bw)を計算し、
上記逆伝搬計算部が、[Zn]←Activation'([Un])○([Zn+1]・[Wn])を計算した後に、[Zn]←rshift([Zn],by)を計算し、
各上記逆伝搬計算部が、[Zn-i]←Activation'([Un-i])○([Zn-i+1]・[Wn-i])を計算した後に、[Zn-i]←rshift([Zn-i],bw)を計算し、
上記パラメータ更新部が、0以上n以下の各iについて、請求項2に記載の秘密勾配降下法計算方法により、i層とi+1層間の勾配[Gi]を用いてi層とi+1層間のパラメータ[Wi]を学習する、
秘密深層学習方法。 - 複数の秘密計算装置を含み、少なくとも勾配GとパラメータWとを秘匿したまま勾配降下法を計算する秘密勾配降下法計算システムであって、
β1, β2, η, εは予め定めたハイパーパラメータとし、○は要素ごとの積とし、tは学習回数とし、[G]は上記勾配Gの秘匿値とし、[W]は上記パラメータWの秘匿値とし、[M], [M^], [V], [V^], [G^]は上記勾配Gと要素数が等しい行列M, M^, V, V^, G^の秘匿値とし、β^1,t, β^2,t, g^を次式とし、
Adamは値v^の行列V^の秘匿値[V^]を入力として値g^の行列G^の秘匿値[G^]を出力する秘密一括写像を計算する関数とし、
各秘密計算装置は、
[M]←β1[M]+(1-β1)[G]と、[V]←β2[V]+(1-β2)[G]○[G]と、[M^]←β^1,t[M]と、[V^]←β^2,t[V]と、[G^]←Adam([V^])と、[G^]←[G^]○[M^]と、[W]←[W]-[G^]とを計算するパラメータ更新部を含む、
秘密勾配降下法計算システム。 - 複数の秘密計算装置を含み、少なくとも学習データの特徴量Xと学習データの正解データTとパラメータWとを秘匿したままディープニューラルネットワークを学習する秘密深層学習システムであって、
β1, β2, η, εは予め定めたハイパーパラメータとし、・は行列の積とし、○は要素ごとの積とし、tは学習回数とし、[G]は勾配Gの秘匿値とし、[W]は上記パラメータWの秘匿値とし、[X]は上記学習データの特徴量Xの秘匿値とし、[T]は上記学習データの正解ラベルTの秘匿値とし、[M], [M^], [V], [V^], [G^], [U], [Y], [Z]は上記勾配Gと要素数が等しい行列M, M^, V, V^, G^, U, Y, Zの秘匿値とし、β^1,t, β^2,t, g^を次式とし、
Adamは値v^の行列V^の秘匿値[V^]を入力として値g^の行列G^の秘匿値[G^]を出力する秘密一括写像を計算する関数とし、rshiftは算術右シフトとし、mは1回の学習に用いる学習データの数とし、H'は次式とし、
nは上記ディープニューラルネットワークの隠れ層の数とし、Activationは上記隠れ層の活性化関数とし、Activation2は上記ディープニューラルネットワークの出力層の活性化関数とし、Activation2'は上記活性化関数Activation2に対応する損失関数とし、Activation'は上記活性化関数Activationの微分とし、
各秘密計算装置は、
[U1]←[W0]・[X]と、[Y1]←Activation([U1])と、1以上n-1以下の各iについての[Ui+1]←[Wi]・[Yi], [Yi+1]←Activation([Ui+1])と、[Un+1]←[Wn]・[Yn]と、[Yn+1]←Activation2([Un+1])とを計算する順伝搬部と、
[Zn+1]←Activation2'([Yn+1],[T])と、[Zn]←Activation'([Un])○([Zn+1]・[Wn])と、1以上n-1以下の各iについての[Zn-i]←Activation'([Un-i])○([Zn-i+1]・[Wn-i])とを計算する逆伝搬部と、
[G0]←[Z1]・[X]と、1以上n-1以下の各iについての[Gi]←[Zi+1]・[Yi]と、[Gn]←[Zn+1]・[Yn]とを計算する勾配計算部と、
[G0]←rshift([G0],H')と、1以上n-1以下の各iについての[Gi]←rshift([Gi],H')と、[Gn]←rshift([Gn],H')とを計算し、0以上n以下の各iについて、請求項5に記載の秘密勾配降下法計算システムにより、i層とi+1層間の勾配[Gi]を用いてi層とi+1層間のパラメータ[Wi]を学習するパラメータ更新部と、
を含む秘密深層学習システム。 - 請求項5に記載の秘密勾配降下法計算システムまたは請求項6に記載の秘密深層学習システムにおいて用いられる秘密計算装置。
- 請求項7に記載の秘密計算装置としてコンピュータを機能させるためのプログラム。
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19941104.2A EP4016507A4 (en) | 2019-08-14 | 2019-08-14 | SECRET GRADIENT DESCENT CALCULATION METHOD, SECRET DEEP LEARNING METHOD, SECRET GRADIENT DESCENT CALCULATION SYSTEM, SECRET DEEP LEARNING SYSTEM, SECRET CALCULATION DEVICE AND PROGRAM |
CN201980099184.2A CN114207694B (zh) | 2019-08-14 | 2019-08-14 | 秘密梯度下降法计算方法及系统、秘密深度学习方法及系统、秘密计算装置、记录介质 |
AU2019461061A AU2019461061B2 (en) | 2019-08-14 | 2019-08-14 | Secure gradient descent computation method, secure deep learning method, secure gradient descent computation system, secure deep learning system, secure computation apparatus, and program |
JP2021539762A JP7279796B2 (ja) | 2019-08-14 | 2019-08-14 | 秘密勾配降下法計算方法、秘密深層学習方法、秘密勾配降下法計算システム、秘密深層学習システム、秘密計算装置、およびプログラム |
PCT/JP2019/031941 WO2021029034A1 (ja) | 2019-08-14 | 2019-08-14 | 秘密勾配降下法計算方法、秘密深層学習方法、秘密勾配降下法計算システム、秘密深層学習システム、秘密計算装置、およびプログラム |
US17/634,237 US20220329408A1 (en) | 2019-08-14 | 2019-08-14 | Secure gradient descent computation method, secure deep learning method, secure gradient descent computation system, secure deep learning system, secure computation apparatus, and program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2019/031941 WO2021029034A1 (ja) | 2019-08-14 | 2019-08-14 | 秘密勾配降下法計算方法、秘密深層学習方法、秘密勾配降下法計算システム、秘密深層学習システム、秘密計算装置、およびプログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021029034A1 true WO2021029034A1 (ja) | 2021-02-18 |
Family
ID=74570940
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2019/031941 WO2021029034A1 (ja) | 2019-08-14 | 2019-08-14 | 秘密勾配降下法計算方法、秘密深層学習方法、秘密勾配降下法計算システム、秘密深層学習システム、秘密計算装置、およびプログラム |
Country Status (6)
Country | Link |
---|---|
US (1) | US20220329408A1 (ja) |
EP (1) | EP4016507A4 (ja) |
JP (1) | JP7279796B2 (ja) |
CN (1) | CN114207694B (ja) |
AU (1) | AU2019461061B2 (ja) |
WO (1) | WO2021029034A1 (ja) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017183587A1 (ja) * | 2016-04-18 | 2017-10-26 | 日本電信電話株式会社 | 学習装置、学習方法および学習プログラム |
JP2018156619A (ja) * | 2017-03-16 | 2018-10-04 | 株式会社デンソー | 連続最適化問題の大域的探索装置及びプログラム |
JP2019008383A (ja) * | 2017-06-21 | 2019-01-17 | キヤノン株式会社 | 画像処理装置、撮像装置、画像処理方法、プログラム、および、記憶媒体 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09106390A (ja) * | 1995-10-12 | 1997-04-22 | Sumitomo Metal Ind Ltd | ニューラルネットワーク |
JP4354609B2 (ja) * | 1999-07-16 | 2009-10-28 | パナソニック株式会社 | 有限体上の連立方程式求解装置及び逆元演算装置 |
CN101635021B (zh) * | 2003-09-26 | 2011-08-10 | 日本电信电话株式会社 | 标签装置、标签自动识别系统和标签隐私保护方法 |
KR20150122162A (ko) * | 2013-03-04 | 2015-10-30 | 톰슨 라이센싱 | 프라이버시 보호 카운팅을 위한 방법 및 시스템 |
JP6309432B2 (ja) * | 2014-11-12 | 2018-04-11 | 日本電信電話株式会社 | 秘密計算システム及び方法並びに管理サーバ及びプログラム |
CN109328377B (zh) * | 2016-07-06 | 2021-12-21 | 日本电信电话株式会社 | 秘密计算系统、秘密计算装置、秘密计算方法、以及程序 |
CN109416894B (zh) * | 2016-07-06 | 2021-12-31 | 日本电信电话株式会社 | 秘密计算系统、秘密计算装置、秘密计算方法及记录介质 |
KR101852116B1 (ko) * | 2016-11-15 | 2018-04-25 | 재단법인대구경북과학기술원 | 디노이징 장치 및 노이즈 제거 방법 |
EP3602422B1 (en) * | 2017-03-22 | 2022-03-16 | Visa International Service Association | Privacy-preserving machine learning |
CA3072638A1 (en) * | 2017-08-30 | 2019-03-07 | Inpher, Inc. | High-precision privacy-preserving real-valued function evaluation |
JP6730740B2 (ja) * | 2017-12-25 | 2020-07-29 | 株式会社アクセル | 処理装置、処理方法、処理プログラム、及び暗号処理システム |
JP6730741B2 (ja) * | 2017-12-26 | 2020-07-29 | 株式会社アクセル | 処理装置、処理方法、処理プログラム、及び暗号処理システム |
US11816575B2 (en) * | 2018-09-07 | 2023-11-14 | International Business Machines Corporation | Verifiable deep learning training service |
CN110084063B (zh) * | 2019-04-23 | 2022-07-15 | 中国科学技术大学 | 一种保护隐私数据的梯度下降计算方法 |
-
2019
- 2019-08-14 AU AU2019461061A patent/AU2019461061B2/en active Active
- 2019-08-14 US US17/634,237 patent/US20220329408A1/en active Pending
- 2019-08-14 CN CN201980099184.2A patent/CN114207694B/zh active Active
- 2019-08-14 WO PCT/JP2019/031941 patent/WO2021029034A1/ja unknown
- 2019-08-14 JP JP2021539762A patent/JP7279796B2/ja active Active
- 2019-08-14 EP EP19941104.2A patent/EP4016507A4/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017183587A1 (ja) * | 2016-04-18 | 2017-10-26 | 日本電信電話株式会社 | 学習装置、学習方法および学習プログラム |
JP2018156619A (ja) * | 2017-03-16 | 2018-10-04 | 株式会社デンソー | 連続最適化問題の大域的探索装置及びプログラム |
JP2019008383A (ja) * | 2017-06-21 | 2019-01-17 | キヤノン株式会社 | 画像処理装置、撮像装置、画像処理方法、プログラム、および、記憶媒体 |
Non-Patent Citations (5)
Title |
---|
KAIMING HEXIANGYU ZHANGSHAOQING RENJIAN SUN: "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification", PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, 2015, pages 1026 - 1034, XP032866428, DOI: 10.1109/ICCV.2015.123 |
KOKI HAMADADAI IKARASHIKOJI CHIDA: "A Batch Mapping Algorithm for Secure Function Evaluation", IEICE TRANSACTIONS A, vol. 96, no. 4, 2013, pages 157 - 165, XP008184067 |
PAYMAN MOHASSELYUPENG ZHANG: "SecureML: A System for Scalable Privacy-Preserving Machine Learning", IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2017, 2017, pages 19 - 38, XP055554322, DOI: 10.1109/SP.2017.12 |
SAMEER WAGHDIVYA GUPTAAND NISHANTH CHANDRAN: "SecureNN: 3-Party Secure Computation for Neural Network Training", PROCEEDINGS ON PRIVACY ENHANCING TECHNOLOGIES, vol. 1, 2019, pages 24 |
See also references of EP4016507A4 |
Also Published As
Publication number | Publication date |
---|---|
EP4016507A4 (en) | 2023-05-10 |
AU2019461061A1 (en) | 2022-03-03 |
CN114207694B (zh) | 2024-03-08 |
JPWO2021029034A1 (ja) | 2021-02-18 |
US20220329408A1 (en) | 2022-10-13 |
EP4016507A1 (en) | 2022-06-22 |
JP7279796B2 (ja) | 2023-05-23 |
AU2019461061B2 (en) | 2023-03-30 |
CN114207694A (zh) | 2022-03-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11087223B2 (en) | Learning and inferring insights from encrypted data | |
Nandakumar et al. | Towards deep neural network training on encrypted data | |
US11354539B2 (en) | Encrypted data model verification | |
US11343068B2 (en) | Secure multi-party learning and inferring insights based on encrypted data | |
US20200366459A1 (en) | Searching Over Encrypted Model and Encrypted Data Using Secure Single-and Multi-Party Learning Based on Encrypted Data | |
Yu et al. | Regularized extreme learning machine for regression with missing data | |
Nguyen et al. | Quantum key distribution protocol based on modified generalization of Deutsch-Jozsa algorithm in d-level quantum system | |
CN112805768B (zh) | 秘密s型函数计算系统及其方法、秘密逻辑回归计算系统及其方法、秘密s型函数计算装置、秘密逻辑回归计算装置、程序 | |
Paszyński et al. | Deep learning driven self-adaptive hp finite element method | |
US11681939B2 (en) | Quantum data loader | |
Joshi | Support vector machines | |
Liu et al. | An experimental study on symbolic extreme learning machine | |
Lauriola et al. | Radius-margin ratio optimization for dot-product boolean kernel learning | |
WO2021029024A1 (ja) | 秘密ソフトマックス関数計算システム、秘密ソフトマックス関数計算装置、秘密ソフトマックス関数計算方法、秘密ニューラルネットワーク計算システム、秘密ニューラルネットワーク学習システム、プログラム | |
Zhang et al. | Improved circuit implementation of the HHL algorithm and its simulations on QISKIT | |
Luo et al. | Finding second-order stationary points in nonconvex-strongly-concave minimax optimization | |
WO2021029034A1 (ja) | 秘密勾配降下法計算方法、秘密深層学習方法、秘密勾配降下法計算システム、秘密深層学習システム、秘密計算装置、およびプログラム | |
WO2019225531A1 (ja) | 秘密一括近似システム、秘密計算装置、秘密一括近似方法、およびプログラム | |
Mandal et al. | Hybrid phase-based representation of quantum images | |
Zhao et al. | PPCNN: An efficient privacy‐preserving CNN training and inference framework | |
WO2022254599A1 (ja) | 秘密共役勾配法計算方法、秘密共役勾配法計算システム、秘密計算装置、およびプログラム | |
KR102684265B1 (ko) | 완전 동형 평가를 위한 컴퓨테이션 네트워크 변환 | |
US20240039691A1 (en) | Management of accurate scales in fully-homomorphic encryption schemes | |
Flynn et al. | A simultaneous perturbation weak derivative estimator for stochastic neural networks | |
Kabziński | Rank-Revealing Orthogonal Decomposition in Extreme Learning Machine Design |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19941104 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021539762 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2019461061 Country of ref document: AU Date of ref document: 20190814 Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2019941104 Country of ref document: EP Effective date: 20220314 |