TW202109378A

TW202109378A - System and method of distributed deep learning system

Info

Publication number: TW202109378A
Application number: TW108129897A
Authority: TW
Inventors: 王紹睿
Original assignee: 中華電信股份有限公司
Priority date: 2019-08-21
Filing date: 2019-08-21
Publication date: 2021-03-01
Also published as: TWI690861B

Abstract

The present invention provides a system and a method of distributed deep learning. The system includes a first node, a token vault, and a second node. The first node includes the original data, the data label corresponding to the original data, and the private key, and the first node is configured to: obtain the gradient data of the original data; and encrypt the gradient data into a ciphertext data based on the homomorphic encryption technology; use the private key to sign the data label to obtain the digital signature of the data label. The token vault receives the data label and the digital signature of the first node, and after verifying the digital signature, transmits the token corresponding to the data label to the first node. The second node receives the ciphertext data and the token from the first node, and performs deep learning operations accordingly.

Description

Distributed deep learning system and method

本發明是有關於一種深度學習系統及方法，且特別是有關於一種分散式深度學習系統及方法。The present invention relates to a deep learning system and method, and particularly relates to a distributed deep learning system and method.

機器學習已在語音辨識、影像辨識和自然語言處理等領域上取得了相當程度的成功。並且，這些技術在無人駕駛、數位醫療系統、廣告、物聯網等領域具有很好的應用前景。Machine learning has achieved considerable success in the fields of speech recognition, image recognition, and natural language processing. Moreover, these technologies have good application prospects in fields such as unmanned driving, digital medical systems, advertising, and the Internet of Things.

考慮到訓練中所涉及的資料集和模型的規模十分龐大，機器學習平台通常是分散式平台，其中部署了數十個乃至數百個並行運行的本地端節點對模型做訓練。Considering the large scale of the data sets and models involved in training, the machine learning platform is usually a decentralized platform, in which dozens or even hundreds of local end nodes running in parallel are deployed to train the model.

然而，傳統的分散式資訊處理系統（特別是分散式深度學習系統）往往會遇到無法解決在處理大量個人隱私資料時，要如何維持資料隱私安全性及資料可用性的兩難問題。However, traditional decentralized information processing systems (especially decentralized deep learning systems) often encounter the dilemma of how to maintain data privacy security and data availability when processing large amounts of personal private data.

有鑑於此，本發明提供一種分散式深度學習系統及方法，其可用以解決上述技術問題。In view of this, the present invention provides a distributed deep learning system and method, which can be used to solve the above technical problems.

本發明提供一種分散式深度學習系統，包括第一本地端節點、雲端代碼庫及第二本地端節點。第一本地端節點包括一第一原始資料、對應於第一原始資料的一第一資料標籤及一第一私密金鑰，且第一本地端節點經配置以：求得第一原始資料的一第一梯度資料；基於一同態加密技術將第一梯度資料加密為一第一密文資料；使用第一私密金鑰對第一資料標籤進行一簽章運算，以取得第一資料標籤的一第一數位簽章。雲端代碼庫接收第一本地端節點的第一資料標籤及第一數位簽章，並在驗證第一數位簽章之後，將對應於第一資料標籤的一第一代碼回傳至第一本地端節點。第二本地端節點從第一本地端節點接收第一密文資料及對應於第一資料標籤的第一代碼，並據以進行一深度學習運算。The present invention provides a distributed deep learning system, which includes a first local end node, a cloud code library and a second local end node. The first local end node includes a first original data, a first data label corresponding to the first original data, and a first private key, and the first local end node is configured to: obtain a first original data The first gradient data; the first gradient data is encrypted into a first ciphertext data based on the homomorphic encryption technology; the first private key is used to perform a signature operation on the first data tag to obtain the first data tag of the first data tag A digital signature. The cloud code library receives the first data label and the first digital signature of the first local end node, and after verifying the first digital signature, returns a first code corresponding to the first data label to the first local end node. The second local end node receives the first ciphertext data and the first code corresponding to the first data label from the first local end node, and performs a deep learning operation accordingly.

本發明提供一種分散式深度學習方法，包括：由一第一本地端節點求得一第一原始資料的一第一梯度資料，其中第一本地端節點包括第一原資料、對應於第一原始資料的一第一資料標籤及一第一私密金鑰；由第一本地端節點基於一同態加密技術將第一梯度資料加密為一第一密文資料；由第一本地端節點使用第一私密金鑰對第一資料標籤進行一簽章運算，以取得第一資料標籤的一第一數位簽章；由一雲端代碼庫接收第一本地端節點的第一資料標籤及第一數位簽章，並在驗證第一數位簽章之後，將對應於第一資料標籤的一第一代碼回傳至第一本地端節點；以及由一第二本地端節點從第一本地端節點接收第一密文資料及對應於第一資料標籤的第一代碼，並據以進行一深度學習運算。The present invention provides a distributed deep learning method, including: obtaining a first gradient data of a first original data by a first local end node, wherein the first local end node includes the first original data and corresponds to the first original data. A first data label and a first private key of the data; the first local end node encrypts the first gradient data into a first ciphertext data based on the homomorphic encryption technology; the first local end node uses the first private key The key performs a signature operation on the first data tag to obtain a first digital signature of the first data tag; a cloud code library receives the first data tag and the first digital signature of the first local end node, And after verifying the first digital signature, a first code corresponding to the first data label is returned to the first local end node; and a second local end node receives the first ciphertext from the first local end node The data and the first code corresponding to the first data label are used to perform a deep learning operation.

基於上述，本發明提出使用同態加密（homomorphic encryption）方法及代碼化（tokenization）技術於分散式深度學習系統中，藉以在保證資料安全性的前提下維持資料的可用性，從而提升分散式深度學習機制的安全性。Based on the above, the present invention proposes to use a homomorphic encryption method and tokenization technology in a distributed deep learning system to maintain data availability while ensuring data security, thereby enhancing distributed deep learning The security of the mechanism.

為讓本發明的上述特徵和優點能更明顯易懂，下文特舉實施例，並配合所附圖式作詳細說明如下。In order to make the above-mentioned features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail in conjunction with the accompanying drawings.

概略而言，本發明提出使用同態加密方法及代碼化技術於分散式深度學習系統中。由於依據同態加密的數學理論，可以讓加密後的密文仍然維持能做數學運算的特性，這樣就能在保證資料安全性的前提下維持資料的可用性。例如，原始資料擁有者對原始資料做同態加密後，將加密後的密文交給其他人，而其他人可直接在對這個經同態加密後看似一團混亂的密文上做加減乘除的運算而不需要知道其背後的明文為何，這樣他人對密文盲運算的結果，若我們做解密的話會發現跟原始資料擁有者直接對明文做運算的結果相同。除此之外，本發明也使用代碼化技術，以保護各學習資料的標籤的隱私性。並且，由於代碼本身也具有標籤值內容不影響可用性的特性，故可達到在保障安全性的情形下仍能維持資料可用性的目的。以下將作進一步說明。In summary, the present invention proposes to use a homomorphic encryption method and coding technology in a distributed deep learning system. Based on the mathematical theory of homomorphic encryption, the encrypted ciphertext can still maintain the characteristics of mathematical operations, so that the availability of the data can be maintained under the premise of ensuring the security of the data. For example, after the original data owner homomorphically encrypts the original data, the encrypted ciphertext is handed over to others, and other people can directly add or subtract on the seemingly chaotic ciphertext after homomorphic encryption. The operation of multiplication and division does not need to know the plaintext behind it, so if we decrypt the ciphertext, the result of the blind operation of the ciphertext will be the same as the result of the original data owner directly performing the operation on the plaintext. In addition, the present invention also uses coding technology to protect the privacy of the tags of each learning material. In addition, since the code itself also has the feature that the content of the tag value does not affect the usability, it can achieve the purpose of maintaining the usability of the data while ensuring security. This will be further explained below.

請參照圖1，其是依據本發明之一實施例繪示的分散式深度學習系統示意圖。如圖1所示，分散式深度學習系統10包括本地端節點100、100a、100b、雲端代碼庫（token vault）200及第三方系統300。Please refer to FIG. 1, which is a schematic diagram of a distributed deep learning system according to an embodiment of the present invention. As shown in FIG. 1, the distributed deep learning system 10 includes local end nodes 100, 100a, 100b, a cloud code library (token vault) 200, and a third-party system 300.

在本發明的實施例中，本地端節點100、100a及100b的特性及概念皆相似，故以下將暫基於本地端節點100進行說明，而本領域具通常知識者應可據以推得本地端節點100a及100b的相關實施方式。In the embodiment of the present invention, the characteristics and concepts of the local end nodes 100, 100a, and 100b are similar, so the following description will be based on the local end node 100 temporarily, and those with ordinary knowledge in the field should be able to deduce the local end. Related implementations of nodes 100a and 100b.

如圖1所示，本地端節點100例如是分散式深度學習系統10中的各個節點。在一實施例中，本地端節點100初始時可包含原始資料OD1、原始資料OD1對應的資料標籤TD1以及本地端節點100的私密金鑰PVK1。資料標籤TD1例如是本地端節點100在進行深度學習的訓練（training）階段中，各個原始資料OD1對應的分類資料標籤（label）。As shown in FIG. 1, the local end node 100 is, for example, each node in the distributed deep learning system 10. In one embodiment, the local end node 100 may initially include the original data OD1, the data tag TD1 corresponding to the original data OD1, and the private key PVK1 of the local end node 100. The data label TD1 is, for example, a classification data label (label) corresponding to each original data OD1 in the training phase of the deep learning of the local end node 100.

在一實施例中，當醫院要利用病人的原始資料OD1做深度學習時，其對應的資料標籤TD1例如是每筆病人資料對應的隱私分類資訊。舉例而言，甲這個病人的原始醫療資料屬於得到愛滋病、乙丙兩人資料屬於得到性病、丁戊兩人資料屬於得到前列腺癌等。在此情況下，分散式深度學習系統10中的各個節點（例如本地端節點100、100a及100b）可彼此交換這些資訊，以各自進行深度學習運算。In one embodiment, when the hospital wants to use the patient's original data OD1 for deep learning, the corresponding data label TD1 is, for example, the privacy classification information corresponding to each patient data. For example, the original medical data of the patient A belongs to the acquisition of AIDS, the data of the second and third persons belong to the acquisition of sexually transmitted diseases, and the data of the two persons belong to the acquisition of prostate cancer. In this case, each node in the distributed deep learning system 10 (for example, the local end nodes 100, 100a, and 100b) can exchange this information with each other to perform deep learning operations.

在圖1實施例中，本地端節點100可包括梯度資料運算模組110、簽章運算軟體模組120、深度學習軟體模組130及同態加密軟體模組140。在不同的實施例中，梯度資料運算模組110，其為能夠提供隨機梯度下降（stochastic gradient descent，SGD）法運算之軟體模組。深度學習軟體模組130可為能夠提供深度學習除了隨機梯度下降法以外其他數學運算之軟體模組。簽章運算軟體模組120可為能夠提供密碼學簽章運算之軟體模組。同態加密軟體模組140可為能夠進行同態加密密碼學運算之軟體模組。In the embodiment of FIG. 1, the local end node 100 may include a gradient data operation module 110, a signature operation software module 120, a deep learning software module 130 and a homomorphic encryption software module 140. In different embodiments, the gradient data operation module 110 is a software module capable of providing stochastic gradient descent (SGD) operation. The deep learning software module 130 may be a software module capable of providing deep learning other mathematical operations besides the stochastic gradient descent method. The signature calculation software module 120 may be a software module capable of providing cryptographic signature calculation. The homomorphic encryption software module 140 may be a software module capable of performing homomorphic encryption cryptographic operations.

在本發明的實施例中，為讓本地端節點100可與其他節點安全地交換資料，本地端節點100可進行以下操作。具體來說，梯度資料運算模組110可求得原始資料OD1的梯度資料GD1。在一實施例中，梯度資料運算模組110可基於隨機梯度下降運算將原始資料OD1轉換為梯度資料GD1，但本發明可不限於此。之後，同態加密軟體模組140可基於同態加密技術將梯度資料GD1加密為密文資料ED1。In the embodiment of the present invention, in order to allow the local end node 100 to exchange data with other nodes securely, the local end node 100 may perform the following operations. Specifically, the gradient data operation module 110 can obtain the gradient data GD1 of the original data OD1. In an embodiment, the gradient data operation module 110 can convert the original data OD1 into the gradient data GD1 based on the stochastic gradient descent operation, but the invention is not limited to this. After that, the homomorphic encryption software module 140 can encrypt the gradient data GD1 into ciphertext data ED1 based on the homomorphic encryption technology.

並且，簽章運算軟體模組120可使用私密金鑰PVK1對資料標籤TD1進行簽章運算，以取得資料標籤TD1的數位簽章（digital signature）DS1。In addition, the signature calculation software module 120 can use the private key PVK1 to perform a signature calculation on the data tag TD1 to obtain the digital signature DS1 of the data tag TD1.

之後，本地端節點100可將資料標籤TD1及數位簽章DS1發送至雲端代碼庫200。After that, the local end node 100 can send the data tag TD1 and the digital signature DS1 to the cloud code library 200.

在一實施例中，雲端代碼庫200例如是儲存及處理代碼（token）相關功能的雲端代碼庫，其可包含儲存的代碼、資料標籤和代碼的對應表、代碼軟體模組210、驗簽章運算軟體模組220。在不同的實施例中，上述代碼例如是一亂數代碼，而每個代碼可對應一個資料標籤。並且，代碼軟體模組210可將資料標籤和代碼的對應關係儲存為資料標籤和代碼的對應表。In one embodiment, the cloud code library 200 is, for example, a cloud code library for storing and processing token related functions, which may include stored codes, data tags and code correspondence tables, code software modules 210, and verification stamps. Calculation software module 220. In different embodiments, the above code is, for example, a random number code, and each code can correspond to a data label. In addition, the code software module 210 can store the correspondence between the data label and the code as a correspondence table of the data label and the code.

代碼軟體模組210可用於比對資料標籤TD1是否存在對應代碼，若無則產生新亂數代碼，並儲存資料標籤TD1及代碼的組合到前述資料標籤和代碼的對應表。驗簽章運算軟體模組220例如是能夠驗證資料的數位簽章是否正確的軟體模組。The code software module 210 can be used to compare whether there is a corresponding code in the data tag TD1, if not, generate a new random number code, and store the combination of the data tag TD1 and the code in the aforementioned data tag and code correspondence table. The signature verification calculation software module 220 is, for example, a software module capable of verifying whether the digital signature of the data is correct.

基此，當雲端代碼庫200從本地端節點100收到資料標籤TD1和其數位簽章DS1時，可對數位簽章DS1進行驗證。在一實施例中，雲端代碼庫200可至可信且公正第三方系統300查詢對應於本地端節點100的公開金鑰PBK1。在不同的實施例中，第三方系統300例如是可信任公正的第三方，其可存有各本地端節點100、100a、100b的私密金鑰所對應的公開金鑰。Based on this, when the cloud code library 200 receives the data tag TD1 and its digital signature DS1 from the local end node 100, the digital signature DS1 can be verified. In one embodiment, the cloud code base 200 can reach the trusted and fair third-party system 300 to query the public key PBK1 corresponding to the local end node 100. In different embodiments, the third-party system 300 is, for example, a trusted and impartial third party, which may store the public key corresponding to the private key of each local end node 100, 100a, 100b.

在取得本地端節點100的公開金鑰PBK1之後，雲端代碼庫200可透過驗簽章運算軟體模組220對數位簽章DS1進行驗簽章運算，以驗證數位簽章DS1是否正確。若數位簽章DS1經驗證為正確，則代碼軟體模組210可相應地將資料標籤TD1轉換成代碼T1，而雲端代碼庫200可將代碼T1回傳至本地端節點100。另一方面，若數位簽章DS1經驗證為錯誤，則雲端代碼庫200可回傳一錯誤訊息至本地端節點100。After obtaining the public key PBK1 of the local end node 100, the cloud code base 200 can perform a signature verification operation on the digital signature DS1 through the signature verification calculation software module 220 to verify whether the digital signature DS1 is correct. If the digital signature DS1 is verified to be correct, the code software module 210 can correspondingly convert the data tag TD1 into a code T1, and the cloud code library 200 can return the code T1 to the local end node 100. On the other hand, if the digital signature DS1 is verified as an error, the cloud code library 200 can return an error message to the local end node 100.

在本地端節點100從雲端代碼庫200接收代碼T1之後，本地端節點100可將密文資料ED1及代碼T1發送至分散式深度學習系統10中的其他節點，例如本地端節點100a及/或100b，以讓本地端節點100a及/或100b據以進行深度學習運算。在一實施例中，本地端節點100可依照一事先約定順序或隨機將密文資料ED1及代碼T1傳給分散式深度學習系統10中的其他節點。藉此，這些節點即可利用同態加密後的密文可保持原資料數學運算能力之特性，進行分散式深度學習運算。After the local end node 100 receives the code T1 from the cloud code library 200, the local end node 100 can send the ciphertext data ED1 and the code T1 to other nodes in the distributed deep learning system 10, such as the local end nodes 100a and/or 100b , So that the local end nodes 100a and/or 100b can perform deep learning operations accordingly. In an embodiment, the local end node 100 can transmit the ciphertext data ED1 and the code T1 to other nodes in the distributed deep learning system 10 according to a predetermined order or randomly. In this way, these nodes can use the homomorphic encrypted ciphertext to maintain the characteristics of the mathematical operation ability of the original data, and perform distributed deep learning operations.

在其他實施例中，本地端節點100a、100b亦可依據上述實施例的教示而進行相似於本地端節點100的操作。以本地端節點100a為例，其可包括第二原始資料、對應於第二原始資料的第二資料標籤及第二私密金鑰。基此，本地端節點100a可經配置以：求得第二原始資料的第二梯度資料；基於同態加密技術將第二梯度資料加密為密文資料ED2；使用第二私密金鑰對第二資料標籤進行簽章運算，以取得第二資料標籤的第二數位簽章；將第二資料標籤及第二數位簽章發送至雲端代碼庫200；從雲端代碼庫200接收對應於第二資料標籤的代碼T2，其中代碼T2為雲端代碼庫200在驗證第二數位簽章之後所產生；將密文資料ED2及對應於第二資料標籤的代碼T2發送至本地端節點100。In other embodiments, the local end nodes 100a, 100b can also perform operations similar to those of the local end node 100 according to the teachings of the above-mentioned embodiments. Taking the local end node 100a as an example, it may include second original data, a second data label corresponding to the second original data, and a second private key. Based on this, the local end node 100a can be configured to: obtain the second gradient data of the second original data; encrypt the second gradient data into ciphertext data ED2 based on homomorphic encryption technology; use the second private key to pair the second gradient data The data tag performs a signature calculation to obtain the second digital signature of the second data tag; sends the second data tag and the second digital signature to the cloud code library 200; receives the second data tag corresponding to the cloud code library 200 The code T2 of, where the code T2 is generated by the cloud code library 200 after verifying the second digital signature; the ciphertext data ED2 and the code T2 corresponding to the second data tag are sent to the local end node 100.

在本地端節點100從本地端節點100a接收到密文資料ED2及代碼T2之後，本地端節點100可透過深度學習軟體模組130據以進行深度學習運算。After the local end node 100 receives the ciphertext data ED2 and the code T2 from the local end node 100a, the local end node 100 can use the deep learning software module 130 to perform deep learning operations accordingly.

此外，本地端節點100a亦可將密文資料ED2及對應於第二資料標籤的代碼T2發送至本地端節點100b，藉以讓本地端節點100b能夠據以進行深度學習運算。In addition, the local end node 100a can also send the ciphertext data ED2 and the code T2 corresponding to the second data label to the local end node 100b, so that the local end node 100b can perform deep learning operations accordingly.

由上可知，基於同態加密及代碼化技術，本發明的分散式深度學習系統可保護分散式深度學習過程中的隱私資訊安全。首先，在初始化階段，各個分散的本地端節點可針對各自擁有的原始資料使用隨機梯度下降法進行學習運算，並將求得的梯度(gradient)資料，進行同態加密運算，得到加密後的密文資料。It can be seen from the above that based on homomorphic encryption and coding technology, the decentralized deep learning system of the present invention can protect the privacy of information security in the decentralized deep learning process. First, in the initialization phase, each decentralized local node can use the stochastic gradient descent method for learning operations based on the original data they own, and perform homomorphic encryption operations on the obtained gradient data to obtain the encrypted secret. Text information.

此外，各個分散的本地端節點還可用代碼化技術將各自擁有的資料其對應深度學習過程中的資料標籤傳到雲端代碼庫，同時附上以自身的私密金鑰對此資料標籤做的數位簽章。在此情況下，雲端代碼庫可利用本地端節點的公開金鑰來驗證數位簽章。若驗證通過，則將資料標籤轉換成代碼，並傳回本地端節點。基此，各分散的本地端節點將經同態加密後已轉為密文的梯度資料，以及經代碼化過程已轉為代碼的資料標籤，傳給其他的本地端節點。如此一來，本發明利用同態加密後的密文可保持原梯度資料在數學上的可運算特性，同時兼顧隱私性，以及利用代碼化技術可保護各學習資料的標籤的隱私性，繼續進行分散式深度學習運算。In addition, each decentralized local end node can also use coding technology to transfer their own data and their corresponding data tags in the deep learning process to the cloud code library, and at the same time attach a digital signature to this data tag with their own private key. chapter. In this case, the cloud code library can use the public key of the local end node to verify the digital signature. If the verification is passed, the data label is converted into a code and sent back to the local end node. Based on this, each scattered local end node transmits the gradient data that has been homomorphically encrypted into ciphertext and the data label that has been converted into code by the coding process to other local end nodes. In this way, the present invention uses the homomorphic encrypted ciphertext to maintain the mathematically operable characteristics of the original gradient data, while taking into account privacy, and the use of coding technology can protect the privacy of the tags of each learning material. Decentralized deep learning operations.

請參照圖2，其是依據本發明之一實施例繪示的分散式深度學習方法流程圖。本實施例的方法可由圖1的分散式深度學習系統10執行，以下即搭配圖1所示的元件說明圖2各步驟。Please refer to FIG. 2, which is a flowchart of a distributed deep learning method according to an embodiment of the present invention. The method of this embodiment can be executed by the distributed deep learning system 10 of FIG. 1. The steps in FIG. 2 will be described below in conjunction with the components shown in FIG. 1.

首先，在步驟S21中，本地端節點100可求得原始資料OD1的梯度資料GD1。在步驟S22中，本地端節點100可基於同態加密技術將梯度資料GD1加密為密文資料ED1。在步驟S23中，本地端節點100可使用私密金鑰PVK1對資料標籤TD1進行簽章運算，以取得資料標籤TD1的數位簽章DS1。在步驟S24中，雲端代碼庫200可接收本地端節點100的資料標籤TD1及數位簽章DS1，並在驗證數位簽章DS1之後，將對應於資料標籤TD1的代碼T1回傳至本地端節點100。在步驟S25中，本地端節點100a可從本地端節點100接收密文資料ED1及對應於資料標籤TD1的代碼T1，並據以進行深度學習運算。以上各步驟的細節可參照先前實施例中的說明，於此不另贅述。First, in step S21, the local end node 100 can obtain the gradient data GD1 of the original data OD1. In step S22, the local end node 100 may encrypt the gradient data GD1 into ciphertext data ED1 based on the homomorphic encryption technology. In step S23, the local end node 100 can use the private key PVK1 to perform a signature operation on the data tag TD1 to obtain the digital signature DS1 of the data tag TD1. In step S24, the cloud code library 200 may receive the data tag TD1 and the digital signature DS1 of the local end node 100, and after verifying the digital signature DS1, return the code T1 corresponding to the data tag TD1 to the local end node 100 . In step S25, the local end node 100a may receive the ciphertext data ED1 and the code T1 corresponding to the data tag TD1 from the local end node 100, and perform deep learning operations accordingly. For the details of the above steps, please refer to the description in the previous embodiment, which will not be repeated here.

綜上所述，基於同態加密及代碼化技術，本發明的分散式深度學習系統及方法可保護隱私資訊安全，進而能夠解決傳統的分散式資訊處理系統（特別是分散式深度學習系統）時常遇到的在處理大量個人隱私資料時要如何維持資料隱私安全性及資料可用性的兩難問題。In summary, based on homomorphic encryption and coding technology, the decentralized deep learning system and method of the present invention can protect privacy information security, and thus can solve traditional decentralized information processing systems (especially decentralized deep learning systems) often Encountered the dilemma of how to maintain data privacy security and data availability when processing large amounts of personal private data.

本發明所提的系統及方法可利用同態加密後的密文可保持深度學習中的梯度資料在數學上的可運算特性，同時兼顧隱私安全性，以及利用代碼化技術可保護各學習資料的標籤的隱私性，加上代碼本身也是另一種標籤值內容不影響可用性的特性，以此達到在保障安全性的情形下仍能維持資料可用性的目的。The system and method proposed by the present invention can use the homomorphic encrypted ciphertext to maintain the mathematically operable characteristics of the gradient data in the deep learning, while taking into account the privacy and security, and the use of coding technology can protect the learning data The privacy of the tag, plus the code itself is another feature that does not affect the usability of the tag value content, so as to achieve the purpose of maintaining the availability of data while ensuring security.

雖然本發明已以實施例揭露如上，然其並非用以限定本發明，任何所屬技術領域中具有通常知識者，在不脫離本發明的精神和範圍內，當可作些許的更動與潤飾，故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention. Anyone with ordinary knowledge in the relevant technical field can make some changes and modifications without departing from the spirit and scope of the present invention. The scope of protection of the present invention shall be determined by the scope of the attached patent application.

10:分散式深度學習系統 100、100a、100b:本地端節點 110:梯度資料運算模組 120:簽章運算軟體模組 130:深度學習軟體模組 140:同態加密軟體模組 200:雲端代碼庫 210:代碼軟體模組 220: 驗簽章運算軟體模組 300:第三方系統 DS1:數位簽章 ED1、ED2:密文資料 OD1:原始資料 PBK1:公開金鑰 PVK1:私密金鑰 S21~S25:步驟 T1、T2:代碼 TD1:資料標籤10: Decentralized deep learning system 100, 100a, 100b: local end node 110: Gradient data operation module 120: Signature calculation software module 130: Deep Learning Software Module 140: Homomorphic encryption software module 200: Cloud code base 210: Code Software Module 220: Verification and signature calculation software module 300: Third-party system DS1: Digital Signature ED1, ED2: ciphertext data OD1: Original data PBK1: Public key PVK1: private key S21~S25: steps T1, T2: code TD1: Data label

圖1是依據本發明之一實施例繪示的分散式深度學習系統示意圖。圖2是依據本發明之一實施例繪示的分散式深度學習方法流程圖。FIG. 1 is a schematic diagram of a distributed deep learning system according to an embodiment of the present invention. Fig. 2 is a flowchart of a decentralized deep learning method according to an embodiment of the present invention.

10:分散式深度學習系統 10: Decentralized deep learning system

100、100a、100b:本地端節點 100, 100a, 100b: local end node

110:梯度資料運算模組 110: Gradient data operation module

120:簽章運算軟體模組 120: Signature calculation software module

130:深度學習軟體模組 130: Deep Learning Software Module

140:同態加密軟體模組 140: Homomorphic encryption software module

200:雲端代碼庫 200: Cloud code base

210:代碼軟體模組 210: Code Software Module

220:驗簽章運算軟體模組 220: Verification and signature calculation software module

300:第三方系統 300: Third-party system

DS1:數位簽章 DS1: Digital Signature

ED1、ED2:密文資料 ED1, ED2: ciphertext data

OD1:原始資料 OD1: Original data

PBK1:公開金鑰 PBK1: Public key

PVK1:私密金鑰 PVK1: private key

T1、T2:代碼 T1, T2: code

TD1:資料標籤 TD1: Data label

Claims

A decentralized deep learning system, including: A first local end node includes a first original data, a first data label corresponding to the first original data, and a first private key, and the first local end node is configured to: Obtain a first gradient data of the first original data; Encrypting the first gradient data into a first ciphertext data based on a homomorphic encryption technology; Use the first private key to perform a signature operation on the first data label to obtain a first digital signature of the first data label; A cloud code library that receives the first data tag and the first digital signature of the first local end node, and after verifying the first digital signature, will correspond to a first code of the first data tag Back to the first local end node; and A second local end node receives the first ciphertext data and the first code corresponding to the first data label from the first local end node, and performs a deep learning operation accordingly.

According to the system described in claim 1, wherein the first local end node converts the first original data into the first gradient data based on a stochastic gradient descent operation.

For example, the system described in item 1 of the scope of patent application further includes a third-party system that stores a first public key corresponding to the first private key of the first local end node, and the cloud code library is Configure to: Query the third-party system for the first public key corresponding to the first local end node; Perform a signature verification operation on the first digital signature based on the first public key; In response to the verification that the first digital signature is correct, the first data tag is converted into the first code, and the first code is returned to the first local end node; and In response to the verification of the first digital signature as an error, an error message is returned to the first local end node.

For example, the system described in item 1 of the scope of patent application, wherein the second local end node includes a second original data, a second data label corresponding to the second original data, and a second private key, and the first 2. The local end node is configured to: Obtain a second gradient data of the second original data; Encrypting the second gradient data into a second ciphertext data based on the homomorphic encryption technology; Use the second private key to perform the signature operation on the second data label to obtain a second digital signature of the second data label; Sending the second data tag and the second digital signature to the cloud code library; Receiving a second code corresponding to the second data tag from the cloud code library, where the second code is generated by the cloud code library after verifying the second digital signature; The second ciphertext data and the second code corresponding to the second data label are sent to the first local end node, so that the first local end node can perform the deep learning operation accordingly.

For example, the system described in item 4 of the scope of patent application further includes a third local end node that receives the first ciphertext data and the first code corresponding to the first data label from the first local end node, And receiving the second ciphertext data and the second code corresponding to the second data label from the second local end node, and performing the deep learning operation accordingly.

The system described in item 1 of the scope of patent application further includes a third local end node, which receives the first ciphertext data and the first code corresponding to the first data label from the first local end node.

A decentralized deep learning method including: A first gradient data of a first original data is obtained by a first local end node, wherein the first local end node includes the first original data, a first data label corresponding to the first original data, and a The first private key; Encrypting the first gradient data into a first ciphertext data by the first local end node based on homomorphic encryption technology; The first local end node uses the first private key to perform a signature operation on the first data label to obtain a first digital signature of the first data label; A cloud code library receives the first data label and the first digital signature of the first local end node, and after verifying the first digital signature, a first code corresponding to the first data label Back to the first local end node; and A second local end node receives the first ciphertext data and the first code corresponding to the first data label from the first local end node, and performs a deep learning operation accordingly.