TWI690861B - System and method of distributed deep learning system - Google Patents

System and method of distributed deep learning system Download PDF

Info

Publication number
TWI690861B
TWI690861B TW108129897A TW108129897A TWI690861B TW I690861 B TWI690861 B TW I690861B TW 108129897 A TW108129897 A TW 108129897A TW 108129897 A TW108129897 A TW 108129897A TW I690861 B TWI690861 B TW I690861B
Authority
TW
Taiwan
Prior art keywords
data
local end
end node
label
digital signature
Prior art date
Application number
TW108129897A
Other languages
Chinese (zh)
Other versions
TW202109378A (en
Inventor
王紹睿
Original Assignee
中華電信股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中華電信股份有限公司 filed Critical 中華電信股份有限公司
Priority to TW108129897A priority Critical patent/TWI690861B/en
Application granted granted Critical
Publication of TWI690861B publication Critical patent/TWI690861B/en
Publication of TW202109378A publication Critical patent/TW202109378A/en

Links

Images

Abstract

The present invention provides a system and a method of distributed deep learning. The system includes a first node, a token vault, and a second node. The first node includes the original data, the data label corresponding to the original data, and the private key, and the first node is configured to: obtain the gradient data of the original data; and encrypt the gradient data into a ciphertext data based on the homomorphic encryption technology; use the private key to sign the data label to obtain the digital signature of the data label. The token vault receives the data label and the digital signature of the first node, and after verifying the digital signature, transmits the token corresponding to the data label to the first node. The second node receives the ciphertext data and the token from the first node, and performs deep learning operations accordingly.

Description

分散式深度學習系統及方法Decentralized deep learning system and method

本發明是有關於一種深度學習系統及方法,且特別是有關於一種分散式深度學習系統及方法。The invention relates to a deep learning system and method, and in particular to a distributed deep learning system and method.

機器學習已在語音辨識、影像辨識和自然語言處理等領域上取得了相當程度的成功。並且,這些技術在無人駕駛、數位醫療系統、廣告、物聯網等領域具有很好的應用前景。Machine learning has achieved considerable success in the fields of speech recognition, image recognition and natural language processing. Moreover, these technologies have very good application prospects in the fields of unmanned driving, digital medical systems, advertising, and the Internet of Things.

考慮到訓練中所涉及的資料集和模型的規模十分龐大,機器學習平台通常是分散式平台,其中部署了數十個乃至數百個並行運行的本地端節點對模型做訓練。Considering that the scale of data sets and models involved in training is very large, machine learning platforms are usually decentralized platforms, where dozens or even hundreds of local end nodes running in parallel are deployed to train the model.

然而,傳統的分散式資訊處理系統(特別是分散式深度學習系統)往往會遇到無法解決在處理大量個人隱私資料時,要如何維持資料隱私安全性及資料可用性的兩難問題。However, traditional decentralized information processing systems (especially decentralized deep learning systems) often encounter the dilemma of how to maintain data privacy security and data availability when processing large amounts of personal privacy data.

有鑑於此,本發明提供一種分散式深度學習系統及方法,其可用以解決上述技術問題。In view of this, the present invention provides a distributed deep learning system and method, which can be used to solve the above technical problems.

本發明提供一種分散式深度學習系統,包括第一本地端節點、雲端代碼庫及第二本地端節點。第一本地端節點包括一第一原始資料、對應於第一原始資料的一第一資料標籤及一第一私密金鑰,且第一本地端節點經配置以:求得第一原始資料的一第一梯度資料;基於一同態加密技術將第一梯度資料加密為一第一密文資料;使用第一私密金鑰對第一資料標籤進行一簽章運算,以取得第一資料標籤的一第一數位簽章。雲端代碼庫接收第一本地端節點的第一資料標籤及第一數位簽章,並在驗證第一數位簽章之後,將對應於第一資料標籤的一第一代碼回傳至第一本地端節點。第二本地端節點從第一本地端節點接收第一密文資料及對應於第一資料標籤的第一代碼,並據以進行一深度學習運算。The invention provides a decentralized deep learning system, including a first local end node, a cloud code base and a second local end node. The first local end node includes a first original data, a first data label corresponding to the first original data and a first private key, and the first local end node is configured to: find a The first gradient data; encrypt the first gradient data into a first ciphertext data based on the homomorphic encryption technology; use the first private key to perform a signature operation on the first data label to obtain a first data label A digital signature. The cloud code base receives the first data label and the first digital signature of the first local node, and after verifying the first digital signature, returns a first code corresponding to the first data label to the first local terminal node. The second local end node receives the first ciphertext data and the first code corresponding to the first data label from the first local end node, and performs a deep learning operation accordingly.

本發明提供一種分散式深度學習方法,包括:由一第一本地端節點求得一第一原始資料的一第一梯度資料,其中第一本地端節點包括第一原資料、對應於第一原始資料的一第一資料標籤及一第一私密金鑰;由第一本地端節點基於一同態加密技術將第一梯度資料加密為一第一密文資料;由第一本地端節點使用第一私密金鑰對第一資料標籤進行一簽章運算,以取得第一資料標籤的一第一數位簽章;由一雲端代碼庫接收第一本地端節點的第一資料標籤及第一數位簽章,並在驗證第一數位簽章之後,將對應於第一資料標籤的一第一代碼回傳至第一本地端節點;以及由一第二本地端節點從第一本地端節點接收第一密文資料及對應於第一資料標籤的第一代碼,並據以進行一深度學習運算。The invention provides a decentralized deep learning method, comprising: obtaining a first gradient data of a first original data from a first local end node, wherein the first local end node includes the first original data, corresponding to the first original data A first data label and a first private key of the data; the first local end node encrypts the first gradient data into a first ciphertext data based on the homomorphic encryption technology; the first local end node uses the first private key The key performs a signature operation on the first data label to obtain a first digital signature of the first data label; a cloud code base receives the first data label and the first digital signature of the first local node, After verifying the first digital signature, a first code corresponding to the first data label is returned to the first local end node; and a second local end node receives the first ciphertext from the first local end node The data and the first code corresponding to the first data label, and perform a deep learning operation accordingly.

基於上述,本發明提出使用同態加密(homomorphic encryption)方法及代碼化(tokenization)技術於分散式深度學習系統中,藉以在保證資料安全性的前提下維持資料的可用性,從而提升分散式深度學習機制的安全性。Based on the above, the present invention proposes to use homomorphic encryption method and tokenization technology in the distributed deep learning system, in order to maintain the availability of the data under the premise of ensuring the security of the data, thereby enhancing the distributed deep learning The safety of the mechanism.

為讓本發明的上述特徵和優點能更明顯易懂,下文特舉實施例,並配合所附圖式作詳細說明如下。In order to make the above-mentioned features and advantages of the present invention more obvious and understandable, the embodiments are specifically described below in conjunction with the accompanying drawings for detailed description as follows.

概略而言,本發明提出使用同態加密方法及代碼化技術於分散式深度學習系統中。由於依據同態加密的數學理論,可以讓加密後的密文仍然維持能做數學運算的特性,這樣就能在保證資料安全性的前提下維持資料的可用性。例如,原始資料擁有者對原始資料做同態加密後,將加密後的密文交給其他人,而其他人可直接在對這個經同態加密後看似一團混亂的密文上做加減乘除的運算而不需要知道其背後的明文為何,這樣他人對密文盲運算的結果,若我們做解密的話會發現跟原始資料擁有者直接對明文做運算的結果相同。除此之外,本發明也使用代碼化技術,以保護各學習資料的標籤的隱私性。並且,由於代碼本身也具有標籤值內容不影響可用性的特性,故可達到在保障安全性的情形下仍能維持資料可用性的目的。以下將作進一步說明。In summary, the present invention proposes to use homomorphic encryption methods and coding techniques in a distributed deep learning system. According to the mathematical theory of homomorphic encryption, the encrypted ciphertext can still maintain the characteristics of mathematical operations, so that the availability of data can be maintained under the premise of ensuring data security. For example, after the original data owner homomorphically encrypts the original data, the encrypted ciphertext is handed over to others, and other people can directly add or subtract the ciphertext that appears to be a mess after being homomorphically encrypted. The operation of multiplication and division does not need to know the plaintext behind it. In this way, if we decrypt the result of the blind operation of the ciphertext, if we decrypt it, we will find that it is the same as the operation of the original data owner directly on the plaintext. In addition to this, the present invention also uses coding technology to protect the privacy of the labels of the learning materials. In addition, since the code itself has the characteristic that the content of the tag value does not affect the usability, it can achieve the purpose of maintaining the usability of the data under the condition of ensuring security. This will be explained further below.

請參照圖1,其是依據本發明之一實施例繪示的分散式深度學習系統示意圖。如圖1所示,分散式深度學習系統10包括本地端節點100、100a、100b、雲端代碼庫(token vault)200及第三方系統300。Please refer to FIG. 1, which is a schematic diagram of a distributed deep learning system according to an embodiment of the present invention. As shown in FIG. 1, the decentralized deep learning system 10 includes local end nodes 100, 100a, 100b, a cloud code base (token vault) 200, and a third-party system 300.

在本發明的實施例中,本地端節點100、100a及100b的特性及概念皆相似,故以下將暫基於本地端節點100進行說明,而本領域具通常知識者應可據以推得本地端節點100a及100b的相關實施方式。In the embodiments of the present invention, the characteristics and concepts of the local end nodes 100, 100a, and 100b are similar, so the following will be described based on the local end node 100, and those with ordinary knowledge in the art should be able to derive the local end Related implementations of nodes 100a and 100b.

如圖1所示,本地端節點100例如是分散式深度學習系統10中的各個節點。在一實施例中,本地端節點100初始時可包含原始資料OD1、原始資料OD1對應的資料標籤TD1以及本地端節點100的私密金鑰PVK1。資料標籤TD1例如是本地端節點100在進行深度學習的訓練(training)階段中,各個原始資料OD1對應的分類資料標籤(label)。As shown in FIG. 1, the local end node 100 is, for example, each node in the distributed deep learning system 10. In one embodiment, the local end node 100 may initially include the original data OD1, the data tag TD1 corresponding to the original data OD1, and the private key PVK1 of the local end node 100. The data label TD1 is, for example, a classification data label (label) corresponding to each original data OD1 in the training phase of the deep learning of the local end node 100.

在一實施例中,當醫院要利用病人的原始資料OD1做深度學習時,其對應的資料標籤TD1例如是每筆病人資料對應的隱私分類資訊。舉例而言,甲這個病人的原始醫療資料屬於得到愛滋病、乙丙兩人資料屬於得到性病、丁戊兩人資料屬於得到前列腺癌等。在此情況下,分散式深度學習系統10中的各個節點(例如本地端節點100、100a及100b)可彼此交換這些資訊,以各自進行深度學習運算。In an embodiment, when the hospital wants to use the patient's original data OD1 for deep learning, the corresponding data label TD1 is, for example, the privacy classification information corresponding to each patient data. For example, the original medical data of the patient A belongs to AIDS, the data of the two persons to obtain sexually transmitted diseases, and the data of the two persons to obtain prostate cancer. In this case, each node in the decentralized deep learning system 10 (such as the local end nodes 100, 100a, and 100b) can exchange this information with each other to perform deep learning operations.

在圖1實施例中,本地端節點100可包括梯度資料運算模組110、簽章運算軟體模組120、深度學習軟體模組130及同態加密軟體模組140。在不同的實施例中,梯度資料運算模組110,其為能夠提供隨機梯度下降(stochastic gradient descent,SGD)法運算之軟體模組。深度學習軟體模組130可為能夠提供深度學習除了隨機梯度下降法以外其他數學運算之軟體模組。簽章運算軟體模組120可為能夠提供密碼學簽章運算之軟體模組。同態加密軟體模組140可為能夠進行同態加密密碼學運算之軟體模組。In the embodiment of FIG. 1, the local end node 100 may include a gradient data calculation module 110, a signature calculation software module 120, a deep learning software module 130, and a homomorphic encryption software module 140. In different embodiments, the gradient data operation module 110 is a software module capable of providing stochastic gradient descent (SGD) method operations. The deep learning software module 130 may be a software module capable of providing deep learning mathematical operations other than stochastic gradient descent. The signature calculation software module 120 may be a software module capable of providing cryptographic signature calculation. The homomorphic encryption software module 140 may be a software module capable of homomorphic encryption cryptographic operations.

在本發明的實施例中,為讓本地端節點100可與其他節點安全地交換資料,本地端節點100可進行以下操作。具體來說,梯度資料運算模組110可求得原始資料OD1的梯度資料GD1。在一實施例中,梯度資料運算模組110可基於隨機梯度下降運算將原始資料OD1轉換為梯度資料GD1,但本發明可不限於此。之後,同態加密軟體模組140可基於同態加密技術將梯度資料GD1加密為密文資料ED1。In the embodiment of the present invention, in order for the local end node 100 to exchange data with other nodes securely, the local end node 100 may perform the following operations. Specifically, the gradient data calculation module 110 can obtain the gradient data GD1 of the original data OD1. In an embodiment, the gradient data operation module 110 may convert the original data OD1 into the gradient data GD1 based on a random gradient descent operation, but the invention may not be limited to this. Thereafter, the homomorphic encryption software module 140 may encrypt the gradient data GD1 into the ciphertext data ED1 based on the homomorphic encryption technology.

並且,簽章運算軟體模組120可使用私密金鑰PVK1對資料標籤TD1進行簽章運算,以取得資料標籤TD1的數位簽章(digital signature)DS1。In addition, the signature calculation software module 120 may perform a signature calculation on the data tag TD1 using the private key PVK1 to obtain a digital signature DS1 of the data tag TD1.

之後,本地端節點100可將資料標籤TD1及數位簽章DS1發送至雲端代碼庫200。After that, the local end node 100 can send the data tag TD1 and the digital signature DS1 to the cloud code base 200.

在一實施例中,雲端代碼庫200例如是儲存及處理代碼(token)相關功能的雲端代碼庫,其可包含儲存的代碼、資料標籤和代碼的對應表、代碼軟體模組210、驗簽章運算軟體模組220。在不同的實施例中,上述代碼例如是一亂數代碼,而每個代碼可對應一個資料標籤。並且,代碼軟體模組210可將資料標籤和代碼的對應關係儲存為資料標籤和代碼的對應表。In an embodiment, the cloud code base 200 is, for example, a cloud code base for storing and processing code (token) related functions, which may include stored codes, data labels and code correspondence tables, code software modules 210, and signature verification seals Calculation software module 220. In different embodiments, the above code is, for example, a random number code, and each code may correspond to a data label. In addition, the code software module 210 can store the correspondence between the data label and the code as a correspondence table between the data label and the code.

代碼軟體模組210可用於比對資料標籤TD1是否存在對應代碼,若無則產生新亂數代碼,並儲存資料標籤TD1及代碼的組合到前述資料標籤和代碼的對應表。驗簽章運算軟體模組220例如是能夠驗證資料的數位簽章是否正確的軟體模組。The code software module 210 can be used to compare whether a corresponding code exists in the data tag TD1, if not, a new random code is generated, and the combination of the data tag TD1 and the code is stored in the corresponding table of the aforementioned data tag and code. The signature verification calculation software module 220 is, for example, a software module capable of verifying whether the digital signature of the data is correct.

基此,當雲端代碼庫200從本地端節點100收到資料標籤TD1和其數位簽章DS1時,可對數位簽章DS1進行驗證。在一實施例中,雲端代碼庫200可至可信且公正第三方系統300查詢對應於本地端節點100的公開金鑰PBK1。在不同的實施例中,第三方系統300例如是可信任公正的第三方,其可存有各本地端節點100、100a、100b的私密金鑰所對應的公開金鑰。Based on this, when the cloud code base 200 receives the data tag TD1 and its digital signature DS1 from the local end node 100, the digital signature DS1 can be verified. In an embodiment, the cloud code base 200 can query the trusted and fair third-party system 300 to query the public key PBK1 corresponding to the local end node 100. In different embodiments, the third-party system 300 is, for example, a trusted and fair third-party, which may store the public key corresponding to the private key of each local end node 100, 100a, 100b.

在取得本地端節點100的公開金鑰PBK1之後,雲端代碼庫200可透過驗簽章運算軟體模組220對數位簽章DS1進行驗簽章運算,以驗證數位簽章DS1是否正確。若數位簽章DS1經驗證為正確,則代碼軟體模組210可相應地將資料標籤TD1轉換成代碼T1,而雲端代碼庫200可將代碼T1回傳至本地端節點100。另一方面,若數位簽章DS1經驗證為錯誤,則雲端代碼庫200可回傳一錯誤訊息至本地端節點100。After obtaining the public key PBK1 of the local end node 100, the cloud code base 200 can perform the signature verification operation on the digital signature DS1 through the signature verification operation software module 220 to verify whether the digital signature DS1 is correct. If the digital signature DS1 is verified as correct, the code software module 210 can convert the data tag TD1 into the code T1 accordingly, and the cloud code base 200 can transmit the code T1 to the local node 100. On the other hand, if the digital signature DS1 is verified as an error, the cloud code base 200 may return an error message to the local node 100.

在本地端節點100從雲端代碼庫200接收代碼T1之後,本地端節點100可將密文資料ED1及代碼T1發送至分散式深度學習系統10中的其他節點,例如本地端節點100a及/或100b,以讓本地端節點100a及/或100b據以進行深度學習運算。在一實施例中,本地端節點100可依照一事先約定順序或隨機將密文資料ED1及代碼T1傳給分散式深度學習系統10中的其他節點。藉此,這些節點即可利用同態加密後的密文可保持原資料數學運算能力之特性,進行分散式深度學習運算。After the local end node 100 receives the code T1 from the cloud code base 200, the local end node 100 may send the ciphertext data ED1 and the code T1 to other nodes in the distributed deep learning system 10, such as the local end node 100a and/or 100b To allow the local end nodes 100a and/or 100b to perform deep learning operations accordingly. In an embodiment, the local end node 100 may transmit the ciphertext data ED1 and the code T1 to other nodes in the distributed deep learning system 10 according to a predetermined order or randomly. In this way, these nodes can use the homomorphic encrypted ciphertext to maintain the characteristics of the mathematical operation ability of the original data, and perform distributed deep learning operations.

在其他實施例中,本地端節點100a、100b亦可依據上述實施例的教示而進行相似於本地端節點100的操作。以本地端節點100a為例,其可包括第二原始資料、對應於第二原始資料的第二資料標籤及第二私密金鑰。基此,本地端節點100a可經配置以:求得第二原始資料的第二梯度資料;基於同態加密技術將第二梯度資料加密為密文資料ED2;使用第二私密金鑰對第二資料標籤進行簽章運算,以取得第二資料標籤的第二數位簽章;將第二資料標籤及第二數位簽章發送至雲端代碼庫200;從雲端代碼庫200接收對應於第二資料標籤的代碼T2,其中代碼T2為雲端代碼庫200在驗證第二數位簽章之後所產生;將密文資料ED2及對應於第二資料標籤的代碼T2發送至本地端節點100。In other embodiments, the local end nodes 100a, 100b can also perform operations similar to the local end node 100 according to the teachings of the above embodiments. Taking the local end node 100a as an example, it may include second original data, a second data label corresponding to the second original data, and a second private key. Based on this, the local end node 100a can be configured to: obtain the second gradient data of the second original data; encrypt the second gradient data into the ciphertext data ED2 based on the homomorphic encryption technology; use the second private key to the second The data label performs a signature operation to obtain the second digital signature of the second data label; the second data label and the second digital signature are sent to the cloud code base 200; the corresponding second data label is received from the cloud code base 200 Code T2, where the code T2 is generated by the cloud code base 200 after verifying the second digital signature; the ciphertext data ED2 and the code T2 corresponding to the second data label are sent to the local node 100.

在本地端節點100從本地端節點100a接收到密文資料ED2及代碼T2之後,本地端節點100可透過深度學習軟體模組130據以進行深度學習運算。After the local end node 100 receives the ciphertext data ED2 and the code T2 from the local end node 100a, the local end node 100 may perform deep learning operations based on the deep learning software module 130.

此外,本地端節點100a亦可將密文資料ED2及對應於第二資料標籤的代碼T2發送至本地端節點100b,藉以讓本地端節點100b能夠據以進行深度學習運算。In addition, the local end node 100a may also send the ciphertext data ED2 and the code T2 corresponding to the second data tag to the local end node 100b, so that the local end node 100b can perform deep learning operations accordingly.

由上可知,基於同態加密及代碼化技術,本發明的分散式深度學習系統可保護分散式深度學習過程中的隱私資訊安全。首先,在初始化階段,各個分散的本地端節點可針對各自擁有的原始資料使用隨機梯度下降法進行學習運算,並將求得的梯度(gradient)資料,進行同態加密運算,得到加密後的密文資料。As can be seen from the above, based on homomorphic encryption and coding technology, the decentralized deep learning system of the present invention can protect the security of private information in the process of decentralized deep learning. First, in the initialization phase, each scattered local end node can use the random gradient descent method to learn from the original data it owns, and perform the homomorphic encryption operation on the obtained gradient data to obtain the encrypted secret Documents.

此外,各個分散的本地端節點還可用代碼化技術將各自擁有的資料其對應深度學習過程中的資料標籤傳到雲端代碼庫,同時附上以自身的私密金鑰對此資料標籤做的數位簽章。在此情況下,雲端代碼庫可利用本地端節點的公開金鑰來驗證數位簽章。若驗證通過,則將資料標籤轉換成代碼,並傳回本地端節點。基此,各分散的本地端節點將經同態加密後已轉為密文的梯度資料,以及經代碼化過程已轉為代碼的資料標籤,傳給其他的本地端節點。如此一來,本發明利用同態加密後的密文可保持原梯度資料在數學上的可運算特性,同時兼顧隱私性,以及利用代碼化技術可保護各學習資料的標籤的隱私性,繼續進行分散式深度學習運算。In addition, each decentralized local end node can also use coding technology to transfer the data tags corresponding to their own data and the corresponding deep learning process to the cloud code base, and also attach a digital signature to this data tag with its own private key chapter. In this case, the cloud code base can use the public key of the local end node to verify the digital signature. If the verification is passed, the data label is converted into a code and returned to the local end node. Based on this, each decentralized local end node will homomorphically encrypt the gradient data that has been converted into ciphertext, and the data label that has been converted into code through the coding process to other local end nodes. In this way, the present invention uses the homomorphic encrypted ciphertext to maintain the mathematically operable characteristics of the original gradient data, while taking into account privacy, and the use of coding technology to protect the privacy of the labels of each learning material, to continue Decentralized deep learning operations.

請參照圖2,其是依據本發明之一實施例繪示的分散式深度學習方法流程圖。本實施例的方法可由圖1的分散式深度學習系統10執行,以下即搭配圖1所示的元件說明圖2各步驟。Please refer to FIG. 2, which is a flowchart of a distributed deep learning method according to an embodiment of the invention. The method of this embodiment may be executed by the distributed deep learning system 10 of FIG. 1, and the steps of FIG. 2 will be described below with the components shown in FIG. 1.

首先,在步驟S21中,本地端節點100可求得原始資料OD1的梯度資料GD1。在步驟S22中,本地端節點100可基於同態加密技術將梯度資料GD1加密為密文資料ED1。在步驟S23中,本地端節點100可使用私密金鑰PVK1對資料標籤TD1進行簽章運算,以取得資料標籤TD1的數位簽章DS1。在步驟S24中,雲端代碼庫200可接收本地端節點100的資料標籤TD1及數位簽章DS1,並在驗證數位簽章DS1之後,將對應於資料標籤TD1的代碼T1回傳至本地端節點100。在步驟S25中,本地端節點100a可從本地端節點100接收密文資料ED1及對應於資料標籤TD1的代碼T1,並據以進行深度學習運算。以上各步驟的細節可參照先前實施例中的說明,於此不另贅述。First, in step S21, the local end node 100 may obtain the gradient data GD1 of the original data OD1. In step S22, the local end node 100 may encrypt the gradient data GD1 into the ciphertext data ED1 based on the homomorphic encryption technology. In step S23, the local end node 100 may use the private key PVK1 to perform a signature operation on the data label TD1 to obtain the digital signature DS1 of the data label TD1. In step S24, the cloud code base 200 can receive the data label TD1 and the digital signature DS1 of the local node 100, and after verifying the digital signature DS1, return the code T1 corresponding to the data label TD1 to the local node 100 . In step S25, the local end node 100a may receive the ciphertext data ED1 and the code T1 corresponding to the data label TD1 from the local end node 100, and perform deep learning operations accordingly. For details of the above steps, reference may be made to the description in the previous embodiment, and no further details are provided here.

綜上所述,基於同態加密及代碼化技術,本發明的分散式深度學習系統及方法可保護隱私資訊安全,進而能夠解決傳統的分散式資訊處理系統(特別是分散式深度學習系統)時常遇到的在處理大量個人隱私資料時要如何維持資料隱私安全性及資料可用性的兩難問題。In summary, based on homomorphic encryption and coding technology, the decentralized deep learning system and method of the present invention can protect the privacy of information security, and thus can solve the traditional decentralized information processing system (especially the decentralized deep learning system) often The dilemma of how to maintain data privacy security and data availability when dealing with a large amount of personal privacy data.

本發明所提的系統及方法可利用同態加密後的密文可保持深度學習中的梯度資料在數學上的可運算特性,同時兼顧隱私安全性,以及利用代碼化技術可保護各學習資料的標籤的隱私性,加上代碼本身也是另一種標籤值內容不影響可用性的特性,以此達到在保障安全性的情形下仍能維持資料可用性的目的。The system and method proposed in the present invention can utilize homomorphic encrypted ciphertext to maintain the mathematical operability of gradient data in deep learning, while taking into account privacy and security, and use coding technology to protect the learning materials. The privacy of the tag, plus the code itself is another feature of the value of the tag that does not affect the usability, so as to achieve the purpose of maintaining the availability of the data under the guarantee of security.

雖然本發明已以實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明的精神和範圍內,當可作些許的更動與潤飾,故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed as above with examples, it is not intended to limit the present invention. Any person with ordinary knowledge in the technical field can make some changes and modifications without departing from the spirit and scope of the present invention. The scope of protection of the present invention shall be subject to the scope defined in the appended patent application.

10:分散式深度學習系統 100、100a、100b:本地端節點 110:梯度資料運算模組 120:簽章運算軟體模組 130:深度學習軟體模組 140:同態加密軟體模組 200:雲端代碼庫 210:代碼軟體模組 220: 驗簽章運算軟體模組 300:第三方系統 DS1:數位簽章 ED1、ED2:密文資料 OD1:原始資料 PBK1:公開金鑰 PVK1:私密金鑰 S21~S25:步驟 T1、T2:代碼 TD1:資料標籤10: Decentralized deep learning system 100, 100a, 100b: local end node 110: Gradient data operation module 120: Signature calculation software module 130: Deep learning software module 140: Homomorphic encryption software module 200: Cloud code base 210: code software module 220: Verification signature calculation software module 300: Third-party system DS1: digital signature ED1, ED2: cipher text information OD1: original data PBK1: public key PVK1: private key S21~S25: Step T1, T2: code TD1: data label

圖1是依據本發明之一實施例繪示的分散式深度學習系統示意圖。 圖2是依據本發明之一實施例繪示的分散式深度學習方法流程圖。 FIG. 1 is a schematic diagram of a distributed deep learning system according to an embodiment of the invention. 2 is a flowchart of a distributed deep learning method according to an embodiment of the invention.

10:分散式深度學習系統 10: Decentralized deep learning system

100、100a、100b:本地端節點 100, 100a, 100b: local end node

110:梯度資料運算模組 110: Gradient data operation module

120:簽章運算軟體模組 120: Signature calculation software module

130:深度學習軟體模組 130: Deep learning software module

140:同態加密軟體模組 140: Homomorphic encryption software module

200:雲端代碼庫 200: Cloud code base

210:代碼軟體模組 210: code software module

220:驗簽章運算軟體模組 220: Verification signature calculation software module

300:第三方系統 300: Third-party system

DS1:數位簽章 DS1: digital signature

ED1、ED2:密文資料 ED1, ED2: cipher text information

OD1:原始資料 OD1: original data

PBK1:公開金鑰 PBK1: public key

PVK1:私密金鑰 PVK1: private key

T1、T2:代碼 T1, T2: code

TD1:資料標籤 TD1: data label

Claims (7)

一種分散式深度學習系統,包括: 一第一本地端節點,其包括一第一原始資料、對應於該第一原始資料的一第一資料標籤及一第一私密金鑰,且該第一本地端節點經配置以: 求得該第一原始資料的一第一梯度資料; 基於一同態加密技術將該第一梯度資料加密為一第一密文資料; 使用該第一私密金鑰對該第一資料標籤進行一簽章運算,以取得該第一資料標籤的一第一數位簽章; 一雲端代碼庫,接收該第一本地端節點的該第一資料標籤及該第一數位簽章,並在驗證該第一數位簽章之後,將對應於該第一資料標籤的一第一代碼回傳至該第一本地端節點;以及 一第二本地端節點,其從該第一本地端節點接收該第一密文資料及對應於該第一資料標籤的該第一代碼,並據以進行一深度學習運算。 A decentralized deep learning system, including: A first local end node, which includes a first original data, a first data label corresponding to the first original data, and a first private key, and the first local end node is configured to: Find a first gradient data of the first original data; Encrypt the first gradient data into a first ciphertext data based on homomorphic encryption technology; Using the first private key to perform a signature operation on the first data label to obtain a first digital signature of the first data label; A cloud code base, receiving the first data label and the first digital signature of the first local end node, and after verifying the first digital signature, a first code corresponding to the first data label Back to the first local end node; and A second local end node receives the first ciphertext data and the first code corresponding to the first data label from the first local end node, and performs a deep learning operation accordingly. 如申請專利範圍第1項所述的系統,其中該第一本地端節點基於一隨機梯度下降運算將該第一原始資料轉換為該第一梯度資料。The system according to item 1 of the patent application scope, wherein the first local end node converts the first original data into the first gradient data based on a random gradient descent operation. 如申請專利範圍第1項所述的系統,更包括一第三方系統,其儲存該第一本地端節點的該第一私密金鑰所對應的一第一公開金鑰,且該雲端代碼庫經配置以: 向該第三方系統查詢對應於該第一本地端節點的該第一公開金鑰; 基於該第一公開金鑰對該第一數位簽章進行一驗簽章運算; 反應於該第一數位簽章經驗證為正確,將該第一資料標籤轉換為該第一代碼,並回傳該第一代碼至該第一本地端節點;以及 反應於該第一數位簽章經驗證為錯誤,將一錯誤訊息回傳至該第一本地端節點。 The system as described in item 1 of the patent application scope further includes a third-party system that stores a first public key corresponding to the first private key of the first local end node, and the cloud code base is Configure with: Query the third party system for the first public key corresponding to the first local end node; Performing a signature verification operation on the first digital signature based on the first public key; In response to the verification of the first digital signature being correct, the first data label is converted into the first code, and the first code is returned to the first local end node; and In response to the verification of the first digital signature being an error, an error message is returned to the first local end node. 如申請專利範圍第1項所述的系統,其中該第二本地端節點包括一第二原始資料、對應於該第二原始資料的一第二資料標籤及一第二私密金鑰,且該第二本地端節點經配置以: 求得該第二原始資料的一第二梯度資料; 基於該同態加密技術將該第二梯度資料加密為一第二密文資料; 使用該第二私密金鑰對該第二資料標籤進行該簽章運算,以取得該第二資料標籤的一第二數位簽章; 將該第二資料標籤及該第二數位簽章發送至該雲端代碼庫; 從該雲端代碼庫接收對應於該第二資料標籤的一第二代碼,其中該第二代碼為該雲端代碼庫在驗證該第二數位簽章之後所產生; 將該第二密文資料及對應於該第二資料標籤的該第二代碼發送至該第一本地端節點,以令該第一本地端節點據以進行該深度學習運算。 The system according to item 1 of the patent application scope, wherein the second local end node includes a second original data, a second data label corresponding to the second original data, and a second private key, and the second The two local end nodes are configured to: Find a second gradient data of the second original data; Encrypt the second gradient data into a second ciphertext data based on the homomorphic encryption technology; Using the second private key to perform the signature operation on the second data label to obtain a second digital signature of the second data label; Send the second data label and the second digital signature to the cloud code base; Receiving a second code corresponding to the second data tag from the cloud code base, wherein the second code is generated by the cloud code base after verifying the second digital signature; Sending the second ciphertext data and the second code corresponding to the second data label to the first local end node, so that the first local end node can perform the deep learning operation accordingly. 如申請專利範圍第4項所述的系統,更包括一第三本地端節點,其從該第一本地端節點接收該第一密文資料及對應於該第一資料標籤的該第一代碼,以及從該第二本地端節點接收該第二密文資料及對應於該第二資料標籤的該第二代碼,並據以進行該深度學習運算。The system described in item 4 of the patent application scope further includes a third local end node that receives the first ciphertext data and the first code corresponding to the first data label from the first local end node, And receiving the second ciphertext data and the second code corresponding to the second data label from the second local end node, and performing the deep learning operation accordingly. 如申請專利範圍第1項所述的系統,更包括一第三本地端節點,其從該第一本地端節點接收該第一密文資料及對應於該第一資料標籤的該第一代碼。The system as described in item 1 of the patent application scope further includes a third local end node that receives the first ciphertext data and the first code corresponding to the first data label from the first local end node. 一種分散式深度學習方法,包括: 由一第一本地端節點求得一第一原始資料的一第一梯度資料,其中該第一本地端節點包括該第一原資料、對應於該第一原始資料的一第一資料標籤及一第一私密金鑰; 由該第一本地端節點基於一同態加密技術將該第一梯度資料加密為一第一密文資料; 由該第一本地端節點使用該第一私密金鑰對該第一資料標籤進行一簽章運算,以取得該第一資料標籤的一第一數位簽章; 由一雲端代碼庫接收該第一本地端節點的該第一資料標籤及該第一數位簽章,並在驗證該第一數位簽章之後,將對應於該第一資料標籤的一第一代碼回傳至該第一本地端節點;以及 由一第二本地端節點從該第一本地端節點接收該第一密文資料及對應於該第一資料標籤的該第一代碼,並據以進行一深度學習運算。 A decentralized deep learning method, including: A first gradient data of a first original data is obtained from a first local end node, wherein the first local end node includes the first original data, a first data label corresponding to the first original data and a The first private key; The first local end node encrypts the first gradient data into a first ciphertext data based on the homomorphic encryption technology; The first local end node uses the first private key to perform a signature operation on the first data label to obtain a first digital signature of the first data label; A cloud code base receives the first data label and the first digital signature of the first local node, and after verifying the first digital signature, a first code corresponding to the first data label Back to the first local end node; and A second local end node receives the first ciphertext data and the first code corresponding to the first data label from the first local end node, and performs a deep learning operation accordingly.
TW108129897A 2019-08-21 2019-08-21 System and method of distributed deep learning system TWI690861B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW108129897A TWI690861B (en) 2019-08-21 2019-08-21 System and method of distributed deep learning system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW108129897A TWI690861B (en) 2019-08-21 2019-08-21 System and method of distributed deep learning system

Publications (2)

Publication Number Publication Date
TWI690861B true TWI690861B (en) 2020-04-11
TW202109378A TW202109378A (en) 2021-03-01

Family

ID=71134487

Family Applications (1)

Application Number Title Priority Date Filing Date
TW108129897A TWI690861B (en) 2019-08-21 2019-08-21 System and method of distributed deep learning system

Country Status (1)

Country Link
TW (1) TWI690861B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI789115B (en) * 2021-11-12 2023-01-01 中華電信股份有限公司 Encryption system and encryption method for cloud services

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9818136B1 (en) * 2003-02-05 2017-11-14 Steven M. Hoffberg System and method for determining contingent relevance
TW201812646A (en) * 2016-07-18 2018-04-01 美商南坦奧美克公司 Distributed machine learning system, method of distributed machine learning, and method of generating proxy data
CN108712260A (en) * 2018-05-09 2018-10-26 曲阜师范大学 The multi-party deep learning of privacy is protected to calculate Proxy Method under cloud environment
TWM573022U (en) * 2018-06-15 2019-01-11 全球智能股份有限公司 Management system for artificial intelligence knowledge

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9818136B1 (en) * 2003-02-05 2017-11-14 Steven M. Hoffberg System and method for determining contingent relevance
TW201812646A (en) * 2016-07-18 2018-04-01 美商南坦奧美克公司 Distributed machine learning system, method of distributed machine learning, and method of generating proxy data
CN108712260A (en) * 2018-05-09 2018-10-26 曲阜师范大学 The multi-party deep learning of privacy is protected to calculate Proxy Method under cloud environment
TWM573022U (en) * 2018-06-15 2019-01-11 全球智能股份有限公司 Management system for artificial intelligence knowledge

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI789115B (en) * 2021-11-12 2023-01-01 中華電信股份有限公司 Encryption system and encryption method for cloud services

Also Published As

Publication number Publication date
TW202109378A (en) 2021-03-01

Similar Documents

Publication Publication Date Title
CN107147652B (en) A kind of safety fusion authentication method of the polymorphic identity of user based on block chain
CN102170357B (en) Combined secret key dynamic security management system
US8892881B2 (en) Split key secure access system
JP6180177B2 (en) Encrypted data inquiry method and system capable of protecting privacy
WO2019072262A3 (en) Recovering encrypted transaction information in blockchain confidential transactions
JP5001157B2 (en) Authentication method based on polynomial
WO2017164159A1 (en) 1:n biometric authentication, encryption, signature system
CN102263638B (en) Authenticating device, authentication method and signature generation device
CN111259443A (en) PSI (program specific information) technology-based method for protecting privacy of federal learning prediction stage
CN107453862A (en) Private key generation storage and the scheme used
CN104283688B (en) A kind of USBKey security certification systems and safety certifying method
CN104158827B (en) Ciphertext data sharing method, device, inquiry server and upload data client
CN106533697B (en) Generating random number and extracting method and its application in authentication
CN113434878B (en) Modeling and application method, device, equipment and storage medium based on federal learning
CN107360002B (en) Application method of digital certificate
CN102664898A (en) Fingerprint identification-based encrypted transmission method, fingerprint identification-based encrypted transmission device and fingerprint identification-based encrypted transmission system
CN109347626B (en) Safety identity authentication method with anti-tracking characteristic
TW201409990A (en) Communication method utilizing fingerprint information for authentication
Gençoğlu Importance of cryptography in information security
CN105553980A (en) Safety fingerprint identification system and method based on cloud computing
US11728991B2 (en) Privacy-preserving leakage-deterring public-key encryption from attribute-based encryptions
TWI556618B (en) Network Group Authentication System and Method
CN101170411A (en) A light access authentication method
WO2014030706A1 (en) Encrypted database system, client device and server, method and program for adding encrypted data
TWI690861B (en) System and method of distributed deep learning system