CN116436699B - Encryption mode-based federal learning data security training method and system - Google Patents

Encryption mode-based federal learning data security training method and system Download PDF

Info

Publication number
CN116436699B
CN116436699B CN202310678195.4A CN202310678195A CN116436699B CN 116436699 B CN116436699 B CN 116436699B CN 202310678195 A CN202310678195 A CN 202310678195A CN 116436699 B CN116436699 B CN 116436699B
Authority
CN
China
Prior art keywords
private key
encrypted
result
data
actual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310678195.4A
Other languages
Chinese (zh)
Other versions
CN116436699A (en
Inventor
李延凯
梁栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Primitive Technology Co ltd
Original Assignee
Beijing Primitive Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Primitive Technology Co ltd filed Critical Beijing Primitive Technology Co ltd
Priority to CN202310678195.4A priority Critical patent/CN116436699B/en
Publication of CN116436699A publication Critical patent/CN116436699A/en
Application granted granted Critical
Publication of CN116436699B publication Critical patent/CN116436699B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • H04L63/0442Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload wherein the sending and receiving network entities apply asymmetric encryption, i.e. different keys for encryption and decryption
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general

Abstract

The invention relates to the field of data sharing, in particular to a federal learning data security training method and system based on an encryption mode, wherein the method comprises the steps of generating a public-private key pair, wherein the public-private key pair comprises a public key and a private key, and encrypting the private key in the public-private key pair; the public key is sent to a plurality of data training nodes, and a model unit formed by local training data of the plurality of data training nodes can form a federal learning model; when any data training node generates an intermediate result, encrypting the intermediate result by using a public key, and transmitting the encrypted intermediate result to other data training nodes; receiving an intermediate result encrypted by any node by using a public key; and decrypting the intermediate result by using the encrypted private key to obtain an intermediate result of any data training node. According to the invention, the risk of data or asset theft possibly caused by single private key loss is solved by utilizing public and private key pair encryption and decryption processes in the construction process of the federal learning model.

Description

Encryption mode-based federal learning data security training method and system
Technical Field
The invention relates to the field of data sharing, in particular to a federal learning data security training method and system based on an encryption mode.
Background
In the fields of data science and artificial intelligence, traditional machine learning methods require collecting and storing all users' data on a central server, and then using this data to train a global model. However, in a practical scenario, this approach is not applicable due to data privacy, bandwidth limitations, etc. Federal learning is a novel machine learning approach that can solve these problems. In federal learning, each participant uses its local data set to train the local model. The participants then aggregate the model parameters on a central server for global model updates. Federal learning models are more suitable than traditional machine learning models for protecting user privacy and for distributed data training.
The traditional private key preservation method in federal learning mainly comprises the following steps: the method comprises the steps of (1) a private key owner stores a private key by himself, (2) the private key is managed to be stored by a trusted third party, (3) the private key is stored and transmitted in a segmented mode through methods such as secret sharing, and (4) the private key is stored in cooperation with a mnemonic, when the private key is lost, the private key is retrieved through the mnemonic which is bound with the private key one by one, and in essence, the private key of the method is stored by the trusted third party. The method can only ensure that the private key can be retrieved through the mnemonic under the condition of losing or forgetting the private key, but cannot solve the risk of theft of data or assets caused by the leakage of the private key.
Disclosure of Invention
Therefore, the invention provides a federal learning data security training method based on an encryption mode, which can solve the problem that data or assets are stolen due to private key leakage.
In order to achieve the above object, in one aspect, the present invention provides a federal learning data security training method based on encryption, including:
generating a public-private key pair, wherein the public-private key pair comprises a public key and a private key, and encrypting the private key in the public-private key pair;
transmitting the public key to a plurality of data training nodes, wherein a model unit formed by local training data of the data training nodes can form a federal learning model;
when any data training node generates an intermediate result, encrypting the intermediate result by using the public key, and transmitting the encrypted intermediate result to other data training nodes;
receiving an intermediate result encrypted by any node by using the public key;
and decrypting the intermediate result by using the encrypted private key to obtain an intermediate result of any data training node.
Further, the method further comprises the following steps: after receiving the intermediate result, sending starting instruction information, and sending the starting instruction information, so that each data training node carries out next round of iterative training after receiving the starting instruction information, and continuously updating the training result to realize updating of the federal learning model.
Further, encrypting the private key of the public-private key pair includes:
the method comprises the steps of obtaining the number of characters of a private key, dividing the private key into at least two encrypted segments according to the distribution of keywords, obtaining the actual number of the encrypted segments, wherein each encrypted segment at least comprises one keyword, and obtaining the actual number of the keywords in each encrypted segment;
determining an encryption mode according to the number of characters, the actual number of encryption segments and the actual number of keywords in each encryption segment, wherein the encryption mode comprises a first encryption mode, a second encryption mode and a third encryption mode;
the first encryption mode is determined according to a first result of comparing the number of characters with a preset character standard number, a second result of comparing the actual number of the encrypted segments with the preset standard segment number and a third result of comparing the actual number of keywords with the standard keyword number;
the second encryption mode is determined according to a fourth result of comparing the number of characters with a preset standard number of characters, a fifth result of comparing the actual number of the encrypted segments with the preset standard number of segments and a sixth result of comparing the actual number of keywords with the standard keyword number;
and the third encryption mode is determined according to a seventh result of comparing the number of characters with the preset standard number of characters, an eighth result of comparing the actual number of the encrypted segments with the preset standard number of segments and a ninth result of comparing the actual number of the keywords with the standard keyword number.
Further, in the first encryption mode, the first result is that the number of characters is larger than the standard number of characters, the second result is that the actual number of encrypted segments is larger than the standard number of segments, and the third result is that the actual number of keywords is larger than the standard number of keywords;
in the second encryption mode, the fourth result is that the number of characters is equal to the standard number of characters, the fifth result is that the actual number of encrypted segments is equal to the standard number of segments, and the sixth result is that the actual number of keywords is equal to the standard number of keywords;
in the third encryption mode, the seventh result is that the number of characters is smaller than the standard number of characters, the eighth result is that the actual number of encrypted segments is smaller than the standard number of segments, and the ninth result is that the actual number of keywords is smaller than the standard number of keywords.
Further, obtaining the number of characters of the private key includes:
intercepting image data of the private key by utilizing an intercepting device arranged in a server;
determining the occupied area of the private key in the image data according to the distribution condition of the private key in the image data;
determining the number of characters corresponding to the occupied area of the private key according to the number of occupied characters in the unit area;
and taking the character number as the character number of the private key in the image data.
Further, obtaining the actual number of encrypted segments includes:
a plurality of keywords are preset;
intercepting character strings of each keyword respectively, and traversing the encrypted segment according to the actual length of the character strings;
determining the distribution density of the keywords in the encrypted segment, and not dividing the encrypted segment in the encrypted segment comprising one keyword;
dividing an encrypted piece comprising more than two keywords into two parts in a manner of the number of the actual keywords/2;
and dividing all lines of the private key respectively to determine the actual encrypted section of the private key.
Further, obtaining the actual number of keywords in each of the encrypted segments includes:
acquiring the actual character length of any keyword;
comparing any encrypted segment with the actual character length, and if the character length of the encrypted segment is smaller than the actual character length, judging that the encrypted segment does not contain the keyword;
if the character length of the encrypted segment is greater than or equal to the actual character length, further judging the number of keywords contained in the encrypted segment;
and if the similarity between the character converted by the keyword and the character string in the encrypted segment is greater than or equal to 95%, the keyword is contained in the encrypted segment, and the fact that the character string in the encrypted segment has several positions with the similarity greater than or equal to 95% is determined, and the corresponding number is used as the actual number of the keywords in the encrypted segment.
Further, the calculation formula of the standard number of characters is:where li represents the number of actual characters corresponding to the private key generated in the history period, t1 is the start time of the history period, t2 is the end time of the history period, and n is the actual number of private keys generated in the history period;
the calculation formula of the standard segment number is as follows:where f (x) represents an arbitrary encrypted segment, hi represents a termination position within an arbitrary encrypted line, and m represents the number of encrypted lines in the private key;
the calculation formula of the standard keyword quantity is as follows:where ki denotes the actual number of key words within each encryption line and m denotes the number of encryption lines in the private key;
wherein T00 represents a standard number of characters, D00 represents a standard number of segments, and K00 represents a standard keyword amount.
Further, transmitting the encrypted intermediate result to other data training nodes includes:
a rated transmission speed is preset;
before transmitting the intermediate result, detecting the actual transmission distance between the current training stage and other data nodes;
and adjusting the order of transmitting the intermediate result to other data nodes according to the transmission distance so that the intermediate result can reach other data nodes at the same time.
In another aspect, the present invention further provides a system according to the federal learning data security training method based on encryption method as described above, where the system includes:
the generation module is used for generating a public-private key pair, wherein the public-private key pair comprises a public key and a private key, and encrypting the private key in the public-private key pair;
the sending module is used for sending the public key to a plurality of data training nodes, and a model unit formed by local training data of the data training nodes can form a federal learning model;
the encryption module is used for encrypting the intermediate result by using the public key when any data training node generates the intermediate result, and transmitting the encrypted intermediate result to other data training nodes;
the receiving module is used for receiving an intermediate result encrypted by any node by using the public key;
and the decryption module is used for decrypting the intermediate result by using the encrypted private key so as to acquire the intermediate result of any data training node.
Compared with the prior art, the method has the beneficial effects that the novel method of 'password + private key' is introduced in the encryption and decryption process by utilizing the public and private key in the construction process of the federal learning model, so that the risk of data or asset theft possibly caused by single private key loss is solved, the data and the asset are further protected from being stolen, and the data and the asset are further effectively protected.
In particular, the encryption of the private key is realized by adopting different encryption modes, and the encryption modes are determined according to the actual parameters of the private key, and in the embodiment of the invention, the corresponding encryption modes are selected by determining the number of characters of the private key, the actual number of encryption segments in the private key and the actual number of keywords, so that different types of encryption can be carried out on different private keys, the encryption process is more efficient, the encryption uniqueness of the private key is improved, the encryption cracking probability is greatly reduced, the safety protection of the private key is realized, and the data safety is improved.
In particular, the first encryption mode with higher safety degree is adopted for encrypting the private keys with more characters, more encryption sections and more keywords, so that the safety of the private keys is higher, the third encryption mode with lower safety degree is adopted for encrypting the private keys with less characters, fewer encryption sections and fewer keywords, reasonable distribution of the encryption modes is realized, positive correlation between the complexity of the encryption mode and the complexity of the private keys is realized, effective encryption of the private keys is realized, different encryption of the private keys of different types is improved, the safety of the private keys is ensured, decryption according to the private keys is convenient, and the decryption efficiency of the private keys is improved.
In particular, whether the keyword exists in the encrypted segment is determined by comparing the character length of the encrypted segment with the character length of the keyword, after the encrypted segment possibly exists and the similarity degree of the character converted by the keyword and the character string in the encrypted segment meets a certain condition, the corresponding keyword is determined to be contained in the encrypted segment, and the actual number is determined by summarizing the keywords contained in the encrypted segment after traversing all the keywords, so that the effective determination of the keywords in the encrypted segment is realized, and the determination efficiency and accuracy are improved.
Drawings
Fig. 1 is a schematic flow chart of a federal learning data security training method based on encryption mode according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of another method for secure training of federal learning data based on encryption scheme according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a third flow chart of a federal learning data security training method based on encryption mode according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a federal learning data security system based on an encryption method according to an embodiment of the present invention.
Detailed Description
In order that the objects and advantages of the invention will become more apparent, the invention will be further described with reference to the following examples; it should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.
It should be noted that, in the description of the present invention, terms such as "upper," "lower," "left," "right," "inner," "outer," and the like indicate directions or positional relationships based on the directions or positional relationships shown in the drawings, which are merely for convenience of description, and do not indicate or imply that the apparatus or elements must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
Furthermore, it should be noted that, in the description of the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention can be understood by those skilled in the art according to the specific circumstances.
Referring to fig. 1, the federal learning data security training method based on encryption mode provided by the embodiment of the invention includes:
step S100: generating a public-private key pair, wherein the public-private key pair comprises a public key and a private key, and encrypting the private key in the public-private key pair;
step S200: transmitting the public key to a plurality of data training nodes, wherein a model unit formed by local training data of the data training nodes can form a federal learning model;
step S300: when any data training node generates an intermediate result, encrypting the intermediate result by using the public key, and transmitting the encrypted intermediate result to other data training nodes;
step S400: receiving an intermediate result encrypted by any node by using the public key;
step S500: and decrypting the intermediate result by using the encrypted private key to obtain an intermediate result of any data training node.
Specifically, in the embodiment of the invention, a new method of 'password + private key' is introduced in the encryption and decryption process by utilizing the public and private key in the construction process of the federal learning model, so that the risk of data or asset theft possibly caused by single private key loss is solved, the data and the asset are further protected from being stolen, and the data and the asset are further effectively protected.
Specifically, as shown in fig. 2, the method further includes: step S600: after receiving the intermediate result, sending starting instruction information, and sending the starting instruction information, so that each data training node carries out next round of iterative training after receiving the starting instruction information, and continuously updating the training result to realize updating of the federal learning model.
Specifically, after each data training node transmits an intermediate result, the server transmits starting instruction information after receiving the intermediate result and decrypting the intermediate result, so that the next round of iterative training is continued after the starting instruction information is received in the data training stage, gradient functions and loss functions of model units corresponding to each data training node and the training result of the previous round are ensured to be continuously converged, uninterrupted updating of the federal learning model is realized, and the precision of the federal learning model in the server is improved.
Specifically, as shown in fig. 3, in step S100, encrypting the private key in the public-private key pair includes:
step S101: the method comprises the steps of obtaining the number of characters of a private key, dividing the private key into at least two encrypted segments according to the distribution of keywords, obtaining the actual number of the encrypted segments, wherein each encrypted segment at least comprises one keyword, and obtaining the actual number of the keywords in each encrypted segment;
step S102: determining an encryption mode according to the number of characters, the actual number of encryption segments and the actual number of keywords in each encryption segment, wherein the encryption mode comprises a first encryption mode, a second encryption mode and a third encryption mode;
step S103: the first encryption mode is determined according to a first result of comparing the number of characters with a preset character standard number, a second result of comparing the actual number of the encrypted segments with the preset standard segment number and a third result of comparing the actual number of keywords with the standard keyword number;
step S104: the second encryption mode is determined according to a fourth result of comparing the number of characters with a preset standard number of characters, a fifth result of comparing the actual number of the encrypted segments with the preset standard number of segments and a sixth result of comparing the actual number of keywords with the standard keyword number;
step S105: and the third encryption mode is determined according to a seventh result of comparing the number of characters with the preset standard number of characters, an eighth result of comparing the actual number of the encrypted segments with the preset standard number of segments and a ninth result of comparing the actual number of the keywords with the standard keyword number.
Specifically, in the embodiment of the invention, the encryption of the private key is realized by adopting different encryption modes, the encryption modes are determined according to the actual parameters of the private key, and the corresponding encryption modes are selected by determining the number of characters of the private key, the actual number of encryption segments in the private key and the actual number of keywords, so that different types of encryption can be carried out on different private keys, the encryption process is more efficient, the encryption uniqueness of the private key is improved, the encryption cracking probability of the password is greatly reduced, the safety protection of the private key is realized, and the data safety is improved.
Specifically, in the first encryption mode, the first result is that the number of characters is greater than the standard number of characters, the second result is that the actual number of encrypted segments is greater than the standard number of segments, and the third result is that the actual number of keywords is greater than the standard number of keywords;
in the second encryption mode, the fourth result is that the number of characters is equal to the standard number of characters, the fifth result is that the actual number of encrypted segments is equal to the standard number of segments, and the sixth result is that the actual number of keywords is equal to the standard number of keywords;
in the third encryption mode, the seventh result is that the number of characters is smaller than the standard number of characters, the eighth result is that the actual number of encrypted segments is smaller than the standard number of segments, and the ninth result is that the actual number of keywords is smaller than the standard number of keywords.
Specifically, the embodiment of the invention encrypts the private key with more characters, more encryption segments and more keywords by adopting a first encryption mode with higher security, so that the security of the private key is higher, the private key with fewer characters, fewer encryption segments and fewer keywords is encrypted by adopting a third encryption mode with lower security, the reasonable distribution of the encryption mode is realized, the positive correlation between the complexity of the encryption mode and the complexity of the private key is realized, the effective encryption of the private key is realized, the differential encryption of different types of private keys is improved, the security of the private key is ensured, the decryption according to the private key is convenient, and the decryption efficiency of the private key is improved.
Specifically, acquiring the number of characters of the private key includes:
intercepting image data of the private key by utilizing an intercepting device arranged in a server;
determining the occupied area of the private key in the image data according to the distribution condition of the private key in the image data;
determining the number of characters corresponding to the occupied area of the private key according to the number of occupied characters in the unit area;
and taking the character number as the character number of the private key in the image data.
Specifically, the embodiment of the invention acquires the image data of the private key through the intercepting device, determines the distribution condition of the private key in the image data, further determines the occupied area of the private key, and determines the number of characters of the private key according to the occupied area, so that the determination of the number of characters of the private key is more accurate, the determination efficiency of the number of characters of the private key is improved, and the subsequent effective determination of the encryption mode of the private key is facilitated.
Specifically, obtaining the actual number of encrypted segments includes:
a plurality of keywords are preset;
intercepting character strings of each keyword respectively, and traversing the encrypted segment according to the actual length of the character strings;
determining the distribution density of the keywords in the encrypted segment, and not dividing the encrypted segment in the encrypted segment comprising one keyword;
dividing an encrypted piece comprising more than two keywords into two parts in a manner of the number of the actual keywords/2;
and dividing all lines of the private key respectively to determine the actual encrypted section of the private key.
Specifically, the embodiment of the invention determines the number of the encrypted segments in the private key, so that the determination process according to the number of the encrypted segments is more accurate, further the encryption mode can be more efficient, when the actual number of the encrypted segments of the private key is determined, whether the encrypted segments are divided or not is determined according to the character strings of the keywords, further the actual number of the encrypted segments in the private key is determined, and the determination process of the actual number of the encrypted segments is more accurate and efficient.
Specifically, obtaining the actual number of keywords in each of the encrypted segments includes:
acquiring the actual character length of any keyword;
comparing any encrypted segment with the actual character length, and if the character length of the encrypted segment is smaller than the actual character length, judging that the encrypted segment does not contain the keyword;
if the character length of the encrypted segment is greater than or equal to the actual character length, further judging the number of keywords contained in the encrypted segment;
and if the similarity between the character converted by the keyword and the character string in the encrypted segment is greater than or equal to 95%, the keyword is contained in the encrypted segment, and the fact that the character string in the encrypted segment has several positions with the similarity greater than or equal to 95% is determined, and the corresponding number is used as the actual number of the keywords in the encrypted segment.
Specifically, the embodiment of the invention determines whether the keyword exists in the encrypted segment by comparing the character length of the encrypted segment with the character length of the keyword, determines that the encrypted segment contains the corresponding keyword after the similarity degree between the character converted by the keyword and the character string in the encrypted segment meets a certain condition after the encrypted segment possibly exists, and gathers the keywords contained in the encrypted segment to determine the actual number after traversing all the keywords, thereby realizing the effective determination of the keywords in the encrypted segment and improving the determination efficiency and accuracy.
Specifically, the calculation formula of the standard number of characters is:where li represents the number of actual characters corresponding to the private key generated in the history period, t1 is the start time of the history period, t2 is the end time of the history period, and n is the actual number of private keys generated in the history period;
the calculation formula of the standard segment number is as follows:where f (x) represents an arbitrary encrypted segment, hi represents a termination position within an arbitrary encrypted line, and m represents the number of encrypted lines in the private key;
the calculation formula of the standard keyword quantity is as follows:where ki denotes the actual number of key words within each encryption line and m denotes the number of encryption lines in the private key;
wherein T00 represents a standard number of characters, D00 represents a standard number of segments, and K00 represents a standard keyword amount.
Specifically, the embodiment of the invention realizes the calculation of parameters for determining the private key in the historical period by limiting the specific determination modes of the standard number of characters, the standard number of segments and the standard keyword amount, further sums the actual number of characters of the private key in the historical period when the standard number of segments is determined, divides the actual number of the generated private key, further determines the standard number of characters, realizes the effective prediction of the standard number of characters by using the historical mean value data, and improves the determination accuracy.
Specifically, transmitting the encrypted intermediate result to other data training nodes includes:
a rated transmission speed is preset;
before transmitting the intermediate result, detecting the actual transmission distance between the current training stage and other data nodes;
and adjusting the order of transmitting the intermediate result to other data nodes according to the transmission distance so that the intermediate result can reach other data nodes at the same time.
Specifically, the embodiment of the invention limits the transmission process of the intermediate result, so that the intermediate result can reach other data nodes at the same time, thereby realizing the synchronism of the transmission of the intermediate result, effectively delaying the information and improving the processing efficiency of the information.
In practical application, the federal learning data security training method based on the encryption mode in the embodiment of the invention comprises the following steps:
a third party generates a public-private key pair, and simultaneously the private key of the public-private key pair is uniquely bound with a password (the password number can be 6 digits, and is a pure number similar to a bank card transaction password), and the public key is distributed to the participants of the data provision;
the data providers encrypt the intermediate calculation result by using the public key and interactively transmit the intermediate calculation result to finish the calculation of gradient and loss;
the data provider uploads the respective encryption result to a third party;
the third party returns the result of the decryption (decrypted by the "password + private key") and the data provider starts the next round of iterative training.
All the places where the private key is used independently are modified to use a 'password+private key', and the method for storing the password comprises the following steps:
the password and the private key are not stored in the same place of the system, so that the password cannot be lost simultaneously when the private key is lost, and the risk of data or asset theft possibly caused by the loss of the private key is avoided.
The cipher may be burned into hardware such as a dongle that needs to be inserted each time a federal learning operation is performed.
The embodiment of the invention also provides a federal learning data security training system based on an encryption mode, as shown in fig. 4, the system comprises:
the generating module 10 is configured to generate a public-private key pair, where the public-private key pair includes a public key and a private key, and encrypt the private key in the public-private key pair;
a transmitting module 20, configured to transmit the public key to a plurality of data training nodes, where model units formed by local training data of the plurality of data training nodes can form a federal learning model;
the encryption module 30 is configured to encrypt the intermediate result with the public key when any of the data training nodes generates the intermediate result, and transmit the encrypted intermediate result to other data training nodes;
a receiving module 40, configured to receive an intermediate result encrypted by any node using the public key;
the decryption module 50 is configured to decrypt the intermediate result using the encrypted private key to obtain an intermediate result of any data training node.
Specifically, the federal learning data security training system based on the encryption mode provided by the embodiment of the invention is used for executing the federal learning data security training method based on the encryption mode, so that the same technical effects can be achieved, and the details are not repeated here.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.
The foregoing description is only of the preferred embodiments of the invention and is not intended to limit the invention; various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. The federal learning data safety training method based on the encryption mode is characterized by comprising the following steps of:
generating a public-private key pair, wherein the public-private key pair comprises a public key and a private key, and encrypting the private key in the public-private key pair;
transmitting the public key to a plurality of data training nodes, wherein a model unit formed by local training data of the data training nodes can form a federal learning model;
when any data training node generates an intermediate result, encrypting the intermediate result by using the public key, and transmitting the encrypted intermediate result to other data training nodes;
receiving an intermediate result encrypted by any node by using the public key;
decrypting the intermediate result by using the encrypted private key to obtain an intermediate result of any data training node;
encrypting the private key of the public-private key pair includes:
the method comprises the steps of obtaining the number of characters of a private key, dividing the private key into at least two encrypted segments according to the distribution of keywords, obtaining the actual number of the encrypted segments, wherein each encrypted segment at least comprises one keyword, and obtaining the actual number of the keywords in each encrypted segment;
determining an encryption mode according to the number of characters, the actual number of encryption segments and the actual number of keywords in each encryption segment, wherein the encryption mode comprises a first encryption mode, a second encryption mode and a third encryption mode;
the first encryption mode is determined according to a first result of comparing the number of characters with a preset character standard number, a second result of comparing the actual number of the encrypted segments with the preset standard segment number and a third result of comparing the actual number of keywords with the standard keyword number;
the second encryption mode is determined according to a fourth result of comparing the number of characters with a preset standard number of characters, a fifth result of comparing the actual number of the encrypted segments with the preset standard number of segments and a sixth result of comparing the actual number of keywords with the standard keyword number;
the third encryption mode is determined according to a seventh result of comparing the number of characters with a preset standard number of characters, an eighth result of comparing the actual number of the encrypted segments with the preset standard number of segments and a ninth result of comparing the actual number of keywords with the standard keyword number;
in the first encryption mode, the first result is that the number of characters is larger than the standard number of characters, the second result is that the actual number of encryption segments is larger than the standard number of segments, and the third result is that the actual number of keywords is larger than the standard number of keywords;
in the second encryption mode, the fourth result is that the number of characters is equal to the standard number of characters, the fifth result is that the actual number of encrypted segments is equal to the standard number of segments, and the sixth result is that the actual number of keywords is equal to the standard number of keywords;
in the third encryption mode, the seventh result is that the number of characters is smaller than the standard number of characters, the eighth result is that the actual number of encrypted segments is smaller than the standard number of segments, and the ninth result is that the actual number of keywords is smaller than the standard number of keywords.
2. The federal learning data security training method based on encryption scheme according to claim 1, further comprising: after receiving the intermediate result, sending starting instruction information, and sending the starting instruction information, so that each data training node carries out next round of iterative training after receiving the starting instruction information, and continuously updating the training result to realize updating of the federal learning model.
3. The federal learning data security training method based on encryption scheme according to claim 2, wherein obtaining the number of characters of the private key comprises:
intercepting image data of the private key by utilizing an intercepting device arranged in a server;
determining the occupied area of the private key in the image data according to the distribution condition of the private key in the image data;
determining the number of characters corresponding to the occupied area of the private key according to the number of occupied characters in the unit area;
and taking the character number as the character number of the private key in the image data.
4. The federal learning data security training method based on encryption scheme according to claim 3, wherein obtaining the actual number of encrypted segments comprises:
a plurality of keywords are preset;
intercepting character strings of each keyword respectively, and traversing the encrypted segment according to the actual length of the character strings;
determining the distribution density of the keywords in the encrypted segment, and not dividing the encrypted segment in the encrypted segment comprising one keyword;
dividing an encrypted piece comprising more than two keywords into two parts in a manner of the number of the actual keywords/2;
and dividing all lines of the private key respectively to determine the actual encrypted section of the private key.
5. The method for securely training federal learning data based on encryption scheme according to claim 4, wherein obtaining the actual number of keywords in each of the encrypted segments comprises:
acquiring the actual character length of any keyword;
comparing any encrypted segment with the actual character length, and if the character length of the encrypted segment is smaller than the actual character length, judging that the encrypted segment does not contain the keyword;
if the character length of the encrypted segment is greater than or equal to the actual character length, further judging the number of keywords contained in the encrypted segment;
and if the similarity between the character converted by the keyword and the character string in the encrypted segment is greater than or equal to 95%, the keyword is contained in the encrypted segment, and the fact that the character string in the encrypted segment has several positions with the similarity greater than or equal to 95% is determined, and the corresponding number is used as the actual number of the keywords in the encrypted segment.
6. The encryption-based federal learning data security training method according to claim 5, wherein the calculation formula of the standard number of characters is:where li represents the number of actual characters corresponding to the private key generated in the history period, t1 is the start time of the history period, t2 is the end time of the history period, and n is the actual number of private keys generated in the history period;
the calculation formula of the standard segment number is as follows:where f (x) represents an arbitrary encrypted segment, hi represents a termination position within an arbitrary encrypted line, and m represents the number of encrypted lines in the private key;
the calculation formula of the standard keyword quantity is as follows:where ki denotes the actual number of key words within each encryption line and m denotes the number of encryption lines in the private key;
wherein T00 represents a standard number of characters, D00 represents a standard number of segments, and K00 represents a standard keyword amount.
7. The method for securely training federal learning data based on encryption scheme according to claim 6, wherein transmitting the encrypted intermediate result to other data training nodes comprises:
a rated transmission speed is preset;
before transmitting the intermediate result, detecting the actual transmission distance between the current training stage and other data nodes;
and adjusting the order of transmitting the intermediate result to other data nodes according to the transmission distance so that the intermediate result can reach other data nodes at the same time.
8. A system for encryption-based federal learning data security training method according to any one of claims 1-7, comprising:
the generation module is used for generating a public-private key pair, wherein the public-private key pair comprises a public key and a private key, and encrypting the private key in the public-private key pair;
the sending module is used for sending the public key to a plurality of data training nodes, and a model unit formed by local training data of the data training nodes can form a federal learning model;
the encryption module is used for encrypting the intermediate result by using the public key when any data training node generates the intermediate result, and transmitting the encrypted intermediate result to other data training nodes;
the receiving module is used for receiving an intermediate result encrypted by any node by using the public key;
and the decryption module is used for decrypting the intermediate result by using the encrypted private key so as to acquire the intermediate result of any data training node.
CN202310678195.4A 2023-06-09 2023-06-09 Encryption mode-based federal learning data security training method and system Active CN116436699B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310678195.4A CN116436699B (en) 2023-06-09 2023-06-09 Encryption mode-based federal learning data security training method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310678195.4A CN116436699B (en) 2023-06-09 2023-06-09 Encryption mode-based federal learning data security training method and system

Publications (2)

Publication Number Publication Date
CN116436699A CN116436699A (en) 2023-07-14
CN116436699B true CN116436699B (en) 2023-08-22

Family

ID=87081760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310678195.4A Active CN116436699B (en) 2023-06-09 2023-06-09 Encryption mode-based federal learning data security training method and system

Country Status (1)

Country Link
CN (1) CN116436699B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110995737A (en) * 2019-12-13 2020-04-10 支付宝(杭州)信息技术有限公司 Gradient fusion method and device for federal learning and electronic equipment
CN111083140A (en) * 2019-12-13 2020-04-28 北京网聘咨询有限公司 Data sharing method under hybrid cloud environment
CN113395159A (en) * 2021-01-08 2021-09-14 腾讯科技(深圳)有限公司 Data processing method based on trusted execution environment and related device
CN113612598A (en) * 2021-08-02 2021-11-05 北京邮电大学 Internet of vehicles data sharing system and method based on secret sharing and federal learning
CN114443754A (en) * 2020-11-03 2022-05-06 中国电信股份有限公司 Block chain-based federated learning processing method, device, system and medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220374763A1 (en) * 2021-05-18 2022-11-24 International Business Machines Corporation Federated learning with partitioned and dynamically-shuffled model updates

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110995737A (en) * 2019-12-13 2020-04-10 支付宝(杭州)信息技术有限公司 Gradient fusion method and device for federal learning and electronic equipment
CN111083140A (en) * 2019-12-13 2020-04-28 北京网聘咨询有限公司 Data sharing method under hybrid cloud environment
CN114443754A (en) * 2020-11-03 2022-05-06 中国电信股份有限公司 Block chain-based federated learning processing method, device, system and medium
CN113395159A (en) * 2021-01-08 2021-09-14 腾讯科技(深圳)有限公司 Data processing method based on trusted execution environment and related device
CN113612598A (en) * 2021-08-02 2021-11-05 北京邮电大学 Internet of vehicles data sharing system and method based on secret sharing and federal learning

Also Published As

Publication number Publication date
CN116436699A (en) 2023-07-14

Similar Documents

Publication Publication Date Title
CN108632032B (en) Safe multi-keyword sequencing retrieval system without key escrow
CN110224986B (en) Efficient searchable access control method based on hidden policy CP-ABE
US8223970B2 (en) Message deciphering method, system and article
CN108632248B (en) Data ciphering method, data query method, apparatus, equipment and storage medium
CN112019591B (en) Cloud data sharing method based on block chain
AU655304B2 (en) Algorithm independent cryptographic key management
CN107104982B (en) It can search for encryption system with traitor tracing function in mobile electron medical treatment
CN101593196A (en) The methods, devices and systems that are used for rapidly searching ciphertext
JP6497747B2 (en) Key exchange method, key exchange system
CN104168108A (en) Attribute-based hybrid encryption method capable of tracing leaked secret key
CN110191153A (en) Social communication method based on block chain
CN112383550B (en) Dynamic authority access control method based on privacy protection
EP1501238B1 (en) Method and system for key distribution comprising a step of authentication and a step of key distribution using a KEK (key encryption key)
CN113905047A (en) Space crowdsourcing task allocation privacy protection method and system
CN105721146B (en) A kind of big data sharing method towards cloud storage based on SMC
CN111510464B (en) Epidemic situation information sharing method and system for protecting user privacy
CN111581648B (en) Method of federal learning to preserve privacy in irregular users
Li et al. Secure and temporary access delegation with equality test for cloud-assisted IoV
CN116436699B (en) Encryption mode-based federal learning data security training method and system
CN108920968B (en) File searchable encryption method based on connection keywords
CN116340986A (en) Block chain-based privacy protection method and system for resisting federal learning gradient attack
US11451518B2 (en) Communication device, server device, concealed communication system, methods for the same, and program
CN113868450A (en) Remote sensing image safety retrieval method based on block chain
Rabin Provably unbreakable hyper-encryption in the limited access model
Yu et al. Hail the Closest Driver on Roads: privacy-preserving ride matching in online ride hailing services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A Federated Learning Data Security Training Method and System Based on Encryption

Granted publication date: 20230822

Pledgee: Zhongguancun Beijing technology financing Company limited by guarantee

Pledgor: Beijing primitive Technology Co.,Ltd.

Registration number: Y2024990000094