Disclosure of Invention
In order to solve the problems, the invention provides a safe sharing method and system for data of the Internet of vehicles based on blockchains and dynamic reputation, which avoid the risk of impersonation of legal nodes by external adversaries, solve the problem of single-point failure of a centralized server, improve the validity and accuracy of reputation evaluation, and realize safe sharing application of the data in the Internet of vehicles scene under the condition of protecting local privacy and training model safety.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the invention provides a method for safely sharing internet of vehicles data based on blockchain and dynamic reputation, comprising the following steps:
issuing a model training task, and selecting RSU nodes and vehicle nodes which participate in the model training task according to the reputation value;
downloading a global model from the blockchain by the selected RSU node, and issuing the global model to vehicle nodes which are in the coverage range and added with model training tasks;
the vehicle node carries out iterative training of the local model according to the global model and the local data, encrypts the generated local update parameters by adopting a secret sharing algorithm to obtain secret shares, signs the secret shares and uploads the secret shares to the RSU node;
calculating Euclidean distance between secret shares by the RSU node, signing the Euclidean distance and uploading the Euclidean distance to the block chain;
reconstructing Euclidean distance by using the block chain through Lagrange interpolation, selecting legal vehicle nodes, aggregating secret shares uploaded by the legal vehicle nodes in RSU nodes, and reconstructing an aggregation result by using the block chain through Lagrange interpolation to obtain an aggregation updated global model;
and downloading the aggregated and updated global model from the blockchain by the RSU node, and issuing the global model to the corresponding vehicle node so as to enable the vehicle node to perform the next round of model training until the model converges, and storing reputation values subjected to reputation evaluation on the RSU node and the vehicle node in a chain manner in each round of model training process.
As an optional implementation manner, the internet of vehicles data security sharing method further comprises verifying the legitimacy of the vehicle node and the RSU node; the method specifically comprises the following steps:
after receiving the secret share uploaded by the vehicle node, the RSU node verifies whether the identity and the time stamp of the vehicle node are valid or not;
after the block link receives the Euclidean distance and the secret share uploaded by the RSU node, whether the identification mark and the timestamp of the RSU node are valid or not is verified.
As an alternative embodiment, the reputation evaluation of the RSU comprises: the RSU node verifies the size of the local update parameter uploaded by the vehicle node, the RSU node verifies the update of the global model parameter, and the RSU node verifies the secret share of other RSU nodes; if the events are validated as valid and reliable, they are considered positive interaction events.
As an alternative implementation manner, the weight eta is set for the number of positive interaction events, and 0< eta is less than or equal to 1, and the weight of the number of negative interaction events is set to be 1, the reputation value of the RSU node is as follows:
wherein ,
representing the trust degree, the distrust degree and the uncertainty degree of the task publisher i on the RSU node j respectively.
As an alternative embodiment, the trust level, the untrustworthiness and the uncertainty of the task publisher i on the RSU node j are respectively:
wherein r and s are the number of positive interaction events and the number of negative interaction events respectively; c is a constant.
As an alternative embodiment, the reputation value of the vehicle node is:
wherein ,
for the degree of uncertainty of task publisher i on vehicle node j, s and f are the number of positive interactions and the number of negative interactions, respectively.
As an alternative implementation manner, when the reputation evaluation is carried out on the vehicle node, the probability of positive interaction events of the vehicle node is predicted by adopting Bayesian theory so as to eliminate uncertain interaction events of the vehicle node.
In a second aspect, the present invention provides a system for securely sharing data of a vehicle networking based on blockchain and dynamic reputation, comprising: task publishers, blockchains, RSU nodes, and vehicle nodes;
the task publisher is used for publishing model training tasks, and RSU nodes and vehicle nodes which participate in the model training tasks are selected according to the reputation values;
the selected RSU node downloads the global model from the blockchain and transmits the global model to vehicle nodes which are added with model training tasks in the coverage areas of the nodes;
the vehicle node carries out iterative training of the local model according to the global model and the local data, encrypts the generated local update parameters by adopting a secret sharing algorithm to obtain secret shares, signs the secret shares and uploads the secret shares to the RSU node;
the RSU node calculates Euclidean distance between secret shares, signs the Euclidean distance and then uploads the Euclidean distance to the block chain;
reconstructing Euclidean distance by the block chain through Lagrange interpolation, selecting legal vehicle nodes, aggregating secret shares uploaded by the legal vehicle nodes in RSU nodes, and reconstructing an aggregation result by the block chain through Lagrange interpolation to obtain an aggregation updated global model;
and the RSU node downloads the aggregated and updated global model from the blockchain and transmits the global model to the corresponding vehicle node so that the vehicle node performs the model training of the next round until the model converges, and reputation values of the RSU node and the vehicle node subjected to reputation evaluation are stored in a uplink manner in the process of training each round of model.
In a third aspect, the invention provides an electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the method of the first aspect.
In a fourth aspect, the present invention provides a computer readable storage medium storing computer instructions which, when executed by a processor, perform the method of the first aspect.
Compared with the prior art, the invention has the beneficial effects that:
compared with the prior art, the method and the device protect the model parameters of the vehicle nodes in the whole federal learning training process by using secret sharing and Multi-Krum algorithm, and effectively avoid the inference attack and collusion attack of the unreliable RSU nodes; screening model parameters before aggregation, and further effectively removing toxic update by calculating Euclidean distance between secret shares, so that communication and calculation pressure of vehicle terminals are reduced, effective defense against poisoning attack initiated by malicious vehicle nodes on the premise of not revealing privacy is ensured, and safe and efficient internet of vehicles data sharing is finally realized.
Compared with the prior art without using a blockchain, the invention uses the blockchain to replace a central server to screen and aggregate model parameters, avoids the problems of error results and single-point faults returned by an unreliable central server, verifies the identity legitimacy of a vehicle node and an RSU node when the model parameters are uploaded by the vehicle node and the RSU node, prevents an external adversary from impersonating legal users to upload false data, realizes decentralization and safe federal learning, and finally ensures safe storage on chains of global models and credit values.
Compared with the prior art using blockchains, in the training process of each time, the task publisher respectively carries out effective and accurate reputation evaluation on the RSU and the vehicle node according to the model quality, and the reputation value is safely stored in a uplink manner; in the reputation evaluation process, based on a subjective logic model, the weight is set according to the size of a data set, the reputation evaluation of the RSU node is increased, the uncertain interaction events possibly occurring to the vehicle node in the training process are eliminated by adopting the Bayesian theory, the effective and accurate reputation evaluation of the RSU node and the vehicle node is realized, and the vehicle federal learning with traceability, verifiability and privacy protection is finally realized.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Detailed Description
The invention is further described below with reference to the drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, unless the context clearly indicates otherwise, the singular forms also are intended to include the plural forms, and furthermore, it is to be understood that the terms "comprises" and "comprising" and any variations thereof are intended to cover non-exclusive inclusions, such as, for example, processes, methods, systems, products or devices that comprise a series of steps or units, are not necessarily limited to those steps or units that are expressly listed, but may include other steps or units that are not expressly listed or inherent to such processes, methods, products or devices.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
Example 1
The embodiment provides a safe sharing method of internet of vehicles data based on blockchain and dynamic reputation, which comprises the following steps:
issuing a model training task, and selecting RSU nodes and vehicle nodes which participate in the model training task according to the reputation value;
downloading global model parameters from the blockchain by the selected RSU node, and issuing the global model parameters to vehicle nodes which are added with model training tasks in the coverage areas of the nodes;
the vehicle node carries out iterative training of the local model according to the global model parameters and the local data, encrypts the generated local update parameters by adopting a secret sharing algorithm to obtain secret shares, signs the secret shares and uploads the secret shares to the RSU node;
calculating Euclidean distance between secret shares by the RSU node, signing the Euclidean distance and uploading the Euclidean distance to the block chain;
reconstructing Euclidean distance by using the block chain through Lagrange interpolation, selecting legal vehicle nodes, aggregating secret shares uploaded by the legal vehicle nodes in RSU nodes, and reconstructing an aggregation result by using the block chain through Lagrange interpolation to obtain an aggregation updated global model;
and downloading the aggregated and updated global model parameters from the blockchain by the RSU node, and issuing the aggregated and updated global model parameters to the corresponding vehicle nodes so that the vehicle nodes perform the next round of model training until the models converge, and storing reputation values subjected to reputation evaluation on the RSU node and the vehicle nodes in a uplink manner in each round of model training process.
In the embodiment, the method comprises the steps of completing four-way participation of a task publisher, a blockchain module, a roadside unit (RSU) module and a vehicle node module; specifically:
(1) Task publishers: establishing a machine learning model according to own requirements, issuing a federal learning training task through a blockchain, wherein vehicle nodes interested in the federal learning training task can apply to participate in the federal learning training task, and a task issuer can finally obtain an ideal machine learning model as more and more vehicle nodes are added into the federal learning training task and contribute to model training; meanwhile, the task publishers need to evaluate the quality of the local model of the participating nodes, and generate a reputation value which is used as an index for measuring the credibility of the participating nodes in the federal learning process based on the subjective logic model, so that other task publishers can select the participating nodes with better reputation in the federal learning.
(2) A blockchain module: and storing reputation evaluation results of the task publishers on the RSU and the vehicle nodes, and registering the vehicle nodes and the RSU nodes in the system and aggregating the global model. Because of the traceable and tamper-resistant nature of blockchains, when participating nodes send low quality local models, the relevant information in the data block can be used as persistent and transparent evidence. In addition, the task publishers ensure safe sharing of the reputation evaluation results through a predefined access control strategy in the blockchain, and access records of other task publishers on the reputation evaluation results are stored in the blockchain.
(3) Roadside unit RSU module: the RSU node is a wireless device fixed on two sides of a road or placed at a specific position on the roadside, is positioned near the vehicle node, has certain calculation and storage capabilities, can collect, process and forward data uploaded by the vehicle node in the coverage range, and is mainly responsible for distributing training tasks, verifying the data sharing size of the vehicle node, assisting in removing toxic gradient updating of a blockchain and aggregating a global model, so that the data communication range of the Internet of vehicles is enlarged, and the safety in the data sharing process is ensured.
In the model downloading stage, the RSU node downloads a global model from the blockchain and distributes the global model to vehicle nodes participating in training in a covered area; in the model uploading stage, the RSU node is responsible for processing the secret shares uploaded by the vehicle node and uploading the secret shares to the blockchain.
(4) Vehicle node module: the mobile user side collects, stores and preprocesses the data, and the training process participating in federal learning then uploads the secret share of the local model parameters to the adjacent RSU nodes, and then the training process of the model parameters is repeated, and a new round of iteration is performed until the precision of the global model reaches an ideal expected value; the uploading content needs to declare the local data size and append corresponding training time, thereby indicating the data contribution size of the user.
The process flow is described in detail below in conjunction with fig. 1-2.
1. The release task stage: the task publisher establishes a machine learning model according to own requirements, and uploads the initialization model to the blockchain.
2. System node registration phase: registration of all RSU nodes and vehicle nodes in the system is completed by the blockchain, and node information of successful registration is stored on the blockchain.
3. Participating node selection phase: the RSU node and the vehicle node which want to join the federal learning training task send an application to a task publisher, wherein the application content comprises signing abstracts of respective identity identifiers and data set information by using the latest timestamp, and then becoming candidate nodes for model training and aggregation; the task publisher downloads the reputation value of the candidate node for a period of time from the blockchain, and selects a node with a higher reputation value to participate in the task.
4. Model issuing stage: and the RSU nodes which successfully join in the training task download global model parameters from the blockchain, and then the RSU transmits the global model parameters of the current round to the vehicle nodes which are covered by the RSU nodes and successfully join in the training task.
5. Privacy protection training phase:
5.1, local training: the vehicle node updates the local model according to the downloaded global model, then enters into the iterative training of the model, trains the local model by using local data and generates local update;
5.2, generating a secret share: the vehicle node calls a Shmar (k, N) secret sharing algorithm to encrypt the local update parameters to obtain secret shares, and the secret shares are signed by a private key and then uploaded to the RSU node.
Secret sharing is a cryptographic technique that shares information among multiple participants in order to ensure that the information is not corrupted, tampered with, and lost. Secret sharing divides a secret into several shares by a specific operation, distributes to multiple participants, and secret recovery needs to be performed jointly by multiple participants according to a protocol, individual secret shares being of no use. The secret sharing algorithm mainly comprises China remainder theorem, shmair, blakley and the like.
In this embodiment, an shrar algorithm is adopted, where the shrar algorithm divides the secret S into N secret shares and allocates the N secret shares to N participants, and if the original secret S needs to be recovered, at least k participants must be required to perform cooperative decryption, and if fewer than k participants cooperate, the original secret S cannot be reconstructed. The algorithm is realized based on Lagrange interpolation, and comprises two stages of secret distribution and secret reconstruction.
1) The secret distribution phase algorithm is as follows:
for secret s ε Z p The distributor is from Z p Domain randomly selects t 1 coefficients a 1 ,a 2 ,…,a t-1 Constructing a polynomial:
f(x)=a 0 +a 1 x+a 2 x 2 +…+a t-1 x t-1 (mod p)
wherein s=f (0), calculating y i =f(x i ),x i ∈[1,n]Will (x) i ,y i ) Respectively sent to the participators P i ,i∈[1,n]。
2) The secret reconstruction phase algorithm is as follows:
at least t participants reconstruct the secret share into the original secret s by the lagrangian interpolation formula:
wherein ,
if the number of parties providing shares is less than t, then no information about the secret will be revealed.
6. A poisoning attack detection stage: after the RSU node receives the secret share uploaded by the vehicle node, firstly verifying whether the identity and the time stamp of the vehicle node are valid or not; after verification, considering that malicious vehicle nodes possibly upload toxic updates, the toxic updates need to be removed as much as possible during global aggregation and then the aggregation is performed.
The present embodiment uses a Multi-Krum algorithm to remove toxic updates generated by malicious users, specifically: the RSU node calculates Euclidean distance between the secret shares uploaded by the received vehicle nodes, and then the RSU node signs the Euclidean distance by using a private key of the RSU node and uploads the Euclidean distance to the blockchain.
The Multi-Krum algorithm is a Bayesian fault-tolerant machine learning algorithm based on Euclidean distance, is an algorithm for ensuring that the Bayesian fault-tolerant machine learning algorithm can still converge when the Bayesian fault exists in distributed machine learning, and can remove toxic update generated by malicious users. The algorithm is as follows:
the number of users in a region is m, and assuming z is the number of Bayesian nodes, the euclidean distance of each vehicle node's uploaded gradient to its nearest m-z-2 gradients is added as the quality score for that gradient:
wherein i.fwdarw.j represents a gradient
Belongs to ideal ladder Tang->
M-z-2 gradients of near the canthus.
And finally, selecting m-z gradients with the lowest quality scores as legal updates, and performing aggregation.
7. Model polymerization stage:
after the block link receives the Euclidean distance and the secret share uploaded by the RSU node, firstly verifying whether the identity and the timestamp of the RSU node are valid or not;
after verification, reconstructing Euclidean distance between the gradient of the ith user and the gradient of the jth user through Lagrange interpolation according to model parameters uploaded by the RSU node, and then selecting legal vehicle nodes according to a Multi-Krum algorithm;
the RSU node downloads legal vehicle nodes from the blockchain, sums the secret shares uploaded by the legal vehicle nodes locally and uploads the secret shares to the blockchain;
reconstructing an aggregation result by the block chain through Lagrangian interpolation;
the RSU node downloads the global model parameters after aggregation and update from the blockchain, and transmits the global model parameters to vehicle nodes in the coverage area, and the vehicle nodes use the global model parameters after aggregation and update to train the next round, and repeat the steps until the model converges or the ideal precision is achieved.
8. Reputation evaluation stage:
reputation assessment models based on subjective logic are described using concepts of evidence space and concept spaceThe sum measures the trust relationship of one party to another. Evidence space is composed of historical interaction events, which are divided into positive and negative events. The concept space is composed of a series of probability reputation evaluation opinions, and the subjective reputation opinion of one party i to another party j is composed of triples
Express, and satisfy->
wherein ,
respectively representing the trust degree, the distrust degree and the uncertainty degree of i to j; r and s are the number of positive interaction events and the number of negative interaction events respectively; c is a constant, and the value of c is related to the influence of the uncertain interaction event number on the reputation.
The interactive event is a process that a vehicle node downloads global model parameters and iteratively trains and uploads primary parameters according to local data; the method comprises the steps that a poisoning attack detection scheme through an RSU node is a positive interaction event, otherwise, the poisoning attack detection scheme is a negative interaction event, and when a vehicle node does not upload any parameter, the uncertain interaction event appears. Historical interactivity events are a set of total number of interactivity events with nodes that are referenced in Δt time: t= { s, f }, where s and f are the number of positive and negative interaction events, respectively.
8.1, reputation evaluation of RSU, comprising three types of events:
1) The RSU node verifies the size of the uploaded data set of the participating vehicle nodes; 2) The RSU node verifies the updating of the global model parameters; 3) The RSU verifies the secret shares of the other RSUs.
If the RSU node verifies that these events are valid and reliable, it will be considered a positive interaction event. The concept space is mainly composed of reputation opinions of task publishers on participating RSU nodes.
Considering that the contribution of different RSU nodes to model training is possibly different, the number of vehicles in the coverage area of the different RSU nodes and the size of the collected data set are different, so that the time for assisting a blockchain to remove a toxic model and aggregate a global model is also different, which means that the model training is differently contributed by the blockchain, therefore, the reputation evaluation scheme sets weight eta for the number of positive interaction events r to reflect the contribution of the RSU nodes; meanwhile, in order to reduce the occurrence of negative interaction events, the weight of s is set to be 1, and 0< eta is less than or equal to 1; the original algorithm is rewritten as:
the reputation value of task publisher i for RSU node j is expressed as:
8.2, evaluating the credit of the vehicle node;
concepts based on subjective logic models and historical interaction events can be derived:
wherein ,
is the probability of successful parameter transmission and represents the communication quality.
Considering that the situation that the vehicle node j may have an uncertain interaction event may affect the accuracy of reputation evaluation of the vehicle node, the present embodiment predicts the probability of occurrence of a positive interaction event for the vehicle node using a bayesian formula, namely:
the historical interaction event t= { s, f } of the vehicle node j is taken as a precondition E, and the behavior of the positive interaction event is taken as an event H.
Assuming that the probability of occurrence of event H under the condition of occurrence of event E is compliant with Beta distribution, the correlation coefficient of uncertain interaction behavior of vehicle node j on reputation influence
The mathematical expectation of the Beta distribution is expressed as: />
Wherein the correlation coefficient
Representing the probability of the vehicle node j exhibiting a positive interaction event upon occurrence of an uncertain interaction.
In combination with the three formulas, in one federal learning task, the task publisher i directly credits the vehicle node j with the reputation value
The method comprises the following steps:
the embodiment provides a safe sharing method of Internet of vehicles data based on a blockchain and dynamic reputation, wherein training tasks are issued through the blockchain, and model parameters and reputation values of participating nodes are stored in an decentralization mode by using the blockchain to replace a parameter server in traditional federal learning; verifying the legitimacy of the vehicle node and the RSU node; supporting secret sharing of model parameters, removing toxic model parameters by combining a Multi-Krum algorithm, and adding an RSU layer to screen and aggregate model parameters uploaded by vehicle nodes; in the reputation evaluation process, subjective logic models are improved, weights are set for different data sets, the Bayesian theory is adopted to eliminate uncertain interaction events occurring on vehicle nodes in the training process, the RSU nodes and the vehicle nodes are accurately and effectively subjected to reputation evaluation according to historical interaction behaviors of the participating nodes, and finally global model parameters and reputation values of the participating nodes are safely stored on a chain. The risk that an external adversary impersonates a legal node is avoided, the single-point fault problem of the centralized server is solved, the validity and the accuracy of reputation evaluation are improved, and the data security sharing application in the Internet of vehicles scene is realized under the conditions of protecting local privacy and training model security.
Example 2
The embodiment provides a car networking data safety sharing system based on block chain and dynamic reputation, which comprises: task publishers, blockchains, RSU nodes, and vehicle nodes;
the task publisher is used for publishing model training tasks, and RSU nodes and vehicle nodes which participate in the model training tasks are selected according to the reputation values;
the selected RSU node downloads the global model from the blockchain and transmits the global model to vehicle nodes which are added with model training tasks in the coverage areas of the nodes;
the vehicle node carries out iterative training of the local model according to the global model and the local data, encrypts the generated local update parameters by adopting a secret sharing algorithm to obtain secret shares, signs the secret shares and uploads the secret shares to the RSU node;
the RSU node calculates Euclidean distance between secret shares, signs the Euclidean distance and then uploads the Euclidean distance to the block chain;
reconstructing Euclidean distance by the block chain through Lagrange interpolation, selecting legal vehicle nodes, aggregating secret shares uploaded by the legal vehicle nodes in RSU nodes, and reconstructing an aggregation result by the block chain through Lagrange interpolation to obtain an aggregation updated global model;
and the RSU node downloads the aggregated and updated global model from the blockchain and transmits the global model to the corresponding vehicle node so that the vehicle node performs the model training of the next round until the model converges, and reputation values of the RSU node and the vehicle node subjected to reputation evaluation are stored in a uplink manner in the process of training each round of model.
It should be noted that the above modules correspond to the steps described in embodiment 1, and the above modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to those disclosed in embodiment 1. It should be noted that the modules described above may be implemented as part of a system in a computer system, such as a set of computer-executable instructions.
In further embodiments, there is also provided:
an electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the method described in embodiment 1. For brevity, the description is omitted here.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include read only memory and random access memory and provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store information of the device type.
A computer readable storage medium storing computer instructions which, when executed by a processor, perform the method described in embodiment 1.
The method in embodiment 1 may be directly embodied as a hardware processor executing or executed with a combination of hardware and software modules in the processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method. To avoid repetition, a detailed description is not provided herein.
Those of ordinary skill in the art will appreciate that the elements of the various examples described in connection with the present embodiments, i.e., the algorithm steps, can be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.