CN117313902A

CN117313902A - Signal game-based vehicle formation asynchronous federal learning method

Info

Publication number: CN117313902A
Application number: CN202311619800.7A
Authority: CN
Inventors: 于海洋; 梁育豪; 赵亚楠; 杨阳; 任毅龙; 崔志勇
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2023-11-30
Filing date: 2023-11-30
Publication date: 2023-12-29
Anticipated expiration: 2043-11-30
Also published as: CN117313902B

Abstract

The invention discloses a vehicle formation asynchronous federal learning method based on signal game, which comprises the following steps: grouping vehicles according to the position information to obtain a vehicle formation; the method comprises the steps that a server sends a learning task to the Internet of vehicles equipment and receives participation signals sent by vehicle formation; the server identifies the participation signals based on a first strategy of the signal game to judge whether the vehicle formation is in a positive state; if the server judges that the vehicle formation is in a positive state, the global model is sent to the vehicle formation; the server receives a local model of the vehicle formation trained using the local data set, and aggregates the global model in an asynchronous manner based on the local model. The invention can realize the accurate selection of active participants under the condition of incomplete information, simultaneously reduces the communication expenditure by an asynchronous aggregation mode, improves the training efficiency, and combines the two modes to improve the model training quality.

Description

Signal game-based vehicle formation asynchronous federal learning method

Technical Field

The present invention relates to the field of electrical digital data processing. And more particularly to a signal game based asynchronous federal learning method for vehicle formation.

Background

With the development of technology, intelligent internet-connected vehicles (Intelligent Connected Vehicle, ICV) have attracted considerable attention. In ICV, a large amount of sensor data needs to be processed with the cooperation of a central server, providing a comfortable driving experience for the driver. Since these sensor data contain sensitive information such as speed, acceleration, etc., which are easily edited, forged, and stolen, there are security and privacy problems in transmitting the sensor data to the central server. Federal learning solves the above problems by locally training the local model with proprietary data via ICV and uploading local model parameters. Meanwhile, the method also provides an efficient communication mode for solving the machine learning problem about the distributed data. Because of the above-described advantages of federal learning, it has been widely used in some ICV applications, such as vehicle co-location and traffic flow prediction.

However, the expensive training overhead and Non-independent distribution (Non-Independent and Identically Distributed, non-IID) data from a single ICV remain a challenge to be addressed by large-scale application of federal learning in the internet of vehicles. In particular, since sensor data involved in training has different distributions, such Non-IID data can lead to bias in global model aggregation, degrading model accuracy. In terms of communication overhead, the traditional synchronous federation learning method needs to wait for all nodes to upload the model and then aggregate, which greatly increases time overhead. Furthermore, each ICV requires multiple rounds of interaction with a central server, which inevitably creates a significant amount of communication overhead in a large-scale vehicle scenario, thereby reducing the efficiency of the training process.

Furthermore, in the real world, there are some malicious participants that may disrupt the overall model training. Therefore, it is necessary to effectively exclude malicious nodes before training. Existing participant selection strategies are mostly based on reputation models, the nature of the nodes is judged by quantifying subjective and objective reputations, considering that the selector can observe complete information of the vehicle under default conditions. However, the central server often has only incomplete information in the real world. Therefore, it is another challenge to design an effective participant selection strategy to exclude unreliable members in incomplete conditions.

Disclosure of Invention

The invention is based on the above-mentioned demand of the prior art, the technical problem to be solved by the invention is to provide a vehicle formation asynchronous federal learning method based on signal game to realize the accurate selection of active participants under the condition of incomplete information, and simultaneously, the communication overhead is reduced in an asynchronous aggregation mode, the training efficiency is improved, and the combination of the two can improve the model training quality.

In order to solve the problems, the invention is realized by adopting the following technical scheme:

provided is a vehicle formation asynchronous federal learning method based on signal gaming, comprising: grouping vehicles according to the position information to obtain a vehicle formation; the method comprises the steps that a server sends a learning task to the Internet of vehicles equipment and receives participation signals sent by vehicle formation; the server identifies the participation signals based on a first strategy of signal gaming to determine whether the vehicle formation is in a positive state, including: a server and vehicle formation are taken as persons in a bureau to construct a game model; solving the game model to obtain a first strategy of perfect Bayesian equilibrium, wherein the first strategy comprises game equilibrium on four solution spaces; determining a state of a vehicle formation according to the first strategy, including: according toAnd->Determining the boundary value of the high reward +.>Boundary value of low reward +.>Wherein, the method comprises the steps of, wherein,m _h representing the received first consideration proposal signal,m _l representing a received second reward proposal signal, the first reward proposal signal being higher than the second reward proposal signal,Drepresenting the cost of recovering to the previous global aggregate model after receiving the formation of the malicious vehicle,E _s representing the impact of receiving a trained local model of a malicious vehicle formation,E _p representing the effect of the server receiving a trained local model of the active vehicle fleet; determining a priori beliefsμAnd according to the boundary value of the high reward +.>Boundary value of low reward +.>Determining a priori beliefs pertaining to a solution space, the priori beliefsμIncluding at the time of receivingm _h A priori belief that the time server believes that the vehicle formation is of the positive typeμ _h And on receipt ofm _l A priori belief that the time server believes that the vehicle formation is of the positive typeμ _l The method comprises the steps of carrying out a first treatment on the surface of the By comparison in the corresponding solution spacem _h -m _l And (3) withF、m _h And (3) withE _s -D，m _l And (3) withE _s -D，And (3) withγ、m _l And C%θ _p )-F、m _l And C%θ _p )、m _h And C%θ _p )+F、m _l And (3) withE _p 、m _h And (3) withE _p 、γAnd (3) withμ _h 、γAnd (3) withμ _l To determine an equilibrium solution for the vehicle formation, wherein,Frepresents the camouflage cost of a plague vehicle formation,γprobability value representing that the vehicle formation belongs to the positive type C #θ _p ) Training overhead representing active vehicle formation; if the server judges that the vehicle formation is in a positive state, the global model is sent to the vehicle formation; the server receives a local model of the vehicle formation trained using the local data set, and aggregates the global model in an asynchronous manner based on the local model.

Optionally, the game model constructed by the server and the vehicle formation for the persons in the office includes: building a signal gaming model between a server and a vehicle formationThe method comprises the steps of carrying out a first treatment on the surface of the Wherein N represents the collection of persons in the office, +.>，，/>，/>，/>，/>Representing a set of types for both parties of the game->Representing a set of types of servers, +.>A set of types representing a formation of vehicles,θ _h representing classes of serversThe method comprises the steps of,θ _p indicating the positive type of vehicle formation,θ _s representing malicious types of vehicle platoon, Pr{θ=θ _p -representing natural probabilities of formation belonging to the positive type, Pr{θ=θ _s Natural probability of formation belonging to malicious type, 1-γProbability values representing that the formation belongs to a malicious type, m= {m _h ,m _l M represents a signal set, n= {π _a ,π _r And, pi represents the set of policies,π _a indicating that the server receives a request for vehicle formation,π _r representing that the server refuses the request for vehicle formation; u represents a reward set, U= { U _r ,U _v }，U _r Representing server profit, U _v Representing vehicle formation benefits.

Optionally, the reward set includes: when a=θ _p ，B=m _h ，C=π _a When U _r =E _p -m _h ，U _v =m _h -C(θ _p ) The method comprises the steps of carrying out a first treatment on the surface of the When a=θ _p ，B=m _l ，C=π _a When U _r =E _p -m _l ，U _v =m _l -C(θ _p ) The method comprises the steps of carrying out a first treatment on the surface of the When a=θ _p ，B=m _h ||m _l ，C=π _r When U _r =0，U _v =0; when a=θ _s ，B=m _h ，C=π _a When U _r =E _s -m _h -D，U _v =m _h -C(θ _p )-FThe method comprises the steps of carrying out a first treatment on the surface of the When a=θ _s ，B=m _h ，C=π _r When U _r =0，U _v =-FThe method comprises the steps of carrying out a first treatment on the surface of the When a=θ _s ，B=m _l ，C=π _a When U _r =E _s -m _l -D，U _v =m _l -C(θ _S ) The method comprises the steps of carrying out a first treatment on the surface of the When a=θ _s ，B=m _l ，C=π _r When U _r =0，U _v =0, wherein, C%θ _S ) The training overhead of the malicious vehicle formation is represented, A is the type of the vehicle formation, B is a reward proposal of the vehicle formation received by the server, and C is a decision of the server.

Optionally, according to the boundary value of the high rewardBoundary value of low reward +.>Determining a solution space to which a priori belief belongs, including +.> Wherein,D ₁ a first one of the solution spaces is represented,D ₂ a second one of the solution spaces is represented,D ₃ a third one of the solutions is represented by a third one of the solutions,D ₄ representing a fourth solution space.

Optionally, the first strategy includes game balancing on four solutions, including: if it isThe server is balanced +.>The method comprises the steps of carrying out a first treatment on the surface of the Judging whether the first limit condition is satisfied>，/>，/>，/>If so, determine the equalization solution of the vehicle formation +.>The method comprises the steps of carrying out a first treatment on the surface of the Judging whether the second limiting condition is satisfied，/>，/>If so, determine the equalization solution of the vehicle formation +.>The method comprises the steps of carrying out a first treatment on the surface of the If->The server is balanced +.>The method comprises the steps of carrying out a first treatment on the surface of the Judging whether the third limiting condition is satisfied，/>，/>，/>If so, determine the balance of the vehicle formationSolutionThe method comprises the steps of carrying out a first treatment on the surface of the Judging whether the fourth limitation condition is satisfied>，/>，/>If so, determine the equalization solution of the vehicle formation +.>The method comprises the steps of carrying out a first treatment on the surface of the If->Equalization of the serverThe method comprises the steps of carrying out a first treatment on the surface of the Judging whether the fifth limitation condition is satisfied>，/>，，/>If so, determine the equalization solution of the vehicle formation +.>The method comprises the steps of carrying out a first treatment on the surface of the If->The server is balanced +.>The method comprises the steps of carrying out a first treatment on the surface of the Judging whether or not the sixth constraint is satisfied>，/>，/>，/>If so, determine the equalization solution of the vehicle formation +.>The method comprises the steps of carrying out a first treatment on the surface of the Judging whether or not the seventh constraint is satisfied>，/>If so, determine the equalization solution of the vehicle formation +.>。

Optionally, the prior beliefs are modified based on gaming equalizationμComprising: if it isWhen the first constraint is satisfied, the server updates the prior beliefs toλ={λ _h ,λ _l }, whereinλ _h =μ _h ，/>The method comprises the steps of carrying out a first treatment on the surface of the When the second constraint is satisfied, the server updates the prior beliefs toλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0; if->When the third constraint is satisfied, the server updates the prior beliefs toλ={λ _h ,λ _l }, whereinλ _h =μ _h ，/>The method comprises the steps of carrying out a first treatment on the surface of the When the fourth constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0; if->When the fifth constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, whereinλ _h =μ _h ，/>The method comprises the steps of carrying out a first treatment on the surface of the If->When the sixth constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, wherein->，λ _l =μ _l The method comprises the steps of carrying out a first treatment on the surface of the When the seventh constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0。

Optionally, the local modelBased on training formula->Obtaining; wherein,w _r-1 global model representing the round on the server that the vehicle formation received,/for>Is a strong convex function->As a function of the local loss,ρis the parameter of the ultrasonic wave to be used as the ultrasonic wave,D ^p representing the sum of the vehicle data sets within the formation.

Optionally, the server receives a local model of the vehicle formation trained using the local data set, and aggregates the global model in an asynchronous manner based on the local model, wherein the formula includes:wherein,rthe polymerization run is represented by the number of polymerization runs,w _r a global model representing the current turn server,w _r-1 a global model representing a round on a server, < +.>Representation modelw _r-1 And model->Is used as a similarity function of the (c) for the (c),zrepresenting the super parameter.

Optionally, the vehicle formation trains the global model with a local data set to obtain a local model, including: vehicle formation utilizes a random gradient descent method to find a locally optimal solution, based on a Mini-batch gradient descent method to control the number of training samples.

Optionally, if the server determines that the vehicle formation is not in a positive state, the server disregards the vehicle formation and waits for participation signals of other vehicle formations.

Compared with the prior art, the invention provides a vehicle formation asynchronous federal learning method based on signal game. By building an asynchronous federal learning framework based on vehicle formation, the vehicle formation is made a client for federal learning that provides diversified and sufficient data through secure data sharing within the formation while aggregating computing resources to reduce training overhead. Then, asynchronous optimization is performed, and the server immediately aggregates the global model after uploading the trained model for each formation. The invention further establishes a first strategy based on the signal game to judge the participants, in the strategy, the server is a receiver of the signal, the vehicle formation is a sender of the signal, the participants can be selected under the condition of incomplete information, the server can obtain the optimal signals of the vehicle formation under different conditions through perfect Bayesian equilibrium, the privacy information of the user is ensured, and then the server can accurately judge whether the vehicle formation is positive or malicious through signal identification. The asynchronous federal learning and the first strategy based on the signal game can improve the model training quality, and when the server receives the participation training signals sent by the vehicle formation each time, the server judges the nature of the vehicle formation according to the selection strategy, so that malicious attack in the federal learning process is prevented, and the federal learning result is destroyed.

Drawings

In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present description, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.

FIG. 1 is a flow chart of a method for asynchronous federal learning of vehicle formation based on signal gaming provided in an embodiment of the present invention;

FIG. 2 is a flowchart for determining whether a vehicle formation is in a positive state in a signal game-based vehicle formation asynchronous federal learning method according to an embodiment of the present invention;

FIG. 3 is a flow chart of determining a vehicle formation status according to a first strategy in a signal game-based vehicle formation asynchronous federal learning method according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

For the purpose of facilitating an understanding of the embodiments of the present invention, reference will now be made to the following description of specific embodiments, taken in conjunction with the accompanying drawings, which are not intended to limit the scope of the invention.

Example 1

In this embodiment, the local model refers to a model trained on each participant, and is trained using data local to that participant, including characteristics and patterns of the participant's local data. The global model is a global model obtained by aggregating local models of participants during federal learning. To ensure privacy protection and data security, federal learning typically employs security aggregation algorithms, such as cryptographic aggregation, differential privacy, etc., to aggregate the local models of participants in an encrypted or privacy-preserving manner. By aggregation, the local information of the individual models of the participants is integrated into a global model, which contains the collective knowledge and characteristics of all participants in federal learning for subsequent reasoning or prediction.

The embodiment provides a vehicle formation asynchronous federal learning method based on signal game, as shown in fig. 1, comprising the following steps:

s1: and grouping the vehicles according to the position information to obtain a vehicle formation.

Assuming that there are several vehicles within the system, the set of numbers may be noted as v= {1, …,n}，nrepresenting the number of vehicle numbers, which, depending on the geographical distance, respectively constitute a number of vehicle platoons p= {1, …,pthe number of vehicles in each formation is denoted as M ₀ ={m ₁ ,…,m _i ,…m _p Simultaneously, any one of the vehicles is formed into a teampThe number set of the vehicle is recorded as. Each vehicle in the formation has its own data set, and the data in the data sets are often different from each other, for example, using a high-definition camera of the vehicle to capture license plate number information. Can belong to a vehiclevIs expressed as a dataset of (a)For two different vehiclesvAnduusually +.>. The vehicle data within a formation is securely shared by Dedicated Short-range communications, such as the vehicle-mounted wireless communication system (DSRC), and thus a data set of the formationD ^p Is the sum of the vehicle data sets within the formation: />。

S2: the server sends a learning task to the Internet of vehicles equipment and receives participation signals sent by vehicle formation.

The server issues machine learning tasks to the internet of vehicles platform, waiting for the vehicle formation to send participation requests. The vehicle formation is used as a participant to send signals to the server, and the server receives the signals sent by the participant.

S3: the server identifies the engagement signals based on a first policy of signal gaming to determine whether the vehicle formation is in a positive state.

As shown in fig. 2, in this step, it includes:

s30: and constructing a game model by taking the server and the vehicle formation as persons in the office.

Building a signal gaming model between a server and a vehicle formationThe method comprises the steps of carrying out a first treatment on the surface of the Wherein,Nrepresenting the collection of persons in the office->，/>，/>，/>，/>，/>Representing a set of types for both parties of the game->Representing a set of types of servers, +.>A set of types representing a formation of vehicles,θ _h indicating the type of server that is to be used,θ _p indicating the positive type of vehicle formation,θ _s representing malicious types of vehicle platoon, Pr{θ=θ _p -representing natural probabilities of formation belonging to the positive type, Pr{θ=θ _s Natural probability of formation belonging to malicious type, 1-γProbability values representing that the formation belongs to a malicious type, m= {m _h ,m _l M represents a signal set, n= {π _a ,π _r And, pi represents the set of policies,π _a indicating that the server receives a request for vehicle formation,π _r representing that the server refuses the request for vehicle formation; u represents a reward set, U= { U _r ,U _v }，U _r Representing server profit, U _v Representing vehicle formation benefits.

The details are as follows:

the office set of people is denoted as n= { SER, PL }. The present embodiment defines the server SER and the vehicle formation PL as two persons in the game process, wherein the server is the receiver of the signal and the vehicle formation is the sender of the signal, the server decides its decision by receiving the signal of the vehicle formation.

The type set is recorded asWherein->，/>. The type set describes all possible types of both parties to the game. There is only one type of server, and there are both active and malicious types of vehicle formation. The natural probability of which type of vehicle formation belongs to can be expressed as: />，/>。

The signal set is denoted as m= {m _h ,m _l }. The signal is the information that the vehicle formation uses to send their personal information and their expected rewards to the server, the signal set consists of two offers,m _h representing the received first consideration proposal signal,m _l representing a received second reward proposal signal, the first reward proposal signal being higher than the second reward proposal signal.

The policy set is noted as pi= {π _a ,π _r }. The strategy is a decision made by the server according to its belief upon receipt of a signal for vehicle formation. The server may decide to accept or reject the request sent by the vehicle fleet.

The reward set is noted as u= { U _r ,U _v }. The notation C represents the training overhead, which is a quantity related to the type of vehicle formation, namely C #θ _p ) And C%θ _S ) Wherein C is%θ _p ) Representing training overhead of active vehicle formation C #θ _S ) Training for representing formation of malicious vehiclesOverhead is exercised. Assuming that the malicious vehicle formation has smaller training overhead than positive, C #θ _p )>C(θ _S ). The malicious formation requires some masquerading to deceive the server when sending the signal, the cost required for masquerading is defined as the embodimentF. If a team participates in training, it will inevitably have an impact on the whole model. The present embodiment quantifies the impact of a trained local model receiving a malicious vehicle formation asE _s Quantifying the impact of a trained local model that has been actively queued asE _p . If a local model of a malicious vehicle formation is accepted for global aggregation, which may lead to a decrease in the accuracy of the global model, it is costly for the server to revert to the previous global aggregation model, which is defined in this embodiment asD。

The remuneration set of server and vehicle platoons under various decision possibilities includes: when a=θ _p ，B=m _h ，C=π _a When U _r =E _p -m _h ，U _v =m _h -C(θ _p ) The method comprises the steps of carrying out a first treatment on the surface of the When a=θ _p ，B=m _l ，C=π _a When U _r =E _p -m _l ，U _v =m _l -C(θ _p ) The method comprises the steps of carrying out a first treatment on the surface of the When a=θ _p ，B=m _h ||m _l ，C=π _r When U _r =0，U _v =0; when a=θ _s ，B=m _h ，C=π _a When U _r =E _s -m _h -D，U _v =m _h -C(θ _p )-FThe method comprises the steps of carrying out a first treatment on the surface of the When a=θ _s ，B=m _h ，C=π _r When U _r =0，U _v =-FThe method comprises the steps of carrying out a first treatment on the surface of the When a=θ _s ，B=m _l ，C=π _a When U _r =E _s -m _l -D，U _v =m _l -C(θ _S ) The method comprises the steps of carrying out a first treatment on the surface of the When a=θ _s ，B=m _l ，C=π _r When U _r =0，U _v =0, a is the type of vehicle formation, B is the consideration proposal of the vehicle formation received by the server, C is the decision of the server, and is specifically shown in table 1:

the second row of the table shows, for example, when the vehicle is queued as an aggressive typeθ _p The server receives a first reward proposal signal sent by the vehicle formationm _h While the server decision is to accept requests sent by a vehicle fleetπ _a When the server receives the benefit ofE _p -m _h The benefits of vehicle formation arem _h -C(θ _p ). The meaning of other rows of the table can be obtained in the same way.

S31: and solving the game model to obtain a first strategy of perfect Bayesian equilibrium.

Defining boundary values for high rewardsBoundary value of low reward +.>The a priori belief of the server may be written asμ={μ _h ,μ _l -wherein the a priori beliefsμIncluding at the time of receivingm _h A priori belief that the time server believes that the vehicle formation is of the positive typeμ _h And on receipt ofm _l A priori belief that the time server believes that the vehicle formation is of the positive typeμ _l The expression is->，/>。/>The present embodiment distributes all possible prior beliefs in four solution spaces, denoted as follows:

and the third stage is that the vehicle formation performs perfect Bayesian equilibrium on the sub-games under different priori belief distribution solution spaces, and finally, separation and mixed equilibrium on four solution spaces are obtained according to comprehensive analysis of an equilibrium result. After solving, the game equalization on the four solution spaces can be expressed as follows:

when the server is a prioriWhen in use; balancing of servers>。

Under the first limitation condition，/>，/>，/>The next step is to obtain a mixed equalization +.>The method comprises the steps of carrying out a first treatment on the surface of the Wherein the equilibrium solution of the vehicle formation ∈ ->The method comprises the steps of carrying out a first treatment on the surface of the At this time, the belief of the server is updated asλ={λ _h ,λ _l }, whereinλ _h =γ，/>。

Under the second limitation condition，/>，/>Next, a separation balance is obtainedThe method comprises the steps of carrying out a first treatment on the surface of the Wherein the equilibrium solution of the vehicle formation ∈ ->The method comprises the steps of carrying out a first treatment on the surface of the At this time, the belief of the server is updated asλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0。

When the server is a prioriWhen in use; balancing of servers>。

In the third limitation condition，/>，/>，/>The next step is to obtain a mixed equalization +.>The method comprises the steps of carrying out a first treatment on the surface of the Wherein the equilibrium solution of the vehicle formation ∈ ->The method comprises the steps of carrying out a first treatment on the surface of the Meanwhile, the belief of the server is updated at the momentλ={λ _h ,λ _l }, whereinλ _h =μ _h ，/>。

In the fourth limitation condition，/>，/>The next result is a separation equalization->The method comprises the steps of carrying out a first treatment on the surface of the Wherein the equilibrium solution of the vehicle formation ∈ ->The method comprises the steps of carrying out a first treatment on the surface of the At this time, the belief of the server is updated asλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0。

When the server is a prioriWhen in use; balancing of servers>。

In the fifth limitation condition，/>，/>，/>The next step is to obtain a mixed equalization +.>The method comprises the steps of carrying out a first treatment on the surface of the Wherein the equilibrium solution of the vehicle formation ∈ ->The method comprises the steps of carrying out a first treatment on the surface of the At this time, the belief of the server is updated asλ={λ _h ,λ _l }, whereinλ _h =μ _h ，/>。

When the server is a prioriWhen in use; balancing of servers>。

In the sixth limiting condition，/>，/>，/>The next step is to obtain a mixed equalization +.>The method comprises the steps of carrying out a first treatment on the surface of the Wherein the equilibrium solution of the vehicle formation ∈ ->The method comprises the steps of carrying out a first treatment on the surface of the At this time, the belief of the server is updated asλ={λ _h ,λ _l }, whereinλ _l =μ _l ，/>。

In the seventh limitation，/>The next result is a separation equalization->The method comprises the steps of carrying out a first treatment on the surface of the Wherein the equilibrium solution of the vehicle formation ∈ ->The method comprises the steps of carrying out a first treatment on the surface of the At this time, the belief of the server is updated asλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0。

S32: and determining the state of the vehicle formation according to the first strategy.

As shown in fig. 3, in this step, it includes:

s320: according toAnd->Determining the boundary value of the high reward +.>Boundary value of low reward +.>。

S321: determining a priori beliefsμAnd according toAnd->The solution space to which the a priori beliefs belong is determined.

In this embodiment, a priori beliefsμThe game can be obtained by analyzing relevant information and experience such as cognition of the server on vehicle requirements, market information and the like before the game starts. During the game, the participants will update according to the observed information, i.e. the a priori beliefs will update, affecting the policy choices and decisions of the participants in the game.

S322: by comparison in the corresponding solution spacem _h -m _l And (3) withF、m _h And (3) withE _s -D，m _l And (3) withE _s -D，And (3) withγ、m _l And C%θ _p )-F、m _l And C%θ _p )、m _h And C%θ _p )+F、m _l And (3) withE _p 、m _h And (3) withE _p 、γAnd (3) withμ _h 、γAnd (3) withμ _l And determining an equilibrium solution for the vehicle formation.

If it isThe server is balanced +.>The method comprises the steps of carrying out a first treatment on the surface of the When the first constraint is met, a balanced solution of the vehicle platoon is determined>The method comprises the steps of carrying out a first treatment on the surface of the Determining an equalization solution for the vehicle formation when the second constraint is satisfied。

If it isThe server is balanced +.>The method comprises the steps of carrying out a first treatment on the surface of the When the third constraint is met, determining an equalization solution for the vehicle platoon>The method comprises the steps of carrying out a first treatment on the surface of the Determining an equalization solution for the vehicle formation when the fourth constraint is satisfied。

If it isThe server is balanced +.>The method comprises the steps of carrying out a first treatment on the surface of the When the fifth constraint is met, determining an equalization solution for the vehicle platoon>。

If it isThe server is balanced +.>The method comprises the steps of carrying out a first treatment on the surface of the When the sixth constraint is met, determining an equalization solution for the vehicle platoon>The method comprises the steps of carrying out a first treatment on the surface of the Determining an equalization solution for the vehicle formation when the seventh constraint is satisfied。

Further, correcting the prior beliefs based on gaming equalization, comprising:

if it isWhen the first constraint is satisfied, the server updates the prior beliefs toλ={λ _h ,λ _l }, whereinλ _h =μ _h ，/>The method comprises the steps of carrying out a first treatment on the surface of the When the second constraint is satisfied, the server updates the prior beliefs toλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0。

If it isWhen the third constraint is satisfied, the server updates the prior beliefs toλ={λ _h ,λ _l }, whereinλ _h =μ _h ，/>The method comprises the steps of carrying out a first treatment on the surface of the When the fourth constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0。

If it isWhen the fifth constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, whereinλ _h =μ _h ，/>。

If it isWhen the sixth constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, wherein，λ _l =μ _l The method comprises the steps of carrying out a first treatment on the surface of the When the seventh constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0。

The equalization solution refers to an optimal solution for both the server and the vehicle fleet under different constraints. The present embodiment assumes that vehicle platooning does not make a deceptive decision if the revenue received by the deception server may be less than the revenue not deception. Solving equalization is to obtain decisions which are most favorable for both parties under various conditions and different benefits. Illustratively, whenWhen the sixth constraint is satisfied, the server belief is updated toλ _l =μ _l ，/>. Wherein,λ _l =μ _l indicating that the server considers that when receiving the signalm _l At the time, there areμ _l Is considered to be positive.

S4: and if the server judges that the vehicle formation is in a positive state, the global model is sent to the vehicle formation.

The server judges whether the state of the vehicle formation is positive or malicious through perfect Bayesian equilibrium on four solution spaces; if the vehicle formation is active, the server sends the global model of the present wheel to the vehicle formation; if malicious, the server disregards the vehicle formation and continues to wait for participation requests by other vehicle formations.

S5: the server receives a local model of the vehicle formation trained using the local data set, and aggregates the global model in an asynchronous manner based on the local model.

Random gradient descent (Stochastic Gradient Descent, SGD) methods can be used for training in federal learning, with Mini-batch used to control the number of training samples, neither resulting in overfitting nor divergence. Is provided withAs a function of the loss,frepresenting learning rate, the target global model iswVehicle formationpCan be expressed as:

wherein,D ^p representing the sum of the vehicle data sets within the formation.

First, vehicle formation is trained using a local data set; assume that the received global model isw _r-1 And (2) andis a strong convex function->Is local toThe loss function is a function of the loss,ρbeing a hyper-parameter, the training can be expressed as:

wherein,w _r-1 a global model representing a server received by a vehicle fleet,is a strong convex function->As a function of the local loss,ρis a super parameter.

Then, the vehicle formation training is completed to obtain a local modelUploading the local model to a server.

In the federal learning process, the server performs global model aggregation in an asynchronous mode, and the local models sent by any vehicle formation are collected to immediately aggregate the global models. Assume that the aggregation round isrThe global model on the server at this time isw _r The global model of the last round of server isw _r-1 Receiving a local model sent from a vehicle formationAggregate weight isαThe global aggregation formula is:

the training data quantity is set asWhereinkTotal number of training data tags representing the machine learning model trained with federal learning, and +.>Representative modelwIs labeled among training data of (a)kIs a data count of (a). For two modelsw _a Andw _b Defining model similarity function SIM #w _a ,w _b ) The method comprises the following steps:

wherein,representative modelw _a Is labeled among training data of (a)nData total of>Representative modelw _b Is labeled among training data of (a)nIs a data count of (a).

Recording deviceIs a super parameter, the complete global aggregation formula of the server is:

wherein,representation modelw _r-1 Model->Is a similarity function of (1).

And taking the updated global model as a new global model, and repeatedly executing S1-S5 until the global model converges or the global model reaches the target accuracy.

Compared with the prior art, the embodiment of the invention provides a vehicle formation asynchronous federal learning method based on signal game. By building an asynchronous federal learning framework based on vehicle formation, the vehicle formation is made a client for federal learning that provides diversified and sufficient data through secure data sharing within the formation while aggregating computing resources to reduce training overhead. Then, asynchronous optimization is performed, and the server immediately aggregates the global model after uploading the trained model for each formation. Further, the embodiment of the invention establishes a first strategy based on the signal game to judge the participants, in the strategy, the server is a receiver of the signal, the vehicle formation is a sender of the signal, the participants can be selected under the condition of incomplete information, the server can obtain the optimal signals of the vehicle formation under different conditions through perfect Bayesian equilibrium, the privacy information of the user is ensured, and then the server can accurately judge whether the vehicle formation is positive or malicious through signal identification. The asynchronous federal learning and the first strategy based on the signal game can improve the model training quality, and when the server receives the participation training signals sent by the vehicle formation each time, the server judges the nature of the vehicle formation according to the selection strategy, so that malicious attack in the federal learning process is prevented, and the federal learning result is destroyed.

Example 2

A computer readable storage medium having stored thereon a computer program having stored thereon a signal game based vehicle formation asynchronous federal learning program which when executed by a processor implements the steps of a signal game based vehicle formation asynchronous federal learning method of embodiment 1.

The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. A signal game based asynchronous federal learning method for vehicle formation, comprising:

grouping vehicles according to the position information to obtain a vehicle formation;

the method comprises the steps that a server sends a learning task to the Internet of vehicles equipment and receives participation signals sent by vehicle formation;

the server identifies the participation signals based on a first strategy of signal gaming to determine whether the vehicle formation is in a positive state, including: a server and vehicle formation are taken as persons in a bureau to construct a game model; solving the game model to obtain a first strategy of perfect Bayesian equilibrium, wherein the first strategy comprises game equilibrium on four solution spaces; determining a state of a vehicle formation according to the first strategy, including: according toAnd->Determining the boundary value of the high reward +.>Boundary value of low reward +.>Wherein, the method comprises the steps of, wherein,m _h representing the received first consideration proposal signal,m _l representing a received second reward proposal signal, the first reward proposal signal being higher than the second reward proposal signal,Drepresenting the cost of recovering to the previous global aggregate model after receiving the formation of the malicious vehicle,E _s representing the impact of receiving a trained local model of a malicious vehicle formation,E _p representing the effect of the server receiving a trained local model of the active vehicle fleet; determining a priori beliefsμAnd according to the boundary value of the high reward +.>And low rewardsBoundary value->Determining a priori beliefs pertaining to a solution space, the priori beliefsμIncluding at the time of receivingm _h A priori belief that the time server believes that the vehicle formation is of the positive typeμ _h And on receipt ofm _l A priori belief that the time server believes that the vehicle formation is of the positive typeμ _l The method comprises the steps of carrying out a first treatment on the surface of the By comparison in the corresponding solution spacem _h -m _l And (3) withF、m _h And (3) withE _s -D，m _l And (3) withE _s -D，And (3) withγ、m _l And C%θ _p )-F、m _l And C%θ _p )、m _h And C%θ _p )+F、m _l And (3) withE _p 、m _h And (3) withE _p 、γAnd (3) withμ _h 、γAnd (3) withμ _l To determine an equilibrium solution for the vehicle formation, wherein,Frepresents the camouflage cost of a plague vehicle formation,γprobability value representing that the vehicle formation belongs to the positive type C #θ _p ) Training overhead representing active vehicle formation;

if the server judges that the vehicle formation is in a positive state, the global model is sent to the vehicle formation;

the server receives a local model of the vehicle formation trained using the local data set, and aggregates the global model in an asynchronous manner based on the local model.

2. The method for asynchronous federal learning of vehicle formation based on signal gaming according to claim 1, wherein the step of constructing a gaming model for persons in the office with the server and the vehicle formation comprises:

building a signal gaming model between a server and a vehicle formationThe method comprises the steps of carrying out a first treatment on the surface of the Wherein N represents the collection of persons in the office, +.>，/>，/>，/>，/>，/>Representing a set of types for both parties of the game->Representing a set of types of servers, +.>A set of types representing a formation of vehicles,θ _h indicating the type of server that is to be used,θ _p indicating the positive type of vehicle formation,θ _s representing malicious types of vehicle platoon, Pr{θ=θ _p -representing natural probabilities of formation belonging to the positive type, Pr{θ=θ _s Natural probability of formation belonging to malicious type, 1-γProbability values representing that the formation belongs to a malicious type, m= {m _h ,m _l M represents a signal set, n= {π _a ,π _r And, pi represents the set of policies,π _a indicating that the server receives a request for vehicle formation,π _r representing that the server refuses the request for vehicle formation; u represents a reward set, U= { U _r ,U _v }，U _r Representing server profit, U _v Representing vehicle formation benefits.

3. A method of asynchronous federal learning of vehicle formation based on signal gaming according to claim 2, wherein the set of rewards comprises: when a=θ _p ，B=m _h ，C=π _a When U _r =E _p -m _h ，U _v =m _h -C(θ _p ) The method comprises the steps of carrying out a first treatment on the surface of the When a=θ _p ，B=m _l ，C=π _a When U _r =E _p -m _l ，U _v =m _l -C(θ _p ) The method comprises the steps of carrying out a first treatment on the surface of the When a=θ _p ，B=m _h ||m _l ，C=π _r When U _r =0，U _v =0; when a=θ _s ，B=m _h ，C=π _a When U _r =E _s -m _h -D，U _v =m _h -C(θ _p )-FThe method comprises the steps of carrying out a first treatment on the surface of the When a=θ _s ，B=m _h ，C=π _r When U _r =0，U _v =-FThe method comprises the steps of carrying out a first treatment on the surface of the When a=θ _s ，B=m _l ，C=π _a When U _r =E _s -m _l -D，U _v =m _l -C(θ _S ) The method comprises the steps of carrying out a first treatment on the surface of the When a=θ _s ，B=m _l ，C=π _r When U _r =0，U _v =0, wherein, C%θ _S ) The training overhead of the malicious vehicle formation is represented, A is the type of the vehicle formation, B is a reward proposal of the vehicle formation received by the server, and C is a decision of the server.

4. A method of asynchronous federal learning of vehicle formation based on signal gaming according to claim 3,

according to the boundary value of the high rewardBoundary value of low reward +.>Determining a solution space to which the prior beliefs belong, including, wherein,D ₁ a first one of the solution spaces is represented,D ₂ a second one of the solution spaces is represented,D ₃ a third one of the solutions is represented by a third one of the solutions,D ₄ representing a fourth solution space.

5. A method of asynchronous federal learning of signal-game-based vehicle formation according to claim 4, wherein the first strategy comprises four subspace-based game balancing, comprising:

if it isThe server is balanced +.>；

Judging whether the first limiting condition is satisfied，/>，/>，/>If so, determine the equalization solution of the vehicle formation +.>；

Judging whether the second limiting condition is satisfied，/>，/>If so, determine the equalization solution of the vehicle formation +.>；

If it isThe server is balanced +.>；

Judging whether the third limiting condition is satisfied，/>，/>，/>If so, determine the equalization solution of the vehicle formation +.>；

Judging whether the fourth limiting condition is satisfied，/>，/>If so, determine the equalization solution of the vehicle formation +.>；

If it isThe server is balanced +.>；

Judging whether the fifth limiting condition is satisfied，/>，/>，/>If so, determine the equalization solution of the vehicle formation +.>；

If it isThe server is balanced +.>；

Judging whether the sixth limiting condition is satisfied，/>，/>，/>If so, determine the equalization solution of the vehicle formation +.>；

Judging whether the seventh limiting condition is satisfied，/>If so, determining an equalization solution for the vehicle formation。

6. The signal game-based vehicle formation asynchronous federal learning method of claim 5, wherein the prior beliefs are modified based on game balancingμComprising:

if it isWhen the first constraint is satisfied, the server updates the prior beliefs toλ={λ _h ,λ _l }, whereinλ _h =μ _h ，The method comprises the steps of carrying out a first treatment on the surface of the When the second constraint is satisfied, the server updates the prior beliefs toλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0；

If it isWhen the third constraint is satisfied, the server updates the prior beliefs toλ={λ _h ,λ _l }, whereinλ _h =μ _h ，The method comprises the steps of carrying out a first treatment on the surface of the When the fourth constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, whereinλ _h =1，λ _l =0；

If it isWhen the fifth constraint is met, the server updates the a priori beliefs toλ={λ _h ,λ _l }, whereinλ _h =μ _h ，；

7. The method for asynchronous federal learning of vehicle formation based on signal gaming of claim 4, wherein the local modelBased on training formula->Obtaining; wherein,w _r-1 global model representing the round on the server that the vehicle formation received,/for>Is a strong convex function->As a function of the local loss,ρis the parameter of the ultrasonic wave to be used as the ultrasonic wave,D ^p representing the sum of the vehicle data sets within the formation.

8. The method of asynchronous federal learning of vehicle formation based on signal gaming of claim 7, wherein the server receives a local model of vehicle formation training completion using a local data set, aggregates global models based on the local model in an asynchronous manner, and wherein the formula comprises:

wherein,rthe polymerization run is represented by the number of polymerization runs,w _r a global model representing the current turn server,w _r-1 a global model representing a round on a server, < +.>Representation modelw _r-1 And model->Is used as a similarity function of the (c) for the (c),zrepresenting the super parameter.

9. The method of asynchronous federal learning of vehicle formation based on signal gaming of claim 8, wherein vehicle formation trains the global model with a local data set to obtain a local model, comprising: vehicle formation utilizes a random gradient descent method to find a locally optimal solution, based on a Mini-batch gradient descent method to control the number of training samples.

10. The method of claim 1, wherein if the server determines that the vehicle formation is not in a positive state, the server disregards the vehicle formation and waits for participation signals of other vehicle formations.