CN115909746A

CN115909746A - Traffic flow prediction method, system and medium based on federal learning

Info

Publication number: CN115909746A
Application number: CN202310006261.3A
Authority: CN
Inventors: 鲁鸣鸣; 何文勇
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2023-01-04
Filing date: 2023-01-04
Publication date: 2023-04-04

Abstract

The invention discloses a traffic flow prediction method, a system and a medium based on federal learning, wherein the method comprises the following steps: setting hyper-parameters of a server model and initializing to obtain a global model; the server distributes the global model to the client to obtain each local model; each client updates the local model by using the local traffic flow data set; calculating the correlation between each local model update and the global model update, and screening the client according to the correlation; the screened client side sends the local model parameters to a server; the server converges the received local model parameters to complete the global model updating; repeating the steps until the models converge; and finally, predicting the traffic flow by using the converged local model by each client. The invention can avoid invalid parameter uploading and reduce the communication overhead in the process of federal learning training.

Description

Traffic flow prediction method, system and medium based on federal learning

Technical Field

The invention belongs to the technical field of intelligent traffic, and particularly relates to a traffic flow prediction method, a traffic flow prediction system and a traffic flow prediction medium based on federal learning.

Background

With the acceleration of urbanization, population living density becomes more dense, and with the continuous increase of the number of private cars in cities, the demand of residents for public transportation services is rapidly increased. On one hand, the urban environment is rapidly deteriorated due to a large amount of exhaust gas emitted by automobiles; on the other hand, the traffic congestion degree is more serious due to the intensive traveling of the vehicles. The problems not only cause the consumption of a large amount of manpower and financial resources, but also seriously affect the travel experience of people. Therefore, how to relieve traffic pressure and improve urban trip efficiency is an urgent research problem. In recent years, an Intelligent Transportation System (ITS) [1] can prevent some traffic congestion and accidents in time by providing a rational traffic management decision. Therefore, research and development of intelligent transportation systems are receiving more and more attention from researchers. Traffic flow prediction is an important research field in intelligent traffic systems [2], and can effectively depict real-time traffic flow conditions of roads and capture change rules of traffic flow on the roads along with time, so the traffic flow prediction is widely applied to various applications of the traffic systems.

At present, deep learning has performed well in a series of traffic prediction problems. The existing traffic flow prediction method mainly focuses on how to train and predict traffic flow data collected by a sensor in a cloud by using a machine learning model, so that on one hand, the computing power of edge end sensor equipment is not fully utilized, and on the other hand, the risk of data leakage in the transmission process is caused. In addition, as the amount of traffic data increases explosively, the increase of the GPU/CPU computing power is relatively slow, which makes the cloud server far from meeting the computing requirements of real scenes [3]. In recent years, as computing, storage, and the like of terminal equipment have been greatly improved, researchers have begun to put part of services and computing down on terminal equipment, and have proposed a completely new solution in combination with Federal Learning (FL) [4 ]. The scheme converts a centralized training mode into a distributed terminal equipment cooperative training mode, thereby effectively solving the problems. The current traffic flow prediction work based on federal learning is either too simple to have insufficient representation capacity [5], so that the model has the problem of poor prediction performance; or the algorithm design is too complex [6], resulting in huge communication overhead at the edge end and the computation end.

Disclosure of Invention

In order to solve the problems of insufficient performance and huge communication overhead in the conventional federal learning-based traffic flow prediction method, the invention provides a federal learning-based traffic flow prediction method, a system and a medium with lighter communication weight, which can avoid invalid parameter uploading and reduce the communication overhead in the federal learning training process.

In order to achieve the technical purpose, the invention adopts the following technical scheme:

a traffic flow prediction method based on federal learning comprises the following steps:

step 1, setting hyper-parameters of a server model, and initializing a basic structure of the model to obtain global model parameters

(ii) a The parameter subscript G designates the global model, superscript @>

Represents iteration turns, and makes initialization->

；

Step 2, the server sends the global model parameters

Distributing the parameters to each client to obtain the local model parameters of each client>

(ii) a In the parameter subscript>

Respectively denote>

A client-side local model;

step 3, training local models respectively by using local traffic flow data sets of all clients to finish the first local model

Updating the wheel to obtain the parameters of each local model as->

；

Step 4, calculating the correlation between local model update of each client and global model update of the server, and screening the clients to be uploaded with local model parameters to the server in the current turn according to the correlation;

step 5, the selected client sides respectively send the local model parameters to a server;

step 6, the server converges the received local model parameters, namely the global model is the first

Updating the wheel to obtain a global model parameter of ^ 4>

；

Step 7, updating

Repeating the steps 2 to 6 until the global model and each local model are converged;

and 8, predicting the local traffic flow by each client by using the local model.

Further, the hyper-parameters set in step 1 include a learning rate

Number of communication rounds->

Local training round>

。

Further, the basic structure of the model adopts an Encoder-Decoder architecture: encoder module using gating basedThe recurrent neural network captures the context information in the input traffic flow time series, namely, converts the hidden time dynamic characteristics in the input time series into an intermediate hidden vector

(ii) a The Decoder module uses a gate-based recurrent neural network and a fully-connected network for result prediction.

Further, step 4 comprises:

(1) Calculating the local model of each client

Updating the wheel:

in the formula (I), the compound is shown in the specification,

indicates the fifth->

Individual client in the fifth or fifth place>

Partial model update on wheel, based on a comparison of the number of partial models in the wheel and the number of partial models in the wheel>

And &>

Respectively denote a fifth->

Individual client local model is at ^ h>

Wheel and a fifth->

A parameter of the wheel;

(2) Computing server Global model number

Updating of the wheel to evaluate it approximately as ^ based>

Updating the wheel:

in the formula (I), the compound is shown in the specification,

presentation server in the fifth or fifth place>

Global model update of the wheel;

(3) Calculating the correlation between the local model update and the global model update of each client, specifically by comparing

And &>

The number of parameters with the same symbol in the Chinese character is measured and expressed as:

in the formula (I), the compound is shown in the specification,

represents the correlation of local and global model updates for client k @>

An index representing the parameters of the model is determined,Mrepresents the total number of model parameters, <' > is selected>

Represents->

Corresponds to the partial model->

Is/are>

An updated value of a parameter->

Represents->

Corresponds to the global pattern G £ th>

Update values of individual parameters; sgn denotes a sign calculation function;

(4) Screening all clients: and if the correlation between the local model update and the global model update of a certain client is greater than a set correlation threshold, the local model of the client is a model which meets the requirement of uploading to the server.

Further, the server uses a FedAVG algorithm to converge the received local model parameters.

A traffic flow prediction system based on federal learning comprises 1 server and

a client, the server and each client comprising a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, causes the processors of the server and the client to cooperate to implement any of the aboveThe technical scheme is that the traffic flow prediction method based on the federal learning is adopted.

A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the federal learning based traffic flow prediction method in any of the above technical solutions.

Advantageous effects

Compared with the prior art, the invention has the beneficial effects that:

the method of the invention filters the client model parameters which are effective for updating the global model according to the correlation by calculating the correlation between the local model update of each client and the global model update of the server, avoids the uploading of ineffective parameters, and greatly reduces the communication overhead of the whole system under the condition of ensuring equivalent precision. In addition, compared with the traditional centralized method, the method not only makes full use of the computing power of the terminal equipment, but also ensures the data privacy of each equipment.

Drawings

FIG. 1 is a flow chart of an embodiment of the method of the present invention;

FIG. 2 is a schematic structural diagram of a global model in the method of the present invention;

FIG. 3 is a schematic diagram of the overall structure of the process of the present invention;

FIG. 4 is a graph comparing the results of the method of the present invention and other algorithms in terms of prediction accuracy;

FIG. 5 is a comparison graph of the results of the method of the present invention and other algorithms on the communication overhead.

Detailed Description

In order to make the technical solutions in the embodiments of the present invention better understood and make the above objects, features and advantages of the present invention more obvious and understandable, the technical solutions of the present invention are further described in detail below with reference to the accompanying drawings:

the flow chart of the specific embodiment of the method of the invention is shown in fig. 1, and the process is as follows:

step 1, setting hyper-parameters of server model, such as learning rate

Number of communication rounds->

Local training round>

Etc., and initializing the basic structure of the model to obtain global model parameters>

(ii) a The parameter subscript G designates the global model, superscript @>

Represents iteration turns, and makes initialization->

So that the initialized global model is denoted as +>

。

In order to better capture the time sequence characteristics of dynamic change in traffic flow data, the method adopts an Encoder-Decoder architecture as the infrastructure of a global model, and the structure of the Encoder-Decoder architecture is shown in figure 2. The Encoder module captures context information on a time sequence by using a gate-controlled recurrent neural network (GRU), and converts hidden time dynamic characteristics in an input time sequence into an intermediate hidden vector

. And the Decoder module uses GRU and full connection network (FNN) to predict the result. By initializing a global model->

(r = 0), r represents the iteration count, and is thus ready for subsequent model distribution.

Step 2, the server sends the global model parameters

(ii) a In parameter subscript>

Respectively denote->

And (4) local models of the clients.

Flow (1) in fig. 3 represents a process in which the server distributes parameters to the respective clients.

Updating the wheel to obtain the parameters of each local model as->

。

Flow (2) in fig. 3 represents the client-side local model training process.

First, the method of the invention uses

Represents an input time series of the traffic flow, wherein n represents the length of the series, and ` er `>

For the time series->

D represents the characteristic dimension of traffic flow data at each time point. The Encoder will output one after reading the whole sequence XAn intermediate hidden state vector

Representing a potential dynamic temporal pattern in the time series. In the client k, the calculation process is as follows:

（1）

wherein

Representing the initial hidden state vector.

Next, the method of the present embodiment uses

Representing the time series predicted by the model, wherein n represents the length of the predicted sequence. The module is characterized in that the input of each time step is the output of the previous time step. It is noted that the input at the first time step is the intermediate concealment vector ≧ which is the Encoder output>

And the last value of the input sequence->

. In the client k, the calculation process is as follows:

（2）。

finally, the method of this embodiment uses a gradient descent method to update the parameters, and the procedure is as follows:

（3）

wherein

Indicates a learning rate, is selected>

Indicating that the gradient is calculated.

And 4, calculating the correlation between local model update of each client and global model update of the server, and screening the clients to be uploaded with local model parameters to the server in the current turn according to the correlation.

Flow (3) in fig. 3 represents calculating the correlation between the local model update and the global model update of the client, and the method of the present embodiment divides the correlation into the following key 4 processes:

(1) And (3) calculating the update of the r-th round of the local model of each client:

（4）/>

in the formula (I), the compound is shown in the specification,

indicates the fifth->

Individual client is on the ^ h>

Partial model update of the wheel, ->

And &>

Respectively denote a fifth->

Individual client local model is at ^ h>

Wheel and a fifth->

A parameter of the wheel;

(2) Computing server Global model number

Updating of the wheel, evaluating it approximately as ^ based>

And (3) updating the wheel:

（4）

in the formula (I), the compound is shown in the specification,

indicating that the server is in the ^ th->

Global model update of the wheel;

And &>

（6）

（7）

in the formula (I), the compound is shown in the specification,

represents the correlation of local and global model updates for client k @>

An index representing the parameters of the model,Mrepresents the total number of model parameters, <' > is selected>

Represents->

Corresponds to the partial model->

Is/are>

An updated value of a parameter->

Represents->

Corresponds to the global pattern G £ th>

And 5, respectively sending the local model parameters to a server by each screened client.

Flow (4) in fig. 3 represents the process of each client sending the local model parameters to the server.

Step 6, the server uses a FedAVG algorithm to converge the received local model parameters, namely the global model is the first

Updating the wheel to obtain a global model parameter of>

。

The flow (5) in fig. 3 represents the process of aggregating the parameters by the server.

Step 7, updating

And repeating the steps 2 to 6 until the global model and each local model converge.

In the traffic flow prediction experiment performed by the embodiment of the invention, the main evaluation indexes comprise a mean square error, a root mean square error and an average absolute error. The data sets used in this example are the METR-LA and PEMS-BAY traffic stream data sets. The METR-LA contained 207 monitored values of highway traffic speed in los Angeles county in 4 months, with each sensor monitored every five minutes for a total of 34249 data samples; PEMS-BAY contains 325 sensors monitoring BAY expressway vehicle flow speed within 6 months, and comprises 52093 data samples in total.

Fig. 4 a) compares the predicted performance of this example method (CM-FedSeq 2 Seq) with other methods on the METR-LA dataset. Firstly, compared with some classical centralized methods, the CM-FedSeq2Seq of the method has better results on multiple evaluation indexes. In particular, CM-FedSeq2Seq reduced the MAE and MAPE errors by 38.4% and 44.0%, respectively, compared to ARIMA. In addition, compared with the existing federal learning method, the method CM-FedSeq2Seq of the embodiment also far exceeds the two latest methods FedGRU and CNFGNN. Particularly, compared with FedGRU, the CM-FedSeq2Seq reduces the error indexes of each index by 24.6 percent, 19.2 percent and 25.1 percent respectively; CM-FedSeq2Seq showed a 4.5% reduction in RMSE index compared to CNFGNN (only RMSE was compared here since CNFGNN only provided an RMSE error index).

Fig. 4 b) compares the predicted performance of this example method (CM-FedSeq 2 Seq) with other methods on PEMS-BAY data sets. First, compared with the compared centralized method, the CM-FedSeq2Seq of the present embodiment method has better results in each evaluation index. In particular, CM-FedSeq2Seq reduced the MAE, RMSE, and MAPE errors by 27.4%, 23.4%, and 29.8%, respectively, compared to FC-LSTM (the better accurate model). Meanwhile, compared with the existing federal learning method, the CM-FedSeq2Seq of the method is also superior to the two latest methods FedGRU and CNFGNN. Particularly, compared with FedGRU, the CM-FedSeq2Seq reduces the error indexes of each index by 22.9%, 21.6% and 27.5% respectively; the RMSE index of CM-FedSeq2Seq was reduced by 0.52% compared to CNFGNN.

In order to highlight the advantages of the method in the communication overhead, the communication overhead conditions of the methods in the training phase are compared. As shown in fig. 5, the CM-FedSeq2Seq of the present embodiment also ensures that the communication overhead of the CM-FedSeq2Seq is far smaller than that of CNFGNN on the premise that the RMSE index is better than that of the conventional method CNFGNN.

Experiments show that the method has higher prediction accuracy and lower communication overhead, so that the method is a more effective federal learning traffic flow prediction method.

The above embodiments are preferred embodiments of the present application, and those skilled in the art can make various changes or modifications without departing from the general concept of the present application, and such changes or modifications should fall within the scope of the claims of the present application.

Reference is made herein to:

[1] Lin Y, Wang P, Ma M. Intelligent transportation system (ITS): Concept, challenge and opportunity[C]//2017 ieee 3rd international conference on big data security on cloud (bigdatasecurity), ieee international conference on high performance and smart computing (hpsc), and ieee international conference on intelligent data and security (ids). IEEE, 2017: 167-172.

[2] Luo Q, Zhou Y. Spatial-temporal Structures of Deep Learning Models for Traffic Flow Forecasting: A Survey[C]//2021 4th International Conference on Intelligent Autonomous Systems (ICoIAS). IEEE, 2021: 187-193s.

[3] Guo Y, Zhao R, Lai S, et al. Distributed machine learning for multiuser mobile edge computing systems[J]. IEEE Journal of Selected Topics in Signal Processing, 2022.

[4] Khan L U, Saad W, Han Z, et al. Federated learning for internet of things: Recent advances, taxonomy, and open challenges[J]. IEEE Communications Surveys & Tutorials, 2021.

[5] Liu Y, James J Q, Kang J, et al. Privacy-preserving traffic flow prediction: A federated learning approach[J]. IEEE Internet of Things Journal, 2020, 7(8): 7751-7763.

[6] Meng C, Rambhatla S, Liu Y. Cross-node federated graph neural network for spatio-temporal data modeling[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021: 1202-1211。

Claims

1. a traffic flow prediction method based on federal learning is characterized by comprising the following steps:

(ii) a The parameter subscript G designates the global model, superscript @>

Represents iteration turns, and makes initialization->

；

Step 2, the server sends the global model parameters

(ii) a In the parameter subscript>

Respectively denote->

A client-side local model;

Updating the wheel to obtain the parameters of each local model as->

；

step 5, the screened client sides respectively send local model parameters to a server;

Updating the wheel to obtain a global model parameter of ^ 4>

；

Step 7, updating

2. The federally-learned traffic flow prediction method according to claim 1, wherein the hyper-parameter set in step 1 includes a learning rate

Number of communication rounds->

And local training round>

。

3. The federally-learned traffic flow prediction method according to claim 1, wherein the basic structure of the model employs an Encoder-Decoder architecture: the Encoder module uses a gate-based recurrent neural network to capture the context information in the input traffic flow time sequence, namely, the hidden time dynamic characteristics in the input time sequence are converted into an intermediate hidden vector

4. The federal learning based traffic flow prediction method as claimed in claim 1, wherein step 4 includes:

(1) Calculating the local model number of each client

Updating the wheel:

in the formula (I), the compound is shown in the specification,

indicates the fifth->

Individual client is on the ^ h>

Partial model update of the wheel, ->

And &>

Respectively denote a fifth->

Individual client side local model in ^ th>

Wheel and a fifth->

A parameter of the wheel;

(2) Computing server Global model number

Updating of the wheel, evaluating it approximately as ^ based>

Updating the wheel:

in the formula (I), the compound is shown in the specification,

indicating that the server is in the ^ th->

Global model update of the wheel;

And &>

The number of parameters with the same symbol in the Chinese character is measured and expressed as: />

In the formula (I), the compound is shown in the specification,

represents the correlation of local and global model updates for client k @>

Represents->

In correspondence with a partial model>

Is/are>

An updated value of a parameter->

Represents->

Corresponds to the global pattern G £ th>

Update values of individual parameters; sgn represents a sign calculation function;

5. The federal learning based traffic flow prediction method of claim 1, wherein the server aggregates the received local model parameters using a FedAVG algorithm.

6. A traffic flow prediction system based on federal learning is characterized by comprising 1 server and

a client, the server and each client compriseA memory and a processor, the memory having stored therein a computer program that, when executed by the processor, causes the processor of the server and the client to jointly implement the method of any one of claims 1 to 5.

7. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 5.