CN117852701A

CN117852701A - Traffic flow prediction method and system based on characteristic attention mechanism

Info

Publication number: CN117852701A
Application number: CN202410022611.XA
Authority: CN
Inventors: 张志荣; 沈娴娴; 仲安婕; 姜耀; 唐中一; 张楚; 彭甜
Original assignee: Huaiyin Institute of Technology
Current assignee: Huaiyin Institute of Technology
Priority date: 2024-01-05
Filing date: 2024-01-05
Publication date: 2024-04-09
Anticipated expiration: 2044-01-05
Also published as: CN117852701B

Abstract

The invention discloses a traffic flow prediction method and a traffic flow prediction system based on a characteristic attention mechanism, which are characterized in that required traffic flow data information is acquired, a mean processing method is adopted for preprocessing, the influence of numerical value differences among different predicted data indexes in traffic flow data is eliminated by adopting an entropy weight method, the processed traffic flow data is input into an SFformer model after CDO optimization of a Chenopteryle disaster algorithm for training the traffic flow data, and a predicted value of the traffic flow data is obtained. Compared with the prior art, the SFformer model provided by the invention comprises a coding layer, a decoding layer and a full-output layer, wherein the coding layer consists of a sparse attention mechanism and a characteristic attention mechanism, the decoding layer adopts a generating decoder, the full-connection layer adopts a BP neural network to extract data characteristics, the PCA algorithm reduces the dimension of characteristic data, and finally, a traffic flow prediction result is obtained through a Softmax function. The SFformer model provided by the invention can be used for predicting the traffic flow data of the intersection more rapidly and more accurately.

Description

Traffic flow prediction method and system based on characteristic attention mechanism

Technical Field

The invention relates to the technical field of model prediction, in particular to a traffic flow prediction method and system based on a characteristic attention mechanism.

Background

Traffic flow prediction is the prediction of traffic flow conditions in roads and traffic networks by collecting, analyzing, and utilizing traffic data. It is very important for traffic managers, city planners and drivers.

Traffic flow prediction can help traffic managers to plan traffic resources better, improve traffic efficiency and reduce congestion when optimizing traffic signals. Traffic flow prediction can help urban planners to know future travel demands and traffic bottlenecks, so that roads, traffic facilities, residential areas and the like are reasonably planned, and sustainable development of cities is ensured. For drivers, traffic flow prediction can help the drivers to avoid a congested road section, and proper travel time is selected, so that travel efficiency and comfort are improved.

Traffic management, city planning and driving assistance systems are the main application areas for traffic flow prediction. Traffic managers can optimize intersection signal timing, provide real-time traffic information, guide traffic flow and the like by predicting traffic flow, so as to improve traffic conditions. Urban planners can utilize traffic flow prediction results to plan roads, public transportation lines, parking facilities and the like to adapt to future traffic demands. Meanwhile, the driving assistance system can also utilize traffic flow prediction to help the driver to make better travel decisions, provide real-time traffic information, guide the driver to select an optimal route and avoid a congestion road section.

Traffic flow prediction, however, also faces challenges, and is affected by many uncertainty factors, such as weather conditions, special events (e.g., accidents, construction, etc.), holidays, etc. Changes in these factors may result in a certain discrepancy between the actual traffic flow and the predicted outcome.

In traffic flow prediction, a deep learning algorithm is an effective method. The method can efficiently capture the characteristics of input data and construct an accurate prediction model through multi-layer neural network modeling. In traffic flow prediction, the deep learning algorithm mainly comprises a convolutional neural network, a cyclic neural network, a long-time and short-time memory network and the like. The algorithms can start with the temporal and spatial characteristics of traffic flow, combine traffic flow predictions with factors such as time, weather, population, etc., and can be adapted to large-scale data predictions in real-time and non-real-time. However, the deep learning algorithm requires a long time and a large amount of computing resources in the training process, and a network prediction model transform model based on an attention mechanism appears in 2017, which works well when processing inter-sequence data, but because the attention mechanisms adopted by the coding layer and the decoding layer of the transform model are self-attention mechanisms, the awareness of local characteristic data is weak in the training process of the data. According to the invention, the transducer model is improved, and the SFformer model is provided for improving the field of road traffic flow prediction and improving the prediction accuracy of traffic flow data in the aspect of traffic flow prediction.

Disclosure of Invention

The invention aims to: aiming at accurately grasping real-time highway traffic flow data pointed out in the background art, the invention provides a traffic flow prediction method and a traffic flow prediction system based on a characteristic attention mechanism, and the traffic flow prediction precision is improved.

The technical scheme is as follows: the invention provides a traffic flow prediction method based on a characteristic attention mechanism, which comprises the following steps:

step 1: acquiring the required traffic flow data information, and preprocessing the acquired traffic flow data information by using a mean value processing method;

step 2: processing traffic flow data by adopting an entropy weight method, and constructing a corresponding feature matrix;

step 3: inputting traffic flow data processed by an entropy weight method into an SFformer model for training to obtain an estimated value of the traffic flow data, solving optimal values of Q, K, V parameters in the SFformer model by adopting a Chernobelin disaster algorithm CDO in the SFformer model, taking a mean square error as a loss function, and evaluating the SFformer model training model;

step 4: and finally, measuring the SFformer model prediction effect of the model by adopting two statistical indexes of average absolute error.

Further, the collected traffic flow data information in the step 1 is data information of historical traffic flow, air index and weather quality of an intersection, data are collected every 2min, 30 samples are obtained every hour, 12 hours are obtained every day, 3 ten thousand pieces of data are obtained for collected data, 2 ten thousand pieces of data are used for model training, and 1 ten thousand pieces of data are used for model verification of a test set.

Further, filling missing values in a vehicle flow data set, an air index data set and a weather quality data set by adopting a mean value processing method, and determining indexes of columns with missing values in the data set; the column average is then calculated and filled into the missing value slots in the corresponding column.

Further, in the step 2, the entropy weight method is adopted to process the traffic flow data, and the specific method is as follows:

the noise reduction process of the entropy weight method on the traffic flow data is as follows:

step 1: forming a decision matrix J through the prediction data set and the prediction index set;

wherein the data set m= (M) participating in the evaluation ₁ ,M ₂ ,M ₃ ,…,M _m ) Prediction index set d= (D) ₁ ,D ₂ ,D ₃ ,…,D _n ) The method comprises the steps of carrying out a first treatment on the surface of the Data M for participation in evaluation _i Corresponding to index D _j The value of (2) is defined as x _ij Wherein the range of values of i and j is: i=1, 2,3, …, m; j=1, 2,3, …, n;

step 2: because the prediction indexes have positive indexes and reverse indexes, the influence of different dimensions on the decision matrix needs to be eliminated, and meanwhile, the decision variables of the negative index values need to be removed, and the decision matrix J needs to be subjected to standardization processing;

the standardized processing mode of the forward index in the decision matrix J is as follows:

the standardized processing mode of the negative indexes in the decision matrix J is as follows:

step 3: calculating the feature ratio of the ith evaluation object under the jth index:

step 4: calculating the entropy value of the j-th index:

step 5: calculating the difference value of the entropy value of the j-th index: d, d _j ＝1-e _j ；

Step 6: calculating entropy values corresponding to the indexes:

step 7: multiplying the entropy corresponding to each index with the corresponding index to obtain a sample matrix x _f ：x _f ＝w _j ·v _ij ；

Combining traffic flow, air index and weather quality data obtained after the entropy weight method is processed to construct a feature matrix C, wherein A is a traffic flow sequence, B is an air index sequence and C is a weather index sequence; the processed vehicle flow data set, air quality data set and weather index data set are respectively divided into a training set and a testing set.

Further, training the processed traffic flow data by using the SFformer model in the step 3 to obtain an estimated value of the traffic flow data, and the specific method is as follows:

the SFformer model comprises a coding layer, a decoding layer and a full-connection layer 3; the data is input into SFformer model by embedding position information of data, and the input position information is composed of scalar projection, local time stamp and global time stamp, and the coding layer needs to pass through encoder X _en Encoding the collected traffic flow data, then inputting a model, and inputting an expression at the time t of an encoding layer as follows:

wherein L is _en Is the sequence length of the input data; m is the length of the sequence; n (N) _* Training set length for traffic flow data;

after inputting traffic flow data into an SFformer model, calculating and outputting traffic flow data containing characteristic values through a sparse attention mechanism and a characteristic attention mechanism at a coding layer, performing convolution calculation on data with main characteristics by adopting self-attention distillation operation, and finally outputting data containing self-attention characteristics; the sparse attention mechanism is obtained by carrying out discrete treatment on the attention mechanism of the SFformer model by KL, and a distillation layer is added into the self-attention layer of the coding layer;

the decoding layer obtains a traffic flow prediction result through calculation of a generating decoder and a joint attention function, and an input calculation formula of the characteristic data in the decoding layer is as follows:

wherein,representing an input; />Representing a plurality of attention calculations; concat is a function of the join attention calculation result; passing the sequence through a mask-based sparsity self-attention layer;

the full-connection layer adopts a deep learning algorithm to learn data, extracts traffic flow data characteristics, obtains traffic flow data information with strong representativeness, adopts a PCA algorithm to reduce the dimension of the characteristic data, and finally obtains a traffic flow prediction result through a Softmax function.

Further, a CDO algorithm is adopted to obtain optimal values of Q, K, V parameters in the SFformer model through an attack mode of alpha, beta and gamma radiation particles, a training result of the SFformer model is optimized, and the self-adaptive learning rate of the SFformer model is improved.

The invention also discloses a traffic flow prediction system based on the characteristic attention mechanism, which comprises the following modules:

the traffic flow data acquisition module is used for acquiring the required traffic flow data information;

the traffic flow data preprocessing module is used for constructing a corresponding feature matrix by utilizing the collected historical traffic flow data information and reducing noise of the traffic flow data through an entropy weight method;

the traffic flow data prediction module is used for taking traffic flow data information as a model sample data set, dividing the traffic flow data information into a training set and a testing set, and sending the training set and the testing set into the SFformer model optimized by using the Chenoptery disaster algorithm CDO to predict the traffic flow at the future moment.

The beneficial effects are that:

1. the invention considers that the monitoring and collecting period of the sample data is longer, and certain errors are possibly caused to the data by the measuring equipment, the measuring method and some human factors. And (3) preprocessing the data by using a mean value processing method, eliminating the influence of numerical value differences among different predicted data indexes in the traffic flow data by using an entropy weight method, calculating the weight of each index, and inputting an SFformer model to train the sample data.

2. The SFformer model provided by the invention comprises a coding layer, a decoding layer and a full connection layer 3. The data is input into the SFformer model, the position information of the data is embedded, and the input information of the data consists of scalar projection, local time stamp and global time stamp. The SFformer model coding layer is composed of a Sparse attention (Sparse attention) mechanism and a feature attention (Feature attention) mechanism, so that the processing capacity of the SFformer model on traffic flow data can be improved. The decoding layer adopts a generating decoder to predict traffic flow for a plurality of time periods in the future and outputs corresponding predicted values. The full-connection layer adopts the BP neural network to extract data characteristics, so that the data information of the traffic flow with strong representativeness is obtained, the data screening function is realized, most of characteristics of the traffic flow data are reserved, the PCA algorithm is adopted to reduce the dimension of the characteristic data, the training precision of the SFformer model is improved, and finally, the traffic flow prediction result is obtained through the Softmax function.

3. According to the SFformer model, the optimal values of Q, K, V parameters of the SFformer model are solved through a Chenopteriy disaster algorithm (CDO), so that the self-adaptive learning rate of the SFformer model is improved, and the parameters of the SFformer model are optimized to be more accurate.

Drawings

FIG. 1 is a block diagram of a system flow provided by an embodiment of the present invention;

FIG. 2 is a diagram of a model structure of an SFformer prediction model according to the present invention;

FIG. 3 is a flow chart of full-link layer prediction in the SFformer prediction model of the present invention.

Detailed Description

The invention will be further described in connection with the accompanying drawings in the examples of the invention. The following examples illustrate the technical solution of the present invention in detail and are not intended to limit the invention in any way. On the basis of the present invention, a person of ordinary skill in the art makes several optimizations and improvements without making other inventive changes. All of which fall within the scope of the present invention.

The invention discloses a traffic flow prediction method and a system based on a characteristic attention mechanism, wherein the traffic flow prediction method comprises the following specific implementation processes:

s1: according to the invention, the data information of the urban historical traffic flow, the air index and the weather quality is acquired by installing the sensor at each urban traffic intersection, and the data is collected every 2min, wherein 3 ten thousand pieces of data are obtained. For the data collected, 2 ten thousand pieces of data are used for model training, and 1 ten thousand pieces of data are used for model verification of the test set.

S2: in order to ensure the accuracy and reliability of the data, the collected data needs to be preprocessed. The embodiment adopts a mean value processing method to fill missing values in a vehicle flow data set, an air index data set and a weather quality data set, and the implementation method comprises the following steps: firstly, determining indexes of columns with missing values in a data set; the column average is then calculated and filled into the missing value slots in the corresponding column.

S3: considering the problem that the utilization rate of the predicted data index is low due to the fact that the monitoring and collecting period of the sample data is long, the collected data is possibly influenced by measuring equipment, measuring methods and some human factors, and the numerical value and dimension differences among traffic flow, air index and weather quality data, the method adopts an entropy weight method to eliminate the influence of the numerical value differences among different predicted data indexes in traffic flow data, calculates the weight of each index, and then adopts a SFformer (Sparse Feature attention Informer) model to train the sample data.

The entropy weight method is a weighting method for determining the weight of each index according to the index information entropy of the prediction data. The entropy weight method is utilized to solve the problem of confusion of the data of each prediction index, and errors of prediction results caused by human factors can be avoided. The noise reduction process of the entropy weight method on the traffic flow data is as follows:

wherein the data set m= (M) participating in the evaluation ₁ ,M ₂ ,M ₃ ,…,M _m ) The method comprises the steps of carrying out a first treatment on the surface of the Prediction index set d= (D) ₁ ,D ₂ ,D ₃ ,…,D _n ) The method comprises the steps of carrying out a first treatment on the surface of the Data M for participation in evaluation _i Corresponding to index D _j The value of (2) is defined as x _ij Wherein the range of values of i and j is: i=1, 2,3, …, m; j=1, 2,3, …, n.

Step 2: because the prediction indexes have positive indexes and reverse indexes, the influence of different scales on the decision matrix needs to be eliminated, and meanwhile, the decision variables of the negative index values need to be removed, so that the decision matrix J needs to be standardized.

step 4: calculating the entropy value of the j-th index:

step 5: calculating the difference value of the entropy value of the j-th index: d, d _j ＝1-e _j 。

Step 6: calculating entropy values corresponding to the indexes:

step 7: multiplying the entropy corresponding to each index with the corresponding index to obtain a sample matrix x _f ：x _f ＝w _j ·v _ij 。

S4: combining traffic flow, air index and weather quality data obtained after the entropy weight method is processed to construct a feature matrix C, wherein A is a traffic flow sequence, B is an air index sequence and C is a weather index sequence; the processed vehicle flow data set, air quality data set and weather index data set are respectively divided into a training set and a testing set.

S5: the training length of traffic flow prediction data gradually increases along with the sequence, the prediction time span of a long time sequence model such as traffic flow is very long, and the prediction accuracy is also reduced due to the influence of the time sequence. The case adopts the SFformer prediction model to train traffic flow data, so as to realize the prediction of the traffic flow and reduce the influence of time sequences on the model. The SFformer prediction model consists of a sparse self-attention mechanism and a characteristic attention mechanism, so that the time complexity is reduced, the memory use is reduced, the self-attention distillation mechanism is adopted to efficiently process an input sequence, a generating decoder is used, a plurality of prediction results are output at the same time, and the sequence prediction speed is accelerated.

S6: the traffic flow data sampling interval is 2 minutes, 30 samples are arranged every hour, and 12 hours are arranged a day. In order to improve the prediction accuracy of the SFformer prediction model, the invention sets the input length of the coding layer encoder to 48 and the input sequence length of the decoder to 36. In the decoder input, the starting token length is set to 24 and the actual predicted value length is set to 12 (replaced with a 0 value). The prediction process adopts a recursion mode, firstly predicts traffic flow data in a certain period, then takes a predicted value as a model to be input, and then carries out traffic flow prediction in a next period.

S7: since the SFformer prediction model self-attention mechanism does not contain recursion and convolution, only the weight relation between input data and output data can be calculated, and the sequence information of the data cannot be reflected, so that the SFformer prediction model is embedded with the position information of the data when the data is input, and when the position information is input into the SFformer prediction model, the input position information is composed of scalar projection, local time stamp and global time stamp in order to extract more time features. The position information expression of the data is:

wherein pos is timing information of the data; l (L) _X Is the sequence length of the input data; d, d _model And the dimension of the current characteristic data is the dimension.

And (3) weighting and calculating scalar projection, local time stamp and global time stamp, re-convolving and mapping to 512 dimensions, and finally inputting the most model, wherein the encoder is realized by an nn.Embedding program package in Pytorch, so that the prediction precision of the model is improved, and the calculation formula of the encoder is as follows:

wherein,the coding result is the input layer; />Extracting input features of the input values for the convolutions; alpha is a factor that balances input characteristics and temporal encoding; PE is position coding; SE is a global time code.

In the SFformer predictive model, the time of input data is globally coded in a month, day and day of week manner. However, for traffic flow data, congestion phenomenon can occur on holidays, and the influence of the traffic flow data on the data amount is not obvious in the period of time, so that the global time stamp is encoded in a mode of year, month and day in the scheme, and the time relevance among traffic flows in different seasons is improved.

S8: the data is input to the encoder of the model after unified conversion. When the SFformer prediction model predicts traffic flow data at time t+1, the coding layer needs to pass through the encoder X _en Encoding the collected traffic flow data, then inputting a model, and inputting an expression at the time t of an encoding layer as follows:

wherein L is _en Is the sequence length of the input data; m is the length of the sequence; n (N) _* Is the training set length of the traffic flow data.

S9: because the collected data contains various characteristic information, the coding layer needs to adopt a double-layer attention mechanism to finish receiving a large amount of long-sequence data, the first layer of the coding layer is a sparse attention mechanism, and the second layer of the coding layer is a characteristic attention mechanism. The first sparse attention mechanism of the SFformer predictive model coding layer is obtained by carrying out discrete processing on the attention mechanism of the SFformer model by KL. The self-carrying attention mechanism is used for completing the construction of a model by the relation between input samples, dot product scaling is carried out, and the calculation formula is as follows:

wherein Q, K, V is Query, key, value element of sparse attention mechanism; QK (quality control kit) ^T Converting the output data into weights through Softmax, and calculating a weighted sum aiming at possible output results to obtain a final result. d is the dimension. The smoothed probability form defining the mth attention coefficient is:

wherein q _x 、k _x 、v _x Line x of Q, K, V element data respectively, both time-and space-complex being O (L _Q L _K )；p(k _y |q _x ) For the probability distribution of the self-attention mechanism Q, the deviation of the self-attention mechanism Q is uniformly distributed through dot product calculation, namely the relevance between elements in the Q sequence is reduced; the KL divergence is used to evaluate the sparsity of the y-th Q to reduce the correlation between the probabilities p and Q, and the calculation formula is as follows:

wherein q _y Is the Log-Sum-Exp value on K.

The attention mechanism after KL divergence treatment is a sparse attention mechanism of an SFformer model, and the calculation formula is as follows:

wherein,the sparse matrix is the same as Q in size and comprises sparse evaluationEstimating coefficient q _y 。

The second layer of the coding layer adopts a characteristic attention mechanism which can evaluate the association relation among traffic flow, air index and weather and strengthen key factors causing traffic flow change influence. The method can adaptively extract the influence degree of each air index and weather characteristics on traffic flow, thereby improving the prediction accuracy.

The weight of the current sampling time t needs to be calculated in the feature attention mechanism. Given the output h of the SFformer model processing unit _(t) ' and unit vector q, the calculation formula of the weight of the current sampling time t is:

wherein v is _e ，w _e ，u _e Characteristic weight matrixes of traffic flow, air index and weather in the attention mechanism are respectively represented, and a bias term b is added to adjust the attention mechanism function. In calculating the feature weights, the softmax function pair is usedNormalization processing is performed to ensure that the attention weight sum of each feature is 1.

The normalized calculation formula of (2) is:

wherein,the relevant parameters corresponding to the characteristic data are obtained; />For each special purposeCharacteristic value corresponding to the sign data will +.>And->Multiplication is available->The method comprises the following steps of:

traffic flow prediction is carried out according to the incidence matrix, and the prediction value in the hidden state is calculatedThe update formula is:

s10: in order to save memory occupation and reduce calculation time, the SFformer model adds a distillation layer in the self-attention layer of the coding layer, the distillation layer consists of a convolution layer and a pooling layer, the length of a sequence passing through the distillation layer after each sparse attention mechanism operation is halved, the memory occupation can be reduced by the method, the calculation speed is improved, and the expression of the self-attention distillation mechanism is as follows:

wherein, [. Cndot. ] is the attention block of the double-layer sparse self-attention mechanism operation; maxPool represents maximum pooling; ELU (·) is the activation function; convld (·) represents a one-dimensional convolution.

To enhance the robustness of the self-attention distillation mechanism operation, the output dimensions of the encoded layers are ensured to be the same by successively reducing the number of layers of the sparse attention mechanism.

S11: after the traffic flow data is input into an SFformer model, the traffic flow data containing characteristic values is calculated and output through sparsity self-attention and a characteristic attention mechanism in a coding layer, the data with main characteristics is convolved and calculated through self-attention distillation operation, and finally the data containing the self-attention characteristics is output.

S12: and inputting the characteristic data output by the coding layer into a decoding layer, and in the generating decoder, predicting the traffic flow of a plurality of time periods in the future and outputting a plurality of traffic flow prediction results. The SFformer model can accelerate the prediction speed of time sequence data by encoding and decoding traffic flow data, and can avoid the rapid reduction of speed in the long-term reasoning prediction process.

The coding layer outputs data containing self-attention characteristics, and the traffic flow prediction result is obtained through calculation of a joint attention function in the decoding layer. The input calculation formula of the characteristic data at the decoding layer is as follows:

wherein,representing an input; />Representing a plurality of attention calculations; concat is a function of the join attention calculation result; passing the sequence through a mask-based sparsity self-attention layer avoids predicting each location, thereby avoiding autoregressive. The invention utilizes the generation type structure in the decoder structure to construct the traffic flow sequence prediction under all characteristic variables at one time, thereby reducing the prediction decoding time.

S13: all the characteristic data output by the decoder are combined as input to the full connection layer. The SFformer model full-connection layer provided by the invention adopts a deep learning algorithm to learn data, extracts data characteristics to obtain traffic flow data information with strong representativeness, adopts a PCA algorithm to reduce the dimension of the characteristic data, and finally obtains a traffic flow prediction result through a Softmax function.

The invention adopts a deep learning algorithm to realize the extraction of traffic flow data characteristics, adopts a BP neural network algorithm to learn data, adopts an s-type excitation function in the BP neural network to extract the characteristics, and adopts the following calculation formula of the s-type excitation function:

wherein x is all characteristic data output by the decoding layer, namely input of the full connection layer; f (x) is traffic flow data after the BP neural network extracts the characteristic data.

The number of layers of an input layer, a hidden layer and an output layer of the BP neural network is set to be 1, the number of nodes of the input layer is set to be 5, the number of nodes of the hidden layer is set to be 11 through a trial-and-error method, the number of nodes of the output layer is set to be 1, the maximum iteration number is 1000, the training precision is 0.001, the learning rate is 0.01, the transfer function of the input layer and the hidden layer is selected to be tan sig, the transfer function between the hidden layer and the output layer is selected to be purelin, and the training function is selected to be tranlm.

The invention adopts PCA algorithm to reduce the dimension of the characteristic data. The PCA algorithm maps high-dimensional data to a low-dimensional space through linear transformation, and adopts a covariance matrix to conduct characteristic decomposition on traffic flow data to extract main data characteristics, so that the effect of data dimension reduction is achieved. The specific implementation mode is as follows:

the traffic flow data is defined as a 512-dimensional sample data matrix X, with 3 eigenvalues per sample.

Step 1: and (5) carrying out centering treatment on the data. And calculating the average value of each characteristic value of the data matrix X, and subtracting the average value of each sample vector of the data matrix X to obtain a new centralized data matrix Y.

Wherein i is a sample index; j is a characteristic value index;is the mean value of the j-th eigenvalue.

Step 2: a covariance matrix of the data is calculated. Covariance matrix C may be derived from a transpose matrix Y of centered data matrices Y and Y ^T And multiplying the two to obtain the product.

Step 3: the eigenvalues decompose the covariance matrix. The eigenvectors and corresponding eigenvalues are obtained by eigenvalue decomposition covariance matrix C.

Step 4: and selecting a main component. And taking the first k characteristic quantities according to the sequence of characteristic values from large to small to form a projection matrix W.

Step 5: and performing dimension reduction operation. The reduced dimension data matrix Z is calculated by multiplying the original data matrix X by the projection matrix.

Z＝XW

And calculating the traffic flow data matrix processed by the PCA through a Softmax function to obtain a final traffic flow prediction result.

S14: training traffic flow training set data by using an SFformer model, performing accuracy test by using test set data, calculating a loss function between a predicted value and an actual value, adopting a mean square error as the loss function, optimizing parameters of the SFformer model by using a Chenopteriy disaster algorithm (CDO), improving the self-adaptive learning rate of the SFformer model, and optimizing the parameters of the SFformer model so as to be more accurate.

The CDO algorithm is a process that mimics three opaque radiations α, β, γ in a carnubeli disaster incident to attack humans. The gradient descent process of alpha, beta and gamma particles is calculated and converted into a mathematical model, so that the optimization effect on important parameters in the SFformer model is realized. The calculation formula of the gradient drop of alpha particles in the CDO algorithm is as follows:

v _a ＝0.25·(X _a (t)-ρ _a ·Δ _α )

x _h ＝r ² ·π

S _a ＝log(rand(1:16000))

Δ _α ＝|A _α ·X _a (t)-X _T (t)|

A _α ＝r ² ·π

wherein X is _a (t) is the current position of the alpha particles; ρ _a Representing the propagation of alpha particles; delta _α Is the positional distance between the alpha particles and the human being. X is x _h Is the area of human walking; s is S _α Is the moving speed of alpha particles, S _α ∈[1,16000]The method comprises the steps of carrying out a first treatment on the surface of the r is [0,1 ]]Random numbers in between; a is that _α Is the propagation area of alpha particles;

the calculation formula of the gradient drop of the beta particles is as follows:

v _β ＝0.5·(X _β (t)-ρ _β ·Δ _β )

x _h ＝r ² ·π

S _β ＝log(rand(1:270000))

Δ _β ＝|A _β ·X _β (t)-X _T (t)|

A _β ＝r ² ·π

wherein X is _β (t) is the current position of the beta particles; ρ _β Representing the propagation of beta particles; delta _β Is the positional distance between the beta particles and the human. X is x _h Is the area of human walking; s is S _β Is the moving speed of beta particles, S _β ∈[1,16000]The method comprises the steps of carrying out a first treatment on the surface of the r is [0,1 ]]Random numbers in between; a is that _β For propagation of beta particlesAn area;

the calculation formula of the gradient drop of the gamma particles is as follows:

v _γ ＝(X _γ (t)-ρ _γ ·Δ _γ )

x _h ＝r ² ·π

S _γ ＝log(rand(1:300000))

Δ _γ ＝|A _γ ·X _γ (t)-X _T (t)|

A _γ ＝r ² ·π

wherein X is _β (t) is the current position of the beta particles; ρ _β Representing the propagation of beta particles; delta _β Is the positional distance between the beta particles and the human. X is x _h Is the area of human walking; s is S _β Is the moving speed of beta particles, S _β ∈[1,16000]The method comprises the steps of carrying out a first treatment on the surface of the r is [0,1 ]]Random numbers in between; a is that _β Is the propagation area of the beta particles.

The total particle velocity average value can be obtained by calculation according to a motion equation:

the CDO algorithm obtains the optimal values of Q, K, V parameters in the SFformer model through the attack mode of alpha, beta and gamma radiation particles, and further optimizes the training result of the SFformer model.

S15: and training traffic flow training set data by using an SFformer model, performing accuracy test by using test set data, calculating a loss function between a predicted value and an actual value, and evaluating a difference value between a predicted result and a real result of the SFformer model by using a mean square error as the loss function in the case. The mean square error MSE is a commonly used evaluation index that measures the difference between the model predicted result and the real result. The average value of the square summation of errors between the predicted value and the actual value is calculated as follows:

where s is traffic flow sample data; y is _i Is the actual value of the traffic flow of the ith sample; y is _i Traffic flow sample training values for the ith sample. The smaller the value of the mean square error, the closer the training result of the model is to the actual value, and the better the training performance of the model is. In the training process, the goal of optimizing the model is typically to minimize the mean square error so that the training results of the model are as close as possible to the true results.

S16: the invention selects the average absolute error MAE statistical index to measure the prediction effect of the SFformer prediction model. The mean absolute error MAE is calculated as follows:

wherein m is traffic flow sample data; y is _i Is the actual value of the traffic flow of the ith sample; f (x) _i ) Traffic flow sample training values for the ith sample. The smaller the value of the average absolute error, the closer the prediction result of the model is to the actual value, and the better the prediction performance of the model is.

The invention provides a traffic flow prediction system based on a characteristic attention mechanism, which comprises the following modules:

and the traffic flow data acquisition module is used for installing sensors at all traffic intersections of the city to acquire data information of historical traffic flow, air index and weather quality of the city, and collecting data every 2 minutes, wherein 3 ten thousand pieces of data are obtained.

And the traffic flow data preprocessing module is used for constructing a corresponding feature matrix by utilizing the collected historical traffic flow data set and reducing noise of the traffic flow data through an entropy weight method.

And the traffic flow data prediction module is used for taking the historical traffic flow data set as a model sample data set, dividing the historical traffic flow data set into a training set and a testing set, and sending the training set and the testing set into the SFformer model to predict the traffic flow at the future moment.

The foregoing embodiments are merely illustrative of the technical concept and features of the present invention, and are intended to enable those skilled in the art to understand the present invention and to implement the same, not to limit the scope of the present invention. All equivalent changes or modifications made according to the spirit of the present invention should be included in the scope of the present invention.

Claims

1. A traffic flow prediction method based on a characteristic attention mechanism, which is characterized by comprising the following steps:

2. The traffic flow prediction method based on the characteristic attention mechanism according to claim 1, wherein the collected traffic flow data information in the step 1 is data information of historical traffic flow, air index and weather quality of an intersection, data are collected every 2min, 30 samples are collected every hour, 12 hours are taken a day, 3 ten thousand pieces of data are collected, 2 ten thousand pieces of data are used for model training, and 1 ten thousand pieces of data are used for model verification of a test set.

3. The traffic flow prediction method based on the characteristic attention mechanism according to claim 2, wherein the missing values in the traffic flow data set, the air index data set and the weather quality data set are filled by adopting a mean processing method, and the indexes of columns with the missing values in the data set are determined first; the column average is then calculated and filled into the missing value slots in the corresponding column.

4. The traffic flow prediction method based on the feature attention mechanism according to claim 1, wherein the processing of the traffic flow data in the step 2 by adopting the entropy weight method comprises the following specific steps:

step 4: calculating the entropy value of the j-th index:

Step 6: calculating entropy values corresponding to the indexes:

5. The traffic flow prediction method based on the feature attention mechanism according to claim 1, wherein the training of the processed traffic flow data by using the SFformer model in the step 3 obtains an estimated value of the traffic flow data, and the specific method is as follows:

the SFformer model comprises a coding layer, a decoding layer and a full-connection layer 3; the position information of the data is embedded when the data is input into the SFformer model, and the position information is input into the SFfIn the ormer model, to extract more temporal features, the input position information consists of scalar projections, local time stamps and global time stamps, and the coding layer needs to pass through the encoder X _en Encoding the collected traffic flow data, then inputting a model, and inputting an expression at the time t of an encoding layer as follows:

6. The traffic flow prediction method based on the characteristic attention mechanism according to claim 5, wherein the optimal values of Q, K, V parameters in the SFformer model are obtained by adopting a CDO algorithm through an attack mode of alpha, beta and gamma radiation particles, the training result of the SFformer model is optimized, and the self-adaptive learning rate of the SFformer model is improved.

7. A system based on the traffic flow prediction method based on the characteristic attention mechanism as claimed in any one of claims 1 to 6, characterized by comprising the following modules: