CN112508220A - Traffic flow prediction method and device - Google Patents

Traffic flow prediction method and device Download PDF

Info

Publication number
CN112508220A
CN112508220A CN202011001420.3A CN202011001420A CN112508220A CN 112508220 A CN112508220 A CN 112508220A CN 202011001420 A CN202011001420 A CN 202011001420A CN 112508220 A CN112508220 A CN 112508220A
Authority
CN
China
Prior art keywords
traffic flow
kernel function
flow prediction
populations
combined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011001420.3A
Other languages
Chinese (zh)
Inventor
李雷孝
林浩
李�杰
王洪彬
马志强
万剑雄
王慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University of Technology
Original Assignee
Inner Mongolia University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University of Technology filed Critical Inner Mongolia University of Technology
Priority to CN202011001420.3A priority Critical patent/CN112508220A/en
Publication of CN112508220A publication Critical patent/CN112508220A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Strategic Management (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Business, Economics & Management (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Tourism & Hospitality (AREA)
  • Biomedical Technology (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Development Economics (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a traffic flow prediction method and a traffic flow prediction device. The traffic flow prediction method comprises the following steps: obtaining historical traffic flow data, and determining the historical traffic flow data as input data; inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training. The invention solves the technical problems of low accuracy and efficiency due to machine learning and deep learning in the traffic flow prediction process in the prior art.

Description

Traffic flow prediction method and device
Technical Field
The invention relates to the field of computer technology application, in particular to a traffic flow prediction method and a traffic flow prediction device.
Background
The precursor of Intelligent Traffic Systems (ITS) is the Intelligent Vehicle road System (IVHS). The ITS aims to effectively apply advanced scientific technologies (information technology, sensor technology, automatic control theory, operational research, artificial intelligence and the like) to the traffic field, and covers traffic modes such as highways, railways, civil aviation, water transportation and the like. The accurate and reliable traffic flow prediction result can be directly served for an intelligent traffic system, real-time and effective travel information can be provided for travelers, and decision basis is provided for relieving traffic jam. Sensor systems installed on roads can generally provide traffic network traffic, speed, and lane occupancy information to provide data support for traffic flow prediction. How to improve the accuracy of traffic flow prediction is always a research hotspot in the field of ITS. The traffic flow prediction problem is a typical time series prediction problem. The wide application of machine learning and deep learning algorithms provides an important solution for solving the problem of complex time series prediction. At present, a plurality of machine learning algorithms and deep learning algorithms are introduced into traffic flow prediction, but the accuracy and the efficiency still have a space for improving.
Aiming at the problems of low accuracy and low efficiency in the process of predicting the traffic flow based on machine learning and deep learning in the prior art, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the invention provides a traffic flow prediction method and a traffic flow prediction device, which at least solve the technical problems of low accuracy and low efficiency in the process of predicting traffic flow based on machine learning and deep learning in the prior art.
According to an aspect of an embodiment of the present invention, there is provided a traffic flow prediction method including: obtaining historical traffic flow data, and determining the historical traffic flow data as input data; inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
Optionally, the input data includes:
Figure RE-GDA0002915303300000021
wherein the content of the first and second substances,
Figure RE-GDA0002915303300000022
for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i ,j-m,xi,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,…,xi-7,j)T Indicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
Further, optionally, constructing the combined kernel function by the kernel function includes: acquiring a kernel function; combining according to the kernel functions to obtain combined kernel functions; wherein the combined kernel function comprises:
Figure RE-GDA0002915303300000023
Figure RE-GDA0002915303300000024
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Further, optionally, optimizing the parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm includes: randomly initializing at least two populations; respectively performing genetic algorithm operation and particle swarm algorithm operation according to at least two populations, and calculating individual fitness of the at least two populations in each iteration; and obtaining parameters of the optimized combined kernel function by comparing the optimal values in the individual fitness of at least two populations and taking the optimal values as the result of the iteration to enter the next iteration.
Optionally, the method further includes: and training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
Further, optionally, training the combined kernel function after the parameters are optimized through parallelization training, and obtaining the trained traffic flow prediction model includes: randomly initializing a population; dividing the initialized population to obtain at least two sub-populations; calculating individual fitness of at least two sub-populations; carrying out population updating according to the individual fitness of at least two sub-populations to obtain an updated combined kernel function; and verifying the updated combined kernel function to obtain a traffic flow prediction model.
According to an aspect of an embodiment of the present invention, there is provided a traffic flow prediction apparatus including: the acquisition module is used for acquiring historical traffic flow data and determining the historical traffic flow data as input data; the prediction module is used for inputting the input data into the traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
Optionally, the input data includes:
Figure RE-GDA0002915303300000031
wherein the content of the first and second substances,
Figure RE-GDA0002915303300000032
for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i ,j-m,xi,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,...,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
Optionally, the apparatus comprises: the building function module is used for building a combined kernel function through the kernel function; wherein, the function building module comprises: an acquisition unit configured to acquire a kernel function; the function building unit is used for combining according to the kernel functions to obtain combined kernel functions; wherein the combined kernel function comprises:
Figure RE-GDA0002915303300000033
Figure RE-GDA0002915303300000034
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Further, optionally, the apparatus comprises: an optimization module for optimizing parameters of the combined kernel function using a genetic algorithm and a particle swarm algorithm, wherein the optimization module comprises: a first initialization unit for initializing at least two populations at random; the first calculation unit is used for performing genetic algorithm operation and particle swarm algorithm operation respectively according to at least two populations and calculating the individual fitness of the at least two populations in each iteration; and the comparison unit is used for obtaining the parameters of the optimized combined kernel function by comparing the optimal values in the individual fitness of at least two populations and taking the optimal values as the result of the current iteration to enter the next iteration.
Optionally, the apparatus further comprises: and the model training module is used for training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
Further, optionally, the model training module includes: the second initialization unit is used for initializing the population randomly; the dividing unit is used for dividing the initialized population to obtain at least two sub-populations; the second calculating unit is used for calculating the individual fitness of at least two sub-populations; the updating unit is used for updating the population according to the individual fitness of at least two sub-populations to obtain an updated combined kernel function; and the verification unit is used for verifying the updated combined kernel function to obtain a traffic flow prediction model.
In the embodiment of the invention, historical traffic flow data is obtained and determined as input data; inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training, so that the technical effects of improving the accuracy and improving the calculation efficiency are achieved, and the technical problems of low accuracy and low efficiency due to machine learning and deep learning in the traffic flow prediction process in the prior art are solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart diagram of a traffic flow prediction method according to an embodiment of the invention;
FIG. 2 is a diagram illustrating a structure of a parameter optimization algorithm in a traffic flow prediction method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of time consumed by portions of a parameter optimization algorithm in a traffic flow prediction method according to an embodiment of the invention;
FIG. 4 is a schematic diagram of the SPGAPSO-CKRVM algorithm flow in the traffic flow prediction method according to the embodiment of the invention;
fig. 5(a) is a schematic diagram of data No. 8/month 6 to 19 of monitoring points No. 1027 in the traffic flow prediction method according to the embodiment of the present invention;
fig. 5(b) is a schematic view of data of monitoring point No. 1027 for three days in the traffic flow prediction method according to the embodiment of the present invention;
FIG. 6(a) is a schematic diagram of training time comparison at different nodes in a traffic flow prediction method according to an embodiment of the present invention;
FIG. 6(b) is a schematic diagram of an acceleration ratio comparison at different nodes in a traffic flow prediction method according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a traffic flow prediction apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
According to an aspect of the embodiment of the present invention, a traffic flow prediction method is provided, fig. 1 is a schematic flow diagram of a traffic flow prediction method according to an embodiment of the present invention, and as shown in fig. 1, a traffic flow prediction method provided in the embodiment of the present application includes:
step S102, obtaining historical traffic flow data, and determining the historical traffic flow data as input data;
step S104, inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
Optionally, the input data includes:
Figure RE-GDA0002915303300000051
wherein the content of the first and second substances,
Figure RE-GDA0002915303300000052
for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i ,j-m,xi,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,…,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
Further, optionally, constructing the combined kernel function by the kernel function includes: acquiring a kernel function; combining according to the kernel functions to obtain combined kernel functions; wherein the combined kernel function comprises:
Figure RE-GDA0002915303300000053
Figure RE-GDA0002915303300000061
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Specifically, the traffic flow prediction method provided in the embodiment of the present application combines common kernel functions to construct a combined kernel function, as shown in formulas (1) and (2).
Figure RE-GDA0002915303300000062
Figure RE-GDA0002915303300000063
Since the kernel function of the Relevance Vector Machine (RVM) does not need to satisfy the Mercer theorem, no validation of the availability of the combined kernel function is required. The combined kernel function enables the RVM to have the local learning capacity of the Gaussian kernel function and the Laplace kernel and have stronger generalization capacity of the polynomial kernel function. Where σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Further, optionally, optimizing the parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm includes: randomly initializing at least two populations; respectively performing genetic algorithm operation and particle swarm algorithm operation according to at least two populations, and calculating individual fitness of the at least two populations in each iteration; and obtaining parameters of the optimized combined kernel function by comparing the optimal values in the individual fitness of at least two populations and taking the optimal values as the result of the iteration to enter the next iteration.
Specifically, unlike the general RVM parameter optimization problem, λ in the combined kernel as shown in formula (1) (2) is also a hyper-parameter to be optimized. The mathematical model of the combined kernel RVM parameter optimization problem can therefore be expressed as:
P={σbestbest,dbest} (3)
the traffic flow prediction method provided by the embodiment of the application is based on genetic algorithm (GA for short) and particle swarm algorithm (PSO for short) to construct a parameter optimization algorithm. Two populations are initialized randomly, and GA operation and PSO operation are carried out respectively. And comparing the superior values of the two in each iteration in a mode of calculating the individual fitness, and entering the next iteration as the result of the current iteration, wherein the updating of the sigma is shown as a formula (4).
Figure RE-GDA0002915303300000064
λ andthe update of d is similar to σ. The GA and the PSO have the commonality of iterative optimization, and the hybrid algorithm of the GA and the PSO can fully utilize the large search range of the GA and the rapid convergence capability of the PSO in iteration. Optimization of parameters to obtain [ sigma ]bestbest,dbestAnd inputting the result as the operation parameter of the RVM, and further solving the Accuracy Accuracy of the RVM prediction model. Defining Accuracy as the objective function of the RVM parameter optimization problem, the RVM parameter optimization problem can be described as:
Figure RE-GDA0002915303300000071
wherein
Figure RE-GDA0002915303300000072
Figure RE-GDA0002915303300000073
Figure RE-GDA0002915303300000074
The variation range of σ is usually ± 8. dminAnd dmaxD is a variable range, and is generally 0 or inf. And finally, judging whether the obtained result meets a termination condition. When the termination condition is satisfied, the iteration of the parameter optimization algorithm is terminated. The termination condition is shown in equation (6).
min{fitnessga,fitnesspso}≤fitnessmin or T≤Tmax (6)
Wherein the fitnessminIs the minimum fitness, i.e., the minimum error acceptable; t is the number of iterations, TmaxIs the maximum number of iterations.
Among them, GA operation, binary coding is the most commonly used coding method for the RVM parameter optimization problem. The GA needs to evaluate each set of parameters through a fitness function to guide the iterative process to proceed in a good direction. The MSE of RVM is used herein as the fitness function. After the fitness is calculated, individuals are selected according to the idea of roulette, and a new solution cluster population is generated by means of operations such as crossing and variation.
PSO operates, and PSO also calculates the degree of goodness of the fitness evaluation particle using the MSE of RVM as the fitness function. And searching an individual extreme value and a global extreme value according to the calculation result, and then updating the speed and the position of each particle according to formulas (7) and (8).
v=vi+c1r1(pbest-xi)+c2r2(Gbest-xi) (7)
x=xi-vi (8)
In equations (7) and (8), v is the updated particle velocity; x is the updated particle position; v. ofiIs the current velocity of the particle; x is the number ofiIs the current position of the particle; c. C1And c2Are learning factors respectively; r is1And r2Is a random number between (0, 1); p is a radical ofbestIs an individual extremum; gbestIs a global extremum.
The GA and the PSO in the parameter optimization algorithm in the traffic flow prediction method provided in the embodiment of the present application may be divided into three parts, namely, an initialization population, population updating, and a calculation fitness, fig. 2 is a schematic diagram of a structure of the parameter optimization algorithm in the traffic flow prediction method according to the embodiment of the present invention, and as shown in fig. 2, the initialization population part includes randomly generating an initialization population and calculating the initialization population fitness. The population updating part comprises selection, crossing, variation, speed updating, position updating and population updating. And the fitness calculating part is responsible for traversing all the individuals in the population and calculating the fitness of the individuals. The parameter optimization algorithm was run 20 times with a maximum number of iterations of 10 and a population size of 10. The average of the time spent by each section was calculated and recorded.
As shown in fig. 3, fig. 3 is a schematic diagram illustrating time consumption of each part of a parameter optimization algorithm in the traffic flow prediction method according to the embodiment of the present invention. The time consumption of the part for calculating the fitness is the highest, and occupies more than 90% of the total time consumption. And the time consumption of the population updating part with more complex computation logic only accounts for about 0.1 percent of the total time consumption. Aiming at the problem that the part of the calculation fitness consumes too long time, the traffic flow prediction method provided by the embodiment of the application adopts a Spark platform to carry out parallelization processing on the algorithm so as to reduce the calculation fitness running time.
Optionally, the traffic flow prediction method provided in the embodiment of the present application further includes: and training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
Further, optionally, training the combined kernel function after the parameters are optimized through parallelization training, and obtaining the trained traffic flow prediction model includes: randomly initializing a population; dividing the initialized population to obtain at least two sub-populations; calculating individual fitness of at least two sub-populations; carrying out population updating according to the individual fitness of at least two sub-populations to obtain an updated combined kernel function; and verifying the updated combined kernel function to obtain a traffic flow prediction model.
Specifically, Spark in the traffic flow prediction method provided in the embodiment of the present application is a fast and general calculation engine specially designed for large-scale data processing. Since the intermediate output result can be stored in the memory, Spark can better run algorithms requiring iteration, such as data mining and machine learning. The SPGAPSO-CKRVM algorithm design mainly depends on a flexible Distributed data set (RDD for short) specific to Spark. The overall flow of the SPGAPSO-CKRVM algorithm is shown in fig. 4. Fig. 4 is a schematic diagram of the SPGAPSO-CKRVM algorithm flow in the traffic flow prediction method according to the embodiment of the present invention.
As shown in fig. 4, the SPGAPSO-CKRVM algorithm flow is specifically as follows:
STEP 1 initialization
Initializing Spark operation parameters, wherein the key parameters comprise default, executive, cores and num-executors, and the key parameters respectively correspond to the number of the RDD default partitions, the cores occupied by each executive and the total number of the executors. The experimental data are read from the external file and compressed to (0,1) by performing an extremum-based normalization process according to equation (9).
Figure RE-GDA0002915303300000091
Wherein X is the original data, Xmax、XminRespectively, the maximum and minimum values of the original data set. A set of { σ, λ, d } is randomly generated as an initial population, and an RDD is created using the initial population.
STEP 2 calculates individual fitness
And dividing the initial population into n sub-populations, and calculating the individual fitness of each sub-population in parallel by using a map function. And triggering the calculation involved in the map function by using a collect function according to an inertia mechanism of the RDD, and converting the sub-population of the RDD type into a list. Then, one part of the sub-population is subjected to SPGA operation, and the other part is subjected to SPPSO operation.
STEP 3 population update
In the SPGA operation, first, a selection operation is performed based on the calculation result of the individual fitness and the formula (10).
Figure RE-GDA0002915303300000092
The meaning of equation (10) is that the more excellent the probability that the individual is selected is greater. And performing cross operation and mutation operation on the selected individuals to generate a next generation population. The velocity and position of each particle is updated in the SPPSO operation according to equations (7) and (8) to generate the next generation population. And recalculating the individual fitness, comparing the individual fitness of the two next-generation populations, and keeping the optimal value.
STEP 4 verification accuracy
Decoding the result obtained by STEP 3 into { sigma, lambda, d } as the parameter input of RVM, and if the prediction accuracy meets the expectation, using the result as the final solution { sigmabestbest,dbestAnd (6) outputting. If not, the STEP returns to STEP 3.
In summary, the traffic flow prediction method provided by the embodiment of the present application uses traffic flow data collected from the geomagnetic induction coils of the monitoring points No. 1027, No. 1036, and No. 1042 of Whitemud Drive in canada as experimental data. Data were provided by the intelligent traffic research center at alberta university, canada. The data set contained traffic flow data from year 2015 8/6 to year 2015 8/28, with a collection frequency of 20 s. The data of 8 months and 28 days are test sets, and the other data are training sets. The monitoring point information corresponding to the experimental data is as follows:
data of monitoring point No. 1027 in east direction of data set 1 Whitemoud Drive
Data of monitoring point 1036 west direction of data set 2 Whitemoud Drive
Data of monitoring point No. 1042 of west direction ramp of data set 3 Whitemoud Drive
And (4) repairing data lost due to the failure of the acquisition equipment by adopting a historical trend method and an exponential smoothing method. The experimental data after partial repair are shown in fig. 5(a) and 5 (b).
As can be seen from fig. 5(a), the experimental data is continuous and periodic, with significant tidal activity. As can be seen from fig. 5(b), 7: 00-9: 00 and 16: 00-19: and 00 is the peak time of the traffic flow.
According to the traffic flow prediction method provided by the embodiment of the application, the performance of the SPGAPSO-CKRVM algorithm is verified by building a Spark cluster with 8 nodes through a virtual machine. The cluster is configured in detail as: CentOS-6.10-x64, Spark-2.1.1-bin-hadoop2.7, hadoop-2.7.2.tar, pyspark-2.3.2, py4 j-0.10.8.1. The parameter settings of the SPGAPSO-CKRVM algorithm are shown in table 1.
TABLE 1 SPGAPSO-CKRVM Algorithm parameter set
Figure RE-GDA0002915303300000101
Traffic jam, travel delay and road accidents mostly occur in peak hours, so that the prediction of traffic flow in the peak hours is more meaningful. So in the field of traffic flow prediction, a kernel function that predicts more accurately during peak hours should be considered as performing better. In addition to the Accuracy (1-MAPE), the traffic flow prediction method provided by the embodiment of the application additionally defines a Peak Hour Accuracy (PHA), that is, the Accuracy of the traffic flow prediction of (7: 00-9: 00) and (16: 00-19: 00), to evaluate the performance of the kernel function.
TABLE 2 Kernel function Performance test results
Figure RE-GDA0002915303300000102
Figure RE-GDA0002915303300000111
As can be seen from the table, all kernel functions are more accurate on dataset 1 and dataset 2 than dataset 3. This is because the traffic flow on the ramp corresponding to the data set 3 has stronger randomness and greater prediction difficulty. The performance of the combined kernel functions numbered 5 and 6 is generally superior to that of the single kernel functions numbered 1-4. Except in data set 1, the combined kernel function numbered 6 is always more accurate than the combined kernel function numbered 5. In the three data sets, PHA for the combined kernel function numbered 6 was always higher than the combined kernel function numbered 5. Therefore, it is used herein
Figure RE-GDA0002915303300000112
Subsequent experiments were performed as a kernel function of RVM.
The SPGAPSO-CKRVM provided by the traffic flow prediction method provided by the embodiment of the application is compared with the existing machine learning and deep learning algorithm by using RMSE, MAE and MAPE as evaluation indexes. The results of the experiment are shown in table 3.
TABLE 3 Algorithm accuracy test results
Figure RE-GDA0002915303300000113
Wherein CNN-GRU is CNN-binding GRU; CNN-Bi-LSTM is CNN in combination with Bi-directional LSTM; the SPGAPSO-SVM optimizes the SVM by using the same parameter optimization algorithm as the SPGAPSO-CKRVM; GA-CKRVM is a single optimized combined kernel RVM using GA. As can be seen from Table 3, the conventional PSO-SVR performed the worst. The deep learning algorithm based on RNN, such as LSTM, CNN-GRU, CNN-Bi-LSTM, and the like, is obviously superior to the algorithm based on SVM. GA-CKRVM performed slightly better than the above algorithm. The SPGAPSO-CKRVM performed better in all three datasets than the other comparative algorithms.
The algorithm expandability experiment is used for testing whether the algorithm running speed can be improved by adding nodes or not, and the parallelization effect is measured by calculating the acceleration ratio. The experiment runs the SPGAPSO-CKRVM algorithm 10 times based on a single node, 2 nodes, 4 nodes and 8 nodes, records the experiment result and calculates the acceleration ratio. The acceleration ratio is calculated as follows:
Figure RE-GDA0002915303300000121
wherein, T1Is the algorithm serial run time. T isnIs the time that the algorithm runs in parallel on n nodes. Ideally, the acceleration ratio should be equal to the number of nodes. The results of the experiment are shown in fig. 6(a) and 6 (b).
The population size determines the amount of SPGAPSO-CKRVM to be calculated. As can be seen from fig. 6(a), the running time increases linearly as the amount of calculation increases. In the case of small computational complexity, the training time difference of the algorithm at a single node, 2 nodes, 4 nodes and 8 nodes is small. With the increase of the amount of calculation, the running time of 8 nodes is far lower than that of 4 nodes, 2 nodes and a single node. This is because the larger the number of nodes, the smaller the amount of computation each compute node is responsible for processing.
As can be seen from fig. 6(b), in the case where the calculation amount is small, the acceleration effect is not significant. This is because the basic operations such as starting up a cluster, dividing a task, and allocating resources are time-consuming, and the cluster does not exhibit an ideal effect. The advantages of parallel computing become more and more obvious as the computing amount increases, and the acceleration ratio tends to increase and gradually approaches to an ideal value. Experimental results prove that the SPGAPSO-CKRVM provided by the traffic flow prediction method provided by the embodiment of the application has better expandability.
The traffic flow prediction method provided by the embodiment of the application utilizes the combined kernel RVM and the plurality of heuristic algorithms to construct a traffic flow prediction model, and utilizes a Spark parallelization technology to carry out parallelization design on the parameter optimization algorithm of the RVM, thereby providing the SPGAPSO-CKRVM. The model provided by the method is superior to other methods in prediction accuracy and effectively shortens parameter optimization time after experiments are carried out by using real data of the Whitemoud Drive highway in Canada.
In the embodiment of the invention, historical traffic flow data is obtained and determined as input data; inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training, so that the technical effects of improving the accuracy and improving the calculation efficiency are achieved, and the technical problems of low accuracy and low efficiency due to machine learning and deep learning in the traffic flow prediction process in the prior art are solved.
Example 2
According to an aspect of the embodiment of the present invention, there is provided a traffic flow prediction apparatus, and fig. 7 is a schematic diagram of the traffic flow prediction apparatus according to the embodiment of the present invention, and as shown in fig. 7, the traffic flow prediction apparatus according to the embodiment of the present application includes: the acquisition module 72 is used for acquiring historical traffic flow data and determining the historical traffic flow data as input data; the prediction module 74 is used for inputting the input data into the traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
Optionally, the input data includes:
Figure RE-GDA0002915303300000131
wherein the content of the first and second substances,
Figure RE-GDA0002915303300000132
for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i ,j-m,xi,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,...,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
Optionally, the traffic flow prediction apparatus provided in the embodiment of the present application includes: the building function module is used for building a combined kernel function through the kernel function; wherein, the function building module comprises: an acquisition unit configured to acquire a kernel function; the function building unit is used for combining according to the kernel functions to obtain combined kernel functions; wherein the combined kernel function comprises:
Figure RE-GDA0002915303300000133
Figure RE-GDA0002915303300000134
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Further, optionally, the traffic flow prediction apparatus provided in the embodiment of the present application includes: an optimization module for optimizing parameters of the combined kernel function using a genetic algorithm and a particle swarm algorithm, wherein the optimization module comprises: a first initialization unit for initializing at least two populations at random; the first calculation unit is used for performing genetic algorithm operation and particle swarm algorithm operation respectively according to at least two populations and calculating the individual fitness of the at least two populations in each iteration; and the comparison unit is used for obtaining the parameters of the optimized combined kernel function by comparing the optimal values in the individual fitness of at least two populations and taking the optimal values as the result of the current iteration to enter the next iteration.
Optionally, the traffic flow prediction apparatus provided in the embodiment of the present application further includes: and the model training module is used for training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
Further, optionally, the model training module includes: the second initialization unit is used for initializing the population randomly; the dividing unit is used for dividing the initialized population to obtain at least two sub-populations; the second calculating unit is used for calculating the individual fitness of at least two sub-populations; the updating unit is used for updating the population according to the individual fitness of at least two sub-populations to obtain an updated combined kernel function; and the verification unit is used for verifying the updated combined kernel function to obtain a traffic flow prediction model.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A traffic flow prediction method, comprising:
obtaining historical traffic flow data, and determining the historical traffic flow data as input data;
inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result;
the traffic flow prediction model is a model obtained by constructing a combined kernel function through a kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
2. The method of claim 1, wherein the input data comprises:
Figure RE-FDA0002915303290000011
wherein the content of the first and second substances,
Figure RE-FDA0002915303290000012
for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i,j-m,xi ,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,...,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
3. The method of claim 2, wherein constructing the combined kernel function by the kernel function comprises:
acquiring a kernel function;
combining according to the kernel functions to obtain the combined kernel functions;
wherein the combined kernel function comprises:
Figure RE-FDA0002915303290000013
Figure RE-FDA0002915303290000014
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
4. The method of claim 3, wherein optimizing the parameters of the combined kernel function using genetic and particle swarm algorithms comprises:
randomly initializing at least two populations;
respectively performing genetic algorithm operation and particle swarm algorithm operation according to the at least two populations, and calculating individual fitness of the at least two populations in each iteration;
and comparing the optimal values in the individual fitness of the at least two populations, and taking the optimal values as the result of the current iteration to enter the next iteration to obtain the optimized parameters of the combined kernel function.
5. The method of claim 4, further comprising:
and training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
6. The method of claim 5, wherein the training the combined kernel function after optimizing the parameters through parallelization training to obtain the trained traffic prediction model comprises:
randomly initializing a population;
dividing the initialized population to obtain at least two sub-populations;
calculating the individual fitness of the at least two sub-populations;
performing population updating according to the individual fitness of the at least two sub-populations to obtain the updated combined kernel function;
and verifying the updated combined kernel function to obtain the traffic flow prediction model.
7. A traffic flow prediction device characterized by comprising:
the acquisition module is used for acquiring historical traffic flow data and determining the historical traffic flow data as input data;
the prediction module is used for inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result;
the traffic flow prediction model is a model obtained by constructing a combined kernel function through a kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
8. The apparatus of claim 7, wherein the input data comprises:
Figure RE-FDA0002915303290000031
wherein the content of the first and second substances,
Figure RE-FDA0002915303290000032
for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i,j-m,xi ,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,…,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
9. The apparatus of claim 8, wherein the apparatus comprises:
the building function module is used for building a combined kernel function through the kernel function;
wherein the construction function module comprises:
an acquisition unit configured to acquire a kernel function;
the building function unit is used for combining according to the kernel functions to obtain the combined kernel functions;
wherein the combined kernel function comprises:
Figure RE-FDA0002915303290000033
Figure RE-FDA0002915303290000034
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
10. The apparatus of claim 9, wherein the apparatus comprises:
an optimization module for optimizing parameters of the combined kernel function using a genetic algorithm and a particle swarm algorithm, wherein the optimization module comprises:
a first initialization unit for initializing at least two populations at random;
the first calculation unit is used for respectively carrying out genetic algorithm operation and particle swarm algorithm operation according to the at least two populations, and calculating the individual fitness of the at least two populations in each iteration;
and the comparison unit is used for comparing the optimal values in the individual fitness of the at least two populations and taking the optimal values as the result of the current iteration to enter the next iteration to obtain the optimized parameters of the combined kernel function.
CN202011001420.3A 2020-09-22 2020-09-22 Traffic flow prediction method and device Pending CN112508220A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011001420.3A CN112508220A (en) 2020-09-22 2020-09-22 Traffic flow prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011001420.3A CN112508220A (en) 2020-09-22 2020-09-22 Traffic flow prediction method and device

Publications (1)

Publication Number Publication Date
CN112508220A true CN112508220A (en) 2021-03-16

Family

ID=74953949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011001420.3A Pending CN112508220A (en) 2020-09-22 2020-09-22 Traffic flow prediction method and device

Country Status (1)

Country Link
CN (1) CN112508220A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114004993A (en) * 2021-10-25 2022-02-01 厦门大学 IA-SVM running condition identification method and device based on LSTM speed prediction optimization

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127330A (en) * 2016-06-18 2016-11-16 上海大学 Fluctuating wind speed Forecasting Methodology based on least square method supporting vector machine
CN106781465A (en) * 2016-12-06 2017-05-31 广州市科恩电脑有限公司 A kind of road traffic Forecasting Methodology
CN107025468A (en) * 2017-05-18 2017-08-08 重庆大学 Highway congestion recognition methods based on PCA GA SVM algorithms
CN109102884A (en) * 2018-07-19 2018-12-28 南京邮电大学 Parkinson disease diagnostic method based on mixed kernel function supporting vector machine model
CN110674598A (en) * 2019-08-26 2020-01-10 江苏师范大学 Injection molding process optimization method based on support vector machine and particle swarm optimization
CN110766237A (en) * 2019-10-31 2020-02-07 内蒙古工业大学 Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127330A (en) * 2016-06-18 2016-11-16 上海大学 Fluctuating wind speed Forecasting Methodology based on least square method supporting vector machine
CN106781465A (en) * 2016-12-06 2017-05-31 广州市科恩电脑有限公司 A kind of road traffic Forecasting Methodology
CN107025468A (en) * 2017-05-18 2017-08-08 重庆大学 Highway congestion recognition methods based on PCA GA SVM algorithms
CN109102884A (en) * 2018-07-19 2018-12-28 南京邮电大学 Parkinson disease diagnostic method based on mixed kernel function supporting vector machine model
CN110674598A (en) * 2019-08-26 2020-01-10 江苏师范大学 Injection molding process optimization method based on support vector machine and particle swarm optimization
CN110766237A (en) * 2019-10-31 2020-02-07 内蒙古工业大学 Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114004993A (en) * 2021-10-25 2022-02-01 厦门大学 IA-SVM running condition identification method and device based on LSTM speed prediction optimization

Similar Documents

Publication Publication Date Title
CN108197739B (en) Urban rail transit passenger flow prediction method
Zhang et al. Traffic flow prediction model based on deep belief network and genetic algorithm
CN109272157A (en) A kind of freeway traffic flow parameter prediction method and system based on gate neural network
CN102034350B (en) Short-time prediction method and system of traffic flow data
CN109920248B (en) Bus arrival time prediction method based on GRU neural network
CN108091135A (en) Parking position multistep forecasting method based on Optimization of Wavelet neutral net
CN110059875B (en) Public bicycle demand prediction method based on distributed whale optimization algorithm
Kim et al. Idle vehicle relocation strategy through deep learning for shared autonomous electric vehicle system optimization
CN112530157B (en) Road traffic congestion propagation prediction method based on knowledge graph and Conv1D-LSTM-D
CN112907970B (en) Variable lane steering control method based on vehicle queuing length change rate
CN113780665B (en) Private car stay position prediction method and system based on enhanced recurrent neural network
Hosseini et al. Short-term traffic flow forecasting by mutual information and artificial neural networks
Song et al. Traffic signal control under mixed traffic with connected and automated vehicles: a transfer-based deep reinforcement learning approach
CN115063184A (en) Electric vehicle charging demand modeling method, system, medium, equipment and terminal
Ma et al. Fuzzy hybrid framework with dynamic weights for short‐term traffic flow prediction by mining spatio‐temporal correlations
CN114565187A (en) Traffic network data prediction method based on graph space-time self-coding network
CN117669993B (en) Progressive charging facility planning method, progressive charging facility planning device, terminal and storage medium
CN117994986B (en) Traffic flow prediction optimization method based on intelligent optimization algorithm
CN110766237A (en) Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm
CN112036598A (en) Charging pile use information prediction method based on multi-information coupling
CN112508220A (en) Traffic flow prediction method and device
Chen et al. A Spark-based Ant Lion algorithm for parameters optimization of random forest in credit classification
Lin et al. Traffic Flow Prediction Using SPGAPSO-CKRVM Model.
CN114298133A (en) Short-term wind speed hybrid prediction method and device
CN114463978A (en) Data monitoring method based on rail transit information processing terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210316