CN112508220A - Traffic flow prediction method and device - Google Patents
Traffic flow prediction method and device Download PDFInfo
- Publication number
- CN112508220A CN112508220A CN202011001420.3A CN202011001420A CN112508220A CN 112508220 A CN112508220 A CN 112508220A CN 202011001420 A CN202011001420 A CN 202011001420A CN 112508220 A CN112508220 A CN 112508220A
- Authority
- CN
- China
- Prior art keywords
- traffic flow
- kernel function
- flow prediction
- populations
- combined
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000006870 function Effects 0.000 claims abstract description 134
- 238000012549 training Methods 0.000 claims abstract description 32
- 239000002245 particle Substances 0.000 claims abstract description 29
- 230000002068 genetic effect Effects 0.000 claims abstract description 22
- 238000005457 optimization Methods 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 15
- 239000000126 substance Substances 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims 1
- 238000013135 deep learning Methods 0.000 abstract description 9
- 238000010801 machine learning Methods 0.000 abstract description 9
- 230000008569 process Effects 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 13
- 230000001133 acceleration Effects 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 238000012544 monitoring process Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 238000005034 decoration Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 description 1
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000011056 performance test Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Strategic Management (AREA)
- Artificial Intelligence (AREA)
- Economics (AREA)
- Marketing (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Business, Economics & Management (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Tourism & Hospitality (AREA)
- Biomedical Technology (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Development Economics (AREA)
- Physiology (AREA)
- Genetics & Genomics (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a traffic flow prediction method and a traffic flow prediction device. The traffic flow prediction method comprises the following steps: obtaining historical traffic flow data, and determining the historical traffic flow data as input data; inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training. The invention solves the technical problems of low accuracy and efficiency due to machine learning and deep learning in the traffic flow prediction process in the prior art.
Description
Technical Field
The invention relates to the field of computer technology application, in particular to a traffic flow prediction method and a traffic flow prediction device.
Background
The precursor of Intelligent Traffic Systems (ITS) is the Intelligent Vehicle road System (IVHS). The ITS aims to effectively apply advanced scientific technologies (information technology, sensor technology, automatic control theory, operational research, artificial intelligence and the like) to the traffic field, and covers traffic modes such as highways, railways, civil aviation, water transportation and the like. The accurate and reliable traffic flow prediction result can be directly served for an intelligent traffic system, real-time and effective travel information can be provided for travelers, and decision basis is provided for relieving traffic jam. Sensor systems installed on roads can generally provide traffic network traffic, speed, and lane occupancy information to provide data support for traffic flow prediction. How to improve the accuracy of traffic flow prediction is always a research hotspot in the field of ITS. The traffic flow prediction problem is a typical time series prediction problem. The wide application of machine learning and deep learning algorithms provides an important solution for solving the problem of complex time series prediction. At present, a plurality of machine learning algorithms and deep learning algorithms are introduced into traffic flow prediction, but the accuracy and the efficiency still have a space for improving.
Aiming at the problems of low accuracy and low efficiency in the process of predicting the traffic flow based on machine learning and deep learning in the prior art, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the invention provides a traffic flow prediction method and a traffic flow prediction device, which at least solve the technical problems of low accuracy and low efficiency in the process of predicting traffic flow based on machine learning and deep learning in the prior art.
According to an aspect of an embodiment of the present invention, there is provided a traffic flow prediction method including: obtaining historical traffic flow data, and determining the historical traffic flow data as input data; inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
Optionally, the input data includes:
wherein the content of the first and second substances,for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i ,j-m,xi,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,…,xi-7,j)T Indicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
Further, optionally, constructing the combined kernel function by the kernel function includes: acquiring a kernel function; combining according to the kernel functions to obtain combined kernel functions; wherein the combined kernel function comprises:
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Further, optionally, optimizing the parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm includes: randomly initializing at least two populations; respectively performing genetic algorithm operation and particle swarm algorithm operation according to at least two populations, and calculating individual fitness of the at least two populations in each iteration; and obtaining parameters of the optimized combined kernel function by comparing the optimal values in the individual fitness of at least two populations and taking the optimal values as the result of the iteration to enter the next iteration.
Optionally, the method further includes: and training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
Further, optionally, training the combined kernel function after the parameters are optimized through parallelization training, and obtaining the trained traffic flow prediction model includes: randomly initializing a population; dividing the initialized population to obtain at least two sub-populations; calculating individual fitness of at least two sub-populations; carrying out population updating according to the individual fitness of at least two sub-populations to obtain an updated combined kernel function; and verifying the updated combined kernel function to obtain a traffic flow prediction model.
According to an aspect of an embodiment of the present invention, there is provided a traffic flow prediction apparatus including: the acquisition module is used for acquiring historical traffic flow data and determining the historical traffic flow data as input data; the prediction module is used for inputting the input data into the traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
Optionally, the input data includes:
wherein the content of the first and second substances,for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i ,j-m,xi,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,...,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
Optionally, the apparatus comprises: the building function module is used for building a combined kernel function through the kernel function; wherein, the function building module comprises: an acquisition unit configured to acquire a kernel function; the function building unit is used for combining according to the kernel functions to obtain combined kernel functions; wherein the combined kernel function comprises:
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Further, optionally, the apparatus comprises: an optimization module for optimizing parameters of the combined kernel function using a genetic algorithm and a particle swarm algorithm, wherein the optimization module comprises: a first initialization unit for initializing at least two populations at random; the first calculation unit is used for performing genetic algorithm operation and particle swarm algorithm operation respectively according to at least two populations and calculating the individual fitness of the at least two populations in each iteration; and the comparison unit is used for obtaining the parameters of the optimized combined kernel function by comparing the optimal values in the individual fitness of at least two populations and taking the optimal values as the result of the current iteration to enter the next iteration.
Optionally, the apparatus further comprises: and the model training module is used for training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
Further, optionally, the model training module includes: the second initialization unit is used for initializing the population randomly; the dividing unit is used for dividing the initialized population to obtain at least two sub-populations; the second calculating unit is used for calculating the individual fitness of at least two sub-populations; the updating unit is used for updating the population according to the individual fitness of at least two sub-populations to obtain an updated combined kernel function; and the verification unit is used for verifying the updated combined kernel function to obtain a traffic flow prediction model.
In the embodiment of the invention, historical traffic flow data is obtained and determined as input data; inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training, so that the technical effects of improving the accuracy and improving the calculation efficiency are achieved, and the technical problems of low accuracy and low efficiency due to machine learning and deep learning in the traffic flow prediction process in the prior art are solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart diagram of a traffic flow prediction method according to an embodiment of the invention;
FIG. 2 is a diagram illustrating a structure of a parameter optimization algorithm in a traffic flow prediction method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of time consumed by portions of a parameter optimization algorithm in a traffic flow prediction method according to an embodiment of the invention;
FIG. 4 is a schematic diagram of the SPGAPSO-CKRVM algorithm flow in the traffic flow prediction method according to the embodiment of the invention;
fig. 5(a) is a schematic diagram of data No. 8/month 6 to 19 of monitoring points No. 1027 in the traffic flow prediction method according to the embodiment of the present invention;
fig. 5(b) is a schematic view of data of monitoring point No. 1027 for three days in the traffic flow prediction method according to the embodiment of the present invention;
FIG. 6(a) is a schematic diagram of training time comparison at different nodes in a traffic flow prediction method according to an embodiment of the present invention;
FIG. 6(b) is a schematic diagram of an acceleration ratio comparison at different nodes in a traffic flow prediction method according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a traffic flow prediction apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
According to an aspect of the embodiment of the present invention, a traffic flow prediction method is provided, fig. 1 is a schematic flow diagram of a traffic flow prediction method according to an embodiment of the present invention, and as shown in fig. 1, a traffic flow prediction method provided in the embodiment of the present application includes:
step S102, obtaining historical traffic flow data, and determining the historical traffic flow data as input data;
step S104, inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
Optionally, the input data includes:
wherein the content of the first and second substances,for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i ,j-m,xi,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,…,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
Further, optionally, constructing the combined kernel function by the kernel function includes: acquiring a kernel function; combining according to the kernel functions to obtain combined kernel functions; wherein the combined kernel function comprises:
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Specifically, the traffic flow prediction method provided in the embodiment of the present application combines common kernel functions to construct a combined kernel function, as shown in formulas (1) and (2).
Since the kernel function of the Relevance Vector Machine (RVM) does not need to satisfy the Mercer theorem, no validation of the availability of the combined kernel function is required. The combined kernel function enables the RVM to have the local learning capacity of the Gaussian kernel function and the Laplace kernel and have stronger generalization capacity of the polynomial kernel function. Where σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Further, optionally, optimizing the parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm includes: randomly initializing at least two populations; respectively performing genetic algorithm operation and particle swarm algorithm operation according to at least two populations, and calculating individual fitness of the at least two populations in each iteration; and obtaining parameters of the optimized combined kernel function by comparing the optimal values in the individual fitness of at least two populations and taking the optimal values as the result of the iteration to enter the next iteration.
Specifically, unlike the general RVM parameter optimization problem, λ in the combined kernel as shown in formula (1) (2) is also a hyper-parameter to be optimized. The mathematical model of the combined kernel RVM parameter optimization problem can therefore be expressed as:
P={σbest,λbest,dbest} (3)
the traffic flow prediction method provided by the embodiment of the application is based on genetic algorithm (GA for short) and particle swarm algorithm (PSO for short) to construct a parameter optimization algorithm. Two populations are initialized randomly, and GA operation and PSO operation are carried out respectively. And comparing the superior values of the two in each iteration in a mode of calculating the individual fitness, and entering the next iteration as the result of the current iteration, wherein the updating of the sigma is shown as a formula (4).
λ andthe update of d is similar to σ. The GA and the PSO have the commonality of iterative optimization, and the hybrid algorithm of the GA and the PSO can fully utilize the large search range of the GA and the rapid convergence capability of the PSO in iteration. Optimization of parameters to obtain [ sigma ]best,λbest,dbestAnd inputting the result as the operation parameter of the RVM, and further solving the Accuracy Accuracy of the RVM prediction model. Defining Accuracy as the objective function of the RVM parameter optimization problem, the RVM parameter optimization problem can be described as:
wherein The variation range of σ is usually ± 8. dminAnd dmaxD is a variable range, and is generally 0 or inf. And finally, judging whether the obtained result meets a termination condition. When the termination condition is satisfied, the iteration of the parameter optimization algorithm is terminated. The termination condition is shown in equation (6).
min{fitnessga,fitnesspso}≤fitnessmin or T≤Tmax (6)
Wherein the fitnessminIs the minimum fitness, i.e., the minimum error acceptable; t is the number of iterations, TmaxIs the maximum number of iterations.
Among them, GA operation, binary coding is the most commonly used coding method for the RVM parameter optimization problem. The GA needs to evaluate each set of parameters through a fitness function to guide the iterative process to proceed in a good direction. The MSE of RVM is used herein as the fitness function. After the fitness is calculated, individuals are selected according to the idea of roulette, and a new solution cluster population is generated by means of operations such as crossing and variation.
PSO operates, and PSO also calculates the degree of goodness of the fitness evaluation particle using the MSE of RVM as the fitness function. And searching an individual extreme value and a global extreme value according to the calculation result, and then updating the speed and the position of each particle according to formulas (7) and (8).
v=vi+c1r1(pbest-xi)+c2r2(Gbest-xi) (7)
x=xi-vi (8)
In equations (7) and (8), v is the updated particle velocity; x is the updated particle position; v. ofiIs the current velocity of the particle; x is the number ofiIs the current position of the particle; c. C1And c2Are learning factors respectively; r is1And r2Is a random number between (0, 1); p is a radical ofbestIs an individual extremum; gbestIs a global extremum.
The GA and the PSO in the parameter optimization algorithm in the traffic flow prediction method provided in the embodiment of the present application may be divided into three parts, namely, an initialization population, population updating, and a calculation fitness, fig. 2 is a schematic diagram of a structure of the parameter optimization algorithm in the traffic flow prediction method according to the embodiment of the present invention, and as shown in fig. 2, the initialization population part includes randomly generating an initialization population and calculating the initialization population fitness. The population updating part comprises selection, crossing, variation, speed updating, position updating and population updating. And the fitness calculating part is responsible for traversing all the individuals in the population and calculating the fitness of the individuals. The parameter optimization algorithm was run 20 times with a maximum number of iterations of 10 and a population size of 10. The average of the time spent by each section was calculated and recorded.
As shown in fig. 3, fig. 3 is a schematic diagram illustrating time consumption of each part of a parameter optimization algorithm in the traffic flow prediction method according to the embodiment of the present invention. The time consumption of the part for calculating the fitness is the highest, and occupies more than 90% of the total time consumption. And the time consumption of the population updating part with more complex computation logic only accounts for about 0.1 percent of the total time consumption. Aiming at the problem that the part of the calculation fitness consumes too long time, the traffic flow prediction method provided by the embodiment of the application adopts a Spark platform to carry out parallelization processing on the algorithm so as to reduce the calculation fitness running time.
Optionally, the traffic flow prediction method provided in the embodiment of the present application further includes: and training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
Further, optionally, training the combined kernel function after the parameters are optimized through parallelization training, and obtaining the trained traffic flow prediction model includes: randomly initializing a population; dividing the initialized population to obtain at least two sub-populations; calculating individual fitness of at least two sub-populations; carrying out population updating according to the individual fitness of at least two sub-populations to obtain an updated combined kernel function; and verifying the updated combined kernel function to obtain a traffic flow prediction model.
Specifically, Spark in the traffic flow prediction method provided in the embodiment of the present application is a fast and general calculation engine specially designed for large-scale data processing. Since the intermediate output result can be stored in the memory, Spark can better run algorithms requiring iteration, such as data mining and machine learning. The SPGAPSO-CKRVM algorithm design mainly depends on a flexible Distributed data set (RDD for short) specific to Spark. The overall flow of the SPGAPSO-CKRVM algorithm is shown in fig. 4. Fig. 4 is a schematic diagram of the SPGAPSO-CKRVM algorithm flow in the traffic flow prediction method according to the embodiment of the present invention.
As shown in fig. 4, the SPGAPSO-CKRVM algorithm flow is specifically as follows:
STEP 1 initialization
Initializing Spark operation parameters, wherein the key parameters comprise default, executive, cores and num-executors, and the key parameters respectively correspond to the number of the RDD default partitions, the cores occupied by each executive and the total number of the executors. The experimental data are read from the external file and compressed to (0,1) by performing an extremum-based normalization process according to equation (9).
Wherein X is the original data, Xmax、XminRespectively, the maximum and minimum values of the original data set. A set of { σ, λ, d } is randomly generated as an initial population, and an RDD is created using the initial population.
STEP 2 calculates individual fitness
And dividing the initial population into n sub-populations, and calculating the individual fitness of each sub-population in parallel by using a map function. And triggering the calculation involved in the map function by using a collect function according to an inertia mechanism of the RDD, and converting the sub-population of the RDD type into a list. Then, one part of the sub-population is subjected to SPGA operation, and the other part is subjected to SPPSO operation.
STEP 3 population update
In the SPGA operation, first, a selection operation is performed based on the calculation result of the individual fitness and the formula (10).
The meaning of equation (10) is that the more excellent the probability that the individual is selected is greater. And performing cross operation and mutation operation on the selected individuals to generate a next generation population. The velocity and position of each particle is updated in the SPPSO operation according to equations (7) and (8) to generate the next generation population. And recalculating the individual fitness, comparing the individual fitness of the two next-generation populations, and keeping the optimal value.
STEP 4 verification accuracy
Decoding the result obtained by STEP 3 into { sigma, lambda, d } as the parameter input of RVM, and if the prediction accuracy meets the expectation, using the result as the final solution { sigmabest,λbest,dbestAnd (6) outputting. If not, the STEP returns to STEP 3.
In summary, the traffic flow prediction method provided by the embodiment of the present application uses traffic flow data collected from the geomagnetic induction coils of the monitoring points No. 1027, No. 1036, and No. 1042 of Whitemud Drive in canada as experimental data. Data were provided by the intelligent traffic research center at alberta university, canada. The data set contained traffic flow data from year 2015 8/6 to year 2015 8/28, with a collection frequency of 20 s. The data of 8 months and 28 days are test sets, and the other data are training sets. The monitoring point information corresponding to the experimental data is as follows:
data of monitoring point No. 1027 in east direction of data set 1 Whitemoud Drive
Data of monitoring point 1036 west direction of data set 2 Whitemoud Drive
Data of monitoring point No. 1042 of west direction ramp of data set 3 Whitemoud Drive
And (4) repairing data lost due to the failure of the acquisition equipment by adopting a historical trend method and an exponential smoothing method. The experimental data after partial repair are shown in fig. 5(a) and 5 (b).
As can be seen from fig. 5(a), the experimental data is continuous and periodic, with significant tidal activity. As can be seen from fig. 5(b), 7: 00-9: 00 and 16: 00-19: and 00 is the peak time of the traffic flow.
According to the traffic flow prediction method provided by the embodiment of the application, the performance of the SPGAPSO-CKRVM algorithm is verified by building a Spark cluster with 8 nodes through a virtual machine. The cluster is configured in detail as: CentOS-6.10-x64, Spark-2.1.1-bin-hadoop2.7, hadoop-2.7.2.tar, pyspark-2.3.2, py4 j-0.10.8.1. The parameter settings of the SPGAPSO-CKRVM algorithm are shown in table 1.
TABLE 1 SPGAPSO-CKRVM Algorithm parameter set
Traffic jam, travel delay and road accidents mostly occur in peak hours, so that the prediction of traffic flow in the peak hours is more meaningful. So in the field of traffic flow prediction, a kernel function that predicts more accurately during peak hours should be considered as performing better. In addition to the Accuracy (1-MAPE), the traffic flow prediction method provided by the embodiment of the application additionally defines a Peak Hour Accuracy (PHA), that is, the Accuracy of the traffic flow prediction of (7: 00-9: 00) and (16: 00-19: 00), to evaluate the performance of the kernel function.
TABLE 2 Kernel function Performance test results
As can be seen from the table, all kernel functions are more accurate on dataset 1 and dataset 2 than dataset 3. This is because the traffic flow on the ramp corresponding to the data set 3 has stronger randomness and greater prediction difficulty. The performance of the combined kernel functions numbered 5 and 6 is generally superior to that of the single kernel functions numbered 1-4. Except in data set 1, the combined kernel function numbered 6 is always more accurate than the combined kernel function numbered 5. In the three data sets, PHA for the combined kernel function numbered 6 was always higher than the combined kernel function numbered 5. Therefore, it is used hereinSubsequent experiments were performed as a kernel function of RVM.
The SPGAPSO-CKRVM provided by the traffic flow prediction method provided by the embodiment of the application is compared with the existing machine learning and deep learning algorithm by using RMSE, MAE and MAPE as evaluation indexes. The results of the experiment are shown in table 3.
TABLE 3 Algorithm accuracy test results
Wherein CNN-GRU is CNN-binding GRU; CNN-Bi-LSTM is CNN in combination with Bi-directional LSTM; the SPGAPSO-SVM optimizes the SVM by using the same parameter optimization algorithm as the SPGAPSO-CKRVM; GA-CKRVM is a single optimized combined kernel RVM using GA. As can be seen from Table 3, the conventional PSO-SVR performed the worst. The deep learning algorithm based on RNN, such as LSTM, CNN-GRU, CNN-Bi-LSTM, and the like, is obviously superior to the algorithm based on SVM. GA-CKRVM performed slightly better than the above algorithm. The SPGAPSO-CKRVM performed better in all three datasets than the other comparative algorithms.
The algorithm expandability experiment is used for testing whether the algorithm running speed can be improved by adding nodes or not, and the parallelization effect is measured by calculating the acceleration ratio. The experiment runs the SPGAPSO-CKRVM algorithm 10 times based on a single node, 2 nodes, 4 nodes and 8 nodes, records the experiment result and calculates the acceleration ratio. The acceleration ratio is calculated as follows:
wherein, T1Is the algorithm serial run time. T isnIs the time that the algorithm runs in parallel on n nodes. Ideally, the acceleration ratio should be equal to the number of nodes. The results of the experiment are shown in fig. 6(a) and 6 (b).
The population size determines the amount of SPGAPSO-CKRVM to be calculated. As can be seen from fig. 6(a), the running time increases linearly as the amount of calculation increases. In the case of small computational complexity, the training time difference of the algorithm at a single node, 2 nodes, 4 nodes and 8 nodes is small. With the increase of the amount of calculation, the running time of 8 nodes is far lower than that of 4 nodes, 2 nodes and a single node. This is because the larger the number of nodes, the smaller the amount of computation each compute node is responsible for processing.
As can be seen from fig. 6(b), in the case where the calculation amount is small, the acceleration effect is not significant. This is because the basic operations such as starting up a cluster, dividing a task, and allocating resources are time-consuming, and the cluster does not exhibit an ideal effect. The advantages of parallel computing become more and more obvious as the computing amount increases, and the acceleration ratio tends to increase and gradually approaches to an ideal value. Experimental results prove that the SPGAPSO-CKRVM provided by the traffic flow prediction method provided by the embodiment of the application has better expandability.
The traffic flow prediction method provided by the embodiment of the application utilizes the combined kernel RVM and the plurality of heuristic algorithms to construct a traffic flow prediction model, and utilizes a Spark parallelization technology to carry out parallelization design on the parameter optimization algorithm of the RVM, thereby providing the SPGAPSO-CKRVM. The model provided by the method is superior to other methods in prediction accuracy and effectively shortens parameter optimization time after experiments are carried out by using real data of the Whitemoud Drive highway in Canada.
In the embodiment of the invention, historical traffic flow data is obtained and determined as input data; inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training, so that the technical effects of improving the accuracy and improving the calculation efficiency are achieved, and the technical problems of low accuracy and low efficiency due to machine learning and deep learning in the traffic flow prediction process in the prior art are solved.
Example 2
According to an aspect of the embodiment of the present invention, there is provided a traffic flow prediction apparatus, and fig. 7 is a schematic diagram of the traffic flow prediction apparatus according to the embodiment of the present invention, and as shown in fig. 7, the traffic flow prediction apparatus according to the embodiment of the present application includes: the acquisition module 72 is used for acquiring historical traffic flow data and determining the historical traffic flow data as input data; the prediction module 74 is used for inputting the input data into the traffic flow prediction model for prediction to obtain a traffic flow prediction result; the traffic flow prediction model is a model obtained by constructing a combined kernel function through the kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
Optionally, the input data includes:
wherein the content of the first and second substances,for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i ,j-m,xi,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,...,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
Optionally, the traffic flow prediction apparatus provided in the embodiment of the present application includes: the building function module is used for building a combined kernel function through the kernel function; wherein, the function building module comprises: an acquisition unit configured to acquire a kernel function; the function building unit is used for combining according to the kernel functions to obtain combined kernel functions; wherein the combined kernel function comprises:
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
Further, optionally, the traffic flow prediction apparatus provided in the embodiment of the present application includes: an optimization module for optimizing parameters of the combined kernel function using a genetic algorithm and a particle swarm algorithm, wherein the optimization module comprises: a first initialization unit for initializing at least two populations at random; the first calculation unit is used for performing genetic algorithm operation and particle swarm algorithm operation respectively according to at least two populations and calculating the individual fitness of the at least two populations in each iteration; and the comparison unit is used for obtaining the parameters of the optimized combined kernel function by comparing the optimal values in the individual fitness of at least two populations and taking the optimal values as the result of the current iteration to enter the next iteration.
Optionally, the traffic flow prediction apparatus provided in the embodiment of the present application further includes: and the model training module is used for training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
Further, optionally, the model training module includes: the second initialization unit is used for initializing the population randomly; the dividing unit is used for dividing the initialized population to obtain at least two sub-populations; the second calculating unit is used for calculating the individual fitness of at least two sub-populations; the updating unit is used for updating the population according to the individual fitness of at least two sub-populations to obtain an updated combined kernel function; and the verification unit is used for verifying the updated combined kernel function to obtain a traffic flow prediction model.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (10)
1. A traffic flow prediction method, comprising:
obtaining historical traffic flow data, and determining the historical traffic flow data as input data;
inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result;
the traffic flow prediction model is a model obtained by constructing a combined kernel function through a kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
2. The method of claim 1, wherein the input data comprises:
wherein the content of the first and second substances,for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i,j-m,xi ,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,...,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
3. The method of claim 2, wherein constructing the combined kernel function by the kernel function comprises:
acquiring a kernel function;
combining according to the kernel functions to obtain the combined kernel functions;
wherein the combined kernel function comprises:
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
4. The method of claim 3, wherein optimizing the parameters of the combined kernel function using genetic and particle swarm algorithms comprises:
randomly initializing at least two populations;
respectively performing genetic algorithm operation and particle swarm algorithm operation according to the at least two populations, and calculating individual fitness of the at least two populations in each iteration;
and comparing the optimal values in the individual fitness of the at least two populations, and taking the optimal values as the result of the current iteration to enter the next iteration to obtain the optimized parameters of the combined kernel function.
5. The method of claim 4, further comprising:
and training the combined kernel function after the parameters are optimized through parallel training to obtain a trained traffic flow prediction model.
6. The method of claim 5, wherein the training the combined kernel function after optimizing the parameters through parallelization training to obtain the trained traffic prediction model comprises:
randomly initializing a population;
dividing the initialized population to obtain at least two sub-populations;
calculating the individual fitness of the at least two sub-populations;
performing population updating according to the individual fitness of the at least two sub-populations to obtain the updated combined kernel function;
and verifying the updated combined kernel function to obtain the traffic flow prediction model.
7. A traffic flow prediction device characterized by comprising:
the acquisition module is used for acquiring historical traffic flow data and determining the historical traffic flow data as input data;
the prediction module is used for inputting the input data into a traffic flow prediction model for prediction to obtain a traffic flow prediction result;
the traffic flow prediction model is a model obtained by constructing a combined kernel function through a kernel function, optimizing parameters of the combined kernel function by using a genetic algorithm and a particle swarm algorithm and performing parallelization training.
8. The apparatus of claim 7, wherein the input data comprises:
wherein the content of the first and second substances,for the traffic flow to be predicted, i represents the number of days, and j represents the time period; n is the total number of samples; (x)i,j-m,xi ,j-m-1,...,xi,j-1)TRepresenting the traffic flow of m time intervals before the j time interval in the ith day; (x)i-7n,j,xi-7n+7,j,…,xi-7,j)TIndicating the traffic flow of the j time period within the previous n weeks of the i day; m is commonly referred to as the prediction step size, and m and n are both positive integers.
9. The apparatus of claim 8, wherein the apparatus comprises:
the building function module is used for building a combined kernel function through the kernel function;
wherein the construction function module comprises:
an acquisition unit configured to acquire a kernel function;
the building function unit is used for combining according to the kernel functions to obtain the combined kernel functions;
wherein the combined kernel function comprises:
wherein σ is the kernel function width; d, determining the distribution of the data in a high-dimensional space; and lambda is a weight coefficient and satisfies a constraint condition of 0 to 1.
10. The apparatus of claim 9, wherein the apparatus comprises:
an optimization module for optimizing parameters of the combined kernel function using a genetic algorithm and a particle swarm algorithm, wherein the optimization module comprises:
a first initialization unit for initializing at least two populations at random;
the first calculation unit is used for respectively carrying out genetic algorithm operation and particle swarm algorithm operation according to the at least two populations, and calculating the individual fitness of the at least two populations in each iteration;
and the comparison unit is used for comparing the optimal values in the individual fitness of the at least two populations and taking the optimal values as the result of the current iteration to enter the next iteration to obtain the optimized parameters of the combined kernel function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011001420.3A CN112508220A (en) | 2020-09-22 | 2020-09-22 | Traffic flow prediction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011001420.3A CN112508220A (en) | 2020-09-22 | 2020-09-22 | Traffic flow prediction method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112508220A true CN112508220A (en) | 2021-03-16 |
Family
ID=74953949
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011001420.3A Pending CN112508220A (en) | 2020-09-22 | 2020-09-22 | Traffic flow prediction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112508220A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114004993A (en) * | 2021-10-25 | 2022-02-01 | 厦门大学 | IA-SVM running condition identification method and device based on LSTM speed prediction optimization |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106127330A (en) * | 2016-06-18 | 2016-11-16 | 上海大学 | Fluctuating wind speed Forecasting Methodology based on least square method supporting vector machine |
CN106781465A (en) * | 2016-12-06 | 2017-05-31 | 广州市科恩电脑有限公司 | A kind of road traffic Forecasting Methodology |
CN107025468A (en) * | 2017-05-18 | 2017-08-08 | 重庆大学 | Highway congestion recognition methods based on PCA GA SVM algorithms |
CN109102884A (en) * | 2018-07-19 | 2018-12-28 | 南京邮电大学 | Parkinson disease diagnostic method based on mixed kernel function supporting vector machine model |
CN110674598A (en) * | 2019-08-26 | 2020-01-10 | 江苏师范大学 | Injection molding process optimization method based on support vector machine and particle swarm optimization |
CN110766237A (en) * | 2019-10-31 | 2020-02-07 | 内蒙古工业大学 | Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm |
-
2020
- 2020-09-22 CN CN202011001420.3A patent/CN112508220A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106127330A (en) * | 2016-06-18 | 2016-11-16 | 上海大学 | Fluctuating wind speed Forecasting Methodology based on least square method supporting vector machine |
CN106781465A (en) * | 2016-12-06 | 2017-05-31 | 广州市科恩电脑有限公司 | A kind of road traffic Forecasting Methodology |
CN107025468A (en) * | 2017-05-18 | 2017-08-08 | 重庆大学 | Highway congestion recognition methods based on PCA GA SVM algorithms |
CN109102884A (en) * | 2018-07-19 | 2018-12-28 | 南京邮电大学 | Parkinson disease diagnostic method based on mixed kernel function supporting vector machine model |
CN110674598A (en) * | 2019-08-26 | 2020-01-10 | 江苏师范大学 | Injection molding process optimization method based on support vector machine and particle swarm optimization |
CN110766237A (en) * | 2019-10-31 | 2020-02-07 | 内蒙古工业大学 | Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114004993A (en) * | 2021-10-25 | 2022-02-01 | 厦门大学 | IA-SVM running condition identification method and device based on LSTM speed prediction optimization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108197739B (en) | Urban rail transit passenger flow prediction method | |
Zhang et al. | Traffic flow prediction model based on deep belief network and genetic algorithm | |
CN109272157A (en) | A kind of freeway traffic flow parameter prediction method and system based on gate neural network | |
CN102034350B (en) | Short-time prediction method and system of traffic flow data | |
CN109920248B (en) | Bus arrival time prediction method based on GRU neural network | |
CN108091135A (en) | Parking position multistep forecasting method based on Optimization of Wavelet neutral net | |
CN110059875B (en) | Public bicycle demand prediction method based on distributed whale optimization algorithm | |
Kim et al. | Idle vehicle relocation strategy through deep learning for shared autonomous electric vehicle system optimization | |
CN112530157B (en) | Road traffic congestion propagation prediction method based on knowledge graph and Conv1D-LSTM-D | |
CN112907970B (en) | Variable lane steering control method based on vehicle queuing length change rate | |
CN113780665B (en) | Private car stay position prediction method and system based on enhanced recurrent neural network | |
Hosseini et al. | Short-term traffic flow forecasting by mutual information and artificial neural networks | |
Song et al. | Traffic signal control under mixed traffic with connected and automated vehicles: a transfer-based deep reinforcement learning approach | |
CN115063184A (en) | Electric vehicle charging demand modeling method, system, medium, equipment and terminal | |
Ma et al. | Fuzzy hybrid framework with dynamic weights for short‐term traffic flow prediction by mining spatio‐temporal correlations | |
CN114565187A (en) | Traffic network data prediction method based on graph space-time self-coding network | |
CN117669993B (en) | Progressive charging facility planning method, progressive charging facility planning device, terminal and storage medium | |
CN117994986B (en) | Traffic flow prediction optimization method based on intelligent optimization algorithm | |
CN110766237A (en) | Bus passenger flow prediction method and system based on SPGAPSO-SVM algorithm | |
CN112036598A (en) | Charging pile use information prediction method based on multi-information coupling | |
CN112508220A (en) | Traffic flow prediction method and device | |
Chen et al. | A Spark-based Ant Lion algorithm for parameters optimization of random forest in credit classification | |
Lin et al. | Traffic Flow Prediction Using SPGAPSO-CKRVM Model. | |
CN114298133A (en) | Short-term wind speed hybrid prediction method and device | |
CN114463978A (en) | Data monitoring method based on rail transit information processing terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210316 |