CN115311842A

CN115311842A - Traffic flow prediction model training and traffic flow prediction method, device and electronic equipment

Info

Publication number: CN115311842A
Application number: CN202110496420.3A
Authority: CN
Inventors: 刘运胜; 何光旭; 曹勇; 王昌建
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2021-05-07
Filing date: 2021-05-07
Publication date: 2022-11-08

Abstract

The embodiment of the application provides a method, a device and electronic equipment for training a traffic flow prediction model and predicting traffic flow, historical vehicle passing data and various historical signal periods of an intersection are obtained, time intervals of two consecutive vehicle passing of the same vehicle are calculated according to historical vehicle passing times of vehicles at the intersection in the historical vehicle passing data, a de-weighting time threshold value corresponding to the vehicle is determined according to distribution conditions of a plurality of time intervals in a preset time interval, de-weighting processing is carried out on historical vehicle passing data to be de-weighted, in the historical vehicle passing data, of which the time intervals of two consecutive vehicle passing of the same vehicle are smaller than the de-weighting time threshold value corresponding to the vehicle, updated historical vehicle passing data are obtained, traffic flow in the various historical signal periods is respectively counted according to the updated historical vehicle passing data, and a traffic flow prediction model is trained on the basis of the traffic flow in the various historical signal periods. According to the scheme, the accuracy of the traffic flow prediction model and the accuracy of traffic flow prediction are improved.

Description

Traffic flow prediction model training and traffic flow prediction method and device and electronic equipment

Technical Field

The application relates to the technical field of intelligent traffic, in particular to a traffic flow prediction model training and traffic flow prediction method, a device and electronic equipment.

Background

In recent years, with the continuous improvement of national economic level, the holding amount of urban motor vehicles is increased explosively, so that urban traffic jam is more and more serious. The traffic flow prediction is a main means for relieving urban traffic pressure, and the time length of a signal lamp is changed in real time by predicting the traffic flow of each intersection in advance, so that the traffic efficiency of the urban intersections is improved.

At present, a traffic flow prediction method is mainly used for predicting based on traffic flow in a fixed period, for example, a signal period of a traffic light at an intersection is set to be fixed 50 seconds, traffic flow at the intersection is counted every 50 seconds, a statistical historical traffic flow is analyzed and trained to obtain a traffic flow prediction model, and then the traffic flow counted in the current period is input into the traffic flow prediction model, so that a prediction result of the traffic flow in the next period can be obtained.

However, since the real-time traffic road condition is constantly changing, the time length of the intersection signal lamp should be dynamically adjusted, which may constantly affect the change of the traffic flow, and the above method for predicting the traffic flow based on the traffic flow of the fixed period cannot meet the requirement of the dynamic change, resulting in a low accuracy of the traffic flow prediction model and an inaccurate traffic flow prediction result.

Disclosure of Invention

The embodiment of the application aims to provide a traffic flow prediction model training method, a traffic flow prediction device and electronic equipment, so as to improve the accuracy of a traffic flow prediction model and further improve the accuracy of traffic flow prediction. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present application provides a traffic flow prediction model training method, where the method includes:

acquiring historical vehicle passing data and historical signal periods of the intersection, wherein the historical vehicle passing data comprises historical vehicle passing time of each vehicle at the intersection, and the time lengths of the historical signal periods are different;

calculating the time interval of two vehicle passing of the same vehicle according to the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, and determining the de-weighting time threshold value corresponding to the vehicle according to the distribution condition of a plurality of time intervals in a preset time interval;

carrying out de-weighting treatment on historical vehicle-passing data to be de-weighted by using de-weighting time thresholds corresponding to all vehicles to obtain updated historical vehicle-passing data, wherein the historical vehicle-passing data to be de-weighted is the historical vehicle-passing data of which the time interval of two vehicle-passing times of the same vehicle is smaller than the de-weighting time threshold corresponding to the vehicle;

respectively counting the traffic flow in each historical signal period according to the updated historical traffic data;

and training a traffic flow prediction model based on the traffic flow in each historical signal period.

Optionally, the historical vehicle passing data further includes license plate data of each vehicle at the intersection;

the method comprises the steps of calculating the time interval of two vehicle passing of the same vehicle according to the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, and determining the de-weighting time threshold corresponding to the vehicle according to the distribution condition of a plurality of time intervals in a preset time interval, wherein the steps comprise:

calculating first time intervals of two successive vehicle passing of vehicles with the same license plate data according to the license plate data of each vehicle at the intersection and the historical vehicle passing time in the historical vehicle passing data, and calculating first percentage distribution of a plurality of first time intervals in a preset time interval;

and determining a first time interval corresponding to the first percentage when the first percentage distribution reaches a preset change condition according to the first percentage distribution of the plurality of first time intervals in a preset time interval, and using the first time interval as a de-weighting time threshold corresponding to the vehicles with the same license plate data.

Optionally, the method further includes:

calculating second time intervals of two consecutive passing of the unlicensed vehicles according to the historical passing time of the unlicensed vehicles at the intersection in the historical passing data, and calculating second percentage distribution of the second time intervals in a preset time interval;

and determining a second time interval corresponding to a second percentage equal to the first percentage when the first percentage reaches a preset change condition in the second percentage distribution as a de-weighting time threshold corresponding to the unlicensed vehicle according to the second percentage distribution of the plurality of second time intervals in the preset time interval.

Optionally, the step of performing deduplication processing on historical vehicle passing data to be deduplicated by using the deduplication time threshold corresponding to each vehicle to obtain updated historical vehicle passing data includes:

determining historical vehicle passing data to be subjected to de-weighting according to historical vehicle passing time of each vehicle at the intersection and a de-weighting time threshold value corresponding to each vehicle in the historical vehicle passing data, wherein the historical vehicle passing data to be subjected to de-weighting are two pieces of historical vehicle passing data, wherein the time interval of two times of vehicle passing of the same vehicle is smaller than the de-weighting time threshold value corresponding to the vehicle;

and deleting the historical passing vehicle data with earlier historical passing vehicle time in the historical passing vehicle data to be subjected to duplicate removal, and reserving the historical passing vehicle data with later historical passing vehicle time in the historical passing vehicle data to be subjected to duplicate removal to obtain updated historical passing vehicle data.

Optionally, the step of training the traffic flow prediction model based on the traffic flow in each historical signal period includes:

extracting characteristic data in each historical signal period according to historical vehicle passing data, each historical signal period and the vehicle flow in each historical signal period and a preset characteristic data extraction strategy;

dividing to obtain a training set and a verification set according to the characteristic data in each historical signal period;

training a preset multi-class base model by using a training set to respectively obtain the feature importance of various feature data;

screening feature data with feature importance greater than a preset threshold value from various feature data of the base model aiming at various base models to obtain a primary selection feature set corresponding to the base model, wherein the primary selection feature set records category information of the feature data with feature importance greater than the preset threshold value;

solving a union set of the initially selected feature sets corresponding to all the types of base models to obtain a target feature set;

based on the target feature set, screening out a target training set from the training set, and screening out a target verification set from the verification set;

training a preset model to be trained by using a target training set, and carrying out hyper-parameter optimization based on a target verification set to obtain an optimal hyper-parameter;

and combining the target training set and the target verification set, and retraining the preset model to be trained by using the optimal hyper-parameters to obtain a traffic flow prediction model.

In a second aspect, an embodiment of the present application provides a traffic flow prediction method, where the method includes:

acquiring vehicle passing data of a current signal period of an intersection;

according to the vehicle passing data of the current signal period of the intersection, the vehicle flow of the current signal period of the intersection is counted;

the traffic flow of the current signal period of the intersection is input into a traffic flow prediction model obtained through pre-training to obtain a traffic flow prediction value of the intersection, wherein the traffic flow prediction model is obtained through training by the method provided by the first aspect of the embodiment of the application.

In a third aspect, an embodiment of the present application provides a traffic flow prediction model training device, where the device includes:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring historical vehicle passing data and various historical signal periods of an intersection, the historical vehicle passing data comprises historical vehicle passing time of various vehicles at the intersection, and the time length of the various historical signal periods is different;

the calculation module is used for calculating the time interval of two vehicle passing of the same vehicle according to the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, and determining the de-weighting time threshold value corresponding to the vehicle according to the distribution condition of a plurality of time intervals in a preset time interval;

the system comprises a duplication elimination processing module, a comparison module and a comparison module, wherein the duplication elimination processing module is used for utilizing duplication elimination time thresholds corresponding to all vehicles to carry out duplication elimination processing on historical vehicle passing data to be duplicated to obtain updated historical vehicle passing data, and the historical vehicle passing data to be duplicated is the historical vehicle passing data of which the time interval of two successive vehicle passing of the same vehicle is smaller than the duplication elimination time threshold of a corresponding point of the vehicle;

the first statistical module is used for respectively counting the traffic flow in each historical signal period according to the updated historical traffic data;

and the training module is used for training the traffic flow prediction model based on the traffic flow in each historical signal period.

a calculation module specifically configured to: calculating first time intervals of vehicles passing through the same license plate data twice in sequence according to the license plate data and the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, and calculating first percentage distribution of a plurality of first time intervals in a preset time interval; and determining a first time interval corresponding to the first percentage when the first percentage distribution reaches a preset change condition according to the first percentage distribution of the plurality of first time intervals in a preset time interval, and using the first time interval as a de-weighting time threshold corresponding to the vehicles with the same license plate data.

Optionally, the calculation module is further configured to calculate a second time interval between two consecutive passing of the unlicensed vehicles according to the historical passing time of each unlicensed vehicle at the intersection in the historical passing data, and calculate a second percentage distribution of the plurality of second time intervals in the preset time interval; and determining a second time interval corresponding to a second percentage equal to the first percentage when the first percentage reaches a preset change condition in the second percentage distribution as a de-weighting time threshold corresponding to the unlicensed vehicle according to the second percentage distribution of the plurality of second time intervals in the preset time interval.

Optionally, the duplicate removal processing module is specifically configured to: determining historical vehicle passing data to be subjected to de-weighting according to historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data and a de-weighting time threshold value corresponding to each vehicle, wherein the historical vehicle passing data to be subjected to de-weighting is two pieces of historical vehicle passing data, wherein the time interval between two times of vehicle passing of the same vehicle is smaller than the de-weighting time threshold value corresponding to the vehicle; and deleting the historical passing vehicle data with earlier historical passing vehicle time in the historical passing vehicle data to be subjected to duplicate removal, and reserving the historical passing vehicle data with later historical passing vehicle time in the historical passing vehicle data to be subjected to duplicate removal to obtain updated historical passing vehicle data.

Optionally, the training module is specifically configured to: extracting characteristic data in each historical signal period according to historical vehicle passing data, each historical signal period and the vehicle flow in each historical signal period and a preset characteristic data extraction strategy; dividing to obtain a training set and a verification set according to the characteristic data in each historical signal period; training a preset multi-class base model by using a training set to respectively obtain the feature importance of various feature data; screening feature data with feature importance greater than a preset threshold value from various feature data of the base model aiming at various base models to obtain a primary selection feature set corresponding to the base model, wherein the primary selection feature set records category information of the feature data with feature importance greater than the preset threshold value; solving a union set of the initially selected feature sets corresponding to all kinds of base models to obtain a target feature set; based on the target feature set, screening out a target training set from the training set, and screening out a target verification set from the verification set; training a preset model to be trained by using a target training set, and carrying out hyper-parameter optimization based on a target verification set to obtain an optimal hyper-parameter; and combining the target training set and the target verification set, and retraining the preset model to be trained by using the optimal hyper-parameter to obtain a traffic flow prediction model.

In a fourth aspect, an embodiment of the present application provides a traffic flow prediction apparatus, including:

the second acquisition module is used for acquiring vehicle passing data of the current signal period of the intersection;

the second statistical module is used for counting the traffic flow of the intersection in the current signal period according to the traffic passing data of the intersection in the current signal period;

the prediction module is used for inputting the traffic flow of the intersection in the current signal period into a traffic flow prediction model obtained through pre-training to obtain a traffic flow prediction value of the intersection, wherein the traffic flow prediction model is obtained through training by the method provided by the first aspect of the embodiment of the application.

In a fifth aspect, an embodiment of the present application provides an electronic device, including a processor and a memory; a memory for storing a computer program; a processor, configured to implement the method provided by the first aspect or the method provided by the second aspect of the embodiments of the present application when executing the computer program stored in the memory.

In a sixth aspect, the present application provides a machine-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method provided by the first aspect or the method provided by the second aspect of the present application is implemented.

In a seventh aspect, embodiments of the present application further provide a computer program product including instructions, which when executed on a computer, cause the computer to perform the method provided in the first aspect or the method provided in the second aspect of the embodiments of the present application.

In the traffic flow prediction model training and traffic flow prediction method, device and electronic equipment provided by the embodiment of the application, historical vehicle passing data and various historical signal periods of an intersection are obtained, time intervals of two consecutive vehicle passing of the same vehicle are calculated according to the historical vehicle passing time of vehicles at the intersection in the historical vehicle passing data, the de-weighting time threshold value corresponding to the vehicle is determined according to the distribution condition of the time intervals in the preset time interval, the de-weighting processing is carried out on the historical vehicle passing data to be de-weighted, in the historical vehicle passing data, of which the time intervals of two consecutive vehicle passing of the same vehicle are smaller than the de-weighting time threshold value corresponding to the vehicle, so that updated historical vehicle passing data are obtained, the traffic flow in the various historical signal periods is respectively counted according to the updated historical vehicle passing data, and a traffic flow prediction model is trained on the basis of the traffic flow in the various historical signal periods.

Calculating a de-weighting time threshold value corresponding to each vehicle, then performing de-weighting processing on historical vehicle passing data by using the de-weighting time threshold value corresponding to each vehicle, and then training a vehicle flow prediction model by using vehicle flows in each historical signal period obtained by statistics of the de-weighted historical vehicle passing data. The passing data is subjected to de-weighting based on de-weighting time threshold values calculated according to distribution conditions of historical passing data, and dynamic changes of signal periods are fully considered by a traffic flow prediction model of traffic flow training calculated according to each historical signal period, so that the accuracy of the traffic flow prediction model is improved, and the accuracy of traffic flow prediction is improved by utilizing the traffic flow prediction model to predict traffic flow.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a traffic flow prediction model training method according to an embodiment of the present application;

FIG. 2 is a graph of time interval versus percentage for an embodiment of the present application;

FIG. 3 is a schematic flow chart illustrating feature selection according to an embodiment of the present application;

fig. 4 is a schematic flow chart illustrating a traffic flow prediction method according to an embodiment of the present application;

FIG. 5 is a schematic flow chart illustrating a traffic flow prediction method according to another embodiment of the present application;

fig. 6 is a schematic structural diagram of a traffic flow prediction model training device according to an embodiment of the present application;

fig. 7 is a schematic structural view of a traffic flow prediction device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application are within the scope of protection of the present application.

In order to improve the precision of a traffic flow prediction model and further improve the accuracy of traffic flow prediction, the embodiment of the application provides a method and a device for training the traffic flow prediction model and predicting the traffic flow, and electronic equipment.

Next, a method for training a traffic flow prediction model according to an embodiment of the present application will be described first. An execution subject of the traffic flow prediction model training method provided by the embodiment of the application may be an electronic device (e.g., a training machine) with a model training function. The method for implementing the traffic flow prediction model training method provided by the embodiment of the present application may be at least one of software, a hardware circuit, and a logic circuit provided in the execution subject.

As shown in fig. 1, a method for training a traffic flow prediction model according to an embodiment of the present application may include the following steps.

S101, historical vehicle passing data and historical signal periods of the intersection are obtained, wherein the historical vehicle passing data comprise historical vehicle passing time of vehicles at the intersection, and the time length of each historical signal period is different.

S102, calculating time intervals of two vehicle passing of the same vehicle according to the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, and determining the de-weighting time threshold value corresponding to the vehicle according to the distribution condition of the time intervals in the preset time interval.

And S103, carrying out de-weighting processing on the historical vehicle-passing data to be de-weighted by using the de-weighting time threshold value corresponding to each vehicle to obtain updated historical vehicle-passing data, wherein the historical vehicle-passing data to be de-weighted is the historical vehicle-passing data of which the time interval of two successive vehicle-passing of the same vehicle is smaller than the de-weighting time threshold value corresponding to the vehicle.

And S104, respectively counting the traffic flow in each historical signal period according to the updated historical traffic data.

S105, training a traffic flow prediction model based on the traffic flow in each historical signal period.

By applying the scheme of the embodiment of the application, the de-weighting time threshold value corresponding to each vehicle is calculated, then the de-weighting processing is carried out on the historical vehicle passing data by using the de-weighting time threshold value corresponding to each vehicle, and then the vehicle flow prediction model is trained by using the vehicle flow in each historical signal period obtained by statistics of the de-weighted historical vehicle passing data. The passing data is subjected to de-weighting based on de-weighting time threshold values calculated according to distribution conditions of historical passing data, and dynamic changes of signal periods are fully considered by a traffic flow prediction model of traffic flow training calculated according to each historical signal period, so that the accuracy of the traffic flow prediction model is improved, and the accuracy of traffic flow prediction is improved by utilizing the traffic flow prediction model to predict traffic flow.

The traffic flow prediction model is an end-to-end model, and can realize the input of the traffic flow of the current signal period and directly output to obtain the traffic flow prediction value of the next signal period. The traffic flow prediction model is obtained through traffic flow training in a large number of historical signal periods, and specifically, during training, the traffic flow in a plurality of historical signal periods is used as sample data input, and the traffic flow in the next signal period of the historical signal periods is used as output, so that the traffic flow prediction model is trained.

The intersection refers to a crossroad, a T-shaped intersection and the like in an urban traffic scene, the intersection has a plurality of directions, each direction is defined as a bayonet, each bayonet is composed of a plurality of lanes, the lanes in the bayonet can be divided into a one-way lane or a two-way lane, and the fields of the historical vehicle passing data can comprise intersection numbers, bayonet directions, lane numbers, license plate data, historical vehicle passing time and the like. The historical vehicle passing data can be acquired through acquisition equipment such as a camera and an infrared sensor, stored in a database and extracted from the database when a vehicle flow prediction model is trained. The signal period refers to the time of one-time circulation of traffic light signals of the traffic lights at the intersection, and is generally dynamically adjustable and can be dynamically adjusted according to real-time road conditions. Since the durations of the historical signal periods are different, the fields of the historical signal period may include the time at which the historical signal period started and the duration of the historical signal period.

After the historical vehicle passing data and each historical signal period of the intersection are obtained, the historical vehicle passing data needs to be preprocessed.

The deduplication processing in the embodiment of the application is not simple deduplication of the same sample or deduplication based on an artificially set vehicle time interval threshold, but determines the deduplication time threshold in a statistical manner, and then performs deduplication based on the deduplication time threshold. Specifically, according to the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, calculating the time interval of two successive vehicle passing of the same vehicle, then according to the distribution condition of a plurality of time intervals in a preset time interval, determining the de-weighting time threshold value corresponding to the vehicle, and based on the calculated de-weighting time threshold value, performing de-weighting processing on the historical vehicle passing data to be de-weighted.

Because the vehicles appearing at the intersection comprise the vehicles with the license plates and the vehicles without the license plates, in order to ensure the accuracy, the de-weighting time threshold values can be respectively calculated for the vehicles with the license plates and the vehicles without the license plates. Specifically, for a vehicle with a license plate, the historical vehicle-passing data further includes license plate data of each vehicle at the intersection, and accordingly, S102 may specifically be: calculating first time intervals of vehicles passing through the same license plate data twice in sequence according to the license plate data and the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, and calculating first percentage distribution of a plurality of first time intervals in a preset time interval; and determining a first time interval corresponding to the first percentage when the first percentage distribution reaches a preset change condition according to the first percentage distribution of the plurality of first time intervals in a preset time interval, and using the first time interval as a de-weighting time threshold corresponding to the vehicles with the same license plate data.

For vehicles with license plates, in order to facilitate calculation, the historical passing time of the same license plate data can be sorted in an ascending or descending order according to the license plate data and the historical passing time of each vehicle at the intersection, then based on the sorting result, a first time interval delta T1 of two successive passing of vehicles with the same license plate data is calculated, so that a plurality of delta T1 can be obtained, a first percentage distribution of the plurality of delta T1 in a preset time interval [ T1, T2] is calculated, the preset time interval [ T1, T2] is a time interval in which the vehicles are repeatedly collected with high probability, under the general condition, T1 can be taken as 0, T2 can be selected as a larger value, and then the range is narrowed according to the actual condition, the first percentage distribution of the plurality of delta T1 in [ T1, T2] is calculated, namely, the sample proportion of the plurality of delta T1 in [ T1, T2] is calculated, and as shown in fig. 2, the distribution of the plurality of delta T1 in [0,100] intervals is displayed. And determining a first time interval corresponding to the first percentage when the first percentage distribution reaches a preset change condition according to the first percentage distribution of the plurality of delta T1 in [ T1, T2], and using the first time interval as a de-weighting time threshold corresponding to the vehicles with the same license plate data. As can be seen from fig. 2, when Δ T1 is greater than 60s, the curve decreases more and more significantly, that is, the frequency of repeated acquisition becomes lower significantly, it can be considered that the passing data with Δ T1 less than 60s has a high probability of being repeated acquisition, and the passing data can be subjected to the deduplication operation, that is, the deduplication time threshold can be set to 60s. As can be seen from the above example, the preset changing condition may be a turning point at which the curve in the first time interval-first percentage graph decreases and slows down, and of course, a changing condition that only indicates that the frequency of repeatedly acquiring the passing data of the same vehicle is lower than a certain threshold may be used as the preset changing condition in the embodiment of the present application.

For a vehicle without a license plate, the method may further comprise: calculating a second time interval of passing vehicles twice in sequence of the unlicensed vehicles according to the historical passing time of each unlicensed vehicle at the intersection in the historical passing data, and calculating a second percentage distribution of the plurality of second time intervals in the preset time interval; and determining a second time interval corresponding to a second percentage equal to the first percentage when the first percentage reaches a preset change condition in the second percentage distribution as a de-weighting time threshold corresponding to the unlicensed vehicle according to the second percentage distribution of the plurality of second time intervals in the preset time interval.

Since the passing data of the vehicle without the license plate and the passing data of the vehicle with the license plate are generated in the same large environment, the distribution of the repeated data of the passing data of the vehicle without the license plate is basically consistent with the distribution of the repeated data of the vehicle with the license plate, that is, the proportion of the repeated data is the same, that is, the percentage distribution of the time interval in the preset time interval is basically consistent. Therefore, after calculating a second time interval at which the unlicensed vehicle passes twice and a second percentage distribution of a plurality of second time intervals in the preset time interval, a second time interval corresponding to a second percentage, equal to the first percentage when the first percentage distribution reaches the preset variation condition, in the second percentage distribution may be determined according to the second percentage distribution of the plurality of second time intervals in the preset time interval, as the deduplication time threshold corresponding to the unlicensed vehicle. For example, if the percentage of the vehicles with license plates reaches the preset change condition when the percentage is 10%, the corresponding deduplication time threshold is 60s, and correspondingly, if the percentage of the vehicles without license plates reaches the second time interval of 65s, the deduplication time threshold corresponding to the vehicles without license plates may be set to 65s.

In the embodiment of the application, the preprocessing of the historical vehicle passing data can further comprise data field screening and data fusion. For historical vehicle passing data, the obtained historical vehicle passing data and historical signal cycles are screened according to the screening mode that the selected fields in the vehicle passing data comprise intersection numbers, intersection directions, lane numbers, license plate data and vehicle passing time, and the selected fields in the signal cycles comprise intersection signal cycle starting time and intersection signal cycle duration, then duplication elimination processing is carried out, for the historical vehicle passing data after duplication elimination processing, in order to ensure the consistency and the correlation of the data, the historical vehicle passing data and the historical signal cycles can be spliced, so that the historical vehicle passing data obtain the information of the affiliated historical signal cycles, and the intersection signal cycle ending time is calculated according to the historical signal cycles.

In an implementation manner of the embodiment of the present application, S103 may specifically be: determining historical vehicle passing data to be subjected to de-weighting according to historical vehicle passing time of each vehicle at the intersection and a de-weighting time threshold value corresponding to each vehicle in the historical vehicle passing data, wherein the historical vehicle passing data to be subjected to de-weighting are two pieces of historical vehicle passing data, wherein the time interval of two times of vehicle passing of the same vehicle is smaller than the de-weighting time threshold value corresponding to the vehicle; and deleting the historical passing data with earlier historical passing time in the historical passing data to be subjected to weight removal, and reserving the historical passing data with later historical passing time in the historical passing data to be subjected to weight removal to obtain updated historical passing data.

The specific vehicle passing data duplicate removal processing is to delete the historical vehicle passing data with earlier historical vehicle passing time in the historical vehicle passing data to be deduplicated, and keep the historical vehicle passing data with later historical vehicle passing time in the historical vehicle passing data to be deduplicated, so that the historical vehicle passing data of the same vehicle is always up-to-date, and real-time dynamics is guaranteed.

After the historical vehicle passing data is preprocessed, the traffic flow in each historical signal period needs to be counted, so that a traffic flow prediction model is trained based on the traffic flow in each historical signal period.

When the traffic flow statistics is carried out, if no traffic data exist in a certain historical signal period, the traffic flow missing in the historical signal period can be filled with 0, and the data fields at the moment comprise an intersection number, a bayonet direction, a lane number, license plate data, traffic passing time, intersection signal period starting time, intersection signal period ending time, intersection signal period duration and traffic flow.

The specific training process may be to train the traffic flow prediction model by using traffic flows of multiple historical signal periods as sample data input and using traffic flows of the next signal period of the historical signal period as output and using a gradient descent method. The specific training process is the same as or similar to the conventional machine learning training process, and is not described herein again.

In order to improve the reliability of the characteristics and the stability of the traffic flow prediction model and reduce the characteristic bias of a single model or a single method, in the embodiment of the present application, before the training of the traffic flow prediction model, the characteristic selection needs to be performed based on multiple models.

Specifically, S105 may specifically be: extracting characteristic data in each historical signal period according to historical vehicle passing data, each historical signal period and the vehicle flow in each historical signal period and a preset characteristic data extraction strategy; dividing to obtain a training set and a verification set according to characteristic data in each historical signal period; training a preset multi-class base model by using a training set to respectively obtain the feature importance of various feature data; screening feature data with feature importance greater than a preset threshold value from various feature data of various base models to obtain a primary feature set corresponding to the base models, wherein the primary feature set records category information of the feature data with feature importance greater than the preset threshold value; solving a union set of the initially selected feature sets corresponding to all kinds of base models to obtain a target feature set; based on the target feature set, screening a target training set from the training set, and screening a target verification set from the verification set; training a preset model to be trained by using a target training set, and carrying out hyperparameter optimization based on a target verification set to obtain an optimal hyperparameter; and combining the target training set and the target verification set, and retraining the preset model to be trained by using the optimal hyper-parameters to obtain a traffic flow prediction model.

And during feature extraction, feature extraction is carried out according to historical vehicle passing data, each historical signal period and the vehicle flow in each historical signal period and a preset feature data extraction strategy, wherein the main extracted features comprise historical flow features and time features. The historical flow characteristics comprise the flow average values of the first n periods of the lane and the gate, wherein n can be 1, 2, 4, 8 and the like; the time characteristics include the minutes, hours, days in the month, nearest time distance to the commute peak, etc. of the starting time of the assigned historical signal cycle, which can be set to 7. In addition, the data field described above may also be set as a base feature.

After extracting the feature data in each historical signal period, dividing to obtain a training set and a verification set, wherein the verification set can obtain the feature data in the last historical signal period. As shown in fig. 3, a training set is used to train multiple preset base models, and feature importance of various feature data is obtained respectively, where the base models may be selected from Extra-Trees (eXtreme random Trees), random forest, GBDT (Gradient Boosting Tree), XGBoost (eXtreme Gradient Boosting), lightGBM (Light Gradient Boosting Machine), and the like. The feature importance is obtained after model training.

And screening the feature data with the feature importance greater than a preset threshold value in the various feature data of the base model aiming at various base models to obtain a primary feature set corresponding to the base model. In general, as shown in fig. 3, feature data with feature importance greater than or equal to the median can be selected, and the category information thereof can be made into an initially selected feature set. And solving a union set of the initially selected feature sets corresponding to all types of base models to obtain a target feature set, namely obtaining all features with higher importance degree for model training. Then, based on the target feature set, a target training set is screened out from the training set, and a target verification set is screened out from the verification set, namely the target training set and the target verification set which are feature types in the target feature set are screened out, then the target training set is utilized to train a preset model to be trained, and the super-parameter optimization is carried out based on the target verification set, so that the optimal super-parameter is obtained. The super-parameter optimization may be based on a bayesian optimization method and an MAE (Mean Absolute Error) evaluation index to obtain the super-parameter of the model when the MAE is minimum, and of course, other super-parameter optimization methods may be used herein, and evaluation of the evaluation index may be based on an index such as root Mean square, and the like, which is not specifically limited herein. And finally, combining the target training set and the target verification set, and retraining the preset model to be trained by using the optimal hyper-parameter to obtain the traffic flow prediction model. The preset model to be trained can be any one of the Extra-Trees, randomForest, GBDT, XGboost, lightGBM and other models.

The embodiment of the application also provides a traffic flow prediction method, and an execution main body of the traffic flow prediction method can be an electronic device (such as a traffic flow control server and a signal lamp controller) with a traffic flow prediction function. The manner of implementing the traffic flow prediction method provided by the embodiment of the present application may be at least one of software, a hardware circuit, and a logic circuit provided in the execution main body.

As shown in fig. 4, a traffic flow prediction method provided in the embodiment of the present application may include the following steps.

S401, vehicle passing data of the current signal period of the intersection are obtained.

S402, according to the vehicle passing data of the current signal period of the intersection, the vehicle flow of the current signal period of the intersection is counted.

And S403, inputting the traffic flow of the intersection in the current signal period into a traffic flow prediction model obtained by pre-training to obtain a traffic flow prediction value of the intersection.

The traffic flow prediction model is obtained by training through the method of the embodiment shown in fig. 1.

By applying the scheme of the embodiment of the application, the traffic flow prediction model is obtained by training by using the method shown in the figure 1, and the accuracy of the traffic flow prediction model is high, so that the accuracy of the traffic flow prediction is improved when the traffic flow prediction model is used for predicting the traffic flow.

When vehicle flow prediction is performed, vehicle passing data of a current signal period of an intersection needs to be acquired first, the vehicle passing data can comprise an intersection number, a bayonet direction, a lane number, license plate data and vehicle passing time, and the current signal period comprises intersection signal period starting time and intersection signal period duration. After process data of the current signal period of the intersection are obtained, the traffic flow of the current signal period of the intersection is counted according to the traffic data of the current signal period of the intersection, then the counted traffic flow is input into a traffic flow prediction model obtained through pre-training, the traffic flow of the current signal period of the intersection can be input, and the traffic flow prediction value of the next signal period can be obtained through direct output. The training mode of the traffic flow prediction model is shown in fig. 1, and is not described herein again.

After the predicted traffic flow value is obtained, the signal cycle of the traffic light may be adjusted according to the predicted traffic flow value, for example, if the predicted traffic flow is significantly increased, the signal cycle of the traffic light may be adjusted to be increased to improve the vehicle passing rate.

In an implementation manner of the embodiment of the present application, the training of the traffic flow prediction model and the traffic flow prediction are as shown in fig. 5, and specifically include the following steps.

The method comprises the steps of firstly, obtaining and analyzing historical vehicle passing data and intersection signal periods of each intersection.

And secondly, preprocessing the historical vehicle passing data and the intersection signal period, and mainly comprising the following processing steps:

and (4) screening data fields. The fields selected from the vehicle passing data comprise crossing numbers, bayonet directions, lane numbers, license plate data and vehicle passing time; the fields selected in the signal period comprise the starting time of the intersection signal period and the duration of the intersection signal period;

and (5) data deduplication. And deleting the vehicle data of which the time interval between two successive occurrences of the same vehicle in the vehicle passing data is smaller than the de-weight time threshold value delta T, and retaining the vehicle data of the last timestamp, wherein the determination process of the de-weight time threshold value delta T is described in the embodiment shown in fig. 1, and the details are not described here.

And (6) data fusion. The passing data and the signal period are spliced, so that the historical passing data obtains the information (the starting time of the crossing signal period and the time length of the crossing signal period) of the signal period to which the historical passing data belongs, and the ending time of the crossing signal period is obtained according to the signal period.

And thirdly, counting the traffic flow of each lane in each historical signal period, filling the missing traffic flow data in the signal period with a value of 0, wherein the data fields at the moment are an intersection number, a bayonet direction, a lane number, a signal period starting time, a signal period ending time, an intersection signal period duration and the traffic flow.

And fourthly, extracting the characteristics of the vehicle passing data. The method is mainly characterized by comprising the following steps of: the average value of the flow of the first n periods of the lane and the gate, wherein n is 1, 2, 4 and 8 respectively, and the time characteristic is as follows: the minutes, hours, days of the month, and the nearest temporal distance (seconds) to the commute peak, which are 7-00, 17-00. The basic characteristics are crossing number, bayonet direction, lane number and signal cycle duration.

And fifthly, dividing the data of the passing vehicles into a training set, a verification set and a test set, wherein the verification set and the test set respectively obtain data of the last but one week and the last but one week.

Sixthly, training the model by using the training set and performing feature selection, wherein a feature selection method based on multiple models is used to screen out a more important feature set, and the specific feature selection method is shown in the embodiment shown in fig. 1, and is not described in detail here.

And seventhly, selecting a proper model for model training. Models such as machine learning or deep learning can be selected, the models are selected according to specific effects, and LightGBM is selected as final model training in the embodiment of the application.

And eighthly, selecting a target training set training model after the characteristic selection, and carrying out super-parameter optimization based on a target verification set after the characteristic selection. And acquiring the hyperparameter of the model when the MAE is minimum by using a Bayesian optimization method based on the MAE evaluation index.

And ninthly, combining the target training set and the target verification set, re-training the model by using the optimal hyper-parameters, and obtaining a final prediction model.

And step ten, inputting the passing data in the test set into a prediction model to obtain a prediction result.

Corresponding to the above traffic flow prediction model training method, an embodiment of the present application provides a traffic flow prediction model training apparatus, as shown in fig. 6, the apparatus includes:

the first obtaining module 610 is configured to obtain historical vehicle passing data and historical signal periods of an intersection, where the historical vehicle passing data includes historical vehicle passing time of each vehicle at the intersection, and the time lengths of the historical signal periods are different;

the calculating module 620 is configured to calculate time intervals of two consecutive vehicle passing of the same vehicle according to the historical vehicle passing times of the vehicles at the intersection in the historical vehicle passing data, and determine a deduplication time threshold corresponding to the vehicle according to distribution conditions of the time intervals in a preset time interval;

the deduplication processing module 630 is configured to perform deduplication processing on historical vehicle passing data to be deduplicated by using the deduplication time threshold corresponding to each vehicle to obtain updated historical vehicle passing data, where the historical vehicle passing data to be deduplicated is historical vehicle passing data in which a time interval between two consecutive vehicle passing of the same vehicle is smaller than the deduplication time threshold corresponding to the vehicle;

the first statistical module 640 is configured to separately count traffic flow in each historical signal period according to the updated historical vehicle passing data;

a training module 650 for training the traffic prediction model based on the traffic in each historical signal period.

the calculating module 620 may be specifically configured to: calculating first time intervals of vehicles passing through the same license plate data twice in sequence according to the license plate data and the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, and calculating first percentage distribution of a plurality of first time intervals in a preset time interval; and determining a first time interval corresponding to the first percentage when the first percentage distribution reaches a preset change condition according to the first percentage distribution of the plurality of first time intervals in a preset time interval, and using the first time interval as a de-weighting time threshold corresponding to the vehicles with the same license plate data.

Optionally, the calculating module 620 may be further configured to calculate a second time interval between two consecutive passing of the unlicensed vehicles according to the historical passing time of each unlicensed vehicle at the intersection in the historical passing data, and calculate a second percentage distribution of the plurality of second time intervals in the preset time interval; and determining a second time interval corresponding to a second percentage equal to the first percentage when the first percentage reaches a preset change condition in the second percentage distribution as a de-weighting time threshold corresponding to the unlicensed vehicle according to the second percentage distribution of the plurality of second time intervals in the preset time interval.

Optionally, the deduplication processing module 630 may be specifically configured to: determining historical vehicle passing data to be subjected to de-weighting according to historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data and a de-weighting time threshold value corresponding to each vehicle, wherein the historical vehicle passing data to be subjected to de-weighting is two pieces of historical vehicle passing data, wherein the time interval between two times of vehicle passing of the same vehicle is smaller than the de-weighting time threshold value corresponding to the vehicle; and deleting the historical passing data with earlier historical passing time in the historical passing data to be subjected to weight removal, and reserving the historical passing data with later historical passing time in the historical passing data to be subjected to weight removal to obtain updated historical passing data.

Optionally, the training module 650 may be specifically configured to: extracting characteristic data in each historical signal period according to historical vehicle passing data, each historical signal period and the vehicle flow in each historical signal period and a preset characteristic data extraction strategy; dividing to obtain a training set and a verification set according to the characteristic data in each historical signal period; training a preset multi-class base model by using a training set to respectively obtain the feature importance of various feature data; screening feature data with feature importance greater than a preset threshold value from various feature data of various base models to obtain a primary feature set corresponding to the base models, wherein the primary feature set records category information of the feature data with feature importance greater than the preset threshold value; solving a union set of the initially selected feature sets corresponding to all kinds of base models to obtain a target feature set; based on the target feature set, screening a target training set from the training set, and screening a target verification set from the verification set; training a preset model to be trained by using a target training set, and carrying out hyper-parameter optimization based on a target verification set to obtain an optimal hyper-parameter; and combining the target training set and the target verification set, and retraining the preset model to be trained by using the optimal hyper-parameter to obtain a traffic flow prediction model.

By applying the scheme of the embodiment of the application, the de-weighting time threshold value corresponding to each vehicle is calculated, then de-weighting processing is carried out on historical vehicle passing data by using the de-weighting time threshold value corresponding to each vehicle, and then the vehicle flow prediction model is trained by using the vehicle flow in each historical signal period obtained by statistics of the historical vehicle passing data after de-weighting processing. The passing data is subjected to de-weighting based on de-weighting time threshold values calculated according to distribution conditions of historical passing data, and dynamic changes of signal periods are fully considered by a traffic flow prediction model of traffic flow training calculated according to each historical signal period, so that the accuracy of the traffic flow prediction model is improved, and the accuracy of traffic flow prediction is improved by utilizing the traffic flow prediction model to predict traffic flow.

Corresponding to the traffic flow prediction method, an embodiment of the present application provides a traffic flow prediction device, as shown in fig. 7, the device includes:

the second obtaining module 710 is configured to obtain vehicle passing data of the intersection in the current signal period;

the second statistical module 720 is used for counting the traffic flow of the intersection in the current signal period according to the traffic passing data of the intersection in the current signal period;

the prediction module 730 is configured to input the traffic flow of the intersection in the current signal period into a traffic flow prediction model obtained through pre-training to obtain a predicted traffic flow value of the intersection, where the traffic flow prediction model is obtained through training by using the method shown in fig. 1.

By applying the scheme of the embodiment of the application, the traffic flow prediction model is obtained by training by using the method shown in fig. 1, and the accuracy of the traffic flow prediction model is high, so that the accuracy of the traffic flow prediction is improved when the traffic flow prediction model is used for predicting the traffic flow.

An electronic device, as shown in fig. 8, includes a processor 801 and a memory 802. The memory 802 is used for storing computer programs; the processor 801 is configured to implement the traffic flow prediction model training method or the traffic flow prediction method provided in the embodiment of the present application when executing the computer program stored in the memory 802.

The Memory may include a RAM (Random Access Memory) or an NVM (Non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor including a CPU (Central Processing Unit), an NP (Network Processor), and the like; but also a DSP (Digital Signal Processing), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

In addition, a machine-readable storage medium is provided, where a computer program is stored, and when the computer program is executed by a processor, the method for training a traffic flow prediction model or the method for predicting a traffic flow provided in the embodiment of the present application is implemented.

In another embodiment of the present application, there is also provided a computer program product including instructions, which when run on a computer, causes the computer to execute the traffic flow prediction model training method or the traffic flow prediction method provided in the present application.

In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to be performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber, DSL (Digital Subscriber Line)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a DVD (Digital Versatile Disk)), or a semiconductor medium (e.g., a SSD (Solid State Disk)), etc.

As for the embodiments of the traffic flow prediction model training device, the traffic flow prediction device, the electronic device, the machine-readable storage medium and the computer program product, since the contents of the related methods are substantially similar to the foregoing method embodiments, the description is relatively simple, and reference may be made to the partial description of the method embodiments for relevant points.

It should be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the traffic prediction model training device, the traffic prediction device, the electronic device, the machine-readable storage medium and the computer program product embodiment, since they are substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims

1. A method for training a traffic prediction model, the method comprising:

obtaining historical vehicle passing data and historical signal periods of an intersection, wherein the historical vehicle passing data comprises historical vehicle passing time of vehicles at the intersection, and the time lengths of the historical signal periods are different;

carrying out de-weighting processing on historical vehicle passing data to be de-weighted by using de-weighting time thresholds corresponding to all vehicles to obtain updated historical vehicle passing data, wherein the historical vehicle passing data to be de-weighted is the historical vehicle passing data of which the time interval of two successive vehicle passing of the same vehicle is smaller than the de-weighting time threshold corresponding to the vehicle;

2. The method of claim 1, wherein the historical vehicle passing data further comprises license plate data for each vehicle at the intersection;

the step of calculating the time interval of two successive vehicle passing of the same vehicle according to the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, and determining the de-weighting time threshold corresponding to the vehicle according to the distribution condition of a plurality of time intervals in a preset time interval comprises the following steps:

calculating first time intervals of vehicles passing through the same license plate data twice in sequence according to the license plate data and the historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data, and calculating first percentage distribution of the first time intervals in a preset time interval;

and determining a first time interval corresponding to the first percentage when the first percentage distribution reaches a preset change condition according to the first percentage distribution of the plurality of first time intervals in the preset time interval, wherein the first percentage distribution is used as a de-weighting time threshold corresponding to the vehicle with the same license plate data.

3. The method of claim 2, further comprising:

calculating a second time interval of passing vehicles twice in sequence of the unlicensed vehicles according to the historical passing time of each unlicensed vehicle at the intersection in the historical passing data, and calculating a second percentage distribution of the plurality of second time intervals in the preset time interval;

and according to a second percentage distribution of the plurality of second time intervals in the preset time interval, determining a second time interval corresponding to a second percentage equal to the first percentage when the first percentage distribution reaches the preset change condition in the second percentage distribution as a de-weighting time threshold corresponding to the unlicensed vehicle.

4. The method according to claim 1, wherein the step of performing the deduplication processing on the historical vehicle passing data to be deduplicated by using the deduplication time threshold corresponding to each vehicle to obtain the updated historical vehicle passing data comprises:

determining historical vehicle passing data to be subjected to de-weighting according to historical vehicle passing time of each vehicle at the intersection in the historical vehicle passing data and a de-weighting time threshold value corresponding to each vehicle, wherein the historical vehicle passing data to be subjected to de-weighting is two pieces of historical vehicle passing data, wherein the time interval between two times of vehicle passing of the same vehicle is smaller than the de-weighting time threshold value corresponding to the vehicle;

and deleting the historical passing data with earlier historical passing time in the historical passing data to be subjected to weight removal, and reserving the historical passing data with later historical passing time in the historical passing data to be subjected to weight removal to obtain updated historical passing data.

5. The method of claim 1, wherein the step of training a traffic prediction model based on the traffic flow in each historical signal period comprises:

extracting characteristic data in each historical signal period according to the historical vehicle passing data, each historical signal period and the vehicle flow in each historical signal period and a preset characteristic data extraction strategy;

training a preset multi-class base model by using the training set to respectively obtain the feature importance of each class of feature data;

screening feature data with feature importance greater than a preset threshold value from various feature data of various base models to obtain a primary feature set corresponding to the base models, wherein the primary feature set records category information of the feature data with feature importance greater than the preset threshold value;

solving a union set of the initially selected feature sets corresponding to all kinds of base models to obtain a target feature set;

based on the target feature set, screening a target training set from the training set, and screening a target verification set from the verification set;

training a preset model to be trained by using the target training set, and carrying out hyperparameter optimization based on the target verification set to obtain an optimal hyperparameter;

and merging the target training set and the target verification set, and retraining the preset model to be trained by using the optimal hyper-parameter to obtain a traffic flow prediction model.

6. A traffic flow prediction method, characterized in that the method comprises:

acquiring vehicle passing data of a current signal period of an intersection;

inputting the traffic flow of the intersection in the current signal period into a traffic flow prediction model obtained through pre-training to obtain a traffic flow prediction value of the intersection, wherein the traffic flow prediction model is obtained through training according to the method of any one of claims 1 to 5.

7. A traffic flow prediction model training apparatus, characterized in that the apparatus comprises:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring historical vehicle passing data and historical signal periods of an intersection, and the historical vehicle passing data comprises historical vehicle passing time of each vehicle at the intersection;

the system comprises a duplication elimination processing module, a comparison module and a comparison module, wherein the duplication elimination processing module is used for utilizing duplication elimination time thresholds corresponding to all vehicles to carry out duplication elimination processing on historical vehicle passing data to be duplicated to obtain updated historical vehicle passing data, and the historical vehicle passing data to be duplicated is the historical vehicle passing data of which the time interval of two successive vehicle passing of the same vehicle is smaller than the duplication time threshold corresponding to the vehicle;

and the training module is used for training a traffic flow prediction model based on the traffic flow in each historical signal period.

8. A traffic flow prediction apparatus, characterized in that the apparatus comprises:

the prediction module is used for inputting the traffic flow of the intersection in the current signal period into a traffic flow prediction model obtained through pre-training to obtain a predicted traffic flow value of the intersection, wherein the traffic flow prediction model is obtained through training according to the method of any one of claims 1 to 5.

9. An electronic device comprising a processor and a memory;

the memory is used for storing a computer program;

the processor, when executing the computer program stored in the memory, implementing the method of any of claims 1-5 or the method of claim 6.

10. A machine readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of claims 1 to 5 or the method of claim 6.