CN110929926A - Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest - Google Patents

Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest Download PDF

Info

Publication number
CN110929926A
CN110929926A CN201911126632.1A CN201911126632A CN110929926A CN 110929926 A CN110929926 A CN 110929926A CN 201911126632 A CN201911126632 A CN 201911126632A CN 110929926 A CN110929926 A CN 110929926A
Authority
CN
China
Prior art keywords
short
prediction
passenger flow
random forest
term
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911126632.1A
Other languages
Chinese (zh)
Inventor
殷茗
李佳成
周翔
张煊宇
芦菲娅
李欣
李怿臻
马子琛
马怀宇
朱奎宇
吴瑜
仵芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN201911126632.1A priority Critical patent/CN110929926A/en
Publication of CN110929926A publication Critical patent/CN110929926A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Abstract

The invention discloses a short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest, which is used for solving the technical problem of poor practicability of the conventional passenger flow prediction method. The technical scheme includes that two single models, namely a long-short term memory network and a random forest, are combined, the long-short term memory network is used for fitting daily passenger flow time series data, then the random forest is used for fitting residual errors between the long-short term memory network and the random forest, and finally prediction results of the two trained single models are superposed to obtain a combined model prediction result. The combined model combines the advantages of two single models, improves the prediction accuracy compared with the single model, has higher prediction stability, correspondingly improves the prediction of the passenger flow peak value, is particularly suitable for the prediction of short-term explosive passenger flow, and has good practicability.

Description

Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest
Technical Field
The invention relates to a passenger flow prediction method, in particular to a short-term explosion passenger flow prediction method based on a long and short-term memory network and a random forest.
Background
The document "design U.S. tourist areas using optimal simple tourist map analysis, Tourism Management,2015,46: 322-135" discloses a method for predicting tourist needs using Singular Spectral Analysis (SSA), which article shows significant advantages in predicting the number of tourists going to the united states. The model construction mainly comprises two stages of decomposition and reconstruction, wherein the decomposition stage mainly comprises two steps of embedding and singular value decomposition; in the reconstruction phase, two steps of grouping and diagonal averaging are mainly included, so that the model reaches the optimal prediction state. However, the method is single model prediction, needs to be improved in the aspects of application range, accuracy and the like, does not consider the current situation that domestic tourism is greatly influenced by factors such as festivals and holidays, and is not suitable for solving the problem of short-term explosive passenger flow prediction.
Disclosure of Invention
In order to overcome the defect that the existing passenger flow prediction method is poor in practicability, the invention provides a short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest. The method combines two single models, namely a long-short term memory network and a random forest, firstly uses the long-short term memory network to fit daily passenger flow time series data, then adopts the random forest to fit residual errors between the long-short term memory network and the random forest, and finally superposes the prediction results of the two trained single models to obtain a combined model prediction result. The combined model combines the advantages of two single models, improves the prediction accuracy compared with the single model, has higher prediction stability, correspondingly improves the prediction of the passenger flow peak value, is particularly suitable for the prediction of short-term explosive passenger flow, and has good practicability.
The technical scheme adopted by the invention for solving the technical problems is as follows: a short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest is characterized by comprising the following steps:
step one, selecting a prediction variable factor.
And comprehensively considering correlation, repeatability and feasibility factors, and selecting the daily passenger flow volume, the daily weather conditions such as temperature, wind direction, wind speed and humidity, the network search index and the holiday data of the scenic spot as the prediction variable factors by the passenger flow prediction model.
And step two, training a prediction model.
And preprocessing the original data to convert the data into data suitable for the model. Selecting prediction variable factors, preprocessing data comprising three indexes of a warm-wet index, a wind effect index and a dressing index, respectively expressing the three indexes by XTHI, XWCI and XICL, and calculating formulas shown in formulas (1) to (3).
XTHI=(1.8t+32)-0.5(1-f)(1.8t-26) (1)
Figure BDA0002277046410000021
Figure BDA0002277046410000022
t is the temperature in centigrade, f isRelative humidity%, v represents wind speed m/s, s represents sunshine hours H/d, H represents 75% W/m of human metabolic rate2A represents the absorption condition of human body to solar radiation, the numerical value of A is 0.06, R represents the solar radiation received by the unit area of the vertical sunlight, and the value is (1385 +/-7) W/m2And α represents the solar altitude.
Secondly, training the long-term and short-term memory network is carried out.
And converting the corresponding year, month and day of the passenger flow into an ordered sequence. The information of each day is recorded as a record, the record comprises a serial number converted from the date, a current day comfort index, a current day holiday index, a yesterday search index and yesterday passenger flow, and the corresponding result is the current day passenger flow. After part of the abnormal data is removed, the size of the Batch size is set as the whole data set, and the iteration number is 1000. And importing the data into the long-short term memory network to obtain the prediction result of the long-short term memory network.
And thirdly, training the random forest.
The number of the maximum submodels of the random forest is set to be 1000, the method for judging whether the node is continuously split is the mean square error, all the characteristics are involved in the judgment when the node is split, and the maximum depth of the random forest is not limited. Meanwhile, in order to increase the training speed and in consideration of the load-bearing capacity of the machine itself, the number of parallels is set to 16. And importing the record into a random forest, and training residual errors to obtain a prediction result of the random forest.
And finally, carrying out model combination work.
The sum of the prediction results of the long-term and short-term memory network model and the random forest model is the final prediction result of the combined model.
The invention has the beneficial effects that: the method combines two single models, namely a long-short term memory network and a random forest, firstly uses the long-short term memory network to fit daily passenger flow time series data, then adopts the random forest to fit residual errors between the long-short term memory network and the random forest, and finally superposes the prediction results of the two trained single models to obtain a combined model prediction result. The combined model combines the advantages of two single models, improves the prediction accuracy compared with the single model, has higher prediction stability, correspondingly improves the prediction of the passenger flow peak value, is particularly suitable for the prediction of short-term explosive passenger flow, and has good practicability.
The present invention will be described in detail with reference to the following embodiments.
Detailed Description
The invention will now be further described with reference to the example of traffic prediction in the scenic spot of the four girl mountains. The four girl mountains are a typical mountain type scenic spot and have a certain degree of popularity across the country. Most importantly, the informatization process is promoted earlier by the four girl mountains, sufficient data are provided, and daily passenger flow data are easy to obtain.
The invention discloses a short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest, which comprises the following specific steps:
step 1: and selecting a predictive variable factor.
The factors such as relevance, repeatability and feasibility are comprehensively considered, and the daily passenger flow, the daily weather conditions (temperature, wind direction, wind speed and humidity), the network search index and the holiday data of the scenic spot of the four girl mountains are selected as the factors of the prediction variables by the passenger flow prediction model.
Step 2: and training a prediction model.
And preprocessing the original data to convert the data into data suitable for the model. In combination with the specific predictor variables selected in this example, the data pretreatment included three indices, namely, the temperature-humidity index, the wind efficiency index and the dressing index, which were respectively expressed by XTHI, XWCI and XICL, and the calculation formulas are shown in formulas (1) to (3).
XTHI=(1.8t+32)-0.5(1-f)(1.8t-26) (1)
Figure BDA0002277046410000031
Figure BDA0002277046410000032
t is temperature in centigrade (DEG C), f is relative humidity (%), v represents wind speed (m/s), s represents sunshine hours (h ^ er)d) H represents 75% (W/m) of human metabolic rate2) A represents the absorption condition of the human body to solar radiation (after the practical condition is comprehensively considered, the numerical value of A is selected to be 0.06), and R represents the solar radiation (with the value of 1385 +/-7) W/m received by the unit area of the land vertical to the sunlight2) And α represents the solar altitude.
Secondly, training the long-term and short-term memory network is carried out.
And converting the corresponding year, month and day of the passenger flow into an ordered sequence. The information of each day is recorded as a record, the record comprises a serial number converted from the date, a current day comfort index, a current day holiday index, a yesterday search index and yesterday passenger flow, and the corresponding result is the current day passenger flow. After part of the abnormal data is removed, the size of the Batch size is set as the whole data set, and the iteration number is 1000. And importing the data into the long-short term memory network to obtain the prediction result of the long-short term memory network.
And thirdly, training the random forest.
The number of the maximum submodels of the random forest is set to be 1000, the method for judging whether the node is continuously split is the mean square error, all the characteristics are involved in the judgment when the node is split, and the maximum depth of the random forest is not limited. Meanwhile, in order to increase the training speed and in consideration of the load-bearing capacity of the machine itself, the number of parallels is set to 16. And importing the record into a random forest, and training residual errors to obtain a prediction result of the random forest.
And finally, carrying out model combination work.
The sum of the prediction results of the long-term and short-term memory network model and the random forest model is the final prediction result of the combined model.
And comparing the long-short term memory network model with the random forest model, wherein the three models belong to regression models, and two evaluation indexes, namely root mean square error and R square, are selected to verify the prediction effects of the combination model, the long-short term memory network model and the random forest model, and the specific results are shown in tables 1 and 2.
TABLE 1 comparison of the results of the three models
Figure BDA0002277046410000041
TABLE 2 three model explosive passenger flow prediction results
Figure BDA0002277046410000042
Figure BDA0002277046410000051
The experiment and the final experiment result show that the combined model combining the long-term and short-term memory network model and the random forest model is superior to a single model. The combined model is optimal from either the root mean square error or the R-squared index. In the aspect of short-term explosive passenger flow prediction, the combined model also has outstanding advantages which cannot be compared with a single model. The combined model obtains stronger nonlinear fitting capability by selecting two single models which are excellent in nonlinear prediction, so that the combined model has certain advantages in passenger flow prediction.

Claims (1)

1. A short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest is characterized by comprising the following steps:
step one, selecting a prediction variable factor;
comprehensively considering correlation, repeatability and feasibility factors, selecting daily passenger flow of scenic spots, daily weather conditions such as temperature, wind direction, wind speed and humidity, network search indexes and holiday data as prediction variable factors by a passenger flow prediction model;
step two, training a prediction model;
preprocessing the original data and converting the preprocessed original data into data suitable for a model; selecting prediction variable factors, preprocessing data including three indexes of temperature-humidity index, wind efficiency index and dressing index, respectively expressing the three indexes by XTHI, XWCI and XICL, and calculating formulas shown in formulas (1) to (3);
XTHI=(1.8t+32)-0.5(1-f)(1.8t-26) (1)
Figure FDA0002277046400000011
Figure FDA0002277046400000012
t is the temperature in centigrade, f is the relative humidity percent, v represents the wind speed m/s, s represents the sunshine hours H/d, H represents 75 percent W/m of the human body metabolic rate2A represents the absorption condition of human body to solar radiation, the numerical value of A is 0.06, R represents the solar radiation received by the unit area of the vertical sunlight, and the value is (1385 +/-7) W/m2α represents the solar altitude;
secondly, training the long-term and short-term memory network;
converting the corresponding year, month and day of the passenger flow into an ordered sequence; recording the information of each day as a record, wherein the record comprises a serial number converted from the date, a current day comfort index, a current day holiday index, a yesterday search index and yesterday passenger flow, and the corresponding result is the current day passenger flow; after part of abnormal data is removed, setting the size of the Batch size as a whole data set, wherein the iteration times are 1000 times; importing the data into the long-short term memory network to obtain a prediction result of the long-short term memory network;
thirdly, training the random forest;
setting the number of the maximum submodels of the random forest to be 1000, judging whether the node is continuously split by adopting a mean square error method, wherein all characteristics participate in judgment when the node is split, and the maximum depth of the random forest is not limited; meanwhile, in order to accelerate the training speed and take the bearing capacity of the machine into consideration, the parallel number is set to be 16; importing the record into a random forest, training residual errors, and obtaining a prediction result of the random forest;
finally, model combination work is carried out;
the sum of the prediction results of the long-term and short-term memory network model and the random forest model is the final prediction result of the combined model.
CN201911126632.1A 2019-11-18 2019-11-18 Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest Pending CN110929926A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911126632.1A CN110929926A (en) 2019-11-18 2019-11-18 Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911126632.1A CN110929926A (en) 2019-11-18 2019-11-18 Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest

Publications (1)

Publication Number Publication Date
CN110929926A true CN110929926A (en) 2020-03-27

Family

ID=69854061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911126632.1A Pending CN110929926A (en) 2019-11-18 2019-11-18 Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest

Country Status (1)

Country Link
CN (1) CN110929926A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985706A (en) * 2020-08-15 2020-11-24 西北工业大学 Scenic spot daily passenger flow volume prediction method based on feature selection and LSTM
CN112215424A (en) * 2020-10-16 2021-01-12 平安国际智慧城市科技股份有限公司 Medical index prediction method, device, electronic equipment and storage medium
CN112949939A (en) * 2021-03-30 2021-06-11 福州市电子信息集团有限公司 Taxi passenger carrying hotspot prediction method based on random forest model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106779196A (en) * 2016-12-05 2017-05-31 中国航天系统工程有限公司 A kind of tourist flow prediction and peak value regulation and control method based on tourism big data
CN107590569A (en) * 2017-09-25 2018-01-16 山东浪潮云服务信息科技有限公司 A kind of data predication method and device
US20180176243A1 (en) * 2016-12-16 2018-06-21 Patternex, Inc. Method and system for learning representations for log data in cybersecurity
CN109034469A (en) * 2018-07-20 2018-12-18 成都中科大旗软件有限公司 A kind of tourist flow prediction technique based on machine learning
CN110443314A (en) * 2019-08-08 2019-11-12 中国工商银行股份有限公司 Scenic spot passenger flow forecast method and device based on machine learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106779196A (en) * 2016-12-05 2017-05-31 中国航天系统工程有限公司 A kind of tourist flow prediction and peak value regulation and control method based on tourism big data
US20180176243A1 (en) * 2016-12-16 2018-06-21 Patternex, Inc. Method and system for learning representations for log data in cybersecurity
CN107590569A (en) * 2017-09-25 2018-01-16 山东浪潮云服务信息科技有限公司 A kind of data predication method and device
CN109034469A (en) * 2018-07-20 2018-12-18 成都中科大旗软件有限公司 A kind of tourist flow prediction technique based on machine learning
CN110443314A (en) * 2019-08-08 2019-11-12 中国工商银行股份有限公司 Scenic spot passenger flow forecast method and device based on machine learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李东;: "吐鲁番地区旅游气候舒适度与游客量逆变化相关性分析" *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985706A (en) * 2020-08-15 2020-11-24 西北工业大学 Scenic spot daily passenger flow volume prediction method based on feature selection and LSTM
CN111985706B (en) * 2020-08-15 2023-08-25 西北工业大学 Scenic spot daily passenger flow prediction method based on feature selection and LSTM
CN112215424A (en) * 2020-10-16 2021-01-12 平安国际智慧城市科技股份有限公司 Medical index prediction method, device, electronic equipment and storage medium
CN112949939A (en) * 2021-03-30 2021-06-11 福州市电子信息集团有限公司 Taxi passenger carrying hotspot prediction method based on random forest model
CN112949939B (en) * 2021-03-30 2022-12-06 福州市电子信息集团有限公司 Taxi passenger carrying hotspot prediction method based on random forest model

Similar Documents

Publication Publication Date Title
Rezaei et al. Investigation of the optimal location design of a hybrid wind-solar plant: A case study
CN110929926A (en) Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest
Qiu et al. Pumped hydropower storage potential and its contribution to hybrid renewable energy co-development: A case study in the Qinghai-Tibet Plateau
Kassem et al. A value tree for identification of evaluation criteria for solar thermal power technologies in developing countries
Pérez-Folgado et al. Paleoclimatic variations in foraminifer assemblages from the Alboran Sea (Western Mediterranean) during the last 150 ka in ODP Site 977
Olayinka Estimation of global and diffuse solar radiations for se-lected cities in Nigeria
Yaniktepe et al. The global solar radiation estimation and analysis of solar energy: Case study for Osmaniye, Turkey
CN111967675A (en) Photovoltaic power generation amount prediction method and prediction device
Nassar et al. Solar and wind atlas for Libya
Kenfack et al. Cameroon's hydropower potential and development under the vision of Central Africa power pool (CAPP): A review
Almutairi Determining the appropriate location for renewable hydrogen development using multi‐criteria decision‐making approaches
CN110852492A (en) Photovoltaic power ultra-short-term prediction method for finding similarity based on Mahalanobis distance
Zhang et al. Ultra-short-term multi-step probability interval prediction of photovoltaic power: A framework with time-series-segment feature analysis
CN111404193B (en) Data-driven-based microgrid random robust optimization scheduling method
Jo et al. Feasibility of concentrated photovoltaic systems (CPV) in various united states geographic locations
CN116484998A (en) Distributed photovoltaic power station power prediction method and system based on meteorological similar day
Formayer et al. SECURES-Met: A European meteorological data set suitable for electricity modelling applications
CN116029490A (en) Optical network storage collaborative planning method considering capacity limitation of distributed resource area
Akello et al. Modeling and performance analysis of solar parabolic trough collectors for hybrid process heat application in Kenya’s tea industry using system advisor model
CN114282336A (en) New energy power station output scene generation method and system
CN113704696A (en) Reservoir water temperature structure discrimination method and discrimination equipment
CN112132344A (en) Short-term wind power prediction method based on similar day and FRS-SVM
Feylizadeh et al. Priority determination of the renewable energies using fuzzy group VIKOR method; Case study Iran
CN117200199B (en) Photovoltaic power prediction method and system based on weather typing
Salmerón-Manzano et al. Renewable energy predictions: Worldwide research trends and future perspective

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200327