CN110929926A - Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest - Google Patents
Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest Download PDFInfo
- Publication number
- CN110929926A CN110929926A CN201911126632.1A CN201911126632A CN110929926A CN 110929926 A CN110929926 A CN 110929926A CN 201911126632 A CN201911126632 A CN 201911126632A CN 110929926 A CN110929926 A CN 110929926A
- Authority
- CN
- China
- Prior art keywords
- short
- prediction
- passenger flow
- random forest
- term
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007637 random forest analysis Methods 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 19
- 230000007787 long-term memory Effects 0.000 title claims abstract description 15
- 230000006403 short-term memory Effects 0.000 title claims abstract description 15
- 238000004880 explosion Methods 0.000 title claims abstract description 8
- 230000015654 memory Effects 0.000 claims abstract description 17
- 238000012549 training Methods 0.000 claims description 15
- 230000005855 radiation Effects 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 5
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 238000010521 absorption reaction Methods 0.000 claims description 3
- 230000037323 metabolic rate Effects 0.000 claims description 3
- 239000002360 explosive Substances 0.000 abstract description 6
- 238000000354 decomposition reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Abstract
The invention discloses a short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest, which is used for solving the technical problem of poor practicability of the conventional passenger flow prediction method. The technical scheme includes that two single models, namely a long-short term memory network and a random forest, are combined, the long-short term memory network is used for fitting daily passenger flow time series data, then the random forest is used for fitting residual errors between the long-short term memory network and the random forest, and finally prediction results of the two trained single models are superposed to obtain a combined model prediction result. The combined model combines the advantages of two single models, improves the prediction accuracy compared with the single model, has higher prediction stability, correspondingly improves the prediction of the passenger flow peak value, is particularly suitable for the prediction of short-term explosive passenger flow, and has good practicability.
Description
Technical Field
The invention relates to a passenger flow prediction method, in particular to a short-term explosion passenger flow prediction method based on a long and short-term memory network and a random forest.
Background
The document "design U.S. tourist areas using optimal simple tourist map analysis, Tourism Management,2015,46: 322-135" discloses a method for predicting tourist needs using Singular Spectral Analysis (SSA), which article shows significant advantages in predicting the number of tourists going to the united states. The model construction mainly comprises two stages of decomposition and reconstruction, wherein the decomposition stage mainly comprises two steps of embedding and singular value decomposition; in the reconstruction phase, two steps of grouping and diagonal averaging are mainly included, so that the model reaches the optimal prediction state. However, the method is single model prediction, needs to be improved in the aspects of application range, accuracy and the like, does not consider the current situation that domestic tourism is greatly influenced by factors such as festivals and holidays, and is not suitable for solving the problem of short-term explosive passenger flow prediction.
Disclosure of Invention
In order to overcome the defect that the existing passenger flow prediction method is poor in practicability, the invention provides a short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest. The method combines two single models, namely a long-short term memory network and a random forest, firstly uses the long-short term memory network to fit daily passenger flow time series data, then adopts the random forest to fit residual errors between the long-short term memory network and the random forest, and finally superposes the prediction results of the two trained single models to obtain a combined model prediction result. The combined model combines the advantages of two single models, improves the prediction accuracy compared with the single model, has higher prediction stability, correspondingly improves the prediction of the passenger flow peak value, is particularly suitable for the prediction of short-term explosive passenger flow, and has good practicability.
The technical scheme adopted by the invention for solving the technical problems is as follows: a short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest is characterized by comprising the following steps:
step one, selecting a prediction variable factor.
And comprehensively considering correlation, repeatability and feasibility factors, and selecting the daily passenger flow volume, the daily weather conditions such as temperature, wind direction, wind speed and humidity, the network search index and the holiday data of the scenic spot as the prediction variable factors by the passenger flow prediction model.
And step two, training a prediction model.
And preprocessing the original data to convert the data into data suitable for the model. Selecting prediction variable factors, preprocessing data comprising three indexes of a warm-wet index, a wind effect index and a dressing index, respectively expressing the three indexes by XTHI, XWCI and XICL, and calculating formulas shown in formulas (1) to (3).
XTHI=(1.8t+32)-0.5(1-f)(1.8t-26) (1)
t is the temperature in centigrade, f isRelative humidity%, v represents wind speed m/s, s represents sunshine hours H/d, H represents 75% W/m of human metabolic rate2A represents the absorption condition of human body to solar radiation, the numerical value of A is 0.06, R represents the solar radiation received by the unit area of the vertical sunlight, and the value is (1385 +/-7) W/m2And α represents the solar altitude.
Secondly, training the long-term and short-term memory network is carried out.
And converting the corresponding year, month and day of the passenger flow into an ordered sequence. The information of each day is recorded as a record, the record comprises a serial number converted from the date, a current day comfort index, a current day holiday index, a yesterday search index and yesterday passenger flow, and the corresponding result is the current day passenger flow. After part of the abnormal data is removed, the size of the Batch size is set as the whole data set, and the iteration number is 1000. And importing the data into the long-short term memory network to obtain the prediction result of the long-short term memory network.
And thirdly, training the random forest.
The number of the maximum submodels of the random forest is set to be 1000, the method for judging whether the node is continuously split is the mean square error, all the characteristics are involved in the judgment when the node is split, and the maximum depth of the random forest is not limited. Meanwhile, in order to increase the training speed and in consideration of the load-bearing capacity of the machine itself, the number of parallels is set to 16. And importing the record into a random forest, and training residual errors to obtain a prediction result of the random forest.
And finally, carrying out model combination work.
The sum of the prediction results of the long-term and short-term memory network model and the random forest model is the final prediction result of the combined model.
The invention has the beneficial effects that: the method combines two single models, namely a long-short term memory network and a random forest, firstly uses the long-short term memory network to fit daily passenger flow time series data, then adopts the random forest to fit residual errors between the long-short term memory network and the random forest, and finally superposes the prediction results of the two trained single models to obtain a combined model prediction result. The combined model combines the advantages of two single models, improves the prediction accuracy compared with the single model, has higher prediction stability, correspondingly improves the prediction of the passenger flow peak value, is particularly suitable for the prediction of short-term explosive passenger flow, and has good practicability.
The present invention will be described in detail with reference to the following embodiments.
Detailed Description
The invention will now be further described with reference to the example of traffic prediction in the scenic spot of the four girl mountains. The four girl mountains are a typical mountain type scenic spot and have a certain degree of popularity across the country. Most importantly, the informatization process is promoted earlier by the four girl mountains, sufficient data are provided, and daily passenger flow data are easy to obtain.
The invention discloses a short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest, which comprises the following specific steps:
step 1: and selecting a predictive variable factor.
The factors such as relevance, repeatability and feasibility are comprehensively considered, and the daily passenger flow, the daily weather conditions (temperature, wind direction, wind speed and humidity), the network search index and the holiday data of the scenic spot of the four girl mountains are selected as the factors of the prediction variables by the passenger flow prediction model.
Step 2: and training a prediction model.
And preprocessing the original data to convert the data into data suitable for the model. In combination with the specific predictor variables selected in this example, the data pretreatment included three indices, namely, the temperature-humidity index, the wind efficiency index and the dressing index, which were respectively expressed by XTHI, XWCI and XICL, and the calculation formulas are shown in formulas (1) to (3).
XTHI=(1.8t+32)-0.5(1-f)(1.8t-26) (1)
t is temperature in centigrade (DEG C), f is relative humidity (%), v represents wind speed (m/s), s represents sunshine hours (h ^ er)d) H represents 75% (W/m) of human metabolic rate2) A represents the absorption condition of the human body to solar radiation (after the practical condition is comprehensively considered, the numerical value of A is selected to be 0.06), and R represents the solar radiation (with the value of 1385 +/-7) W/m received by the unit area of the land vertical to the sunlight2) And α represents the solar altitude.
Secondly, training the long-term and short-term memory network is carried out.
And converting the corresponding year, month and day of the passenger flow into an ordered sequence. The information of each day is recorded as a record, the record comprises a serial number converted from the date, a current day comfort index, a current day holiday index, a yesterday search index and yesterday passenger flow, and the corresponding result is the current day passenger flow. After part of the abnormal data is removed, the size of the Batch size is set as the whole data set, and the iteration number is 1000. And importing the data into the long-short term memory network to obtain the prediction result of the long-short term memory network.
And thirdly, training the random forest.
The number of the maximum submodels of the random forest is set to be 1000, the method for judging whether the node is continuously split is the mean square error, all the characteristics are involved in the judgment when the node is split, and the maximum depth of the random forest is not limited. Meanwhile, in order to increase the training speed and in consideration of the load-bearing capacity of the machine itself, the number of parallels is set to 16. And importing the record into a random forest, and training residual errors to obtain a prediction result of the random forest.
And finally, carrying out model combination work.
The sum of the prediction results of the long-term and short-term memory network model and the random forest model is the final prediction result of the combined model.
And comparing the long-short term memory network model with the random forest model, wherein the three models belong to regression models, and two evaluation indexes, namely root mean square error and R square, are selected to verify the prediction effects of the combination model, the long-short term memory network model and the random forest model, and the specific results are shown in tables 1 and 2.
TABLE 1 comparison of the results of the three models
TABLE 2 three model explosive passenger flow prediction results
The experiment and the final experiment result show that the combined model combining the long-term and short-term memory network model and the random forest model is superior to a single model. The combined model is optimal from either the root mean square error or the R-squared index. In the aspect of short-term explosive passenger flow prediction, the combined model also has outstanding advantages which cannot be compared with a single model. The combined model obtains stronger nonlinear fitting capability by selecting two single models which are excellent in nonlinear prediction, so that the combined model has certain advantages in passenger flow prediction.
Claims (1)
1. A short-term explosion passenger flow prediction method based on a long-term and short-term memory network and a random forest is characterized by comprising the following steps:
step one, selecting a prediction variable factor;
comprehensively considering correlation, repeatability and feasibility factors, selecting daily passenger flow of scenic spots, daily weather conditions such as temperature, wind direction, wind speed and humidity, network search indexes and holiday data as prediction variable factors by a passenger flow prediction model;
step two, training a prediction model;
preprocessing the original data and converting the preprocessed original data into data suitable for a model; selecting prediction variable factors, preprocessing data including three indexes of temperature-humidity index, wind efficiency index and dressing index, respectively expressing the three indexes by XTHI, XWCI and XICL, and calculating formulas shown in formulas (1) to (3);
XTHI=(1.8t+32)-0.5(1-f)(1.8t-26) (1)
t is the temperature in centigrade, f is the relative humidity percent, v represents the wind speed m/s, s represents the sunshine hours H/d, H represents 75 percent W/m of the human body metabolic rate2A represents the absorption condition of human body to solar radiation, the numerical value of A is 0.06, R represents the solar radiation received by the unit area of the vertical sunlight, and the value is (1385 +/-7) W/m2α represents the solar altitude;
secondly, training the long-term and short-term memory network;
converting the corresponding year, month and day of the passenger flow into an ordered sequence; recording the information of each day as a record, wherein the record comprises a serial number converted from the date, a current day comfort index, a current day holiday index, a yesterday search index and yesterday passenger flow, and the corresponding result is the current day passenger flow; after part of abnormal data is removed, setting the size of the Batch size as a whole data set, wherein the iteration times are 1000 times; importing the data into the long-short term memory network to obtain a prediction result of the long-short term memory network;
thirdly, training the random forest;
setting the number of the maximum submodels of the random forest to be 1000, judging whether the node is continuously split by adopting a mean square error method, wherein all characteristics participate in judgment when the node is split, and the maximum depth of the random forest is not limited; meanwhile, in order to accelerate the training speed and take the bearing capacity of the machine into consideration, the parallel number is set to be 16; importing the record into a random forest, training residual errors, and obtaining a prediction result of the random forest;
finally, model combination work is carried out;
the sum of the prediction results of the long-term and short-term memory network model and the random forest model is the final prediction result of the combined model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911126632.1A CN110929926A (en) | 2019-11-18 | 2019-11-18 | Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911126632.1A CN110929926A (en) | 2019-11-18 | 2019-11-18 | Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110929926A true CN110929926A (en) | 2020-03-27 |
Family
ID=69854061
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911126632.1A Pending CN110929926A (en) | 2019-11-18 | 2019-11-18 | Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110929926A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985706A (en) * | 2020-08-15 | 2020-11-24 | 西北工业大学 | Scenic spot daily passenger flow volume prediction method based on feature selection and LSTM |
CN112215424A (en) * | 2020-10-16 | 2021-01-12 | 平安国际智慧城市科技股份有限公司 | Medical index prediction method, device, electronic equipment and storage medium |
CN112949939A (en) * | 2021-03-30 | 2021-06-11 | 福州市电子信息集团有限公司 | Taxi passenger carrying hotspot prediction method based on random forest model |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106779196A (en) * | 2016-12-05 | 2017-05-31 | 中国航天系统工程有限公司 | A kind of tourist flow prediction and peak value regulation and control method based on tourism big data |
CN107590569A (en) * | 2017-09-25 | 2018-01-16 | 山东浪潮云服务信息科技有限公司 | A kind of data predication method and device |
US20180176243A1 (en) * | 2016-12-16 | 2018-06-21 | Patternex, Inc. | Method and system for learning representations for log data in cybersecurity |
CN109034469A (en) * | 2018-07-20 | 2018-12-18 | 成都中科大旗软件有限公司 | A kind of tourist flow prediction technique based on machine learning |
CN110443314A (en) * | 2019-08-08 | 2019-11-12 | 中国工商银行股份有限公司 | Scenic spot passenger flow forecast method and device based on machine learning |
-
2019
- 2019-11-18 CN CN201911126632.1A patent/CN110929926A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106779196A (en) * | 2016-12-05 | 2017-05-31 | 中国航天系统工程有限公司 | A kind of tourist flow prediction and peak value regulation and control method based on tourism big data |
US20180176243A1 (en) * | 2016-12-16 | 2018-06-21 | Patternex, Inc. | Method and system for learning representations for log data in cybersecurity |
CN107590569A (en) * | 2017-09-25 | 2018-01-16 | 山东浪潮云服务信息科技有限公司 | A kind of data predication method and device |
CN109034469A (en) * | 2018-07-20 | 2018-12-18 | 成都中科大旗软件有限公司 | A kind of tourist flow prediction technique based on machine learning |
CN110443314A (en) * | 2019-08-08 | 2019-11-12 | 中国工商银行股份有限公司 | Scenic spot passenger flow forecast method and device based on machine learning |
Non-Patent Citations (1)
Title |
---|
李东;: "吐鲁番地区旅游气候舒适度与游客量逆变化相关性分析" * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985706A (en) * | 2020-08-15 | 2020-11-24 | 西北工业大学 | Scenic spot daily passenger flow volume prediction method based on feature selection and LSTM |
CN111985706B (en) * | 2020-08-15 | 2023-08-25 | 西北工业大学 | Scenic spot daily passenger flow prediction method based on feature selection and LSTM |
CN112215424A (en) * | 2020-10-16 | 2021-01-12 | 平安国际智慧城市科技股份有限公司 | Medical index prediction method, device, electronic equipment and storage medium |
CN112949939A (en) * | 2021-03-30 | 2021-06-11 | 福州市电子信息集团有限公司 | Taxi passenger carrying hotspot prediction method based on random forest model |
CN112949939B (en) * | 2021-03-30 | 2022-12-06 | 福州市电子信息集团有限公司 | Taxi passenger carrying hotspot prediction method based on random forest model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Rezaei et al. | Investigation of the optimal location design of a hybrid wind-solar plant: A case study | |
CN110929926A (en) | Short-term explosion passenger flow prediction method based on long and short-term memory network and random forest | |
Qiu et al. | Pumped hydropower storage potential and its contribution to hybrid renewable energy co-development: A case study in the Qinghai-Tibet Plateau | |
Kassem et al. | A value tree for identification of evaluation criteria for solar thermal power technologies in developing countries | |
Pérez-Folgado et al. | Paleoclimatic variations in foraminifer assemblages from the Alboran Sea (Western Mediterranean) during the last 150 ka in ODP Site 977 | |
Olayinka | Estimation of global and diffuse solar radiations for se-lected cities in Nigeria | |
Yaniktepe et al. | The global solar radiation estimation and analysis of solar energy: Case study for Osmaniye, Turkey | |
CN111967675A (en) | Photovoltaic power generation amount prediction method and prediction device | |
Nassar et al. | Solar and wind atlas for Libya | |
Kenfack et al. | Cameroon's hydropower potential and development under the vision of Central Africa power pool (CAPP): A review | |
Almutairi | Determining the appropriate location for renewable hydrogen development using multi‐criteria decision‐making approaches | |
CN110852492A (en) | Photovoltaic power ultra-short-term prediction method for finding similarity based on Mahalanobis distance | |
Zhang et al. | Ultra-short-term multi-step probability interval prediction of photovoltaic power: A framework with time-series-segment feature analysis | |
CN111404193B (en) | Data-driven-based microgrid random robust optimization scheduling method | |
Jo et al. | Feasibility of concentrated photovoltaic systems (CPV) in various united states geographic locations | |
CN116484998A (en) | Distributed photovoltaic power station power prediction method and system based on meteorological similar day | |
Formayer et al. | SECURES-Met: A European meteorological data set suitable for electricity modelling applications | |
CN116029490A (en) | Optical network storage collaborative planning method considering capacity limitation of distributed resource area | |
Akello et al. | Modeling and performance analysis of solar parabolic trough collectors for hybrid process heat application in Kenya’s tea industry using system advisor model | |
CN114282336A (en) | New energy power station output scene generation method and system | |
CN113704696A (en) | Reservoir water temperature structure discrimination method and discrimination equipment | |
CN112132344A (en) | Short-term wind power prediction method based on similar day and FRS-SVM | |
Feylizadeh et al. | Priority determination of the renewable energies using fuzzy group VIKOR method; Case study Iran | |
CN117200199B (en) | Photovoltaic power prediction method and system based on weather typing | |
Salmerón-Manzano et al. | Renewable energy predictions: Worldwide research trends and future perspective |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200327 |