CN114331234B - Rail transit passenger flow prediction method and system based on passenger travel information - Google Patents

Rail transit passenger flow prediction method and system based on passenger travel information Download PDF

Info

Publication number
CN114331234B
CN114331234B CN202210254509.3A CN202210254509A CN114331234B CN 114331234 B CN114331234 B CN 114331234B CN 202210254509 A CN202210254509 A CN 202210254509A CN 114331234 B CN114331234 B CN 114331234B
Authority
CN
China
Prior art keywords
passenger
travel
station
time
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210254509.3A
Other languages
Chinese (zh)
Other versions
CN114331234A (en
Inventor
许心越
张安忠
蔡昌俊
刘军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Guangzhou Metro Group Co Ltd
Original Assignee
Beijing Jiaotong University
Guangzhou Metro Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University, Guangzhou Metro Group Co Ltd filed Critical Beijing Jiaotong University
Priority to CN202210254509.3A priority Critical patent/CN114331234B/en
Publication of CN114331234A publication Critical patent/CN114331234A/en
Application granted granted Critical
Publication of CN114331234B publication Critical patent/CN114331234B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of rail transit passenger flow prediction, in particular to a rail transit passenger flow prediction method and a rail transit passenger flow prediction system based on passenger travel information, wherein the method comprises the following steps: passenger travel data are acquired, a passenger travel information index system is established based on the passenger travel data, and each index information in the passenger travel information index system is calculated in a statistical mode; estimating the return passenger flow of different time periods in the station based on partial index information obtained by calculation in the passenger travel information index system; and taking the return passenger flow volume of passengers in the station as a covariate to be added into the passenger flow prediction model to predict the station entering passenger flow. According to the invention, the travel rule of the passenger is mined according to the multi-source travel data of the passenger, and a passenger travel information three-level index information system is established on the basis, so that the statistical calculation of each index information is realized. Meanwhile, a prediction method for identifying return passenger flow facing passenger travel information and effectively improving the accuracy of station arrival passenger flow according to the return passenger flow is provided.

Description

Rail transit passenger flow prediction method and system based on passenger travel information
Technical Field
The invention relates to the technical field of rail transit passenger flow prediction, in particular to a rail transit passenger flow prediction method and a rail transit passenger flow prediction system based on passenger travel information.
Background
The accurate prediction of passenger flow demands at stations is crucial to the operation of urban subway systems. In the past, the passenger flow values at several past times are mainly regarded as time series to predict the passenger flow at a certain future time. However, this method basically ignores the travel behavior law of the individual passenger. For example, if a passenger gets off to work at a subway station in the morning, he/she is likely to get on to get home at the same station in the evening. The existing research shows that the travel behavior component is very necessary to be added into the passenger flow prediction time sequence. According to the concept of user travel information, user labels which are easy to understand, representative and meaningful are abstracted through modeling, and an information set of a user is constructed through the labels to describe the behavior characteristics of the user. Therefore, based on the travel information of the individual passenger, the travel information of the passenger is constructed to describe the travel behavior rule of the individual passenger, and accurate passenger flow prediction is possible. At present, the following defects still exist in the travel information of rail transit passengers: deep mining is not carried out on the multisource travel information of the passengers, so that large data waste is caused; the index system of the passenger travel information established through data analysis is not sound enough, and still needs to be further mined.
Disclosure of Invention
The invention aims to solve at least one technical problem in the background art, and provides a method and a system for predicting rail transit passenger flow based on passenger travel information.
In order to achieve the purpose, the invention provides a rail transit passenger flow prediction method based on passenger travel information, which comprises the following steps:
passenger travel data are acquired, a passenger travel information index system is established based on the passenger travel data, and all indexes in the passenger travel information index system are calculated in a statistical mode;
estimating the return passenger flow of different time periods in the station based on partial index data obtained by calculation in the passenger travel information index system;
and taking the return passenger flow volume of passengers in the station as a covariate to be added into the passenger flow prediction model to predict the station entering passenger flow.
Preferably, the trip data includes:
AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information;
APP registration data used for obtaining identity information and associated information of passengers;
APP value-added consumption data used for obtaining value-added service information of passengers;
POI data near the stop, associated with the stop, for describing geographic attributes of the stop.
Preferably, the index information in the passenger travel information index system includes: basic information, business information and derived information;
the basic information comprises identity information and associated information, the identity information comprises an APPID, gender and age of a passenger and whether the passenger is disabled, and the associated information comprises a third-party payment mode of the passenger and a city all-purpose card;
the service information comprises trip basic information, trip derived information and value added service information, wherein the trip basic information comprises the trip in and out time and the trip in and out station information of passengers, the trip derived information comprises average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution, and the value added service information comprises value added service participation times, participation frequency, average transaction amount, payment mode distribution, merchant type distribution and last participation time;
the derived information comprises an activity attribute and a function attribute, wherein the activity attribute comprises travel activity, and the function attribute comprises a travel demand type of the passenger, a residence area site, a working area site and a value-added participation degree.
Preferably, the formula for calculating the average trip duration is as follows:
Figure 550645DEST_PATH_IMAGE001
the formula for calculating the average daily trip times is as follows:
Figure 948128DEST_PATH_IMAGE002
the formula for calculating the travel time distribution is as follows:
Figure 739367DEST_PATH_IMAGE003
counting the travel OD distribution as travel OD statistics of the first three travel frequencies of the passengers;
the formula for calculating the first trip time is as follows:
Figure 880498DEST_PATH_IMAGE004
the formula for calculating the last travel time is as follows:
Figure 706372DEST_PATH_IMAGE005
the formula for calculating the travel time distribution of the holidays is as follows:
Figure 540336DEST_PATH_IMAGE006
counting the travel OD distribution of the holidays, namely the travel OD statistics of the passengers three times before the holiday travel frequency;
in the following formulas, the first and second groups,
Figure 553291DEST_PATH_IMAGE007
represents the first
Figure 498113DEST_PATH_IMAGE008
The next trip, d for the outbound site, o for the inbound site, i for the passenger, t for the time,
Figure 915844DEST_PATH_IMAGE009
represents passenger i at time t
Figure 920709DEST_PATH_IMAGE008
The time of the next outbound is the time of the next outbound,
Figure 420960DEST_PATH_IMAGE010
represents passenger i at time t
Figure 44840DEST_PATH_IMAGE008
The time of the next arrival is the time of the next arrival,
Figure 579726DEST_PATH_IMAGE011
represents the average travel time period of the passenger i,
Figure 755493DEST_PATH_IMAGE012
representing the total number of historical trips of passenger i,
Figure 743040DEST_PATH_IMAGE013
representing the average number of trips of the passenger i, D representing the total number of days of the passenger within the statistical date,
Figure 888720DEST_PATH_IMAGE014
is a binary identification function, when the condition is satisfied, the value is 1, otherwise 0,
Figure 27182DEST_PATH_IMAGE015
representing the time of the first trip of passenger i,
Figure 373850DEST_PATH_IMAGE016
representing the last travel time of the passenger i,
Figure 848693DEST_PATH_IMAGE017
and counting the total times of travel of the passengers on the holidays within the counting date.
Preferably, the formula for calculating the participation frequency is:
Figure 673430DEST_PATH_IMAGE018
wherein,
Figure 182908DEST_PATH_IMAGE019
the frequency of participation in the value added service on behalf of the passenger i,
Figure 434898DEST_PATH_IMAGE020
number of times passenger i participates in value added service;
counting the merchant type distribution as merchant type statistics of the first three passenger participation frequencies;
the formula for calculating the average transaction amount is:
Figure 131459DEST_PATH_IMAGE021
wherein,
Figure 759886DEST_PATH_IMAGE022
the average amount of money spent for passenger i to participate in the value added service,
Figure 126801DEST_PATH_IMAGE023
a total amount of money spent participating in the value added service for the passenger i;
and counting the payment mode distribution as the statistics of the first three using modes when the passenger pays.
Preferably, the travel demand type is determined by a clustering result of a total travel frequency, a first travel time and an average travel time counted by passenger station-entering card-swiping data, and the clustering method comprises the following steps:
dividing the passengers into different categories according to the travel characteristics of the passengers by adopting a K-means algorithm, and selecting the total historical travel times of the passengers i in a passenger travel information index system
Figure 549692DEST_PATH_IMAGE024
First trip time
Figure 264707DEST_PATH_IMAGE025
And average length of trip
Figure 696826DEST_PATH_IMAGE011
As an index of passenger clustering in a station, determining a clustering number K value by adopting an elbow method, wherein a key index of the elbow method is the sum of squared errors among clusters SSE, and a calculation formula is as follows:
Figure 649738DEST_PATH_IMAGE026
wherein,
Figure 509110DEST_PATH_IMAGE027
which represents the k-th cluster of the cluster,
Figure 180263DEST_PATH_IMAGE028
is that
Figure 557017DEST_PATH_IMAGE029
A center point of (a);
the formula for calculating the residential area site is:
Figure 492000DEST_PATH_IMAGE030
Figure 522273DEST_PATH_IMAGE031
the formula for calculating the work area station is as follows:
Figure 821667DEST_PATH_IMAGE032
Figure 595588DEST_PATH_IMAGE033
wherein,
Figure 523093DEST_PATH_IMAGE034
the probability of representing the station e as the station of the residential area of the passenger i;
Figure 724267DEST_PATH_IMAGE035
representing the probability that the station e is used as a station of a working area of the passenger i;
Figure 370012DEST_PATH_IMAGE036
representing the total times of passengers to get in and out of the station e;
Figure 682045DEST_PATH_IMAGE037
representing the number of station entrance times of a passenger i at a station e before 12 o' clock in working day;
Figure 729635DEST_PATH_IMAGE038
representing the number of station entrance times of a passenger i at a station e before 16 resting days;
Figure 104640DEST_PATH_IMAGE039
representing the number of station entering times of a passenger at a station e after 12 o' clock of a working day;
Figure 237681DEST_PATH_IMAGE040
representing the number of times of entering the station e after 16 resting days of the passengers i;
Figure 353405DEST_PATH_IMAGE041
representing the total times of passengers entering and leaving the station e on the working day of the passenger i;
Figure 255502DEST_PATH_IMAGE042
representing the number of the outbound times of the passenger i at the station e before 12 o' clock on the working day;
Figure 798478DEST_PATH_IMAGE043
representing the number of times of departure of a passenger at a station e after 12 o' clock of a working day;
Figure 153236DEST_PATH_IMAGE044
representing the number of times of departure of a passenger at a station e after 16 resting days;
Figure 72651DEST_PATH_IMAGE045
representing the number of times of departure of a passenger i at a station e 16 points before the rest day;
the value-added participation degree is set to be strong, medium and low, when the participation frequency is more than 0.7, the value-added participation degree of the passenger is strong, when the participation frequency is less than 0.4, the value-added participation degree of the passenger is low, and when the participation frequency is between 0.4 and 0.7, the value-added participation degree of the passenger is medium.
Preferably, the estimating of the return passenger flow volume at different time intervals in the station based on the passenger travel information index system and the calculated partial index data includes:
counting according to the in-and-out time, the residential area site, the working area site and the travel demand type in the passenger travel information index system
Figure 829254DEST_PATH_IMAGE046
Wherein s is a certain site, v is a certain week, is 1-7, t is a certain time period,
Figure 534343DEST_PATH_IMAGE046
the number of people who return from the s station within the time period t of v weeks;
selecting historical outbound and return passenger flow data in the station s, and obtaining the week in a mean value calculation modevThe passengers are at
Figure 907556DEST_PATH_IMAGE047
Arrival of time periodsStand and are arranged at
Figure 365082DEST_PATH_IMAGE048
Time period fromsConditional probability distribution of station departure and return journey
Figure 976192DEST_PATH_IMAGE049
The calculation formula is as follows:
Figure 126551DEST_PATH_IMAGE050
in the formula
Figure 190322DEST_PATH_IMAGE051
Represents the total number of the weeks,
Figure 451539DEST_PATH_IMAGE047
which represents the time period a of the time,
Figure 917155DEST_PATH_IMAGE048
which represents the time period of b,
Figure 241345DEST_PATH_IMAGE052
is shown asjOn the v-th day of the week
Figure 57991DEST_PATH_IMAGE047
In a time period ofsThe number of passengers getting off the vehicle when standing,
Figure 716374DEST_PATH_IMAGE053
which represents the time of arrival of the station,
Figure 302076DEST_PATH_IMAGE054
represents the outbound time;
by said probability distribution
Figure 590975DEST_PATH_IMAGE055
Estimating
Figure 629338DEST_PATH_IMAGE056
The calculation formula is as follows:
Figure 969708DEST_PATH_IMAGE057
wherein,
Figure 675496DEST_PATH_IMAGE058
indicating day v of a week
Figure 72980DEST_PATH_IMAGE059
The number of passengers getting off the bus at s station at the moment, H represents the maximum interval of the time for passengers getting on or off the bus at s station, the maximum interval is 24 hours, H represents time slot resolution, and t +1 represents the next time interval of the time interval t.
Preferably, the return passenger flow volume of passengers in the station is added to the passenger flow prediction model as a covariate, and the prediction of the station entering passenger flow is as follows:
will estimate the
Figure 864218DEST_PATH_IMAGE056
Adding the predicted passenger flow into a common seasonal autoregressive moving average model to predict the station entering passenger flow;
the seasonal autoregressive moving average model is as follows: ARIMA (P, D, Q) [ Ω ], where P, D, Q represent the order of autoregressive, differencing and moving average, respectively; p, D, Q is the auto-regressive, differential and moving average order of seasonal portions; Ω is the number of cycles per season;
for a time series
Figure 5349DEST_PATH_IMAGE060
,ARIMA(p,d,q)(P,D,Q)[Ω]The model is as follows:
Figure 565644DEST_PATH_IMAGE061
wherein B is defined as
Figure 399608DEST_PATH_IMAGE062
Figure 678142DEST_PATH_IMAGE063
Figure 360315DEST_PATH_IMAGE064
Figure 775116DEST_PATH_IMAGE065
Figure 779981DEST_PATH_IMAGE066
Wherein
Figure 280232DEST_PATH_IMAGE067
Figure 763166DEST_PATH_IMAGE068
Figure 563632DEST_PATH_IMAGE069
and
Figure 473819DEST_PATH_IMAGE070
as a function of the coefficients to be found,
Figure 461367DEST_PATH_IMAGE071
to follow the error term of white noise and obey a mean of 0 and a variance of
Figure 575140DEST_PATH_IMAGE072
The normal distribution of (a) is,
Figure 230112DEST_PATH_IMAGE073
represents
Figure 311201DEST_PATH_IMAGE074
A time period;
when returning the passenger flow
Figure 520465DEST_PATH_IMAGE075
Inbound traffic volume when acting as covariates
Figure 610781DEST_PATH_IMAGE076
And return passenger flow
Figure 854680DEST_PATH_IMAGE077
The following relationships exist:
Figure 372249DEST_PATH_IMAGE078
Figure 334389DEST_PATH_IMAGE079
wherein,
Figure 700167DEST_PATH_IMAGE080
in order to be the regression coefficient, the method,
Figure 798573DEST_PATH_IMAGE081
is composed of
Figure 487043DEST_PATH_IMAGE046
In the middle week v, when the site s is known, the time is obtained by taking 1. cndot. t;
Figure 936479DEST_PATH_IMAGE082
obeying ARIMA (P, D, Q) (P, D, Q) [ s ]]A model representing the passenger flow except for the return passenger flow in the total inbound passenger flow; according to station history
Figure 103019DEST_PATH_IMAGE083
And
Figure 321510DEST_PATH_IMAGE084
calculate out
Figure 180882DEST_PATH_IMAGE080
And
Figure 852035DEST_PATH_IMAGE082
to obtain
Figure 90774DEST_PATH_IMAGE082
Back pass formula
Figure 163772DEST_PATH_IMAGE085
The prediction is obtained
Figure 194045DEST_PATH_IMAGE086
Then according to
Figure 618073DEST_PATH_IMAGE056
To obtain
Figure 126415DEST_PATH_IMAGE087
Wherein
Figure 319498DEST_PATH_IMAGE087
is formed by
Figure 255093DEST_PATH_IMAGE056
In the middle week v, when the station s is known, the time is obtained by taking t + 1; will be provided with
Figure 900838DEST_PATH_IMAGE087
And
Figure 215801DEST_PATH_IMAGE088
bringing in
Figure 263391DEST_PATH_IMAGE089
In the formula, the prediction is obtained
Figure 635467DEST_PATH_IMAGE090
The amount of passengers arriving at the station at that moment.
In order to achieve the above object, the present invention provides a rail transit passenger flow prediction system based on passenger travel information, comprising:
the index acquisition module is used for acquiring passenger travel data, establishing a passenger travel information index system based on the passenger travel data, and counting and calculating each index in the passenger travel information index system;
the return passenger flow calculation module is used for estimating the return passenger flow at different time intervals in the station based on part of index data calculated in the passenger travel information index system;
and the passenger flow prediction module is used for adding the return passenger flow of passengers in the station into the passenger flow prediction model as a covariate to predict the station entering passenger flow.
In order to achieve the above object, the present invention provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and running on the processor, wherein the computer program, when executed by the processor, implements the method for predicting rail transit passenger flow based on passenger travel information as described in any one of the above.
To achieve the above object, the present invention provides a computer-readable storage medium storing thereon a computer program, which when executed by a processor, implements a method for predicting rail transit passenger flow based on passenger travel information as described in any one of the above.
The invention has the beneficial effects that:
1. the method for predicting the passenger flow of the rail transit based on the passenger travel information provided by the invention is based on intelligent subway construction, effectively associates, fuses and introduces multi-source data related to the subway, establishes the passenger travel information and discusses the application of the passenger travel information in the aspect of passenger flow prediction;
2. according to the rail transit passenger flow prediction method based on passenger travel information, the travel rule of passengers is mined according to multi-source travel data of the passengers, a passenger travel information three-level index system is established on the basis, and statistical calculation of each index is realized;
3. the invention discloses a rail transit passenger flow prediction method based on passenger travel information, and provides a prediction method for identifying return passenger flow facing the passenger travel information and effectively improving the accuracy of station arrival passenger flow according to the return passenger flow.
Drawings
Fig. 1 schematically shows a flow chart of a method for predicting rail transit passenger flow based on passenger travel information according to the present invention;
fig. 2 is a block diagram schematically showing the structure of a rail transit passenger flow prediction system based on passenger travel information according to the present invention.
Detailed Description
The content of the invention will now be discussed with reference to exemplary embodiments. It is to be understood that the embodiments discussed are merely intended to enable one of ordinary skill in the art to better understand and thus implement the teachings of the present invention, and do not imply any limitations on the scope of the invention.
As used herein, the term "include" and its variants are to be read as open-ended terms meaning "including, but not limited to. The term "based on" is to be read as "based, at least in part, on". The terms "one embodiment" and "an embodiment" are to be read as "at least one embodiment".
Fig. 1 schematically shows a flow chart of a method for predicting rail transit passenger flow based on passenger travel information according to the present invention. As shown in fig. 1, the method for predicting rail transit passenger flow based on passenger travel information according to the present invention comprises the following steps:
a. passenger travel data are acquired, a passenger travel information index system is established based on the passenger travel data, and all indexes in the passenger travel information index system are calculated in a statistical mode;
b. estimating the return passenger flow of different time periods in the station based on partial index data obtained by calculation in the passenger travel information index system;
c. and taking the return passenger flow volume of passengers in the station as a covariate to be added into the passenger flow prediction model to predict the station entering passenger flow.
According to an embodiment of the invention, in the step a, when a passenger gets on the station and rides the vehicle, the trip data of the relevant passenger can be obtained through the card swiping record and/or the code scanning record of the passenger, and based on pycharm software, the data is obtained by connecting a python language with a database of the trip information of the passenger, and corresponding statistical analysis is carried out. In this embodiment, the trip data includes: AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information; APP registration data used for obtaining identity information and associated information of passengers; APP value-added consumption data used for obtaining value-added service information of passengers; and POI data, associated with the bus stop, near the bus stop for describing geographic attributes of the bus stop. In the embodiment, the statistical range of the travel data is counted from the date of APP registration of the passenger, the statistical range of the POI data near the station is a land use type within a coverage area with the station as the center and the radius of 500 meters, and the data is acquired through a third-party map software high-grade map.
In this embodiment, the passenger travel information index system is a concept representing passenger information, and includes three primary indexes: basic information, service information, and derived information. Wherein, the basic information comprises two-level indexes: identity information and associated information, the identity information includes three levels of indicators: APPID, sex, age and whether the passenger is disabled, and the associated information comprises three-level indexes: the third party payment mode of the passenger and the city card.
The service information comprises two-level indexes: travel basic information, travel derived information and value added service information. The travel basic information comprises three levels of indexes: passenger entry and exit time and passenger entry and exit station information; the travel derivative information comprises three levels of indexes: average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution; the value added service information comprises three levels of indexes: the number of times of participation of the value added service, the participation frequency, the average transaction amount, the payment mode distribution, the merchant type distribution and the final participation time.
The derived information includes secondary indicators: an active attribute and a functional attribute. Wherein the activity attribute comprises three levels of indexes: trip liveness; the functional attributes include three levels of indicators: passenger travel demand type, residential area site, work area site, and value added participation.
Further, in the present embodiment, the statistically calculating each index in the passenger travel information index system includes: counting all indexes in the identity information, the correlation information and the trip basic information of the passenger, counting total trip times, trip OD distribution, holiday trip OD distribution and trip path distribution indexes in the trip derivative information, and counting participation times, merchant type distribution, payment mode distribution and final participation time indexes in the value-added service information. In the present invention, there is no special method for statistics, as long as the relevant index information can be acquired and the acquired relevant index information is stored in a summary manner. As is clear from the above, the indexes are data information that can be generated by the actions of the passenger such as registration and travel, and are information that can be obtained without calculation, and therefore, only summary statistics are required, and calculation is not required.
Further, in the embodiment, the statistical calculation of each index in the passenger travel information index system further includes calculating three levels of indexes except the indexes, and the specific calculation includes:
calculating the average travel time length, wherein the average travel time length refers to the time spent by the passenger i for each travel, and the calculation formula is as follows:
Figure 502929DEST_PATH_IMAGE091
calculating the average daily trip times, wherein the average daily trip times are the trip times of the passenger i each day, and the calculation formula is as follows:
Figure 884231DEST_PATH_IMAGE092
calculating travel time distribution, wherein the travel time distribution refers to the ratio of the travel times of the passengers i in the early peak (7: 00-9: 00), the late peak (17: 00-19: 00) and the average peak to the total travel times, and taking the early peak as an example, the calculation formula is as follows:
Figure 786328DEST_PATH_IMAGE093
the statistical travel OD distribution is travel OD statistics of the first three travel frequencies of the passengers;
the formula for calculating the first trip time is as follows:
Figure 329305DEST_PATH_IMAGE094
calculating the last travel time, which is the time of the last travel of the passenger i within the statistical date, and is used for judging the activity of the passenger (i.e. the travel activity in the derivative information), wherein the calculation formula is as follows:
Figure 684063DEST_PATH_IMAGE095
calculating the travel time distribution of the holidays, wherein the travel time distribution of the holidays refers to the ratio of the travel times of the passengers in the early peak (7: 00-9: 00), the late peak (17: 00-19: 00) and the average peak to the total travel times of the holidays in the holidays, and taking the early peak as an example, the calculation formula is as follows:
Figure 594688DEST_PATH_IMAGE096
counting the holiday travel OD distribution, namely counting the travel OD of the passenger three times before holiday travel frequency;
in the above-mentioned formulas, the first and second substrates,
Figure 85713DEST_PATH_IMAGE007
represents the first
Figure 799591DEST_PATH_IMAGE008
The next trip, d for the outbound site, o for the inbound site, i for the passenger, t for the time,
Figure 172803DEST_PATH_IMAGE097
represents passenger i at time t
Figure 630329DEST_PATH_IMAGE008
The time of the next outbound is,
Figure 507019DEST_PATH_IMAGE010
represents the passenger i at the time t
Figure 657377DEST_PATH_IMAGE008
The time of the next arrival is the time of the next arrival,
Figure 986727DEST_PATH_IMAGE011
representing the average travel time period of the passenger i,
Figure 985295DEST_PATH_IMAGE012
representing the total number of historical trips of passenger i,
Figure 982070DEST_PATH_IMAGE013
representing the average number of trips of the passenger i, D representing the total number of days of the passenger within the statistical date,
Figure 303330DEST_PATH_IMAGE098
is a binary identification function, when the condition is satisfied, the value is 1, otherwise 0,
Figure 995342DEST_PATH_IMAGE015
representing the time of the first trip of passenger i,
Figure 325829DEST_PATH_IMAGE016
representing the last travel time of the passenger i,
Figure 911531DEST_PATH_IMAGE017
and counting the total times of travel of the passengers on the holidays within the counting date.
Calculating participation frequency, wherein the participation frequency is the frequency degree of the passenger i participating in the value-added service, and the calculation formula is as follows:
Figure 138113DEST_PATH_IMAGE018
wherein,
Figure 442056DEST_PATH_IMAGE019
the frequency of participation in the value added service on behalf of the passenger i,
Figure 45075DEST_PATH_IMAGE020
the number of times the passenger i participates in the value added service;
counting the merchant type distribution, namely counting the merchant types of the first three times of the participation frequency of the passengers;
the formula for calculating the average transaction amount is:
Figure 222634DEST_PATH_IMAGE021
wherein,
Figure 885697DEST_PATH_IMAGE022
the average amount of money spent for passenger i to participate in the value added service,
Figure 411356DEST_PATH_IMAGE023
a total amount of money spent participating in the value added service for the passenger i;
the statistical payment mode distribution is the statistics of the first three usage modes when passengers pay.
The travel demand type is determined by the clustering result of the total travel times, the first travel time and the average travel time counted by the passenger station-entering card-swiping data, and the clustering method comprises the following steps:
dividing the passengers into different categories according to the travel characteristics of the passengers by adopting a K-means algorithm, and selecting the total historical travel times of the passengers i in a passenger travel information index system
Figure 677121DEST_PATH_IMAGE024
First trip time
Figure 237416DEST_PATH_IMAGE099
And average trip duration
Figure 805800DEST_PATH_IMAGE011
As an index of passenger clustering in a station, the determination of a clustering number K value adopts an elbow methodThe key index of the elbow method is the square sum of errors SSE between clusters, namely the square sum of errors, and the calculation formula is as follows:
Figure 84335DEST_PATH_IMAGE026
wherein,
Figure 170103DEST_PATH_IMAGE027
represents the number k of the clusters and represents the number k of clusters,
Figure 587833DEST_PATH_IMAGE028
is that
Figure 592698DEST_PATH_IMAGE029
A center point of (a);
in this embodiment, the travel demand type is a clustering result of three indexes counted by card swiping data of passengers entering a station, the clustering number is obtained according to an SSE formula, and for each class of passengers, the historical travel times and the first travel time are analyzed, for example, the historical travel times account for a large proportion of statistical days, and the first travel time is in an early peak time period, so that the class of passengers can be considered as commuting passengers. Depending on the specific clustering result.
Example (c): the method comprises the steps of taking passengers in a subway station as research objects, selecting AFC data of three working days including 6 months in 2018, 6 days in 7 days in 8 days as basic data, and analyzing travel behavior characteristics of the passengers on the working days in the station. After data screening, the number of people entering the station for three working days is 197328.
The passengers were classified into 5 categories in total by the K-means clustering method. Table 1 below is the cluster center points for the five classes.
Figure 92950DEST_PATH_IMAGE100
TABLE 1
And (3) analyzing a clustering result:
the proportion of the first class of passengers is 21.2%, the travel characteristics are that the travel times in three days are 1.75, the passengers are the class with the highest travel intensity in five classes, and the first-time travel time is 08: 22: 13, the average travel time is 27.7min, the travel distance is not far away, the time period of the early peak is met, and the class of passengers can be considered as standard commuter passengers in the early peak period.
The proportion of the second class of passengers is 10.2%, the travel characteristic is that the travel times in three days are 1.34, the travel intensity is general, and the first travel time is 11: 29: 33, the average travel time is 48.1min, the travel distance is far away, the occupied ratio is less, the passenger can be seen as a passenger who goes out for travel or travels in a long distance, and by combining POI data, more bus stations and railway stations are arranged near the stations, especially Beijing northern railway stations, and the passenger can conveniently travel for travel.
The proportion of the third class of passengers is 34.5%, the travel characteristic shows that the travel times in three days are 1.69, the travel intensity is higher, and the first travel time is 17: 39: 14, the average travel time is 37.9min, and the trip distance is moderate than other classes, accords with the time quantum of late peak, can regard as this type of passenger as the commuter passenger of the late peak period of standard, and this type of passenger accounts for than the highest class in five types of passengers simultaneously, explains that west straight gate station late peak arrival number is many, combines POI data, has more office areas near the station, explains that this explanation is reasonable.
The proportion of the fourth class passenger is 17.2%, and the trip characteristic shows that the number of times of trip in three days is 1.22, and the intensity of trip is minimum, and it is not high to show this class of passenger's trip loyalty, and the time of trip for the first time is 20: 39: 40, the average travel time is 37.1min, the travel distance is moderate compared with other classes, the travel time is later, the passenger can be regarded as a living class passenger, and by combining POI data, a plurality of shopping and catering merchants are nearby the station, and the travel can be regarded as the travel of the passenger going home after consumption.
The proportion of the fifth class of passengers is 17.1%, the travel characteristic is that the travel frequency in three days is 1.25, the travel intensity is low, and the first travel time is 14: 05: 07, the average travel time is 29.2min, the travel distance is short, the travel time is the same as that of the fourth class of passengers, no obvious characteristics exist, the proportion of the travel time is very close to that of the fourth class of passengers, and the passengers can be regarded as life passengers. Judging the residential area site, and in the working day, at noon 12: 00 as a demarcation point, 16 pm after the holiday: 00 as a demarcation point, the statistics of the number of times of passengers to get in and out of the station in the corresponding time interval is shown in the following table 2:
Figure 841463DEST_PATH_IMAGE102
TABLE 2
The station stations where the passenger living areas are located are generally the station where the passenger first travels and the destination station where the passenger last travels in one day, so the probability calculation formula of the station e as the station of the passenger i living area is as follows:
Figure 376349DEST_PATH_IMAGE030
Figure 552116DEST_PATH_IMAGE103
wherein,
Figure 539663DEST_PATH_IMAGE034
representing the probability that station e is the station of the residential area of passenger i;
judging work area stations, the station stations where passenger work areas are typically passengers 12 within a work day: station before 00 as destination and 12: after 00, the station is used as an initial station, so the calculation formula of the station e as the station of the working area of the passenger i is as follows:
Figure 826288DEST_PATH_IMAGE104
Figure 230330DEST_PATH_IMAGE105
wherein,
Figure 576997DEST_PATH_IMAGE035
representing the probability that the station e is used as a station of a working area of the passenger i;
the value-added participation degree is set to be strong, medium and low, when the participation frequency is more than 0.7, the value-added participation degree of the passenger is strong, when the participation frequency is less than 0.4, the value-added participation degree of the passenger is low, and when the participation frequency is between 0.4 and 0.7, the value-added participation degree of the passenger is medium.
In addition, in the present embodiment, a specific format of a part of the index labels in the passenger travel information index system is shown in table 3 below:
Figure 51841DEST_PATH_IMAGE106
TABLE 3
In the present embodiment, the index data in the passenger travel information index system should be updated continuously according to the travel of the passenger. The updating rule of the passenger trip information is as follows: for basic information, updating is performed only when the passenger modifies his personal information; for the service information, the travel basic information, the travel derivative information and the value-added service information are updated in real time along with the travel of the passenger and the use of the value-added service every time, and meanwhile, the service information in the redis is synchronously updated into a database every month; for the derivative information, analyzing the basic information and the service information, and updating once every month; judging the last trip time of the passenger every half year, if the difference between the last trip time of the passenger and the updating time exceeds half year, judging the passenger as an inactive user, and deleting the passenger trip information from the database.
According to an embodiment of the present invention, in step a, after the passenger travel data is acquired, the method further includes preprocessing the travel data, and specifically includes:
redundant data processing: when a passenger swipes a card for many times or equipment fails, data repetition may occur, and the repeated data needs to be deleted;
and (3) error data processing: abnormal data may occur due to passenger behavior and equipment failure. There are three criteria for the determination of abnormal data: firstly, the arrival time of passengers is required to be earlier than the departure time; secondly, the stay time of passengers in the rail transit is regulated to be less than 4 hours; and thirdly, judging the times of the passengers entering the same station within one day, and counting the staff when the statistical data is eliminated because the staff at the station has more access times in one day.
The passenger categories contained in the travel demand types of the third-level indexes in the passenger travel information index system include:
commuting passengers: the trip time and the trip frequency of commuter passengers are relatively fixed due to the working requirements;
touring passengers: the traveling time and the traveling frequency of the passengers are high in fluctuation, the traveling frequency in a short time is high, and the traveling ODs are widely distributed;
leisure entertainment passengers: the travel time of the class of passengers is more distributed on weekends and off-peak time periods of each day;
special passengers: for example, the old, the disabled, the pregnant woman and the like often need external help in the traveling process due to the self-reason, and the information needs to be provided by the passenger when registering the APP account;
the other passengers: other passengers are different from the four passenger types, the travel time and the travel frequency are not determined, and the travel purposes are also various.
According to an embodiment of the present invention, in the step b, estimating the return passenger flow volume in different time periods in the station based on the passenger travel information index system and the calculated partial index data, includes:
according to the in-out time, the residence area station, the working area station and the travel demand type in the passenger travel information index system, counting
Figure 876578DEST_PATH_IMAGE046
Wherein s is a certain site, v is a certain week, the value range is 1-7, which means Monday to Sunday, t is a certain time period,
Figure 386056DEST_PATH_IMAGE046
the number of people returning from station s within a time period t of v weeks;
selecting historical outbound and return passenger flow data in the station s, and obtaining the week in a mean value calculation modevThe passengers are at
Figure 903625DEST_PATH_IMAGE047
Arrival of time periodsStand and are arranged in
Figure 865765DEST_PATH_IMAGE048
Time period fromsConditional probability distribution of station departure return
Figure 635138DEST_PATH_IMAGE049
The calculation formula is as follows:
Figure 733544DEST_PATH_IMAGE050
in the formula
Figure 424944DEST_PATH_IMAGE051
Represents the total number of the weeks,
Figure 608801DEST_PATH_IMAGE047
which represents the time period a of time,
Figure 306498DEST_PATH_IMAGE048
which represents the time period of b,
Figure 524990DEST_PATH_IMAGE052
is shown asjOn the v-th day of the week
Figure 649941DEST_PATH_IMAGE047
In a time period ofsThe number of passengers getting off the vehicle at a station,
Figure 321094DEST_PATH_IMAGE107
which represents the time of arrival of the station,
Figure 556903DEST_PATH_IMAGE108
represents the time of outbound;
by probability distribution
Figure 629901DEST_PATH_IMAGE055
Estimating
Figure 801119DEST_PATH_IMAGE056
The calculation formula is as follows:
Figure 24815DEST_PATH_IMAGE057
wherein,
Figure 798736DEST_PATH_IMAGE058
indicating day v
Figure 257399DEST_PATH_IMAGE059
The number of passengers getting off the bus at s station at the moment, H represents the maximum interval of the time for passengers getting on or off the bus at s station, the maximum interval is 24 hours, H represents time slot resolution, and t +1 represents the next time interval of the time interval t.
According to an embodiment of the present invention, in the step c, the return passenger flow volume of passengers in the station is added as a covariate to the passenger flow prediction model, and the predicted station passenger flow entering the station is:
to be estimated
Figure 458573DEST_PATH_IMAGE056
Adding the predicted traffic volume into a common seasonal autoregressive moving average model (S-ARIMA model) to predict the station entrance traffic volume;
the S-ARIMA model is: ARIMA (P, D, Q) (P, D, Q) [ omega ], where P, D, Q represent the order of autoregressive, differential and moving average, respectively P, D, Q is the order of autoregressive, differential and moving average for part of the season;
for a time series
Figure 104318DEST_PATH_IMAGE109
,ARIMA(p,d,q)(P,D,Q)[Ω]The model is as follows:
Figure 681930DEST_PATH_IMAGE110
wherein B is defined as
Figure 995100DEST_PATH_IMAGE062
Figure 370105DEST_PATH_IMAGE063
Figure 237567DEST_PATH_IMAGE111
Figure 353290DEST_PATH_IMAGE065
Figure 396333DEST_PATH_IMAGE112
In which
Figure 939309DEST_PATH_IMAGE067
Figure 559647DEST_PATH_IMAGE068
Figure 213482DEST_PATH_IMAGE069
And
Figure 235665DEST_PATH_IMAGE070
for the coefficients to be found, the coefficients are,
Figure 215122DEST_PATH_IMAGE071
to follow the error term of white noise and obey a mean of 0 and a variance of
Figure 48387DEST_PATH_IMAGE072
The normal distribution of (a) is,
Figure 505913DEST_PATH_IMAGE073
represents
Figure 648181DEST_PATH_IMAGE074
A time period;
while returning passenger flow
Figure 532961DEST_PATH_IMAGE081
Inbound traffic volume when acting as covariates
Figure 862311DEST_PATH_IMAGE113
And return passenger flow
Figure 389107DEST_PATH_IMAGE077
The following relationships exist:
Figure 854724DEST_PATH_IMAGE114
Figure 175984DEST_PATH_IMAGE115
wherein,
Figure 729980DEST_PATH_IMAGE080
in order to be the regression coefficient, the method,
Figure 794888DEST_PATH_IMAGE081
is formed by
Figure 380590DEST_PATH_IMAGE046
In the middle week v, when the site s is known, the time is obtained by taking 1. cndot. t;
Figure 607172DEST_PATH_IMAGE082
obeying ARIMA (P, D, Q) (P, D, Q) [ omega ]]The model represents the passenger flow except the return passenger flow in the total inbound passenger flow; according to station history
Figure 176694DEST_PATH_IMAGE116
And
Figure 779714DEST_PATH_IMAGE084
calculate out
Figure 219922DEST_PATH_IMAGE080
And
Figure 882985DEST_PATH_IMAGE082
to obtain
Figure 145994DEST_PATH_IMAGE082
Post-pass formula
Figure 818284DEST_PATH_IMAGE117
The prediction is obtained
Figure 112999DEST_PATH_IMAGE086
Then according to
Figure 946963DEST_PATH_IMAGE056
To obtain
Figure 100864DEST_PATH_IMAGE087
Wherein, in the process,
Figure 45686DEST_PATH_IMAGE087
is formed by
Figure 460487DEST_PATH_IMAGE056
In the middle week v, when the station s is known, the time is obtained by taking t + 1; due to the fact that
Figure 465352DEST_PATH_IMAGE082
Obeying ARIMA (P, D, Q) (P, D, Q) [ omega ]]Models, i.e. by which prediction can be made
Figure 965604DEST_PATH_IMAGE090
Of a time period
Figure 717046DEST_PATH_IMAGE118
To do so
Figure 845408DEST_PATH_IMAGE087
Is as in the above
Figure 21175DEST_PATH_IMAGE056
(ii) a Will be provided with
Figure 8722DEST_PATH_IMAGE087
And
Figure 29768DEST_PATH_IMAGE088
substituted into a formula, the prediction being obtained
Figure 684740DEST_PATH_IMAGE090
The amount of passengers arriving at the station at that moment.
In the present embodiment, for example, the model parameters are selected to be (2, 0, 1) (1, 1, 0) [72]The experimental results are shown in Table 4 below, where no M0 model was added
Figure 765829DEST_PATH_IMAGE119
M1 model addition
Figure 381618DEST_PATH_IMAGE120
As covariates, the RMSE of the training set is reduced by 9.87, the RMSE of the test set is reduced by 9.02, the SMAPE of the training set is reduced by 0.64%, the SMAPE of the test set is reduced by 0.16%, and the predicted effect is more accurate after new variables are added.
Figure 486582DEST_PATH_IMAGE121
TABLE 4
According to the scheme, the method provided by the invention is based on intelligent subway construction, effectively associates, fuses and introduces multi-source data related to the subway, establishes passenger travel information, and discusses the application of the passenger travel information in the aspect of passenger flow prediction. According to the invention, the travel rule of the passenger is mined according to the multi-source travel data of the passenger, and a passenger travel information three-level index system is established on the basis, so that the statistical calculation of each index is realized. Meanwhile, a prediction method for identifying the return passenger flow facing to the passenger travel information and effectively improving the accuracy of the station arrival passenger flow according to the return passenger flow is provided.
Further, in order to achieve the above object, the present invention further provides a system for predicting rail transit passenger flow based on passenger travel information, and a block diagram of the system structure is shown in fig. 2, and specifically includes:
the index acquisition module is used for acquiring passenger travel data, establishing a passenger travel information index system based on the passenger travel data, and counting and calculating each index in the passenger travel information index system;
the return passenger flow calculation module is used for estimating the return passenger flow at different time intervals in the station based on part of index data calculated in the passenger travel information index system;
and the passenger flow prediction module is used for adding the return passenger flow of passengers in the station into the passenger flow prediction model as a covariate to predict the station entering passenger flow.
According to one embodiment of the invention, in the index acquisition module, when a passenger gets into a station and takes a bus, travel data of the relevant passenger can be acquired through a card swiping record and/or a code scanning record of the passenger, and based on pycharm software, a python language is used to connect a database of travel information of the passenger, acquire the data and perform corresponding statistical analysis. In this embodiment, the trip data includes: AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information; APP registration data used for obtaining identity information and associated information of passengers; APP value-added consumption data used for obtaining value-added service information of passengers; and POI data, associated with the bus stop, near the bus stop for describing geographic attributes of the bus stop. In the embodiment, the statistical range of the travel data is counted from the date of the passenger registering the APP, the statistical range of the POI data near the station is a land use type within the coverage range of a radius of 500 meters centering on the station, and the data is acquired through a third-party map software high level map.
In this embodiment, the passenger travel information index system is a concept representing passenger information, and includes three primary indexes: basic information, service information, and derived information. Wherein, the basic information comprises two-level indexes: identity information and associated information, the identity information includes three levels of indicators: APPID, sex, age and whether the passenger is disabled, and the associated information comprises three-level indexes: the third party payment mode of the passenger and the city card.
The service information comprises two-level indexes: travel basic information, travel derived information and value added service information. The travel basic information comprises three levels of indexes: passenger's entering and exiting time and passenger's entering and exiting station information; the travel derivative information comprises three levels of indexes: average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution; the value added service information comprises three levels of indexes: number of value added service participation, participation frequency, average transaction amount, payment mode distribution, merchant type distribution and final participation time.
The derived information includes secondary indicators: an active attribute and a functional attribute. Wherein the activity attribute comprises three levels of indexes: trip liveness; the functional attributes comprise three levels of indexes: passenger travel demand type, residential area sites, work area sites, and value-added participation.
Further, in the present embodiment, the statistically calculating each index in the passenger travel information index system includes: the method comprises the steps of counting all indexes in identity information, correlation information and travel basic information of passengers, counting total travel times, travel OD distribution, holiday travel OD distribution and travel path distribution indexes in travel derivative information, and counting participation times, merchant type distribution, payment mode distribution and final participation time indexes in value-added service information. In the present invention, there is no special method for statistics, as long as the relevant index information can be acquired and the acquired relevant index information is stored in a summary manner. As is clear from the above, the above-described indexes are data information that can be generated by the actions of the passenger such as registration and travel, and are information that can be obtained without calculation, and therefore, only summary statistics are required, and calculation is not required.
Further, in this embodiment, the statistically calculating each index in the passenger travel information index system further includes calculating three-level indexes other than the above indexes, and the specific calculation includes:
calculating the average travel time length, wherein the average travel time length refers to the time spent by the passenger i in each travel, and the calculation formula is as follows:
Figure 730482DEST_PATH_IMAGE091
calculating the average daily trip times, wherein the average daily trip times are the trip times of the passenger i each day, and the calculation formula is as follows:
Figure 248051DEST_PATH_IMAGE122
wherein,
Figure 944611DEST_PATH_IMAGE123
representing the average trip times of the passenger i, and D represents the total days of the passenger within the statistical date;
calculating travel time distribution, wherein the travel time distribution refers to the ratio of the travel times of the passengers i in the early peak (7: 00-9: 00), the late peak (17: 00-19: 00) and the average peak to the total travel times, and taking the early peak as an example, the calculation formula is as follows:
Figure 573039DEST_PATH_IMAGE124
the statistical travel OD distribution is travel OD statistics of the first three travel frequencies of the passengers;
the formula for calculating the first trip time is as follows:
Figure 937024DEST_PATH_IMAGE125
calculating the last trip time, which is the time of the last trip of the passenger i in the statistical date, and is used for judging the activity of the passenger (i.e. the trip activity in the derivative information), wherein the calculation formula is as follows:
Figure 891073DEST_PATH_IMAGE126
calculating the travel time distribution of the holidays, wherein the travel time distribution of the holidays refers to the ratio of the travel times of the passengers in the early peak (7: 00-9: 00), the late peak (17: 00-19: 00) and the average peak to the total travel times of the holidays in the holidays, and taking the early peak as an example, the calculation formula is as follows:
Figure 74930DEST_PATH_IMAGE127
counting the holiday travel OD distribution, namely counting the travel OD of the passenger three times before holiday travel frequency;
in the above-mentioned respective formulas, the first and second,
Figure 103453DEST_PATH_IMAGE007
represents the first
Figure 321945DEST_PATH_IMAGE008
The next trip, d for the outbound site, o for the inbound site, i for the passenger, t for the time,
Figure 181317DEST_PATH_IMAGE097
represents passenger i at time t
Figure 993415DEST_PATH_IMAGE008
The time of the next outbound is the time of the next outbound,
Figure 963645DEST_PATH_IMAGE010
represents the passenger i at the time t
Figure 302222DEST_PATH_IMAGE008
The time of the next arrival is the time of the next arrival,
Figure 332495DEST_PATH_IMAGE011
represents the average travel time period of the passenger i,
Figure 490944DEST_PATH_IMAGE012
representing the total number of historical trips of passenger i,
Figure 267795DEST_PATH_IMAGE013
representing the average number of trips of the passenger i, D representing the total number of days of the passenger within the statistical date,
Figure 195300DEST_PATH_IMAGE128
is a binary identification function, when the condition is satisfied, the value is 1, otherwise 0,
Figure 662053DEST_PATH_IMAGE015
representing the time of the first trip of passenger i,
Figure 307798DEST_PATH_IMAGE016
representing the last travel time of the passenger i,
Figure 150989DEST_PATH_IMAGE017
and counting the total number of trips of the passengers on the holidays within the statistical date.
Calculating participation frequency, wherein the participation frequency is the frequency degree of the passenger i participating in the value-added service, and the calculation formula is as follows:
Figure 933000DEST_PATH_IMAGE018
wherein,
Figure 305076DEST_PATH_IMAGE019
the frequency of participation in the value added service on behalf of the passenger i,
Figure 579062DEST_PATH_IMAGE020
the number of times the passenger i participates in the value added service;
counting merchant type distribution, namely counting merchant types of the first three passenger participation frequencies;
the formula for calculating the average transaction amount is:
Figure 960365DEST_PATH_IMAGE021
wherein,
Figure 865392DEST_PATH_IMAGE022
the average amount of money spent for passenger i to participate in the value added service,
Figure 408368DEST_PATH_IMAGE023
a total amount of money spent participating in the value added service for the passenger i;
the statistical payment mode distribution is the statistics of the first three usage modes when passengers pay.
The travel demand type is determined by the clustering result of the total travel times, the first travel time and the average travel time counted by the passenger station-entering card-swiping data, and the clustering method comprises the following steps:
dividing the passengers into different categories according to the travel characteristics of the passengers by adopting a K-means algorithm, and selecting the total historical travel times of the passengers i in a passenger travel information index system
Figure 28706DEST_PATH_IMAGE024
First trip time
Figure 948120DEST_PATH_IMAGE099
And average length of trip
Figure 704724DEST_PATH_IMAGE011
As an index of passenger clustering in a station, determining a clustering number K value by adopting an elbow method, wherein a key index of the elbow method is an inter-cluster error Sum of Squares (SSE), namely an error sum of squares, and a calculation formula is as follows:
Figure 949760DEST_PATH_IMAGE026
wherein,
Figure 791814DEST_PATH_IMAGE027
represents the number k of the clusters and represents the number k of clusters,
Figure 390286DEST_PATH_IMAGE028
is that
Figure 266975DEST_PATH_IMAGE029
A center point of (a);
in this embodiment, the travel demand type is a clustering result of three indexes counted by card swiping data of passengers entering a station, the clustering number is obtained according to an SSE formula, and for each class of passengers, the historical travel times and the first travel time are analyzed, for example, the historical travel times account for a large proportion of statistical days, and the first travel time is in an early peak time period, so that the class of passengers can be considered as commuting passengers. Depending on the specific clustering result.
Example (c): taking passengers in a subway station as research objects, selecting AFC data of three working days of 6 months, 7 days and 8 days in 2018 as basic data, and analyzing the traveling behavior characteristics of the passengers on the working days in the station. After data screening, the number of people entering the station for three working days is 197328.
The passengers were classified into 5 categories in total by the K-means clustering method.
As can be seen from table 1 above, the clustering results were analyzed:
the proportion of the first class of passengers is 21.2%, the travel characteristics are that the travel times within three days are 1.75, the passengers are the class with the highest travel intensity in five classes, and the first-time travel time is 08: 22: 13, the average travel time is 27.7min, the travel distance is not far away, the time period of the early peak is met, and the class of passengers can be considered as standard commuter passengers in the early peak period.
The proportion of the second class of passengers is 10.2%, the travel characteristic is that the travel times in three days are 1.34, the travel intensity is general, and the first travel time is 11: 29: 33, the average travel time is 48.1min, the travel distance is far, the occupied ratio is small, the passenger can be seen as a passenger who travels outside or travels in a long distance, and by combining POI data, the number of bus stations and railway stations near the stations is large, particularly Beijing railway stations, so that the passenger can conveniently travel.
The proportion of the third class of passengers is 34.5%, the travel characteristic shows that the travel times in three days are 1.69, the travel intensity is higher, and the first travel time is 17: 39: 14, the average travel time is 37.9min, the travel distance is moderate compared with other classes, the time period of the late peak is met, the class of passengers can be considered as commuter passengers in the standard late peak period, meanwhile, the class of passengers is the class with the highest proportion among five classes of passengers, the number of people who enter the station at the late peak of the west vertical gate station is large, and by combining POI data, more office areas are arranged near the station, and the explanation is reasonable.
The proportion of the fourth class passenger is 17.2%, and the trip characteristic shows that the number of times of trip in three days is 1.22, and the intensity of trip is minimum, and it is not high to show this class of passenger's trip loyalty, and the time of trip for the first time is 20: 39: 40, the average travel time is 37.1min, the travel distance is moderate compared with other classes, the travel time is later, the passenger can be regarded as a living class passenger, and by combining POI data, a plurality of shopping and catering merchants are arranged near the station, and the travel can be regarded as the travel of the passenger going home after consumption.
The proportion of the fifth class of passengers is 17.1%, the travel characteristic is that the travel frequency in three days is 1.25, the travel intensity is low, and the first travel time is 14: 05: 07, the average travel time is 29.2min, the travel distance is short, the travel time is the same as that of the fourth class of passengers, no obvious characteristics exist, the proportion of the travel time is very close to that of the fourth class of passengers, and the passengers can be regarded as life passengers.
Judging the residential area site, and in the working day, at noon 12: 00 as a demarcation point, 16 pm on holidays: 00 as a demarcation point, counting the number of times of passengers to get in and out of the station in the corresponding time interval as shown in the table 2.
The station stations where the passenger living areas are located are generally the station where the passenger first travels and the destination station where the passenger last travels in one day, so the probability calculation formula of the station e as the station of the passenger i living area is as follows:
Figure 142965DEST_PATH_IMAGE030
Figure 472315DEST_PATH_IMAGE103
wherein,
Figure 733532DEST_PATH_IMAGE034
representing the probability that station e is the station of the residential area of passenger i;
judging work area stations, the station stations where passenger work areas are typically passengers 12 within a work day: station before 00 as destination and 12: after 00, the station is used as an initial station, so the calculation formula of the station e as the station of the working area of the passenger i is as follows:
Figure 464728DEST_PATH_IMAGE129
Figure 785988DEST_PATH_IMAGE130
wherein,
Figure 602634DEST_PATH_IMAGE035
representing the probability that the station e is used as a station of a working area of the passenger i;
the value-added participation degree is set to be strong, medium and low, when the participation frequency is more than 0.7, the value-added participation degree of the passenger is strong, when the participation frequency is less than 0.4, the value-added participation degree of the passenger is low, and when the participation frequency is between 0.4 and 0.7, the value-added participation degree of the passenger is medium.
In the present embodiment, the specific format of the part of the index labels in the passenger travel information index system is as shown in table 3 above.
In the present embodiment, the index data in the passenger travel information index system should be continuously updated according to the travel of the passenger. The updating rule of the passenger trip information is as follows: for basic information, updating is performed only when the passenger modifies his personal information; for the service information, the travel basic information, the travel derivative information and the value-added service information are updated in real time along with the travel of the passenger and the use of the value-added service every time, and meanwhile, the service information in the redis is synchronously updated into a database every month; for the derived information, analyzing basic information and service information, and updating once a month; judging the last trip time of the passenger every half year, if the difference between the last trip time of the passenger and the updating time exceeds half year, judging the passenger as an inactive user, and deleting the passenger trip information from the database.
According to an embodiment of the present invention, after the passenger trip data is acquired in the index acquisition module, the method further includes preprocessing the trip data, and specifically includes:
redundant data processing: when a passenger swipes a card for many times or equipment fails, data repetition may occur, and the repeated data needs to be deleted;
and (3) error data processing: abnormal data may occur due to passenger behavior and equipment failure. There are three criteria for the determination of abnormal data: firstly, the arrival time of passengers is required to be earlier than the departure time; secondly, the stay time of passengers in the rail transit is regulated to be less than 4 hours; and thirdly, judging the times of the passengers entering the same station within one day, and counting the staff when the statistical data is eliminated because the staff at the station has more access times in one day.
The passenger categories included in the travel demand types of the third-level indexes in the passenger travel information index system include:
commuting passengers: the trip time and the trip frequency of commuter passengers are relatively fixed due to the working requirements;
touring passengers: the traveling time and the traveling frequency of the passengers are high in fluctuation, the traveling frequency in a short time is high, and the traveling OD distribution is wide;
leisure entertainment passengers: the travel time of the class of passengers is more distributed on weekends and off-peak time periods of each day;
special passengers: for example, the old, the disabled, the pregnant woman and the like often need external help in the traveling process due to the self-reason, and the information needs to be provided by the passenger when registering the APP account;
the other passengers: other passengers are different from the four passenger types, the travel time and the travel frequency are not determined, and the travel purposes are also various.
According to an embodiment of the present invention, in the return passenger flow calculation module, estimating the return passenger flow volume at different time intervals in the station based on the passenger travel information index system and part of the index data obtained by calculation includes:
according to the in-out time, the residence area station, the working area station and the travel demand type in the passenger travel information index system, counting
Figure 667542DEST_PATH_IMAGE046
Wherein s is a certain site, v is a certain week, the value range is 1-7, which represents Monday to Sunday, t is a certain time period,
Figure 253244DEST_PATH_IMAGE046
the number of people who return from the s station within the time period t of v weeks;
selecting historical outbound and return passenger flow data in the station s, and obtaining the week in a mean value calculation modevThe passengers are at
Figure 748335DEST_PATH_IMAGE047
Arrival of time periodsStand and are arranged in
Figure 317857DEST_PATH_IMAGE048
Time period fromsConditional probability distribution of station departure return
Figure 655297DEST_PATH_IMAGE049
The calculation formula is as follows:
Figure 502031DEST_PATH_IMAGE050
in the formula
Figure 899514DEST_PATH_IMAGE051
Represents the total number of the weeks,
Figure 690752DEST_PATH_IMAGE047
which represents the time period a of time,
Figure 97463DEST_PATH_IMAGE048
which represents the time period of b,
Figure 657757DEST_PATH_IMAGE052
is shown asjOn the v-th day of the week
Figure 491721DEST_PATH_IMAGE047
In the time period ofsThe number of passengers getting off the vehicle when standing,
Figure 507606DEST_PATH_IMAGE131
which represents the time of arrival of the station,
Figure 452428DEST_PATH_IMAGE132
represents the outbound time;
by probability distribution
Figure 132808DEST_PATH_IMAGE055
Estimating
Figure 137673DEST_PATH_IMAGE056
The calculation formula is as follows:
Figure 903504DEST_PATH_IMAGE057
wherein,
Figure 386438DEST_PATH_IMAGE058
indicating day v
Figure 921325DEST_PATH_IMAGE059
The number of passengers getting off the bus at s station at the moment, H represents the maximum interval of the time for passengers getting on or off the bus at s station, the maximum interval is 24 hours, H represents time slot resolution, and t +1 represents the next time interval of the time interval t.
According to an embodiment of the present invention, in the passenger flow prediction module, the return passenger flow volume of passengers in the station is added as a covariate to the passenger flow prediction model, and the prediction of the station passenger flow into the station is:
to be estimated
Figure 97091DEST_PATH_IMAGE056
Adding the predicted traffic volume into a common seasonal autoregressive moving average model (S-ARIMA model) to predict the station entrance traffic volume;
the S-ARIMA model is: ARIMA (P, D, Q) (P, D, Q) [ omega ], where P, D, Q represent the order of autoregressive, differential and moving average, respectively P, D, Q is the order of autoregressive, differential and moving average for part of the season;
for a time series
Figure 225584DEST_PATH_IMAGE133
,ARIMA(p,d,q)(P,D,Q)[Ω]The model is as follows:
Figure 515139DEST_PATH_IMAGE134
wherein B is defined as
Figure 904532DEST_PATH_IMAGE062
Figure 251199DEST_PATH_IMAGE135
Figure 991622DEST_PATH_IMAGE136
Figure 81938DEST_PATH_IMAGE065
Figure 325838DEST_PATH_IMAGE137
Wherein, in the process,
Figure 577827DEST_PATH_IMAGE067
Figure 539967DEST_PATH_IMAGE068
Figure 917464DEST_PATH_IMAGE069
and
Figure 281449DEST_PATH_IMAGE070
as a function of the coefficients to be found,
Figure 704340DEST_PATH_IMAGE071
to follow the error term of white noise, and obey a mean of 0 and a variance of
Figure 153776DEST_PATH_IMAGE072
The normal distribution of (c),
Figure 913790DEST_PATH_IMAGE073
represents
Figure 866703DEST_PATH_IMAGE074
A time period;
when returning the passenger flow
Figure 991654DEST_PATH_IMAGE075
Inbound traffic volume when acting as covariates
Figure 665736DEST_PATH_IMAGE138
And return passenger flow volume
Figure 901546DEST_PATH_IMAGE077
The following relationships exist:
Figure 974544DEST_PATH_IMAGE139
Figure 145762DEST_PATH_IMAGE140
wherein,
Figure 304211DEST_PATH_IMAGE080
in order to be the regression coefficient, the method,
Figure 343711DEST_PATH_IMAGE081
is formed by
Figure 271216DEST_PATH_IMAGE046
In the middle week v, when the site s is known, the time is obtained by taking 1. cndot. t,
Figure 737969DEST_PATH_IMAGE082
obeying ARIMA (P, D, Q) (P, D, Q) [ omega ]]The model represents the passenger flow except the return passenger flow in the total inbound passenger flow; according to station history
Figure 383714DEST_PATH_IMAGE141
And
Figure 698677DEST_PATH_IMAGE084
calculate out
Figure 11846DEST_PATH_IMAGE080
And
Figure 383922DEST_PATH_IMAGE082
to obtain
Figure 516963DEST_PATH_IMAGE082
Post-pass formula
Figure 632686DEST_PATH_IMAGE142
The prediction is obtained
Figure 534783DEST_PATH_IMAGE086
Then according to
Figure 218706DEST_PATH_IMAGE056
To obtain
Figure 573463DEST_PATH_IMAGE087
Wherein
Figure 758457DEST_PATH_IMAGE087
is composed of
Figure 252411DEST_PATH_IMAGE056
In the middle week v, when the station s is known, the time is obtained by taking t + 1; due to the fact that
Figure 966289DEST_PATH_IMAGE082
Obeying ARIMA (P, D, Q) (P, D, Q) [ omega ]]Models, i.e. by which prediction can be made
Figure 339502DEST_PATH_IMAGE090
Of a time period
Figure 797028DEST_PATH_IMAGE118
To do so
Figure 673717DEST_PATH_IMAGE087
Is as in the above
Figure 824076DEST_PATH_IMAGE056
(ii) a Will be provided with
Figure 153426DEST_PATH_IMAGE087
And
Figure 414643DEST_PATH_IMAGE088
bringing in
Figure 871470DEST_PATH_IMAGE143
In the formula, the prediction is obtained
Figure 458309DEST_PATH_IMAGE090
The amount of passengers arriving at the station at that moment.
In the present embodiment, for example, the model parameters are selected to be (2, 0, 1) (1, 1, 0) [72 ]]The results are shown in Table 4 above, where no M0 model was added
Figure 274956DEST_PATH_IMAGE119
M1 model addition
Figure 74284DEST_PATH_IMAGE119
As covariates, new changes can be discovered, addedAfter the prediction, the RMSE of the training set is reduced by 9.87, the RMSE of the test set is reduced by 9.02, the SMAPE of the training set is reduced by 0.64%, the SMAPE of the test set is reduced by 0.16%, and the prediction effect is more accurate.
Further, to achieve the above object, the present invention also provides an electronic device, including: the system comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the method for predicting the rail transit passenger flow based on the passenger travel information is realized.
In order to achieve the above object, the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the above method for predicting rail transit passenger flow based on passenger travel information.
According to the scheme, the method provided by the invention is based on intelligent subway construction, effectively associates, fuses and introduces multi-source data related to the subway, establishes passenger travel information, and discusses the application of the passenger travel information in the aspect of passenger flow prediction. According to the invention, the travel rule of the passenger is mined according to the multisource travel data of the passenger, and on the basis, a passenger travel information three-level index system is established to realize the statistical calculation of each index. Meanwhile, a prediction method for identifying the return passenger flow facing to the passenger travel information and effectively improving the accuracy of the station arrival passenger flow according to the return passenger flow is provided.
Those of ordinary skill in the art will appreciate that the modules and algorithm steps described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and devices may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, each functional module in the embodiments of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method for transmitting/receiving the power saving signal according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.
It should be understood that the order of execution of the steps in the summary of the invention and the embodiments of the present invention does not absolutely imply any order of execution, and the order of execution of the steps should be determined by their functions and inherent logic, and should not be construed as limiting the process of the embodiments of the present invention.

Claims (9)

1. The rail transit passenger flow prediction method based on passenger travel information is characterized by comprising the following steps of:
passenger travel data are acquired, a passenger travel information index system is established based on the passenger travel data, and each index information in the passenger travel information index system is calculated in a statistical mode;
estimating the return passenger flow of different time periods in the station based on partial index data obtained by calculation in the passenger travel information index system;
taking the return passenger flow of passengers in the station as a covariate, adding the covariate into the seasonal autoregressive moving average model, and predicting the station passenger flow;
the passenger trip data includes:
AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information;
APP registration data used for obtaining identity information and associated information of passengers;
APP value-added consumption data used for obtaining value-added service information of passengers;
POI data, associated with the bus stop, near the bus stop for describing geographic attributes of the bus stop;
the index information in the passenger travel information index system comprises the following steps: basic information, business information and derived information;
the basic information comprises identity information and associated information, the identity information comprises an APPID, gender and age of a passenger and whether the passenger is disabled, and the associated information comprises a third-party payment mode of the passenger and a city all-purpose card;
the service information comprises trip basic information, trip derived information and value added service information, wherein the trip basic information comprises the trip in and out time and the trip in and out station information of passengers, the trip derived information comprises average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution, and the value added service information comprises value added service participation times, participation frequency, average transaction amount, payment mode distribution, merchant type distribution and last participation time;
the derived information comprises an active attribute and a functional attribute, wherein the active attribute comprises travel liveness, and the functional attribute comprises a travel demand type of a passenger, a residence area site, a working area site and value-added participation;
the partial index data includes the in-and-out time, the residential zone site, the work zone site, and the travel demand type.
2. The passenger travel information-based rail transit passenger flow prediction method according to claim 1, wherein the formula for calculating the average travel time length is as follows:
Figure 709178DEST_PATH_IMAGE001
the formula for calculating the average daily trip times is as follows:
Figure 927669DEST_PATH_IMAGE002
the formula for calculating the travel time distribution is as follows:
Figure 787041DEST_PATH_IMAGE003
counting the travel OD distribution as travel OD statistics of the first three travel frequencies of the passengers;
the formula for calculating the first trip time is as follows:
Figure 676105DEST_PATH_IMAGE004
the formula for calculating the last travel time is as follows:
Figure 646335DEST_PATH_IMAGE005
the formula for calculating the travel time distribution of the holidays is as follows:
Figure 719333DEST_PATH_IMAGE006
counting the travel OD distribution of the holidays, namely the travel OD statistics of the passengers three times before the holiday travel frequency;
in the following formulas, the first and second groups,
Figure 546343DEST_PATH_IMAGE007
represents the first
Figure 704792DEST_PATH_IMAGE008
The next trip, d for the outbound site, o for the inbound site, i for the passenger, t for the time,
Figure 478713DEST_PATH_IMAGE009
represents the passenger i at the time t
Figure 875060DEST_PATH_IMAGE008
The time of the next outbound is,
Figure 76234DEST_PATH_IMAGE010
represents the passenger i at the time t
Figure 459329DEST_PATH_IMAGE008
The time of the next arrival is the time of the next arrival,
Figure 36941DEST_PATH_IMAGE011
represents the average travel time period of the passenger i,
Figure 84531DEST_PATH_IMAGE012
representing the total number of historical trips of passenger i,
Figure 456607DEST_PATH_IMAGE013
representing the average number of trips of the passenger i, D representing the total number of days of the passenger within the statistical date,
Figure 589648DEST_PATH_IMAGE014
is a binary identification function, when the condition is satisfied, the value is 1, otherwise 0,
Figure 705372DEST_PATH_IMAGE015
representing the time of the first trip of passenger i,
Figure 607469DEST_PATH_IMAGE016
representing the last travel time of the passenger i,
Figure 150445DEST_PATH_IMAGE017
and counting the total times of travel of the passengers on the holidays within the counting date.
3. The passenger travel information-based rail transit passenger flow prediction method according to claim 2, wherein the formula for calculating the participation frequency is:
Figure 304871DEST_PATH_IMAGE018
wherein,
Figure 489864DEST_PATH_IMAGE019
the frequency of participation in the value added service on behalf of the passenger i,
Figure 512047DEST_PATH_IMAGE020
number of times passenger i participates in value added service;
counting the merchant type distribution as merchant type statistics of the first three passenger participation frequencies;
the formula for calculating the average transaction amount is:
Figure 429187DEST_PATH_IMAGE021
wherein,
Figure 271241DEST_PATH_IMAGE022
the average amount of money spent for passenger i to participate in the value added service,
Figure 994347DEST_PATH_IMAGE023
a total amount of money spent participating in the value added service for the passenger i;
and counting the payment mode distribution as the statistics of the first three usage modes when the passengers pay.
4. The passenger travel information-based rail transit passenger flow prediction method according to claim 3, wherein the travel demand type is determined by a clustering result of a total travel frequency, a first travel time and an average travel time counted by passenger boarding and card swiping data, and the clustering method is as follows:
dividing the passengers into different categories according to the travel characteristics of the passengers by adopting a K-means algorithm, and selecting the total historical travel times of the passengers i in a passenger travel information index system
Figure 74298DEST_PATH_IMAGE024
First trip time
Figure 959078DEST_PATH_IMAGE025
And average trip duration
Figure 88095DEST_PATH_IMAGE011
As an index of passenger clustering in a station, determining a clustering number K value by adopting an elbow method, wherein a key index of the elbow method is the sum of squared errors among clusters SSE, and a calculation formula is as follows:
Figure 552575DEST_PATH_IMAGE026
wherein,
Figure 283770DEST_PATH_IMAGE027
represents the number k of the clusters and represents the number k of clusters,
Figure 605030DEST_PATH_IMAGE028
is that
Figure 421676DEST_PATH_IMAGE029
A center point of (a);
the formula for calculating the residential area site is:
Figure 486584DEST_PATH_IMAGE030
Figure 781607DEST_PATH_IMAGE031
the formula for calculating the work area station is as follows:
Figure 273768DEST_PATH_IMAGE032
Figure 764661DEST_PATH_IMAGE033
wherein,
Figure 823140DEST_PATH_IMAGE034
representing the probability that station e is the station of the residential area of passenger i;
Figure 263349DEST_PATH_IMAGE035
the probability that the representative station e is used as a station of the working area of the passenger i;
Figure 926411DEST_PATH_IMAGE036
representing the total times of passengers to get in and out of the station e;
Figure 655333DEST_PATH_IMAGE037
representing the number of station entering times of a passenger i at a station e 12 o' clock before the working day;
Figure 62043DEST_PATH_IMAGE038
representing the number of times of entering the station e before 16 resting days of passengers i;
Figure 622338DEST_PATH_IMAGE039
representing the number of station entering times of a passenger at a station e after 12 o' clock of a working day;
Figure 662494DEST_PATH_IMAGE040
representing the number of times of entering the station e after 16 resting days of the passengers i;
Figure 941028DEST_PATH_IMAGE041
representing the total times of passengers entering and leaving the station e on the working day of the passenger i;
Figure 89113DEST_PATH_IMAGE042
representing the number of outbound times of the passenger i at station e before 12 o' clock on working day;
Figure 503914DEST_PATH_IMAGE043
representing the number of times of departure of a passenger at a station e after 12 o' clock of a working day;
Figure 774358DEST_PATH_IMAGE044
representing the number of times of departure of a passenger at a station e after 16 resting days;
Figure 477872DEST_PATH_IMAGE045
representing the number of times of departure of a passenger i at a station e 16 points before the rest day;
the value-added participation degree is set to be strong, medium and low, when the participation frequency is more than 0.7, the value-added participation degree of the passenger is strong, when the participation frequency is less than 0.4, the value-added participation degree of the passenger is low, and when the participation frequency is between 0.4 and 0.7, the value-added participation degree of the passenger is medium.
5. The passenger travel information-based rail transit passenger flow prediction method according to claim 4, wherein estimating the return passenger flow volume at different time intervals in a station based on the passenger travel information index system and part of calculated index data comprises:
counting according to the in-and-out time, the residential area site, the working area site and the travel demand type in the passenger travel information index system
Figure 491964DEST_PATH_IMAGE046
Wherein s is a site, v is day v of a week, is 1-7, t is a time period,
Figure 761271DEST_PATH_IMAGE046
the number of people who return from the s station within a time period t of the v day of a certain week;
selecting historical outbound and return passenger flow data in the station s, and obtaining the week in a mean value calculation modevThe passengers are at
Figure 140300DEST_PATH_IMAGE047
Time period of arrivalsStand and are arranged in
Figure 393427DEST_PATH_IMAGE048
Time period fromsConditional probability distribution of station departure and return journey
Figure 886244DEST_PATH_IMAGE049
The calculation formula is as follows:
Figure 541216DEST_PATH_IMAGE050
in the formula
Figure 825567DEST_PATH_IMAGE051
Represents the total number of the weeks,
Figure 565990DEST_PATH_IMAGE047
which represents the time period a of the time,
Figure 859568DEST_PATH_IMAGE048
which represents the time period of b,
Figure 369047DEST_PATH_IMAGE052
indicates the number of passengers alighting at s-station during the period a of the jth week and the v day,
Figure 89878DEST_PATH_IMAGE053
which represents the time of arrival of the station,
Figure 786439DEST_PATH_IMAGE054
represents the time of outbound;
by said conditional probability distribution
Figure 680445DEST_PATH_IMAGE055
Estimating
Figure 247693DEST_PATH_IMAGE056
The calculation formula is as follows:
Figure 216391DEST_PATH_IMAGE057
wherein,
Figure 603510DEST_PATH_IMAGE058
indicating day v of a week
Figure 301207DEST_PATH_IMAGE059
The number of passengers getting off the bus at s station at the moment, H represents the maximum interval of the time for passengers getting on or off the bus at s station, the maximum interval is 24 hours, H represents time slot resolution, and t +1 represents the next time interval of the time interval t.
6. The passenger travel information-based rail transit passenger flow prediction method according to claim 5, wherein the return passenger flow volume of passengers in the station is added to the passenger flow prediction model as a covariate, and the predicted station arrival passenger flow is:
will estimate the
Figure 722961DEST_PATH_IMAGE056
Adding the predicted passenger flow into a common seasonal autoregressive moving average model to predict the station entering passenger flow;
the seasonal autoregressive moving average model is as follows: ARIMA (P, D, Q) [ Ω ], where P, D, Q represent the order of auto-regressive, differential and moving average, respectively P, D, Q is the order of auto-regressive, differential and moving average for part of season; Ω is the number of cycles per season;
for a time series
Figure 847912DEST_PATH_IMAGE060
,ARIMA(p,d,q)(P,D,Q)[Ω]The model is as follows:
Figure 519065DEST_PATH_IMAGE061
wherein B is defined as
Figure 692557DEST_PATH_IMAGE062
Figure 765556DEST_PATH_IMAGE063
Figure 61408DEST_PATH_IMAGE064
Figure 423119DEST_PATH_IMAGE065
Figure 465549DEST_PATH_IMAGE066
Wherein
Figure 596316DEST_PATH_IMAGE067
Figure 63069DEST_PATH_IMAGE068
Figure 912077DEST_PATH_IMAGE069
and
Figure 755268DEST_PATH_IMAGE070
as a function of the coefficients to be found,
Figure 537279DEST_PATH_IMAGE071
to follow the error term of white noise and obey a mean of 0 and a variance of
Figure 112617DEST_PATH_IMAGE072
The normal distribution of (c),
Figure 245658DEST_PATH_IMAGE073
represents
Figure 626961DEST_PATH_IMAGE074
A time period;
when returning the passenger flow
Figure 732320DEST_PATH_IMAGE075
Inbound traffic volume when acting as covariates
Figure 278226DEST_PATH_IMAGE076
And return passenger flow
Figure 164143DEST_PATH_IMAGE077
The following relationships exist:
Figure 817978DEST_PATH_IMAGE078
Figure 777844DEST_PATH_IMAGE079
wherein,
Figure 22880DEST_PATH_IMAGE080
in order to be the regression coefficient, the method,
Figure 68197DEST_PATH_IMAGE075
is composed of
Figure 525723DEST_PATH_IMAGE046
In the middle week v, when the site s is known, the time is obtained by taking 1. cndot. t;
Figure 667991DEST_PATH_IMAGE081
obeying ARIMA (P, D, Q) (P, D, Q) [ s ]]The model represents the passenger flow except the return passenger flow in the total inbound passenger flow; according to station history
Figure 756033DEST_PATH_IMAGE082
And
Figure 350962DEST_PATH_IMAGE083
calculate out
Figure 818371DEST_PATH_IMAGE080
And
Figure 549567DEST_PATH_IMAGE081
to obtain
Figure 870827DEST_PATH_IMAGE081
Post-pass formula
Figure 890735DEST_PATH_IMAGE084
The prediction is obtained
Figure 955643DEST_PATH_IMAGE085
Then according to
Figure 744608DEST_PATH_IMAGE056
To obtain
Figure 502348DEST_PATH_IMAGE086
Wherein
Figure 806291DEST_PATH_IMAGE086
is formed by
Figure 346993DEST_PATH_IMAGE056
Middle week v, stationWhen the point s is known, the time is obtained by taking t + 1; will be provided with
Figure 318361DEST_PATH_IMAGE086
And
Figure 910317DEST_PATH_IMAGE087
bringing in
Figure 967135DEST_PATH_IMAGE088
In the formula, the prediction is obtained
Figure 311528DEST_PATH_IMAGE089
The incoming passenger flow at that moment.
7. Rail transit passenger flow prediction system based on passenger's trip information, its characterized in that includes:
the index acquisition module is used for acquiring passenger travel data, establishing a passenger travel information index system based on the passenger travel data, and counting and calculating each index in the passenger travel information index system;
the return passenger flow calculation module is used for estimating the return passenger flow at different time intervals in the station based on part of index data calculated in the passenger travel information index system;
the passenger flow prediction module is used for adding return passenger flow of passengers in the station as a covariate into the seasonal autoregressive moving average model to predict the station entering passenger flow;
the passenger trip data includes:
AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information;
APP registration data used for obtaining identity information and associated information of passengers;
APP value-added consumption data used for obtaining value-added service information of passengers;
POI data, associated with the bus stop, near the bus stop for describing geographic attributes of the bus stop;
the index information in the passenger travel information index system comprises the following steps: basic information, business information and derived information;
the basic information comprises identity information and associated information, the identity information comprises an APPID, gender and age of a passenger and whether the passenger is disabled, and the associated information comprises a third-party payment mode of the passenger and a city all-purpose card;
the service information comprises trip basic information, trip derived information and value added service information, wherein the trip basic information comprises the trip in and out time and the trip in and out station information of passengers, the trip derived information comprises average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution, and the value added service information comprises value added service participation times, participation frequency, average transaction amount, payment mode distribution, merchant type distribution and last participation time;
the derived information comprises an active attribute and a functional attribute, wherein the active attribute comprises travel liveness, and the functional attribute comprises a travel demand type of a passenger, a residence area site, a working area site and value-added participation;
the portion of the index data includes the inbound and outbound times, the residential site, the work site, and the travel demand type.
8. An electronic device comprising a processor, a memory, and a computer program stored on the memory and operable on the processor, the computer program when executed by the processor implementing a method of rail transit passenger flow prediction based on passenger travel information according to any one of claims 1 to 6.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method for predicting rail transit passenger flow based on passenger travel information according to any one of claims 1 to 6.
CN202210254509.3A 2022-03-16 2022-03-16 Rail transit passenger flow prediction method and system based on passenger travel information Active CN114331234B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210254509.3A CN114331234B (en) 2022-03-16 2022-03-16 Rail transit passenger flow prediction method and system based on passenger travel information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210254509.3A CN114331234B (en) 2022-03-16 2022-03-16 Rail transit passenger flow prediction method and system based on passenger travel information

Publications (2)

Publication Number Publication Date
CN114331234A CN114331234A (en) 2022-04-12
CN114331234B true CN114331234B (en) 2022-07-12

Family

ID=81033087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210254509.3A Active CN114331234B (en) 2022-03-16 2022-03-16 Rail transit passenger flow prediction method and system based on passenger travel information

Country Status (1)

Country Link
CN (1) CN114331234B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114898574B (en) * 2022-04-26 2023-04-04 安徽省交通控股集团有限公司 Method and system for estimating traffic parameters
CN114912683B (en) * 2022-05-13 2024-05-10 中铁第六勘察设计院集团有限公司 System and method for predicting abnormal large passenger flow of smart city rail transit
CN115759472B (en) * 2022-12-07 2023-12-22 北京轨道交通路网管理有限公司 Passenger flow information prediction method and device and electronic equipment
CN116778739B (en) * 2023-06-20 2024-09-20 深圳市中车智联科技有限公司 Public transportation scheduling method and system based on demand response
CN117746640B (en) * 2024-02-20 2024-04-30 云南省公路科学技术研究院 Road traffic flow rolling prediction method, system, terminal and medium
CN117935416B (en) * 2024-03-21 2024-06-25 成都赛力斯科技有限公司 Pre-running area access statistical method, device and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701180A (en) * 2016-01-06 2016-06-22 北京航空航天大学 Commuting passenger feature extraction and determination method based on public transportation IC card data
CN105718946A (en) * 2016-01-20 2016-06-29 北京工业大学 Passenger going-out behavior analysis method based on subway card-swiping data
CN106845714A (en) * 2017-01-24 2017-06-13 东南大学 A kind of monthly passenger flow method of ARIMA model prediction urban track traffics based on seasonal index number
CN109961164A (en) * 2017-12-25 2019-07-02 比亚迪股份有限公司 Passenger flow forecast method and device
CN110782070A (en) * 2019-09-25 2020-02-11 北京市交通信息中心 Urban rail transit emergency passenger flow space-time distribution prediction method
WO2020091620A1 (en) * 2018-10-30 2020-05-07 Общество С Ограниченной Ответственностью "Глобус Медиа" Method for predicting passenger flow and device for the implementation thereof
CN111260140A (en) * 2020-01-19 2020-06-09 武汉中科通达高新技术股份有限公司 Method for predicting instantaneous return large passenger flow in subway station
CN111932867A (en) * 2020-06-18 2020-11-13 东南大学 Multisource data-based bus IC card passenger getting-off station derivation method
CN111985710A (en) * 2020-08-18 2020-11-24 深圳诺地思维数字科技有限公司 Bus passenger trip station prediction method, storage medium and server
WO2021174755A1 (en) * 2020-03-02 2021-09-10 北京全路通信信号研究设计院集团有限公司 Rail transit passenger flow demand prediction method and apparatus based on deep learning
CN113850417A (en) * 2021-08-27 2021-12-28 浙江浙大中控信息技术有限公司 Passenger flow organization decision-making method based on station passenger flow prediction
CN114037158A (en) * 2021-11-09 2022-02-11 北京京投亿雅捷交通科技有限公司 Passenger flow prediction method based on OD path and application method

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701180A (en) * 2016-01-06 2016-06-22 北京航空航天大学 Commuting passenger feature extraction and determination method based on public transportation IC card data
CN105718946A (en) * 2016-01-20 2016-06-29 北京工业大学 Passenger going-out behavior analysis method based on subway card-swiping data
CN106845714A (en) * 2017-01-24 2017-06-13 东南大学 A kind of monthly passenger flow method of ARIMA model prediction urban track traffics based on seasonal index number
CN109961164A (en) * 2017-12-25 2019-07-02 比亚迪股份有限公司 Passenger flow forecast method and device
WO2020091620A1 (en) * 2018-10-30 2020-05-07 Общество С Ограниченной Ответственностью "Глобус Медиа" Method for predicting passenger flow and device for the implementation thereof
CN110782070A (en) * 2019-09-25 2020-02-11 北京市交通信息中心 Urban rail transit emergency passenger flow space-time distribution prediction method
CN111260140A (en) * 2020-01-19 2020-06-09 武汉中科通达高新技术股份有限公司 Method for predicting instantaneous return large passenger flow in subway station
WO2021174755A1 (en) * 2020-03-02 2021-09-10 北京全路通信信号研究设计院集团有限公司 Rail transit passenger flow demand prediction method and apparatus based on deep learning
CN111932867A (en) * 2020-06-18 2020-11-13 东南大学 Multisource data-based bus IC card passenger getting-off station derivation method
CN111985710A (en) * 2020-08-18 2020-11-24 深圳诺地思维数字科技有限公司 Bus passenger trip station prediction method, storage medium and server
CN113850417A (en) * 2021-08-27 2021-12-28 浙江浙大中控信息技术有限公司 Passenger flow organization decision-making method based on station passenger flow prediction
CN114037158A (en) * 2021-11-09 2022-02-11 北京京投亿雅捷交通科技有限公司 Passenger flow prediction method based on OD path and application method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Classification method of rail transit stations based on POI data and tf-idf index;Shichen Zhong等;《21st Cota international conference of transportation professionals》;20211231;全文 *
Deep learning-based hybrid model for short-term subway passenger flow prediction using automatic fare collection data;Jia feifan等;《IET Intelligent Transport Systems》;20191231;第13卷(第11期);第1708-1716页 *
基于广义动态模糊神经网络的短时车站进站客流量预测;李春晓等;《都市快轨交通》;20150818;第28卷(第4期);第57-61页 *
客运专线旅客出行需求及客流时空分布研究;徐攀;《中国优秀硕士学位论文全文数据库 (工程科技Ⅱ辑)》;20121015(第10期);第C033-465页 *

Also Published As

Publication number Publication date
CN114331234A (en) 2022-04-12

Similar Documents

Publication Publication Date Title
CN114331234B (en) Rail transit passenger flow prediction method and system based on passenger travel information
Zhou et al. Bus arrival time calculation model based on smart card data
CN109035770B (en) Real-time analysis and prediction method for bus passenger capacity in big data environment
Munizaga et al. Validating travel behavior estimated from smartcard data
Merriman et al. Excess commuting in the Tokyo metropolitan area: measurement and policy simulations
Mohring et al. The values of waiting time, travel time, and a seat on a bus
Ortega-Tong Classification of London's public transport users using smart card data
Kemp Some evidence of transit demand elasticities
Lee et al. Assessing transit competitiveness in Seoul considering actual transit travel times based on smart card data
Fu et al. Impact of a new metro line: analysis of metro passenger flow and travel time based on smart card data
Xiong et al. Understanding operation patterns of urban online ride-hailing services: A case study of Xiamen
CN114358808A (en) Public transport OD estimation and distribution method based on multi-source data fusion
CN110889092A (en) Short-time large-scale activity peripheral track station passenger flow volume prediction method based on track transaction data
Li et al. Using smart card data trimmed by train schedule to analyze metro passenger route choice with synchronous clustering
Sun et al. Identifying public transit commuters based on both the smartcard data and survey data: a case study in xiamen, China
Yoo Transfer penalty estimation with transit trips from smartcard data in Seoul, Korea
Dumas Analyzing transit equity using automatically collected data
CN108681741A (en) Based on the subway of IC card and resident's survey data commuting crowd's information fusion method
Han et al. Analyzing the accessibility of subway stations for transport-vulnerable population segments in Seoul: Case of bus-to-subway transfer
CN112990518A (en) Real-time prediction method and device for destination station of individual subway passenger
Wang et al. Determining the level of service scale of public transport system considering the distribution of service quality
Mullen Estimating the demand for urban bus travel
Montero-Lamas et al. A new big data approach to understanding general traffic impacts on bus passenger delays
Li-Jun et al. Evaluation of the reliability of bus service based on gps and smart card data
Scholl et al. A rapid road to employment? The impacts of a bus rapid transit system in Lima

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220615

Address after: 100044 Beijing city Haidian District Shangyuan Village No. 3

Applicant after: Beijing Jiaotong University

Applicant after: GUANGZHOU METRO GROUP Co.,Ltd.

Address before: 100044 Beijing city Haidian District Shangyuan Village No. 3

Applicant before: Beijing Jiaotong University

GR01 Patent grant
GR01 Patent grant