CN114331234B - Rail transit passenger flow prediction method and system based on passenger travel information - Google Patents
Rail transit passenger flow prediction method and system based on passenger travel information Download PDFInfo
- Publication number
- CN114331234B CN114331234B CN202210254509.3A CN202210254509A CN114331234B CN 114331234 B CN114331234 B CN 114331234B CN 202210254509 A CN202210254509 A CN 202210254509A CN 114331234 B CN114331234 B CN 114331234B
- Authority
- CN
- China
- Prior art keywords
- passenger
- travel
- station
- time
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 238000004364 calculation method Methods 0.000 claims abstract description 52
- YHXISWVBGDMDLQ-UHFFFAOYSA-N moclobemide Chemical compound C1=CC(Cl)=CC=C1C(=O)NCCN1CCOCC1 YHXISWVBGDMDLQ-UHFFFAOYSA-N 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 10
- 230000001932 seasonal effect Effects 0.000 claims description 9
- 230000000284 resting effect Effects 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 5
- 101100100125 Mus musculus Traip gene Proteins 0.000 description 99
- 230000000694 effects Effects 0.000 description 11
- 230000006399 behavior Effects 0.000 description 8
- 238000012545 processing Methods 0.000 description 5
- 230000002159 abnormal effect Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000003064 k means clustering Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of rail transit passenger flow prediction, in particular to a rail transit passenger flow prediction method and a rail transit passenger flow prediction system based on passenger travel information, wherein the method comprises the following steps: passenger travel data are acquired, a passenger travel information index system is established based on the passenger travel data, and each index information in the passenger travel information index system is calculated in a statistical mode; estimating the return passenger flow of different time periods in the station based on partial index information obtained by calculation in the passenger travel information index system; and taking the return passenger flow volume of passengers in the station as a covariate to be added into the passenger flow prediction model to predict the station entering passenger flow. According to the invention, the travel rule of the passenger is mined according to the multi-source travel data of the passenger, and a passenger travel information three-level index information system is established on the basis, so that the statistical calculation of each index information is realized. Meanwhile, a prediction method for identifying return passenger flow facing passenger travel information and effectively improving the accuracy of station arrival passenger flow according to the return passenger flow is provided.
Description
Technical Field
The invention relates to the technical field of rail transit passenger flow prediction, in particular to a rail transit passenger flow prediction method and a rail transit passenger flow prediction system based on passenger travel information.
Background
The accurate prediction of passenger flow demands at stations is crucial to the operation of urban subway systems. In the past, the passenger flow values at several past times are mainly regarded as time series to predict the passenger flow at a certain future time. However, this method basically ignores the travel behavior law of the individual passenger. For example, if a passenger gets off to work at a subway station in the morning, he/she is likely to get on to get home at the same station in the evening. The existing research shows that the travel behavior component is very necessary to be added into the passenger flow prediction time sequence. According to the concept of user travel information, user labels which are easy to understand, representative and meaningful are abstracted through modeling, and an information set of a user is constructed through the labels to describe the behavior characteristics of the user. Therefore, based on the travel information of the individual passenger, the travel information of the passenger is constructed to describe the travel behavior rule of the individual passenger, and accurate passenger flow prediction is possible. At present, the following defects still exist in the travel information of rail transit passengers: deep mining is not carried out on the multisource travel information of the passengers, so that large data waste is caused; the index system of the passenger travel information established through data analysis is not sound enough, and still needs to be further mined.
Disclosure of Invention
The invention aims to solve at least one technical problem in the background art, and provides a method and a system for predicting rail transit passenger flow based on passenger travel information.
In order to achieve the purpose, the invention provides a rail transit passenger flow prediction method based on passenger travel information, which comprises the following steps:
passenger travel data are acquired, a passenger travel information index system is established based on the passenger travel data, and all indexes in the passenger travel information index system are calculated in a statistical mode;
estimating the return passenger flow of different time periods in the station based on partial index data obtained by calculation in the passenger travel information index system;
and taking the return passenger flow volume of passengers in the station as a covariate to be added into the passenger flow prediction model to predict the station entering passenger flow.
Preferably, the trip data includes:
AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information;
APP registration data used for obtaining identity information and associated information of passengers;
APP value-added consumption data used for obtaining value-added service information of passengers;
POI data near the stop, associated with the stop, for describing geographic attributes of the stop.
Preferably, the index information in the passenger travel information index system includes: basic information, business information and derived information;
the basic information comprises identity information and associated information, the identity information comprises an APPID, gender and age of a passenger and whether the passenger is disabled, and the associated information comprises a third-party payment mode of the passenger and a city all-purpose card;
the service information comprises trip basic information, trip derived information and value added service information, wherein the trip basic information comprises the trip in and out time and the trip in and out station information of passengers, the trip derived information comprises average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution, and the value added service information comprises value added service participation times, participation frequency, average transaction amount, payment mode distribution, merchant type distribution and last participation time;
the derived information comprises an activity attribute and a function attribute, wherein the activity attribute comprises travel activity, and the function attribute comprises a travel demand type of the passenger, a residence area site, a working area site and a value-added participation degree.
Preferably, the formula for calculating the average trip duration is as follows:
the formula for calculating the average daily trip times is as follows:
the formula for calculating the travel time distribution is as follows:
counting the travel OD distribution as travel OD statistics of the first three travel frequencies of the passengers;
the formula for calculating the first trip time is as follows:
the formula for calculating the last travel time is as follows:
the formula for calculating the travel time distribution of the holidays is as follows:
counting the travel OD distribution of the holidays, namely the travel OD statistics of the passengers three times before the holiday travel frequency;
in the following formulas, the first and second groups,represents the firstThe next trip, d for the outbound site, o for the inbound site, i for the passenger, t for the time,represents passenger i at time tThe time of the next outbound is the time of the next outbound,represents passenger i at time tThe time of the next arrival is the time of the next arrival,represents the average travel time period of the passenger i,representing the total number of historical trips of passenger i,representing the average number of trips of the passenger i, D representing the total number of days of the passenger within the statistical date,is a binary identification function, when the condition is satisfied, the value is 1, otherwise 0,representing the time of the first trip of passenger i,representing the last travel time of the passenger i,and counting the total times of travel of the passengers on the holidays within the counting date.
Preferably, the formula for calculating the participation frequency is:
wherein,the frequency of participation in the value added service on behalf of the passenger i,number of times passenger i participates in value added service;
counting the merchant type distribution as merchant type statistics of the first three passenger participation frequencies;
the formula for calculating the average transaction amount is:
wherein,the average amount of money spent for passenger i to participate in the value added service,a total amount of money spent participating in the value added service for the passenger i;
and counting the payment mode distribution as the statistics of the first three using modes when the passenger pays.
Preferably, the travel demand type is determined by a clustering result of a total travel frequency, a first travel time and an average travel time counted by passenger station-entering card-swiping data, and the clustering method comprises the following steps:
dividing the passengers into different categories according to the travel characteristics of the passengers by adopting a K-means algorithm, and selecting the total historical travel times of the passengers i in a passenger travel information index systemFirst trip timeAnd average length of tripAs an index of passenger clustering in a station, determining a clustering number K value by adopting an elbow method, wherein a key index of the elbow method is the sum of squared errors among clusters SSE, and a calculation formula is as follows:
the formula for calculating the residential area site is:
the formula for calculating the work area station is as follows:
wherein,the probability of representing the station e as the station of the residential area of the passenger i;representing the probability that the station e is used as a station of a working area of the passenger i;representing the total times of passengers to get in and out of the station e;representing the number of station entrance times of a passenger i at a station e before 12 o' clock in working day;representing the number of station entrance times of a passenger i at a station e before 16 resting days;representing the number of station entering times of a passenger at a station e after 12 o' clock of a working day;representing the number of times of entering the station e after 16 resting days of the passengers i;representing the total times of passengers entering and leaving the station e on the working day of the passenger i;representing the number of the outbound times of the passenger i at the station e before 12 o' clock on the working day;representing the number of times of departure of a passenger at a station e after 12 o' clock of a working day;representing the number of times of departure of a passenger at a station e after 16 resting days;representing the number of times of departure of a passenger i at a station e 16 points before the rest day;
the value-added participation degree is set to be strong, medium and low, when the participation frequency is more than 0.7, the value-added participation degree of the passenger is strong, when the participation frequency is less than 0.4, the value-added participation degree of the passenger is low, and when the participation frequency is between 0.4 and 0.7, the value-added participation degree of the passenger is medium.
Preferably, the estimating of the return passenger flow volume at different time intervals in the station based on the passenger travel information index system and the calculated partial index data includes:
counting according to the in-and-out time, the residential area site, the working area site and the travel demand type in the passenger travel information index systemWherein s is a certain site, v is a certain week, is 1-7, t is a certain time period,the number of people who return from the s station within the time period t of v weeks;
selecting historical outbound and return passenger flow data in the station s, and obtaining the week in a mean value calculation modevThe passengers are atArrival of time periodsStand and are arranged atTime period fromsConditional probability distribution of station departure and return journeyThe calculation formula is as follows:
in the formulaRepresents the total number of the weeks,which represents the time period a of the time,which represents the time period of b,is shown asjOn the v-th day of the weekIn a time period ofsThe number of passengers getting off the vehicle when standing,which represents the time of arrival of the station,represents the outbound time;
wherein,indicating day v of a weekThe number of passengers getting off the bus at s station at the moment, H represents the maximum interval of the time for passengers getting on or off the bus at s station, the maximum interval is 24 hours, H represents time slot resolution, and t +1 represents the next time interval of the time interval t.
Preferably, the return passenger flow volume of passengers in the station is added to the passenger flow prediction model as a covariate, and the prediction of the station entering passenger flow is as follows:
will estimate theAdding the predicted passenger flow into a common seasonal autoregressive moving average model to predict the station entering passenger flow;
the seasonal autoregressive moving average model is as follows: ARIMA (P, D, Q) [ Ω ], where P, D, Q represent the order of autoregressive, differencing and moving average, respectively; p, D, Q is the auto-regressive, differential and moving average order of seasonal portions; Ω is the number of cycles per season;
wherein B is defined as,,,,Wherein、、andas a function of the coefficients to be found,to follow the error term of white noise and obey a mean of 0 and a variance ofThe normal distribution of (a) is,representsA time period;
when returning the passenger flowInbound traffic volume when acting as covariatesAnd return passenger flowThe following relationships exist:
wherein,in order to be the regression coefficient, the method,is composed ofIn the middle week v, when the site s is known, the time is obtained by taking 1. cndot. t;obeying ARIMA (P, D, Q) (P, D, Q) [ s ]]A model representing the passenger flow except for the return passenger flow in the total inbound passenger flow; according to station historyAndcalculate outAndto obtainBack pass formulaThe prediction is obtainedThen according toTo obtainWhereinis formed byIn the middle week v, when the station s is known, the time is obtained by taking t + 1; will be provided withAndbringing inIn the formula, the prediction is obtainedThe amount of passengers arriving at the station at that moment.
In order to achieve the above object, the present invention provides a rail transit passenger flow prediction system based on passenger travel information, comprising:
the index acquisition module is used for acquiring passenger travel data, establishing a passenger travel information index system based on the passenger travel data, and counting and calculating each index in the passenger travel information index system;
the return passenger flow calculation module is used for estimating the return passenger flow at different time intervals in the station based on part of index data calculated in the passenger travel information index system;
and the passenger flow prediction module is used for adding the return passenger flow of passengers in the station into the passenger flow prediction model as a covariate to predict the station entering passenger flow.
In order to achieve the above object, the present invention provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and running on the processor, wherein the computer program, when executed by the processor, implements the method for predicting rail transit passenger flow based on passenger travel information as described in any one of the above.
To achieve the above object, the present invention provides a computer-readable storage medium storing thereon a computer program, which when executed by a processor, implements a method for predicting rail transit passenger flow based on passenger travel information as described in any one of the above.
The invention has the beneficial effects that:
1. the method for predicting the passenger flow of the rail transit based on the passenger travel information provided by the invention is based on intelligent subway construction, effectively associates, fuses and introduces multi-source data related to the subway, establishes the passenger travel information and discusses the application of the passenger travel information in the aspect of passenger flow prediction;
2. according to the rail transit passenger flow prediction method based on passenger travel information, the travel rule of passengers is mined according to multi-source travel data of the passengers, a passenger travel information three-level index system is established on the basis, and statistical calculation of each index is realized;
3. the invention discloses a rail transit passenger flow prediction method based on passenger travel information, and provides a prediction method for identifying return passenger flow facing the passenger travel information and effectively improving the accuracy of station arrival passenger flow according to the return passenger flow.
Drawings
Fig. 1 schematically shows a flow chart of a method for predicting rail transit passenger flow based on passenger travel information according to the present invention;
fig. 2 is a block diagram schematically showing the structure of a rail transit passenger flow prediction system based on passenger travel information according to the present invention.
Detailed Description
The content of the invention will now be discussed with reference to exemplary embodiments. It is to be understood that the embodiments discussed are merely intended to enable one of ordinary skill in the art to better understand and thus implement the teachings of the present invention, and do not imply any limitations on the scope of the invention.
As used herein, the term "include" and its variants are to be read as open-ended terms meaning "including, but not limited to. The term "based on" is to be read as "based, at least in part, on". The terms "one embodiment" and "an embodiment" are to be read as "at least one embodiment".
Fig. 1 schematically shows a flow chart of a method for predicting rail transit passenger flow based on passenger travel information according to the present invention. As shown in fig. 1, the method for predicting rail transit passenger flow based on passenger travel information according to the present invention comprises the following steps:
a. passenger travel data are acquired, a passenger travel information index system is established based on the passenger travel data, and all indexes in the passenger travel information index system are calculated in a statistical mode;
b. estimating the return passenger flow of different time periods in the station based on partial index data obtained by calculation in the passenger travel information index system;
c. and taking the return passenger flow volume of passengers in the station as a covariate to be added into the passenger flow prediction model to predict the station entering passenger flow.
According to an embodiment of the invention, in the step a, when a passenger gets on the station and rides the vehicle, the trip data of the relevant passenger can be obtained through the card swiping record and/or the code scanning record of the passenger, and based on pycharm software, the data is obtained by connecting a python language with a database of the trip information of the passenger, and corresponding statistical analysis is carried out. In this embodiment, the trip data includes: AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information; APP registration data used for obtaining identity information and associated information of passengers; APP value-added consumption data used for obtaining value-added service information of passengers; and POI data, associated with the bus stop, near the bus stop for describing geographic attributes of the bus stop. In the embodiment, the statistical range of the travel data is counted from the date of APP registration of the passenger, the statistical range of the POI data near the station is a land use type within a coverage area with the station as the center and the radius of 500 meters, and the data is acquired through a third-party map software high-grade map.
In this embodiment, the passenger travel information index system is a concept representing passenger information, and includes three primary indexes: basic information, service information, and derived information. Wherein, the basic information comprises two-level indexes: identity information and associated information, the identity information includes three levels of indicators: APPID, sex, age and whether the passenger is disabled, and the associated information comprises three-level indexes: the third party payment mode of the passenger and the city card.
The service information comprises two-level indexes: travel basic information, travel derived information and value added service information. The travel basic information comprises three levels of indexes: passenger entry and exit time and passenger entry and exit station information; the travel derivative information comprises three levels of indexes: average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution; the value added service information comprises three levels of indexes: the number of times of participation of the value added service, the participation frequency, the average transaction amount, the payment mode distribution, the merchant type distribution and the final participation time.
The derived information includes secondary indicators: an active attribute and a functional attribute. Wherein the activity attribute comprises three levels of indexes: trip liveness; the functional attributes include three levels of indicators: passenger travel demand type, residential area site, work area site, and value added participation.
Further, in the present embodiment, the statistically calculating each index in the passenger travel information index system includes: counting all indexes in the identity information, the correlation information and the trip basic information of the passenger, counting total trip times, trip OD distribution, holiday trip OD distribution and trip path distribution indexes in the trip derivative information, and counting participation times, merchant type distribution, payment mode distribution and final participation time indexes in the value-added service information. In the present invention, there is no special method for statistics, as long as the relevant index information can be acquired and the acquired relevant index information is stored in a summary manner. As is clear from the above, the indexes are data information that can be generated by the actions of the passenger such as registration and travel, and are information that can be obtained without calculation, and therefore, only summary statistics are required, and calculation is not required.
Further, in the embodiment, the statistical calculation of each index in the passenger travel information index system further includes calculating three levels of indexes except the indexes, and the specific calculation includes:
calculating the average travel time length, wherein the average travel time length refers to the time spent by the passenger i for each travel, and the calculation formula is as follows:
calculating the average daily trip times, wherein the average daily trip times are the trip times of the passenger i each day, and the calculation formula is as follows:
calculating travel time distribution, wherein the travel time distribution refers to the ratio of the travel times of the passengers i in the early peak (7: 00-9: 00), the late peak (17: 00-19: 00) and the average peak to the total travel times, and taking the early peak as an example, the calculation formula is as follows:
the statistical travel OD distribution is travel OD statistics of the first three travel frequencies of the passengers;
the formula for calculating the first trip time is as follows:
calculating the last travel time, which is the time of the last travel of the passenger i within the statistical date, and is used for judging the activity of the passenger (i.e. the travel activity in the derivative information), wherein the calculation formula is as follows:
calculating the travel time distribution of the holidays, wherein the travel time distribution of the holidays refers to the ratio of the travel times of the passengers in the early peak (7: 00-9: 00), the late peak (17: 00-19: 00) and the average peak to the total travel times of the holidays in the holidays, and taking the early peak as an example, the calculation formula is as follows:
counting the holiday travel OD distribution, namely counting the travel OD of the passenger three times before holiday travel frequency;
in the above-mentioned formulas, the first and second substrates,represents the firstThe next trip, d for the outbound site, o for the inbound site, i for the passenger, t for the time,represents passenger i at time tThe time of the next outbound is,represents the passenger i at the time tThe time of the next arrival is the time of the next arrival,representing the average travel time period of the passenger i,representing the total number of historical trips of passenger i,representing the average number of trips of the passenger i, D representing the total number of days of the passenger within the statistical date,is a binary identification function, when the condition is satisfied, the value is 1, otherwise 0,representing the time of the first trip of passenger i,representing the last travel time of the passenger i,and counting the total times of travel of the passengers on the holidays within the counting date.
Calculating participation frequency, wherein the participation frequency is the frequency degree of the passenger i participating in the value-added service, and the calculation formula is as follows:
wherein,the frequency of participation in the value added service on behalf of the passenger i,the number of times the passenger i participates in the value added service;
counting the merchant type distribution, namely counting the merchant types of the first three times of the participation frequency of the passengers;
the formula for calculating the average transaction amount is:
wherein,the average amount of money spent for passenger i to participate in the value added service,a total amount of money spent participating in the value added service for the passenger i;
the statistical payment mode distribution is the statistics of the first three usage modes when passengers pay.
The travel demand type is determined by the clustering result of the total travel times, the first travel time and the average travel time counted by the passenger station-entering card-swiping data, and the clustering method comprises the following steps:
dividing the passengers into different categories according to the travel characteristics of the passengers by adopting a K-means algorithm, and selecting the total historical travel times of the passengers i in a passenger travel information index systemFirst trip timeAnd average trip durationAs an index of passenger clustering in a station, the determination of a clustering number K value adopts an elbow methodThe key index of the elbow method is the square sum of errors SSE between clusters, namely the square sum of errors, and the calculation formula is as follows:
wherein,represents the number k of the clusters and represents the number k of clusters,is thatA center point of (a);
in this embodiment, the travel demand type is a clustering result of three indexes counted by card swiping data of passengers entering a station, the clustering number is obtained according to an SSE formula, and for each class of passengers, the historical travel times and the first travel time are analyzed, for example, the historical travel times account for a large proportion of statistical days, and the first travel time is in an early peak time period, so that the class of passengers can be considered as commuting passengers. Depending on the specific clustering result.
Example (c): the method comprises the steps of taking passengers in a subway station as research objects, selecting AFC data of three working days including 6 months in 2018, 6 days in 7 days in 8 days as basic data, and analyzing travel behavior characteristics of the passengers on the working days in the station. After data screening, the number of people entering the station for three working days is 197328.
The passengers were classified into 5 categories in total by the K-means clustering method. Table 1 below is the cluster center points for the five classes.
TABLE 1
And (3) analyzing a clustering result:
the proportion of the first class of passengers is 21.2%, the travel characteristics are that the travel times in three days are 1.75, the passengers are the class with the highest travel intensity in five classes, and the first-time travel time is 08: 22: 13, the average travel time is 27.7min, the travel distance is not far away, the time period of the early peak is met, and the class of passengers can be considered as standard commuter passengers in the early peak period.
The proportion of the second class of passengers is 10.2%, the travel characteristic is that the travel times in three days are 1.34, the travel intensity is general, and the first travel time is 11: 29: 33, the average travel time is 48.1min, the travel distance is far away, the occupied ratio is less, the passenger can be seen as a passenger who goes out for travel or travels in a long distance, and by combining POI data, more bus stations and railway stations are arranged near the stations, especially Beijing northern railway stations, and the passenger can conveniently travel for travel.
The proportion of the third class of passengers is 34.5%, the travel characteristic shows that the travel times in three days are 1.69, the travel intensity is higher, and the first travel time is 17: 39: 14, the average travel time is 37.9min, and the trip distance is moderate than other classes, accords with the time quantum of late peak, can regard as this type of passenger as the commuter passenger of the late peak period of standard, and this type of passenger accounts for than the highest class in five types of passengers simultaneously, explains that west straight gate station late peak arrival number is many, combines POI data, has more office areas near the station, explains that this explanation is reasonable.
The proportion of the fourth class passenger is 17.2%, and the trip characteristic shows that the number of times of trip in three days is 1.22, and the intensity of trip is minimum, and it is not high to show this class of passenger's trip loyalty, and the time of trip for the first time is 20: 39: 40, the average travel time is 37.1min, the travel distance is moderate compared with other classes, the travel time is later, the passenger can be regarded as a living class passenger, and by combining POI data, a plurality of shopping and catering merchants are nearby the station, and the travel can be regarded as the travel of the passenger going home after consumption.
The proportion of the fifth class of passengers is 17.1%, the travel characteristic is that the travel frequency in three days is 1.25, the travel intensity is low, and the first travel time is 14: 05: 07, the average travel time is 29.2min, the travel distance is short, the travel time is the same as that of the fourth class of passengers, no obvious characteristics exist, the proportion of the travel time is very close to that of the fourth class of passengers, and the passengers can be regarded as life passengers. Judging the residential area site, and in the working day, at noon 12: 00 as a demarcation point, 16 pm after the holiday: 00 as a demarcation point, the statistics of the number of times of passengers to get in and out of the station in the corresponding time interval is shown in the following table 2:
TABLE 2
The station stations where the passenger living areas are located are generally the station where the passenger first travels and the destination station where the passenger last travels in one day, so the probability calculation formula of the station e as the station of the passenger i living area is as follows:
wherein,representing the probability that station e is the station of the residential area of passenger i;
judging work area stations, the station stations where passenger work areas are typically passengers 12 within a work day: station before 00 as destination and 12: after 00, the station is used as an initial station, so the calculation formula of the station e as the station of the working area of the passenger i is as follows:
wherein,representing the probability that the station e is used as a station of a working area of the passenger i;
the value-added participation degree is set to be strong, medium and low, when the participation frequency is more than 0.7, the value-added participation degree of the passenger is strong, when the participation frequency is less than 0.4, the value-added participation degree of the passenger is low, and when the participation frequency is between 0.4 and 0.7, the value-added participation degree of the passenger is medium.
In addition, in the present embodiment, a specific format of a part of the index labels in the passenger travel information index system is shown in table 3 below:
TABLE 3
In the present embodiment, the index data in the passenger travel information index system should be updated continuously according to the travel of the passenger. The updating rule of the passenger trip information is as follows: for basic information, updating is performed only when the passenger modifies his personal information; for the service information, the travel basic information, the travel derivative information and the value-added service information are updated in real time along with the travel of the passenger and the use of the value-added service every time, and meanwhile, the service information in the redis is synchronously updated into a database every month; for the derivative information, analyzing the basic information and the service information, and updating once every month; judging the last trip time of the passenger every half year, if the difference between the last trip time of the passenger and the updating time exceeds half year, judging the passenger as an inactive user, and deleting the passenger trip information from the database.
According to an embodiment of the present invention, in step a, after the passenger travel data is acquired, the method further includes preprocessing the travel data, and specifically includes:
redundant data processing: when a passenger swipes a card for many times or equipment fails, data repetition may occur, and the repeated data needs to be deleted;
and (3) error data processing: abnormal data may occur due to passenger behavior and equipment failure. There are three criteria for the determination of abnormal data: firstly, the arrival time of passengers is required to be earlier than the departure time; secondly, the stay time of passengers in the rail transit is regulated to be less than 4 hours; and thirdly, judging the times of the passengers entering the same station within one day, and counting the staff when the statistical data is eliminated because the staff at the station has more access times in one day.
The passenger categories contained in the travel demand types of the third-level indexes in the passenger travel information index system include:
commuting passengers: the trip time and the trip frequency of commuter passengers are relatively fixed due to the working requirements;
touring passengers: the traveling time and the traveling frequency of the passengers are high in fluctuation, the traveling frequency in a short time is high, and the traveling ODs are widely distributed;
leisure entertainment passengers: the travel time of the class of passengers is more distributed on weekends and off-peak time periods of each day;
special passengers: for example, the old, the disabled, the pregnant woman and the like often need external help in the traveling process due to the self-reason, and the information needs to be provided by the passenger when registering the APP account;
the other passengers: other passengers are different from the four passenger types, the travel time and the travel frequency are not determined, and the travel purposes are also various.
According to an embodiment of the present invention, in the step b, estimating the return passenger flow volume in different time periods in the station based on the passenger travel information index system and the calculated partial index data, includes:
according to the in-out time, the residence area station, the working area station and the travel demand type in the passenger travel information index system, countingWherein s is a certain site, v is a certain week, the value range is 1-7, which means Monday to Sunday, t is a certain time period,the number of people returning from station s within a time period t of v weeks;
selecting historical outbound and return passenger flow data in the station s, and obtaining the week in a mean value calculation modevThe passengers are atArrival of time periodsStand and are arranged inTime period fromsConditional probability distribution of station departure returnThe calculation formula is as follows:
in the formulaRepresents the total number of the weeks,which represents the time period a of time,which represents the time period of b,is shown asjOn the v-th day of the weekIn a time period ofsThe number of passengers getting off the vehicle at a station,which represents the time of arrival of the station,represents the time of outbound;
wherein,indicating day vThe number of passengers getting off the bus at s station at the moment, H represents the maximum interval of the time for passengers getting on or off the bus at s station, the maximum interval is 24 hours, H represents time slot resolution, and t +1 represents the next time interval of the time interval t.
According to an embodiment of the present invention, in the step c, the return passenger flow volume of passengers in the station is added as a covariate to the passenger flow prediction model, and the predicted station passenger flow entering the station is:
to be estimatedAdding the predicted traffic volume into a common seasonal autoregressive moving average model (S-ARIMA model) to predict the station entrance traffic volume;
the S-ARIMA model is: ARIMA (P, D, Q) (P, D, Q) [ omega ], where P, D, Q represent the order of autoregressive, differential and moving average, respectively P, D, Q is the order of autoregressive, differential and moving average for part of the season;
wherein B is defined as,,,,In which、、Andfor the coefficients to be found, the coefficients are,to follow the error term of white noise and obey a mean of 0 and a variance ofThe normal distribution of (a) is,representsA time period;
while returning passenger flowInbound traffic volume when acting as covariatesAnd return passenger flowThe following relationships exist:
wherein,in order to be the regression coefficient, the method,is formed byIn the middle week v, when the site s is known, the time is obtained by taking 1. cndot. t;obeying ARIMA (P, D, Q) (P, D, Q) [ omega ]]The model represents the passenger flow except the return passenger flow in the total inbound passenger flow; according to station historyAndcalculate outAndto obtainPost-pass formulaThe prediction is obtainedThen according toTo obtainWherein, in the process,is formed byIn the middle week v, when the station s is known, the time is obtained by taking t + 1; due to the fact thatObeying ARIMA (P, D, Q) (P, D, Q) [ omega ]]Models, i.e. by which prediction can be madeOf a time periodTo do soIs as in the above(ii) a Will be provided withAndsubstituted into a formula, the prediction being obtainedThe amount of passengers arriving at the station at that moment.
In the present embodiment, for example, the model parameters are selected to be (2, 0, 1) (1, 1, 0) [72]The experimental results are shown in Table 4 below, where no M0 model was addedM1 model additionAs covariates, the RMSE of the training set is reduced by 9.87, the RMSE of the test set is reduced by 9.02, the SMAPE of the training set is reduced by 0.64%, the SMAPE of the test set is reduced by 0.16%, and the predicted effect is more accurate after new variables are added.
TABLE 4
According to the scheme, the method provided by the invention is based on intelligent subway construction, effectively associates, fuses and introduces multi-source data related to the subway, establishes passenger travel information, and discusses the application of the passenger travel information in the aspect of passenger flow prediction. According to the invention, the travel rule of the passenger is mined according to the multi-source travel data of the passenger, and a passenger travel information three-level index system is established on the basis, so that the statistical calculation of each index is realized. Meanwhile, a prediction method for identifying the return passenger flow facing to the passenger travel information and effectively improving the accuracy of the station arrival passenger flow according to the return passenger flow is provided.
Further, in order to achieve the above object, the present invention further provides a system for predicting rail transit passenger flow based on passenger travel information, and a block diagram of the system structure is shown in fig. 2, and specifically includes:
the index acquisition module is used for acquiring passenger travel data, establishing a passenger travel information index system based on the passenger travel data, and counting and calculating each index in the passenger travel information index system;
the return passenger flow calculation module is used for estimating the return passenger flow at different time intervals in the station based on part of index data calculated in the passenger travel information index system;
and the passenger flow prediction module is used for adding the return passenger flow of passengers in the station into the passenger flow prediction model as a covariate to predict the station entering passenger flow.
According to one embodiment of the invention, in the index acquisition module, when a passenger gets into a station and takes a bus, travel data of the relevant passenger can be acquired through a card swiping record and/or a code scanning record of the passenger, and based on pycharm software, a python language is used to connect a database of travel information of the passenger, acquire the data and perform corresponding statistical analysis. In this embodiment, the trip data includes: AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information; APP registration data used for obtaining identity information and associated information of passengers; APP value-added consumption data used for obtaining value-added service information of passengers; and POI data, associated with the bus stop, near the bus stop for describing geographic attributes of the bus stop. In the embodiment, the statistical range of the travel data is counted from the date of the passenger registering the APP, the statistical range of the POI data near the station is a land use type within the coverage range of a radius of 500 meters centering on the station, and the data is acquired through a third-party map software high level map.
In this embodiment, the passenger travel information index system is a concept representing passenger information, and includes three primary indexes: basic information, service information, and derived information. Wherein, the basic information comprises two-level indexes: identity information and associated information, the identity information includes three levels of indicators: APPID, sex, age and whether the passenger is disabled, and the associated information comprises three-level indexes: the third party payment mode of the passenger and the city card.
The service information comprises two-level indexes: travel basic information, travel derived information and value added service information. The travel basic information comprises three levels of indexes: passenger's entering and exiting time and passenger's entering and exiting station information; the travel derivative information comprises three levels of indexes: average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution; the value added service information comprises three levels of indexes: number of value added service participation, participation frequency, average transaction amount, payment mode distribution, merchant type distribution and final participation time.
The derived information includes secondary indicators: an active attribute and a functional attribute. Wherein the activity attribute comprises three levels of indexes: trip liveness; the functional attributes comprise three levels of indexes: passenger travel demand type, residential area sites, work area sites, and value-added participation.
Further, in the present embodiment, the statistically calculating each index in the passenger travel information index system includes: the method comprises the steps of counting all indexes in identity information, correlation information and travel basic information of passengers, counting total travel times, travel OD distribution, holiday travel OD distribution and travel path distribution indexes in travel derivative information, and counting participation times, merchant type distribution, payment mode distribution and final participation time indexes in value-added service information. In the present invention, there is no special method for statistics, as long as the relevant index information can be acquired and the acquired relevant index information is stored in a summary manner. As is clear from the above, the above-described indexes are data information that can be generated by the actions of the passenger such as registration and travel, and are information that can be obtained without calculation, and therefore, only summary statistics are required, and calculation is not required.
Further, in this embodiment, the statistically calculating each index in the passenger travel information index system further includes calculating three-level indexes other than the above indexes, and the specific calculation includes:
calculating the average travel time length, wherein the average travel time length refers to the time spent by the passenger i in each travel, and the calculation formula is as follows:
calculating the average daily trip times, wherein the average daily trip times are the trip times of the passenger i each day, and the calculation formula is as follows:
wherein,representing the average trip times of the passenger i, and D represents the total days of the passenger within the statistical date;
calculating travel time distribution, wherein the travel time distribution refers to the ratio of the travel times of the passengers i in the early peak (7: 00-9: 00), the late peak (17: 00-19: 00) and the average peak to the total travel times, and taking the early peak as an example, the calculation formula is as follows:
the statistical travel OD distribution is travel OD statistics of the first three travel frequencies of the passengers;
the formula for calculating the first trip time is as follows:
calculating the last trip time, which is the time of the last trip of the passenger i in the statistical date, and is used for judging the activity of the passenger (i.e. the trip activity in the derivative information), wherein the calculation formula is as follows:
calculating the travel time distribution of the holidays, wherein the travel time distribution of the holidays refers to the ratio of the travel times of the passengers in the early peak (7: 00-9: 00), the late peak (17: 00-19: 00) and the average peak to the total travel times of the holidays in the holidays, and taking the early peak as an example, the calculation formula is as follows:
counting the holiday travel OD distribution, namely counting the travel OD of the passenger three times before holiday travel frequency;
in the above-mentioned respective formulas, the first and second,represents the firstThe next trip, d for the outbound site, o for the inbound site, i for the passenger, t for the time,represents passenger i at time tThe time of the next outbound is the time of the next outbound,represents the passenger i at the time tThe time of the next arrival is the time of the next arrival,represents the average travel time period of the passenger i,representing the total number of historical trips of passenger i,representing the average number of trips of the passenger i, D representing the total number of days of the passenger within the statistical date,is a binary identification function, when the condition is satisfied, the value is 1, otherwise 0,representing the time of the first trip of passenger i,representing the last travel time of the passenger i,and counting the total number of trips of the passengers on the holidays within the statistical date.
Calculating participation frequency, wherein the participation frequency is the frequency degree of the passenger i participating in the value-added service, and the calculation formula is as follows:
wherein,the frequency of participation in the value added service on behalf of the passenger i,the number of times the passenger i participates in the value added service;
counting merchant type distribution, namely counting merchant types of the first three passenger participation frequencies;
the formula for calculating the average transaction amount is:
wherein,the average amount of money spent for passenger i to participate in the value added service,a total amount of money spent participating in the value added service for the passenger i;
the statistical payment mode distribution is the statistics of the first three usage modes when passengers pay.
The travel demand type is determined by the clustering result of the total travel times, the first travel time and the average travel time counted by the passenger station-entering card-swiping data, and the clustering method comprises the following steps:
dividing the passengers into different categories according to the travel characteristics of the passengers by adopting a K-means algorithm, and selecting the total historical travel times of the passengers i in a passenger travel information index systemFirst trip timeAnd average length of tripAs an index of passenger clustering in a station, determining a clustering number K value by adopting an elbow method, wherein a key index of the elbow method is an inter-cluster error Sum of Squares (SSE), namely an error sum of squares, and a calculation formula is as follows:
wherein,represents the number k of the clusters and represents the number k of clusters,is thatA center point of (a);
in this embodiment, the travel demand type is a clustering result of three indexes counted by card swiping data of passengers entering a station, the clustering number is obtained according to an SSE formula, and for each class of passengers, the historical travel times and the first travel time are analyzed, for example, the historical travel times account for a large proportion of statistical days, and the first travel time is in an early peak time period, so that the class of passengers can be considered as commuting passengers. Depending on the specific clustering result.
Example (c): taking passengers in a subway station as research objects, selecting AFC data of three working days of 6 months, 7 days and 8 days in 2018 as basic data, and analyzing the traveling behavior characteristics of the passengers on the working days in the station. After data screening, the number of people entering the station for three working days is 197328.
The passengers were classified into 5 categories in total by the K-means clustering method.
As can be seen from table 1 above, the clustering results were analyzed:
the proportion of the first class of passengers is 21.2%, the travel characteristics are that the travel times within three days are 1.75, the passengers are the class with the highest travel intensity in five classes, and the first-time travel time is 08: 22: 13, the average travel time is 27.7min, the travel distance is not far away, the time period of the early peak is met, and the class of passengers can be considered as standard commuter passengers in the early peak period.
The proportion of the second class of passengers is 10.2%, the travel characteristic is that the travel times in three days are 1.34, the travel intensity is general, and the first travel time is 11: 29: 33, the average travel time is 48.1min, the travel distance is far, the occupied ratio is small, the passenger can be seen as a passenger who travels outside or travels in a long distance, and by combining POI data, the number of bus stations and railway stations near the stations is large, particularly Beijing railway stations, so that the passenger can conveniently travel.
The proportion of the third class of passengers is 34.5%, the travel characteristic shows that the travel times in three days are 1.69, the travel intensity is higher, and the first travel time is 17: 39: 14, the average travel time is 37.9min, the travel distance is moderate compared with other classes, the time period of the late peak is met, the class of passengers can be considered as commuter passengers in the standard late peak period, meanwhile, the class of passengers is the class with the highest proportion among five classes of passengers, the number of people who enter the station at the late peak of the west vertical gate station is large, and by combining POI data, more office areas are arranged near the station, and the explanation is reasonable.
The proportion of the fourth class passenger is 17.2%, and the trip characteristic shows that the number of times of trip in three days is 1.22, and the intensity of trip is minimum, and it is not high to show this class of passenger's trip loyalty, and the time of trip for the first time is 20: 39: 40, the average travel time is 37.1min, the travel distance is moderate compared with other classes, the travel time is later, the passenger can be regarded as a living class passenger, and by combining POI data, a plurality of shopping and catering merchants are arranged near the station, and the travel can be regarded as the travel of the passenger going home after consumption.
The proportion of the fifth class of passengers is 17.1%, the travel characteristic is that the travel frequency in three days is 1.25, the travel intensity is low, and the first travel time is 14: 05: 07, the average travel time is 29.2min, the travel distance is short, the travel time is the same as that of the fourth class of passengers, no obvious characteristics exist, the proportion of the travel time is very close to that of the fourth class of passengers, and the passengers can be regarded as life passengers.
Judging the residential area site, and in the working day, at noon 12: 00 as a demarcation point, 16 pm on holidays: 00 as a demarcation point, counting the number of times of passengers to get in and out of the station in the corresponding time interval as shown in the table 2.
The station stations where the passenger living areas are located are generally the station where the passenger first travels and the destination station where the passenger last travels in one day, so the probability calculation formula of the station e as the station of the passenger i living area is as follows:
wherein,representing the probability that station e is the station of the residential area of passenger i;
judging work area stations, the station stations where passenger work areas are typically passengers 12 within a work day: station before 00 as destination and 12: after 00, the station is used as an initial station, so the calculation formula of the station e as the station of the working area of the passenger i is as follows:
wherein,representing the probability that the station e is used as a station of a working area of the passenger i;
the value-added participation degree is set to be strong, medium and low, when the participation frequency is more than 0.7, the value-added participation degree of the passenger is strong, when the participation frequency is less than 0.4, the value-added participation degree of the passenger is low, and when the participation frequency is between 0.4 and 0.7, the value-added participation degree of the passenger is medium.
In the present embodiment, the specific format of the part of the index labels in the passenger travel information index system is as shown in table 3 above.
In the present embodiment, the index data in the passenger travel information index system should be continuously updated according to the travel of the passenger. The updating rule of the passenger trip information is as follows: for basic information, updating is performed only when the passenger modifies his personal information; for the service information, the travel basic information, the travel derivative information and the value-added service information are updated in real time along with the travel of the passenger and the use of the value-added service every time, and meanwhile, the service information in the redis is synchronously updated into a database every month; for the derived information, analyzing basic information and service information, and updating once a month; judging the last trip time of the passenger every half year, if the difference between the last trip time of the passenger and the updating time exceeds half year, judging the passenger as an inactive user, and deleting the passenger trip information from the database.
According to an embodiment of the present invention, after the passenger trip data is acquired in the index acquisition module, the method further includes preprocessing the trip data, and specifically includes:
redundant data processing: when a passenger swipes a card for many times or equipment fails, data repetition may occur, and the repeated data needs to be deleted;
and (3) error data processing: abnormal data may occur due to passenger behavior and equipment failure. There are three criteria for the determination of abnormal data: firstly, the arrival time of passengers is required to be earlier than the departure time; secondly, the stay time of passengers in the rail transit is regulated to be less than 4 hours; and thirdly, judging the times of the passengers entering the same station within one day, and counting the staff when the statistical data is eliminated because the staff at the station has more access times in one day.
The passenger categories included in the travel demand types of the third-level indexes in the passenger travel information index system include:
commuting passengers: the trip time and the trip frequency of commuter passengers are relatively fixed due to the working requirements;
touring passengers: the traveling time and the traveling frequency of the passengers are high in fluctuation, the traveling frequency in a short time is high, and the traveling OD distribution is wide;
leisure entertainment passengers: the travel time of the class of passengers is more distributed on weekends and off-peak time periods of each day;
special passengers: for example, the old, the disabled, the pregnant woman and the like often need external help in the traveling process due to the self-reason, and the information needs to be provided by the passenger when registering the APP account;
the other passengers: other passengers are different from the four passenger types, the travel time and the travel frequency are not determined, and the travel purposes are also various.
According to an embodiment of the present invention, in the return passenger flow calculation module, estimating the return passenger flow volume at different time intervals in the station based on the passenger travel information index system and part of the index data obtained by calculation includes:
according to the in-out time, the residence area station, the working area station and the travel demand type in the passenger travel information index system, countingWherein s is a certain site, v is a certain week, the value range is 1-7, which represents Monday to Sunday, t is a certain time period,the number of people who return from the s station within the time period t of v weeks;
selecting historical outbound and return passenger flow data in the station s, and obtaining the week in a mean value calculation modevThe passengers are atArrival of time periodsStand and are arranged inTime period fromsConditional probability distribution of station departure returnThe calculation formula is as follows:
in the formulaRepresents the total number of the weeks,which represents the time period a of time,which represents the time period of b,is shown asjOn the v-th day of the weekIn the time period ofsThe number of passengers getting off the vehicle when standing,which represents the time of arrival of the station,represents the outbound time;
wherein,indicating day vThe number of passengers getting off the bus at s station at the moment, H represents the maximum interval of the time for passengers getting on or off the bus at s station, the maximum interval is 24 hours, H represents time slot resolution, and t +1 represents the next time interval of the time interval t.
According to an embodiment of the present invention, in the passenger flow prediction module, the return passenger flow volume of passengers in the station is added as a covariate to the passenger flow prediction model, and the prediction of the station passenger flow into the station is:
to be estimatedAdding the predicted traffic volume into a common seasonal autoregressive moving average model (S-ARIMA model) to predict the station entrance traffic volume;
the S-ARIMA model is: ARIMA (P, D, Q) (P, D, Q) [ omega ], where P, D, Q represent the order of autoregressive, differential and moving average, respectively P, D, Q is the order of autoregressive, differential and moving average for part of the season;
wherein B is defined as,,,,Wherein, in the process,、、andas a function of the coefficients to be found,to follow the error term of white noise, and obey a mean of 0 and a variance ofThe normal distribution of (c),representsA time period;
when returning the passenger flowInbound traffic volume when acting as covariatesAnd return passenger flow volumeThe following relationships exist:
wherein,in order to be the regression coefficient, the method,is formed byIn the middle week v, when the site s is known, the time is obtained by taking 1. cndot. t,obeying ARIMA (P, D, Q) (P, D, Q) [ omega ]]The model represents the passenger flow except the return passenger flow in the total inbound passenger flow; according to station historyAndcalculate outAndto obtainPost-pass formulaThe prediction is obtainedThen according toTo obtainWhereinis composed ofIn the middle week v, when the station s is known, the time is obtained by taking t + 1; due to the fact thatObeying ARIMA (P, D, Q) (P, D, Q) [ omega ]]Models, i.e. by which prediction can be madeOf a time periodTo do soIs as in the above(ii) a Will be provided withAndbringing inIn the formula, the prediction is obtainedThe amount of passengers arriving at the station at that moment.
In the present embodiment, for example, the model parameters are selected to be (2, 0, 1) (1, 1, 0) [72 ]]The results are shown in Table 4 above, where no M0 model was addedM1 model additionAs covariates, new changes can be discovered, addedAfter the prediction, the RMSE of the training set is reduced by 9.87, the RMSE of the test set is reduced by 9.02, the SMAPE of the training set is reduced by 0.64%, the SMAPE of the test set is reduced by 0.16%, and the prediction effect is more accurate.
Further, to achieve the above object, the present invention also provides an electronic device, including: the system comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the method for predicting the rail transit passenger flow based on the passenger travel information is realized.
In order to achieve the above object, the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the above method for predicting rail transit passenger flow based on passenger travel information.
According to the scheme, the method provided by the invention is based on intelligent subway construction, effectively associates, fuses and introduces multi-source data related to the subway, establishes passenger travel information, and discusses the application of the passenger travel information in the aspect of passenger flow prediction. According to the invention, the travel rule of the passenger is mined according to the multisource travel data of the passenger, and on the basis, a passenger travel information three-level index system is established to realize the statistical calculation of each index. Meanwhile, a prediction method for identifying the return passenger flow facing to the passenger travel information and effectively improving the accuracy of the station arrival passenger flow according to the return passenger flow is provided.
Those of ordinary skill in the art will appreciate that the modules and algorithm steps described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and devices may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, each functional module in the embodiments of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method for transmitting/receiving the power saving signal according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.
It should be understood that the order of execution of the steps in the summary of the invention and the embodiments of the present invention does not absolutely imply any order of execution, and the order of execution of the steps should be determined by their functions and inherent logic, and should not be construed as limiting the process of the embodiments of the present invention.
Claims (9)
1. The rail transit passenger flow prediction method based on passenger travel information is characterized by comprising the following steps of:
passenger travel data are acquired, a passenger travel information index system is established based on the passenger travel data, and each index information in the passenger travel information index system is calculated in a statistical mode;
estimating the return passenger flow of different time periods in the station based on partial index data obtained by calculation in the passenger travel information index system;
taking the return passenger flow of passengers in the station as a covariate, adding the covariate into the seasonal autoregressive moving average model, and predicting the station passenger flow;
the passenger trip data includes:
AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information;
APP registration data used for obtaining identity information and associated information of passengers;
APP value-added consumption data used for obtaining value-added service information of passengers;
POI data, associated with the bus stop, near the bus stop for describing geographic attributes of the bus stop;
the index information in the passenger travel information index system comprises the following steps: basic information, business information and derived information;
the basic information comprises identity information and associated information, the identity information comprises an APPID, gender and age of a passenger and whether the passenger is disabled, and the associated information comprises a third-party payment mode of the passenger and a city all-purpose card;
the service information comprises trip basic information, trip derived information and value added service information, wherein the trip basic information comprises the trip in and out time and the trip in and out station information of passengers, the trip derived information comprises average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution, and the value added service information comprises value added service participation times, participation frequency, average transaction amount, payment mode distribution, merchant type distribution and last participation time;
the derived information comprises an active attribute and a functional attribute, wherein the active attribute comprises travel liveness, and the functional attribute comprises a travel demand type of a passenger, a residence area site, a working area site and value-added participation;
the partial index data includes the in-and-out time, the residential zone site, the work zone site, and the travel demand type.
2. The passenger travel information-based rail transit passenger flow prediction method according to claim 1, wherein the formula for calculating the average travel time length is as follows:
the formula for calculating the average daily trip times is as follows:
the formula for calculating the travel time distribution is as follows:
counting the travel OD distribution as travel OD statistics of the first three travel frequencies of the passengers;
the formula for calculating the first trip time is as follows:
the formula for calculating the last travel time is as follows:
the formula for calculating the travel time distribution of the holidays is as follows:
counting the travel OD distribution of the holidays, namely the travel OD statistics of the passengers three times before the holiday travel frequency;
in the following formulas, the first and second groups,represents the firstThe next trip, d for the outbound site, o for the inbound site, i for the passenger, t for the time,represents the passenger i at the time tThe time of the next outbound is,represents the passenger i at the time tThe time of the next arrival is the time of the next arrival,represents the average travel time period of the passenger i,representing the total number of historical trips of passenger i,representing the average number of trips of the passenger i, D representing the total number of days of the passenger within the statistical date,is a binary identification function, when the condition is satisfied, the value is 1, otherwise 0,representing the time of the first trip of passenger i,representing the last travel time of the passenger i,and counting the total times of travel of the passengers on the holidays within the counting date.
3. The passenger travel information-based rail transit passenger flow prediction method according to claim 2, wherein the formula for calculating the participation frequency is:
wherein,the frequency of participation in the value added service on behalf of the passenger i,number of times passenger i participates in value added service;
counting the merchant type distribution as merchant type statistics of the first three passenger participation frequencies;
the formula for calculating the average transaction amount is:
wherein,the average amount of money spent for passenger i to participate in the value added service,a total amount of money spent participating in the value added service for the passenger i;
and counting the payment mode distribution as the statistics of the first three usage modes when the passengers pay.
4. The passenger travel information-based rail transit passenger flow prediction method according to claim 3, wherein the travel demand type is determined by a clustering result of a total travel frequency, a first travel time and an average travel time counted by passenger boarding and card swiping data, and the clustering method is as follows:
dividing the passengers into different categories according to the travel characteristics of the passengers by adopting a K-means algorithm, and selecting the total historical travel times of the passengers i in a passenger travel information index systemFirst trip timeAnd average trip durationAs an index of passenger clustering in a station, determining a clustering number K value by adopting an elbow method, wherein a key index of the elbow method is the sum of squared errors among clusters SSE, and a calculation formula is as follows:
wherein,represents the number k of the clusters and represents the number k of clusters,is thatA center point of (a);
the formula for calculating the residential area site is:
the formula for calculating the work area station is as follows:
wherein,representing the probability that station e is the station of the residential area of passenger i;the probability that the representative station e is used as a station of the working area of the passenger i;representing the total times of passengers to get in and out of the station e;representing the number of station entering times of a passenger i at a station e 12 o' clock before the working day;representing the number of times of entering the station e before 16 resting days of passengers i;representing the number of station entering times of a passenger at a station e after 12 o' clock of a working day;representing the number of times of entering the station e after 16 resting days of the passengers i;representing the total times of passengers entering and leaving the station e on the working day of the passenger i;representing the number of outbound times of the passenger i at station e before 12 o' clock on working day;representing the number of times of departure of a passenger at a station e after 12 o' clock of a working day;representing the number of times of departure of a passenger at a station e after 16 resting days;representing the number of times of departure of a passenger i at a station e 16 points before the rest day;
the value-added participation degree is set to be strong, medium and low, when the participation frequency is more than 0.7, the value-added participation degree of the passenger is strong, when the participation frequency is less than 0.4, the value-added participation degree of the passenger is low, and when the participation frequency is between 0.4 and 0.7, the value-added participation degree of the passenger is medium.
5. The passenger travel information-based rail transit passenger flow prediction method according to claim 4, wherein estimating the return passenger flow volume at different time intervals in a station based on the passenger travel information index system and part of calculated index data comprises:
counting according to the in-and-out time, the residential area site, the working area site and the travel demand type in the passenger travel information index systemWherein s is a site, v is day v of a week, is 1-7, t is a time period,the number of people who return from the s station within a time period t of the v day of a certain week;
selecting historical outbound and return passenger flow data in the station s, and obtaining the week in a mean value calculation modevThe passengers are atTime period of arrivalsStand and are arranged inTime period fromsConditional probability distribution of station departure and return journeyThe calculation formula is as follows:
in the formulaRepresents the total number of the weeks,which represents the time period a of the time,which represents the time period of b,indicates the number of passengers alighting at s-station during the period a of the jth week and the v day,which represents the time of arrival of the station,represents the time of outbound;
wherein,indicating day v of a weekThe number of passengers getting off the bus at s station at the moment, H represents the maximum interval of the time for passengers getting on or off the bus at s station, the maximum interval is 24 hours, H represents time slot resolution, and t +1 represents the next time interval of the time interval t.
6. The passenger travel information-based rail transit passenger flow prediction method according to claim 5, wherein the return passenger flow volume of passengers in the station is added to the passenger flow prediction model as a covariate, and the predicted station arrival passenger flow is:
will estimate theAdding the predicted passenger flow into a common seasonal autoregressive moving average model to predict the station entering passenger flow;
the seasonal autoregressive moving average model is as follows: ARIMA (P, D, Q) [ Ω ], where P, D, Q represent the order of auto-regressive, differential and moving average, respectively P, D, Q is the order of auto-regressive, differential and moving average for part of season; Ω is the number of cycles per season;
wherein B is defined as,,,,Wherein、、andas a function of the coefficients to be found,to follow the error term of white noise and obey a mean of 0 and a variance ofThe normal distribution of (c),representsA time period;
when returning the passenger flowInbound traffic volume when acting as covariatesAnd return passenger flowThe following relationships exist:
wherein,in order to be the regression coefficient, the method,is composed ofIn the middle week v, when the site s is known, the time is obtained by taking 1. cndot. t;obeying ARIMA (P, D, Q) (P, D, Q) [ s ]]The model represents the passenger flow except the return passenger flow in the total inbound passenger flow; according to station historyAndcalculate outAndto obtainPost-pass formulaThe prediction is obtainedThen according toTo obtainWhereinis formed byMiddle week v, stationWhen the point s is known, the time is obtained by taking t + 1; will be provided withAndbringing inIn the formula, the prediction is obtainedThe incoming passenger flow at that moment.
7. Rail transit passenger flow prediction system based on passenger's trip information, its characterized in that includes:
the index acquisition module is used for acquiring passenger travel data, establishing a passenger travel information index system based on the passenger travel data, and counting and calculating each index in the passenger travel information index system;
the return passenger flow calculation module is used for estimating the return passenger flow at different time intervals in the station based on part of index data calculated in the passenger travel information index system;
the passenger flow prediction module is used for adding return passenger flow of passengers in the station as a covariate into the seasonal autoregressive moving average model to predict the station entering passenger flow;
the passenger trip data includes:
AFC card swiping record and APP code scanning record for acquiring the passenger entrance and exit time and entrance and exit station information;
APP registration data used for obtaining identity information and associated information of passengers;
APP value-added consumption data used for obtaining value-added service information of passengers;
POI data, associated with the bus stop, near the bus stop for describing geographic attributes of the bus stop;
the index information in the passenger travel information index system comprises the following steps: basic information, business information and derived information;
the basic information comprises identity information and associated information, the identity information comprises an APPID, gender and age of a passenger and whether the passenger is disabled, and the associated information comprises a third-party payment mode of the passenger and a city all-purpose card;
the service information comprises trip basic information, trip derived information and value added service information, wherein the trip basic information comprises the trip in and out time and the trip in and out station information of passengers, the trip derived information comprises average trip duration, total trip times, daily average trip times, trip time distribution, trip OD distribution, trip path distribution, first trip time, last trip time, holiday trip time distribution and holiday trip OD distribution, and the value added service information comprises value added service participation times, participation frequency, average transaction amount, payment mode distribution, merchant type distribution and last participation time;
the derived information comprises an active attribute and a functional attribute, wherein the active attribute comprises travel liveness, and the functional attribute comprises a travel demand type of a passenger, a residence area site, a working area site and value-added participation;
the portion of the index data includes the inbound and outbound times, the residential site, the work site, and the travel demand type.
8. An electronic device comprising a processor, a memory, and a computer program stored on the memory and operable on the processor, the computer program when executed by the processor implementing a method of rail transit passenger flow prediction based on passenger travel information according to any one of claims 1 to 6.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method for predicting rail transit passenger flow based on passenger travel information according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210254509.3A CN114331234B (en) | 2022-03-16 | 2022-03-16 | Rail transit passenger flow prediction method and system based on passenger travel information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210254509.3A CN114331234B (en) | 2022-03-16 | 2022-03-16 | Rail transit passenger flow prediction method and system based on passenger travel information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114331234A CN114331234A (en) | 2022-04-12 |
CN114331234B true CN114331234B (en) | 2022-07-12 |
Family
ID=81033087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210254509.3A Active CN114331234B (en) | 2022-03-16 | 2022-03-16 | Rail transit passenger flow prediction method and system based on passenger travel information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114331234B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114898574B (en) * | 2022-04-26 | 2023-04-04 | 安徽省交通控股集团有限公司 | Method and system for estimating traffic parameters |
CN114912683B (en) * | 2022-05-13 | 2024-05-10 | 中铁第六勘察设计院集团有限公司 | System and method for predicting abnormal large passenger flow of smart city rail transit |
CN115759472B (en) * | 2022-12-07 | 2023-12-22 | 北京轨道交通路网管理有限公司 | Passenger flow information prediction method and device and electronic equipment |
CN116778739B (en) * | 2023-06-20 | 2024-09-20 | 深圳市中车智联科技有限公司 | Public transportation scheduling method and system based on demand response |
CN117746640B (en) * | 2024-02-20 | 2024-04-30 | 云南省公路科学技术研究院 | Road traffic flow rolling prediction method, system, terminal and medium |
CN117935416B (en) * | 2024-03-21 | 2024-06-25 | 成都赛力斯科技有限公司 | Pre-running area access statistical method, device and storage medium |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105701180A (en) * | 2016-01-06 | 2016-06-22 | 北京航空航天大学 | Commuting passenger feature extraction and determination method based on public transportation IC card data |
CN105718946A (en) * | 2016-01-20 | 2016-06-29 | 北京工业大学 | Passenger going-out behavior analysis method based on subway card-swiping data |
CN106845714A (en) * | 2017-01-24 | 2017-06-13 | 东南大学 | A kind of monthly passenger flow method of ARIMA model prediction urban track traffics based on seasonal index number |
CN109961164A (en) * | 2017-12-25 | 2019-07-02 | 比亚迪股份有限公司 | Passenger flow forecast method and device |
CN110782070A (en) * | 2019-09-25 | 2020-02-11 | 北京市交通信息中心 | Urban rail transit emergency passenger flow space-time distribution prediction method |
WO2020091620A1 (en) * | 2018-10-30 | 2020-05-07 | Общество С Ограниченной Ответственностью "Глобус Медиа" | Method for predicting passenger flow and device for the implementation thereof |
CN111260140A (en) * | 2020-01-19 | 2020-06-09 | 武汉中科通达高新技术股份有限公司 | Method for predicting instantaneous return large passenger flow in subway station |
CN111932867A (en) * | 2020-06-18 | 2020-11-13 | 东南大学 | Multisource data-based bus IC card passenger getting-off station derivation method |
CN111985710A (en) * | 2020-08-18 | 2020-11-24 | 深圳诺地思维数字科技有限公司 | Bus passenger trip station prediction method, storage medium and server |
WO2021174755A1 (en) * | 2020-03-02 | 2021-09-10 | 北京全路通信信号研究设计院集团有限公司 | Rail transit passenger flow demand prediction method and apparatus based on deep learning |
CN113850417A (en) * | 2021-08-27 | 2021-12-28 | 浙江浙大中控信息技术有限公司 | Passenger flow organization decision-making method based on station passenger flow prediction |
CN114037158A (en) * | 2021-11-09 | 2022-02-11 | 北京京投亿雅捷交通科技有限公司 | Passenger flow prediction method based on OD path and application method |
-
2022
- 2022-03-16 CN CN202210254509.3A patent/CN114331234B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105701180A (en) * | 2016-01-06 | 2016-06-22 | 北京航空航天大学 | Commuting passenger feature extraction and determination method based on public transportation IC card data |
CN105718946A (en) * | 2016-01-20 | 2016-06-29 | 北京工业大学 | Passenger going-out behavior analysis method based on subway card-swiping data |
CN106845714A (en) * | 2017-01-24 | 2017-06-13 | 东南大学 | A kind of monthly passenger flow method of ARIMA model prediction urban track traffics based on seasonal index number |
CN109961164A (en) * | 2017-12-25 | 2019-07-02 | 比亚迪股份有限公司 | Passenger flow forecast method and device |
WO2020091620A1 (en) * | 2018-10-30 | 2020-05-07 | Общество С Ограниченной Ответственностью "Глобус Медиа" | Method for predicting passenger flow and device for the implementation thereof |
CN110782070A (en) * | 2019-09-25 | 2020-02-11 | 北京市交通信息中心 | Urban rail transit emergency passenger flow space-time distribution prediction method |
CN111260140A (en) * | 2020-01-19 | 2020-06-09 | 武汉中科通达高新技术股份有限公司 | Method for predicting instantaneous return large passenger flow in subway station |
WO2021174755A1 (en) * | 2020-03-02 | 2021-09-10 | 北京全路通信信号研究设计院集团有限公司 | Rail transit passenger flow demand prediction method and apparatus based on deep learning |
CN111932867A (en) * | 2020-06-18 | 2020-11-13 | 东南大学 | Multisource data-based bus IC card passenger getting-off station derivation method |
CN111985710A (en) * | 2020-08-18 | 2020-11-24 | 深圳诺地思维数字科技有限公司 | Bus passenger trip station prediction method, storage medium and server |
CN113850417A (en) * | 2021-08-27 | 2021-12-28 | 浙江浙大中控信息技术有限公司 | Passenger flow organization decision-making method based on station passenger flow prediction |
CN114037158A (en) * | 2021-11-09 | 2022-02-11 | 北京京投亿雅捷交通科技有限公司 | Passenger flow prediction method based on OD path and application method |
Non-Patent Citations (4)
Title |
---|
A Classification method of rail transit stations based on POI data and tf-idf index;Shichen Zhong等;《21st Cota international conference of transportation professionals》;20211231;全文 * |
Deep learning-based hybrid model for short-term subway passenger flow prediction using automatic fare collection data;Jia feifan等;《IET Intelligent Transport Systems》;20191231;第13卷(第11期);第1708-1716页 * |
基于广义动态模糊神经网络的短时车站进站客流量预测;李春晓等;《都市快轨交通》;20150818;第28卷(第4期);第57-61页 * |
客运专线旅客出行需求及客流时空分布研究;徐攀;《中国优秀硕士学位论文全文数据库 (工程科技Ⅱ辑)》;20121015(第10期);第C033-465页 * |
Also Published As
Publication number | Publication date |
---|---|
CN114331234A (en) | 2022-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114331234B (en) | Rail transit passenger flow prediction method and system based on passenger travel information | |
Zhou et al. | Bus arrival time calculation model based on smart card data | |
CN109035770B (en) | Real-time analysis and prediction method for bus passenger capacity in big data environment | |
Munizaga et al. | Validating travel behavior estimated from smartcard data | |
Merriman et al. | Excess commuting in the Tokyo metropolitan area: measurement and policy simulations | |
Mohring et al. | The values of waiting time, travel time, and a seat on a bus | |
Ortega-Tong | Classification of London's public transport users using smart card data | |
Kemp | Some evidence of transit demand elasticities | |
Lee et al. | Assessing transit competitiveness in Seoul considering actual transit travel times based on smart card data | |
Fu et al. | Impact of a new metro line: analysis of metro passenger flow and travel time based on smart card data | |
Xiong et al. | Understanding operation patterns of urban online ride-hailing services: A case study of Xiamen | |
CN114358808A (en) | Public transport OD estimation and distribution method based on multi-source data fusion | |
CN110889092A (en) | Short-time large-scale activity peripheral track station passenger flow volume prediction method based on track transaction data | |
Li et al. | Using smart card data trimmed by train schedule to analyze metro passenger route choice with synchronous clustering | |
Sun et al. | Identifying public transit commuters based on both the smartcard data and survey data: a case study in xiamen, China | |
Yoo | Transfer penalty estimation with transit trips from smartcard data in Seoul, Korea | |
Dumas | Analyzing transit equity using automatically collected data | |
CN108681741A (en) | Based on the subway of IC card and resident's survey data commuting crowd's information fusion method | |
Han et al. | Analyzing the accessibility of subway stations for transport-vulnerable population segments in Seoul: Case of bus-to-subway transfer | |
CN112990518A (en) | Real-time prediction method and device for destination station of individual subway passenger | |
Wang et al. | Determining the level of service scale of public transport system considering the distribution of service quality | |
Mullen | Estimating the demand for urban bus travel | |
Montero-Lamas et al. | A new big data approach to understanding general traffic impacts on bus passenger delays | |
Li-Jun et al. | Evaluation of the reliability of bus service based on gps and smart card data | |
Scholl et al. | A rapid road to employment? The impacts of a bus rapid transit system in Lima |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220615 Address after: 100044 Beijing city Haidian District Shangyuan Village No. 3 Applicant after: Beijing Jiaotong University Applicant after: GUANGZHOU METRO GROUP Co.,Ltd. Address before: 100044 Beijing city Haidian District Shangyuan Village No. 3 Applicant before: Beijing Jiaotong University |
|
GR01 | Patent grant | ||
GR01 | Patent grant |