CN108133302B - Public bicycle potential demand prediction method based on big data - Google Patents

Public bicycle potential demand prediction method based on big data Download PDF

Info

Publication number
CN108133302B
CN108133302B CN201611091696.9A CN201611091696A CN108133302B CN 108133302 B CN108133302 B CN 108133302B CN 201611091696 A CN201611091696 A CN 201611091696A CN 108133302 B CN108133302 B CN 108133302B
Authority
CN
China
Prior art keywords
public
demand
sample
big data
public bicycle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611091696.9A
Other languages
Chinese (zh)
Other versions
CN108133302A (en
Inventor
陈龙
周晋冬
张临辉
张颖
刘杰
朱兴一
韦联春
郑凌瀚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Architectural Design & Research Institute Co ltd
Original Assignee
Shanghai Pudong Architectural Design & Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Architectural Design & Research Institute Co ltd filed Critical Shanghai Pudong Architectural Design & Research Institute Co ltd
Priority to CN201611091696.9A priority Critical patent/CN108133302B/en
Publication of CN108133302A publication Critical patent/CN108133302A/en
Application granted granted Critical
Publication of CN108133302B publication Critical patent/CN108133302B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06315Needs-based resource requirements planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0645Rental transactions; Leasing transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Abstract

The invention relates to a public bicycle potential demand prediction method based on big data, which comprises the following steps: obtaining resident trip attributes according to the mobile phone signaling big data; obtaining land attributes according to the internet big data; obtaining the PA current situation of a public bicycle rental point according to the big data of the public bicycle card swiping, and calculating the demand heat; taking the public bike leasing points with the demand heat degree lower than the demand threshold as a total sample, taking the resident travel attribute and the land attribute as input parameters of the sample, taking the PA current situation of the public bike leasing points as output parameters of the sample, and carrying out BP neural network training by combining an AIC (automated aided learning) criterion and a random forest rule to obtain a potential demand evaluation model of the public bike; and calculating a bicycle demand predicted value of the public bicycle rental lot with the demand heat degree higher than a threshold value according to the public bicycle potential demand evaluation model. Compared with the prior art, the method has the advantages of accurate analysis result, powerful guidance for the layout of the public bicycle rental spots and the like.

Description

Public bicycle potential demand prediction method based on big data
Technical Field
The invention relates to the field of big data, in particular to a public bicycle potential demand prediction method based on big data.
Background
The public bicycle leasing system can effectively supplement and extend conventional public transportation by virtue of the advantages of low price, convenience, fitness, energy conservation, environmental protection and the like, and becomes a public welfare career which is extremely popularized in many countries and regions in the world. Since the 70 s of the last century, the european countries first introduced a public bicycle rental system, and more than 50 countries or regions have provided public bicycle rental services to citizens so far; in the first half of 2008, public bicycle leasing begins to enter China, and Hangzhou cities serve as successful typical urban public bicycle leasing systems suitable for national conditions of China and rapidly initiate follow-up construction of other cities in China. The bicycle riding step by step becomes a fashionable choice for short trip, amateur body building, leisure and sightseeing of the majority of citizens.
The current public bicycle distribution method lacks scientific guidance combined with supply and demand analysis, and is usually developed around living points, public building points, public transportation points and other service point locations by public bicycle construction and operation enterprises under guidance and suggestions of related management units, and each point is configured with standard quantity of pile positions and vehicle number randomly. Therefore, the phenomena of uneven cold and heat, low utilization rate and the like of the distributed network points often occur in the actual use process, and the investment is wasted, serious and even the loss of operation is caused.
Disclosure of Invention
The invention aims to provide a public bicycle potential demand prediction method based on big data aiming at the problems.
The purpose of the invention can be realized by the following technical scheme:
a big data based public bike potential demand prediction method, the method comprising:
1) analyzing and obtaining resident travel attributes in an analysis area according to the mobile phone signaling big data, wherein the resident travel attributes comprise population distribution, occupational area distribution and population travel PA distribution;
2) analyzing and obtaining land attributes in the analysis area according to the internet big data, wherein the land attributes comprise land types and public transport station distribution;
3) analyzing the PA current situation of the public bicycle rental lot according to the big card swiping data of the public bicycles, and calculating the demand heat of the public bicycle rental lot;
4) taking the public bike leasing points with the demand heat degree lower than the demand threshold value as a total sample, taking the resident trip attribute obtained in the step 1) and the land attribute obtained in the step 2) as input parameters of the sample, taking the PA current situation of the public bike leasing points obtained in the step 3) as output parameters of the sample, and carrying out BP neural network training by combining an AIC (automated aided learning) criterion and a random forest rule to obtain a potential demand evaluation model of the public bike;
5) calculating a bicycle demand predicted value of the public bicycle rental lot with the demand heat higher than a threshold value according to the public bicycle potential demand evaluation model obtained in the step 4).
The step 4) is specifically as follows:
41) taking the public bike leasing points with the demand heat degree lower than the demand threshold value as a total sample, taking the resident travel attribute obtained in the step 1) and the land attribute obtained in the step 2) as input parameters of the sample, taking the PA current situation of the public bike leasing points obtained in the step 3) as output parameters of the sample, and proportionally dividing the sample into a training sample and a testing sample;
42) selecting selected samples from training samples to train according to a random jungle rule, and carrying out permutation and combination on input parameters of each selected sample to obtain all input parameter sets under each selected sample;
43) carrying out BP neural network training on all input parameter sets under each selected sample, calculating model error values under different input parameter sets by using an AIC (automatic aided learning) criterion, and selecting the input parameter set with the minimum model error value as the optimal input parameter set of the selected sample;
44) and taking the optimal input parameter set with the highest occurrence frequency under all the selected samples as the optimal input parameter set of the model, and carrying out BP neural network training on all the training samples to obtain the potential demand evaluation model of the public bike.
The calculation of the model error values under different input parameter sets by using the AIC criterion is specifically as follows:
Figure BDA0001168806420000021
wherein, K is the number of input parameters in the current input parameter set, n is the number of selected samples, and RSS is the model residual error.
The step 4) further comprises the step of performing k-fold cross test on the total sample, and calculating to obtain an error value of the potential demand evaluation model of the public bike; the k-fold cross test specifically comprises the following steps: dividing a sample complete set into k parts, and respectively taking Part1 as a test sample, Part2, Part3, … and Part as training samples; and taking Part2 as a test sample, Part1, Part3, … and Part as training samples, and repeating the steps for k times of tests, respectively calculating model error values obtained in each test, and taking the average value of the model error values as the error value of the public bicycle potential demand evaluation model.
The step 1) is specifically as follows:
11) according to a cell identification field in the mobile phone signaling big data, traffic analysis cell division is carried out on an analysis area, the area of each traffic analysis cell and sample expansion parameters of a general population group are determined, and population distribution in the analysis area is obtained through calculation;
12) determining a mobile phone user leaving time threshold corresponding to each traffic analysis cell, performing bivariate classification on the residence time sections and residence time lengths of different traffic analysis cells of all mobile phone users throughout the day, calculating the confidence coefficient of the bivariate classification according to the residence time lengths, and estimating and obtaining the occupational area distribution according to the confidence coefficient;
13) and establishing a main key contact between the public bicycle leasing point and the traffic analysis cell, and calculating the outgoing amount and the arrival amount of residents in the traffic analysis cell related to the public bicycle leasing point according to the mobile phone signaling big data to obtain population outgoing PA distribution.
The step 2) is specifically as follows:
21) obtaining a map in an analysis area according to the internet big data;
22) extracting and integrating the information points of the map in the analysis area obtained in the step 21), and classifying according to the functions of the information points;
23) and obtaining the land types and the public transport station distribution in the analysis area according to the classification result.
The demand heat of the public bicycle rental lot is specifically as follows:
Figure BDA0001168806420000031
wherein, DHiFor the required heat of the public bicycle rental spot, AtiAnd PtiAverage arrival and departure of the public bicycles at the ith public bicycle rental spot, OtiFor the initial number of allocation of vehicles in a time period, Oti' is the number of rents the initial vehicle allocation at the end of the period during which the vehicle was leased.
The step 5) is specifically as follows:
51) taking the resident trip attribute and the land attribute corresponding to the public bicycle rental point with the demand heat degree higher than the threshold as input parameters, bringing the input parameters into the potential demand evaluation model of the public bicycles obtained in the step 4), and obtaining an arrival quantity evaluation value Aot of the public bicycle rental point in the corresponding time periodiAnd rental amount evaluation value Poti
52) The estimated value Aot of the amount of arrival of the public bike rental lot in the corresponding time period obtained from the step 51)iAnd rental amount evaluation value PotiAverage public bike arrival At in conjunction with the time period of the public bike rental spotiCalculating to obtain a correction coefficient alphai
53) And calculating a demand predicted value of the public bicycle rental point according to the correction coefficient, wherein the demand predicted value comprises the public bicycle rental amount and the public bicycle arrival amount.
The correction coefficient is specifically as follows:
Figure BDA0001168806420000041
the public bicycle leasing amount is max (alpha)iPoti,Pti) The public bike arrival amount is max (Aot)i,Ati)。
Compared with the prior art, the invention has the following beneficial effects:
(1) the method comprises the steps of obtaining the relation between the public bicycle leasing point and resident trip distribution and urban transportation land distribution by analyzing mobile phone signaling big data, internet big data and public bicycle card swiping big data, and then training through a BP (back propagation) neural network to obtain a public bicycle potential demand evaluation model.
(2) The public bicycle demand analyzed by the method can be used for helping the existing public bicycle operation area to perform supply and demand balance analysis and improving the service efficiency of the public bicycles, and can also be used for calculating the public bicycle leasing demand and economic return in the area to provide planning for public bicycle distribution points.
(3) When a BP neural network is used for training a sample, a random jungle rule and an ACI criterion are combined, randomness of the selected sample is guaranteed through the random jungle rule, the training result is guaranteed to be accurate, preference does not exist, model errors are calculated through the ACI criterion, the finally determined input parameter set is guaranteed to be optimal, performance of the established model is guaranteed to be optimal, and the final evaluation result of bicycle requirements is guaranteed to be accurate.
(4) After a public bicycle potential demand evaluation model is obtained through BP neural network training, k-fold cross test is also carried out for calculating the error value of the model, comparison is conveniently carried out after a new model building method is proposed, a user can conveniently select the model, and therefore a more accurate evaluation value is obtained.
(5) After the potential demand of the public bicycles is calculated by using the model, the correction coefficient is calculated by combining the PA condition of the actual public bicycle rental point, the calculated demand is further corrected, and the accuracy of the calculation result is further improved.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a flow chart of step 4) of the method of the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
As shown in fig. 1, the present embodiment aims to provide a public bicycle potential demand forecasting method based on big data, which includes the following steps:
1) according to the big data of the mobile phone signaling, analyzing to obtain the travel attribute of residents in the analysis area:
the method comprises the steps of analyzing travel and activity requirements of population at the current situation by utilizing big data of mobile phone signaling to obtain dynamic spatial distribution of population at different time intervals, wherein the population is a main population complete set of public bicycle service. Determining a TAZ (Traffic Analysis Zone) division scheme in a target city Analysis area range according to a cell identification field in mobile phone signaling data, and taking TAZi (i is a natural number from 1 to N, and N is the total quantity of TAZs divided in the Analysis area range) as an identification of each TAZ. Two parameters of each TAZi are respectively determined: AREAi (area) and FACTOR _ ETi (population spread parameters) were further calculated to derive the population distribution within the study area.
And analyzing the spatial distribution of the standing population and the living place in different areas under the current situation by utilizing the big data of the mobile phone signaling. And determining the leaving time threshold of each mobile phone user corresponding to each TAZi, and classifying the residence place and the working place of all the mobile phone users in 24 hours all day in the residence time sections (whether in reasonable working periods) and the residence time lengths (several hours) of different traffic cells. And determining the confidence of classification according to the length of the stay time, thereby estimating the working and living conditions of each user in different traffic districts. In this embodiment, the terminal is regarded as leaving if the terminal is not in the original cell within 30 minutes, and the final determined occupational distribution is shown in table 1:
TABLE 1 determination of occupancy traffic cells for each user
Figure BDA0001168806420000051
Figure BDA0001168806420000061
And calculating the outgoing and attracting amounts of residents in a traffic cell to which the bicycle rental point belongs, and calculating the PA spatial distribution and the passenger flow corridor distribution of the outgoing and arriving amounts (PA spatial distribution) of urban residents in the traffic cell within a certain distance range suitable for riding the bicycle by using the big data of the mobile phone signaling. Establishing a main key connection between a public bicycle mesh point and a traffic cell, calculating PA space distribution of the traffic cell and other traffic cells within a range of 300 meters around each bicycle leasing point, and OD distribution of a resident trip starting point and an end point (OD distribution). The flow corridor is identified by the OD in the study area, in this example, the OD pairs with a flow rate above 98% quantile are identified as flow corridors.
In this embodiment, according to each TAZi divided in the above steps, the area AREAi corresponding to each TAZi is determined, areas which are difficult to reach by the population, such as lakes, deserts, and swamps in the TAZi, need to be removed, and the number of population distributions in the unit area in the TAZi, that is, the population distribution density, is counted by using the actual area of the coverage area which can be reached by the population in the TAZi.
And determining a sample expansion parameter FACTOR _ ETi from a mobile phone user group to a general population group, and obtaining the population proportion of the mobile phone owned by the TAZi according to a sample questionnaire survey so as to determine the parameter. If the population proportion of the owned mobile phone is p _ ai for each TAZi, then the sample expansion parameter: FACTOR _ ETi is 1/p _ ai. The population distribution in the study area was then calculated and the resulting TAZ parameters are shown in table 2:
TABLE 2 TAZ parameters
TAZ numbering AREA (Unit: square kilometer) FACTOR_ET
TAZ1 10 0.5
TAZ2 8 0.25
TAZ3 8 0.25
2) Analyzing and obtaining land attributes in the analysis area according to the internet big data:
and obtaining the current situation of city construction and the spatial distribution situation of main traffic nodes by utilizing the big data of the Internet. In consideration of the connectivity of public bicycles, rail transit and public transit, the spatial distribution of public transit networks and rail transit networks in a research area needs to be determined. In the part, the public transportation station information and the line network information obtained by investigation are input into a geographic information system, so that the current rail transit and the public transportation distribution are obtained.
And analyzing the spatial position distribution of different types of POI of the current conditions of each area. These points will all be possible end points of public bicycle travel. In this embodiment, seventeen major categories of POI data are obtained by counting and classifying the Baidu map POIs, and each major category includes a plurality of minor categories. On the basis of extracting POI data of a certain category, the distribution condition of the function in a city can be rapidly and intuitively obtained, as shown in Table 3:
TABLE 3 Baidu POI Categories List
Large class Subclass of
Company enterprise Park, company … …
Real estate Residential area and office building
Food Chinese restaurant and foreign restaurant … …
Automobile service Automobile sale, automobile maintenance … …
Leisure entertainment KTV, singing and dancing hall … …
Sport body-building Gym, fitness center, … …
Shopping Shop, market, … …
Life service Ticket office, logistics company, … …
Medical treatment General hospital, special hospital, … …
Hotel Star hotel, fast hotel, … …
Tourist attraction Park, zoo, … …
Government agency … … government and administrative unit at different levels
Culture media News publishing, exhibition hall, … …
Education training … … colleges and universities, middle school
Traffic facility Airport, railway station, … …
Finance Bank, atm, … …
Beauty people Beauty treatment, hairdressing, … …
3) Analyzing and obtaining the PA current situation of the public bicycle rental lot according to the big card swiping data of the public bicycles, and calculating the demand heat of the public bicycle rental lot:
and mastering the traveling current situation of the public bicycles according to the card swiping data and the like of the public bicycles. The method comprises the steps of taking rental points ZLDI (i is a natural number from 1 to N, and N is the total quantity of ZLDs divided in an analysis area range) as a representation, extracting public bicycle rental card swiping data of each ZLDI, and calculating the operation current situation and bicycle traveling characteristics of the public bicycle rental points in different time periods. The trip characteristics include: to volume (vehicle), average distance traveled (meters), average elapsed time traveled (minutes), and define indices DHi (Demand Heat) for public bicycle outlets:
Figure BDA0001168806420000081
wherein Ati,PtiAverage public bicycle arrival and emission quantity, Ot, at the ith rental spotiIs the initial number of cars prepared of period, Ot'iAt the end of the time interval, the initial vehicle allocation OtiNumber rented during the period (Ot'i≤Oti) The public bicycle travel characteristics and the current operation situation of the public bicycle rental lot, which are finally obtained according to the steps, are shown in tables 4 and 5:
TABLE 4 public bicycle travel characteristics
Rental Point numbering Average distance (rice) Average travel time (minutes) TimeSegment of
ZLD1 3000 24 Early peak
ZLD1 1400 10 Late peak
ZLD1 1400 15 Other time periods
ZLD2 2800 18 Early peak
ZLD2 2800 20 Late peak
TABLE 5 Current situation of operation of public bicycle rental points
Figure BDA0001168806420000082
4) Taking the public bike leasing points with the demand heat degree lower than the demand threshold value as a total sample, taking the resident trip attribute obtained in the step 1) and the land attribute obtained in the step 2) as input parameters of the sample, taking the PA current situation of the public bike leasing points obtained in the step 3) as output parameters of the sample, and carrying out BP neural network training by combining an AIC criterion and a random forest rule to obtain a public bike potential demand evaluation model:
in the network points with lower demand heat, the bicycles in each time interval arrive, and the leaving average number can reflect the correct relationship between the property of the surrounding traffic cells and the public bicycle demand of the network points. The potential demand of public bicycles can not be met due to the supply limitation of the initial distribution number of the network points with higher demand heat. Therefore, in this embodiment, the branch points with the demand heat degree of 70% or less are set to enter the total sample to find the relationship between the public bicycle rental demand (the average arrival amount in the time period and the issue amount) and the properties of the surrounding traffic cells.
And (3) accumulating long-term historical data, adopting a method of combining AIC (Akaike Information criterion) criterion with a random forest rule and BP (Back propagation) neural network, taking population distribution, job distribution and population travel PA (power amplifier) distribution conditions in the step 1, taking the land type of the space land and the distribution of public transportation sites in the step 2 as input, taking public bicycle leasing point fields in a result table of the two steps as association, establishing main key association, and taking the current situation of the public bicycle branch point PA in the step 3 as output to perform learning training. Because the model has multiple inputs and outputs, the mode of combining AIC criterion with random forest law is adopted to ensure that the number of input parameters entering neural network learning is reasonable, the network training is stable, and the phenomenon of overfitting is avoided. Potential requirements of the current public bicycle network are mined by utilizing the good learning efficiency of the BP neural network.
The method is divided into time periods for training, the time periods are divided into early peak, late peak and other time periods in the embodiment, and the input of the training model comprises resident travel attributes and land attributes of traffic cells (the bicycle travel traffic cells contained in a sample are numbered TAZi-TAZj) in a certain range near the corresponding leasing point website number ZLDk. The output of the training model is the bicycle sending amount and the bicycle reaching amount of the corresponding public bicycle rental point. The learning training is performed according to the flow shown in fig. 2. In the present embodiment, the parameters included in the output and input of the samples in each period are as shown in tables 6 and 7:
watch 6 output sample meter (bicycle trip)
Figure BDA0001168806420000091
Figure BDA0001168806420000101
Table 7 input sample table (resident trip)
Figure BDA0001168806420000102
The sample set is divided into training samples and test samples according to a learning and training proportion which is commonly adopted according to a machine learning method, wherein 80% of total samples are selected as the training samples and 20% are selected as the test samples in the embodiment. The method includes the steps that a random forest method is established, training samples with a certain proportion are randomly selected from training samples through random numbers generated each time to serve as selected samples, in the embodiment, 80% of the training samples serve as the selected samples, input parameters of each selected sample are arranged and combined in different modes, different AVSi (Attribute Variable Set) are obtained, such as AVS1 (population, distance to public transportation sites, duty ratio), AVS2 (distance to public transportation sites, duty ratio and travel attraction) and the like, and a three-layer BP neural network is designed to train each group of samples.
Designing a BP neural network:
the optimum number of hidden nodes L is calculated with reference to the following formula:
L=(m+n)1/2+c
where m is the number of input nodes, n is the number of output nodes, and c is a constant between 1 and 10, and the value of the optimal number of hidden nodes L in this embodiment is 10.
Each unit in the network layer outputs through an activation function, and the Sigmoid function is selected in this embodiment, which specifically includes:
Figure BDA0001168806420000103
in training, the error of the output layer and the hidden layer, and the error of the input layer are calculated according to the following formula:
an output layer:
δj (l)=(dqj-x(l) j)f(s(l) j)
hidden layer and input layer:
Figure BDA0001168806420000111
and (3) correcting the weight:
w(l+1) ji[k+1]=w(l) ji[k]+uδj (l)x(l-1) j+η(w(l) ji[k]-w(l) ji[k-1])
threshold value:
θ(l+1) j[k+1]=θ(l) j[k]+uδj (l)+η(θ(l) j[k]-θ(l) j[k-1])
the performance index of each training:
Figure BDA0001168806420000112
the whole training process is repeated for a plurality of times to ensure the stability of the model. The training termination is determined as E being smaller than a certain threshold, which in this embodiment is 0.01.
And training different samples to obtain multiple groups of training results and errors, and judging the number of the optimal input variables by an AIC (automatic aided learning) criterion. The AIC criterion evaluates the error values of the model for different parameter numbers by the following sub-formula:
AIC=2K+nln(RSS/n)
wherein K is the number of parameters input into the neural network, n is the number of samples entering learning, and RSS is the model residual error.
When the AIC value of certain AVSi is minimum, the input parameters of the AVSi are considered as the optimal input parameter set of the selected sample entering the model. And finally determining an optimal input variable set among the various input samples through mode voting. In the model test, a k-fold cross test method is adopted, namely a sample complete set is divided into k parts of Part1, Part2, … … and Part, wherein k is 5 in the embodiment, so that Part1 is used as a test sample, Part2, Part … … and Part5 are used as training samples, Part2 is used as a test sample, Part1, Part3, … … and Part5 are used as training samples during the test, and the test is carried out for 5 times. The model error was averaged over 5 tests and the results are shown in table 8:
TABLE 8 model evaluation error Table
Test set number Error of demand (issue quantity) Demand error (arrival volume)
1 10% 15%
2 7% 9%
3 11% 13%
4 8% 10%
5 8% 9%
So far, the model obtained through the training of the BP neural network reveals the influence of traffic cells near public bicycle branches on the public bicycle demand, and two groups of network weights of the leased quantity and the returning quantity of the branches corresponding to the leased quantity and the returning quantity of the branches obtained through the training are respectively stored.
5) Calculating a bicycle demand predicted value of the public bicycle rental lot with the demand heat higher than a threshold value according to the public bicycle potential demand evaluation model obtained in the step 4):
and 4, calculating and analyzing the potential demand of the public bicycles in each region by taking the two groups of network weights obtained in the step 4 as a potential demand evaluation model. For the bicycle rental point ZLDI with the demand enthusiasm above Range in the area, the relevant parameter sets of the surrounding traffic cells are input into the potential demand evaluation model obtained in the step 4 through the key connection, and the public bicycle arrival quantity evaluation value Aot of the rental point in the corresponding time period is obtainediRental amount evaluation value Poti. Aot corresponding to ZLDIi,PotiWith the most current value AtiEstablishing a leasing demand correction coefficient:
Figure BDA0001168806420000121
will predict the rental demand, PotiMultiplying the correction coefficient to obtain the final public bicycle leasing demand evaluation value as follows: quantity of public bicycle rentals max (alpha)iPoti,Pti) Public bicycle arrival at the net point max (Aot)i,Ati)。
After the evaluation of the potential demand of the public bicycle leasing of each branch is completed, the result is helpful for a planning department to carry out public bicycle distribution optimization according to the current bicycle distribution number and the demand heat, so that the public bicycle travel demand of resident travel is met, the demand heat of the region is balanced, and the bicycle travel sharing rate of the region is improved.

Claims (9)

1. A big data based public bike potential demand prediction method, characterized in that the method comprises:
1) analyzing and obtaining resident travel attributes in an analysis area according to the mobile phone signaling big data, wherein the resident travel attributes comprise population distribution, occupational area distribution and population travel PA distribution;
2) analyzing and obtaining land attributes in the analysis area according to the internet big data, wherein the land attributes comprise land types and public transport station distribution;
3) analyzing the PA current situation of the public bicycle rental lot according to the big card swiping data of the public bicycles, and calculating the demand heat of the public bicycle rental lot;
4) taking the public bike leasing points with the demand heat degree lower than the demand threshold value as a total sample, taking the resident trip attribute obtained in the step 1) and the land attribute obtained in the step 2) as input parameters of the sample, taking the PA current situation of the public bike leasing points obtained in the step 3) as output parameters of the sample, and carrying out BP neural network training by combining an AIC (automated aided learning) criterion and a random forest rule to obtain a potential demand evaluation model of the public bike;
the step 4) is specifically as follows:
41) taking the public bike leasing points with the demand heat degree lower than the demand threshold value as a total sample, taking the resident travel attribute obtained in the step 1) and the land attribute obtained in the step 2) as input parameters of the sample, taking the PA current situation of the public bike leasing points obtained in the step 3) as output parameters of the sample, and proportionally dividing the sample into a training sample and a testing sample;
42) selecting selected samples from training samples to train according to a random jungle rule, and carrying out permutation and combination on input parameters of each selected sample to obtain all input parameter sets under each selected sample;
43) carrying out BP neural network training on all input parameter sets under each selected sample, calculating model error values under different input parameter sets by using an AIC (automatic aided learning) criterion, and selecting the input parameter set with the minimum model error value as the optimal input parameter set of the selected sample;
44) taking the optimal input parameter set with the highest occurrence frequency under all the selected samples as the optimal input parameter set of the model, and carrying out BP neural network training on all the training samples to obtain a potential demand evaluation model of the public bike;
5) calculating a bicycle demand predicted value of the public bicycle rental lot with the demand heat higher than a threshold value according to the public bicycle potential demand evaluation model obtained in the step 4).
2. The big data-based public bicycle potential demand prediction method according to claim 1, wherein the calculating of the model error values under different input parameter sets by using the AIC criterion is specifically:
Figure FDA0003215135310000021
wherein, K is the number of input parameters in the current input parameter set, n is the number of selected samples, and RSS is the model residual error.
3. The big data-based public bike potential demand prediction method as claimed in claim 1, wherein the step 4) further comprises performing a k-fold cross test on the total samples, and calculating an error value of the public bike potential demand evaluation model; the k-fold cross test specifically comprises the following steps: dividing a sample complete set into k parts, and respectively taking Part1 as a test sample, Part2, Part3, … and Part as training samples; and taking Part2 as a test sample, Part1, Part3, … and Part as training samples, and repeating the steps for k times of tests, respectively calculating model error values obtained in each test, and taking the average value of the model error values as the error value of the public bicycle potential demand evaluation model.
4. The big data based public bicycle potential demand prediction method according to claim 1, wherein the step 1) is specifically as follows:
11) according to a cell identification field in the mobile phone signaling big data, traffic analysis cell division is carried out on an analysis area, the area of each traffic analysis cell and sample expansion parameters of a general population group are determined, and population distribution in the analysis area is obtained through calculation;
12) determining a mobile phone user leaving time threshold corresponding to each traffic analysis cell, performing bivariate classification on the residence time sections and residence time lengths of different traffic analysis cells of all mobile phone users throughout the day, calculating the confidence coefficient of the bivariate classification according to the residence time lengths, and estimating and obtaining the occupational area distribution according to the confidence coefficient;
13) and establishing a main key contact between the public bicycle leasing point and the traffic analysis cell, and calculating the outgoing amount and the arrival amount of residents in the traffic analysis cell related to the public bicycle leasing point according to the mobile phone signaling big data to obtain population outgoing PA distribution.
5. The big data based public bicycle potential demand prediction method according to claim 1, wherein the step 2) is specifically as follows:
21) obtaining a map in an analysis area according to the internet big data;
22) extracting and integrating the information points of the map in the analysis area obtained in the step 21), and classifying according to the functions of the information points;
23) and obtaining the land types and the public transport station distribution in the analysis area according to the classification result.
6. The big-data-based public bike potential demand prediction method according to claim 1, wherein the demand heat of the public bike rental lot is specifically:
Figure FDA0003215135310000031
wherein, DHiFor the required heat of the public bicycle rental spot, AtiAnd PtiAverage arrival and departure of the public bicycles at the ith public bicycle rental spot, OtiIs the initial number of cars prepared of period, Ot'iThe number of rented vehicles in the period at the end of the period.
7. The big data based public bicycle potential demand prediction method according to claim 1, wherein the step 5) is specifically as follows:
51) taking the resident trip attribute and the land attribute corresponding to the public bicycle rental point with the demand heat degree higher than the threshold as input parameters, bringing the input parameters into the potential demand evaluation model of the public bicycles obtained in the step 4), and obtaining an arrival quantity evaluation value Aot of the public bicycle rental point in the corresponding time periodiAnd rental amount evaluation value Poti
52) The estimated value Aot of the amount of arrival of the public bike rental lot in the corresponding time period obtained from the step 51)iAnd rental amount evaluation value PotiAverage public bike arrival At in conjunction with the time period of the public bike rental spotiCalculating to obtain a correction coefficient alphai
53) And calculating a demand predicted value of the public bicycle rental point according to the correction coefficient, wherein the demand predicted value comprises the public bicycle rental amount and the public bicycle arrival amount.
8. The big data based public bike potential demand prediction method according to claim 7, wherein the correction coefficients are specifically:
Figure FDA0003215135310000032
9. the big-data-based public bike potential demand prediction method according to claim 7, wherein the public bike rental amount is max (α)iPoti,Pti) The public bike arrival amount is max (Aot)i,Ati)。
CN201611091696.9A 2016-12-01 2016-12-01 Public bicycle potential demand prediction method based on big data Active CN108133302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611091696.9A CN108133302B (en) 2016-12-01 2016-12-01 Public bicycle potential demand prediction method based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611091696.9A CN108133302B (en) 2016-12-01 2016-12-01 Public bicycle potential demand prediction method based on big data

Publications (2)

Publication Number Publication Date
CN108133302A CN108133302A (en) 2018-06-08
CN108133302B true CN108133302B (en) 2021-12-14

Family

ID=62388261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611091696.9A Active CN108133302B (en) 2016-12-01 2016-12-01 Public bicycle potential demand prediction method based on big data

Country Status (1)

Country Link
CN (1) CN108133302B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147919A (en) * 2018-11-21 2019-08-20 太原理工大学 A kind of public bicycles automatic scheduling method based on price competition mechanism
CN110119838A (en) * 2019-04-17 2019-08-13 成都信息工程大学 A kind of shared bicycle demand forecast system, method and device
CN110059986B (en) * 2019-05-08 2021-02-19 武汉大学 Dynamic releasing method and system for shared bicycle
CN110189029A (en) * 2019-05-31 2019-08-30 福州大学 A kind of bicycle cycling and parking demand appraisal procedure based on extensive mobile phone location data
CN110427595B (en) * 2019-07-24 2023-04-07 东南大学 Method for quantitatively analyzing influence of shared bicycle on renting amount of public bicycles with piles
CN112580842A (en) * 2019-09-29 2021-03-30 上海浦东建筑设计研究院有限公司 Shared bicycle supply decision early warning method and system
CN111489039B (en) * 2020-04-15 2023-05-19 悉地(苏州)勘察设计顾问有限公司 Method and system for predicting total quantity of shared bicycle
CN111984924A (en) * 2020-07-07 2020-11-24 东南大学 Method for evaluating influence of public bicycle leasing policy on regional bicycle safety
CN111737605A (en) * 2020-07-09 2020-10-02 南京瑞栖智能交通技术产业研究院有限公司 Travel purpose identification method and device based on mobile phone signaling data
CN112149902B (en) * 2020-09-23 2022-06-14 吉林大学 Subway short-time arrival passenger flow prediction method based on passenger flow characteristic analysis
CN112836996B (en) * 2021-03-10 2022-03-04 西南交通大学 Method for identifying potential ticket buying demand of passenger
CN113052686B (en) * 2021-04-30 2024-03-08 中国银行股份有限公司 Data processing method and device
CN113393104A (en) * 2021-06-03 2021-09-14 东南大学 Method for evaluating influence of rail transit running state on peripheral public bicycles
CN115810271B (en) * 2023-02-07 2023-04-28 安徽交欣科技股份有限公司 Method for judging passenger flow corridor position based on card swiping data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361398A (en) * 2014-08-04 2015-02-18 浙江工业大学 Method for predicting natural demands on public bicycle rental spots
CN104766146A (en) * 2015-04-24 2015-07-08 陆化普 Traffic demand forecasting method and system
CN105513351A (en) * 2015-12-17 2016-04-20 北京亚信蓝涛科技有限公司 Traffic travel characteristic data extraction method based on big data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361398A (en) * 2014-08-04 2015-02-18 浙江工业大学 Method for predicting natural demands on public bicycle rental spots
CN104766146A (en) * 2015-04-24 2015-07-08 陆化普 Traffic demand forecasting method and system
CN105513351A (en) * 2015-12-17 2016-04-20 北京亚信蓝涛科技有限公司 Traffic travel characteristic data extraction method based on big data

Also Published As

Publication number Publication date
CN108133302A (en) 2018-06-08

Similar Documents

Publication Publication Date Title
CN108133302B (en) Public bicycle potential demand prediction method based on big data
Ko et al. Locating refuelling stations for alternative fuel vehicles: a review on models and applications
Faridimehr et al. A stochastic programming approach for electric vehicle charging network design
CN113902011A (en) Urban rail transit short-time passenger flow prediction method based on cyclic neural network
Bao et al. Spatial analysis of bikeshare ridership with smart card and POI data using geographically weighted regression method
CN107563540A (en) A kind of public transport in short-term based on random forest is got on the bus the Forecasting Methodology of the volume of the flow of passengers
US20150006255A1 (en) Determining demographic data
US20150005007A1 (en) Displaying demographic data
Huang et al. Analysis of the acceptance of park-and-ride by users
Qiao et al. Is ride-hailing competing or complementing public transport? A perspective from affordability
Kang et al. Potential of urban land use by autonomous vehicles: Analyzing land use potential in Seoul capital area of Korea
Ji et al. A spatial-temporal model for locating electric vehicle charging stations
Song et al. Learning electric vehicle driver range anxiety with an initial state of charge-oriented gradient boosting approach
CN110222884A (en) Station accessibility appraisal procedure based on POI data and the volume of the flow of passengers
CN111581318B (en) Shared bicycle riding purpose inference method and device and storage medium
Henke et al. Mobility Habits Surveys: A Real Case Application for University Students in Italy
Jose et al. Transportation sustainability assessment using an indicator-based method: a case of Kochi, Kerala, India
CN115859766A (en) Method for analyzing threshold effect of built-up environment on shared bicycle driving quantity influence
Vanderwoerd Examining the effects of autonomous vehicle ride sharing services on fixed-route public transit
Çelebi et al. A bicycle sharing system design for ITU Ayazağa campus
Rith et al. Development of small area population estimation models for a developing, densely populated metropolitan area and its applications: A case study of Metro Manila.
Murphy et al. Interactions of policies acting at the local, sub-national, and national scales for Canada’s energy transition
Xie Measuring the performance of metro-based Transit Oriented Development (TOD): a comparative study between Beijing and Singapore
Mahmud Characterization of Business Establishments and Commercial Vehicle Movements Utilizing Machine Learning Techniques in Halifax, Canada
CN113988447A (en) District-level land utilization space amount prediction method based on comprehensive traffic

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant