CN107316108A - A kind of citizens' activities public bus network chooses sliding window multiple features Forecasting Methodology - Google Patents

A kind of citizens' activities public bus network chooses sliding window multiple features Forecasting Methodology Download PDF

Info

Publication number
CN107316108A
CN107316108A CN201710463795.3A CN201710463795A CN107316108A CN 107316108 A CN107316108 A CN 107316108A CN 201710463795 A CN201710463795 A CN 201710463795A CN 107316108 A CN107316108 A CN 107316108A
Authority
CN
China
Prior art keywords
passenger
feature
nearest
behavior
different
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710463795.3A
Other languages
Chinese (zh)
Inventor
李拥军
吴雁
杨鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201710463795.3A priority Critical patent/CN107316108A/en
Publication of CN107316108A publication Critical patent/CN107316108A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • General Engineering & Computer Science (AREA)
  • Fuzzy Systems (AREA)
  • Development Economics (AREA)
  • Mathematical Physics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Sliding window multiple features Forecasting Methodology is chosen the invention discloses a kind of citizens' activities public bus network.This method is used constructs sample based on sliding window;Sample characteristics attribute is constructed using rider history behavior record in 139 day set time window, by multiple sliding window, covers different time intervals to construct many parts of training samples;In each time window, design feature attribute in terms of interaction feature Attribute class, the characteristic attribute class of passenger's mass transit card different type feature of the characteristic attribute class, passenger of characteristic attribute class, different line features from rider history travel behaviour feature on specific public bus network etc. seven;Using classical accuracy precision, recall rate recall and F1 value as evaluating standard, final scoring is ranked up according to F1 values.The present invention is using the sample training model of construction, and method is fast, reliably, effectively.

Description

A kind of citizens' activities public bus network chooses sliding window multiple features Forecasting Methodology
Technical field
Chosen the present invention relates to public bus network, more particularly to one kind chooses sliding window for citizens' activities public bus network Multiple features are predicted, belong to data mining technology field.
Background technology
As China's economic growth and urbanization rate are continuously increased, citizens' activities demand constantly increases, while traffic is gathered around It is stifled also increasingly serious.The history travel behaviour of fixed passenger is analyzed and excavated, prediction passenger future on fixed circuit Trip mode, providing information symmetrical and safe outside environment for numerous passengers has important directive significance.
The content of the invention
Present invention aims at there is provided a kind of citizen that sliding window multiple features are constructed based on fixed citizen's historical behavior Trip public bus network chooses Forecasting Methodology, provides information symmetrical for numerous passengers, the outside environment of safety and comfort has important meaning Justice.
The object of the invention is achieved through the following technical solutions:
A kind of citizens' activities public bus network chooses sliding window multiple features Forecasting Methodology, comprises the following steps:
1) sample is constructed using based on sliding window;Use rider history behavior record structure in 139 day set time window Sample characteristics attribute is made, by multiple sliding window, covers different time intervals to construct many parts of training samples;
In each time window, characteristic attribute class, the feature of different line features from rider history travel behaviour feature Interaction feature Attribute class, the characteristic attribute of passenger's mass transit card different type feature of Attribute class, passenger on specific public bus network The interaction feature Attribute class of the Behavior law of class, different passenger types on specific public bus network, passenger's mass transit card hair fastener place Characteristic attribute class, the characteristic attribute class and reactions change of feature become between passenger mass transit card hair fastener place and fixed public bus network The aspect design feature attribute of characteristic attribute seven of the ratio class of gesture;Per category feature again from statistics class, time class, ratio trend class, The coding several classifications of class carry out specific design;
2) model evaluation using classical accuracy precision, recall rate recall and F1 value as evaluating standard, most Final review point is ranked up according to F1 values;Calculation formula is as follows:
Each sample is together decided on by card_id and with line_name, and wherein card_id represents mass transit card unique ID number, Line_name represents public bus network title, and whether sample class label is then held mass transit card passenger in following fixed time period and can Go on a journey to determine on line_name circuits.
Preferably, the characteristic attribute class of the rider history travel behaviour feature includes passenger behavior on all public bus networks Sequential category feature, passenger's time category feature, the ratio trend category feature and passenger different classes of attribute of passenger's trip change Feature;
Wherein, the sequential category feature of passenger behavior is to count in nearest 12 hours of each passenger, most on all public bus networks Nearly 1,3,7,14,28,56,84,112, total degree of being ridden on all public bus networks in 139 days;
Passenger's time category feature refers to passenger's average riding interval number of days, the nearest bus card-reading exchange hour of passenger, User enlivens hourage, trip number of times and is more than between all numbers of all numbers of 1 time, trip number of times more than 2 times, average charge time every other day Number, average number of times of swiping the card weekly.
The ratio trend category feature of passenger's trip change:In view of the variation tendency influence of rider history behavior, Cheng Kehang For number of times be more than 2 times all number accountings, passenger nearest 1, number of swiping the card for 2,4 times 4 nearest 2,4,8 swipe the card several accountings, all footlines For number of times head office be number of times accounting, working day behavior number of times in behavior number of times accounting etc. of always swiping the card, this category feature can be to passenger By bus rule portrayed.
The feature of the different classes of attribute of passenger.Different classes of passenger is on future trip and influence, working clan's travel time Rule, old man's trip is affected by other factors larger, is different features by 7 kinds of different mass transit card Type mappings.
Preferably, the characteristic attribute class of the different line features multiplies including circuit sequential class statistical nature, circuit history The variation tendency category feature and public bus network coding characteristic of seat amount;
Wherein circuit sequential class statistical nature, which refers to that trip of the history volume of the flow of passengers of different circuits on passenger is present, influences, right The volume of the flow of passengers of the every circuit respectively to nearest 12 hours, nearest 1,3,7,14,28,56,84,112,139 days is counted, and is given Time window in weekend, the working day total volume of the flow of passengers of passenger counted;To weekend and the work maximum volume of the flow of passengers of per day, history Counted;
The variation tendency category feature of the circuit history seating amount refers to the presence that the change of the history volume of the flow of passengers is gone on a journey to passenger Influence is nearest to each circuit 1,2,4 weeks nearest 2,4,8 weeks interior volume of the flow of passengers ratio construct feature;
The public bus network coding characteristic refers to that the website number of different circuit locations and daily circuit is selected passenger There is influence in following travel route, mainly there is different line characteristics, every circuit website number feature.
Preferably, interaction feature Attribute class of the passenger on specific public bus network includes passenger to there is history to take row For every circuit sequential statistics category feature, passenger to have history take behavior daily each circuit time category feature and multiply Visitor takes behavior ratio trend category feature to history ride circuit;
Wherein, the passenger refers to passenger every to the sequential statistics category feature for having every circuit of history seating behavior The liveness that history on the specific circuit of bar is taken is portrayed, and is having history to take behavior line passenger in set time window Public transport trading activity on road in nearest 12 hours, in nearest 1,3,7,14,28,56,84,112,139 days is counted, to multiplying Visitor takes maximum times, weekend seating number of times, working day seating number of times and counted;
The passenger refers to that passenger is having history seating to the time category feature for having daily each circuit of history seating behavior Nearest riding time interval in behavior, passenger has the time interval of seating behavior, passenger to have and gone by bus in preset time window Enlivening number of days and enliven hourage, return and multiply minimum number of days, averagely return and multiply number of days feature for record;
The passenger takes behavior ratio trend category feature to history ride circuit and refers to the specific line of nearest 1 week occupant ride Road number of times took behavior number accounting, the online way concentration of passenger in nearest 2 weeks and enlivens hourage and enliven small in circuit complete or collected works When total accounting, passenger number of times taken at weekend always taking number of times accounting always taking number of times accounting, working day and take number of times Feature.
Preferably, the characteristic attribute class of passenger's mass transit card different type feature is counted including different type passenger sequential Category feature and different type passenger's trend category feature;
Wherein, prime number different type passenger sequential statistics category feature refers to carve the trip rule of different passenger types Draw, the week by different passenger types nearest 12 hours on all circuits, in nearest 1,3,7,14,28,56,84,112,139 days End and working day behavior number of times feature are counted;
The different type passenger trend category feature is nearest to different groups crowd 1,2,4 weeks nearest 2, in 4,8 weeks Travel amount accounting is counted.
Preferably, the characteristic attribute class in passenger's mass transit card hair fastener place includes different location passenger's sequential class statistics spy Levy, different hair fastener passenger trip trend category feature and different location passenger coding category feature;
The different location passenger sequential class statistical nature refer to respectively to individual place passenger nearest 12 hours, nearest 1, 3rd, 7,14,28,56,84,112, behavior total degree statistics in 139 days;
The different hair fasteners passenger trip trend category feature refer to different location passenger nearest 1,2,4 weeks nearest 2nd, 4, travel amount accounting statistics in 8 weeks, weekend goes on a journey number of times in total degree accounting;
The different location passenger coding category feature refers to the rule by bus and the line of presence on different mass transit card hair fastener ground Lu Dougong is mapped as handing over card hair fastener place.
Relative to prior art, the invention has the advantages that and beneficial effect:
1) present invention builds training sample and test sample based on sliding window.The present invention believes passenger's trip historical series Sectional compression is ceased, the influence of recent behavior is emphasized, passenger's rule of going on a journey in the recent period is strong to future trip correlation, and behavior pair in the past Future influence gradually weakens, and takes current feature fine granularity to extract, feature coarseness at a specified future date is extracted, and is aided with overlapping extraction side Method;The present invention is when constructing sample attribute feature and sample class label, set time window size;Constructed by sliding window Many parts of samples.
2) varigrained extracting mode is used when the present invention carries out feature extraction to fixed citizen's history mass transit card data; Construct the characteristic attribute with otherness;From rider history behavioral data, each circuit goes over the daily volume of the flow of passengers, and passenger has every trade To enliven number of days, the different grain size progress feature extraction of travel route and final prediction label;
3) present invention design polymorphic type characteristic attribute;It is characteristic attribute class from rider history travel behaviour feature, not collinear Interaction feature Attribute class of the characteristic attribute class, passenger of road feature on specific public bus network, passenger's mass transit card different type are special The characteristic attribute class of point, different passenger types take interaction feature Attribute class, the passenger's public transport of rule on specific public bus network Between the characteristic attribute class in card hair fastener place, passenger mass transit card hair fastener place and fixed public bus network the characteristic attribute class of feature with And the aspect design citizens' activities characteristic attribute of characteristic attribute seven of the ratio class of some other variation tendency.
Embodiment
To more fully understand the present invention, the present invention is elaborated with reference to embodiments, but it is claimed Scope be not limited thereto.
The present invention chooses method based on fixed citizen colony historical behavior feature prediction citizen future trip public bus network, should Scheme of the invention design is as follows.
The present invention is avoids constructing many parts of sample trip data distribution inconsistence problems, and proposition, which is used, is based on sliding window structure Make sample.Experimental data is that experimental data is the cities of Guangdong public transport line in 1 day to 2014 on December five months 31, of August in 2014 Road mass transit card user's history transaction data, sample characteristics are constructed using rider history behavior record in 139 day set time window Attribute, following 7 days passengers fix whether going on a journey on public bus network and determine the class label of sample, by multiple sliding window, Different time intervals is covered to construct many parts of training samples.
In each time window, characteristic attribute class, the feature of different line features from rider history travel behaviour feature Interaction feature Attribute class, the characteristic attribute of passenger's mass transit card different type feature of Attribute class, passenger on specific public bus network Class, different passenger type on specific public bus network on the interaction feature Attribute class of Behavior law, passenger's mass transit card hair fastener The characteristic attribute class and reactions change of feature between characteristic attribute class, passenger mass transit card hair fastener place and the fixed public bus network put The aspect design feature attribute of characteristic attribute of ratio class of trend etc. seven;Per category feature again from statistics class, time class, ratio trend Class, the coding several classifications of class carry out specific design.
1) feature set of passenger behavior attribute is designed:
The sequential category feature of passenger behavior on all public bus networks:The statistics by bus of passenger recently on all circuits is to multiplying Rule is described visitor by bus, and Distance Time is shorter, and to future, the influence of trip is bigger, as historical trading time gap is got over Long, influence is less and less, and extraction interval granularity is increasing, count in nearest 12 hours of each passenger, nearest 1,3,7, 14th, 28,56,84,112, ridden in 139 days on all public bus networks total degree.
Passenger's time category feature:Passenger's active degree is described in preset time window.Passenger's average riding interval number of days, multiplies Objective bus card-reading exchange hour recently, user enliven all numbers of hourage, trip number of times more than 1 time, trip number of times and are more than 2 times All numbers, average charge time interval number of days, average number of times of swiping the card weekly.
The ratio trend category feature of passenger's trip change:In view of the variation tendency influence of rider history behavior, Cheng Kehang For number of times be more than 2 times all number accountings, passenger nearest 1, number of swiping the card for 2,4 times 4 nearest 2,4,8 swipe the card several accountings, all footlines For number of times head office be number of times accounting, working day behavior number of times in behavior number of times accounting etc. of always swiping the card, this category feature can be to passenger By bus rule portrayed.
The feature of the different classes of attribute of passenger.Different classes of passenger is on future trip and influence, working clan's travel time Rule, old man's trip is affected by other factors larger, is different features by 7 kinds of different mass transit card Type mappings.
2) the characteristic attribute class of different line features is designed:
Circuit sequential class statistical nature:Trip of the history volume of the flow of passengers of different circuits on passenger, which exists, to be influenceed, to every line The volume of the flow of passengers that road counts nearest 12 hours, nearest 1,3,7,14,28,56,84,112,139 days respectively is counted, given Weekend, the working day total guest flow statistics of passenger in time window, weekend and the work maximum guest flow statistics of per day, history;
The variation tendency category feature of circuit history seating amount:The change of the history volume of the flow of passengers influences on the presence that passenger goes on a journey, right Each circuit nearest 1,2,4 weeks nearest 2, in 4,8 weeks volume of the flow of passengers ratio construction feature.
Public bus network coding characteristic:The website number of different circuit locations and daily circuit is to the following trip of passenger's selection There is influence in circuit, mainly there is different line characteristics, every circuit website number feature
3) interaction feature Attribute class of the design passenger on specific public bus network:
Passenger counts category feature to the sequential for having every circuit of history seating behavior:To passenger on every specific circuit The liveness taken of history portrayed, having on history seating behavior circuit nearest 12 small to passenger in set time window When interior, nearest 1,3,7,14,28,56,84,112, the public transport trading activity statistics in 139 days, passenger take maximum times, week Take number of times, working day and take number of times in end.
Passenger is to there is the time category feature of daily each circuit of history seating behavior:Passenger is having in history seating behavior most Passenger has the time interval of seating behavior, passenger to have behavior record by bus near riding time interval, preset time window Number of days (enlivening number of days) and hourage is enlivened, is returned and is multiplied minimum number of days, averagely returns and multiply the features such as number of days.
Passenger takes behavior ratio trend category feature to history ride circuit:The nearest specific circuit number of times of 1 week occupant ride Behavior number accounting, the online way concentration of passenger was taken in nearest 2 weeks to enliven hourage and enliven hour sum in circuit complete or collected works Accounting, passenger take number of times at weekend and are always taking the features such as number of times accounting in always seating number of times accounting, working day seating number of times.
4) the characteristic attribute class of passenger's mass transit card different type feature is designed:
Different type passenger sequential counts category feature:Different groups passenger trip rule is different, to different passenger types Trip rule portrayed, by different passenger types nearest 12 hours on all circuits, nearest 1,3,7,14,28,56,84, 112nd, in 139 days weekend and working day behavior number of times feature.
Different type passenger's trend category feature:Different groups Behavioral change trend is reacted, such as old group can be with season Change trip rule and change, student group can change with the change trip rule of winter and summer vacation, to different groups Crowd nearest 1, nearest 2, in 4,8 weeks, travel amount accounting is counted within 2,4 weeks.
5) the characteristic attribute class in design passenger's mass transit card hair fastener place:
Different location passenger's sequential class statistical nature:There is difference in the trip rule of different location passenger, respectively to individual Point passenger behavior total degree statistics in nearest 12 hours, nearest 1,3,7,14,28,56,84,112,139 days (presses weekend and work Make to count respectively day).
Different hair fasteners passenger trip trend category feature:Different location passenger is nearest 1,2,4 weeks nearest 2,4,8 weeks Interior travel amount accounting statistics, weekend goes on a journey number of times in total degree accounting.
Different location passenger encodes category feature:The rule by bus and the circuit of presence on different mass transit card hair fastener ground are not Together, in order to embody these information in the sample, 20 different mass transit card hair fastener place mappings are characterized.
Model evaluation uses classical accuracy (precision), recall rate (recall) and F1 values as evaluating standard, Final scoring is ranked up according to F1 values.Specific formula for calculation is as follows:
Each sample is together decided on by card_id and with line_name, and wherein card_id represents mass transit card unique ID number, Line_name represents public bus network title, and whether sample class label is then held mass transit card passenger in following fixed time period and can Go on a journey to determine on line_name circuits.
Embodiment:2015 Guangdong citizen's bus trip predictions
Five months part public bus network mass transit card user's history numbers that embodiment data are provided by public transport company of Guangdong Province According to history data structure information is as shown in table 1.
The data sheet field of table 1
Field symbols Field name Field value
Line_name Line name 7 circuits
Terminal_id Card swiping terminal ID Field encryption
Card_id IC-card ID Field encryption
Create_city Hair fastener place 20 places such as Guangzhou, Foshan
Deal_time Exchange hour 2014090108
Card_type Public transport Card Type 7 kinds of generic card, old man's card etc.
Stop_cnt Circuit website number 24
Line_type Circuit types In Guangzhou/wide Buddhist is trans-regional
According to fixed group, history rides to record on public bus network, excavates behavior mould of the fixed crowd in public transport Formula, thus it is speculated that passenger's trip custom and preference, the public bus network that prediction passenger takes in following fixed time period.
Model evaluation uses classical accuracy (precision), recall rate (recall) and F1 values as evaluating standard, Final scoring is ranked up according to F1 values.Specific formula for calculation is as follows:
Each sample is together decided on by card_id and with line_name, and wherein card_id represents mass transit card unique ID number, Line_name represents public bus network title, and whether sample class label is then held mass transit card passenger in following fixed time period and can Go on a journey to determine on line_name circuits.
Modeling data is that the part public bus network south of the Five Ridges, August 1 day to 2014 on December, five months 31, Guangdong in 2014 is general Whether family historical data, sliding window is fixed as extracting for 139 days the feature of design sample, according to passenger to certain in subsequent 7 days Bar circuit has travel behaviour to carry out the label label (1/0 of sample drawn;Take).According to above-mentioned thought, respectively using 8 Month 1 day extracted passenger's characteristic attribute to December 17, December 18 to December 24 sample drawn label configurations sample sample_ 1;August 3 days extracted passenger's feature to December 19, December 20 to December 26 sample drawn label configurations sample sample_ 2;August 6 days extracted passenger's feature to December 22, December 23 to December 29 sample drawn label configurations sample sample_ 3;August 8 days extracted passenger's characteristic attribute to December 24, December 25 the label configurations sample to sample drawn on December 31 Sample_4, so then constructing many parts of abundances has the sample of otherness.The passenger's feature extracted to August to December 31 for 15th Construct unlabeled exemplars sample_0, trip of the prediction passenger on January 1st, 2015 to January 7 fixed public bus network.Often 291 characteristic attributes are obtained according to features described above attribute construction method in part sample.
Have that correlation is higher, granularity is different in view of the characteristic attribute of the sample of construction, random forests algorithm can be very well Such characteristic attribute is handled, the present invention carries out the training modeling of sample using random forests algorithm so that model prediction Performance is more excellent.Finally, model performance desired value such as table 2 is obtained.
The model performance result of table 2

Claims (6)

1. a kind of citizens' activities public bus network chooses sliding window multiple features Forecasting Methodology, it is characterised in that comprise the following steps:
1) sample is constructed using based on sliding window;Sample is constructed using rider history behavior record in 139 day set time window Eigen attribute, by multiple sliding window, covers different time intervals to construct many parts of training samples;
In each time window, characteristic attribute class, the characteristic attribute of different line features from rider history travel behaviour feature Interaction feature Attribute class on specific public bus network of class, passenger, the characteristic attribute class of passenger's mass transit card different type feature, no Interaction feature Attribute class, the feature in passenger's mass transit card hair fastener place with Behavior law of the passenger type on specific public bus network The characteristic attribute class and the ratio of reactions change trend of feature between Attribute class, passenger mass transit card hair fastener place and fixed public bus network It is worth the aspect design feature attribute of characteristic attribute seven of class;Per category feature again from statistics class, time class, ratio trend class, coding class Several classifications carry out specific design;
2) model evaluation is used as evaluating standard, most final review using classical accuracy precision, recall rate recall and F1 value Divide and be ranked up according to F1 values;Calculation formula is as follows:
<mrow> <mi>Pr</mi> <mi>e</mi> <mi>c</mi> <mi>i</mi> <mi>s</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mo>=</mo> <mfrac> <mrow> <mo>|</mo> <mo>&amp;cap;</mo> <mo>(</mo> <mi>Pr</mi> <mi> </mi> <mi>e</mi> <mi>d</mi> <mi>i</mi> <mi>c</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mi>S</mi> <mi>e</mi> <mi>t</mi> <mo>,</mo> <mi>Re</mi> <mi> </mi> <mi>f</mi> <mi>e</mi> <mi>r</mi> <mi>e</mi> <mi>n</mi> <mi>c</mi> <mi>e</mi> <mi>S</mi> <mi>e</mi> <mi>t</mi> <mo>|</mo> </mrow> <mrow> <mo>|</mo> <mrow> <mi>Pr</mi> <mi> </mi> <mi>e</mi> <mi>d</mi> <mi>i</mi> <mi>c</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mi>S</mi> <mi>e</mi> <mi>t</mi> </mrow> <mo>|</mo> </mrow> </mfrac> </mrow>
<mrow> <mi>Re</mi> <mi> </mi> <mi>c</mi> <mi>a</mi> <mi>l</mi> <mi>l</mi> <mo>=</mo> <mfrac> <mrow> <mo>|</mo> <mo>&amp;cap;</mo> <mrow> <mo>(</mo> <mi>Pr</mi> <mi> </mi> <mi>e</mi> <mi>d</mi> <mi>i</mi> <mi>c</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mi>S</mi> <mi>e</mi> <mi>t</mi> <mo>,</mo> <mi>Re</mi> <mi> </mi> <mi>f</mi> <mi>e</mi> <mi>r</mi> <mi>e</mi> <mi>n</mi> <mi>c</mi> <mi>e</mi> <mi>S</mi> <mi>e</mi> <mi>t</mi> <mo>)</mo> </mrow> <mo>|</mo> </mrow> <mrow> <mo>|</mo> <mrow> <mi>Re</mi> <mi> </mi> <mi>f</mi> <mi>e</mi> <mi>r</mi> <mi>e</mi> <mi>n</mi> <mi>c</mi> <mi>e</mi> <mi>S</mi> <mi>e</mi> <mi>t</mi> </mrow> <mo>|</mo> </mrow> </mfrac> </mrow>
<mrow> <mi>F</mi> <mn>1</mn> <mo>=</mo> <mfrac> <mrow> <mn>2</mn> <mo>&amp;times;</mo> <mi>Pr</mi> <mi> </mi> <mi>e</mi> <mi>d</mi> <mi>i</mi> <mi>c</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mo>&amp;times;</mo> <mi>Re</mi> <mi> </mi> <mi>c</mi> <mi>a</mi> <mi>l</mi> <mi>l</mi> </mrow> <mrow> <mi>Pr</mi> <mi> </mi> <mi>e</mi> <mi>c</mi> <mi>i</mi> <mi>o</mi> <mi>s</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mo>+</mo> <mi>Re</mi> <mi> </mi> <mi>c</mi> <mi>a</mi> <mi>l</mi> <mi>l</mi> </mrow> </mfrac> </mrow>
Each sample is together decided on by card_id and with line_name, and wherein card_id represents mass transit card unique ID number, Line_name represents public bus network title, and whether sample class label is then held mass transit card passenger in following fixed time period and can Go on a journey to determine on line_name circuits.
2. citizens' activities public bus network according to claim 1 chooses sliding window multiple features Forecasting Methodology, its feature exists In the characteristic attribute class of the rider history travel behaviour feature includes the sequential class spy of passenger behavior on all public bus networks Levy, passenger's time category feature, passenger trip change ratio trend category feature and the different classes of attribute of passenger feature;
Wherein, on all public bus networks the sequential category feature of passenger behavior be count in nearest 12 hours of each passenger, nearest 1, 3rd, 7,14,28,56,84,112, ridden in 139 days on all public bus networks total degree;
Passenger's time category feature refers to passenger's average riding interval number of days, the nearest bus card-reading exchange hour of passenger, user Enliven hourage, trip number of times be more than all numbers of 1 time, trip number of times be more than 2 times all numbers, average charge time interval number of days, Average number of times of swiping the card weekly.
The ratio trend category feature of passenger's trip change:In view of the variation tendency influence of rider history behavior, passenger behavior All number accountings of the number more than 2 times, passenger nearest 1, number of swiping the card for 2,4 times 4 are nearest 2,4,8 swipe the card several accountings, weekend behaviors time Number head office be number of times accounting, working day behavior number of times in behavior number of times accounting etc. of always swiping the card, this category feature can multiply to passenger Car rule is portrayed.
The feature of the different classes of attribute of passenger.Different classes of passenger is on future trip and influences, working clan's travel time rule, Old man's trip is affected by other factors larger, is different features by 7 kinds of different mass transit card Type mappings.
3. citizens' activities public bus network according to claim 1 chooses sliding window multiple features Forecasting Methodology, its feature exists In the characteristic attribute class of the different line features includes circuit sequential class statistical nature, the change of circuit history seating amount and become Gesture category feature and public bus network coding characteristic;
Wherein circuit sequential class statistical nature, which refers to that trip of the history volume of the flow of passengers of different circuits on passenger is present, influences, to every The volume of the flow of passengers of the circuit respectively to nearest 12 hours, nearest 1,3,7,14,28,56,84,112,139 days is counted, when given Between in window weekend, the working day total volume of the flow of passengers of passenger counted;Weekend and the maximum volume of the flow of passengers of per day, history that works are carried out Statistics;
The variation tendency category feature of the circuit history seating amount refers to that the change of the history volume of the flow of passengers influences on the presence that passenger goes on a journey, Nearest to each circuit 1,2,4 weeks nearest 2, in 4,8 weeks volume of the flow of passengers ratio construction feature;
The public bus network coding characteristic refers to that the website number of different circuit locations and daily circuit selects future to passenger There is influence in travel route, mainly there is different line characteristics, every circuit website number feature.
4. citizens' activities public bus network according to claim 1 chooses sliding window multiple features Forecasting Methodology, its feature exists In interaction feature Attribute class of the passenger on specific public bus network includes passenger to there is history to take every circuit of behavior Sequential statistics category feature, passenger to have history take behavior daily each circuit time category feature and passenger to history take Circuit takes behavior ratio trend category feature;
Wherein, the passenger refers to passenger in every tool to the sequential statistics category feature for having every circuit of history seating behavior The liveness that history on body circuit is taken is portrayed, and is having history to take on behavior circuit passenger in set time window Public transport trading activity in nearest 12 hours, in nearest 1,3,7,14,28,56,84,112,139 days is counted, passenger is multiplied Maximum times, weekend seating number of times, working day seating number of times is sat to be counted;
The passenger refers to that passenger is having history to take behavior to the time category feature for having daily each circuit of history seating behavior Passenger has the time interval of seating behavior, passenger to have the note of behavior by bus in upper nearest riding time interval, preset time window Record enlivens number of days and enlivens hourage, returns and multiply minimum number of days, averagely return and multiply number of days feature;
The passenger takes behavior ratio trend category feature to history ride circuit and refers to the specific circuit of nearest 1 week occupant ride Number takes behavior number accounting, the online way concentration of passenger in nearest 2 weeks and enlivens hourage to enliven hour in circuit complete or collected works total Number accounting, passenger take number of times at weekend and are always taking number of times accounting feature in always seating number of times accounting, working day seating number of times.
5. citizens' activities public bus network according to claim 1 chooses sliding window multiple features Forecasting Methodology, its feature exists In the characteristic attribute class of passenger's mass transit card different type feature includes different type passenger sequential and counts category feature and difference Type passenger's trend category feature;
Wherein, prime number different type passenger sequential statistics category feature refers to portray the trip rule of different passenger types, Weekend by different passenger types nearest 12 hours on all circuits, in nearest 1,3,7,14,28,56,84,112,139 days Counted with working day behavior number of times feature;
The different type passenger trend category feature is nearest to different groups crowd 1, gone on a journey nearest 2, in 4,8 weeks within 2,4 weeks Amount accounting is counted.
6. citizens' activities public bus network according to claim 1 chooses sliding window multiple features Forecasting Methodology, its feature exists In the characteristic attribute class in passenger's mass transit card hair fastener place includes different location passenger's sequential class statistical nature, different hair fasteners The trip trend category feature of ground passenger and different location passenger coding category feature;
The different location passenger sequential class statistical nature refer to respectively to individual place passenger nearest 12 hours, nearest 1,3,7, 14th, 28,56,84,112, behavior total degree statistics in 139 days;
The different hair fasteners passenger trip trend category feature refer to different location passenger nearest 1,2,4 weeks nearest 2,4, Travel amount accounting statistics in 8 weeks, weekend goes on a journey number of times in total degree accounting;
The different location passenger coding category feature refers to the rule by bus and the circuit of presence on different mass transit card hair fastener ground all Public affairs are mapped as handing over card hair fastener place.
CN201710463795.3A 2017-06-19 2017-06-19 A kind of citizens' activities public bus network chooses sliding window multiple features Forecasting Methodology Pending CN107316108A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710463795.3A CN107316108A (en) 2017-06-19 2017-06-19 A kind of citizens' activities public bus network chooses sliding window multiple features Forecasting Methodology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710463795.3A CN107316108A (en) 2017-06-19 2017-06-19 A kind of citizens' activities public bus network chooses sliding window multiple features Forecasting Methodology

Publications (1)

Publication Number Publication Date
CN107316108A true CN107316108A (en) 2017-11-03

Family

ID=60183854

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710463795.3A Pending CN107316108A (en) 2017-06-19 2017-06-19 A kind of citizens' activities public bus network chooses sliding window multiple features Forecasting Methodology

Country Status (1)

Country Link
CN (1) CN107316108A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108615096A (en) * 2018-05-10 2018-10-02 平安科技(深圳)有限公司 Server, the processing method of Financial Time Series and storage medium
CN109741114A (en) * 2019-01-10 2019-05-10 博拉网络股份有限公司 A kind of user under big data financial scenario buys prediction technique
CN110019367A (en) * 2017-12-28 2019-07-16 北京京东尚科信息技术有限公司 A kind of method and apparatus of statistical data feature
CN111126681A (en) * 2019-12-12 2020-05-08 华侨大学 Bus route adjusting method based on historical passenger flow

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019367A (en) * 2017-12-28 2019-07-16 北京京东尚科信息技术有限公司 A kind of method and apparatus of statistical data feature
CN110019367B (en) * 2017-12-28 2022-04-12 北京京东尚科信息技术有限公司 Method and device for counting data characteristics
CN108615096A (en) * 2018-05-10 2018-10-02 平安科技(深圳)有限公司 Server, the processing method of Financial Time Series and storage medium
CN109741114A (en) * 2019-01-10 2019-05-10 博拉网络股份有限公司 A kind of user under big data financial scenario buys prediction technique
CN111126681A (en) * 2019-12-12 2020-05-08 华侨大学 Bus route adjusting method based on historical passenger flow
CN111126681B (en) * 2019-12-12 2022-06-07 华侨大学 Bus route adjusting method based on historical passenger flow

Similar Documents

Publication Publication Date Title
CN107316108A (en) A kind of citizens&#39; activities public bus network chooses sliding window multiple features Forecasting Methodology
Rahman et al. Perceived service quality of paratransit in developing countries: A structural equation approach
Hnatkovska et al. Breaking the caste barrier: Intergenerational mobility in India
Rentziou et al. VMT, energy consumption, and GHG emissions forecasting for passenger transportation
Chay The impact of federal civil rights policy on black economic progress: Evidence from the equal employment opportunity act of 1972
Tonts et al. From state paternalism to neoliberalism in Australian rural policy: perspectives from the Western Australian wheatbelt
Egu et al. Investigating day-to-day variability of transit usage on a multimonth scale with smart card data. A case study in Lyon
CN106372775A (en) Assessment method and system of comprehensive value of power grid client
CN105389713A (en) Mobile data traffic package recommendation algorithm based on user historical data
Morency et al. Car sharing system: what transaction datasets reveal on users' behaviors
CN107368915B (en) Subway passenger travel time selection behavior analysis method
CN106934412A (en) A kind of user behavior sorting technique and system
CN104239968A (en) Short-term load predicting method based on quick fuzzy rough set
CN106919953A (en) A kind of abnormal trip Stock discrimination method based on track traffic data analysis
CN106777169A (en) A kind of user&#39;s trip hobby analysis method based on car networking data
CN110070255A (en) The method for introducing commuter&#39;s Passenger Traveling Choice modeling and analysis after sharing bicycle
Ritchie A statistical approach to statewide traffic counting
Guo et al. Exploring potential travel demand of customized bus using smartcard data
Ravšelj et al. R&D subsidies as drivers of corporate performance in Slovenia: The regional perspective
CN108681741A (en) Based on the subway of IC card and resident&#39;s survey data commuting crowd&#39;s information fusion method
CN112949926B (en) Income maximization ticket amount distribution method based on passenger demand re-identification
Cho et al. Presidential voting and the local variability of economic hardship
Blomquist Value of life, economics of
CN110020666B (en) Public transport advertisement putting method and system based on passenger behavior mode
CN112508425B (en) Urban travel user portrait system construction method for elastic public transport system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20171103