CN111581318B - Shared bicycle riding purpose inference method and device and storage medium - Google Patents
Shared bicycle riding purpose inference method and device and storage medium Download PDFInfo
- Publication number
- CN111581318B CN111581318B CN202010382582.XA CN202010382582A CN111581318B CN 111581318 B CN111581318 B CN 111581318B CN 202010382582 A CN202010382582 A CN 202010382582A CN 111581318 B CN111581318 B CN 111581318B
- Authority
- CN
- China
- Prior art keywords
- riding
- poi
- shared bicycle
- act
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000000694 effects Effects 0.000 claims abstract description 130
- 230000005484 gravity Effects 0.000 claims abstract description 68
- 238000013507 mapping Methods 0.000 claims abstract description 26
- 238000005457 optimization Methods 0.000 claims description 18
- 238000004458 analytical method Methods 0.000 claims description 9
- 238000012216 screening Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000007619 statistical method Methods 0.000 claims description 7
- 230000002159 abnormal effect Effects 0.000 claims description 5
- 230000002123 temporal effect Effects 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 8
- 101100100125 Mus musculus Traip gene Proteins 0.000 description 7
- 230000006399 behavior Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- Databases & Information Systems (AREA)
- Game Theory and Decision Science (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Remote Sensing (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a shared bicycle riding purpose inference method, a device and a storage medium, aiming at the travel mode of a shared bicycle and the travel characteristics of a shared bicycle user group, setting a set of mapping relation models between the purpose activity types and POI categories suitable for the shared bicycle; setting the inferred time weight of the activity purpose by combining the daily travel habits of urban residents and the business rules of different places; deducing the frequency density proportion of the number of POIs mapped by various target activity types in a candidate area, the type grade of the POIs and the time window weight according to a certain shared bicycle activity target, reflecting the spatiotemporal characteristics of the overall environment of the candidate area, and improving the existing gravity model according to the principle of proximity; and (3) performing purpose inference on the shared bicycle riding data for a period of time by using an improved gravity model and a Bayesian formula. By adopting the technical scheme of the invention, the vacancy of the existing purpose inference model in the aspect of inferring the purpose of the shared bicycle activity can be made up, and the accuracy and the authenticity of the type inference are improved.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a shared bicycle riding purpose inference method, a device and a storage medium.
Background
As a product of rapid development of social economy and science and technology, a shared bicycle is popular in various cities in China in recent years, large data generated by riding the shared bicycle by urban residents are utilized, modeling is performed by combining multi-source large data such as POI (points of interest) and the like, the travel purpose and activities of the urban residents are scientifically deduced, and the method has important reference significance for analyzing travel characteristics and demand patterns of the urban shared bicycle, exploring travel behavior rules and travel structures of the urban residents riding the shared bicycle and the like.
Existing models related to activity purpose inference are mainly divided into two categories: a target inference model based on machine learning or deep learning and a target inference model based on Bayesian rules. The objective inference model based on machine learning or deep learning mostly utilizes Markov model or long-and-short time memory cyclic neural network to train the relevant model through a large number of space position sequences which are continuous for a long time, and finally infers the travel objective by using the training result of the model. The objective inference model based on Bayes law sets prior probability and conditional probability by using the characteristic rule of spatial position data or related factors influencing objective inference, and infers the objective by using Bayes formula based on the prior probability and the conditional probability. When the two are compared with each other, the prediction accuracy of the former is high, but the required basic data is high in requirement and the data limit is large, and generally, long-time GPS track data such as a mobile phone or a taxi can only be used as basic data; the latter has lower requirements on basic data, and related scholars can deduce activities of data such as taxi taking points, mobile phone GPS signal stagnation points and the like by using the model to obtain good effects.
And at present, relevant inference aims are focused on judging the accuracy of a model prediction result or predicting the travel track of a certain type of people in a time sequence, and analysis and thinking of the model prediction result on the travel characteristics, the travel behavior mode selection, the travel behavior rules of urban people, even the urban travel structure and other macroscopic angles are ignored. Therefore, how to combine the characteristics of a shared bicycle trip mode emerging in recent years to improve the current existing activity purpose prediction model and infer the trip activity purpose of a shared bicycle user group is a blank in the research related to the current trip purpose inference, and is also an important way for disclosing the trip rule of the shared bicycle user group and the influence of the shared bicycle on an urban traffic trip structural system.
Disclosure of Invention
The embodiment of the invention provides a shared bicycle riding purpose inference method, a device and a storage medium, which can make up for the vacancy of the existing purpose inference model in the aspect of inferring the shared bicycle activity purpose and improve the accuracy and the authenticity of the type inference.
The invention provides a shared bicycle riding purpose inference method, which comprises the following steps:
the method comprises the steps of obtaining shared bicycle riding data of a first user, extracting a first riding end point, and creating a buffer area corresponding to the first riding end point to serve as a first candidate area of a riding purpose of the first user;
associating all POIs in the first candidate region with the first user;
according to a pre-constructed mapping relation model between the target activity types and the POI categories with set time weights, counting the number of POIs mapped by each type of target activity types in the first candidate region and the time weights of each type of target activity types, and constructing a first gravity model according to a counting result;
sequentially optimizing the first gravity model according to the frequency density ratio of each target activity type and the attribute characteristic weight of each POI to obtain a second gravity model;
and deducing the final riding purpose of the first user in the first candidate area according to the second gravity model and a Bayesian formula to obtain a purpose deduction result of the first user.
Further, the creating a buffer area corresponding to the first riding end point as a first candidate area of the riding purpose of the first user specifically includes:
and constructing a corresponding circular buffer area by taking the first riding end point as a circular point and a preset length as a radius, and taking the circular buffer area as the first candidate area.
Further, the associating all POIs in the first candidate area with the first user specifically includes:
and screening all POI from the first candidate area, and associating the ID of the first user with the screened POI, POI type and POI geographic coordinates.
Further, the pre-constructed mapping relationship model between the target activity type with the set time weight and the POI category specifically includes:
screening a plurality of target activity types from the existing target activity types according to the travel mode of the shared bicycle and the travel characteristics of the shared bicycle using groups, and establishing the mapping relation model according to the screened target activity types;
and setting different time weights for various target activity types in the mapping relation model according to working days and rest days by combining daily travel habits of urban residents and operation rules of different places.
Further, according to the statistical result, a first gravity model is constructed, specifically: constructing a first gravity model according to the following formula;
wherein, selectpoints is a set of all POIs of the first riding terminal point O in the destination candidate area; p represents a specific POI in the set, and sum (p.category = act |) represents the sum of all POI numbers that are act of destination activity types corresponding to the POI type mapping of P in the destination candidate region; w (p.category = act | t) represents a temporal weight of the destination activity type act within the corresponding time period; d (O) i ,P) 2 Indicates the riding terminal point O i Squared euclidean distance with P.
Further, the first gravity model is sequentially optimized according to the frequency density proportion of various types of activity types of the destinations and the attribute characteristic weight of various types of POIs to obtain a second weight model, which specifically comprises the following steps:
calculating the frequency density proportion of each target activity type according to the following formula;
wherein n is act Representing the total number of POI mapped by the destination activity type act in all POI; n is a radical of act The POI quantity mapped by the target activity type act in a certain shared bicycle target candidate area is represented; rho act The frequency density is the ratio;
then, the first gravity model is optimized for the first time according to the following formula;
the gravity model after the first optimization is as follows:
wherein, C P.category=act Frequency density ratio of destination activity type corresponding to POI type of P, C act The frequency density of the target activity type act in various target activity types is taken as a ratio;
according to the attribute feature weight of each POI, carrying out second optimization on the gravity model after the first optimization according to the following formula to obtain a second gravity model;
wherein, W (P) category ) I.e. the preset attribute feature weight corresponding to P.
Further, the inferring a final riding purpose of the first user in the first candidate region according to the second gravity model and the bayesian formula to obtain a purpose inference result of the first user specifically includes:
according to the second gravity model, carrying out probability calculation on the candidate purposes one by one in the first candidate region, and then according to the Bayes formula, calculating P of each candidate purpose i The specific formula of the normalized conditional probability is as follows:
wherein, pr (P) i | (O, t)) represents the POI point P i Is the conditional probability, G (O, P), of the time period to stop the shared bicycle at the first riding end point O, the final activity purpose for the first user to go to i T) is a candidate destination P i According to the second gravity modelCalculating the obtained result; sigma j G(O,P j T) represents the sum of the results calculated for each candidate object according to the second gravity model in the first candidate region;
and taking the target activity type with the highest probability as the target activity type corresponding to the final riding target in the first candidate area, thereby obtaining a target inference result of the first user.
Correspondingly, the invention provides a statistical analysis method for sharing the riding purpose of a bicycle, which comprises the following steps:
obtaining shared bicycle riding data of each time period in an area to be analyzed, screening out data which are not ridden or abnormal in riding, and obtaining the riding data to be analyzed;
according to the shared bicycle riding purpose inference method, each piece of riding data in the to-be-analyzed riding data is inferred into a riding purpose, and inferred data to be analyzed containing a plurality of riding purpose inference results are obtained;
and statistically analyzing the inferred data to be analyzed to obtain an analysis result.
Accordingly, the present invention provides a shared bicycle riding purpose inference device, comprising:
the obtaining module is used for obtaining shared bicycle riding data of a first user, extracting a first riding end point, and creating a buffer area corresponding to the first riding end point to serve as a first candidate area of a riding purpose of the first user;
an association module for associating all POIs in the first candidate region with the first user;
the model construction module is used for counting the number of POI (point of interest) mapped by various types of target activities in the first candidate region and the time weight of various types of target activities according to a pre-constructed mapping relation model between the target activities with set time weight and the POI types, and constructing a first gravity model according to a counting result;
the model optimization module is used for sequentially optimizing the first gravity model according to the frequency density ratio of various target activity types and the attribute characteristic weight of various POIs to obtain a second gravity model;
and the inference module is used for inferring the final riding purpose of the first user in the first candidate area according to the second gravity model and a Bayesian formula so as to obtain a purpose inference result of the first user.
Accordingly, the present invention provides a storage medium having stored therein processor-executable instructions, which when executed by a processor, are configured to implement a shared bicycle riding purpose inference method as described herein.
In view of the above, the shared bicycle riding purpose inference method, the device and the storage medium provided by the invention set a set of mapping relation models between the target activity types and the POI categories, which are suitable for the travel mode of the shared bicycle and the travel characteristics of the shared bicycle user group; setting the inferred time weight of the activity purpose by combining the daily travel habits of urban residents and the business rules of different places; deducing the frequency density ratio of the number of POIs mapped by various target activity types in a candidate area, the type grade of the POIs and the time window weight to reflect the space-time characteristics of the overall environment of the candidate area through a certain shared bicycle activity target, and improving the existing gravity model according to the principle of proximity; and performing purpose inference on the shared bicycle riding data of a period of time by using an improved gravity model and a Bayesian rule formula, and performing space-time distribution analysis. By adopting the technical scheme of the invention, the vacancy of the existing purpose inference model in the aspect of inferring the purpose of the shared bicycle activity can be made up, and the accuracy and the authenticity of the type inference are improved.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating one embodiment of a shared bicycle riding purpose inference method provided by the present invention;
FIG. 2 is a schematic diagram of a distance to a nearest candidate destination of a shared bicycle according to the present invention;
FIG. 3 is an attribute representation intent of a bicycle buffer provided by the present invention in association with a POI data space;
FIG. 4 is a schematic flow chart diagram illustrating one embodiment of a method for statistical analysis of shared bicycle riding objectives provided by the present invention;
FIG. 5 is a schematic diagram of the number ratio of working hours to riding purpose in each time period of a working day of a week;
FIG. 6 is a schematic diagram of the number ratio of the time periods of the working day of the week for the purpose of riding home;
FIG. 7 is a schematic diagram of the number ratio of time periods of a week of work for riding purposes provided by the present invention;
fig. 8 is a schematic structural diagram of an embodiment of the shared bicycle riding purpose inference device provided by the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flowchart of an embodiment of a shared bicycle riding purpose inference method provided by the present invention is shown. As shown in fig. 1, the inference method includes steps 101 to 105, and the steps are as follows:
step 101: the method comprises the steps of obtaining shared bicycle riding data of a first user, extracting a first riding end point, and creating a buffer area corresponding to the first riding end point to serve as a first candidate area of a riding purpose of the first user.
In this embodiment, taking the estimation of the riding purpose of the first user as an example for explanation, firstly obtaining riding data of a certain shared bicycle of the first user, so as to extract a first riding end point of the first user, constructing a corresponding circular buffer area by taking the first riding end point as a circular point and a preset length as a radius, and taking the circular buffer area as a first candidate area.
In this embodiment, the preferred value of the preset length is 100 meters, and the value is obtained by calculating a method that first collects multiple pieces of shared bicycle riding data, calculates a straight-line distance from a riding end point of each piece of data to a nearest candidate destination (i.e., POI), and finally plots a single bicycle data ratio with the straight-line distance being less than 0-200 meters into a graph as shown in fig. 2. It can be seen from fig. 2 that almost all the shared bicycle data have candidate purposes within a range of 100 meters, so the patent selects 100 meters as the optimal distance. In addition, the preset length can be adjusted according to actual conditions so as to meet the requirements of different conditions.
As an example of this embodiment, the buffer area may be, but is not limited to, a circular area, and may also be other shapes, such as a rectangle, a square, an oval, and the like.
In this embodiment, a candidate region for sharing the riding purpose of the bicycle user may be created using a GIS buffer analysis method. And importing shared bicycle riding end point data through an Analysis Tools- > Proximaty- > Buffer tool in ArcGIS, and setting the distance and unit of a Buffer area to obtain a candidate area of the shared bicycle riding purpose.
Step 102: all POIs in the first candidate area are associated with the first user.
In this embodiment, the POIs appearing in the first candidate region are all used as potential appearance destinations of the first user, and therefore, the first user needs to perform an association operation with all POIs in the region. Step 102 specifically comprises: and screening all POI from the first candidate area, and associating the ID of the first user with the screened POI type, POI geographic coordinates.
In this embodiment, a GIS spatial correlation analysis method may be used to spatially connect each of the candidate regions of the cycling destination of the sharing bicycle users with POI data in the candidate regions. And setting a candidate area as a target element, setting POI data as an associated element and selecting 'JOIN _ ONE _ TO _ MANY' through an Analysis Tools- > Overlay- > Spatial Join tool in ArcGIS, and associating the candidate area with all POIs in the area. As shown in fig. 3, each candidate area for the trip destination of the bicycle user is uniquely identified by using a BikeID field (i.e., a bicycle number), and the riding destination coordinates of the user are represented by using Bike _ X and Bike _ Y fields; and the POI in the associated area is uniquely identified by using a POIID field, the type of the POI is identified by a POItype field, the type of the POI is used for determining the type of the purpose of travel in the following inference, and the position coordinates of the POI are identified by the fields of POI _ X and POI _ Y, and the distance between the riding terminal and the POI is determined in the following inference.
Step 103: and according to a pre-constructed mapping relation model between the target activity types and the POI categories with set time weights, counting the number of POIs mapped by each type of target activity types in the first candidate region and the time weights of each type of target activity types, and constructing a first gravity model according to a counting result.
In this embodiment, the pre-constructed mapping relationship model between the target activity type with the set time weight and the POI category specifically is: screening a plurality of target activity types from the existing target activity types according to the travel mode of the shared bicycle and the travel characteristics of the shared bicycle using groups, and establishing a mapping relation table according to the screened target activity types; and setting different time weights according to working days and rest days for various target activity types in the mapping relation table by combining daily travel habits of urban residents and operation rules of different places.
In this embodiment, all the target activity types can be obtained from the related documents with the inferred travel target categories, but the individual POI is not suitable for the travel mode of the shared bicycle and the travel characteristics of the shared bicycle user group. For example, most users ride the shared bicycle for commuting traffic, transferring traffic with a subway bus station and carrying out short-distance travel, and the users hardly ride the shared bicycle to travel to places related to the long-distance travel such as airports, gas stations and the like. After a plurality of target activity types are screened out, the target activity types which accord with the shared bicycle are obtained, and a mapping relation model between the target activity types and the POI categories is constructed, as shown in table 1.
TABLE 1 mapping relation table of activity types and POI types of bicycle riding purposes
After the mapping relation model is constructed, different time weights are set according to working days and rest days by combining daily trip habits of urban residents and operation rules of different places. For the same target activity type, the activities are divided into the following activities according to the probability according to the morning (7-9 o 'clock), noon (12-14 o' clock), evening (17-20 o 'clock) and night (21-23 o' clock): the "high probability", "normal probability", "low probability" and "low probability" are ranked by 5, and are respectively given a temporal weight of 1.0,0.75,0.5,0.25,0.1, as shown in table 2.
Type of activity of interest | Morning of working | Noon of the workday | At night of working day | Working day and night |
Go home | 0.1 | 0.25 | 0.75 | 0.5 |
Work in office | 1.0 | 0.25 | 0.25 | 0.1 |
Traffic transfer | 1.0 | 0.75 | 1.0 | 0.75 |
Eating food | 0.25 | 1.0 | 0.75 | 0.25 |
Shopping | 0.1 | 0.5 | 0.5 | 0.5 |
Leisure entertainment | 0.1 | 0.5 | 0.5 | 0.75 |
Go to school | 1.0 | 0.25 | 0.25 | 0.1 |
Life service | 0.25 | 0.5 | 0.5 | 0.1 |
Hospitalizing | 0.25 | 0.25 | 0.25 | 0.25 |
TABLE 2 event type time weight setting table for bicycle riding purpose
In this embodiment, according to the mapping relationship model, the number of POIs mapped by various types of destination activity types in the first candidate region and the time weights of the various types of destination activity types are counted, and then according to the counting result, the first gravity model is constructed. According to the statistical result, a first gravity model is constructed, specifically: constructing a first gravity model according to the following formula;
the SelectPois a set of all POIs of the first riding terminal point O in the target candidate area; p represents a specific POI in the set, and sum (p.category = act |) represents the sum of all POI numbers that are act of destination activity types corresponding to the POI type mapping of P in the destination candidate region; w (p.category = act | t) represents a temporal weight of the destination activity type act within the corresponding time period; d (O) i ,P) 2 Indicates the riding terminal O i The square of the euclidean distance with P.
Step 104: and sequentially optimizing the first gravity model according to the frequency density ratio of the activity types of various purposes and the attribute characteristic weight of various POIs to obtain a second gravity model.
In this embodiment, considering that the quantities of PO I mapped by different target activity types are greatly different, directly applying the first gravity model will cause a large deviation to the result; however, there are many differences in physical characteristics such as area and size among different PO I individuals, and if neglected, the inference results will also be subject to error. Therefore, the present invention optimizes equation (1) of the first gravity model.
Step 104 specifically includes:
calculating the frequency density proportion of each target activity type according to the following formula;
wherein n is act Representing the total number of POI mapped by the destination activity type act in all POI; n is a radical of act Representing the number of POI mapped by a target activity type act in a target candidate area of a certain shared bicycle; rho act Is the frequency density ratio;
then, the first gravity model is optimized for the first time according to the following formula;
the gravity model after the first optimization is as follows:
wherein, C P.category=act Frequency density ratio of destination activity type corresponding to POI type of P, C act The frequency density of the target activity type act in various target activity types is taken as a ratio;
according to the attribute feature weight of each POI, carrying out second optimization on the gravity model after the first optimization according to the following formula to obtain a second gravity model;
wherein, W (P) category ) I.e. the preset attribute feature weight corresponding to P.
The first optimization of this embodiment is to use the frequency density and type ratio to replace the original direct statistics on the number of pairs. The frequency density can be obtained by firstly counting the whole number of POI mapped by each category of activity type, then counting the number of POI mapped by each category of activity type in a certain shared bicycle destination candidate area, and dividing the local number by the whole number. And finally, summarizing and summing the frequency density of the activity types of each category in the candidate area, and calculating the frequency density ratio of the activity types of each category. Compared with direct statistics on the number, the method reflects the spatial characteristics of the internal environment of the candidate region more objectively and accurately.
In formula (3), the present invention employs C P.category2act Replace sum (p.category = act) in equation (1), thereby optimizing the model using relative ratios instead of absolute numbers.
In this embodiment, the second optimization is to rank the types of POIs to take into account the individual attribute characteristics of the POIs. Many differences necessarily exist in the individual attribute characteristics of the POI, such as the area, the scale, the influence range and the like of each entity in the real world, but the massive POI data is difficult to realize comprehensive and specific consideration of the individual attribute characteristics of each POI. This patent divides POIs into 4 grades in total by categories, as shown in table 3, according to similar principles, combining the scale and size of different kinds of POIs, national or industry standards, and knowledge of related common sense, and gives a weight of 1.0,0.75,0.5,0.25 from high to low according to grades. Differences between POI entities are simulated using differences in ratings between POI categories, and are brought into the model to more truly reflect the spatial characteristics of the environment surrounding the candidate area.
TABLE 3 POI Category rating Table
Step 105: and deducing the final riding purpose of the first user in the first candidate region according to the second gravity model and the Bayes formula to obtain a purpose deduction result of the first user.
In this embodiment, step 105 specifically includes:
according to the second gravity model, probability calculation is carried out on the candidate purposes one by one in the first candidate area, and then according to a Bayes formula, the candidate purposes P are calculated i The specific formula of the normalized conditional probability is as follows:
wherein, pr (P) i | (O, t)) represents the POI point P i Is the conditional probability, G (O, P), of the time period to stop the shared bicycle at the first riding end point O, the final activity purpose for the first user to go to i T) as candidate object P i Calculating a result according to the second gravity model; sigma j G(O,P j T) represents the sum of the results of the calculations of the respective candidate objectives according to the second gravity model within the first candidate region;
and taking the target activity type with the highest probability as the target activity type corresponding to the final riding target in the first candidate area, thereby obtaining the target inference result of the first user.
Therefore, the method for deducing the riding purpose of the riding user can be used for deducing the riding purpose of the riding user, so that data support is provided for subsequent statistical analysis. Accordingly, referring to fig. 4, fig. 4 is a flowchart illustrating an embodiment of a statistical analysis method for sharing bicycle riding purpose according to the present invention. The statistical analysis method comprises steps 401 to 403, and the steps are as follows:
step 401: and acquiring shared bicycle riding data of each time period in the area to be analyzed, screening out data which are not ridden or abnormal in riding, and acquiring the riding data to be analyzed.
In this embodiment, after a large amount of shared bicycle riding data is acquired, preprocessing and data filtering are performed on the shared bicycle riding data, and bicycle data which are not ridden or have abnormal riding distance are screened out. According to the data format of the shared bicycle, the invention compiles a script program suitable for the shared bicycle, and ensures the accuracy of the filtering result.
Step 402: according to the shared bicycle riding purpose inference method, each piece of riding data in the riding data to be analyzed is inferred into the riding purpose, and the inferred data to be analyzed containing the inferred results of the riding purposes is obtained.
Step 403: and (5) carrying out statistical analysis on the inferred data to be analyzed to obtain an analysis result.
In this embodiment, the estimated riding purpose result of each piece of riding data obtained in step 402 can be used to estimate the final travel purpose of a large number of riding users in different time periods in a region, and the estimated results are collected and counted, so that the riding time-space law and travel characteristics of the time-sharing bicycle user can be briefly analyzed and discussed. With Guangzhou as the first case area, after the purpose of the bicycle riding data of a certain continuous working day of a week is deduced and sorted by the model, the purpose of each activity in each time period is counted according to the proportion of types, and fig. 5 to 7 are proportion statistical result graphs of each time period of the working day of a week for the purposes of ' working "," returning home "and ' transfer ' activities.
Besides the analysis results of the above examples, further data analysis can be performed according to actual situations, such as a thermodynamic density map of the distribution of the riding termination points of a certain target activity type in a certain time period, which is not further exemplified herein.
Accordingly, referring to fig. 8, fig. 8 is a schematic structural diagram of an embodiment of the shared bicycle riding purpose inference device provided by the invention. As shown in fig. 8, the inference means includes:
the obtaining module 801 is configured to obtain shared bicycle riding data of the first user, extract a first riding end point, and create a buffer area corresponding to the first riding end point, where the buffer area is used as a first candidate area for a riding purpose of the first user.
An association module 802 for associating all POIs in the first candidate region with the first user.
The model building module 803 is configured to calculate, according to a pre-built mapping relationship model between the destination activity types and the POI categories for which time weights have been set, the number of POIs mapped by each type of destination activity type in the first candidate region and the time weights of each type of destination activity type, and build a first gravity model according to the calculation result.
And the model optimization module 804 is used for sequentially optimizing the first gravity model according to the frequency density ratio of each target activity type and the attribute characteristic weight of each POI to obtain a second gravity model.
And the inference module 805 is configured to infer a final riding purpose of the first user in the first candidate region according to the second gravity model and a bayesian formula, so as to obtain a purpose inference result of the first user.
The more detailed working principles and flow steps of the present device may be, but are not limited to, referred to the inference methods described above.
Accordingly, embodiments of the present invention also provide a storage medium having stored therein processor-executable instructions, which when executed by a processor, are configured to perform a method of shared bicycle ride objective inference as described herein.
In summary, the present invention has the following improvements and advantages:
(1) In order to make up for the vacancy of the existing purpose inference model in inferring the purpose of the shared bicycle activity, the method utilizes data of locking and parking points of the filtered shared bicycle (ensuring that the bicycle is ridden and abnormal riding distance does not exist) as a central point inferred for the travel purpose, and combines certain range POI (interest point) data around the central point as environmental factors and inference factors of the activity purpose, improves a gravity model to calculate the probability of each possible purpose, and finally infers the type of the final activity purpose of the riding of a user based on Bayes principle.
(2) Based on the characteristics of a shared bicycle travel mode and a shared bicycle user group, the activity types of travel destinations of riding users are summarized and classified by combining the existing shared bicycle related statistical data, and a set of mapping tables for inferred activity types and POI types of the shared bicycle destinations are established according to the classification results. And different time weights are set for the same target activity type according to different dates and time periods so as to improve the authenticity of the guessed result.
(3) In order to make up for the interference on the target inference result caused by the difference of individual characteristics such as floor area and scale of different individuals among real entities in POI data and the difference of the number of POIs mapped by different activity target types, the method adopts a mode of carrying out grade division on the types of the POIs according to the scale size, national or industrial standards and related general knowledge of different types of POIs and replacing absolute number with frequency density and type proportion on the statistics of different activity target types in a target candidate region to correct and improve the optimization model parameters.
It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection therebetween, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.
Claims (7)
1. A shared bicycle riding purpose inference method is characterized by comprising the following steps:
the method comprises the steps of obtaining shared bicycle riding data of a first user, extracting a first riding end point, and creating a buffer area corresponding to the first riding end point to serve as a first candidate area of a riding purpose of the first user;
associating all POIs in the first candidate area with the first user;
according to a pre-constructed mapping relationship model between the target activity types and the POI categories with set time weights, counting the number of POIs mapped by each type of target activity types in the first candidate region and the time weights of each type of target activity types, and constructing a first gravity model according to a counting result, wherein the concrete steps are as follows: constructing a first gravity model according to the following formula;
wherein, selectpoints is a set of all POIs of the first riding terminal point 0 in the target candidate area; p represents a specific POI in the set, and sum (p.category = act) represents the sum of all POI numbers in the destination candidate area, wherein the destination activity types corresponding to the POI type mapping of P are act; w (p.category = act | t) represents a temporal weight of the destination activity type act within the corresponding time period; d (O) i ,P) 2 Indicates the riding terminal O i The square of the Euclidean distance from P; respectively according to the frequency density ratio of various target activity types and the attribute characteristic weight of various POIs, sequentially optimizing the first gravity model to obtain a second gravity model, which specifically comprises the following steps:
calculating the frequency density ratio of each target activity type according to the following formula;
wherein n is act Representing the total number of POI mapped by the destination activity type act in all POI; n is a radical of act The POI quantity mapped by the target activity type act in a certain shared bicycle target candidate area is represented; rho act Is the frequency density ratio;
then, the first gravity model is optimized for the first time according to the following formula;
the gravity model after the first optimization is as follows:
wherein, C P.category=act Frequency density ratio of destination activity type corresponding to POI type of P, C act The frequency density of the target activity type act in various target activity types is taken as a ratio; sum (act. Category) is the total number of activity types for each category;
according to the attribute feature weight of each POI, carrying out second optimization on the gravity model after the first optimization according to the following formula to obtain a second gravity model;
wherein, W (P) category ) The attribute feature weight is the preset attribute feature weight corresponding to the P; deducing the final riding purpose of the first user in the first candidate area according to the second gravity model and the Bayesian formula to obtain a purpose deduction result of the first user, wherein the purpose deduction result specifically comprises the following steps:
according to the second gravity model, probability calculation is carried out on the candidate purposes in the first candidate area one by one, and then according to the Bayes formula, the candidate purposes P are calculated i The specific formula of the normalized conditional probability is as follows:
wherein, pr (P) i | (O, t)) represents the POI point P i Is the conditional probability, G (O, P), of the time period t for parking the shared bicycle at the first riding end point O, the final activity purpose for the first user to go to i T) is a candidate destination P i Calculating a result according to the second gravity model; sigma j G(O,P j T) represents the sum of the results of the calculations of the respective candidate objectives according to the second gravity model within the first candidate region;
and taking the target activity type with the highest probability as the target activity type corresponding to the final riding target in the first candidate area, thereby obtaining a target inference result of the first user.
2. The shared bicycle riding purpose inference method according to claim 1, wherein the creating a buffer area corresponding to the first riding end point as a first candidate area for a riding purpose of the first user specifically comprises:
and constructing a corresponding circular buffer area by taking the first riding end point as a circular point and a preset length as a radius, and taking the circular buffer area as the first candidate area.
3. The shared bicycle riding purpose inference method of claim 1, wherein the associating all POIs in the first candidate region with the first user is specifically:
and screening all POI from the first candidate area, and associating the ID of the first user with the screened POI type and POI geographic coordinates.
4. The shared bicycle riding purpose inference method according to claim 1, wherein the pre-constructed mapping relationship model between the destination activity type with the set time weight and the POI category is specifically:
screening a plurality of target activity types from the existing target activity types according to the travel mode of the shared bicycle and the travel characteristics of the shared bicycle using groups, and establishing the mapping relation model according to the screened target activity types;
and setting different time weights for various target activity types in the mapping relation model according to working days and rest days by combining daily travel habits of urban residents and operation rules of different places.
5. A statistical analysis method for sharing the purpose of bicycle riding is characterized by comprising the following steps:
obtaining shared bicycle riding data of each time period in an area to be analyzed, screening out data which are not ridden or abnormal in riding, and obtaining the riding data to be analyzed;
the shared bicycle riding purpose inference method according to any one of claims 1 to 4, wherein each piece of riding data in the to-be-analyzed riding data is subjected to riding purpose inference to obtain to-be-analyzed inferred data containing inference results of a plurality of riding purposes;
and statistically analyzing the inferred data to be analyzed to obtain an analysis result.
6. An inference device of shared bicycle riding intent, the inference device comprising:
the obtaining module is used for obtaining shared bicycle riding data of a first user, extracting a first riding end point, and creating a buffer area corresponding to the first riding end point to serve as a first candidate area of a riding purpose of the first user;
an association module for associating all POIs in the first candidate region with the first user;
the model building module is used for counting the number of POIs mapped by various types of target activities in the first candidate region and the time weights of various types of target activities according to a pre-built mapping relation model between the target activity types and the POI categories with set time weights, and building a first gravity model according to a counting result, wherein the model building module specifically comprises the following steps of: constructing a first gravity model according to the following formula;
wherein, selectpoints is a set of all POIs of the first riding terminal point O in the destination candidate area; p represents a specific POI in the set, and sum (p.category = act) represents the sum of all POI numbers in the destination candidate area, wherein the destination activity types corresponding to the POI type mapping of P are act; w (p.category = act | t) represents a temporal weight of the destination activity type act within the corresponding time period; d (O) i ,P) 2 Indicates the riding terminal O i The square of the Euclidean distance from P;
the model optimization module is used for sequentially optimizing the first gravity model according to the frequency density proportion of various target activity types and the attribute characteristic weight of various POIs to obtain a second gravity model, and specifically comprises the following steps:
calculating the frequency density ratio of each target activity type according to the following formula;
wherein n is act Representing the total number of POI mapped by the destination activity type act in all POI; n is a radical of act Representing the number of POI mapped by a target activity type act in a target candidate area of a certain shared bicycle; rho act The frequency density is the ratio;
then, the first gravity model is optimized for the first time according to the following formula;
the gravity model after the first optimization is as follows:
wherein, C P.category Activity type frequency density ratio of destination corresponding to POI type with act as P, C act The frequency density of the target activity type act in various target activity types is taken as a ratio; the sum (act. Category) is the total number of activity types for each category;
according to the attribute feature weight of each POI, carrying out second optimization on the gravity model after the first optimization according to the following formula to obtain a second gravity model;
wherein, W (P) category ) The attribute feature weight is the preset attribute feature weight corresponding to the P;
an inference module, configured to infer, according to the second gravity model and a bayesian formula, a final riding purpose of the first user in the first candidate region, and obtain a purpose inference result of the first user, where the purpose inference result is specifically:
according to the second gravity model, probability calculation is carried out on the candidate purposes in the first candidate area one by one, and then according to the Bayes formula, the candidate purposes P are calculated i The concrete formula of the normalized conditional probability of (2) is as follows:
wherein, pr (P) i | (O, t)) represents the POI point P i Is the conditional probability, G (O, P), of the time period t for parking the shared bicycle at the first riding end point O, the final activity purpose for the first user to go to i T) as candidate object P i Calculating a result according to the second gravity model; sigma j G(O,P j T) represents the sum of the results of the calculations of the respective candidate objectives according to the second gravity model within the first candidate region;
and taking the target activity type with the highest probability as the target activity type corresponding to the final riding target in the first candidate area, thereby obtaining a target inference result of the first user.
7. A storage medium having stored therein processor-executable instructions, wherein the processor-executable instructions, when executed by a processor, are for implementing a shared bicycle riding purpose inference method as claimed in any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010382582.XA CN111581318B (en) | 2020-05-08 | 2020-05-08 | Shared bicycle riding purpose inference method and device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010382582.XA CN111581318B (en) | 2020-05-08 | 2020-05-08 | Shared bicycle riding purpose inference method and device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111581318A CN111581318A (en) | 2020-08-25 |
CN111581318B true CN111581318B (en) | 2023-04-07 |
Family
ID=72113291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010382582.XA Active CN111581318B (en) | 2020-05-08 | 2020-05-08 | Shared bicycle riding purpose inference method and device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111581318B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113554353B (en) * | 2021-08-25 | 2024-05-14 | 宁波工程学院 | Public bicycle space scheduling optimization method capable of avoiding space accumulation |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107480807A (en) * | 2017-07-06 | 2017-12-15 | 中山大学 | Shared bicycle destination Forecasting Methodology and device based on space-time layered perception neural networks |
CN107767659A (en) * | 2017-10-13 | 2018-03-06 | 东南大学 | Shared bicycle traffic attraction and prediction of emergence size method based on ARIMA models |
-
2020
- 2020-05-08 CN CN202010382582.XA patent/CN111581318B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107480807A (en) * | 2017-07-06 | 2017-12-15 | 中山大学 | Shared bicycle destination Forecasting Methodology and device based on space-time layered perception neural networks |
CN107767659A (en) * | 2017-10-13 | 2018-03-06 | 东南大学 | Shared bicycle traffic attraction and prediction of emergence size method based on ARIMA models |
Also Published As
Publication number | Publication date |
---|---|
CN111581318A (en) | 2020-08-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | How urban land use influences commuting flows in Wuhan, Central China: A mobile phone signaling data perspective | |
Yin et al. | A generative model of urban activities from cellular data | |
CN106096631B (en) | A kind of floating population's Classification and Identification analysis method based on mobile phone big data | |
CN106971547B (en) | A kind of Short-time Traffic Flow Forecasting Methods considering temporal correlation | |
Zheng et al. | Detecting collective anomalies from multiple spatio-temporal datasets across different domains | |
Zhong et al. | Inferring building functions from a probabilistic model using public transportation data | |
CN104217250B (en) | A kind of urban rail transit new line based on historical data opens passenger flow forecasting | |
Liu et al. | Characterizing mixed-use buildings based on multi-source big data | |
CN113902011A (en) | Urban rail transit short-time passenger flow prediction method based on cyclic neural network | |
CN106931974B (en) | Method for calculating personal commuting distance based on mobile terminal GPS positioning data record | |
WO2017166370A1 (en) | Method for delineating metropolitan area based on regional inter-city flow intensity measuring model | |
CN107103392A (en) | A kind of identification of bus passenger flow influence factor and Forecasting Methodology based on space-time Geographical Weighted Regression | |
Zheng et al. | Exploring both home-based and work-based jobs-housing balance by distance decay effect | |
CN105183870A (en) | Urban functional domain detection method and system by means of microblog position information | |
CN109214863B (en) | Method for predicting urban house demand based on express delivery data | |
CN110716935A (en) | Track data analysis and visualization method and system based on online taxi appointment travel | |
CN112954623B (en) | Resident occupancy rate estimation method based on mobile phone signaling big data | |
CN114202146A (en) | Method and device for evaluating convenience of public service of village and town community | |
Bwambale et al. | Modelling long-distance route choice using mobile phone call detail record data: a case study of Senegal | |
CN111401743A (en) | Dynamic traffic influence evaluation method in urban road construction period | |
Yuan et al. | Recognition of functional areas based on call detail records and point of interest data | |
Ji et al. | Research on classification and influencing factors of metro commuting patterns by combining smart card data and household travel survey data | |
Jaber et al. | How do land use, built environment and transportation facilities affect bike-sharing trip destinations? | |
CN113095539A (en) | Method and device for identifying optimal measurement index of specific crowd | |
CN111581318B (en) | Shared bicycle riding purpose inference method and device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |