CN113868553B - Layered taxi passenger carrying recommendation method and system - Google Patents

Layered taxi passenger carrying recommendation method and system Download PDF

Info

Publication number
CN113868553B
CN113868553B CN202111101052.4A CN202111101052A CN113868553B CN 113868553 B CN113868553 B CN 113868553B CN 202111101052 A CN202111101052 A CN 202111101052A CN 113868553 B CN113868553 B CN 113868553B
Authority
CN
China
Prior art keywords
carrying
passenger
points
recommended
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111101052.4A
Other languages
Chinese (zh)
Other versions
CN113868553A (en
Inventor
刘毅志
刘宇轩
王雪松
廖祝华
赵肄江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University of Science and Technology
Original Assignee
Hunan University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University of Science and Technology filed Critical Hunan University of Science and Technology
Priority to CN202111101052.4A priority Critical patent/CN113868553B/en
Publication of CN113868553A publication Critical patent/CN113868553A/en
Application granted granted Critical
Publication of CN113868553B publication Critical patent/CN113868553B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a layering taxi passenger carrying recommendation method and system, wherein the method comprises the steps of dividing a grid area, mapping historical passenger carrying points and POIs of a taxi into the grid area, extracting a space-time feature vector X from the grid area and constructing a space-time context matrix; inputting the space-time context matrix into an extremely deep factorizer model xDeepFM for training to obtain passenger carrying probability data of a driver-time period-grid region, and then combining the current space-time information of the driver to obtain a recommended grid region with the highest passenger carrying probability and close passenger carrying probability of the driver; and carrying out space-time analysis on the historical carrying points in the recommended grid area to obtain recommended candidate carrying points representing the passenger gathering place. The invention can obtain more accurate recommended positions, improve the performance of passenger carrying recommendation, and is beneficial to improving the passenger carrying efficiency of the driver and the income of the taxi driver by recommending candidate passenger carrying points in a near recommended grid area with highest passenger carrying probability of the driver and representing a passenger gathering place.

Description

Layered taxi passenger carrying recommendation method and system
Technical Field
The invention relates to a taxi passenger-carrying recommendation technology, in particular to a layering taxi passenger-carrying recommendation method and system.
Background
Taxis often travel in cities every day, and become an integral part of intelligent traffic systems. Taxi online booking platforms such as DiDi and UBER not only provide online booking services, but also record GPS data of taxis. Such data has prompted many location-based services (LBS), such as taxi passenger areas, passenger points, passenger route recommendations. They play an important role in effectively improving the profits of taxi drivers and reducing fuel consumption.
Taxi passenger area recommendations are currently faced with new challenges compared to traditional recommendation systems. First, the GPS data update speed presents a challenge. The large amount of GPS data that is continuously updated every day requires a large amount of memory resources and long computation time. In addition, the fast town of China makes the road update speed fast, and the long-time GPS data used in the taxi passenger-carrying recommendation system can introduce more noise. For example, the original road has been abandoned or changed, a plurality of trunk roads are newly added, and so on. These noises can greatly reduce the recommendation accuracy. However, using recent short-term GPS data, a data sparseness problem is faced. To cope with this problem, matrix decomposition techniques are widely used. While matrix decomposition techniques lack efficient utilization of context information. Accordingly, researchers have come to pay attention to models such as factorization machines. Secondly, knowledge and rules underlying the GPS data are to be mined further, e.g. with a large number of spatio-temporal contexts. During taxi cruising, the driver's choice of passenger carrying area will change with time and spatial information. For example, residential areas have a large number of outgoing demands during the morning shift hours; and a large number of passengers often appear in the entertainment area after 11 pm. Therefore, how to fuse space-time context to passenger region recommendations well is also an important issue facing today.
How to improve the passenger carrying efficiency of taxi recommendation service is still a main solution. And recommending the passenger carrying area recommendation and the passenger carrying point recommendation as well as the passenger carrying route recommendation, thereby improving the passenger carrying probability. And recommending passenger carrying hot spot areas in different time periods for the empty taxi drivers. Whereby the probability of the driver cruising onto the passenger in the recommended area is higher. However, the region recommendation has a certain spatial extent. In order to obtain more accurate recommended positions, it is necessary to conduct taxi layered passenger carrying recommendation.
Disclosure of Invention
The invention aims to solve the technical problems: aiming at the problems in the prior art, the invention provides a layering taxi passenger carrying recommendation method and system, which can output a recommendation grid area with highest passenger carrying probability and relatively close to the highest passenger carrying probability of a driver as a first-layer recommendation result, and output a recommendation candidate passenger carrying point representing a passenger gathering place as a second-layer recommendation result, so that more accurate recommendation positions can be obtained, the passenger carrying recommendation performance is improved, personalized recommendation is provided for the driver, and the passenger carrying efficiency of the driver and the income of a taxi driver are improved.
In order to solve the technical problems, the invention adopts the following technical scheme:
a hierarchical taxi passenger recommendation method comprises the following steps:
1) Dividing a urban area into grid areas, and excavating historical passenger carrying points and POIs from an original GPS track of a taxi;
2) Mapping the historical passenger points and POIs into a divided grid area, extracting a space-time feature vector X from the grid area and constructing a space-time context matrix;
3) The space-time context matrix is input into an extremely-deep factorizer model xDeepFM for training, passenger carrying probability data of a driver-time period-grid area are obtained through the trained extremely-deep factorizer model xDeepFM, and then the current space-time information of the driver is combined to obtain a recommended grid area with the highest passenger carrying probability of the driver and which is relatively close to the current space-time information of the driver, and the recommended grid area is output as a first-level recommended result;
4) And carrying out space-time analysis on the historical carrying points in the recommended grid area to obtain recommended candidate carrying points representing the passenger gathering place, and outputting the recommended candidate carrying points as a second-level recommended result.
Optionally, the spatio-temporal feature vector X extracted from the grid region in step 2) includes: the driver characteristic D is used for distinguishing drivers and is coded by one-hot; time feature T, corresponding time period of the order in one day, and one-hot encoding; grid characteristics G, which are used for distinguishing grid areas and are subjected to one-hot coding; grid attribute characteristics A, including the number of historical passenger carrying points, the number of POIs, the average passenger carrying time length, the average passenger carrying distance, the occupation ratio of various POI types and the geometric center position of the grid, and normalizing; the constructed space-time context matrix is composed of a space-time feature vector X and a target variable Y, wherein the target variable Y is the number of passengers carried by a driver in the grid during the period.
Optionally, step 3) includes:
3.1 Inputting the space-time context matrix into the extremely deep factorizer model for training to obtain a trained extremely deep factorizer model xDeepFM, and generating driver-time period-grid region passenger probability data by the trained extremely deep factorizer model xDeepFM, wherein the driver-time period-grid region passenger probability data describes passenger probability of a driver for each grid region in each time period;
3.2 9 grid areas including the current grid area of the driver and 8 adjacent grid areas and the current time period of the driver are combined with the current time-space information of the driver;
3.3 According to the 9 grid areas and the current time period of the driver, matching with the passenger carrying probability data of the driver-time period-grid areas to obtain the passenger carrying probability of the 9 grid areas, taking the grid area with the highest passenger carrying probability of the 9 grid areas as the finally obtained recommended grid area, and outputting the recommended grid area as the recommended result of the first level.
Optionally, step 4) includes:
4.1 Firstly, dividing a day into different time periods, and dividing historical passenger carrying points in a recommended grid area into corresponding time periods according to time attributes of the historical passenger carrying points; then carrying out spatial clustering analysis on the historical carrying points in different time periods according to the spatial geographic positions of the historical carrying points to obtain a first group of candidate carrying points TS [ id, lng, lat, TS,1], taking the central position of a cluster as the geographic coordinates of the candidate carrying points, wherein id is the number of the candidate carrying points, (Lng, lat) is the geographic coordinates of the candidate carrying points, and TS is the time period corresponding to the dividing candidate carrying points;
4.2 Taking the geographical position of the historical carrying points in the recommended area into consideration, carrying out spatial clustering analysis on the historical carrying points in the recommended area according to the spatial attribute of the historical carrying points, if the historical carrying points in a certain time period are larger than a set threshold value in the clustered clusters, reserving the clusters to the time period so as to obtain a second group of candidate carrying points ST [ id, lng, lat, ts,0], taking the central position of the clusters as the geographical coordinates of the candidate carrying points, and taking ts as the time period corresponding to the time period when the historical carrying points in the clustered clusters exceed the threshold value after spatial clustering, wherein id is the number of the candidate carrying points, (Lng, lat) is the geographical coordinates of the candidate carrying points;
4.3 The obtained candidate carrying points of the first group of candidate carrying points TS [ id, lng, lat, TS,1] and the candidate carrying points of the second group of candidate carrying points ST [ id, lng, lat, TS,0] with the distance smaller than a preset threshold value are combined into new candidate carrying points, so that the recommended candidate carrying points which finally represent the passenger gathering place are obtained and are output as the recommended result of the second layer.
Optionally, step 4) is followed by step 5), and step 5) includes: carrying out passenger carrying probability calculation on the passenger carrying probability of the recommended candidate passenger carrying points, acquiring the recommended candidate passenger carrying point with the highest passenger carrying probability, obtaining the recommended passenger carrying point, and outputting the recommended passenger carrying point as a third-level recommendation result.
Optionally, the function expression for carrying out the passenger probability calculation on the passenger probability of the recommended candidate passenger point is:
In the above formula, P (i) is the probability of carrying passengers of the ith recommended candidate carrying point, num (i) is the number of historical carrying points in the cluster where the ith candidate carrying point is located, and Area (i) is the Area of the cluster where the ith candidate carrying point is located; step 4.3) after obtaining the new candidate carrying points, calculating the area of the cluster where the new candidate carrying points are located according to the historical carrying points of the area of the cluster represented by the repeated candidate carrying points.
Optionally, step 5) is followed by step 6), and step 6) includes: and calculating the passenger carrying probability of each section of route in the recommended grid area, acquiring a passenger carrying route with the highest passenger carrying probability to obtain a recommended passenger carrying route, and outputting the recommended passenger carrying route as a fourth-level optimal passenger carrying route recommendation result.
Optionally, the function expression for calculating the passenger probability of each route segment in the recommended grid area is:
in the above description, pr (j) is the probability of carrying passengers on the jth route in the recommended grid region, P (1), and P (2) are the candidate passenger carrying points with the highest probability of carrying passengers on the jth route in the recommended grid region, and D1 and D2 are the distances between the two candidate passenger carrying points with the highest probability of carrying passengers on the jth route in the recommended grid region and the driver.
In addition, the invention also provides a layered taxi passenger recommendation system, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the layered taxi passenger recommendation method.
In addition, the invention further provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program programmed or configured to execute the layered taxi passenger recommendation method.
Compared with the prior art, the invention has the following advantages:
1. According to the invention, the recommendation grid region with the highest passenger carrying probability and relatively close to the passenger carrying probability of the driver can be used as a first-level recommendation result to be output, and the recommendation candidate passenger carrying points representing the passenger gathering place are used as a second-level recommendation result to be output, so that more accurate recommendation positions can be obtained, the passenger carrying recommendation performance is improved, personalized recommendation is provided for the driver, and the passenger carrying efficiency of the driver and the income of a taxi driver are improved.
2. The invention can output the recommended grid area with the highest passenger carrying probability and relatively close as the recommended result of the first level, the recommended candidate passenger carrying point representing the passenger gathering place as the recommended result of the second level, and the recommended result of the second level can be used as the optional recommended result according to the need, thereby realizing the customization meeting the requirements of different services.
3. The invention can output the recommended grid area with the highest passenger carrying probability and relatively close as the recommended result of the first level, and the recommended candidate passenger carrying points representing the passenger gathering place as the recommended result of the second level, and can further expand more levels of recommended results, including recommended passenger carrying points, recommended passenger carrying routes and the like, according to the needs.
Drawings
FIG. 1 is a schematic flow chart of a method according to an embodiment of the invention.
Fig. 2 is a frame diagram of a hierarchical taxi passenger recommendation method according to an embodiment of the invention.
FIG. 3 is a space-time context matrix constructed in accordance with an embodiment of the present invention.
Fig. 4 is a flowchart of taxi passenger carrying area recommendation according to a first embodiment of the invention.
Fig. 5 is a basic flow chart of a second method according to the embodiment of the invention.
FIG. 6 is a schematic diagram of a three-way process according to an embodiment of the invention.
Detailed Description
As shown in fig. 1,2 and 4, the hierarchical taxi passenger recommendation method of the embodiment includes:
1) Dividing a urban area into grid areas, and excavating historical passenger carrying points and POIs from an original GPS track of a taxi;
2) Mapping the historical passenger points and POIs into a divided grid area, extracting a space-time feature vector X from the grid area and constructing a space-time context matrix;
3) The space-time context matrix is input into an extremely-deep factorizer model xDeepFM for training, passenger carrying probability data of a driver-time period-grid area are obtained through the trained extremely-deep factorizer model xDeepFM, and then the current space-time information of the driver is combined to obtain a recommended grid area with the highest passenger carrying probability of the driver and which is relatively close to the current space-time information of the driver, and the recommended grid area is output as a first-level recommended result;
4) And carrying out space-time analysis on the historical carrying points in the recommended grid area to obtain recommended candidate carrying points representing the passenger gathering place, and outputting the recommended candidate carrying points as a second-level recommended result.
In this embodiment, in step 1), a mesh dividing method with equal side length is used, and mesh areas obtained by dividing mesh areas in urban areas are recorded as Gridset, and each mesh area is distinguished by attaching a mesh area ID. When historical passenger points and POIs are mined from the original GPS track of the taxi, in order to distinguish drivers, a driver ID is also added to each driver to distinguish. The GPS data comprises the identification of a driver, the occurrence time of a GPS point, the longitude and latitude of the GPS point and the passenger carrying state of a taxi. The passenger carrying state of the taxi indicates whether the taxi has passengers at present, wherein '0' indicates that the taxi is in an idle state at present, and '1' indicates that the taxi is in a passenger carrying state at present. When the passenger carrying state is changed from 0 to 1, the taxi is indicated to have passenger carrying behaviors, and a passenger carrying GPS point is recorded at the moment and is called a passenger carrying point. Similarly, when the passenger carrying state is changed from 1 to 0, the taxi is indicated to have a passenger getting-off behavior, and the taxi is called a passenger getting-off point. According to the time and space information of the carrying point and the discharging point, the carrying distance and the carrying time after carrying the passengers can be calculated.
In this embodiment, the spatio-temporal feature vector X extracted from the grid region in step 2) includes: the driver characteristic D is used for distinguishing drivers and is coded by one-hot; time feature T, corresponding time period of the order in one day, and one-hot encoding; grid characteristics G, which are used for distinguishing grid areas and are subjected to one-hot coding; grid attribute characteristics A, including the number of historical passenger carrying points, the number of POIs, the average passenger carrying time length, the average passenger carrying distance, the occupation ratio of various POI types and the geometric center position of the grid, and normalizing; the constructed space-time context matrix is composed of a space-time feature vector X and a target variable Y, wherein the target variable Y is the number of passengers carried by a driver in the grid during the period.
In this embodiment, the driver feature D is specifically expressed by the driver number ID, and may be expressed by other fields capable of distinguishing the driver, for example, a hash string generated based on the driver number ID, or the like.
In this embodiment, the time period of the time feature T is 24 time periods divided from one day.
In this embodiment, the grid feature G is specifically expressed by a grid area ID, and may be expressed by other fields capable of distinguishing the grid area, for example, a hash string generated based on the grid area ID.
And extracting passenger carrying points from the taxi GPS data, and mapping the passenger carrying points and the POI data into corresponding grids according to the longitude and latitude of the passenger carrying points and the POI data. Counting the passenger carrying point data and the POI data in each grid to obtain an attribute characteristic A of the grid; grid attribute feature a: (1) The historical passenger carrying point number refers to the passenger carrying point number in the grid area historical data, and has special meaning for taxi drivers, which is an important factor for searching passenger strategies, and the historical passenger carrying point number of the grid can not only clearly reflect whether the grid is a passenger carrying hot spot area, but also reflect the favorite degree of the grid to the drivers. By mapping the load points into the corresponding grids, the historical load point number of each grid can be counted. (2) The number of POIs (points of interest) is the sum of various POIs (points of interest) in the grid area history data. The number of POIs within the grid area is also an important feature of the grid; the number of POIs in the grid area is the sum of various POIs, the number of POIs can reflect the hot spot degree of the grid, the more the number of POIs represents the higher the hot spot degree of the grid, the greater the probability that the grid is accessed by a driver. (3) The average passenger carrying duration refers to the average value of the running time of the taxis in the grid after carrying passengers; it is considered that the longer the driving time after the driver carries the passenger, the greater the economic benefit obtained by the driver; the larger the average passenger carrying time in the grid, the greater the economic benefit obtained by the driver carrying passengers in the grid; obtaining the average passenger carrying time length of the grid by dividing the total passenger carrying time length of the grid area by the number of passenger carrying times in the grid area; the function expression of the average passenger carrying time length is as follows:
In the above equation, T i is the average man-hour of grid region G i, NUM is the number of order occurrences in grid region G i, and Time j is the man-hour of the j-th order in grid region G i. (4) The average passenger carrying distance refers to the average value of the driving distance after passengers are carried on taxis in the grid; the same as the average passenger carrying time length, the longer the driving distance after the passenger carrying of the driver is considered, the greater the economic benefit obtained by the driver is; the larger the average passenger carrying distance in the grid, the greater the economic benefit obtained by the driver carrying passengers in the grid; obtaining an average passenger carrying distance of the grid by dividing the total passenger carrying distance of the grid area by the number of passenger carrying times occurring in the grid area; the functional expression of the average passenger distance is:
In the above equation, D i is the average man distance of the grid Gi, NUM is the number of orders that occur in the grid area Gi, and distance j is the man distance of the jth order in the grid area. (5) The proportion of the types of the POIs is represented by the proportion of the number of the types of the POIs in the total number of the POIs, for example, the proportion of scenic POIs in a grid is higher, so that the grid is mainly used as scenic spots, the scenic spots are generally on weekends, and passengers in holidays are more; the functional expression of the proportion of each POI type is as follows:
in the above equation, type j(i) is the duty cycle of the j-th type POI in the grid Gi, Representing the number of j-th type POIs in the grid Gi. (6) The geometric center position of the grid is represented using latitude O lat and longitude O lot of the grid center.
In this embodiment, the spatio-temporal feature vector X includes four parts: driver feature D, grid feature G, time feature T, grid attribute feature A. Wherein the time feature T and the grid attribute feature a contain spatiotemporal context information for more accurately recommending passenger areas to the driver. The driver feature D is described by the driver ID and needs to be encoded by one-hot. The grid characteristic G is described by a grid ID and needs to be subjected to one-hot coding; dividing a day into 24 time periods, so that the time characteristic T is also subjected to one-hot coding; the grid attribute characteristics A comprise the number of historical carrying points, the number of POIs, the average carrying time length, the average carrying distance, the duty ratio of various POIs and the geometric center position of the grid, and the characteristics need to be normalized; the selected target variable Y is the number of passenger carrying times of the driver in the corresponding grid area in the corresponding time period, and normalization is also needed. In this embodiment, the selected normalization method is linear function normalization (Min-Max scaling), and the formula is as follows:
In the above formula, X is the original data, X min is the minimum value in the original data set, X max is the maximum value in the original data set, and X norm is the value after normalization. Finally, the constructed spatio-temporal context matrix is composed of spatio-temporal feature vectors X and target variables Y, where the target variables Y are the number of passengers the driver has occurred in the grid during that period. An example of a spatio-temporal context matrix constructed in this embodiment is shown in fig. 3, and referring to fig. 3, each row of the spatio-temporal context matrix represents a feature vector X (i) and its corresponding target variable Y (i). The first 4 columns represent driver IDs, the last 6 columns represent grid IDs, the last 4 columns represent time periods, and the last 6 columns represent attributes of the grid. The rightmost column represents a passenger count of the driver in the corresponding grid during the corresponding time period. It can be seen from the figure that the driver ID, the grid ID and the time period are all subjected to one-hot processing (single-hot encoding processing), and the grid attribute and the target Y are all subjected to normalization processing.
In this embodiment, step 3) includes:
3.1 Inputting the space-time context matrix into the ultra-deep factorizer model for training to obtain a trained ultra-deep factorizer model xDeepFM, and generating driver-time period-grid region passenger probability data by the trained ultra-deep factorizer model xDeepFM, wherein the driver-time period-grid region passenger probability data describes passenger probabilities of a driver for each grid region in each time period, for example, the passenger probability of a1 st for a1 st grid in 7-8 time periods is 0.65;
3.2 9 grid areas including the current grid area of the driver and 8 adjacent grid areas and the current time period of the driver are combined with the current time-space information of the driver;
3.3 According to the 9 grid areas and the current time period of the driver, matching with the passenger carrying probability data of the driver-time period-grid areas to obtain the passenger carrying probability of the 9 grid areas, taking the grid area with the highest passenger carrying probability of the 9 grid areas as the finally obtained recommended grid area, and outputting the recommended grid area as the recommended result of the first level.
The method of the embodiment utilizes the extremely deep factorizer model xDeepFM to realize the mapping generation between the space-time context matrix and the passenger probability data of the driver-time period-grid area. It should be noted that, the method of the embodiment only relates to application of the existing extremely deep factorizer model xDeepFM, and does not relate to improvement of the extremely deep factorizer model xDeepFM, and the extremely deep factorizer model xDeepFM is a machine learning model proposed by the microsoft asian institute of research social computing group, so that not only can high-order feature interaction be automatically learned in an explicit and implicit mode at the same time, but also feature interaction occurs in a vector level, and the learning capability of memory and generalization is achieved. The specific implementation of the extremely deep factorizer model xDeepFM is therefore not repeated here.
And 4) performing space-time analysis (STA) on the historical carrying points in the recommended grid area to obtain recommended candidate carrying points representing the passenger gathering place, and outputting the recommended candidate carrying points as a second-level recommended result. In this embodiment, step 4) includes:
4.1 Firstly, dividing a day into different time periods, and dividing historical passenger carrying points in a recommended grid area into corresponding time periods according to time attributes of the historical passenger carrying points; then carrying out spatial clustering analysis on the historical carrying points in different time periods according to the spatial geographic positions of the historical carrying points to obtain a first group of candidate carrying points TS [ id, lng, lat, TS,1], taking the central position of a cluster as the geographic coordinates of the candidate carrying points, wherein id is the number of the candidate carrying points, (Lng, lat) is the geographic coordinates of the candidate carrying points, and TS is the time period corresponding to the dividing candidate carrying points;
4.2 Taking the geographical position of the historical carrying points in the recommended area into consideration, carrying out spatial clustering analysis on the historical carrying points in the recommended area according to the spatial attribute of the historical carrying points, if the historical carrying points in a certain time period are larger than a set threshold value in the clustered clusters, reserving the clusters to the time period so as to obtain a second group of candidate carrying points ST [ id, lng, lat, ts,0], taking the central position of the clusters as the geographical coordinates of the candidate carrying points, and taking ts as the time period corresponding to the time period when the historical carrying points in the clustered clusters exceed the threshold value after spatial clustering, wherein id is the number of the candidate carrying points, (Lng, lat) is the geographical coordinates of the candidate carrying points; the candidate passenger carrying points represent passenger gathering areas to a certain extent, the clustering analysis process of the step 4.2) reveals the gathering phenomenon of the passenger carrying points in space, and sparse data brought by the clustering analysis process of the step 4.1) are made up to a certain extent;
4.3 The obtained candidate carrying points of the first group of candidate carrying points TS [ id, lng, lat, TS,1] and the candidate carrying points of the second group of candidate carrying points ST [ id, lng, lat, TS,0] with the distance smaller than a preset threshold value are combined into new candidate carrying points, so that the recommended candidate carrying points which finally represent the passenger gathering place are obtained and are output as the recommended result of the second layer.
In this embodiment, two sets of candidate passenger carrying points are obtained through steps 4.1) to 4.2). The two sets of data reveal the collection of carrying points from different dimensions. The candidate passenger spots represent to some extent a high-density zone of aggregation of passengers. There may be candidate points of loading between the two sets of data that are geographically identical or similar, and these repeated data will introduce errors into the subsequent recommendations. For this purpose, we need to merge the duplicate data to eliminate duplicate data of both sets of data. The candidate carrying points i and j are set to be respectively from the second group of candidate carrying points ST [ id, lng, lat, TS,0] and the first group of candidate carrying points TS [ id, lng, lat, TS,1]. When the distance between i and j is smaller than the distance threshold d (for example, the available value is 40 m), it can be determined that the candidate carrying points i and j represent the same carrying point high-density aggregation area and are repeated data. In this embodiment, the repeated candidate carrying points are filtered by adopting a voting mechanism, and new candidate carrying points are generated. The distance between the candidate carrying points is taken as a voting basis, and then a new candidate carrying point is generated according to the voting result and the distance between the candidate carrying points. K is the number of historical load points in the aggregate area represented by duplicate candidate load points and d is the threshold for determining duplicate candidate load points. The candidate scores were calculated as follows:
In the above formula, score (i) is a candidate Score of candidate carrying points i, dist (i, k) is a distance between candidate carrying points i, k, and dist (i, j) is a distance between candidate carrying points i, j. Based on the above formula, in this embodiment, for candidate carrying points i and j whose distances from the second set of candidate carrying points ST [ id, lng, lat, TS,0] and the first set of candidate carrying points TS [ id, lng, lat, TS,1] are smaller than the distance threshold d, the position of the new combined candidate carrying point is calculated based on the following formula:
In the above formula, score i is a candidate Score of candidate carrying point I, score j is a candidate Score of candidate carrying point j, dist i,j is a distance between candidate carrying points I, j, L lat,ln,g is a geographic position of a new candidate carrying point after merging, and I lat,ln g,,Jlat,ln g is a geographic position of two candidate carrying points I, j before merging respectively. After obtaining the new candidate carrying points, the area of the cluster where the new candidate carrying points are located needs to be calculated according to the historical carrying points of the area of the cluster represented by the repeated candidate carrying points.
Referring to fig. 1, further, the embodiment further includes further recommending the recommended carrying point. In this embodiment, step 4) is further followed by step 5), where step 5) includes: carrying out passenger carrying probability calculation on the passenger carrying probability of the recommended candidate passenger carrying points, acquiring the recommended candidate passenger carrying point with the highest passenger carrying probability, obtaining the recommended passenger carrying point, and outputting the recommended passenger carrying point as a third-level recommendation result.
In this embodiment, the function expression for calculating the passenger carrying probability of the candidate recommended passenger carrying point is:
In the above formula, P (i) is the probability of carrying passengers of the ith recommended candidate carrying point, num (i) is the number of historical carrying points in the cluster where the ith candidate carrying point is located, and Area (i) is the Area of the cluster where the ith candidate carrying point is located; step 4.3) after obtaining the new candidate carrying points, calculating the area of the cluster where the new candidate carrying points are located according to the historical carrying points of the area of the cluster represented by the repeated candidate carrying points.
Further, the embodiment further includes performing further position recommendation, namely, passenger carrying point recommendation and passenger carrying route recommendation, based on the passenger carrying area recommendation, so that performance of passenger carrying recommendation can be improved, and probability of passengers found by a driver is higher. Referring to fig. 1, step 5) in this embodiment further includes step 6), where step 6) includes: and calculating the passenger carrying probability of each section of route in the recommended grid area, acquiring a passenger carrying route with the highest passenger carrying probability to obtain a recommended passenger carrying route, and outputting the recommended passenger carrying route as a fourth-level optimal passenger carrying route recommendation result.
In this embodiment, the function expression for calculating the passenger probability of each route segment in the recommended grid region is:
in the above description, pr (j) is the probability of carrying passengers on the jth route in the recommended grid region, P (1), and P (2) are the candidate passenger carrying points with the highest probability of carrying passengers on the jth route in the recommended grid region, and D1 and D2 are the distances between the two candidate passenger carrying points with the highest probability of carrying passengers on the jth route in the recommended grid region and the driver.
In summary, the hierarchical taxi passenger recommendation method of the embodiment deeply mines the space-time context of the GPS data by combining POI and passenger point analysis. According to the embodiment, the passenger carrying points are extracted from the original GPS data, the passenger carrying points and POIs are mapped into corresponding grids, space-time characteristics are extracted from the grids to construct a space-time context matrix, and the space-time context not only can make up for the sparsity of the data, but also can improve the accuracy of the recommendation of the passenger carrying region; the space-time context is fused into the extremely deep factorizer model in a characteristic engineering mode, and the extremely deep factorizer model can learn characteristic interaction between a high order and a low order in a vector level in an implicit and explicit mode at the same time, so that the accuracy of passenger carrying area recommendation can be effectively improved; the trained extremely deep factorization machine model can obtain the passenger carrying probability data of the driver-time period-grid region, and then the passenger carrying region which is close in distance and high in passenger carrying probability can be recommended to the driver by combining the current time-space information of the driver; a space-time analysis method is carried out in the region range of the recommended region to obtain candidate carrying points representing the passenger gathering place; and the passenger carrying probability of the candidate passenger carrying points and the passenger carrying routes in the recommended area is calculated, so that the passenger carrying points and the passenger carrying routes are recommended to the driver, and the passenger carrying recommendation performance is further improved.
In addition, the embodiment also provides a layered taxi passenger recommendation system, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the layered taxi passenger recommendation method.
In addition, the embodiment also provides a computer readable storage medium, and a computer program programmed or configured to execute the layered taxi passenger recommendation method is stored in the computer readable storage medium.
Embodiment two:
The main differences between this embodiment, which is basically the same as the first embodiment, are: referring to fig. 5, step 5) is not included after step 4) in the embodiment, that is, only the output of the recommended results of the first level and the second level is included, and although the method cannot give more levels of recommended results, the recommended results of the first level and the second level can also provide the taxi driver with the basis of passenger carrying recommendation, so that the performance of passenger carrying recommendation can be improved more or less, and the passenger carrying efficiency of the driver and the income of the taxi driver can be improved.
In addition, the embodiment also provides a layered taxi passenger recommendation system, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the layered taxi passenger recommendation method.
In addition, the embodiment also provides a computer readable storage medium, and a computer program programmed or configured to execute the layered taxi passenger recommendation method is stored in the computer readable storage medium.
Embodiment III:
The main differences between this embodiment, which is basically the same as the first embodiment, are: referring to fig. 6, step 6) is not included after step 5) in the present embodiment, that is, only the output of the recommended results of the first level, the second level and the third level is included, and although the method cannot give more recommended results, the recommended results of the first level, the second level and the third level can also provide the basis for passenger carrying recommendation for the taxi driver, which can more or less improve the performance of passenger carrying recommendation, and is beneficial to improving the passenger carrying efficiency of the driver and the income of the taxi driver.
In addition, the embodiment also provides a layered taxi passenger recommendation system, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the layered taxi passenger recommendation method.
In addition, the embodiment also provides a computer readable storage medium, and a computer program programmed or configured to execute the layered taxi passenger recommendation method is stored in the computer readable storage medium.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is directed to methods, apparatus (systems), and computer program products in accordance with embodiments of the present application that produce means for implementing the functions specified in the flowchart flow(s) and/or block diagram block or blocks, with reference to the instructions that execute in the flowchart and/or processor(s) of the computer program product. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims (10)

1. A hierarchical taxi passenger recommendation method is characterized by comprising the following steps:
1) Dividing a urban area into grid areas, and excavating historical passenger carrying points and POIs from an original GPS track of a taxi;
2) Mapping the historical passenger points and POIs into a divided grid area, extracting a space-time feature vector X from the grid area and constructing a space-time context matrix;
3) The space-time context matrix is input into an extremely-deep factorizer model xDeepFM for training, passenger carrying probability data of a driver-time period-grid area are obtained through the trained extremely-deep factorizer model xDeepFM, and then the current space-time information of the driver is combined to obtain a recommended grid area with the highest passenger carrying probability of the driver and which is relatively close to the current space-time information of the driver, and the recommended grid area is output as a first-level recommended result;
4) And carrying out space-time analysis on the historical carrying points in the recommended grid area to obtain recommended candidate carrying points representing the passenger gathering place, and outputting the recommended candidate carrying points as a second-level recommended result.
2. The hierarchical taxi passenger recommendation method according to claim 1, wherein the space-time feature vector X extracted from the grid area in step 2) includes: the driver characteristic D is used for distinguishing drivers and is coded by one-hot; time feature T, corresponding time period of the order in one day, and one-hot encoding; grid characteristics G, which are used for distinguishing grid areas and are subjected to one-hot coding; grid attribute characteristics A, including the number of historical passenger carrying points, the number of POIs, the average passenger carrying time length, the average passenger carrying distance, the occupation ratio of various POI types and the geometric center position of the grid, and normalizing; the constructed space-time context matrix is composed of a space-time feature vector X and a target variable Y, wherein the target variable Y is the number of passengers carried by a driver in the grid during the period.
3. The layered taxi passenger recommendation method of claim 2, wherein step 3) includes:
3.1 Inputting the space-time context matrix into the extremely deep factorizer model for training to obtain a trained extremely deep factorizer model xDeepFM, and generating driver-time period-grid region passenger probability data by the trained extremely deep factorizer model xDeepFM, wherein the driver-time period-grid region passenger probability data describes passenger probability of a driver for each grid region in each time period;
3.2 9 grid areas including the current grid area of the driver and 8 adjacent grid areas and the current time period of the driver are combined with the current time-space information of the driver;
3.3 According to the 9 grid areas and the current time period of the driver, matching with the passenger carrying probability data of the driver-time period-grid areas to obtain the passenger carrying probability of the 9 grid areas, taking the grid area with the highest passenger carrying probability of the 9 grid areas as the finally obtained recommended grid area, and outputting the recommended grid area as the recommended result of the first level.
4. The layered taxi passenger recommendation method of claim 3, wherein step 4) comprises:
4.1 Firstly, dividing a day into different time periods, and dividing historical passenger carrying points in a recommended grid area into corresponding time periods according to time attributes of the historical passenger carrying points; then carrying out spatial clustering analysis on the historical carrying points in different time periods according to the spatial geographic positions of the historical carrying points to obtain a first group of candidate carrying points TS [ id, lng, lat, TS,1], taking the central position of a cluster as the geographic coordinates of the candidate carrying points, wherein id is the number of the candidate carrying points, (Lng, lat) is the geographic coordinates of the candidate carrying points, and TS is the time period corresponding to the dividing candidate carrying points;
4.2 Taking the geographical position of the historical carrying points in the recommended area into consideration, carrying out spatial clustering analysis on the historical carrying points in the recommended area according to the spatial attribute of the historical carrying points, if the historical carrying points in a certain time period are larger than a set threshold value in the clustered clusters, reserving the clusters to the time period so as to obtain a second group of candidate carrying points ST [ id, lng, lat, ts,0], taking the central position of the clusters as the geographical coordinates of the candidate carrying points, and taking ts as the time period corresponding to the time period when the historical carrying points in the clustered clusters exceed the threshold value after spatial clustering, wherein id is the number of the candidate carrying points, (Lng, lat) is the geographical coordinates of the candidate carrying points;
4.3 The obtained candidate carrying points of the first group of candidate carrying points TS [ id, lng, lat, TS,1] and the candidate carrying points of the second group of candidate carrying points ST [ id, lng, lat, TS,0] with the distance smaller than a preset threshold value are combined into new candidate carrying points, so that the recommended candidate carrying points which finally represent the passenger gathering place are obtained and are output as the recommended result of the second layer.
5. The layered taxi passenger recommendation method of claim 4, further comprising step 5) after step 4), step 5) comprising: carrying out passenger carrying probability calculation on the passenger carrying probability of the recommended candidate passenger carrying points, acquiring the recommended candidate passenger carrying point with the highest passenger carrying probability, obtaining the recommended passenger carrying point, and outputting the recommended passenger carrying point as a third-level recommendation result.
6. The hierarchical taxi passenger recommendation method according to claim 5, wherein the function expression for calculating the passenger probability of recommending candidate passenger points is:
In the above formula, P (i) is the probability of carrying passengers of the ith recommended candidate carrying point, num (i) is the number of historical carrying points in the cluster where the ith candidate carrying point is located, and Area (i) is the Area of the cluster where the ith candidate carrying point is located; step 4.3) after obtaining the new candidate carrying points, calculating the area of the cluster where the new candidate carrying points are located according to the historical carrying points of the area of the cluster represented by the repeated candidate carrying points.
7. The layered taxi passenger recommendation method of claim 6, further comprising step 6) after step 5), step 6) comprising: and calculating the passenger carrying probability of each section of route in the recommended grid area, acquiring a passenger carrying route with the highest passenger carrying probability to obtain a recommended passenger carrying route, and outputting the recommended passenger carrying route as a fourth-level optimal passenger carrying route recommendation result.
8. The hierarchical taxi passenger recommendation method according to claim 7, wherein the function expression for calculating passenger probability of each route in the recommended grid area is:
in the above description, pr (j) is the probability of carrying passengers on the jth route in the recommended grid region, P (1), and P (2) are the candidate passenger carrying points with the highest probability of carrying passengers on the jth route in the recommended grid region, and D1 and D2 are the distances between the two candidate passenger carrying points with the highest probability of carrying passengers on the jth route in the recommended grid region and the driver.
9. A layered taxi passenger recommendation system comprising a microprocessor and a memory connected to each other, characterized in that the microprocessor is programmed or configured to perform the steps of the layered taxi passenger recommendation method of any one of claims 1 to 8.
10. A computer readable storage medium having stored therein a computer program programmed or configured to perform the layered taxi passenger recommendation method of any one of claims 1 to 8.
CN202111101052.4A 2021-09-18 2021-09-18 Layered taxi passenger carrying recommendation method and system Active CN113868553B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111101052.4A CN113868553B (en) 2021-09-18 2021-09-18 Layered taxi passenger carrying recommendation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111101052.4A CN113868553B (en) 2021-09-18 2021-09-18 Layered taxi passenger carrying recommendation method and system

Publications (2)

Publication Number Publication Date
CN113868553A CN113868553A (en) 2021-12-31
CN113868553B true CN113868553B (en) 2024-06-14

Family

ID=78992846

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111101052.4A Active CN113868553B (en) 2021-09-18 2021-09-18 Layered taxi passenger carrying recommendation method and system

Country Status (1)

Country Link
CN (1) CN113868553B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114692015A (en) * 2022-03-10 2022-07-01 北京理工大学 Riding point recommendation method based on density clustering

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045858A (en) * 2015-07-10 2015-11-11 湖南科技大学 Voting based taxi passenger-carrying point recommendation method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112868036B (en) * 2018-11-06 2023-12-05 北京嘀嘀无限科技发展有限公司 System and method for location recommendation
CN110264706A (en) * 2019-04-07 2019-09-20 武汉理工大学 A kind of unloaded taxi auxiliary system excavated based on big data
KR102338099B1 (en) * 2019-12-30 2021-12-09 연세대학교 산학협력단 Taxi Demand Estimation Method Using Space Partitioning Technique

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045858A (en) * 2015-07-10 2015-11-11 湖南科技大学 Voting based taxi passenger-carrying point recommendation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于网格化的出租车空载寻客路径推荐;高瞻;余辰;向郑涛;陈宇峰;;计算机应用与软件;20190512(05);全文 *

Also Published As

Publication number Publication date
CN113868553A (en) 2021-12-31

Similar Documents

Publication Publication Date Title
US11586992B2 (en) Travel plan recommendation method, apparatus, device and computer readable storage medium
Xu et al. Taxi-RS: Taxi-hunting recommendation system based on taxi GPS data
CN108986453A (en) A kind of traffic movement prediction method based on contextual information, system and device
CN112489426A (en) Urban traffic flow space-time prediction scheme based on graph convolution neural network
CN113159403B (en) Intersection pedestrian track prediction method and device
CN110836675A (en) Decision tree-based automatic driving search decision method
CN112579921B (en) Track indexing and query method and system based on inverted sorting index and prefix tree
CN105608528A (en) Taxi driver income-pressure assessment method and system based on big data analysis
CN110348969A (en) Taxi based on deep learning and big data analysis seeks objective policy recommendation method
CN113868553B (en) Layered taxi passenger carrying recommendation method and system
CN112052405B (en) Passenger searching area recommendation method based on driver experience
CN110490365A (en) A method of based on the pre- survey grid of multisource data fusion about vehicle order volume
Lin et al. Insights into Travel Pattern Analysis and Demand Prediction: A Data-Driven Approach in Bike-Sharing Systems
CN111723871B (en) Estimation method for real-time carriage full load rate of bus
Wang et al. Segmented trajectory clustering-based destination prediction in IoVs
Li et al. Assignment of seasonal factor categories to urban coverage count stations using a fuzzy decision tree
CN114139984B (en) Urban traffic accident risk prediction method based on flow and accident cooperative sensing
CN116484244A (en) Automatic driving accident occurrence mechanism analysis method based on clustering model
CN115565376A (en) Vehicle travel time prediction method and system fusing graph2vec and double-layer LSTM
Neto et al. Predicting the next location for trajectories from stolen vehicles
Xing et al. GRU‐CNN Neural Network Method for Regional Traffic Congestion Prediction Serving Traffic Diversion Demand
CN111062589B (en) Urban taxi scheduling method based on destination prediction
Liu et al. Real‐Time Return Demand Prediction Based on Multisource Data of One‐Way Carsharing Systems
CN112766533A (en) Shared bicycle demand prediction method based on multi-strategy improved GWO _ BP neural network
Yan et al. An Efficient Division Method of Traffic Cell Based on Improved K-means Clustering Algorithm for the Location of Infrastructure in Vehicular Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant