WO2017063356A1 - 代驾订单预测方法和代驾运力调度方法 - Google Patents

代驾订单预测方法和代驾运力调度方法 Download PDF

Info

Publication number
WO2017063356A1
WO2017063356A1 PCT/CN2016/080350 CN2016080350W WO2017063356A1 WO 2017063356 A1 WO2017063356 A1 WO 2017063356A1 CN 2016080350 W CN2016080350 W CN 2016080350W WO 2017063356 A1 WO2017063356 A1 WO 2017063356A1
Authority
WO
WIPO (PCT)
Prior art keywords
order
driving
time period
period
class
Prior art date
Application number
PCT/CN2016/080350
Other languages
English (en)
French (fr)
Inventor
张磊
钟小武
Original Assignee
深圳市天行家科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市天行家科技有限公司 filed Critical 深圳市天行家科技有限公司
Publication of WO2017063356A1 publication Critical patent/WO2017063356A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/02Reservations, e.g. for tickets, services or events
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Definitions

  • the invention relates to the technical field of driving intelligent devices, in particular to a driving order forecasting method based on data mining technology and a driving force dispatching method.
  • the airport driver service came into being.
  • the owner of the airport can choose to take the driver and pay a small fee to save the precious time of the parking delay.
  • the vehicle can get good. care.
  • the initial airport driver service is roughly: self-driving car phone consultation on behalf of the driver service desk, the service desk broadcasts the demand information to the driver on behalf of the driver, and the driver drives the driver to take the driver.
  • the response is not timely, and on the other hand, it may cause multiple drivers to repeat orders or even grab orders.
  • driver's driving software is for all urban groups. There is no specific driver service for specific regions (such as airports, high-speed rail stations, dock ferries, etc.) and specific directions.
  • the main purpose of the present invention is to propose a method for predicting driving orders, based on data mining technology, multi-dimensional analysis of historical order situations, and more accurate forecasting of orders, in order to solve the prior art, the estimation and analysis of orders are not in place.
  • the resulting driver driver assigned unreasonable technical problems.
  • a method for predicting a driving order for a predetermined place comprising the following steps:
  • S3 For each of the area classes, perform the following operations: uniformly divide a day into a plurality of basic time periods, and obtain, from the order database, a predetermined place in the same area class for each day in the historical period.
  • the order quantity in each of the basic time periods, the plurality of basic time periods are clustered according to the order quantity in each of the basic time periods, so that the plurality of basic time periods are clustered to different Order forecasting time period;
  • S5. Receive an order prediction request, determine which region class the order prediction request is from, and which order prediction reference time period the request prediction time belongs to, select an order prediction model corresponding to the corresponding order prediction reference period under the corresponding region class, and obtain the The change factor in the order prediction request is used for the order quantity prediction by the corresponding order prediction model.
  • the above-mentioned driver order prediction method is used to predict the driving orders of some specific places (ie, the predetermined places) such as airports, docks, ferries, high-speed rail stations, etc., and at least has the following advantages:
  • the present invention classifies a plurality of predetermined places by area clustering.
  • the data of the predetermined place belonging to the same area class may adopt an algorithm flow, reduce the number of algorithms executed in parallel, and more importantly, classify the predetermined places, so that the amount of data in one algorithm flow is greatly increased, thereby increasing the prediction result.
  • the amount of order depends on many self-changing factors, including but not limited to weather, and the relationship between the order quantity and these self-changing factors should be a very complicated nonlinear relationship. Therefore, the BP neural network is used to nonlinearly fit the generation of the order prediction model, which can obtain a more reasonable order prediction model to more accurately predict the substitution order.
  • the data preprocessing in the step S1 includes:
  • Extracting key information in the driving order data includes at least the daily reserved order quantity, the agreed execution time, the actual execution time of the order, the cancellation order quantity, and the reason for canceling the order in the historical period; calculating each success Customer waiting time for executed orders.
  • step S2 specifically includes:
  • S21 The change of the order quantity in the historical period is described by using a three-direction chain code based on the order database of each of the predetermined places, to establish a change description sequence of each of the predetermined places;
  • step S22 specifically includes:
  • Chain code editing distance between chain code string 2 (j) is edit(i,j), where 0 ⁇ i ⁇ L1, 0 ⁇ j ⁇ L2, L1 and L2 respectively represent the total length of the change description sequence string 1 and string 2 ;
  • the complete matrix L1 ⁇ L2 is calculated by the above formula, and the element D(L1, L2) in the matrix D is the edit distance edit AB between the two predetermined places A and B;
  • the step S23 specifically includes: obtaining the step S22
  • the edit distances are clustered by an iterative self-organizing data analysis algorithm, so that E predetermined places are divided into different area categories according to the order change similarity.
  • the three-way chain code includes 0, 1, 2: the order quantity is increased from the previous day and the added value is greater than the first threshold, and the chain code 2 indicates "rise”; the order quantity is reduced and decreased from the previous day.
  • the chain code 0 indicates "down”; the order amount is unchanged from the previous day, or the increased but increased value is smaller than the first threshold, or the reduced but decreased value
  • the chain code 1 indicates "unchanged”.
  • the duration of the basic time period in the step S3 is not less than the customer waiting time
  • the performing the time period clustering in the step S3 specifically includes:
  • step S33 For each region class, adopt the nearest neighbor clustering method, and calculate y two-dimensional vectors obtained by normalization in step S32 based on the Euclidean distance. Perform clustering to obtain m vector sample classes based on the degree of similarity of the order quantity;
  • step S35 After performing step S34 for each basic time period, the basic time segments in each vector sample class are consecutive in time, and there is no overlapping basic time segment in each vector sample class; then, m vector samples are The classes respectively perform the merging of the base time periods to form m pieces of the order prediction reference time period.
  • extracting the order data in an order prediction reference period in the step S4 includes: for each predetermined location in a region class, extracting the order prediction reference period of each day in the historical period The order quantity and the corresponding date; the change factor includes at least the weather condition of the order forecast reference period of each day.
  • non-linear fitting using the BP neural network in the step S4 specifically includes:
  • the order data and the change factor in an order prediction reference period in a region class are put into a BP neural network for training, and the order prediction of the order prediction reference period in the region class is obtained. model.
  • step S1 the method further includes step S0: dividing the driver's order into different driver types according to different routes, and performing steps S1 to S4 for each of the driver type driving orders;
  • step S5 when receiving the order prediction request in step S5, it is also required to determine which type of driving the driving order in the order prediction request belongs to, in order to select the corresponding order forecasting reference period in the corresponding area category of the corresponding driving type. Order forecasting model.
  • the above-mentioned driving order forecasting method provided by the present invention performs deep analysis on historical order data through data mining technology for certain predetermined places, and effectively and reasonably predicts the order, so as to reasonably dispatch the driver on behalf of the driver. So that the utilization rate of the driver can be effectively improved.
  • the present invention also provides a generation driving force scheduling method, including: adopting the foregoing generation driving order prediction The method is used to predict an order; and according to the order prediction result, a driver driver scheduling scheme is generated, wherein the scheduling scheme is: the number of drivers driving is a predetermined multiple of the predicted order quantity, and the predetermined multiple is greater than 1.
  • the generation driving force scheduling method is used for the predetermined place, and the driver is reasonably dispatched according to the order result predicted by the above-mentioned driving order forecasting method to improve execution efficiency and improve customer satisfaction.
  • a specific embodiment of the present invention provides a method for dispatching a force based on data mining technology, which is used in certain specific places (such as an airport, a high-speed railway station, a ferry crossing, a dock, etc., not limited thereto), and is arbitrarily selected for these specific places.
  • the order quantity of the time period is predicted, and a reasonable driver dispatching plan is given based on the forecast result, so as to efficiently serve the owner of the driver who needs to find the driver, and at the same time, the utilization rate of the driver is driven (ie, the driver is driven by the driver).
  • the probability of the task) is as high as possible.
  • the generation driving force dispatching method mainly includes two major steps: the driving order forecasting and the driver driving dispatching.
  • the following is a detailed description of how to perform the driving order forecasting and the driver driving dispatching by taking the predetermined location airport as an example.
  • a method for predicting a driving order includes the following steps S1 to S5:
  • S3 For each of the area classes, perform the following operations: uniformly divide a day into a plurality of basic time periods, and obtain, from the order database, a predetermined place in the same area class for each day in the historical period.
  • the order quantity in each of the basic time periods, the plurality of basic time periods are clustered according to the order quantity in each of the basic time periods, so that the plurality of basic time periods are clustered to different Order forecasting time period;
  • S5. Receive an order prediction request, determine which area class and the request the order prediction request comes from.
  • the order forecasting reference time period belongs to which the forecasting reference time period belongs to select the order forecasting model of the corresponding order forecasting reference period under the corresponding area class, and obtain the change factor in the order forecasting request for the order quantity of the corresponding order forecasting model. prediction.
  • the predetermined location is an example of an airport.
  • the plurality of predetermined locations described in step S1 may include, for example, Shenzhen Airport, Guangzhou Airport, Beijing Airport, Hong Kong Airport, and the like.
  • Step S1 specifically includes: extracting historical order data of the pre-operational system from the existing airport driving reservation system, for example, extracting the airport A1, the airport A2, the airport A3, ..., the airport A10 respectively (the number of airports here is only It is an enumeration and does not constitute a limitation of the present invention, as long as it is an airport using an airport reservation reservation system.)
  • order data and then extract key information from the order data, the key information includes at least the daily booking order amount YYDDL, the agreed execution time YDZXSJ, the actual order execution time SJZX, the cancellation order quantity QXL, and the cancellation
  • each airport produces an order database as shown in Table 1 below:
  • Day 1 indicates the earliest day of the 300 days, and so on, "Day 300" is the day before the day.
  • the method for performing area clustering in the step S2 specifically includes:
  • the order quantity change between adjacent days in 300 days is described by a three-way chain code to establish an order change description sequence of each airport.
  • Airport A1 at this 300 In the middle of the day, the order quantity array from day 1 to day 300 is ⁇ 50, 70, 55, 100, ..., 280, 100 ⁇ , the array has a total of 300 elements, and the first element 50 means that the airport A1 is in the office. The amount of orders in the first day.
  • the three-direction chain code is 0, 1, and 2.
  • the chain code "2" indicates the rise; when the difference value Ad is less than one In the second threshold, the drop is represented by a chain code “0”; when the difference ⁇ d is between the second threshold and the first threshold, the order quantity is unchanged by the chain code “1”, wherein the first threshold is A positive number, for example, may be 10, 20, etc., as defined above, and the second threshold is a negative number, for example, may be -10, -20, etc., as defined.
  • the first threshold and the second threshold are 10 and -10, respectively, for the order quantity array ⁇ 50, 70, 55, 60, ..., 280, 100 ⁇ , between the first day and the second day
  • the order quantity change is represented by chain code 2.
  • the order quantity change between the 2nd day and the 3rd day is represented by chain code 0
  • the order quantity change between the 3rd day and the 4th day is represented by chain code 1 to
  • the change in the order quantity of the airport A1 within 300 days can be represented by a three-way chain code string of length 299 (ie, an order change description sequence).
  • the order change for the airports A2 to A10 within 300 days is also represented by a sequence of order change descriptions based on the three-way chain code.
  • 10 order change description sequences of length 299 corresponding to the airports A1 to A10, respectively, are obtained.
  • Select Airport A1 and airport change orders A2 describe sequences in the string 1 and string 2, to calculate the sequence string 1 in the i-th chain code string 1 (i) and the sequences in the string 2 j-th chain code string 2 ( j) The chain code editing distance between edit(i,j), where the values of i and j are between 0 and the sequence length 299;
  • the complete matrix D is calculated by the above formula, as follows:
  • the 45 edit distances obtained above are clustered by the iterative self-organizing analysis algorithm ISODATA to cluster the 10 airports.
  • Other clustering methods can also be used here, but the ISODATA clustering method can adaptively select the number of clusters, making the final clustering result more reasonable and compact. Since the ISODATA clustering algorithm belongs to the prior art, the specific clustering process will not be described here.
  • the 10 airports A1 to A10 are divided into different regional categories based on the order change similarity. It is assumed that the 10 airports are clustered into the regional regions and divided into three regional categories: B1 (A2, A3, A6), B2. (A1, A8, A9, A10), B3 (A4, A5, A7). Then, the subsequent data processing for 10 airports will be clustered and executed by the region class, that is, the region classes B1, B2, and B3 will be executed in parallel according to the same algorithm flow; and the data of multiple airports in the same region class will be Concentrated on one algorithmic flow, no more algorithms are executed at each airport.
  • step S3 The time period clustering described in step S3 will be performed for each area class, and the area class B1 (airport A2, airport A3, airport A6) will be taken as an example to illustrate how to perform time period clustering:
  • Step 1 Divide the day into a plurality of basic time periods, and the duration of the basic time period should not be less than the waiting time of the customer.
  • the day is divided into 24 basic time periods 0, 1, 2, ... , 23, where 0 represents a time period between 0 and 1 point, 1 represents a time period between 1 and 2 points, and so on;
  • Step 2 Obtain the sum of the order quantities of the airport A2, the airport A3, and the airport A6 in each of the basic time periods of each day within 300 days from the order database, and obtain the basic time period and the corresponding order quantity as dimensions.
  • vector X 1 (300, 0) means 300 days
  • the total number of orders for the three airports A2, A3, and A6 between 0 and 1 in 1 day (the earliest day of the date) is 300
  • X 2 (200, 1) indicates 1 point in the first day.
  • the total number of orders for the above three airports between 2 and 2 is 200
  • X 25 (200, 0) indicates that the total number of orders for the above three airports between the 0 and 1 points on the second day is 200.
  • Step 3 Normalize the data of each dimension of the above two two-dimensional vectors to unify the dimension and eliminate the great error caused by the difference of dimensions.
  • Standardized formula Where x min and x max are the minimum and maximum values in the same dimensional data of y two-dimensional vectors, respectively, so that y normalized two-dimensional vectors are obtained.
  • Hypersphere clades in Z 1 is at the center to a radius of the V, i.e., with Same category, then compare a distance d 13 from Z 1 , if d 13 >V, a new cluster center Z 2 is created, and Compare again Euclidean distance from the cluster centers Z 1 and Z 2 ;
  • the clustering becomes m vector sample classes C 1 , C 2 , ..., C m based on the degree of similarity of the order quantity; in each vector sample class, the number of vectors is not necessarily the same.
  • Step 5 Calculate how many order quantities are in the m vector sample classes for a basic time period, and then classify the basic time period into a vector sample class with the largest order quantity.
  • the basic time period since 300 days are selected, there are 300 vectors in each basic time period, which may be scattered among multiple vector sample classes. It is not possible to determine which vector sample class should be attributed to a certain basic time period. The principle of maximum membership, counting how many orders are in each of the m vector sample classes for each base time period.
  • 300 vectors belonging to the base time segment 0 are scattered among the vector sample classes C 1 and C 2 , but in C In 1 , the total order quantity of the base time period 0 is 200, and in C 2 is 30, the base time period 0 should be attributed to the vector sample class C 1 and not to C 2 .
  • Each basic time period is classified in the same way, and in the resulting m vector sample classes, there is no repeated basic time period, and the basic time period in each vector sample class is continuous, each will The continuous base time periods in the vector sample class are combined to obtain m order prediction reference time periods.
  • the time periods 0 to 2, 3 to 5, and 6 to 23 indicate that for the area class B1, any one of the three airports A2, A3, and A6 adopts the same order prediction model in the time period 0 to 2, and Another order forecasting model is used in time periods 3 to 5, and different order forecasting models are used in time periods 6-23.
  • Order forecasting model For example, obtaining the order forecasting model corresponding to the order forecasting reference period 0 ⁇ 2 in the area class B1, first extracting the order quantity between 0:00 and 2:00 of each day in each area of the regional class B1 within 300 days, and each The date corresponding to the order, the weather conditions during the period from 0:00 to 2:00, and the extracted data are input to the BP neural network for training (using nonlinear fitting) to obtain the regional class B1 in the order prediction reference period. Order forecasting model from 0 to 2.
  • the number of layers of the BP neural network can be determined as follows: based on the difference of the region class, a layer is defined, and the number of neurons in the layer is the number of the region class; and there are multiple different order prediction reference times in each region class.
  • Segment another layer is defined, the number of neurons in the layer is the number of the order prediction reference time period; and the input layer, in the above example, needs to input the order quantity (refers to a certain area, an order forecasting basis)
  • the number of orders in the time period), date, weather, so the number of neurons in the input layer is 3.
  • BP neural network method it is possible to obtain an order prediction model for different order prediction time periods of different regions. Determining, according to the order prediction request, which region class the order prediction request comes from and which order prediction reference time period the request prediction time belongs to, selecting an order prediction model of the corresponding order prediction reference period under the corresponding region class, and acquiring the order Forecasting the change factor in the request, such as weather, can then run the order forecasting model to predict the order quantity.
  • the driver's order can be classified first, and then the data of the same type of driving order needs to be obtained when the driver's order data is acquired in step S1, so that the order forecasting model type will be more, in each order type.
  • Each order forecasting time period under each regional category will correspond to a different order forecasting model.
  • a generation driving force dispatching method is also provided.
  • a more accurate order forecasting can be performed, and the driver can be rationally deployed. For example, if an airport receives a forecast and shows that the order quantity of the parking lot ⁇ terminal is 20 in a certain period of time, then the system will allocate The 25 driver drivers waited in the parking lot during the time period. The reason why the number of drivers on behalf of the driver is more than the order quantity is to prevent the owner from finding the driver and the customer experience is not good.
  • certain system rules can be set to prevent one person from repeating the order, the driver repeating the order, and the customer waiting excessively.
  • a scoring mechanism is introduced, and the customer can score the driver and deal with the driver who is not active, has a bad attitude or is slow to deliver the car. details as follows:
  • the server responds promptly, deletes the order information in the order notice, prevents multiple drivers from taking orders, and records the pick-up driver of the order.
  • the driver can be evaluated and scored. In the latter stage, the driver who is under-represented needs to be trained and trained, and the driver who reports a lot of reports will be seriously dealt with. Reduce the wait for customers to wait too much.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开代驾订单预测方法和代驾运力调度方法,可用于机场,代驾订单预测方法包括:建立各机场的历史订单数据库;根据订单变化相似度将机场进行区域聚类;对各区域类按照基础时间段内的订单量,将多个基础时间段进行时间段聚类,使每个区域类下形成不同的订单预测基准时间段;对每一区域类的每一订单预测基准时间段,提取相应的订单量和变化因子,采用BP神经网络进行非线性拟合,从而每一区域类的每一订单预测基准时间段都得到各自不同的订单预测模型;接收到订单预测请求时根据请求的区域类和订单预测基准时间段,选择对应的订单预测模型,结合变化因子预测出订单量。代驾运力调度方法根据上述订单预测结果来产生合理的司机分配方案。

Description

代驾订单预测方法和代驾运力调度方法 技术领域
本发明涉及代驾智能设备技术领域,尤其涉及一种基于数据挖掘技术的代驾订单预测方法和代驾运力调度方法。
背景技术
现如今,越来越多的人选择自驾前往机场乘坐飞机出行,然而,自驾去机场时停车会遇到如下问题:
1、浪费时间:相较于乘坐地铁、的士等交通工具的人而言,需要花费停车过程的时间,尤其是机场客流量大、停车场时常处于饱和状态,导致停车过程会耽误预料之外的宝贵时间,甚至误机。返程回来时,还可能忘记车辆停放位置,给自驾车主带来很大的麻烦;
2、费用昂贵:由于机场停车场的独占性和垄断性,导致机场停车费用昂贵;
3、车辆无人照料:车辆停放于机场而车主出行的期间,车辆无人照料,同时还成为了一种闲置资源。
基于上述机场停车所存在的问题,机场代驾服务应运而生,自驾来机场的车主可以通过选择代驾,付出一点小费,以省去停车所耽误的宝贵时间,同时车辆还能得到很好的看护。最初的机场代驾服务大致是:自驾车主电话咨询代驾服务台,服务台向代驾司机播报需求信息,代驾司机接单后去给车主代驾。但这种方式一方面响应不及时,另一方面可能导致多个代驾司机重复接单甚至抢单的情况发生。
随着技术的不断发展,以及智能电子设备的普及,诸如智能导航设备、智能手机的普及,代驾模式也发生了较大的变革,从最初的电话咨询变为网络或移动终端APP咨询,衍生了许多代驾软件,例如e代驾、滴滴代驾、E都市等。但这些代驾软件存在下列问题:
1)大部分代驾软件对订单量的预估和分析做得不到位,导致司机分配不合理,并且,对代驾司机的动态调度算法仍存在缺陷,导致有些司机时常空闲而有些司机分配订单过多而耽误车主,导致顾客体验不佳,订单量下降,司机资源利用不合理。
2)大部分代驾软件都是面向全部城市群体,市场上缺少针对特定区域(例如机场、高铁站、码头渡口等)、特定方向的专门代驾服务。
发明内容
本发明的主要目的在于提出一种代驾订单预测方法,基于数据挖掘技术,对历史订单情况进行多维度分析,更准确地预测订单,以解决现有技术因对订单的预估和分析不到位而导致的代驾司机分配不合理的技术问题。
本发明解决上述技术问题的技术方案如下:
一种代驾订单预测方法,用于预定场所,包括以下步骤:
S1、分别获取多个预定场所在一历史期间内的代驾订单数据并进行数据预处理,以分别建立各预定场所的订单数据库;
S2、基于每个预定场所的所述订单数据库,将所述多个预定场所按照订单变化相似度进行区域聚类,使得所述多个预定场所归于不同的区域类;
S3、对于每个所述区域类,都执行以下操作:将一天均匀划分为多个基础时间段,从所述订单数据库中获取同一区域类中的预定场所在所述历史期间内的每一天的每一所述基础时间段中的订单量,根据各所述基础时间段内的订单量将所述多个基础时间段进行时间段聚类,以使所述多个基础时间段聚类至不同的订单预测基准时间段;
S4、针对每一区域类中的每一订单预测基准时间段,都执行以下操作:提取一订单预测基准时间段内的订单数据和相应的变化因子,并输入BP神经网络进行非线性拟合,以获得每一区域类中的每一订单预测基准时间段的订单预测模型;
S5、接收订单预测请求,判断所述订单预测请求来自哪个区域类以及请求预测的时间属于哪个订单预测基准时间段,以选择相应区域类下相应订单预测基准时间段的订单预测模型,并获取所述订单预测请求中的变化因子以供所对应的订单预测模型进行订单量预测。
采用上述代驾订单预测方法来预测一些特定场所(即所述预定场所)例如机场、码头、渡口、高铁站等的代驾订单,至少具有以下优势:
1)由于本发明所使用的预定场所有多种类型,并且分布于全国甚至世界各地,存在很大的差异性,因此本发明通过区域聚类将很多个预定场所进行分类, 属于同一区域类的预定场所的数据可以采用一个算法流程,减少并行执行的算法数量,更重要的是,将预定场所进行归类,使得一个算法流程中的数据量大大增加,因而增加了预测结果的准确性;
2)由于本发明中的代驾订单预测,订单量的多少所取决的自变因素较多,包括但不限于天气,订单量与这些自变因素之间的关系应当是非常复杂的非线性关系,因此采用BP神经网络来对订单预测模型的生成进行非线性拟合,能够获得更加合理的订单预测模型,以更加准确地预测代驾订单。
更进一步地,所述步骤S1中的所述数据预处理包括:
提取所述代驾订单数据中的关键信息,所述关键信息至少包括所述历史期间内每天的预约订单量、约定执行时间、订单实际执行时间、取消订单量以及取消订单原因;计算每个成功执行的订单的顾客等待时间。
更进一步地,所述步骤S2具体包括:
S21、基于每个所述预定场所的所述订单数据库,将所述历史期间内订单量的变化采用三方向链码来描述,以建立各所述预定场所的变化描述序列;
S22、对所述多个预定场所,采用所述变化描述序列计算两两之间的编辑距离;
S23、根据编辑距离来判断所述订单变化相似度以将所述多个预定场所进行区域类的划分。
更进一步地,所述步骤S22具体包括:
选取待计算的两个预定场所A和B的变化描述序列string1与string2,计算变化描述序列string1中的第i个链码string1(i)与变化描述序列string2中的第j个链码string2(j)之间的链码编辑距离edit(i,j),其中0≤i≤L1,0≤j≤L2,L1和L2分别表示变化描述序列string1、string2的总长度;
初始化一个L1×L2的矩阵D,采用如下公式计算所述链码编辑距离edit(i,j)来填充矩阵D:
Figure PCTCN2016080350-appb-000001
经过上述公式计算得到完整的L1×L2的矩阵D,并且,矩阵D中的元素D(L1,L2)即为两个所述预定场所A与B之间的编辑距离editAB
按照上述方法,计算任意两个所述预定场所的编辑距离,共得到
Figure PCTCN2016080350-appb-000002
个编辑距离,其中E为预定场所的总个数;
所述步骤S23具体包括:对步骤S22中得到的
Figure PCTCN2016080350-appb-000003
个编辑距离采用迭代自组织数据分析算法进行聚类,以使得E个预定场所按照订单变化相似度划分为不同的区域类。
更进一步地,所述三方向链码包括0、1、2:订单量比前一天增加并且增加的值大于第一阈值时用链码2表示“上升”;订单量比前一天减少并且减少的值大于所述第一阈值时用链码0表示“下降”;订单量与前一天相比不变,或者增大但增大的值小于所述第一阈值,或者减小但减小的值小于所述第一阈值时,用链码1表示“不变”。
更进一步地,所述步骤S3中所述基础时间段的时长不小于所述顾客等待时间;
所述步骤S3中进行所述时间段聚类具体包括:
S31、对每一区域类都执行以下操作:统计一区域类中的所有所述预定场所每一天在每一基础时间段内的订单量,分别以各所述基础时间段和每一基础时间段内对应的订单量为维度数据建立二维向量X(r,h),则该区域类存在y=F×H个二维向量X1,X2,X3,L,Xy,其中H为所述基础时间段的个数,F为所述历史期间所含的天数;
S32、对每一个区域类均执行以下操作:将每个二维向量中各维度的数据进行标准化以统一量纲,标准化公式
Figure PCTCN2016080350-appb-000004
其中xmin、xmax分别为y个二维向量中同一维度数据中的最小值和最大值,从而得到y个标准化后 的二维向量
Figure PCTCN2016080350-appb-000005
S33、对于每一个区域类,采用最近邻聚类方法,基于欧氏距离对步骤S32中标准化后得到的y个二维向量
Figure PCTCN2016080350-appb-000006
进行聚类,得到基于订单量相似程度的m个向量样本类;
S34、统计一基础时间段在m个向量样本类中分别拥有多少订单量,然后将该基础时间段归类于出现订单量最多的一个向量样本类中;
S35、对每个基础时间段都执行步骤S34后,使得每个向量样本类中的基础时间段在时间上连续,并且各向量样本类中不存在重叠的基础时间段;然后对m个向量样本类分别进行基础时间段的合并,从而形成m个所述订单预测基准时间段。
更进一步地,所述步骤S4中提取一订单预测基准时间段内的订单数据包括:对于一区域类中的每个预定场所,提取所述历史期间内的每一天的该订单预测基准时间段内的订单量及对应的日期;所述变化因子至少包括每一天的该订单预测基准时间段的天气情况。
更进一步地,所述步骤S4中采用BP神经网络进行非线性拟合具体包括:
选择神经元的输入输出对照公式以及激活函数;
定义BP神经网络的层数以及各层的神经元数;
将一区域类中的一订单预测基准时间段内的所述订单数据及所述变化因子放入BP神经网络进行训练,即可得到该区域类中的该订单预测基准时间段的所述订单预测模型。
更进一步地,在步骤S1之前还包括步骤S0:将代驾订单依据路线的不同分为不同的代驾类型,并对各所述代驾类型的代驾订单也执行步骤S1至S4;
并且,步骤S5中在接收到订单预测请求时,还需判断所述订单预测请求中的代驾订单属于何种代驾类型,以选择相应代驾类型下相应区域类中相应订单预测基准时间段的订单预测模型。
总之,本发明所提供的上述代驾订单预测方法,针对某些预定场所,通过数据挖掘技术来对历史订单数据进行深层分析,对订单进行有效合理的预测,以对代驾司机进行合理的调度,使代驾司机的利用率得以有效的提高。
另,本发明还提供一种代驾运力调度方法,包括:采用前述的代驾订单预测 方法来预测订单;根据订单预测结果,产生代驾司机调度方案,所述调度方案为:代驾司机人数为预测的订单量的一预定倍数,所述预定倍数大于1。将该代驾运力调度方法用于所述预定场所,根据上述代驾订单预测方法所预测的订单结果,对司机进行合理的调度分配,以提高执行效率,提高客户满意度。
具体实施方式
下面结合优选的实施方式对本发明作进一步说明。
本发明的具体实施方式提供一种基于数据挖掘技术的代驾运力调度方法,用于某些特定场所(例如机场、高铁站、渡口、码头等,不限于此),通过对这些特定场所在任意时间段的订单量进行预测,并基于预测结果给出合理的代驾司机调度方案,以高效地为需要找代驾的车主服务,同时使代驾司机的利用率(即代驾司机执行代驾任务的几率)尽可能高。
所述代驾运力调度方法主要包括两大步骤:代驾订单预测和代驾司机调度,下面以预定场所系机场为例对如何进行代驾订单预测和代驾司机调度进行详细的说明。
一种代驾订单预测方法,包括以下步骤S1至S5:
S1、分别获取多个预定场所在一历史期间内的代驾订单数据并进行数据预处理,以分别建立各预定场所的订单数据库;
S2、基于每个预定场所的所述订单数据库,将所述多个预定场所按照订单变化相似度进行区域聚类,使得所述多个预定场所归于不同的区域类;
S3、对于每个所述区域类,都执行以下操作:将一天均匀划分为多个基础时间段,从所述订单数据库中获取同一区域类中的预定场所在所述历史期间内的每一天的每一所述基础时间段中的订单量,根据各所述基础时间段内的订单量将所述多个基础时间段进行时间段聚类,以使所述多个基础时间段聚类至不同的订单预测基准时间段;
S4、针对每一区域类中的每一订单预测基准时间段,都执行以下操作:提取一订单预测基准时间段内的订单数据和相应的变化因子,并输入BP神经网络进行非线性拟合,以获得每一区域类中的每一订单预测基准时间段的订单预测模型;
S5、接收订单预测请求,判断所述订单预测请求来自哪个区域类以及请求 预测的时间属于哪个订单预测基准时间段,以选择相应区域类下相应订单预测基准时间段的订单预测模型,并获取所述订单预测请求中的变化因子以供所对应的订单预测模型进行订单量预测。
在一种具体的实施例中,预定场所以机场为例,那么,步骤S1中所述的多个预定场所例如可以包括深圳机场、广州机场、北京机场以及香港机场等。则步骤S1具体包括:从现有的机场代驾预约系统中提取系统运行前期的历史订单数据,例如,分别提取机场A1、机场A2、机场A3、…、机场A10(此处的机场个数仅仅是列举,不构成对本发明的限制,只要是使用机场代驾预约系统的机场都可以)在当天的前300天内(此处的历史期间=300天仅仅是列举,不构成对本发明的限制)的订单数据,然后从这些订单数据中提取关键信息,所述关键信息至少包括这300天内每天的预约订单量YYDDL、约定执行时间YDZXSJ、订单实际执行时间SJZX、取消订单量QXL以及取消订单原因QXYY,同时,还需要计算每个成功执行的订单的顾客等待时间DDSJ,其中,顾客等待时间
Figure PCTCN2016080350-appb-000007
其中
Figure PCTCN2016080350-appb-000008
为所有订单的约定执行时间与订单实际执行时间之间的时间差的平均值,μ是本着顾客至上的原则而在实际等待时间的基础上进行适当的夸大,即μ>1但也不宜过大,在1~1.5之间最佳。从而,每一个机场就产生如下表1所示的订单数据库:
  YYDDL YDZXSJ SJZX QXL QXYY DDSJ
第1天            
第2天            
……            
第300天            
表1
在表1中,“第1天”表示这300天中最早的一天,以此类推,“第300天”为当天的前一天。
接上例,所述步骤S2进行区域聚类的方法具体包括:
对每个机场的上述订单数据库,将300天内相邻两天之间的订单量变化采用三方向链码来描述,以建立各机场的订单变化描述序列。例如:机场A1在这300 天中,第1天至第300天的订单量数组为{50,70,55,100,……,280,100},该数组共300个元素,第一个元素50即表示机场A1在所述第1天内的订单量。三方向链码为0、1、2,当后一天的订单量减去前一天的订单量所得差值Δd大于一第一阈值时,用链码“2”表示上升;当差值Ad小于一第二阈值时,用链码“0”表示下降;当差值Δd位于第二阈值与第一阈值之间时,用链码“1”表示订单量不变,其中,所述第一阈值为正数,例如可以是10、20等,视情况而定义,所述第二阈值为负数,例如可以是-10、-20等,视情况而定义。例如,若第一阈值和第二阈值分别为10和-10,则对于上述订单量数组{50,70,55,60,……,280,100},第1天与第2天之间的订单量变化则用链码2表示,第2天与第3天之间的订单量变化则用链码0表示,第3天与第4天之间的订单量变化用链码1表示,以同样的方法计算,可将机场A1在300天内的订单量变化用一个长度为299的三方向链码串(即订单变化描述序列)来表示。
采用上述同样的方法,对机场A2至A10在300天内的订单变化也采用基于三方向链码的订单变化描述序列来表示。从而得到10个分别对应机场A1至A10的长度为299的订单变化描述序列。
接着,计算10个机场两两之间的编辑距离,以判断两机场之间的订单变化相似度。以计算机场A1和A2之间的编辑距离为例来说明:
1)选取机场A1和机场A2的订单变化描述序列string1与string2,先计算序列string1中的第i个链码string1(i)与序列string2中的第j个链码string2(j)之间的链码编辑距离edit(i,j),其中i和j的取值在0与序列长度299之间;
2)初始化一个299×299的矩阵D,采用如下公式计算所述链码编辑距离edit(i,j),并用链码编辑距离edit(i,j)来填充矩阵D:
Figure PCTCN2016080350-appb-000009
经过上述公式计算得到完整的矩阵D,如下:
  string1(1) string1(2) string1(3) …… string1(299)
string2(1) edit(1,1) edit(2,1) edit(3,1) …… edit(299,1)
string2(2) edit(1,2) edit(2,2) edit(3,2) …… edit(299,2)
string2(3) edit(1,3) edit(2,3) edit(3,3) …… edit(299,3)
…… …… …… …… …… ……
string2(299) edit(1,299) edit(2,299) edit(3,299) …… edit(299,299)
并且,矩阵D中的元素D(299,299)=edit(299,299)即为机场A1与机场A2之间的编辑距离
Figure PCTCN2016080350-appb-000010
按照上述方法计算任意两个机场之间的编辑距离,从而10个机场A1~A10之间,即可存在
Figure PCTCN2016080350-appb-000011
个编辑距离。
对上述求得的45个编辑距离采用迭代自组织分析算法ISODATA进行聚类,以将10个机场进行区域聚类。此处也可以采用其他聚类方法,只不过ISODATA聚类方法可以自适应选择聚类数量,使最终的聚类结果更加合理、紧凑。由于ISODATA聚类算法属于现有技术,具体的聚类过程在此不再赘述。
从而,将10个机场A1~A10基于订单变化相似度分成不同的区域类,假设将这10个机场进行所述区域聚类后分为三个区域类:B1(A2、A3、A6)、B2(A1、A8、A9、A10)、B3(A4、A5、A7)。则对10个机场的后续数据处理将以区域类来聚类执行,即对区域类B1、B2、B3都将按照同样的算法流程来并行执行;而同一区域类中的多个机场的数据将集中到一个算法流程,不再每个机场分别执行算法。
下面将对每一区域类进行步骤S3中所述的时间段聚类,以区域类B1(机场A2、机场A3、机场A6)为例来说明如何进行时间段聚类:
步骤1、将一天均分为多个基础时间段,基础时间段的时长应当不小于所述顾客等待时间,在此例子中,将一天分为24个基础时间段0,1,2,……,23,其中0表示0点至1点之间的时间段,1表示1点至2点之间的时间段,以此类推;
步骤2、从订单数据库中获取机场A2、机场A3、机场A6在300天内的每 一天的每一所述基础时间段中的订单量之和,即可得到以基础时间段和相应订单量为维度数据建立的二维向量X(r,h),其中h表示24个基础时间段中的某一基础时间段,r表示某一天中该基础时间段内三个机场的订单量之和。即可得到y个(此处y=F×H=300×24=7200)二维向量X1,X2,X3,L,Xy,例如向量X1(300,0)表示300天内第1天(日期最早的一天)中的0点至1点之间三个机场A2、A3、A6的订单总量为300个,X2(200,1)表示所述第1天中的1点至2点之间上述三个机场的订单总量为200个,X25(200,0)表示第2天中的0点至1点之间上述三个机场的订单总量为200个,以此类推;
步骤3、对以上y个二维向量的各个维度数据进行标准化,以统一量纲,消除因为量纲不同带来的极大误差。标准化公式
Figure PCTCN2016080350-appb-000012
其中xmin、xmax分别为y个二维向量中同一维度数据中的最小值和最大值,从而得到y个标准化后的二维向量
Figure PCTCN2016080350-appb-000013
步骤4、采用最近邻聚类方法,基于欧氏距离对上述步骤3中得到的y个二维向量
Figure PCTCN2016080350-appb-000014
进行聚类,得到基于订单量相似程度的m个向量样本类。具体地,首先设定欧氏距离的非负阈值V,从y个二维向量
Figure PCTCN2016080350-appb-000015
中随机选取一个向量
Figure PCTCN2016080350-appb-000016
作为聚类中心Z1,假设u=1即
Figure PCTCN2016080350-appb-000017
那么,计算向量
Figure PCTCN2016080350-appb-000018
与聚类中心Z1之间的欧氏距离d12
若d12>V,则新建一个聚类中心Z2,且
Figure PCTCN2016080350-appb-000019
再比较向量
Figure PCTCN2016080350-appb-000020
与聚类中心Z1、Z2的距离d13、d23,若d13和d23均大于V,则再新建一个聚类中心Z3
Figure PCTCN2016080350-appb-000021
继续进行比较;若d13和d23均小于V,且d13<d23<V,则说明向量
Figure PCTCN2016080350-appb-000022
距离聚类中心Z1更近,
Figure PCTCN2016080350-appb-000023
应当与
Figure PCTCN2016080350-appb-000024
同属一类;而若d23<d13<V,则说明向量
Figure PCTCN2016080350-appb-000025
距离聚类中心Z2更近,
Figure PCTCN2016080350-appb-000026
应当与
Figure PCTCN2016080350-appb-000027
同属一类;
如果d12<V,则说明
Figure PCTCN2016080350-appb-000028
是在以Z1为中心以V为半径的超球体聚类簇中,即
Figure PCTCN2016080350-appb-000029
Figure PCTCN2016080350-appb-000030
同属一类,再比较
Figure PCTCN2016080350-appb-000031
与Z1之间的距离d13,若d13>V则新建一个聚类中心Z2,且
Figure PCTCN2016080350-appb-000032
再比较
Figure PCTCN2016080350-appb-000033
与聚类中心Z1、Z2的欧式距离;
采用同样的方法不断进行比较、聚类,最终将y个二维向量
Figure PCTCN2016080350-appb-000034
聚类成为基于订单量相似程度的m个向量样本类C1、C2、……、Cm;在每一个向量样本类中,向量数不一定相同。
步骤5、统计一基础时间段在m个向量样本类中分别拥有多少订单量,然后将该基础时间段归类于出现订单量最多的一个向量样本类中。在本例中,由于选取了300天,每种基础时间段存在300个向量,可能分散于多个向量样本类中,并不能确定某个基础时间段到底最终应当归于哪个向量样本类,因此采用最大隶属度原则,统计每个基础时间段在m个向量样本类中分别拥有多少个订单量,例如,属于基础时间段0的300个向量分散于向量样本类C1和C2,但是在C1中,基础时间段0的订单总量为200,而在C2中为30,则基础时间段0应当归于向量样本类C1中而不属于C2。以同样的方法对每个基础时间段进行归类,最终得到的m个向量样本类中,不存在重复的基础时间段,且每个向量样本类中的基础时间段是连续的,将每个向量样本类中连续的基础时间段合并,即可得到m个订单预测基准时间段。例如,采用最大隶属度原则归类后得到的m个向量样本类,例如m=3,得到{(0,100),(1,200),(2,300)}、{(3,10),(4,25),(5,50)}、{(6,500),(7,500),(8,600),……,(23,500)},从而得到三个订单预测基准时间段0~2、3~5、6~23,即表示对于区域类B1,其三个机场A2、A3、A6中的任一个机场在时间段0~2内采用同一个订单预测模型,而在时间段3~5采用另一个订单预测模型,在时间段6~23又采用不同的订单预测模型。
对于不同的区域类,不同的订单预测基准时间段,订单预测模型将不同。下面将详述如何生成订单预测模型:
提取某一区域类中的某一订单预测基准时间段内的订单数据和相应的变化因子,并输入BP神经网络进行非线性拟合,即可获得该区域类中的该订单预测基准时间段的订单预测模型。例如:获取区域类B1中订单预测基准时间段0~2所对应的订单预测模型,首先提取区域类B1中各机场在300天内每一天的0点 ~2点之间的订单量,以及每个订单对应的日期、该日期中0点~2点期间的天气情况,将所提取的这些数据输入至BP神经网络进行训练(采用非线性拟合),以获得区域类B1在订单预测基准时间段0~2的订单预测模型。应当说明,在将所提取的数据输入至BP神经网络之前,需要针对待输入的数据来设计BP神经网络的层数、每层的神经元数、神经元的输入输出对照公式以及激活函数。在一种优选的实施例中,神经元的输入和输出对照公式为:
Figure PCTCN2016080350-appb-000035
yi=f(neti),其中,激活函数选择为S形函数(Sigmoid Function):
Figure PCTCN2016080350-appb-000036
x1=t;x2=w;x3=a。而BP神经网络的层数可如此确定:基于区域类的不同,定义一层,该层的神经元个数即为区域类的个数;基于各区域类中具有多个不同的订单预测基准时间段,再定义一层,该层的神经元个数为订单预测基准时间段的个数;而输入层,以上述为例,需输入订单数量(指的是某一区域、某一订单预测基准时间段的订单数量)、日期、天气,因此输入层的神经元个数为3。
采用上述方法BP神经网络的方法,就能够得到不同区域类不同订单预测基准时间段的订单预测模型。根据订单预测请求,判断所述订单预测请求来自哪个区域类以及请求预测的时间属于哪个订单预测基准时间段,以选择相应区域类下相应订单预测基准时间段的订单预测模型,并获取所述订单预测请求中的变化因子例如天气,然后即可运行该订单预测模型来预测订单量。
在另一些实施例中,假如代驾订单不只有一种类型,根据路线的不同会有多种订单类型,例如有两种:航站楼→停车场,停车场→航站楼,对于此种情况,可先将代驾订单进行分类,然后在步骤S1中获取代驾订单数据时需要获取属于同类代驾订单的数据,这样一来,订单预测模型种类将会更多,每一订单类型中的每一区域类下的每一订单预测基准时间段都将对应不同的订单预测模型。在步骤S5中接收到订单预测请求时就还需要判断代驾订单属于那种类型(例如是航站楼→停车场,还是停车场→航站楼)。
还提供一种代驾运力调度方法,在通过前述的代驾订单预测方法进行较为准确的订单预测的基础上,可以对代驾司机进行合理的调配。例如,某一机场接收到预测结果显示某一时间段停车场→航站楼的订单量为20,那么,系统将分配 25个代驾司机于该时间段内停车场等候,之所以代驾司机人数多于订单量,是为了防止车主找不到司机,造成顾客体验不好。
在一些优选的实施例中,可以通过一定的系统规则设定,防止一人重复下单,司机重复接单,顾客过度等待等情况发生。同时引入打分机制,顾客可以对乘务司机进行评分,对于表现不积极、态度恶劣或送车泊车慢的司机进行处理。具体如下:
①通过获取顾客订单的手机号以及手机IMEI序列码作为该顾客订单的唯一标识ID,那么当该顾客多次下单,或者使用同一手机不同手机号(有可能顾客手机为双卡双待)下单时,就可以判定为同一订单,这样防止重复下单,减少订单冗余和误操作率。
②当司机在终端确认接单后,服务器迅速响应,在订单通告中将该订单信息进行删除,防止多个司机接单情况发生,同时对该订单的接客司机进行记录。
③顾客订单完成后,可以对待客司机进行评价打分,后期对于打分过低的司机需要进行学习培训在上岗,举报特别多的司机予以严肃处理。减少顾客过度等待的情况。
以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的技术人员来说,在不脱离本发明构思的前提下,还可以做出若干等同替代或明显变型,而且性能或用途相同,都应当视为属于本发明的保护范围。

Claims (10)

  1. 一种代驾订单预测方法,用于预定场所,其特征在于:包括以下步骤:
    S1、分别获取多个预定场所在一历史期间内的代驾订单数据并进行数据预处理,以分别建立各预定场所的订单数据库;
    S2、基于每个预定场所的所述订单数据库,将所述多个预定场所按照订单变化相似度进行区域聚类,使得所述多个预定场所归于不同的区域类;
    S3、对于每个所述区域类,都执行以下操作:将一天均匀划分为多个基础时间段,从所述订单数据库中获取同一区域类中的预定场所在所述历史期间内的每一天的每一所述基础时间段中的订单量,根据各所述基础时间段内的订单量将所述多个基础时间段进行时间段聚类,以使所述多个基础时间段聚类至不同的订单预测基准时间段;
    S4、针对每一区域类中的每一订单预测基准时间段,都执行以下操作:提取一订单预测基准时间段内的订单数据和相应的变化因子,并输入BP神经网络进行非线性拟合,以获得每一区域类中的每一订单预测基准时间段的订单预测模型;
    S5、接收订单预测请求,判断所述订单预测请求来自哪个区域类以及请求预测的时间属于哪个订单预测基准时间段,以选择相应区域类下相应订单预测基准时间段的订单预测模型,并获取所述订单预测请求中的变化因子以供所对应的订单预测模型进行订单量预测。
  2. 如权利要求1所述的代驾订单预测方法,其特征在于:所述步骤S1中的所述数据预处理包括:
    提取所述代驾订单数据中的关键信息,所述关键信息至少包括所述历史期间内每天的预约订单量、约定执行时间、订单实际执行时间、取消订单量以及取消订单原因;计算每个成功执行的订单的顾客等待时间。
  3. 如权利要求1所述的代驾订单预测方法,其特征在于:所述步骤S2具体包括:
    S21、基于每个所述预定场所的所述订单数据库,将所述历史期间内订单量的变化采用三方向链码来描述,以建立各所述预定场所的变化描述序列;
    S22、对所述多个预定场所,采用所述变化描述序列计算两两之间的编辑距离;
    S23、根据编辑距离来判断所述订单变化相似度以将所述多个预定场所进行区域类的划分。
  4. 如权利要求3所述的代驾订单预测方法,其特征在于:所述步骤S22具体包括:
    选取待计算的两个预定场所A和B的变化描述序列string1与string2,计算变化描述序列string1中的第i个链码string1(i)与变化描述序列string2中的第j个链码string2(j)之间的链码编辑距离edit(i,j),其中0≤i≤L1,0≤j≤L2,L1和L2分别表示变化描述序列string1、string2的总长度;
    初始化一个L1×L2的矩阵D,采用如下公式计算所述链码编辑距离edit(i,j)来填充矩阵D:
    Figure PCTCN2016080350-appb-100001
    经过上述公式计算得到完整的L1×L2的矩阵D,并且,矩阵D中的元素D(L1,L2)即为两个所述预定场所A与B之间的编辑距离editAB
    按照上述方法,计算任意两个所述预定场所的编辑距离,共得到
    Figure PCTCN2016080350-appb-100002
    个编辑距离,其中E为预定场所的总个数;
    所述步骤S23具体包括:对步骤S22中得到的
    Figure PCTCN2016080350-appb-100003
    个编辑距离采用迭代自组织数据分析算法进行聚类,以使得E个预定场所按照订单变化相似度划分为不同的区域类。
  5. 如权利要求4所述的代驾订单预测方法,其特征在于:所述三方向链码包括0、1、2:订单量比前一天增加并且增加的值大于第一阈值时用链码2表示“上升”;订单量比前一天减少并且减少的值大于所述第一阈值时用链码0表示“下降”;订单量与前一天相比不变,或者增大但增大的值小于所述第一阈值,或者减小但减小的值小于所述第一阈值时,用链码1表示“不变”。
  6. 如权利要求2所述的代驾订单预测方法,其特征在于:所述步骤S3中所述基础时间段的时长不小于所述顾客等待时间;
    所述步骤S3中进行所述时间段聚类具体包括:
    S31、对每一区域类都执行以下操作:统计一区域类中的所有所述预定场所每一天在每一基础时间段内的订单量,分别以各所述基础时间段和每一基础时间段内对应的订单量为维度数据建立二维向量X(r,h),则该区域类存在y=F×H个二维向量X1,X2,X3,L,Xy,其中H为所述基础时间段的个数,F为所述历史期间所含的天数;
    S32、对每一个区域类均执行以下操作:将每个二维向量中各维度的数据进行标准化以统一量纲,标准化公式
    Figure PCTCN2016080350-appb-100004
    其中xmin、xmax分别为y个二维向量中同一维度数据中的最小值和最大值,从而得到y个标准化后的二维向量
    Figure PCTCN2016080350-appb-100005
    S33、对于每一个区域类,采用最近邻聚类方法,基于欧氏距离对步骤S32中标准化后得到的y个二维向量
    Figure PCTCN2016080350-appb-100006
    进行聚类,得到基于订单量相似程度的m个向量样本类;
    S34、统计一基础时间段在m个向量样本类中分别拥有多少订单量,然后将该基础时间段归类于出现订单量最多的一个向量样本类中;
    S35、对每个基础时间段都执行步骤S34后,使得每个向量样本类中的基础时间段在时间上连续,并且各向量样本类中不存在重叠的基础时间段;然后对m个向量样本类分别进行基础时间段的合并,从而形成m个所述订单预测基准时间段。
  7. 如权利要求1所述的代驾订单预测方法,其特征在于:所述步骤S4中提取一订单预测基准时间段内的订单数据包括:对于一区域类中的每个预定场所,提取所述历史期间内的每一天的该订单预测基准时间段内的订单量及对应的日期;所述变化因子至少包括每一天的该订单预测基准时间段的天气情况。
  8. 如权利要求7所述的代驾订单预测方法,其特征在于:所述步骤S4中采用BP神经网络进行非线性拟合具体包括:
    选择神经元的输入输出对照公式以及激活函数;
    定义BP神经网络的层数以及各层的神经元数;
    将一区域类中的一订单预测基准时间段内的所述订单数据及所述变化因子放入BP神经网络进行训练,即可得到该区域类中的该订单预测基准时间段的所述订单预测模型。
  9. 如权利要求1所述的代驾订单预测方法,其特征在于:在步骤S1之前还包括步骤S0:将代驾订单依据路线的不同分为不同的代驾类型,并对各所述代驾类型的代驾订单也执行步骤S1至S4;
    并且,步骤S5中在接收到订单预测请求时,还需判断所述订单预测请求中的代驾订单属于何种代驾类型,以选择相应代驾类型下相应区域类中相应订单预测基准时间段的订单预测模型。
  10. 一种代驾运力调度方法,其特征在于:包括以下步骤:
    采用如权利要求1至9任一项所述的代驾订单预测方法来预测订单;
    根据订单预测结果,产生代驾司机调度方案,所述调度方案为:代驾司机人数为预测的订单量的一预定倍数,所述预定倍数大于1。
PCT/CN2016/080350 2015-10-14 2016-04-27 代驾订单预测方法和代驾运力调度方法 WO2017063356A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510663215.6A CN105373840B (zh) 2015-10-14 2015-10-14 代驾订单预测方法和代驾运力调度方法
CN201510663215.6 2015-10-14

Publications (1)

Publication Number Publication Date
WO2017063356A1 true WO2017063356A1 (zh) 2017-04-20

Family

ID=55376020

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/080350 WO2017063356A1 (zh) 2015-10-14 2016-04-27 代驾订单预测方法和代驾运力调度方法

Country Status (2)

Country Link
CN (1) CN105373840B (zh)
WO (1) WO2017063356A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018214361A1 (en) * 2017-05-25 2018-11-29 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for improvement of index prediction and model building
CN110119884A (zh) * 2019-04-17 2019-08-13 五邑大学 一种基于近邻传播聚类的高速铁路客流时段划分方法
CN110837907A (zh) * 2018-08-17 2020-02-25 天津京东深拓机器人科技有限公司 一种预测波次订单量的方法和装置
CN111292106A (zh) * 2018-12-06 2020-06-16 北京嘀嘀无限科技发展有限公司 一种业务需求影响因素确定方法以及装置
CN111476588A (zh) * 2019-01-24 2020-07-31 北京嘀嘀无限科技发展有限公司 订单需求预测方法、装置、电子设备及可读存储介质
CN111612122A (zh) * 2019-02-25 2020-09-01 北京嘀嘀无限科技发展有限公司 实时需求量的预测方法、装置及电子设备
CN112669595A (zh) * 2020-12-10 2021-04-16 浙江大学 一种基于深度学习的网约车流量预测方法
CN112766587A (zh) * 2021-01-26 2021-05-07 北京顺达同行科技有限公司 物流订单处理方法、装置、计算机设备以及存储介质

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105373840B (zh) * 2015-10-14 2018-12-11 深圳市天行家科技有限公司 代驾订单预测方法和代驾运力调度方法
CN107305546B (zh) * 2016-04-18 2021-03-16 北京嘀嘀无限科技发展有限公司 一种出行场景中建筑物的语义刻画方法以及装置
CN108022140A (zh) * 2016-11-02 2018-05-11 北京嘀嘀无限科技发展有限公司 一种用车订单推荐方法、装置及服务器
CN107392512B (zh) * 2016-11-25 2018-06-01 北京小度信息科技有限公司 任务分组方法和装置
CN108615129B (zh) * 2016-12-09 2021-05-25 北京三快在线科技有限公司 一种运力监测方法、装置及电子设备
CN106779958B (zh) * 2016-12-28 2021-04-27 易塑科技(深圳)有限公司 一种基于集中区域的促使联合下单方法及其系统
CN107133645B (zh) * 2017-05-03 2021-10-26 百度在线网络技术(北京)有限公司 预估乘客取消订单行为的方法、设备及存储介质
WO2019061129A1 (en) * 2017-09-28 2019-04-04 Beijing Didi Infinity Technology And Development Co., Ltd. SYSTEMS AND METHODS FOR EVALUATING PROGRAMMING STRATEGY ASSOCIATED WITH DESIGNATED DRIVING SERVICES
CN109886442A (zh) * 2017-12-05 2019-06-14 北京嘀嘀无限科技发展有限公司 预估接驾时长方法及预估接驾时长系统
CN109919167B (zh) * 2017-12-12 2021-07-06 北京京东乾石科技有限公司 分拣中心的货物分拣方法和装置、货物分拣系统
CN110110950A (zh) * 2018-02-01 2019-08-09 北京京东振世信息技术有限公司 生成配送路区的方法、装置及计算机可读存储介质
CN108564326B (zh) * 2018-04-19 2021-12-21 安吉汽车物流股份有限公司 订单的预测方法及装置、计算机可读介质、物流系统
CN108830504B (zh) * 2018-06-28 2021-09-21 清华大学 用车需求预测方法、系统、服务器及计算机存储介质
CN109389542A (zh) * 2018-09-14 2019-02-26 百度在线网络技术(北京)有限公司 预测酒驾高发地区的方法、装置、计算机设备及存储介质
CN111091221A (zh) * 2018-10-23 2020-05-01 北京嘀嘀无限科技发展有限公司 一种出行等候容忍时间预测方法、系统、装置及存储介质
CN111192071B (zh) * 2018-11-15 2023-11-17 北京嘀嘀无限科技发展有限公司 发单量预估方法及装置、训练发单概率模型的方法及装置
CN111275229B (zh) * 2018-12-04 2022-07-05 北京嘀嘀无限科技发展有限公司 资源模型训练方法、资源缺口预测方法、装置及电子设备
CN111476389A (zh) * 2019-01-24 2020-07-31 北京嘀嘀无限科技发展有限公司 一种预估接单等待时长的方法及装置
CN109816128B (zh) * 2019-01-30 2021-06-29 杭州飞步科技有限公司 网约车订单的处理方法、装置、设备及可读存储介质
CN109886489A (zh) * 2019-02-21 2019-06-14 上海德启信息科技有限公司 应用于中转资源的配置系统及方法
CN111612183A (zh) * 2019-02-25 2020-09-01 北京嘀嘀无限科技发展有限公司 信息处理方法、装置、电子设备及计算机可读存储介质
CN111612489B (zh) * 2019-02-25 2024-03-29 北京嘀嘀无限科技发展有限公司 订单量的预测方法、装置及电子设备
CN110246329A (zh) * 2019-04-07 2019-09-17 武汉理工大学 一种出租车载客数量预测方法
CN110084437A (zh) * 2019-05-09 2019-08-02 上汽安吉物流股份有限公司 订单的预测方法及装置、物流系统以及计算机可读介质
CN110458345A (zh) * 2019-07-31 2019-11-15 深圳蓝贝科技有限公司 确定机器损失出货量的方法、装置、设备及存储介质
CN111832767B (zh) * 2019-08-01 2024-04-26 北京嘀嘀无限科技发展有限公司 播单策略自动测试装置、方法、电子设备和存储介质
WO2021077300A1 (en) * 2019-10-22 2021-04-29 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for improving an online to offline platform
CN110826971B (zh) * 2019-11-15 2023-04-07 拉扎斯网络科技(上海)有限公司 配送设备调度方法、装置、可读存储介质和电子设备
CN111160747B (zh) * 2019-12-23 2022-07-22 北京百度网讯科技有限公司 无人驾驶机器人出租车的调度方法、装置及电子设备
CN113487078A (zh) * 2021-06-30 2021-10-08 上海淇馥信息技术有限公司 一种新生成任务执行的方法、装置及电子设备
CN114997747B (zh) * 2022-07-29 2022-11-04 共幸科技(深圳)有限公司 一种实现上下游供需平衡的代驾服务调度方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103310286A (zh) * 2013-06-25 2013-09-18 浙江大学 一种具有时间序列特性的产品订单预测方法及装置
CN103985247A (zh) * 2014-04-24 2014-08-13 北京嘀嘀无限科技发展有限公司 基于城市叫车需求分布密度的出租车运力调度系统
CN104537831A (zh) * 2015-01-23 2015-04-22 北京嘀嘀无限科技发展有限公司 车辆调度的方法及设备
US20150161752A1 (en) * 2013-12-11 2015-06-11 Uber Technologies Inc. Intelligent queuing for user selection in providing on-demand services
CN105373840A (zh) * 2015-10-14 2016-03-02 深圳市天行家科技有限公司 代驾订单预测方法和代驾运力调度方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081786A (zh) * 2011-01-30 2011-06-01 北京东方车云信息技术有限公司 一种车辆调度的方法及系统
CN103218769A (zh) * 2013-03-19 2013-07-24 王兴健 出租车订单分配方法
CN104599168A (zh) * 2015-02-02 2015-05-06 北京嘀嘀无限科技发展有限公司 叫车订单的分配方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103310286A (zh) * 2013-06-25 2013-09-18 浙江大学 一种具有时间序列特性的产品订单预测方法及装置
US20150161752A1 (en) * 2013-12-11 2015-06-11 Uber Technologies Inc. Intelligent queuing for user selection in providing on-demand services
CN103985247A (zh) * 2014-04-24 2014-08-13 北京嘀嘀无限科技发展有限公司 基于城市叫车需求分布密度的出租车运力调度系统
CN104537831A (zh) * 2015-01-23 2015-04-22 北京嘀嘀无限科技发展有限公司 车辆调度的方法及设备
CN105373840A (zh) * 2015-10-14 2016-03-02 深圳市天行家科技有限公司 代驾订单预测方法和代驾运力调度方法

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018214361A1 (en) * 2017-05-25 2018-11-29 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for improvement of index prediction and model building
CN110837907A (zh) * 2018-08-17 2020-02-25 天津京东深拓机器人科技有限公司 一种预测波次订单量的方法和装置
CN111292106A (zh) * 2018-12-06 2020-06-16 北京嘀嘀无限科技发展有限公司 一种业务需求影响因素确定方法以及装置
CN111476588B (zh) * 2019-01-24 2023-10-24 北京嘀嘀无限科技发展有限公司 订单需求预测方法、装置、电子设备及可读存储介质
CN111476588A (zh) * 2019-01-24 2020-07-31 北京嘀嘀无限科技发展有限公司 订单需求预测方法、装置、电子设备及可读存储介质
CN111612122A (zh) * 2019-02-25 2020-09-01 北京嘀嘀无限科技发展有限公司 实时需求量的预测方法、装置及电子设备
CN111612122B (zh) * 2019-02-25 2023-08-08 北京嘀嘀无限科技发展有限公司 实时需求量的预测方法、装置及电子设备
CN110119884B (zh) * 2019-04-17 2022-09-13 五邑大学 一种基于近邻传播聚类的高速铁路客流时段划分方法
CN110119884A (zh) * 2019-04-17 2019-08-13 五邑大学 一种基于近邻传播聚类的高速铁路客流时段划分方法
CN112669595B (zh) * 2020-12-10 2022-07-01 浙江大学 一种基于深度学习的网约车流量预测方法
CN112669595A (zh) * 2020-12-10 2021-04-16 浙江大学 一种基于深度学习的网约车流量预测方法
CN112766587A (zh) * 2021-01-26 2021-05-07 北京顺达同行科技有限公司 物流订单处理方法、装置、计算机设备以及存储介质
CN112766587B (zh) * 2021-01-26 2023-10-27 北京顺达同行科技有限公司 物流订单处理方法、装置、计算机设备以及存储介质

Also Published As

Publication number Publication date
CN105373840B (zh) 2018-12-11
CN105373840A (zh) 2016-03-02

Similar Documents

Publication Publication Date Title
WO2017063356A1 (zh) 代驾订单预测方法和代驾运力调度方法
Yu et al. Headway-based bus bunching prediction using transit smart card data
Huang et al. A novel bus-dispatching model based on passenger flow and arrival time prediction
CN114390079B (zh) 一种智慧城市公共场所管理方法和物联网系统
CN111079875A (zh) 基于多源数据的公共交通客流监测方法、装置和存储介质
CN109949068A (zh) 一种基于预测结果的实时拼车方法和装置
CN112291807B (zh) 一种基于深度迁移学习和跨域数据融合的无线蜂窝网络流量预测方法
CN106897801A (zh) 司机分类的方法、装置、设备以及存储介质
CN106875670A (zh) Spark平台下基于GPS数据的出租车调配方法
CN106651213A (zh) 服务订单的处理方法及装置
CN113538067B (zh) 一种基于机器学习的城际网约车需求预测方法及系统
CN111340318B (zh) 一种车辆动态调度方法、装置及终端设备
CN113780808A (zh) 基于柔性公交接驳系统线的车辆服务属性决策优化方法
CN111046937A (zh) 一种融合公交数据和poi数据的两段式乘客人群出行目的分析方法
Mostafa et al. Solving the heterogeneous capacitated vehicle routing problem using K-means clustering and valid inequalities
CN116013059A (zh) 一种营运车辆的调度方法、装置、电子设备及存储介质
Hu et al. An artificial-neural-network-based model for real-time dispatching of electric autonomous taxis
Xi Data-driven optimization technologies for MaaS
CN113793195B (zh) 网约车订单处理方法、装置、计算机设备及可读存储介质
CN114418606B (zh) 基于时空卷积网络的网约车订单需求预测方法
CN111062589B (zh) 一种基于目的地预测的城市出租车调度方法
Shang et al. Data mining technologies for Mobility-as-a-Service (MaaS)
CN115131983A (zh) 一种基于停车影响因子的泊车引导方法
Hua et al. Large-scale dockless bike sharing repositioning considering future usage and workload balance
Chen et al. Examine the Prediction Error of Ride‐Hailing Travel Demands with Various Ignored Sparse Demand Effects

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16854745

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16854745

Country of ref document: EP

Kind code of ref document: A1