CN115630733A

CN115630733A - Order delivery time estimation method, system, medium and electronic device

Info

Publication number: CN115630733A
Application number: CN202211267968.1A
Authority: CN
Inventors: 陈瑞
Original assignee: Beijing Yonghui Technology Co ltd
Current assignee: Beijing Yonghui Technology Co ltd
Priority date: 2022-10-17
Filing date: 2022-10-17
Publication date: 2023-01-20

Abstract

The invention provides a method, a system, a medium and an electronic device for estimating order delivery time; the method comprises the following steps: acquiring historical order information; extracting the characteristics of the historical order information to obtain target characteristics; processing the target characteristics to obtain sample data; training a delivery time estimation model based on the sample data, acquiring the trained delivery time estimation model, and estimating the delivery time of the target order based on the trained delivery time estimation model; according to the invention, the delivery time of the order is estimated in advance through the delivery time estimation model, so that the picking time duration requirement and the delivery time duration requirement of the rider in the bin are ensured, meanwhile, the efficiency of each link in the order delivery process is improved, the waiting time duration of the user is reduced, and better experience is brought to the user.

Description

Order delivery time estimation method, system, medium and electronic device

Technical Field

The invention belongs to the technical field of order distribution, and particularly relates to an order delivery time estimation method, system, medium and electronic equipment.

Background

With the wide application of the instant delivery technology, the development of the fresh food delivery industry is also good, the requirement degree of the fresh food on the delivery Time is extremely high due to the particularity of the fresh food, how to quickly and effectively deliver the fresh food to a customer is one of key points of the fresh food delivery industry service selling point, and the ETA (Estimated Time of Arrival estimation) carries out data mining on historical order placing information of a user, warehouse picking and packing information and rider delivery information, extracts relevant characteristics, trains out a corresponding model, and immediately makes a delivery Time estimation result for the order placing of the user.

The fresh food delivery industry is end-to-end delivery demand, the fresh food delivery industry is transferred to the outside of a warehouse from the inside of the warehouse and then transferred to a rider for delivery, finally scenes delivered to hands of customers are split, but the correlation coupling is strong, different two-magnitude differences exist in the face of complex terrain, logistics conversion, weather differences, personnel configuration, goods specifications and the like of all delivery service points in the country, overall planning and estimation are conducted on different endpoints and different stores, the requirement for the duration of picking goods in the warehouse is guaranteed, the requirement for the duration of delivery by the rider is also guaranteed, the efficiency of each link is improved, the waiting duration of a user is reduced, therefore, greater contradictions exist between better user experience feelings, and how to peacefully predict the requirements of each stage are adopted as a core.

The existing related industries also have ETA methods, and the specific principle is as follows:

(1) Navigation receiving and sending related business ETA

The traffic ETA prediction is to predict the traffic of the next time period based on the historical traffic information. Regarding the prediction of road conditions and time, the highway network is simple compared with the urban road network, and has no tide phenomenon, traffic lights and heavy congestion, so that the navigation introduction related ETA prediction is currently performed, a relatively common model is a DCRNNs based on graph convolution, the arrival time prediction is performed by combining the historical road condition information with the actual road condition state, and the main characteristic data is concentrated in the historical and real-time road condition and weather information.

(2) ETA for takeaway delivery related business

The method comprises the steps that the commodity instant delivery service is represented by a American group, estimated delivery time estimation is carried out on delivery relations among a rider, a merchant and a customer, the American group uses a deep learning model to carry out mining learning on historical information of a business circle and the customer, the deep learning model is combined with service rules, specific service rules are overlapped in time to meet specific scene requirements after the model estimates an ETA value, and each rule is generated by repeated iteration of service indexes. The overall optimization of the model and the rule is generated, and after the model time and the rule time are optimized separately, namely the influence of the rule time cannot be considered during model training, and the rule time can generate different floating in different time periods in one year.

Disclosure of Invention

In view of the above drawbacks of the prior art, an object of the present invention is to provide a method, a system, a medium, and an electronic device for estimating an order arrival time, which are used to solve the above problems in the prior art.

To achieve the above and other related objects, the present invention provides an order arrival time estimation method, comprising the steps of: obtaining historical order information; extracting the characteristics of the historical order information to obtain target characteristics; processing the target characteristics to obtain sample data; and training a delivery time estimation model based on the sample data, acquiring the trained delivery time estimation model, and estimating the delivery time of the target order based on the trained delivery time estimation model.

In an embodiment of the present invention, the processing the target feature to obtain sample data includes the following steps: performing duplicate removal and cleaning operation on the target characteristics to obtain original available data; performing exception processing on the original available data to obtain data to be encoded; and encoding the data to be encoded to obtain the sample data.

In an embodiment of the present invention, the training of the estimated delivery time model based on the sample data to obtain the trained estimated delivery time model includes the following steps: inputting the sample data into the delivery time estimation model to train the delivery time estimation model; calculating a loss value corresponding to a loss function of the delivery time estimation model once after each training until the loss value does not decrease any more, stopping training, and obtaining a trained delivery time estimation model; the loss function is a quantile function.

In an embodiment of the present invention, the historical order information includes order information corresponding to at least one historical order; the formula of the quantile function is as follows:

wherein a represents a preset quantile; xi shape _i Representing the difference between the estimated value of the delivery time of the historical order by the delivery time estimation model and the actual value of the delivery time corresponding to the historical order; psi (xi) _i | a) represents the loss value.

In an embodiment of the present invention, the inputting the sample data into the arrival time estimation model to train the arrival time estimation model includes the following steps: dividing the sample data into a training set and a test set; inputting the training set into the delivery time estimation model to train the delivery time estimation model; the training of the arrival time estimation model based on the sample data further comprises the following steps: after the step of obtaining the trained estimated delivery time model, inputting the test set into the trained estimated delivery time model to obtain a performance rate and/or an estimated deviation so as to test the trained estimated delivery time model; wherein the fulfillment rate is the number of fulfillment orders in the testset divided by the total number of orders in the testset; the estimated deviation is the absolute value of the difference between the estimated delivery time of the order in the test set and the actual delivery time of the order in the test set by the trained delivery time estimation model.

In an embodiment of the present invention, the delivery time estimation model adopts a LightGBM model; the delivery time estimation model comprises at least one submodel; the formula of the delivery time of the target order estimated by the trained delivery time estimation model is as follows:

wherein, F _m (x) Representing the delivery time of the target order x; f. of _i (x) Is shown as _i The submodel estimates the delivery time of the target order x; _m represents the total number of all submodels;

denotes f _i (x) The corresponding weight.

In an embodiment of the present invention, the estimating the delivery time of the target order based on the trained delivery time estimation model includes the following steps: coding the trained delivery time estimation model to obtain a PMML file; and importing the PMML file into a terminal to realize that the trained delivery time estimation model is deployed on the terminal so that the terminal estimates the delivery time of the target order.

The invention provides an order delivery time pre-estimation system, comprising: the device comprises an information acquisition module, a feature extraction module, a data acquisition module and a time estimation module; the information acquisition module is used for acquiring historical order information; the characteristic extraction module is used for extracting the characteristics of the historical order information to obtain target characteristics; the data acquisition module is used for processing the target characteristics to acquire sample data; the time estimation module is used for training a delivery time estimation model based on the sample data, acquiring the trained delivery time estimation model, and estimating the delivery time of the target order based on the trained delivery time estimation model.

The present invention provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described order arrival time estimation method.

The present invention provides an electronic device, including: a processor and a memory; the memory is used for storing a computer program; the processor is used for executing the computer program stored in the memory so as to enable the electronic equipment to execute the order arrival time estimation method.

As described above, the method, system, medium and electronic device for estimating order arrival time according to the present invention have the following advantages:

(1) Compared with the prior art, the delivery time of the order is estimated in advance through the delivery time estimation model, the picking time duration requirement and the delivery time duration requirement of the rider in the bin are guaranteed, meanwhile, the efficiency of each link in the order delivery process is improved, the waiting time duration of the user is reduced, and therefore better experience is brought to the user.

(2) The invention provides a quantile regression-based LightGBM model and a method and a system for estimating the arrival time of a specific scene during the process, which are characterized in that data mining is carried out on historical order information, extraction of relevant characteristics is carried out, quantile setting is carried out on the prediction range of the traditional LightGBM model, quantiles suitable for the requirements of a service scene are set, the extracted characteristics are subjected to data cleaning processing and are led into the arrival time estimation model for training, and the arrival time estimation model is led into a terminal after the training is finished, so that the terminal can be matched with the trigger condition of the specific scene to carry out generalized adjustment on the prediction time, and therefore the prediction result is more accurate for a user under the conditions of large range, multiple users and a wide scene.

Drawings

Fig. 1 is a schematic structural diagram of a terminal according to an embodiment of the invention.

FIG. 2 is a flowchart illustrating an exemplary method for estimating order-arrival time according to the present invention.

FIG. 3 is a flow chart illustrating an embodiment of the present invention for processing a target feature to obtain sample data.

FIG. 4 is a flowchart illustrating training of a time-to-delivery estimation model based on sample data to obtain a trained time-to-delivery estimation model according to an embodiment of the present invention.

FIG. 5 is a flowchart illustrating inputting sample data into the time-to-send estimation model for training the time-to-send estimation model according to an embodiment of the present invention.

FIG. 6 is a block diagram of a decision tree according to an embodiment of the present invention.

FIG. 7 is a flowchart illustrating estimating the time of arrival of a target order based on a trained time of arrival estimation model according to an embodiment of the present invention.

FIG. 8 is a schematic diagram illustrating an embodiment of an order arrival time estimation system according to the present invention.

Description of the reference symbols

1. Terminal device

11. Processing unit

12. Memory device

121. Random access memory

122. Cache memory

123. Storage system

124. Program/utility tool

1241. Program module

13. Bus line

14. Input/output interface

15. Network adapter

2. External device

3. Display device

81. Information acquisition module

82. Feature extraction module

83. Data acquisition module

84. Time estimation module

S1 to S4

S31 to S33

S41 to S42

Steps S411 to S412

S43 to S44

Detailed Description

The following description of the embodiments of the present invention is provided by way of specific examples, and other advantages and effects of the present invention will be readily apparent to those skilled in the art from the disclosure herein. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the drawings only show the components related to the present invention rather than being drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of each component in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.

Compared with the prior art, the order delivery time estimation method, the order delivery time estimation system, the order delivery time estimation medium and the electronic equipment estimate the delivery time of the order in advance through the delivery time estimation model, improve the efficiency of each link in the order distribution process and reduce the waiting time of a user while ensuring the picking time requirement in a warehouse and the rider distribution time requirement, thereby bringing better experience to the user; the invention provides a quantile regression-based LightGBM model and a method and a system for estimating the arrival time of a specific scene during the process, which are characterized in that data mining is carried out on historical order information, extraction of relevant characteristics is carried out, quantile setting is carried out on the prediction range of the traditional LightGBM model, quantiles suitable for the requirements of a service scene are set, the extracted characteristics are subjected to data cleaning processing and are led into the arrival time estimation model for training, and the arrival time estimation model is led into a terminal after the training is finished, so that the terminal can be matched with the trigger condition of the specific scene to carry out generalized adjustment on the prediction time, and therefore the prediction result is more accurate for a user under the conditions of large range, multiple users and a wide scene.

The storage medium of the present invention stores a computer program that when executed by a processor implements the order arrival time estimating method described below. The storage medium includes: a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, a usb disk, a Memory card, or an optical disk, which can store program codes.

Any combination of one or more storage media may be employed. The storage medium may be a computer-readable signal medium or a computer-readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a RAM, a ROM, an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The present invention is described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the computer program instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

The electronic device of the invention comprises a processor and a memory.

The memory is used for storing a computer program; preferably, the memory comprises: various media that can store program codes, such as ROM, RAM, magnetic disk, U-disk, memory card, or optical disk.

The processor is connected with the memory and used for executing the computer program stored in the memory so as to enable the electronic equipment to execute the order arrival time estimation method described below.

Preferably, the Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components.

In an embodiment, the electronic device includes a terminal and/or a server.

Fig. 1 shows a block diagram of an exemplary terminal 1 suitable for implementing an embodiment of the invention.

The terminal 1 shown in fig. 1 is only an example, and should not bring any limitation to the function and the use range of the embodiment of the present invention.

As shown in fig. 1, the terminal 1 is in the form of a general purpose computing device. The components of the terminal 1 may include, but are not limited to: one or more processors or processing units 11, a memory 12, and a bus 13 that couples various system components including the memory 12 and the processing unit 11.

Bus 13 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an enhanced ISA (enhanced ISA) bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.

The terminal 1 typically includes a variety of computer system readable media. These media may be any available media that can be accessed by terminal 1 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 12 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 121 and/or cache memory 122. The terminal 1 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 123 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 1, commonly referred to as a "hard drive"). Although not shown in FIG. 1, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 13 by one or more data media interfaces. Memory 12 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

Program/utility 124 having a set (at least one) of program modules 1241 may be stored in, for example, memory 12, such program modules 1241 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 1241 generally perform the functions and/or methodologies of embodiments of the invention as described herein.

The terminal 1 may also communicate with one or more external devices 2, such as a keyboard, pointing device, display 3, etc., as well as with one or more devices that enable a user to interact with the terminal 1, and/or any devices that enable the terminal 1 to communicate with one or more other computing devices, such as a network card, modem, etc. Such communication may be through an input/output (I/O) interface 14. Also, the terminal 1 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) through the network adapter 15. As shown in fig. 1, the network adapter 15 communicates with the other modules of the terminal 1 via the bus 13. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the terminal 1, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

As shown in fig. 2, in an embodiment, the method for estimating order arrival time of the present invention includes the following steps:

and S1, acquiring historical order information.

In an embodiment, the historical order information includes order information corresponding to at least one historical order.

It should be noted that the order information at least includes, but is not limited to, any of the following: the system comprises user ordering information, in-bin goods picking and packaging information, rider distribution information, weather information and community information.

And S2, extracting the characteristics of the historical order information to obtain target characteristics.

Specifically, by performing data mining on historical order information, relevant target features are extracted from the historical order information.

It should be noted that the target feature includes at least one of the following, but is not limited to: a warehouse picking feature, a table waiting feature, a rider distribution feature, a weather feature, a community feature, an order feature; wherein the in-bin picking feature includes at least one of, but is not limited to, any of: ordering quantity of goods picking in a preset time period, number of people picking in the preset time period, mean value of goods picking time length in a store warehouse and mean value of finishing time length in the store warehouse; the mesa wait feature includes at least one of, but is not limited to, any of: the order number to be picked in the future preset time period and the order number to be distributed in the future preset time period; the rider dispensing feature includes at least one of, but is not limited to, any of: the delivery order number in the preset time period and the number of delivery personnel in the preset time period; weather characteristics include at least, but are not limited to, any of: weather of order issuing day; cell characteristics include at least any one of, but are not limited to: the method comprises the following steps of (1) historical distribution average values of cells, historical distribution time median, hundred-degree riding time and hundred-degree riding distance; the order characteristics include at least, but are not limited to, any of the following: order SKU (Stock Keeping Unit, which is a Stock Keeping Unit and is an alphanumeric commodity number assigned to Stock, and each SKU number corresponds to a product) type data and order commodity type data.

And S3, processing the target characteristics to acquire sample data.

As shown in fig. 3, in an embodiment, the processing the target feature to obtain sample data includes the following steps:

and S31, performing duplicate removal and cleaning operation on the target characteristics to acquire original available data.

And S32, carrying out exception processing on the original available data to obtain data to be coded.

Specifically, the original available data is subjected to exception statistics to obtain an exception data set, and then the exception data set is removed, that is, the data to be coded is obtained.

And S33, encoding the data to be encoded to obtain the sample data.

It should be noted that some historical orders may have obviously abnormal data caused by other reasons (for example, traffic accidents occur in the order distribution process, so that the delivery time of the order is delayed for a long time), and if the data are used as sample data, the user trains the delivery time estimation model subsequently, so that the delivery time estimation model is definitely inaccurate when estimating the delivery time of the target order; in this embodiment, the target features corresponding to the orders are defined as abnormal data, and the abnormal data is removed.

Specifically, firstly, duplicate removal cleaning is carried out on all target features, elimination and truncation reservation processing are carried out on abnormal values in the target features to obtain original usable data, then weather anomaly statistics and distribution duration anomaly statistics are carried out on the original usable data continuously, all abnormal data sets are eliminated to obtain usable data to be coded, and finally coding processing is carried out on characteristics of a cell and a time zone in the data to be coded to obtain sample data.

It should be noted that the methods adopted in the above-mentioned "deduplication cleaning" in step S31, the "exception handling" in step S32, and the "encoding handling" in step S33 are all conventional technical means in the field, so detailed description of the specific working principle is not repeated here.

And S4, training a delivery time estimation model based on the sample data, acquiring the trained delivery time estimation model, and estimating the delivery time of the target order based on the trained delivery time estimation model.

As shown in fig. 4, in an embodiment, the training of the time-to-arrival estimation model based on the sample data to obtain the trained time-to-arrival estimation model includes the following steps:

and S41, inputting the sample data into the arrival time estimation model to train the arrival time estimation model.

As shown in fig. 5, in an embodiment, the inputting the sample data into the time-to-delivery estimation model to train the time-to-delivery estimation model includes the following steps:

step S411, the sample data is divided into a training set and a test set.

Specifically, sample data is divided into a training set and a test set according to a certain proportion.

It should be noted that, the sample data is divided into the training set and the test set according to what proportion, which is not a condition for limiting the present invention, and in practical application, the sample data can be set according to a practical application scenario.

Such as having 80% of the sample data as the training set and the remaining 20% as the test set.

Step S412, inputting the training set into the delivery time estimation model to train the delivery time estimation model.

And S42, calculating a loss value corresponding to the loss function of the estimated delivery time model once each time of training until the loss value does not decrease any more, stopping training, and obtaining the trained estimated delivery time model.

In this embodiment, the loss function is a quantile function.

It should be noted that the quantile function researches the relationship between the conditional quantiles of the independent variables and the dependent variables, and the corresponding obtained quantile regression model can estimate the conditional quantile of the dependent variables from the independent variables; compared with the traditional regression analysis which can only obtain the central trend of the dependent variable, the quantile regression can further deduce the conditional probability distribution of the dependent variable, and belongs to one of nonparametric statistical methods.

In one embodiment, the quantile function is formulated as follows:

wherein a represents a preset quantile; xi shape _i Representing the difference between the estimated value of the delivery time of the historical order by the delivery time estimation model and the real value of the delivery time corresponding to the historical order; psi (xi) _i | a) represents the loss value.

It should be noted that the preset quantile a is a preset value, and is specifically set as what number, which is not a condition for limiting the present invention, and in practical application, the preset quantile a may be set according to a practical application scenario.

Typically, the predetermined quantile a is set to 0.90, 0.95, or 0.99.

For example, a =0.90, i.e. 90 quantile, if the true value is 9, then ξ is taken when the predicted value is 10 _i 0 or more, and the calculation result is 0.9 x (10-9) =0.9; when the estimated value is 8, ξ _i < 0, calculated as (0.9-1) × (8-9) =0.1.

Therefore, the arrival time estimation model takes the maximum value corresponding to the quantile function as a loss function, and the greater the value of the quantile function obtained by model training, the better, so that the structure of the arrival time estimation model is trained.

It should be noted that, the 90 quantile is simply understood to ensure that 90% of the estimated values are higher than the true values.

In one embodiment, the training the time-to-arrival estimation model based on the sample data further comprises: after the step of obtaining the trained estimated delivery time model, inputting the test set into the trained estimated delivery time model to obtain a performance rate and/or estimated deviation so as to test the trained estimated delivery time model.

Wherein the fulfillment rate is the number of fulfillment orders in the testset divided by the total number of orders in the testset; the estimated deviation is the absolute value of the difference between the estimated delivery time of the order in the test set and the actual delivery time of the order in the test set by the trained delivery time estimation model.

It should be noted that the deviation of the estimation result of the delivery time estimation model is verified by testing the trained delivery time estimation model through the test set.

In one embodiment, the delivery time estimation model is a LightGBM model; the delivery time estimation model comprises at least one submodel; the formula of the delivery time of the target order estimated by the trained delivery time estimation model is as follows:

denotes f _i (x) The corresponding weight is a preset value.

In particular, by the result (i.e. f) estimated for each submodel _i (x) Weighted summation is performed, and the result is used as the delivery time of the target order (i.e., F) _m (x))。

It should be noted that LightGBM is a model that is more powerful and faster than Xgboost, and has a great improvement in performance, and has the following advantages compared with the conventional algorithm: the method has the advantages of higher training efficiency, low memory use, higher accuracy, support of parallelization learning, capability of processing large-scale data and native support of class characteristics, and no need of carrying out 0-1 coding on the class characteristics.

Boosting is an integrated model for completing learning tasks by linear combination of a series of submodels, and a LightGBM model belongs to Gradientboosting, and the core idea is as follows: and once iteration variables are added, the submodels are increased one by one in the iteration process, and the loss function is ensured to be continuously reduced.

As above, f _i (x) Is a sub-model, and the composite model is

The loss function is L [ F ] _i (x),Y](i.e., ψ (ξ) as described above _i La)) each time a new submodel is added, the loss function is made to decrease continuously towards the gradient of the next highest information content variable:

L[F _i (x),Y]＜L[F _i-1 (x),Y]。

where Y represents the true value.

It should be noted that a decision tree (DecisionTree) is a classification and regression method, and is mostly used for classification in actual research, the structure of the decision tree is a tree structure, and a binary tree is mostly used, and two types of "eligible" and "ineligible" are output on each leaf node according to a certain judgment condition, and are continuously and repeatedly output downwards, as shown in fig. 6.

A decision tree may be understood as a collection of a number of if-then rules, which may also be considered as defining a conditional probability distribution over a particular space and class space, the creation of the decision tree comprising 3 main steps: the method has the advantages of high readability and high classification speed.

Each submodel of the GradientBoosting is a decision tree, and the submodel of the decision tree splits nodes by adopting a method of splitting leaves, so that the calculation cost is low, and the splitting mode is selected, so that the depth of the tree and the minimum data volume of each leaf node need to be controlled, and the over-fitting phenomenon is avoided; lightGBM selects a decision tree algorithm based on Histogram, divides the characteristic values into a plurality of small 'buckets', and then finds splitting on the 'buckets', so that the storage cost and the calculation cost can be reduced; in addition, the processing of the category characteristics also enables the LightGBM to be better improved under specific data.

In this embodiment, the LightGBM model is a composite model comprising a plurality of decision trees; the quantile LightGBM sets a loss function of the LightGBM model as a quantile function, and estimates data below the quantile.

Specifically, after a preset quantile is set, random division of a training set and a test set is carried out on sample data, then the training set is led into a delivery time estimation model for cross validation training, five times of cross validation are set for parameter optimization, the optimized optimal parameters are input into the delivery time estimation model, the delivery time estimation model carries out test validation on the test set, and the deviation of the estimation result of the model is validated.

As shown in fig. 7, in one embodiment, the estimating the time of arrival of the target order based on the trained time of arrival estimation model comprises the following steps:

and S43, coding the trained delivery time estimation model to acquire a PMML file.

Specifically, the trained delivery time estimation model is subjected to xml coding processing through a Java engineering jiaa packet of LightGBM, and a PMML file about delivery time estimation is obtained.

And S44, importing the PMML file into a terminal to realize the deployment of the trained delivery time estimation model on the terminal, so that the terminal estimates the delivery time of the target order.

It should be noted that, by importing the PMML file obtained in step S43 into a terminal, when the user requests to estimate the delivery time of the target order, the terminal is enabled to have the function of estimating the delivery time, and the estimated delivery time can be returned to the user.

In this embodiment, the order delivery time estimation method of the present invention is applied to the fresh food delivery industry, and through data mining of user order placing information, in-warehouse picking and packaging information, rider delivery information, and the like in historical orders, relevant feature extraction is performed, quantiles suitable for business scene requirements are set in the prediction range of the conventional LightGBM model, the extracted features are subjected to data cleaning processing and are led into the delivery time estimation model for training, and the delivery time estimation model is led into the terminal after training, so that the terminal can perform generalized adjustment on the prediction time by matching with a specific scene trigger condition, and thus prediction suitable for delivery time can be performed on users in a large range, multiple users and a wide scene, the prediction result is more accurate, and better user experience is provided.

It should be noted that the present invention is also applicable to the estimation of order arrival time in other industries (such as the navigation delivery related business and the takeaway delivery related business).

It should be noted that the protection scope of the method for estimating the order delivery time according to the present invention is not limited to the execution sequence of the steps listed in this embodiment, and all the solutions implemented by adding or subtracting steps and replacing steps in the prior art according to the principle of the present invention are included in the protection scope of the present invention.

As shown in fig. 8, in one embodiment, the order arrival time estimation system of the present invention includes an information obtaining module 81, a feature extracting module 82, a data obtaining module 83 and a time estimation module 84.

The information obtaining module 81 is configured to obtain historical order information.

The feature extraction module 82 is configured to perform feature extraction on the historical order information to obtain a target feature.

The data obtaining module 83 is configured to process the target feature to obtain sample data.

The time estimation module 84 is configured to train a delivery time estimation model based on the sample data, acquire the trained delivery time estimation model, and estimate delivery time of the target order based on the trained delivery time estimation model.

It should be noted that the structures and principles of the information obtaining module 81, the feature extracting module 82, the data obtaining module 83, and the time estimating module 84 correspond to the steps (step S1 to step S4) in the order arrival time estimating method one by one, and thus are not described herein again.

It should be noted that the division of the modules of the above system is only a logical division, and all or part of the actual implementation may be integrated into one physical entity or may be physically separated. And these modules can all be implemented in the form of software invoked by a processing element; or can be implemented in the form of hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the x module may be a processing element that is set up separately, or may be implemented by being integrated in a chip of the system, or may be stored in a memory of the system in the form of program code, and the function of the x module may be called and executed by a processing element of the system. The other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more Digital Signal Processors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a System-On-a-Chip (SOC).

It should be noted that the order arrival time estimation system of the present invention can implement the order arrival time estimation method of the present invention, but the implementation apparatus of the order arrival time estimation method of the present invention includes, but is not limited to, the structure of the order arrival time estimation system described in this embodiment, and all structural modifications and substitutions in the prior art made according to the principles of the present invention are included in the protection scope of the present invention.

In summary, compared with the prior art, the order delivery time estimation method, the order delivery time estimation system, the order delivery time estimation medium and the electronic device estimate the delivery time of the order in advance through the delivery time estimation model, improve the efficiency of each link in the order delivery process and reduce the waiting time of the user while ensuring the picking time requirement and the rider delivery time requirement in the warehouse, thereby bringing better experience to the user; the invention provides a quantile regression-based LightGBM model and a method and a system for estimating the arrival time when a specific scene is added, wherein historical order information is subjected to data mining, relevant characteristics are extracted, quantile setting is carried out on the prediction range of the traditional LightGBM model, quantile suitable for the requirements of a service scene is set, the extracted characteristics are subjected to data cleaning processing, are led into a arrival time estimation model and then are trained, and are led into a terminal after the arrival time estimation model is trained, so that the terminal can be used for carrying out generalized adjustment on the prediction time by matching with a specific scene trigger condition, and therefore the prediction result can be more accurate for a user under the conditions of large range, multiple users and a wide scene; therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. An order arrival time estimating method is characterized by comprising the following steps:

acquiring historical order information;

performing feature extraction on the historical order information to obtain target features;

processing the target characteristics to obtain sample data;

and training a delivery time estimation model based on the sample data, acquiring the trained delivery time estimation model, and estimating the delivery time of the target order based on the trained delivery time estimation model.

2. The method of claim 1, wherein the step of processing the target feature to obtain sample data comprises the steps of:

performing duplicate removal and cleaning operation on the target characteristics to obtain original available data;

carrying out exception processing on the original available data to obtain data to be coded;

and encoding the data to be encoded to obtain the sample data.

3. The method of claim 1, wherein the training of the arrival time estimation model based on the sample data to obtain the trained arrival time estimation model comprises the following steps:

inputting the sample data into the delivery time pre-estimation model to train the delivery time pre-estimation model;

calculating a loss value corresponding to a loss function of the arrival time estimation model once every time of training until the loss value does not decrease any more, stopping training, and obtaining a trained arrival time estimation model; the loss function is a quantile function.

4. The method of claim 3, wherein the historical order information comprises order information corresponding to at least one historical order; the formula of the quantile function is as follows:

5. The method of claim 3, wherein the step of inputting the sample data into the time-to-send estimation model to train the time-to-send estimation model comprises the steps of:

dividing the sample data into a training set and a test set;

inputting the training set into the delivery time estimation model to train the delivery time estimation model;

the training of the arrival time estimation model based on the sample data further comprises the following steps:

after the step of obtaining the trained estimated delivery time model, inputting the test set into the trained estimated delivery time model to obtain a performance rate and/or an estimated deviation so as to test the trained estimated delivery time model; wherein the fulfillment rate is the number of fulfillment orders in the testset divided by the total number of orders in the testset; the estimated deviation is the absolute value of the difference between the estimated delivery time of the order in the test set and the actual delivery time of the order in the test set, which is estimated by the trained delivery time estimation model.

6. The method of claim 1, wherein the arrival time estimation model is a LightGBM model; the delivery time estimation model comprises at least one submodel; the formula of the delivery time of the target order estimated by the trained delivery time estimation model is as follows:

denotes f _i (x) The corresponding weight.

7. The method of claim 1, wherein the estimating the time of arrival of the target order based on the trained time of arrival estimation model comprises the following steps:

coding the trained delivery time estimation model to obtain a PMML file;

and importing the PMML file into a terminal to realize that the trained delivery time estimation model is deployed on the terminal so that the terminal estimates the delivery time of the target order.

8. An order arrival time estimation system, comprising: the device comprises an information acquisition module, a feature extraction module, a data acquisition module and a time estimation module;

the information acquisition module is used for acquiring historical order information;

the characteristic extraction module is used for extracting the characteristics of the historical order information to obtain target characteristics;

the data acquisition module is used for processing the target characteristics to acquire sample data;

the time estimation module is used for training a delivery time estimation model based on the sample data, acquiring the trained delivery time estimation model, and estimating the delivery time of the target order based on the trained delivery time estimation model.

9. A storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method of estimating time to delivery of an order according to any one of claims 1 to 7.

10. An electronic device, comprising: a processor and a memory;

the memory is used for storing a computer program;

the processor is configured to execute the computer program stored in the memory to cause the electronic device to perform the order arrival time estimation method of any one of claims 1 to 7.