CN114358692A - Distribution time length adjusting method and device and electronic equipment - Google Patents

Distribution time length adjusting method and device and electronic equipment Download PDF

Info

Publication number
CN114358692A
CN114358692A CN202210016491.3A CN202210016491A CN114358692A CN 114358692 A CN114358692 A CN 114358692A CN 202210016491 A CN202210016491 A CN 202210016491A CN 114358692 A CN114358692 A CN 114358692A
Authority
CN
China
Prior art keywords
adjustment
information
time length
adjustment information
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210016491.3A
Other languages
Chinese (zh)
Inventor
王星
朱麟
高兴兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rajax Network Technology Co Ltd
Original Assignee
Rajax Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rajax Network Technology Co Ltd filed Critical Rajax Network Technology Co Ltd
Priority to CN202210016491.3A priority Critical patent/CN114358692A/en
Publication of CN114358692A publication Critical patent/CN114358692A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application provides a distribution time length adjusting method and device, electronic equipment and a computer readable storage medium, and relates to the field of computers. The method and the device for predicting the delivery time length have the advantages that the target adjustment information corresponding to the first predicted delivery time length of the target order is predicted based on the reinforcement learning model according to the characteristics of the order parameter information of the target order, and the first predicted delivery time length is adjusted by fully utilizing the relevant order parameter information of the target order and combining the characteristics of the order parameter information of the target order in the prediction process of the preset adjustment model, so that the accuracy of the delivery time length prediction is improved.

Description

Distribution time length adjusting method and device and electronic equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for adjusting a delivery duration, an electronic device, and a computer-readable storage medium.
Background
With the continuous development of internet technology, the purchase mode of ordering goods on line gradually becomes a new choice for the public. In the consumption mode of ordering commodities on line, ordered commodities are immediately delivered to a user, and convenient consumption experience is brought.
In the scene of ordering goods on line and the like, after the shopping platform receives an order of ordering and purchasing goods through a client, the estimated order delivery time length can be determined according to the relevant information (such as the stock preparation time length, the order distance and the like) of the order. Since the estimated delivery time is the delivery time estimated based on the relevant information of the order, the estimated delivery time is difficult to avoid and has the problem of accuracy. In the related art, the estimated delivery duration can be adjusted to ensure more accurate delivery duration displayed to the user. However, in the related art, the accuracy of the adjusted estimated delivery time is low.
Disclosure of Invention
The object of the present application is to solve at least one of the above-mentioned technical drawbacks, in particular, the technical drawback of low accuracy in estimating the delivery duration.
According to an aspect of the present application, there is provided a delivery duration adjustment method, including: obtaining order parameter information of a target order and a first estimated delivery time length;
inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; the first weight value indicates a probability corresponding to each of the candidate adjustment information;
the candidate adjusting information comprises a first adjusting operation and an operation duration of the first adjusting operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing reinforcement learning on a training sample set;
determining target adjustment information based on the first weight value and the candidate adjustment information.
Optionally, before obtaining the order parameter information of the target order and the first pre-estimated delivery duration, the method includes:
acquiring a training sample set; wherein the training sample set includes sample orders: sample order parameter information, sample estimated distribution time length and sample actual distribution time length;
inputting the sample order parameter information and the sample estimated distribution time length into an initial training model aiming at each sample order to obtain the predicted weight value of each candidate adjusting information;
determining a first distribution accuracy rate of the sample order after the candidate adjustment operation is executed according to the predicted weight value and the actual sample distribution time length;
determining a training loss value according to the first distribution accuracy rate;
and repeatedly training the neural network model based on the sample order information and the training loss value corresponding to the sample order until the preset adjustment model meeting the training end condition is obtained.
Optionally, the determining a training loss value according to the first distribution accuracy includes:
determining an accuracy improvement value according to a first distribution accuracy rate of the sample order and an actual distribution accuracy rate of the sample order aiming at each sample order, wherein the actual distribution accuracy rate of the sample order is determined according to an actual sample distribution time length of the sample order;
and determining the training loss value based on the accuracy improvement value and a correction value, wherein the correction value is determined according to the relation between the predicted adjustment information and the first preset adjustment information, and the predicted adjustment information is determined according to each candidate adjustment information and the corresponding predicted weight value.
Optionally, the data relationship corresponding to the training loss value is as follows:
Figure BDA0003461142480000021
wherein L (θ | s) represents the training loss value,
Figure BDA0003461142480000022
represents the gradient value of x; e (x) represents the expected value of x;
s represents the sample order, A represents the set of candidate adjustment information,
p θ (a | S) represents a probability that S performs each of the candidate adjustment information in a;
r (a) represents the first delivery accuracy;
(s) representing said actual delivery accuracy;
λ × l (a) represents the correction value, λ represents a preset adjustment intensity factor,
l (a) denotes a correction term, and l (a) max ((t (a) -t)u),0)+max(-(t(A)-tl) 0), t (A) represents a prediction adjustment time period of the prediction adjustment information, [ t [ [ t ]l,tu]And the preset adjustment duration interval of the first preset adjustment information is represented.
Optionally, the order parameter information includes at least one of the following:
blocking order quantity;
the order form estimates the time length of stock preparation;
the historical average stock keeping duration of the order;
an order distribution distance;
the historical average delivery duration of the order.
Optionally, the method further includes:
determining a second estimated distribution time length according to the first estimated distribution time length and the target adjustment information;
and sending the second estimated delivery time length to a client, and indicating the client to display the second estimated delivery time length on a user interface.
Optionally, the determining target adjustment information based on the first weight value and the candidate adjustment information includes at least one of:
determining the candidate adjustment information with the maximum corresponding first weight value as the target adjustment information;
and according to each candidate adjusting information and the corresponding first weight value, carrying out weighting processing on the operation duration of each first adjusting operation to obtain the target adjusting information.
According to an aspect of the present application, there is provided a delivery duration adjustment method, including: receiving a second pre-estimated delivery time length sent by a server, wherein the second pre-estimated delivery time length is determined according to a first pre-estimated delivery time length and target adjustment information, the target adjustment information is determined based on each candidate adjustment information and a first weight value corresponding to each candidate adjustment information, and the first weight value corresponding to each candidate adjustment information is determined through a preset adjustment model based on order parameter information and the first pre-estimated delivery time length;
wherein the first weight value indicates a probability that each of the candidate adjustment information corresponds to;
the candidate adjustment information comprises a first adjustment operation and an operation duration of the first adjustment operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing machine learning on a training sample set;
and displaying the second estimated delivery time length on a user interface.
According to another aspect of the present application, there is provided a delivery duration adjustment apparatus, the apparatus including:
the acquisition module is used for acquiring order parameter information of the target order and the first estimated delivery time length;
the prediction module is used for inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; the first weight value indicates a probability corresponding to each of the candidate adjustment information;
the candidate adjusting information comprises a first adjusting operation and an operation duration of the first adjusting operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing reinforcement learning on a training sample set;
a determining module, configured to determine target adjustment information based on the first weight value and the candidate adjustment information.
According to another aspect of the present application, there is provided a delivery duration adjustment apparatus, the apparatus including:
the receiving module is used for receiving second estimated delivery time sent by the server, the second estimated delivery time is determined according to first estimated delivery time and target adjustment information, the target adjustment information is determined based on each candidate adjustment information and first weighted values respectively corresponding to the candidate adjustment information, and the first weighted value corresponding to each candidate adjustment information is determined through a preset adjustment model based on order parameter information and the first estimated delivery time;
wherein the first weight value indicates a probability that each of the candidate adjustment information corresponds to;
the candidate adjustment information comprises a first adjustment operation and an operation duration of the first adjustment operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing machine learning on a training sample set;
and the display module is used for displaying the second estimated delivery time length on a user interface.
According to another aspect of the present application, there is provided an electronic device including:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to: the delivery duration adjustment method of any one of the first or second aspects is performed.
For example, in a third aspect of the present application, there is provided a computing device comprising: the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the delivery duration adjustment method according to any one of the first aspect and the second aspect of the application.
According to yet another aspect of the present application, there is provided a computer-readable storage medium,
the computer program, when executed by a processor, implements the delivery duration adjustment method of any one of the first or second aspects.
For example, in a fourth aspect of the embodiments of the present application, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the delivery duration adjustment method according to any one of the first or second aspects of the present application.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the methods provided in the various alternative implementations of the first aspect or the second aspect described above.
The beneficial effect that technical scheme that this application provided brought is:
the method comprises the steps of obtaining order parameter information of a target order and a first estimated delivery duration; inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; determining target adjustment information based on the first weight value and the candidate adjustment information; the target adjustment information corresponding to the first estimated delivery time of the target order is predicted based on the reinforcement learning model according to the characteristics of the order parameter information of the target order, and the first estimated delivery time is adjusted by fully utilizing the relevant order parameter information of the target order and combining the characteristics of the order parameter information of the target order in the prediction process of the preset adjustment model, so that the accuracy of the delivery time prediction is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
Fig. 1 is a system architecture diagram of a delivery duration adjustment method according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of a framework of a delivery duration adjustment system according to an embodiment of the present disclosure;
fig. 3 is a flowchart illustrating a method for adjusting a delivery duration according to an embodiment of the present disclosure;
fig. 4 is a second schematic flow chart of a delivery duration adjustment method according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a dispensing duration adjustment apparatus according to an embodiment of the present disclosure;
fig. 6 is a second schematic structural diagram of a dispensing duration adjustment apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device for adjusting a delivery duration according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
At least part of the contents of the delivery duration adjustment method provided by the embodiment of the present application relate to the fields of machine learning and the like in the field of artificial intelligence, and also relate to various fields of Cloud technologies, such as Cloud computing in Cloud technology, Cloud services, and related data computing processing in the field of big data.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Machine Learning (ML for short) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
To further illustrate the technical solutions provided by the embodiments of the present application, the following detailed description is made with reference to the accompanying drawings and the detailed description. Although the embodiments of the present application provide method steps as shown in the following embodiments or figures, more or fewer steps may be included in the method based on conventional or non-inventive efforts. In steps where no necessary causal relationship exists logically, the order of execution of the steps is not limited to that provided by the embodiments of the present application.
Fig. 1 is a system architecture diagram of a delivery duration adjustment method according to an embodiment of the present application. The system may include a server 101 and a terminal cluster, wherein the server 101 may be regarded as a background server providing the delivery duration adjustment process.
The terminal cluster may include: the terminal 102, the terminal 103, and the terminals 104 and … …, wherein the terminal is installed with a client supporting the distribution duration adjustment process or the commodity transaction, and the terminal may include a distributor terminal and a user terminal, for example. There may be a communication connection between the terminals, for example, a communication connection between terminal 102 and terminal 103, and a communication connection between terminal 103 and terminal 104.
Meanwhile, the server 101 may provide a service for the terminal cluster through a communication connection function, and any terminal in the terminal cluster may have a communication connection with the server 101, for example, a communication connection exists between the terminal 102 and the server 101, and a communication connection exists between the terminal 103 and the server 101, where the communication connection is not limited to a connection manner, and may be directly or indirectly connected through a wired communication manner, may also be directly or indirectly connected through a wireless communication manner, and may also be through other manners.
The communicatively coupled network may be a wide area network or a local area network, or a combination thereof. The application is not limited thereto.
The delivery duration adjusting method in the embodiment of the present application may be executed on a server side or a terminal side, and the execution subject is not limited in the embodiment of the present application. In the distribution time length adjusting process, a user can trigger an ordering operation through a client installed on the terminal device, the server adjusts the first estimated distribution time length (namely, executes the time-adding or time-subtracting process) according to the order parameter information and the first estimated distribution time length, sends the adjusted second estimated distribution time length to the terminal, and then the terminal displays the second estimated distribution time length to the user through the client.
Therefore, the method provided by the embodiment of the present application may be executed by a computer device, which includes but is not limited to a terminal (also including the user terminal described above) or a server (also including the server 101 described above). The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
Of course, the method provided in the embodiment of the present application is not limited to be used in the application scenario shown in fig. 1, and may also be used in other possible application scenarios, and the embodiment of the present application is not limited. The functions that can be implemented by each device in the application scenario shown in fig. 1 will be described in the following method embodiments, and will not be described in detail herein.
As an alternative embodiment, fig. 2 shows a schematic structural diagram of a framework of a delivery duration adjustment system provided by the present application, and as shown in fig. 2, the system includes two parts, which can be offline and online. The off-line part mainly performs model training. The online part mainly adjusts the initial estimated delivery time (first estimated delivery time) of the order according to the actual order condition of the user by utilizing the model prediction function.
As shown in fig. 1, the processing performed in the online portion of the embodiment of the present application includes:
(1) after a user triggers an order placing operation through a client, acquiring order parameter information of the target order;
(2) the initial estimated delivery duration is obtained, and for convenience of description, the initial estimated delivery duration may be referred to as a first estimated delivery duration, and the first estimated delivery duration may be obtained through prediction by a learning model of the predicted duration.
(3) According to the preset adjusting model of the embodiment of the application, according to the order parameter information and the first pre-estimated delivery time length, a first weight value corresponding to each candidate adjusting information for adjusting the first pre-estimated delivery time length is predicted, and target adjusting information for adjusting the first pre-estimated delivery time length is output based on the first weight value and the candidate adjusting information.
(4) And adjusting the first estimated delivery time length based on the target adjustment information to obtain a second estimated delivery time length, namely the estimated delivery time length displayed to the user.
The processing executed by the offline part of the embodiment of the application:
training an initial model by adopting a reinforced learning idea to obtain the preset adjustment model:
the Reinforcement Learning model may adopt Reinforcement Learning (RL), depth Reinforcement Learning (DQN), Depth Deterministic Policy Gradient (DDPG), depth Reinforcement Learning Algorithm (A3C), and the like, and is not limited herein.
In the embodiment of the present application, the actions and rewards of reinforcement learning are specifically designed as follows:
and (4) Action: and aiming at the candidate adjusting information of the target order, wherein the candidate adjusting information comprises the operation of adding time or subtracting time to the first estimated delivery time length and the corresponding operation time length.
Reward: and according to each candidate adjusting information, carrying out the distribution accuracy of the target order after the candidate adjusting operation on the target order.
According to the embodiment of the application, by combining reinforcement learning training, adjustment operation (time-increasing or time-decreasing operation) is carried out on the first estimated delivery time length to serve as behavior actions in reinforcement learning, the delivery accuracy of the target order under the adjustment of each behavior Action serves as Reward of the reinforcement learning, and the behavior actions corresponding to the target order, namely the time-increasing or time-decreasing operation on the first estimated delivery time length and the corresponding operation time length are determined by maximizing the Reward.
The embodiment of the present application provides a possible implementation manner, and the scheme may be executed by any electronic device, and optionally, any electronic device may be a server device having a delivery duration adjustment capability, or may be a device or a chip integrated on these devices. As shown in fig. 3, which is a schematic flow chart of a delivery duration adjustment method according to an embodiment of the present application, the method includes the following steps:
step S301: obtaining order parameter information of the target order and a first estimated delivery time length.
In the embodiment of the present application, the target order may include an order corresponding to an order placing operation triggered by a user (a customer user) through a user client. Alternatively, in a particular scenario, the target order may include an order for a user to purchase an item at an online store, an order for a meal, and so forth.
The order parameter information represents related parameter information related to the target order, and the order parameter information may specifically include information such as user identification Information (ID), information of items or food items placed and purchased by the user, amount of money paid by the order, number of blocked orders, estimated order preparation time, historical average order preparation time of the order, order distribution distance, and historical average distribution time of the order.
Optionally, the information such as the user ID, the information of the article or meal purchased by placing an order, the amount paid by the order, the order distribution distance, and the like in the order parameter information may be obtained by querying the relevant information of the target order after the user triggers the order placing operation; information such as the number of blocked orders, historical average stock keeping time of orders, historical average distribution time of orders and the like in the order parameter information can be obtained by inquiring a database; the estimated stock-keeping time of the order in the order parameter information can be estimated and determined according to the quantity of the items ordered by the user and the time for preparing each item.
The first estimated delivery duration represents an estimated delivery duration of the target order. The first estimated delivery time length can be obtained by estimating according to the historical average delivery time length of the order. In addition, the method can be obtained by predicting through a learning model of the predicted duration based on order parameter information such as the number of blocked orders, the estimated stock keeping duration of the orders, the historical average stock keeping duration of the orders, the order distribution distance, the historical average distribution duration of the orders and the like; the learning model for predicting the duration can be obtained by pre-training the actual distribution duration of the learned historical orders in the actual training.
Step S302: and inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information.
The first weight value indicates a probability corresponding to each candidate adjustment information, and the probability is a probability of executing the candidate adjustment information on a first estimated delivery duration.
The candidate adjustment information comprises a first adjustment operation and an operation duration of the first adjustment operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the timing operation is, for example, a 1 minute or 2 minutes operation for the first estimated delivery time period, and the timing reduction operation is, for example, a 1 minute or 2 minutes operation for the first estimated delivery time period.
The preset adjusting model is obtained by performing reinforcement learning on a training sample set.
In the embodiment of the application, the preset adjustment model is a model obtained by pre-training. Optionally, the preset adjustment model may be obtained based on reinforcement learning training, specifically, in the embodiment of the present application, the reinforcement learning includes an Action (Action) and an incentive (Reward), and the specific design is as follows:
and (4) Action: the method comprises the steps of aiming at candidate adjusting information of a target order, wherein the candidate adjusting information comprises a first adjusting operation and an operation duration of the first adjusting operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; for example, the candidate adjustment information may include, for example, plus 1 minute, plus 2 minutes, minus 1 minute, minus 2 minutes, and the like. The candidate adjustment information may be preset according to a time length requirement on the actual delivery time length, for example, in the actual order delivery process, the first estimated delivery time length is 20 minutes, and the time length requirement on the actual delivery time length is not more than 30 minutes, so that the operation time length of the time adding operation of the candidate adjustment information is less than 10 minutes. Further, the operation time period in the candidate adjustment information may also be directly set, for example, the operation time period of the time-adding operation is set to 5 minutes to 15 minutes or the like.
Reward: and according to each candidate adjusting information, carrying out the distribution accuracy of the target order after the candidate adjusting operation on the target order.
According to the embodiment of the application, by introducing a training mode of reinforcement learning, adjusting operation (time adding or time subtracting operation) on the first estimated delivery time length is used as a behavior Action in the reinforcement learning; under the adjustment of each Action, the delivery accuracy of the target order is used as Reward for reinforcement learning; and determining the optimal Action corresponding to the target order by maximizing Reward, namely determining the adding or subtracting operation of the first estimated delivery time length and the corresponding operation time length by the optimal delivery accuracy.
In the actual prediction process, a first weight value corresponding to each candidate adjustment information can be predicted according to a preset adjustment model, the order parameter information and the first pre-estimated delivery duration; the first weight value is a probability value corresponding to each candidate adjustment information. In this way, when determining the target adjustment information corresponding to the first estimated delivery duration, the target adjustment information may be determined based on each candidate adjustment information and the corresponding first weight value.
Step S303: determining target adjustment information based on the first weight value and the candidate adjustment information.
Specifically, when determining the target adjustment information of the first estimated delivery duration, the candidate adjustment information with the largest first weight value may be determined as the target adjustment information. In addition, one of the candidate adjustment information may be extracted as the target adjustment information from the candidate adjustment information based on the first weight value corresponding to each candidate adjustment information. It can be understood that the probability of each candidate adjustment information being extracted is the first weight value corresponding to the candidate adjustment information.
In addition, weighting processing may be performed on the operation duration of each first adjustment operation according to each candidate adjustment information and the first weight value corresponding thereto, and the target adjustment information may be determined according to the operation duration after the weighting processing. For example, the candidate adjustment information and the first weight values respectively corresponding thereto are:
candidate adjustment information 1: adding time for 3 minutes, wherein the corresponding first weight value is 0.2;
candidate adjustment information 2: adding time for 5 minutes, wherein the corresponding first weight value is 0.5;
candidate adjustment information 3: adding time for 7 minutes, wherein the corresponding first weight value is 0.3;
the operation duration of each candidate adjustment information is weighted and calculated as follows: 3 × 0.2+5 × 0.5+7 × 0.3 ═ 5.2; therefore, the target adjustment information after the weighting processing is added for 5.2 minutes.
For another example, when the candidate adjustment information includes a time-adding operation and a time-subtracting operation, and when the operation duration is weighted, the operation duration of the time-subtracting operation may be subtracted, for example, the candidate adjustment information and the first weight values respectively corresponding thereto are:
candidate adjustment information 1: adding time for 3 minutes, wherein the corresponding first weight value is 0.2;
candidate adjustment information 2: reducing the time for 1 minute, wherein the corresponding first weight value is 0.5;
candidate adjustment information 3: when the time is added for 2 minutes, the corresponding first weight value is 0.3;
the operation duration of each candidate adjustment information is weighted and calculated as follows: 3 x 0.2-1 x 0.5+2 x 0.3 ═ 0.7; therefore, the target adjustment information after the weighting processing is added 0.7 minutes.
Based on the above method, with reference to example one, an application scenario of the embodiment of the present application is described:
in a scenario where a user orders through an online store, after the user triggers an ordering operation for ordering through a client, the client usually displays a predicted delivery time or a predicted delivery time on a user interface. It should be noted that, in this embodiment of the application, the client may be a client of a distributor, that is, a user interface of the distribution end displays a predicted distribution duration; the client may also be a client of the user, that is, the predicted delivery duration is displayed on a user interface of the user. In addition, the estimated delivery duration displayed on the user interface is obtained by adjusting the initial estimated delivery duration (the first estimated delivery duration in the embodiment of the present application) by using the method in the embodiment of the present application, that is, by performing an operation of adjusting the time increment or time decrement on the initial estimated delivery duration.
Specifically, the order parameter information of the order may be obtained after the user triggers the order placing operation, for example, the order parameter information may include information such as a user ID, order placing food information, an amount paid by the order, a number of blocked orders, an estimated meal delivery duration of the order, a historical average meal delivery duration of the order, an order delivery distance, and a historical average delivery duration of the order; acquiring a first estimated delivery time length of the order, and inputting the order parameter information and the first estimated delivery time length of the order into a preset adjustment model to obtain a first weight value corresponding to each candidate adjustment information; further determining target adjustment information according to the first weight value corresponding to each candidate adjustment information; and adjusting the first estimated delivery time length through the target adjustment information to obtain the estimated delivery time length displayed to the user.
The method comprises the steps of obtaining order parameter information of a target order and a first estimated delivery duration; inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; determining target adjustment information based on the first weight value and the candidate adjustment information; the target adjustment information corresponding to the first estimated delivery time of the target order is predicted based on the reinforcement learning model according to the characteristics of the order parameter information of the target order, and the first estimated delivery time is adjusted by fully utilizing the relevant order parameter information of the target order and combining the characteristics of the order parameter information of the target order in the prediction process of the preset adjustment model, so that the accuracy of the delivery time prediction is improved.
In another embodiment of the present application, before obtaining the order parameter information of the target order and the first estimated delivery duration, the method includes:
acquiring a training sample set; wherein the training sample set includes sample orders: sample order parameter information, sample estimated distribution time length and sample actual distribution time length;
inputting the sample order parameter information and the sample estimated distribution time length into an initial training model aiming at each sample order to obtain the predicted weight value of each candidate adjusting information;
determining a first distribution accuracy rate of the sample order after the candidate adjustment operation is executed according to the predicted weight value and the actual sample distribution time length;
determining a training loss value according to the first distribution accuracy rate;
and repeatedly training the neural network model based on the sample order information and the training loss value corresponding to the sample order until the preset adjustment model meeting the training end condition is obtained.
Before obtaining the order parameter information of the target order and the first estimated delivery duration, the embodiment of the application further comprises a training process of presetting the adjustment model.
Specifically, the training sample set may include a plurality of sample orders, the sample orders may be historical delivery orders, and the preset adjustment model of the present application is obtained by learning actual delivery conditions of the historical delivery orders in the training process.
In the training process, for each sample order, inputting the sample order parameter information and the sample estimated delivery time length into an initial training model, so as to obtain the predicted weight value of each candidate adjustment information. And then, determining a first distribution accuracy rate of the sample order after the candidate adjustment operation is executed according to the predicted weight value and the actual sample distribution time length.
In the embodiment of the present application, the order on-time delivery may be understood as that the difference between the actual delivery time length of the sample order and the expected delivery time length displayed to the user is within a preset time range, and for convenience of description, T is usedsRepresenting the actual delivery duration; by TuIndicating a projected delivery duration for display to a user; by tcRepresenting the difference between the actual delivery duration and the expected delivery duration displayed to the user; then tc=Ts-Tu(ii) a When t iscWhen the order is within the preset time range, the order is considered to be delivered on time. For example, when t iscAt [ -20min,0min]When the order is within range, the order may be considered to be delivered on time.
Wherein the predicted delivery duration T is displayed to the useruFor the first adjustment operation of the first estimated delivery time length, i.e. the delivery time length after the time-adding operation or the time-reducing operation, T0 represents the first estimated delivery time length, Δ T represents the operation time length corresponding to the time-adding operation or the time-reducing operation, then, TuCan be represented as Tu=t0+Δt。
It is understood that, for each candidate adjustment information, if the delivery duration of the sample order is within the time range of on-time delivery under the adjustment operation of the candidate adjustment information, that is, when t iscWhen the time is within the preset time range, the first distribution accuracy of the sample order for executing the candidate adjustment operation is the first weight value corresponding to the candidate adjustment information. If atIn the adjustment operation of the candidate adjustment information, if the delivery time length of the sample order is not within the time range of on-time delivery, the first delivery accuracy of the sample order for executing the candidate adjustment operation is 0.
For example, if the candidate adjustment information is time-up for 2 minutes, and the first weight value corresponding to the candidate adjustment information is 0.3; if the sample order can be delivered on time under the adjustment operation of adding 2 minutes, the first delivery accuracy rate corresponding to the candidate adjustment information is 30%; if the sample order cannot be delivered on time under the adjustment operation of adding 2 minutes, the first delivery accuracy corresponding to the candidate adjustment information is 0.
Further, after the first distribution accuracy is determined, the training loss value may be determined according to the first distribution accuracy.
In another embodiment of the present application, the determining a training loss value according to the first delivery accuracy rate includes:
determining an accuracy improvement value according to a first distribution accuracy rate of the sample order and an actual distribution accuracy rate of the sample order aiming at each sample order, wherein the actual distribution accuracy rate of the sample order is determined according to an actual sample distribution time length of the sample order;
and determining the training loss value based on the accuracy improvement value and a correction value, wherein the correction value is determined according to the relation between the predicted adjustment information and the first preset adjustment information, and the predicted adjustment information is determined according to each candidate adjustment information and the corresponding predicted weight value.
Specifically, in the actual distribution process, if a sample order is delivered on time, the actual distribution accuracy of the sample order is 1; if the sample order is not delivered on time, the actual delivery accuracy of the sample order is 0. And when the difference value between the actual delivery time length of the sample order and the expected delivery time length displayed to the user is within a preset time range, the sample order is regarded as being delivered on time. Wherein the expected delivery duration of the sample order displayed to the user may be based on the sample orderDetermining a first estimated delivery time and a preset adjustment operation time, namely, for the sample order, displaying the estimated delivery time T to the useruT0+ Δ t; where Δ t is predetermined.
Further, an accuracy improvement value is determined according to the first distribution accuracy rate of the sample order and the actual distribution accuracy rate of the sample order. Representing said first delivery accuracy by r (a); representing said actual delivery accuracy by b(s); then, the accuracy increase is r (A) -b(s).
In addition, in the embodiment of the application, when the loss value is determined, the correction value can be determined according to the relationship between the predicted adjustment information and the first preset adjustment information; determining the training loss value based on the accuracy boost value and the correction value.
The first preset adjustment information represents a time range of an operation duration of a preset time-adding operation or a preset time-subtracting operation, and for example, the first preset adjustment information may be [6min,10min ].
The prediction adjustment information is determined according to each candidate adjustment information and the corresponding prediction weight value thereof, and specifically, the prediction adjustment information may be the candidate adjustment information with the largest corresponding prediction weight value; the adjustment information may be determined by weighting the candidate adjustment information according to the prediction weight value.
It is understood that the correction value is determined based on whether the operation time period of the time-adding operation or the time-subtracting operation of the prediction adjustment information is within a time range (first preset adjustment information) of the operation time period of the time-adding operation or the time-subtracting operation set in advance; when the operation duration of the predicted adjustment information is within the time range of the operation duration of the first preset adjustment information, the correction value is 0, namely, the correction is not needed; and when the operation duration of the predicted adjustment information is not within the time range of the operation duration of the first preset adjustment information, correcting by using a correction value. The correction values can be determined in the following examples of specific data relationships.
In another embodiment of the present application, the data relationship corresponding to the training loss value is as follows:
Figure BDA0003461142480000171
wherein L (θ | s) represents the training loss value,
Figure BDA0003461142480000172
represents the gradient value of x; e (x) represents the expected value of x;
s represents the sample order, A represents the set of candidate adjustment information,
p θ (a | S) represents a probability that S performs each of the candidate adjustment information in a;
r (a) represents the first delivery accuracy;
(s) representing said actual delivery accuracy;
λ × l (a) represents the correction value, λ represents a preset adjustment intensity factor,
l (a) denotes a correction term, and l (a) max ((t (a) -t)u),0)+max(-(t(A)-tl) 0), t (A) represents a prediction adjustment time period of the prediction adjustment information, [ t [ [ t ]l,tu]And the preset adjustment duration interval of the first preset adjustment information is represented.
In the embodiment of the application, during the training process, the reason is that
Figure BDA0003461142480000173
Characterizes the improvement of delivery accuracy and thus maximizes delivery
Figure BDA0003461142480000174
Or minimize
Figure BDA0003461142480000175
So as to achieve better and better prediction effect of the preset adjustment model.
Further, the initial training model is repeatedly trained, if a preset training end condition is met, the training is ended, the model at the end of the training is used as the preset adjusting model, if the training end condition is not met, the model parameters of the initial training model are adjusted, and the adjusted model is continuously trained on the basis of each training sample.
In another embodiment of the present application, the method further comprises:
determining a second estimated distribution time length according to the first estimated distribution time length and the target adjustment information;
and sending the second estimated delivery time length to a client, and indicating the client to display the second estimated delivery time length on a user interface.
Specifically, the second estimated delivery duration represents an estimated delivery duration displayed to the user after the adjustment of the target adjustment information is performed on the first estimated delivery duration. Predicted delivery duration T displayed to useruIndicating a first estimated delivery time period, denoted T0, and an operating time period corresponding to an up or down time operation, denoted at, then, TuCan be represented as Tu=t0+Δt。
Further, the second estimated delivery duration may be sent to a client, and the client is instructed to display the second estimated delivery duration on a user interface. The client can be a client of a distributor, namely a distribution end; but also at the user's client, i.e. the user side.
In another embodiment of the present application, the determining target adjustment information based on the first weight value and the candidate adjustment information includes at least one of:
determining the candidate adjustment information with the maximum corresponding first weight value as the target adjustment information;
and according to each candidate adjusting information and the corresponding first weight value, carrying out weighting processing on the operation duration of each first adjusting operation to obtain the target adjusting information.
Specifically, when determining the target adjustment information for the first estimated delivery duration, the candidate adjustment information with the largest first weight value may be determined as the target adjustment information. One of the candidate adjustment information may also be extracted from the candidate adjustment information as the target adjustment information based on the first weight value corresponding to each candidate adjustment information, and it may be understood that the probability of each candidate adjustment information being extracted is the first weight value corresponding to the candidate adjustment information.
In addition, weighting processing may be performed on the operation duration of each first adjustment operation according to each candidate adjustment information and the first weight value corresponding thereto, and the target adjustment information may be determined according to the operation duration after the weighting processing.
The method comprises the steps of obtaining order parameter information of a target order and a first estimated delivery duration; inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; determining target adjustment information based on the first weight value and the candidate adjustment information; the target adjustment information corresponding to the first estimated delivery time of the target order is predicted based on the reinforcement learning model according to the characteristics of the order parameter information of the target order, and the first estimated delivery time is adjusted by fully utilizing the relevant order parameter information of the target order and combining the characteristics of the order parameter information of the target order in the prediction process of the preset adjustment model, so that the accuracy of the delivery time prediction is improved.
The embodiment of the present application provides a possible implementation manner, and the scheme may be executed by any electronic device, and optionally, any electronic device may be a terminal device having a delivery duration adjustment capability, or may also be a device or a chip integrated on these devices. As shown in fig. 4, which is a schematic flow chart of a delivery duration adjustment method according to an embodiment of the present application, the method includes the following steps:
step S401: receiving a second pre-estimated delivery time length sent by a server, wherein the second pre-estimated delivery time length is determined according to a first pre-estimated delivery time length and target adjustment information, the target adjustment information is determined based on each candidate adjustment information and a first weight value corresponding to each candidate adjustment information, and the first weight value corresponding to each candidate adjustment information is determined through a preset adjustment model based on order parameter information and the first pre-estimated delivery time length;
wherein the first weight value indicates a probability that each of the candidate adjustment information corresponds to;
the candidate adjustment information comprises a first adjustment operation and an operation duration of the first adjustment operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing machine learning on a training sample set;
specifically, the order parameter information represents related parameter information related to the target order, and the order parameter information may specifically include information such as user identification Information (ID), information of an item or a meal to be placed and purchased, amount of money paid by the order, number of blocked orders, estimated order-stock time, historical average order-stock time of the order, order distribution distance, and historical average distribution time of the order.
The user ID, the information of the items or food items to be placed and purchased, the amount of money paid by the order, the order distribution distance and other information in the order parameter information can be obtained by inquiring the relevant information of the target order after the user triggers the placing operation; information such as the number of blocked orders, historical average stock keeping time of orders, historical average distribution time of orders and the like in the order parameter information can be obtained by inquiring a database; the estimated stock-keeping time of the order in the order parameter information can be estimated and determined according to the quantity of the items ordered by the user and the time for preparing each item.
The first estimated delivery time length represents an estimated delivery time length of the target order, wherein the first estimated delivery time length can be obtained by estimating according to the historical average delivery time length of the order. In addition, the method can be obtained by predicting through a learning model of the predicted duration based on order parameter information such as the number of blocked orders, the estimated stock keeping duration of the orders, the historical average stock keeping duration of the orders, the order distribution distance, the historical average distribution duration of the orders and the like; the learning model for predicting the duration can be obtained by pre-training the actual distribution duration of the learned historical orders in the actual training.
In the embodiment of the application, a model obtained by pre-training an adjustment model is preset. Optionally, the preset adjustment model may be obtained by training based on a reinforcement learning idea, and specifically, in the embodiment of the present application, the reinforcement learning Action and the Reward are specifically designed as follows:
and (4) Action: the method comprises the steps of aiming at candidate adjusting information of a target order, wherein the candidate adjusting information comprises a first adjusting operation and an operation duration of the first adjusting operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; for example, the candidate adjustment information may include, for example, plus 1 minute, plus 2 minutes, minus 1 minute, minus 2 minutes, and the like. The candidate adjustment information may be preset according to a time length requirement on the actual delivery time length, for example, in the actual order delivery process, the first estimated delivery time length is 20 minutes, and the time length requirement on the actual delivery time length is not more than 30 minutes, so that the operation time length of the time adding operation of the candidate adjustment information is less than 10 minutes. Further, the operation time period in the candidate adjustment information may also be directly set, for example, the operation time period of the time-adding operation is set to 5 minutes to 15 minutes or the like.
Reward: and according to each candidate adjusting information, carrying out the distribution accuracy of the target order after the candidate adjusting operation on the target order.
According to the embodiment of the application, by combining reinforcement learning training, adjustment operation (time-increasing or time-decreasing operation) is carried out on the first estimated delivery time length to serve as a behavior Action in reinforcement learning, the delivery accuracy of the target order under the adjustment of each behavior Action serves as Reward of the reinforcement learning, and the behavior Action corresponding to the target order, namely the time-increasing or time-decreasing operation on the first estimated delivery time length and the corresponding operation time length are determined by maximizing the Reward.
In the actual prediction process, the preset adjustment model may predict a first weight value corresponding to each candidate adjustment information, that is, a probability value corresponding to each candidate adjustment information, according to the order parameter information and the first predicted delivery time length, and further, when determining the target adjustment information corresponding to the first predicted delivery time length, may determine the target adjustment information based on each candidate adjustment information and the corresponding first weight value.
Specifically, when determining the target adjustment information for the first estimated delivery duration, the candidate adjustment information with the largest first weight value may be determined as the target adjustment information. One of the candidate adjustment information may also be extracted from the candidate adjustment information as the target adjustment information based on the first weight value corresponding to each candidate adjustment information, and it may be understood that the probability of each candidate adjustment information being extracted is the first weight value corresponding to the candidate adjustment information.
In addition, weighting processing may be performed on the operation duration of each first adjustment operation according to each candidate adjustment information and the first weight value corresponding thereto, and the target adjustment information may be determined according to the operation duration after the weighting processing.
The second estimated delivery duration represents an estimated delivery duration displayed to the user after the adjustment of the target adjustment information is performed on the first estimated delivery duration. Predicted delivery duration T displayed to useruIndicating a first estimated delivery time period, denoted T0, and an operating time period corresponding to an up or down time operation, denoted at, then, TuCan be represented as Tu=t0+Δt。
Step S402: and displaying the second estimated delivery time length on a user interface.
The user interface can be the user interface of the client of the distributor or the user interface of the client of the ordering user.
The method comprises the steps of obtaining order parameter information of a target order and a first estimated delivery duration; inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; determining target adjustment information based on the first weight value and the candidate adjustment information; the target adjustment information corresponding to the first estimated delivery time of the target order is predicted based on the reinforcement learning model according to the characteristics of the order parameter information of the target order, and the first estimated delivery time is adjusted by fully utilizing the relevant order parameter information of the target order and combining the characteristics of the order parameter information of the target order in the prediction process of the preset adjustment model, so that the accuracy of the delivery time prediction is improved.
An embodiment of the present application provides a delivery duration adjustment apparatus, and as shown in fig. 5, the delivery duration adjustment apparatus 50 may include: an acquisition module 501, a prediction module 502, and a determination module 503, wherein,
an obtaining module 501, configured to obtain order parameter information of a target order and a first pre-estimated delivery duration;
the prediction module 502 is configured to input the order parameter information and the first estimated delivery duration to a preset adjustment model, so as to obtain a first weight value corresponding to each preset candidate adjustment information; the first weight value indicates a probability corresponding to each of the candidate adjustment information;
the candidate adjusting information comprises a first adjusting operation and an operation duration of the first adjusting operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing reinforcement learning on a training sample set;
a determining module 503, configured to determine target adjustment information based on the first weight value and the candidate adjustment information.
In another embodiment of the present application, the apparatus further includes a training module, configured to obtain a training sample set before the obtaining of the order parameter information of the target order and the first estimated delivery duration; wherein the training sample set includes sample orders: sample order parameter information, sample estimated distribution time length and sample actual distribution time length;
inputting the sample order parameter information and the sample estimated distribution time length into an initial training model aiming at each sample order to obtain the predicted weight value of each candidate adjusting information;
determining a first distribution accuracy rate of the sample order after the candidate adjustment operation is executed according to the predicted weight value and the actual sample distribution time length;
determining a training loss value according to the first distribution accuracy rate;
and repeatedly training the neural network model based on the sample order information and the training loss value corresponding to the sample order until the preset adjustment model meeting the training end condition is obtained.
In another embodiment of the present application, the training module is configured to, for each sample order, determine an accuracy improvement value according to a first distribution accuracy of the sample order and an actual distribution accuracy of the sample order, where the actual distribution accuracy of the sample order is determined according to a sample actual distribution duration of the sample order;
and determining the training loss value based on the accuracy improvement value and a correction value, wherein the correction value is determined according to the relation between the predicted adjustment information and the first preset adjustment information, and the predicted adjustment information is determined according to each candidate adjustment information and the corresponding predicted weight value.
In another embodiment of the present application, the data relationship corresponding to the training loss value is as follows:
Figure BDA0003461142480000231
wherein L (θ | s) represents the training loss value,
Figure BDA0003461142480000232
represents the gradient value of x; e (x) represents the expected value of x;
s represents the sample order, A represents the set of candidate adjustment information,
p θ (a | S) represents a probability that S performs each of the candidate adjustment information in a;
r (a) represents the first delivery accuracy;
(s) representing said actual delivery accuracy;
λ × l (a) represents the correction value, λ represents a preset adjustment intensity factor,
l (a) denotes a correction term, and l (a) max ((t (a) -t)u),0)+max(-(t(A)-tl) 0), t (A) represents a prediction adjustment time period of the prediction adjustment information, [ t [ [ t ]l,tu]And the preset adjustment duration interval of the first preset adjustment information is represented.
In another embodiment of the present application, the order parameter information includes at least one of:
blocking order quantity;
the order form estimates the time length of stock preparation;
the historical average stock keeping duration of the order;
an order distribution distance;
the historical average delivery duration of the order.
In another embodiment of the present application, the apparatus further comprises:
the sending module is used for determining a second estimated delivery time length according to the first estimated delivery time length and the target adjustment information;
and sending the second estimated delivery time length to a client, and indicating the client to display the second estimated delivery time length on a user interface.
In another embodiment of the present application, the determining module is configured to determine, as the target adjustment information, the candidate adjustment information with the largest corresponding first weight value;
or
And according to each candidate adjusting information and the corresponding first weight value, carrying out weighting processing on the operation duration of each first adjusting operation to obtain the target adjusting information.
The present embodiment provides a delivery duration adjustment apparatus, as shown in fig. 6, the delivery duration adjustment apparatus 60 may include a receiving module 601 and a display module 602, wherein,
a receiving module 601, configured to receive a second pre-estimated delivery duration sent by a server, where the second pre-estimated delivery duration is determined according to a first pre-estimated delivery duration and target adjustment information, the target adjustment information is determined based on each candidate adjustment information and a first weight value corresponding to each candidate adjustment information, and the first weight value corresponding to each candidate adjustment information is determined by a preset adjustment model based on order parameter information and the first pre-estimated delivery duration;
wherein the first weight value indicates a probability that each of the candidate adjustment information corresponds to;
the candidate adjustment information comprises a first adjustment operation and an operation duration of the first adjustment operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing machine learning on a training sample set;
a display module 602, configured to display the second estimated delivery duration on a user interface.
The distribution duration adjusting apparatus of this embodiment can perform the distribution duration adjusting method shown in the above embodiments of this application, and the implementation principles thereof are similar, and are not described herein again.
The method comprises the steps of obtaining order parameter information of a target order and a first estimated delivery duration; inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; determining target adjustment information based on the first weight value and the candidate adjustment information; the target adjustment information corresponding to the first estimated delivery time of the target order is predicted based on the reinforcement learning model according to the characteristics of the order parameter information of the target order, and the first estimated delivery time is adjusted by fully utilizing the relevant order parameter information of the target order and combining the characteristics of the order parameter information of the target order in the prediction process of the preset adjustment model, so that the accuracy of the delivery time prediction is improved.
An embodiment of the present application provides an electronic device, including: a memory and a processor; at least one program stored in the memory for execution by the processor, which when executed by the processor, implements: the method comprises the steps of obtaining order parameter information of a target order and a first estimated delivery duration; inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; determining target adjustment information based on the first weight value and the candidate adjustment information; the target adjustment information corresponding to the first estimated delivery time of the target order is predicted based on the reinforcement learning model according to the characteristics of the order parameter information of the target order, and the first estimated delivery time is adjusted by fully utilizing the relevant order parameter information of the target order and combining the characteristics of the order parameter information of the target order in the prediction process of the preset adjustment model, so that the accuracy of the delivery time prediction is improved.
In an alternative embodiment, an electronic device is provided, as shown in fig. 7, the electronic device 4000 shown in fig. 7 comprising: a processor 4001 and a memory 4003. Processor 4001 is coupled to memory 4003, such as via bus 4002. Optionally, the electronic device 4000 may further include a transceiver 4004, and the transceiver 4004 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data. In addition, the transceiver 4004 is not limited to one in practical applications, and the structure of the electronic device 4000 is not limited to the embodiment of the present application.
The Processor 4001 may be a CPU (Central Processing Unit), a general-purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (field programmable Gate Array) or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 4001 may also be a combination that performs a computational function, including, for example, a combination of one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
Bus 4002 may include a path that carries information between the aforementioned components. The bus 4002 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 4002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.
The Memory 4003 may be a ROM (Read Only Memory) or other types of static storage devices that can store static information and instructions, a RAM (Random Access Memory) or other types of dynamic storage devices that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these.
The memory 4003 is used for storing application program codes (computer programs) for executing the present scheme, and is controlled by the processor 4001 to execute. Processor 4001 is configured to execute application code stored in memory 4003 to implement what is shown in the foregoing method embodiments.
Among them, electronic devices include but are not limited to: mobile phones, notebook computers, multimedia players, desktop computers, and the like.
The present application provides a computer-readable storage medium, on which a computer program is stored, which, when running on a computer, enables the computer to execute the corresponding content in the foregoing method embodiments.
The method comprises the steps of obtaining order parameter information of a target order and a first estimated delivery duration; inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; determining target adjustment information based on the first weight value and the candidate adjustment information; the target adjustment information corresponding to the first estimated delivery time of the target order is predicted based on the reinforcement learning model according to the characteristics of the order parameter information of the target order, and the first estimated delivery time is adjusted by fully utilizing the relevant order parameter information of the target order and combining the characteristics of the order parameter information of the target order in the prediction process of the preset adjustment model, so that the accuracy of the delivery time prediction is improved.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A distribution time length adjusting method is characterized by comprising the following steps:
obtaining order parameter information of a target order and a first estimated delivery time length;
inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; the first weight value indicates a probability corresponding to each of the candidate adjustment information;
the candidate adjusting information comprises a first adjusting operation and an operation duration of the first adjusting operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing reinforcement learning on a training sample set;
determining target adjustment information based on the first weight value and the candidate adjustment information.
2. The delivery duration adjustment method according to claim 1, wherein before obtaining the order parameter information of the target order and the first estimated delivery duration, the method comprises:
acquiring a training sample set; wherein the training sample set includes sample orders: sample order parameter information, sample estimated distribution time length and sample actual distribution time length;
inputting the sample order parameter information and the sample estimated distribution time length into an initial training model aiming at each sample order to obtain the predicted weight value of each candidate adjusting information;
determining a first distribution accuracy rate of the sample order after the candidate adjustment operation is executed according to the predicted weight value and the actual sample distribution time length;
determining a training loss value according to the first distribution accuracy rate;
and repeatedly training the neural network model based on the sample order information and the training loss value corresponding to the sample order until the preset adjustment model meeting the training end condition is obtained.
3. The delivery duration adjustment method according to claim 2,
determining a training loss value according to the first delivery accuracy rate includes:
determining an accuracy improvement value according to a first distribution accuracy rate of the sample order and an actual distribution accuracy rate of the sample order aiming at each sample order, wherein the actual distribution accuracy rate of the sample order is determined according to an actual sample distribution time length of the sample order;
and determining the training loss value based on the accuracy improvement value and a correction value, wherein the correction value is determined according to the relation between the predicted adjustment information and the first preset adjustment information, and the predicted adjustment information is determined according to each candidate adjustment information and the corresponding predicted weight value.
4. The delivery duration adjustment method according to claim 3, wherein the data relationship corresponding to the training loss value is as follows:
Figure FDA0003461142470000021
wherein L (θ | s) represents the training loss value,
Figure FDA0003461142470000022
represents the gradient value of x; e (x) represents the expected value of x;
s represents the sample order, A represents the set of candidate adjustment information,
p θ (a | S) represents a probability that S performs each of the candidate adjustment information in a;
r (a) represents the first delivery accuracy;
(s) representing said actual delivery accuracy;
λ × l (a) represents the correction value, λ represents a preset adjustment intensity factor,
l (a) denotes a correction term, and l (a) max ((t (a) -t)u),0)+max(-(t(A)-tl) 0), t (A) represents a prediction adjustment time period of the prediction adjustment information, [ t [ [ t ]l,tu]And the preset adjustment duration interval of the first preset adjustment information is represented.
5. The delivery duration adjustment method of claim 1, further comprising:
determining a second estimated distribution time length according to the first estimated distribution time length and the target adjustment information;
and sending the second estimated delivery time length to a client, and indicating the client to display the second estimated delivery time length on a user interface.
6. The delivery duration adjustment method according to claim 1,
the determining, based on the first weight value and the candidate adjustment information, target adjustment information including at least one of:
determining the candidate adjustment information with the maximum corresponding first weight value as the target adjustment information;
and according to each candidate adjusting information and the corresponding first weight value, carrying out weighting processing on the operation duration of each first adjusting operation to obtain the target adjusting information.
7. A distribution time length adjusting method is characterized by comprising the following steps:
receiving a second pre-estimated delivery time length sent by a server, wherein the second pre-estimated delivery time length is determined according to a first pre-estimated delivery time length and target adjustment information, the target adjustment information is determined based on each candidate adjustment information and a first weight value corresponding to each candidate adjustment information, and the first weight value corresponding to each candidate adjustment information is determined through a preset adjustment model based on order parameter information and the first pre-estimated delivery time length;
wherein the first weight value indicates a probability that each of the candidate adjustment information corresponds to;
the candidate adjustment information comprises a first adjustment operation and an operation duration of the first adjustment operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing machine learning on a training sample set;
and displaying the second estimated delivery time length on a user interface.
8. A dispensing duration adjustment device, comprising:
the acquisition module is used for acquiring order parameter information of the target order and the first estimated delivery time length;
the prediction module is used for inputting the order parameter information and the first estimated delivery time length into a preset adjustment model to obtain a first weight value corresponding to each preset candidate adjustment information; the first weight value indicates a probability corresponding to each of the candidate adjustment information;
the candidate adjusting information comprises a first adjusting operation and an operation duration of the first adjusting operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing reinforcement learning on a training sample set;
a determining module, configured to determine target adjustment information based on the first weight value and the candidate adjustment information.
9. A dispensing duration adjustment device, comprising:
the receiving module is used for receiving second estimated delivery time sent by the server, the second estimated delivery time is determined according to first estimated delivery time and target adjustment information, the target adjustment information is determined based on each candidate adjustment information and first weighted values respectively corresponding to the candidate adjustment information, and the first weighted value corresponding to each candidate adjustment information is determined through a preset adjustment model based on order parameter information and the first estimated delivery time;
wherein the first weight value indicates a probability that each of the candidate adjustment information corresponds to;
the candidate adjustment information comprises a first adjustment operation and an operation duration of the first adjustment operation; the first adjustment operation comprises a time adding operation or a time subtracting operation aiming at the first estimated delivery time length; the preset adjusting model is obtained by performing machine learning on a training sample set;
and the display module is used for displaying the second estimated delivery time length on a user interface.
10. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to: the delivery duration adjustment method according to any one of claims 1 to 7 is performed.
CN202210016491.3A 2022-01-07 2022-01-07 Distribution time length adjusting method and device and electronic equipment Pending CN114358692A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210016491.3A CN114358692A (en) 2022-01-07 2022-01-07 Distribution time length adjusting method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210016491.3A CN114358692A (en) 2022-01-07 2022-01-07 Distribution time length adjusting method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN114358692A true CN114358692A (en) 2022-04-15

Family

ID=81108194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210016491.3A Pending CN114358692A (en) 2022-01-07 2022-01-07 Distribution time length adjusting method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN114358692A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114707560A (en) * 2022-05-19 2022-07-05 北京闪马智建科技有限公司 Data signal processing method and device, storage medium and electronic device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114707560A (en) * 2022-05-19 2022-07-05 北京闪马智建科技有限公司 Data signal processing method and device, storage medium and electronic device
CN114707560B (en) * 2022-05-19 2024-02-09 北京闪马智建科技有限公司 Data signal processing method and device, storage medium and electronic device

Similar Documents

Publication Publication Date Title
US20210117738A1 (en) Intelligent agent reinforcement learning method and apparatus, device and medium
CN110647921B (en) User behavior prediction method, device, equipment and storage medium
CN114265979B (en) Method for determining fusion parameters, information recommendation method and model training method
CN110851699A (en) Deep reinforcement learning-based information flow recommendation method, device, equipment and medium
CN109767319A (en) The accrediting amount determines method, apparatus, computer equipment and storage medium
CN111340244B (en) Prediction method, training method, device, server and medium
CN108074003B (en) Prediction information pushing method and device
CN111461812A (en) Object recommendation method and device, electronic equipment and readable storage medium
CN110766513A (en) Information sorting method and device, electronic equipment and readable storage medium
CN114358692A (en) Distribution time length adjusting method and device and electronic equipment
CN111340522A (en) Resource recommendation method, device, server and storage medium
CN117422553A (en) Transaction processing method, device, equipment, medium and product of blockchain network
CN112037063A (en) Exchange rate prediction model generation method, exchange rate prediction method and related equipment
Chan et al. Forecasting online auctions via self‐exciting point processes
CN111724176A (en) Shop traffic adjusting method, device, equipment and computer readable storage medium
CN113128597B (en) Method and device for extracting user behavior characteristics and classifying and predicting user behavior characteristics
CN110210885A (en) Excavate method, apparatus, equipment and the readable storage medium storing program for executing of potential customers
CN109960572A (en) Equipment resource management method and device and intelligent terminal
CN110766478A (en) Method and device for improving user connectivity
CN110543973A (en) data prediction method, data prediction result adjusting method, device and equipment
CN117815674B (en) Game information recommendation method and device, computer readable medium and electronic equipment
CN113065066B (en) Prediction method, prediction device, server and storage medium
KR102113264B1 (en) Control method of financial product price stabilization system
CN112837079B (en) Commodity sales predicting method, commodity sales predicting device and computer equipment
CN116957033A (en) Neural network training method and method for recommending media content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination