CN112884388B - Training method, device and equipment for management strategy generation model - Google Patents

Training method, device and equipment for management strategy generation model Download PDF

Info

Publication number
CN112884388B
CN112884388B CN201911198896.8A CN201911198896A CN112884388B CN 112884388 B CN112884388 B CN 112884388B CN 201911198896 A CN201911198896 A CN 201911198896A CN 112884388 B CN112884388 B CN 112884388B
Authority
CN
China
Prior art keywords
logistics
different
network
management
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911198896.8A
Other languages
Chinese (zh)
Other versions
CN112884388A (en
Inventor
王弋宁
葛倩茹
陈佳琦
林梦婷
魏昊卿
方杰
汤芬斯蒂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SF Technology Co Ltd
Original Assignee
SF Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SF Technology Co Ltd filed Critical SF Technology Co Ltd
Priority to CN201911198896.8A priority Critical patent/CN112884388B/en
Publication of CN112884388A publication Critical patent/CN112884388A/en
Application granted granted Critical
Publication of CN112884388B publication Critical patent/CN112884388B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/083Shipping
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The embodiment of the application provides a training method, a training device and training equipment for a management strategy generation model, which are used for training the management strategy generation model, improving the accuracy of the management strategy generated by the management strategy generation model, and further promoting reasonable utilization and reasonable allocation of logistics resources. The training method of the management strategy generation model provided by the embodiment of the application comprises the following steps: acquiring network information of a network of network points, wherein the network information comprises network point information of different network points and association relations of logistics resources among different network points; according to the network information, obtaining the logistics resource information of different network points and the logistics resource management strategies of different network points; carrying out model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of different nodes and the logistics resource management strategies of different nodes; and determining the model which is completed to be trained as a management strategy generation model.

Description

Training method, device and equipment for management strategy generation model
Technical Field
The application relates to the field of logistics, in particular to a training method, a training device and training equipment for a management strategy generation model.
Background
In logistics work, a reasonable logistics resource allocation strategy is beneficial to fully utilizing logistics resources of logistics companies and guaranteeing the treatment efficiency of logistics operation.
In actual work, after the logistics resources of the network points are traversed, a management strategy matched with the current logistics resources can be configured, data support is provided for the management operation of the current logistics resources, and the stable operation of the logistics operation is ensured.
In practical application, the fact that the network point is insufficient in utilization rate or insufficient in logistics resources still exists under the actual utilization of logistics resources under the existing management strategy is found, so that the accuracy of the existing management strategy is still to be improved.
Disclosure of Invention
The embodiment of the application provides a training method, a training device and training equipment for a management strategy generation model, which are used for training the management strategy generation model, improving the accuracy of the management strategy generated by the management strategy generation model, and further promoting reasonable utilization and reasonable allocation of logistics resources.
In a first aspect, an embodiment of the present application provides a training method for a management policy generation model, where the method includes:
acquiring network information of a network of network points, wherein the network information comprises network point information of different network points and association relations of logistics resources among different network points;
According to the network information, obtaining the logistics resource information of different network points and the logistics resource management strategies of different network points;
carrying out model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of different nodes and the logistics resource management strategies of different nodes;
and determining the model which is completed to be trained as a management strategy generation model.
With reference to the first aspect of the embodiment of the present application, in a first possible implementation manner of the first aspect of the embodiment of the present application, according to an association relationship of logistics resources between different nodes, logistics resource information of different nodes, and logistics resource management policies of different nodes, performing model training on an initial model includes:
and carrying out model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of different nodes, the logistics resource management strategies of different nodes and the management strategy with the lowest risk index output as targets.
With reference to the first possible implementation manner of the first aspect of the embodiment of the present application, in a second possible implementation manner of the first aspect of the embodiment of the present application, the method further includes:
extracting a test management strategy output by the initial model;
Extracting the logistics resource project data of the test management strategy and the logistics piece handling capacity in a preset time period;
and calculating the risk index of the test management strategy according to the logistics resource project data and the logistics piece handling capacity.
With reference to the first possible implementation manner of the first aspect of the present application, in a third possible implementation manner of the first aspect of the present application, the association relationship of the logistic resources between different nodes includes an association relationship of a logistic resource management procedure between a supporting node and a supported node, where the supporting node is a node that provides logistic resource support for the supported node.
With reference to the first aspect of the embodiment of the present application or any implementation manner of the first aspect of the embodiment of the present application, in a fourth possible implementation manner of the first aspect of the embodiment of the present application, the logistics resource information of different sites includes at least one of the following: different processes, different process capacities, different process costs, different equipment capacities, different equipment treatable days, different equipment purchases, different equipment maintenance times, and different equipment allocation costs between different sites.
With reference to the fourth possible implementation manner of the first aspect of the embodiment of the present application, in a fifth possible implementation manner of the first aspect of the embodiment of the present application, the method further includes:
receiving a generation request, wherein the generation request is used for requesting to generate a target management strategy of the current logistics resource of the target website, and the generation request carries the current logistics resource information of the target website;
according to the network information, acquiring current logistics resource information of the associated network point and a target association relation of the logistics resources of the associated network point and the target network point;
inputting the current logistics resource information of the target network point, the current logistics resource information of the associated network point and the target association relationship into a management strategy generation model;
and extracting a target management strategy output by the management strategy generation model.
With reference to the fifth possible implementation manner of the first aspect of the embodiment of the present application, in a sixth possible implementation manner of the first aspect of the embodiment of the present application, the target management policy carries a risk index of the target management policy.
In a second aspect, an embodiment of the present application provides a training apparatus for managing a policy generation model, where the apparatus includes:
the system comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring network information of a network of network points, wherein the network information comprises the network point information of different network points and the association relation of logistics resources among different network points;
The acquisition unit is also used for acquiring the logistics resource information of different network points and the logistics resource management strategies of different network points according to the network information;
the training unit is used for carrying out model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of different nodes and the logistics resource management strategies of different nodes;
and the determining unit is used for determining the model after training as a management strategy generation model.
With reference to the second aspect of the embodiments of the present application, in a first possible implementation manner of the second aspect of the embodiments of the present application, the training unit is specifically configured to:
and carrying out model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of different nodes, the logistics resource management strategies of different nodes and the management strategy with the lowest risk index output as targets.
With reference to the first possible implementation manner of the second aspect of the embodiments of the present application, in a second possible implementation manner of the second aspect of the embodiments of the present application, the apparatus further includes a computing unit, configured to:
extracting a test management strategy output by the initial model;
extracting the logistics resource project data of the test management strategy and the logistics piece handling capacity in a preset time period;
And calculating the risk index of the test management strategy according to the logistics resource project data and the logistics piece handling capacity.
With reference to the first possible implementation manner of the second aspect of the embodiment of the present application, in a third possible implementation manner of the second aspect of the embodiment of the present application, the association relationship of the logistics resource between different nodes includes an association relationship of a logistics resource management procedure between a supporting node and a supported node, where the supporting node is a node that provides logistics resource support for the supported node.
With reference to the second aspect of the embodiment of the present application or any implementation manner of the second aspect of the embodiment of the present application, in a fourth possible implementation manner of the second aspect of the embodiment of the present application, the logistics resource information of different sites includes at least one of the following: different processes, different process capacities, different process costs, different equipment capacities, different equipment treatable days, different equipment purchases, different equipment maintenance times, and different equipment allocation costs between different sites.
With reference to the fourth possible implementation manner of the second aspect of the embodiment of the present application, in a fifth possible implementation manner of the second aspect of the embodiment of the present application, the apparatus further includes:
The receiving unit is used for receiving a generation request, wherein the generation request is used for requesting to generate a target management strategy of the current logistics resource of the target website, and the generation request carries the current logistics resource information of the target website;
the acquisition unit is also used for acquiring current logistics resource information of the associated network point and a target association relation of the logistics resources of the associated network point and the target network point according to the network information;
the input unit is used for inputting the current logistics resource information of the target network point, the current logistics resource information of the associated network point and the target association relationship into the management strategy generation model;
and the extraction unit is used for extracting the target management strategy output by the management strategy generation model.
With reference to the fifth possible implementation manner of the second aspect of the embodiment of the present application, in a sixth possible implementation manner of the second aspect of the embodiment of the present application, the target management policy carries a risk index of the target management policy.
In a third aspect, an embodiment of the present application further provides a training device for a management policy generation model, including a processor and a memory, where the memory stores a computer program, and when the processor invokes the computer program in the memory, the processor executes steps in any one of the training methods for the management policy generation model provided in the embodiment of the present application.
In a fourth aspect, embodiments of the present application further provide a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform steps in any one of the training methods of the management policy generation model provided in the embodiments of the present application.
From the above, the embodiments of the present application have the following beneficial effects:
the training information comprises the physical distribution resource information of different nodes and physical distribution resource management strategies of different nodes, and further comprises the association relation of physical distribution resources among different nodes, so that the model can be guided to pay attention to the association relation of the physical distribution resources among different nodes in the training process of the model, the association relation of the model to the physical distribution resources among different nodes is enabled to have obvious pertinence, the management strategy generated by the management strategy generated model obtained by training can be further improved, the cooperation of the local network node and other nodes on the physical distribution resources can be improved, and higher accuracy is achieved on the physical distribution resource management effect, and further reasonable utilization and reasonable allocation of the physical distribution resources are promoted.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a training method of a management policy generation model according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of calculating risk indexes according to an embodiment of the present application;
FIG. 3 is a further flow diagram of a training method for managing a policy generation model according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a training device for managing a policy generation model according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a training device for managing a policy generation model according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
In the following description, specific embodiments of the present application will be described with reference to steps and symbols performed by one or more computers, unless otherwise indicated. Thus, these steps and operations will be referred to in several instances as being performed by a computer, which as referred to herein performs operations that include processing units by the computer that represent electronic signals that represent data in a structured form. This operation transforms the data or maintains it in place in the computer's memory system, which may reconfigure or otherwise alter the computer's operation in a manner well known to those skilled in the art. The data structure maintained by the data is the physical location of the memory, which has specific characteristics defined by the data format. However, the principles of the present application are described in the foregoing text and are not meant to be limiting, and one skilled in the art will recognize that various steps and operations described below may also be implemented in hardware.
The principles of the present application operate using many other general purpose or special purpose operations, communication environments, or configurations. Examples of well known computing systems, environments, and configurations that may be suitable for use with the application include, but are not limited to, hand-held telephones, personal computers, servers, multiprocessor systems, microcomputer-based systems, mainframe computers, and distributed computing environments that include any of the above systems or devices.
The terms "first," "second," and "third," etc. in this application are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion.
First, before describing embodiments of the present application, related content of the embodiments of the present application with respect to application contexts will be described.
The execution main body of the training method of the management policy generation model in this embodiment may be a training device of the management policy generation model provided in this embodiment, or a training device of different types of management policy generation models, such as a server device, a physical host, or a User Equipment (UE), which is integrated with the training method device of the management policy generation model, where the training device of the management policy generation model may be implemented in a hardware or software manner, and the UE may specifically be a terminal device, such as a smart phone, a tablet computer, a notebook computer, a palm computer, a desktop computer, or a personal digital assistant (Personal Digital Assistant, PDA).
The training device of the management strategy generation model can adopt a working mode of independent operation or a working mode of a device cluster, and the training method of the management strategy generation model provided by the embodiment of the application is used for training the management strategy generation model, so that the accuracy of the management strategy generated by the management strategy generation model is improved, the reasonable utilization and the reasonable allocation of logistics resources are further promoted, and the stable operation of logistics operation is ensured.
Next, a training method of the management policy generation model provided in the embodiment of the present application will be described.
Referring to fig. 1, fig. 1 shows a flowchart of a training method of a management policy generation model according to an embodiment of the present application, and the training method of the management policy generation model provided by the embodiment of the present application includes steps S101 to S104, and the specific details thereof are as follows:
step S101, network information of a network of network points is obtained, wherein the network information comprises the network point information of different network points and the association relation of logistics resources among different network points;
in the embodiment of the application, a network of nodes can be configured based on different nodes of the logistics company, and the network of nodes is composed of the different nodes.
In the network of dots, the location of the dots may correspond to the geographic location/region of the dots; or, the network nodes can be distributed according to the management architecture of the network nodes, for example, different network nodes can be clustered by taking a transfer field to which the network nodes belong as a center; alternatively, the classification may be performed in combination with priorities of different mesh points, and it is understood that the mesh point network is not limited specifically herein.
In the network of the network points, not only the network point information of each network point can be conveniently configured, but also the association relation of the logistics resources among different network points can be conveniently configured. The website information of the website can be specifically information such as website Identification (ID), geographical position of the website, area of the website, vehicle of the website, logistics transportation path of the website, personnel of the website or personnel scheduling of the website; the association relation of the logistics resources among different nodes can be specifically the relation of the transportation path of the logistics vehicles among different nodes, the allocation of logistics equipment among different nodes, the flow of staff among different nodes or the priority of different nodes on the logistics resource allocation among different nodes, and the like, and the node information and the association relation are not limited specifically.
In some embodiments, the above-mentioned logistics resource information of different sites may specifically include at least one of the following:
different processes, different process capacities, different process costs, different equipment capacities, different equipment treatable days, different equipment purchases, different equipment maintenance times, and different equipment allocation costs between different sites.
In some embodiments, the above-mentioned association relationship of the physical distribution resources between different nodes may specifically further include an association relationship between a supporting node and a supported node in the physical distribution resource management process, where the supporting node is a node providing physical distribution resource support to the supported node.
For ease of description, a set of data relating to a device support scenario between mesh points is described as an example:
the supporting website gives up the allocation support, and the supported website does not purchase or maintain equipment, so that the allocation cost, the debugging time and the purchase cost are reduced;
the supporting website gives up the allocation support, and the supported website re-purchases the equipment, so that the allocation fee and the allocation time are reduced;
Supporting website one-way transfer, and supporting equipment is always belonged to supported website, supported website does not maintain supporting equipment, and one-way fee and transfer time in transfer cost are born, so that purchase fee and maintenance fee are reduced;
supporting single-pass transfer of the website, and the supporting equipment permanently belongs to the supported website, and the supported website maintains the supporting equipment, bears the single-pass cost and transfer time in transfer cost, and reduces purchase cost and maintenance cost;
the support network point double-pass transfer, the support equipment returns the support network point at the appointed time, the supported network point does not maintain the support equipment, the supported network point does not purchase new equipment, the double-pass cost and transfer time in transfer cost are born, and the purchase cost and maintenance cost are reduced;
the support network point double-pass transfer, the support equipment returns the support network point at the appointed time, the supported network point maintains the support equipment, the supported network point does not purchase new equipment, the double-pass fee and transfer time in transfer cost are born, and the purchase cost is reduced;
the support network point double-pass transfer, the support equipment returns the support network point at the appointed time, the supported network point does not maintain the support equipment, the supported network point purchases new equipment, the double-pass cost and transfer time in transfer cost are born, and the maintenance cost of the support equipment is reduced;
The support site double-pass transfer, the support equipment returns the support site at the appointed time, the supported site maintains the support equipment, the supported site purchases new equipment, and the double-pass cost and transfer time in transfer cost are born.
It will be appreciated that in the management of the logistics resources of the website, it may relate to the logistics resource support, especially for the equipment involved in the logistics operation of the website, such as horizontal transportation equipment, including various vehicles, conveyor belts, belt conveyors, etc.; packaging equipment, including packaging machinery, coding equipment and the like; the vertical conveying equipment comprises a lifter, a forklift, a stacker and the like; handling equipment, such as large quay bridges (container gantry cranes), various specialized ship (car) unloading equipment; a security monitoring device; the automatic management equipment can deepen the association relation between the nodes by adding the association relation of the logistics resource management procedure between the supporting nodes and the supported nodes in the node network, so that the association relation between the nodes is sunk into specific logistics operation of the nodes, and further has obvious pertinence on the association relation between the nodes in the training of a subsequent model, thereby improving the accuracy of the model.
Step S102, obtaining logistics resource information of different network points and logistics resource management strategies of different network points according to the network information;
after network information of the network points is obtained, different network points can be locked based on the network information or the network point information in the network information, the logistics resource information of the different network points can be obtained, and the current logistics resource management strategy of the different network points can be obtained.
Taking a list as an example, network information of a network of mesh points can exist in the form of a mesh point list, the list comprises mesh point IDs of different mesh points, and corresponding mesh point information and the association relation of logistics resources among different mesh points are stored for each mesh point ID, so that each mesh point can be locked through each mesh point ID in the mesh point list, and each mesh point can be locked through mesh point information corresponding to each mesh point ID in the mesh point list.
It can be understood that the logistics resource information of the website can include website information of the website in the website network, such as the geographic position of the website, the area of the website, vehicles of the website, logistics transportation paths of the website, personnel of the website or personnel scheduling of the website; alternatively, the physical distribution resource information of the website may also include information that is not included in the website information of the website in the website network, which is not limited herein.
Similarly, in practical application, the management policy of the physical distribution resource of the website may also include website information of the website in the website network, or the management policy of the physical distribution resource of the website may also be stored in other places, for example, a local server of the website or a cloud server.
Step S103, performing model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of different nodes and the logistics resource management strategies of different nodes;
after the association relation of the logistics resources among different nodes, the logistics resource information of different nodes and the logistics resource management strategies of different nodes are obtained, the three materials can be input into an initial model for forward propagation, then the three materials are propagated in the opposite direction according to the loss function configured for the initial model, and the initial model is continuously subjected to model training by taking the management strategy of the output logistics resources as a target.
Step S104, determining the model with the training as a management strategy generation model.
When the management strategy of the logistics resources output by the initial model after training can meet the loss function or other target conditions, training of the model is completed, and the model can be used as a management strategy generation model for generating the management strategy of the logistics resources.
As can be seen from the above, according to the training method for generating the model by using the management policy provided by the embodiment of the application, as the training information includes the association relationship of the logistics resources among different nodes besides the logistics resource information of different nodes and the logistics resource management policy of different nodes, in the training process of the model, the model can be guided to pay attention to the association relationship of the logistics resources among different nodes, so that the model has obvious pertinence to the association relationship of the logistics resources among different nodes, and further the management policy generated by generating the model by using the obtained management policy can improve the cooperation of the local network node and other nodes on the logistics resources, has higher accuracy in the management effect of the logistics resources, and further promotes reasonable utilization and reasonable allocation of the logistics resources.
In addition, in some embodiments, training of the model may be performed in conjunction with risk utility, and correspondingly, step S104 may specifically include:
and carrying out model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of different nodes, the logistics resource management strategies of different nodes and the management strategy with the lowest risk index output as targets.
For example, for a corresponding risk index algorithm configured for a logistics supply chain, after an initial model is trained, although the generated management strategy can meet the requirement of a preconfigured loss function, the calculated risk index is 0.6132 and is larger than a 0.25 threshold, the model still needs to be trained at this time to be reduced to be within 0.25, if the risk index can be reduced, the training can be continued until the risk index tends to be stable, and the training of the model can be completed at this time.
It can be understood that the training of the model is performed by introducing the risk index, and the corresponding risk index algorithm can be configured based on different pre-configured risk conditions, so that the training of the model can be guided by combining the risk index in the training process, thereby not only improving the accuracy of the management strategy generated by the model on the physical distribution resource management effect, but also improving the risk resistance of the management strategy generated by the model on the physical distribution resource management effect.
Further, in some embodiments, a flowchart of the risk index calculation in the embodiment of the present application shown in fig. 2 may be implemented by the following steps:
Step S201, extracting a test management strategy output by an initial model;
firstly, when calculating a risk index in the training process of a model, firstly extracting a management strategy output by an initial model, namely a test management strategy.
Step S202, extracting the physical stream resource item data of the test management strategy and the physical stream piece handling capacity in a preset time period;
after the test management policy is extracted, the logistics resource item data and the logistics piece handling capacity within a preset time period can be extracted from the test management policy.
The logistics resource project data can specifically include purchase cost, purchase time, maintenance cost, maintenance time and allocation cost of each logistics resource for each logistics resource.
Step S203, calculating the risk index of the test management strategy according to the logistics resource project data and the logistics piece handling capacity.
After the logistics resource project data of the test management strategy and the logistics piece handling capacity in the preset time period are obtained, the risk index of the current test management strategy can be calculated according to a preset calculation formula.
For ease of understanding, a description will be given of a set of data concerning the apparatus of a process as an example:
site set a= { a 1 ,a 2 ,...,a m Process set w= { W } 1 ,w 2 ,...,w n Device set i= { I } 1 ,i 2 ,...,i k Date set d= { D } 1 ,d 2 ,...,d t Device operating state x iawd Flow throughput P of equipment iw Cost of equipment purchase gamma iaw_BC Time of purchase of equipment gamma iaw_BT Equipment maintenance cost gamma iaw_MC Equipment maintenance time gamma iaw_MT The equipment allocation cost is c iw_ab The risk utility of the device in terms of purchase, use and maintenance can then be calculated as a risk index by the following function:
Figure BDA0002295336660000111
r awd indicating the logistics processing capacity of a plurality of devices in the same process; />
Figure BDA0002295336660000112
b aw For indicating equipment costs;
Figure BDA0002295336660000113
μ 1 for indicating the risk utility of the device with respect to purchase, use and maintenance;
E(μ 1 ),E(μ 1 ) For indicating mu 1 Is a mathematical expectation of (a).
Calculation of mu 1 E (mu) 1 ) I.e. calculate the difference |mu.between the two 1 -E(μ 1 )|,|μ 1 -E(μ 1 ) The term "risk" is used to denote the degree of risk of a device with respect to purchase, use, and maintenance, i.e., the risk index of a device with respect to purchase, use, and maintenance.
If the deployment of the device is considered, the risk index may also be calculated by the following function:
Figure BDA0002295336660000114
μ 2 for indicating risks of devices with respect to deploymentUtility;
E(μ 2 ),E(μ 2 ) For indicating mu 2 Is a mathematical expectation of (a).
Calculation of mu 2 E (mu) 2 ) I.e. calculate the difference |mu.between the two 2 -E(μ 2 )|,|μ 2 -E(μ 2 ) The i is used to represent the degree of risk of the device with respect to the aspect of the allocation, i.e. the risk index of the device with respect to the aspect of the allocation.
Of course, in the training process of the model, not only the management policy outputting the lowest risk index may be targeted, but also the maximum output throughput or the lowest equipment cost may be targeted, and of course, two or three of the risk index, the throughput and the equipment cost may be targeted.
Further, in addition to a single mesh point for a target mesh point, the overall level of the mesh point network may be targeted.
Taking the risk index, output throughput and equipment cost of the overall network of the network of points as an example, the target index can be calculated by the following functions:
Figure BDA0002295336660000121
the smaller L, the whole net
The better the balance among the risk index, the processing capacity and the equipment cost is achieved by the management strategy of the logistics resources of different network points in the point network, when the L is minimum, the best balance among the comprehensive risk index, the processing capacity and the equipment cost is achieved or the highest cost performance is achieved.
In some embodiments, referring to still another flowchart of the training method of the management policy generation model according to the embodiment of the present application shown in fig. 3, after the management policy generation model is obtained through training in the corresponding embodiment of fig. 1, the model may be further applied through the following steps:
Step S301, a generation request is received, wherein the generation request is used for requesting to generate a target management strategy of a current logistics resource of a target website, and the generation request carries current logistics resource information of the target website;
it can be appreciated that when there is a need for generating a management policy for a physical distribution resource, a request for generating may be initiated, where the request may carry a node ID of the target node and current physical distribution resource information of the target node.
The current logistics resource information of the target network point is directly carried in the generation request, so that the current logistics resource information of the target network point input into the management strategy generation model can be simply and accurately determined, and the condition that the network of the network point or other equipment is not updated about the stored current logistics resource information of the target network point is avoided.
Step S302, current logistics resource information of the associated network point and a target association relation of the logistics resources of the associated network point and the target network point are obtained according to the network information;
when the generation request triggers the generation task of the management strategy of the logistics resources of the target network point, the target network point can be identified from the network information of the network point network according to the network point ID carried by the generation request, and the associated network point with the target association relation with the target network point is identified, so that the current logistics resource information of the associated network point can be obtained.
Step S303, inputting the current logistics resource information of the target network point, the current logistics resource information of the associated network point and the target association relationship into a management strategy generation model;
after the current logistics resource information of the target network point, the current logistics resource information of the associated network point and the target association relation of the target network point and the associated network point in the logistics resource are obtained, the current logistics resource information of the target network point, the current logistics resource information of the associated network point and the target association relation of the target network point and the associated network point in the logistics resource are input into a trained management strategy generation model, and generation processing of the management strategy of the logistics resource is carried out.
Step S304, extracting the target management strategy output by the management strategy generation model.
After the generation of the management strategy of the logistics resource is completed by the strategy generation model to be managed, the target management strategy input by the model can be extracted, and the target management strategy can be used for managing the current logistics resource of the target network point.
Taking a scheduling policy as an example, if the current logistics resource information of the target network point input into the model comprises the resource information of a plurality of devices, the target management policy output by the model can be the scheduling policy of the devices; if the logistics resource information of the target network point input into the model comprises the resource information of a plurality of workers, the target management strategy output by the model can be the scheduling strategy of the workers; if the current logistics resource information of the target network point input into the model comprises the resource information of a plurality of devices and personnel, the target management strategy output by the model can be the scheduling strategy of the devices and the personnel.
In addition, the training of the model can be performed by combining with the risk utility, and correspondingly, after the training is completed, when the model is put into practical application, the output target management strategy can also carry the risk index of the target management strategy, and the risk or the risk resistance of the current target management strategy can be conveniently known through carrying the risk index.
In order to facilitate better implementation of the training method of the management policy generation model provided by the embodiment of the application, the embodiment of the application also provides a training device of the management policy generation model.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a training device for managing a policy generation model according to an embodiment of the present application, where the training device for managing a policy generation model may specifically include the following structure:
an obtaining unit 401, configured to obtain network information of a mesh point network, where the network information includes mesh point information of different mesh points and association relations of logistics resources between different mesh points;
the obtaining unit 401 is further configured to obtain, according to the network information, logistics resource information of different nodes and logistics resource management policies of different nodes;
the training unit 402 is configured to perform model training on the initial model according to association relationships of logistics resources among different nodes, logistics resource information of different nodes, and logistics resource management policies of different nodes;
A determining unit 403, configured to determine the model after training as a management policy generation model.
In some embodiments, the training unit is specifically configured to:
and carrying out model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of different nodes, the logistics resource management strategies of different nodes and the management strategy with the lowest risk index output as targets.
In some embodiments, the apparatus further comprises a computing unit 404 to:
extracting a test management strategy output by the initial model;
extracting the logistics resource project data of the test management strategy and the logistics piece handling capacity in a preset time period;
and calculating the risk index of the test management strategy according to the logistics resource project data and the logistics piece handling capacity.
In some embodiments, the association of the logistic resources between the different nodes includes an association of a logistic resource management process between a supporting node and a supported node, the supporting node providing logistic resource support to the supported node.
In some embodiments, the logistic resource information for the different sites includes at least one of: different processes, different process capacities, different process costs, different equipment capacities, different equipment treatable days, different equipment purchases, different equipment maintenance times, and different equipment allocation costs between different sites.
In some embodiments, the apparatus further comprises:
a receiving unit 405, configured to receive a generation request, where the generation request is used to request to generate a target management policy of a current logistics resource of a target website, and the generation request carries current logistics resource information of the target website;
the obtaining unit 401 is further configured to obtain, according to the network information, current physical distribution resource information of the associated mesh point and a target association relationship between the physical distribution resources of the associated mesh point and the target mesh point;
an input unit 406, configured to input current logistics resource information of the target website, current logistics resource information of the associated website, and a target association relationship into a management policy generation model;
the extracting unit 407 is configured to extract the target management policy output by the management policy generation model.
In some embodiments, the target management policy carries a risk index for the target management policy.
The embodiment of the application further provides a training device for managing a policy generation model, referring to fig. 5, fig. 5 shows a schematic structural diagram of the training device for managing a policy generation model in the embodiment of the application, specifically, the training device for managing a policy generation model provided in the embodiment of the application includes a processor 501, where the processor 501 is configured to implement steps of a training method for managing a policy generation model in any embodiment as shown in fig. 1 to 3 when executing a computer program stored in a memory 502; alternatively, the processor 501 is configured to implement the functions of the units in the corresponding embodiment as shown in fig. 4 when executing the computer program stored in the memory 502.
By way of example, a computer program may be partitioned into one or more modules/units that are stored in the memory 502 and executed by the processor 501 to complete the present application. One or more of the modules/units may be a series of computer program instruction segments capable of performing particular functions to describe the execution of the computer program in a computer device.
Training devices that manage the policy generation model may include, but are not limited to, processor 501, memory 502. It will be appreciated by those skilled in the art that the illustration is merely an example of a training device for managing the policy generation model and does not constitute a limitation of the training device for managing the policy generation model, and may include more or less components than illustrated, or may be combined with certain components, or different components, e.g., the training device for managing the policy generation model may further include an input-output device, a network access device, a bus, etc., through which the processor 501, the memory 502, the input-output device, the network access device, etc., are connected.
The processor 501 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being a control center of the training device that manages the policy generation model, the various interfaces and lines being utilized to connect the various parts of the training device that entirely manages the policy generation model.
The memory 502 may be used to store computer programs and/or modules, and the processor 501 may implement various functions of the computer device by executing or executing the computer programs and/or modules stored in the memory 502, and invoking data stored in the memory 502. The memory 502 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created by use of a training device (e.g., audio data, video data, etc.) that generates a model according to a management policy, and so forth. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the training device, the device and the corresponding units of the management policy generation model described above may refer to the description of the training method of the management policy generation model in any embodiment corresponding to fig. 1 to 3, and details are not repeated herein.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.
To this end, an embodiment of the present application provides a computer readable storage medium, in which a plurality of instructions capable of being loaded by a processor are stored, so as to execute steps in a training method for managing a policy generation model in any embodiment of the present application, and specific operations may refer to a description of the training method for managing a policy generation model in any embodiment of the present application, such as fig. 1 to 3, which is not repeated herein.
Wherein the computer-readable storage medium may comprise: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
Since the instructions stored in the computer readable storage medium may perform the steps in the training method for managing the policy generation model in any embodiment of the present application, as shown in fig. 1 to 3, the beneficial effects that can be achieved by the training method for managing the policy generation model in any embodiment of the present application, as shown in fig. 1 to 3, are detailed in the foregoing description, and are not repeated herein.
The foregoing describes in detail the training method, apparatus, device and computer readable storage medium of the management policy generation model provided in the embodiments of the present application, and specific examples are applied to illustrate the principles and implementation of the embodiments of the present application, where the foregoing description of the embodiments is only used to help understand the method and core idea of the embodiments of the present application; meanwhile, those skilled in the art, based on the ideas of the embodiments of the present application, may change the specific implementation and application scope, and in summary, the present disclosure should not be construed as limiting the embodiments of the present application.

Claims (7)

1. A training method for a management policy generation model, the method comprising:
acquiring network information of a network of network points, wherein the network information comprises network point information of different network points and association relations of logistics resources among different network points; according to the network information, obtaining the logistics resource information of different network points and the logistics resource management strategies of different network points;
performing model training on an initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of the different nodes and the logistics resource management strategy of the different nodes; the association relation of the logistic resources among different nodes comprises the association relation of the logistic resource management procedure between the supporting node and the supported node, wherein the supporting node is the node providing logistic resource support for the supported node;
The logistics resource information of different network points comprises at least one of the following: different processes, processing capabilities of the different processes, costs of the different processes, different devices, processing capabilities of the different devices, number of days the different devices can process, purchase costs of the different devices, purchase time of the different devices, maintenance costs of the different devices, maintenance time of the different devices, and transfer costs of the different devices between the different sites;
the logistics resource management strategy characterizes the scheduling and distributing strategy of logistics resources among different network points;
determining the model after training as a management strategy generation model;
the method further comprises the steps of:
receiving a generation request, wherein the generation request is used for requesting to generate a target management strategy of a current logistics resource of a target website, and the generation request carries the current logistics resource information of the target website;
acquiring current logistics resource information of an associated node and a target association relation between the associated node and logistics resources of the target node according to the network information;
inputting the current logistics resource information of the target network point, the current logistics resource information of the associated network point and the target association relationship into the management policy generation model;
And extracting the target management strategy output by the management strategy generation model.
2. The method of claim 1, wherein the model training the initial model according to the association relationship of the logistics resources between the different nodes, the logistics resource information of the different nodes, and the logistics resource management policy of the different nodes comprises:
and performing model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of the different nodes, the logistics resource management strategies of the different nodes and the management strategy with the lowest risk index output as targets.
3. The method according to claim 2, wherein the method further comprises:
extracting a test management strategy output by the initial model;
extracting logistics resource project data of the test management strategy and logistics piece handling capacity in a preset time period;
and calculating the risk index of the test management strategy according to the logistics resource project data and the logistics piece handling capacity.
4. The method of claim 1, wherein the target management policy carries a risk index for the target management policy.
5. A training apparatus for managing a policy generation model, the apparatus comprising:
the system comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring network information of a network of network points, wherein the network information comprises the network point information of different network points and the association relation of logistics resources among different network points;
the acquisition unit is further used for acquiring the logistics resource information of the different network points and the logistics resource management strategies of the different network points according to the network information;
the training unit is used for carrying out model training on the initial model according to the association relation of the logistics resources among different nodes, the logistics resource information of the different nodes and the logistics resource management strategy of the different nodes; the association relation of the logistic resources among different nodes comprises the association relation of the logistic resource management procedure between the supporting node and the supported node, wherein the supporting node is the node providing logistic resource support for the supported node;
the logistics resource information of different network points comprises at least one of the following: different processes, processing capabilities of the different processes, costs of the different processes, different devices, processing capabilities of the different devices, number of days the different devices can process, purchase costs of the different devices, purchase time of the different devices, maintenance costs of the different devices, maintenance time of the different devices, and transfer costs of the different devices between the different sites;
The logistics resource management strategy characterizes the scheduling and distributing strategy of logistics resources among different network points;
a determining unit for determining the model after training as a management policy generation model;
the apparatus further comprises:
the receiving unit is used for receiving a generation request, wherein the generation request is used for requesting to generate a target management strategy of the current logistics resource of the target website, and the generation request carries the current logistics resource information of the target website;
the acquisition unit is also used for acquiring current logistics resource information of the associated network point and a target association relation of the logistics resources of the associated network point and the target network point according to the network information;
the input unit is used for inputting the current logistics resource information of the target network point, the current logistics resource information of the associated network point and the target association relationship into the management strategy generation model;
and the extraction unit is used for extracting the target management strategy output by the management strategy generation model.
6. Training device for a management policy generation model, characterized in that it comprises a processor and a memory, in which a computer program is stored, which processor, when calling the computer program in the memory, performs a training method for a management policy generation model according to any of claims 1-4.
7. A computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the training method of the management policy generation model of any of claims 1 to 4.
CN201911198896.8A 2019-11-29 2019-11-29 Training method, device and equipment for management strategy generation model Active CN112884388B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911198896.8A CN112884388B (en) 2019-11-29 2019-11-29 Training method, device and equipment for management strategy generation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911198896.8A CN112884388B (en) 2019-11-29 2019-11-29 Training method, device and equipment for management strategy generation model

Publications (2)

Publication Number Publication Date
CN112884388A CN112884388A (en) 2021-06-01
CN112884388B true CN112884388B (en) 2023-06-09

Family

ID=76038407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911198896.8A Active CN112884388B (en) 2019-11-29 2019-11-29 Training method, device and equipment for management strategy generation model

Country Status (1)

Country Link
CN (1) CN112884388B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2717666A1 (en) * 2010-10-15 2012-04-15 W. John Mowat Method for managing the inbound freight process of the supply chain on behalf of a retail distribution network
CN108921483A (en) * 2018-07-16 2018-11-30 深圳北斗应用技术研究院有限公司 A kind of logistics route planing method, device and driver arrange an order according to class and grade dispatching method, device
CN109214559A (en) * 2018-08-17 2019-01-15 安吉汽车物流股份有限公司 The prediction technique and device of logistics business, readable storage medium storing program for executing
CN109389270A (en) * 2017-08-09 2019-02-26 菜鸟智能物流控股有限公司 Logistics object determination method and device and machine readable medium
CN109472441A (en) * 2018-09-21 2019-03-15 顺丰科技有限公司 Method, processing unit, equipment and the storage medium allocated supplies
CN109767052A (en) * 2018-11-20 2019-05-17 顺丰科技有限公司 Autotask distribution method and system
CN110458429A (en) * 2019-07-29 2019-11-15 暨南大学 A kind of intelligent task distribution and personal scheduling method, system for geographical site

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130031035A1 (en) * 2011-07-31 2013-01-31 International Business Machines Corporation Learning admission policy for optimizing quality of service of computing resources networks

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2717666A1 (en) * 2010-10-15 2012-04-15 W. John Mowat Method for managing the inbound freight process of the supply chain on behalf of a retail distribution network
CN109389270A (en) * 2017-08-09 2019-02-26 菜鸟智能物流控股有限公司 Logistics object determination method and device and machine readable medium
CN108921483A (en) * 2018-07-16 2018-11-30 深圳北斗应用技术研究院有限公司 A kind of logistics route planing method, device and driver arrange an order according to class and grade dispatching method, device
CN109214559A (en) * 2018-08-17 2019-01-15 安吉汽车物流股份有限公司 The prediction technique and device of logistics business, readable storage medium storing program for executing
CN109472441A (en) * 2018-09-21 2019-03-15 顺丰科技有限公司 Method, processing unit, equipment and the storage medium allocated supplies
CN109767052A (en) * 2018-11-20 2019-05-17 顺丰科技有限公司 Autotask distribution method and system
CN110458429A (en) * 2019-07-29 2019-11-15 暨南大学 A kind of intelligent task distribution and personal scheduling method, system for geographical site

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周泓 ; 欧建新 ; 李政道 ; .航空货运中心物流系统建模及仿真研究.系统仿真学报.2008,(12),全文. *
方伯芃.不确定环境下的产业链生产与配送协同调度优化.计算机集成制造系统.2018,第 2018,24(01)卷(第 2018,24(01)期),全文. *

Also Published As

Publication number Publication date
CN112884388A (en) 2021-06-01

Similar Documents

Publication Publication Date Title
US11270256B2 (en) Material organization task generation method and device, and material organization method and device
CN107864187A (en) The online task executing method of terminal device and server
Kress et al. The partitioning min–max weighted matching problem
CN112860412B (en) Service data processing method and device, electronic equipment and storage medium
CN112884388B (en) Training method, device and equipment for management strategy generation model
CN110220549A (en) A kind of method and apparatus of pile type assessment
CN112633587A (en) Logistics information processing method and device
CN113222205A (en) Path planning method and device
CN116167245A (en) Multi-attribute transfer decision model-based multi-modal grain transportation method and system
CN107203633B (en) Data table pushing processing method and device and electronic equipment
CN115795097A (en) Data processing method and device based on XML (extensive Makeup language) logic rule
CN115562861A (en) Method and apparatus for data processing for data skew
CN114186932A (en) Cargo distribution method and device for compartment grid robot in office area
CN111860918B (en) Distribution method and device, electronic equipment and computer readable medium
CN107547429A (en) One kind load determines method, apparatus and electronic equipment
CN112949887B (en) Method, device and equipment for planning dispatch path and computer readable storage medium
CN113762573A (en) Logistics network optimization method and device
CN111461430A (en) Method and device for generating route information
CN112950106B (en) Stock stocking method and device for transfer vehicle, electronic equipment and storage medium
Kaderi et al. Automated management of maritime container terminals using internet of things and big data technologies
CN111597052B (en) Chip management and control method and device, server and readable storage medium
CN116112336A (en) Alarm data processing method and device
CN113052354B (en) Freight container allocation optimization method, device, equipment and storage medium
US11315207B1 (en) Cargo optimization systems, devices and related methods
CN112990801A (en) Method, device and equipment for planning transportation path and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant