CN108520327A

CN108520327A - The stowage and device of vehicle-mounted cargo, computer-readable medium

Info

Publication number: CN108520327A
Application number: CN201810357903.3A
Authority: CN
Inventors: 金忠孝; 戴昌志
Original assignee: Anji Automotive Logistics Ltd By Share Ltd; SAIC Motor Corp Ltd
Current assignee: Anji Automotive Logistics Ltd By Share Ltd; SAIC Motor Corp Ltd
Priority date: 2018-04-19
Filing date: 2018-04-19
Publication date: 2018-09-11
Anticipated expiration: 2038-04-19
Also published as: CN108520327B

Abstract

A kind of stowage and device, computer-readable medium of vehicle-mounted cargo, the stowage of the vehicle-mounted cargo include：Generate random number；Based on the random number and preset probability parameter, the selection strategy of loading action is determined, the selection strategy of the loading action includes following any：Random selection strategy and selection strategy based on neural network；Based on the selection strategy of loading action, selects corresponding loading to act and place corresponding cargo.Using said program, due to that different loadings can be selected to act based on neural network, therefore different loading actions can be evaluated in a manner of probability statistics, calculating speed is very fast, can solve large-scale logistics packing problems.

Description

The stowage and device of vehicle-mounted cargo, computer-readable medium

Technical field

The present embodiments relate to combinatorial optimization problem solve field more particularly to a kind of vehicle-mounted cargo stowage and Device, computer-readable medium.

Background technology

For logistics system, the stowage of vehicle-mounted cargo is a critically important technical problem.The loading of vehicle-mounted cargo Method is a bin packing, and bin packing is widely present in the every field such as industrial production and computer science, such as clothes row The fabric cutting of industry, the sheet profile blanking of the container loading of carrier, processing industry, the stock layout of printing industry, reality are raw Multi-processor task scheduling, resource allocation, file distribution, the memory pipe of object and computer realm are packed and arranged in work The bottom operations such as reason.

For computation complexity, bin packing is the uncertainty (Non- of a multinomial complexity Deterministic Polynomial, NP) problem, it is difficult to accurate globally optimal solution be solved, generally using heuristic calculation Method and searching algorithm solve.The thought of heuritic approach is to find a kind of heuristic rule that can generate feasible solution, is found with this An optimal solution or approximate optimal solution for problem.The solution efficiency of this method is higher, but needs to find out its spy to different problems Some heuristic rules, this heuristic rule generally without versatility, are not suitable for other problems.For bin packing, open Hairdo algorithm has adapts to (First Fit, FF) algorithm, optimal adaptation (Best Fit, BF) algorithm, genetic algorithm for the first time, simulation Annealing algorithm and particle cluster algorithm etc..And searching algorithm refers to being scanned in solution space, with find problem optimal solution or Person's approximate optimal solution.This method cannot be guaranteed the optimal solution for obtaining problem, if but suitably with some heuristic knowledges, so that it may It is achieved a better balance in the quality and efficiency of approximate solution.

The stowage of existing vehicle-mounted cargo mainly uses heuritic approach exhaustive vanning scheme in a limited space, Per cent pack is not high, and calculates and take longer, unsuitable extensive computation.It solves than faster heuritic approach, such as population Although algorithm can be solved quickly, the quality of solution is not high.

Therefore currently existing scheme, it can not solve to solve large-scale logistics packing problems.

Invention content

The technical issues of embodiment of the present invention solves is how to solve large-scale logistics packing problems.

In order to solve the above technical problems, the embodiment of the present invention provides a kind of stowage of vehicle-mounted cargo, the method packet It includes：Generate random number；Based on the random number and preset probability parameter, the selection strategy of loading action, the loading are determined The selection strategy of action includes following any：Random selection strategy and selection strategy based on neural network；Based on the dress The selection strategy of load action, selects corresponding loading to act and places corresponding cargo.

Optionally, the preset probability parameter with execute step increase and linear decrease.

Optionally, when the preset probability parameter is not more than preset Parameters threshold, it is set as preset fixed value.

Optionally, described to be based on the random number and preset probability parameter, determine that the selection strategy of loading action includes： When the random number is less than the preset probability parameter, determine that the selection strategy of loading action is random selection strategy；When When the random number is not less than the preset probability parameter, determine that the selection strategy of loading action is the choosing based on neural network Select strategy.

Optionally, the loading, which acts, includes：It identification information to be placed into cargo, the location information to be placed into cargo and waits for It is put into the orientation information of cargo.

Optionally, the loading action meets at least one of following constraints：To be placed into cargo with it is at least one before The cargo placed is close together, is not be overlapped in the horizontal direction to be placed into cargo and the cargo placed before.

Optionally, described to be based on institute when the selection strategy of loading action is the selection strategy based on neural network The selection strategy for stating loading action, selects the action of corresponding loading to include：Based on BP neural network, different loading actions is calculated Corresponding BP neural network output valve；The maximum loading action of corresponding BP neural network output valve is selected to be moved for corresponding load Make.

Optionally, the output valve of the BP neural network is：Q(s,a；θ_i), set of actions is loaded in state for assessing Charging ratio index in set；Wherein a is to load set of actions, and s is current state set, and the state set includes：Vehicle The enclosure space size in compartment, optional placement location and to be placed into shipment identifier information, θ_iFor the BP neural network Network parameter.

Optionally, after selecting corresponding loading to act and placing corresponding cargo, further include：Described load is executed to move Make, the loading is acted into deposit training set；The BP neural network parameter is updated based on the training set.

Optionally, described to include based on the training set update BP neural network parameter：Extract the training set Sample in conjunction；Based on the sample, the output valve for calculating target nerve network is：Its Middle r is the instantaneous feedback value for executing the loading action, and a is to load set of actions, and a ' is following loading set of actions, and s is Current state set, the state set include：The enclosure space size in compartment, optional placement location and to be placed into goods Object identification information, s ' are the state set after loading action update,For the network parameter of the target nerve network, γ For preset second proportionality coefficient,For assessing charging ratio indexs of the different a in different s；Calculate the mesh Mark neural network output valve and the output valve of the BP neural network between mean square deviation be：

Wherein,Output valve for the target nerve network and the BP nerve nets Mean square deviation between the output valve of network；

Based on the principle of mean square error minimum, the parameter of the BP neural network is updated.

Optionally, the principle based on mean square error minimum, the parameter for updating the BP neural network further include：Using The partial derivative that stochastic gradient descent algorithm calculates each neural network parameter is as follows：

WhereinFor to θ_iPartial derivative； For the partial derivative of the neural network parameter；Based on the partial derivative, the parameter of the BP neural network is calculated and updated.

Optionally, the stowage of the vehicle-mounted cargo further includes：Based on the parameter of the BP neural network, described in update The parameter of target nerve network.

The embodiment of the present invention provides a kind of loading attachment of vehicle-mounted cargo, including：Generation unit is suitable for generating random number； Determination unit is suitable for being based on the random number and preset probability parameter, determines that the selection strategy of loading action, described load move The selection strategy of work includes following any：Random selection strategy and selection strategy based on neural network；Placement unit is suitable for Based on the selection strategy of loading action, selects corresponding loading to act and place corresponding cargo.

Optionally, the preset probability parameter is with the increase linear decrease for executing step.

Optionally, the determination unit includes：First determination subelement is suitable for joining less than the probability when the random number When number, determine that the selection strategy of loading action is random selection strategy；Second determination subelement is suitable for when the random number is not small When the probability parameter, determine that the selection strategy of loading action is the selection strategy based on neural network.

Optionally, when the selection strategy of loading action is the selection strategy based on neural network, the placement is single Member includes：First computation subunit is suitable for being based on BP neural network, and it is defeated to calculate the different corresponding BP neural networks of loading action Go out value；Subelement is selected, it is that corresponding loading acts to be suitably selected for the maximum loading action of corresponding BP neural network output valve.It puts Subelement is set, is suitable for placing corresponding cargo based on the action of corresponding loading.

Optionally, the output valve of the BP neural network is：Q(s,a；θ_i), set of actions is loaded in state for assessing Charging ratio index in set；Wherein a is to load set of actions, and s is current state set, and the state set includes：Vehicle The enclosure space size in compartment, optional placement location and to be placed into chest identification information, θ_iFor the net of the BP neural network Network parameter.

Optionally, after the placement unit selects corresponding loading to act and places corresponding cargo, further include：It holds Row unit is adapted for carrying out the loading action, and the loading is acted deposit training set；Training unit is suitable for based on described Training set updates the BP neural network parameter.

Optionally, the training unit includes：Subelement is extracted, is suitable for extracting the sample in the training set；Second Computation subunit, is suitable for being based on the sample, and the output valve for calculating target nerve network is： Wherein r is the instantaneous feedback value for executing the loading action, and a is to load set of actions, and a ' is following loading set of actions, s For current state set, the state set includes：The enclosure space size in compartment, optional placement location and to be placed into Chest identification information, s ' are the state set after loading action update,For the network parameter of the target nerve network, γ For preset second proportionality coefficient,For assessing charging ratio indexs of the different a in different s；Third calculates son Unit, the mean square deviation between output valve and the output valve of the BP neural network suitable for calculating the target nerve network are：

Wherein,Output valve for the target nerve network and the BP nerve nets Mean square deviation between the output valve of network；Subelement is updated, is suitable for the principle based on mean square error minimum, updates the BP nerve nets The parameter of network.

Optionally, the update subelement includes：Computing module is suitable for calculating each god using stochastic gradient descent algorithm Partial derivative through network parameter is as follows：

WhereinFor to θ_iPartial derivative； For the partial derivative of the neural network parameter；Update module is suitable for being based on the partial derivative, calculates and updates the BP god Parameter through network.

Optionally, the loading attachment of the vehicle-mounted cargo, which is characterized in that further include：Updating unit is suitable for based on described The parameter of BP neural network updates the parameter of the target nerve network.

The embodiment of the present invention provides a kind of computer readable storage medium, and computer readable storage medium is non-volatile deposits Storage media or non-transitory storage media, are stored thereon with computer instruction, and the computer instruction executes any of the above-described when running The step of kind the method.

The embodiment of the present invention provides a kind of loading attachment of vehicle-mounted cargo, including memory and processor, the memory On be stored with the computer instruction that can be run on the processor, the processor executes when running the computer instruction The step of stating any the method.

Compared with prior art, the technical solution of the embodiment of the present invention has the advantages that：

The embodiment of the present invention determines the selection strategy of loading action by being based on the random number and preset probability parameter For random selection strategy, or the selection strategy based on neural network, it is then based on the selection strategy of the loading action, selection Corresponding loading acts and places corresponding cargo.Due to that different loadings can be selected to act based on neural network, therefore can be with Different loading actions is evaluated in a manner of probability statistics, calculating speed is very fast, can solve large-scale logistics loading and ask Topic.

Further, since neural network can be designed as the form of class function, so as to realize Multi-core simultaneously Row operation, further increases calculating speed.

Further, due to loading action include the identification information to be placed into cargo, the location information to be placed into cargo and To be placed into the orientation information of cargo, therefore the changeable situation such as can meet compartment, cargo, it is suitable for various scenes, improves institute State the versatility of the stowage of vehicle-mounted cargo.

Description of the drawings

Fig. 1 is a kind of flow chart of the stowage of vehicle-mounted cargo provided in an embodiment of the present invention；

Fig. 2 is a kind of flow chart of the selection method of loading action based on neural network provided in an embodiment of the present invention；

Fig. 3 is a kind of structural schematic diagram of the loading attachment of vehicle-mounted cargo provided in an embodiment of the present invention.

Specific implementation mode

The stowage of existing vehicle-mounted cargo mainly uses heuritic approach exhaustive vanning scheme in a limited space, Per cent pack is not high, and calculates and take longer, unsuitable extensive computation.It solves than faster heuritic approach, such as population Although algorithm can be solved quickly, the quality of solution is not high.Therefore currently existing scheme, it can not solve to solve large-scale logistics dress Load problem.

It is understandable to enable above-mentioned purpose, feature and the advantageous effect of the present invention to become apparent, below in conjunction with the accompanying drawings to this The specific embodiment of invention is described in detail.

Referring to Fig. 1, an embodiment of the present invention provides a kind of stowages of vehicle-mounted cargo, and the method may include as follows Step：

Step S101 generates random number.

In specific implementation, it can be based on random number, determine the selection strategy of different loading actions.

Step S102 is based on the random number and preset probability parameter, determines the selection strategy of loading action, the dress The selection strategy of load action includes following any：Random selection strategy and selection strategy based on neural network.

It in specific implementation, can be by roulette algorithm or greedy algorithm, based on the random number and preset general Rate parameter selects the selection strategy of desired loading action.For example, when training starts, it is endless due to neural network model It is kind, random selection strategy can be selected more；With the increase for executing step, neural network model is continued to optimize, accurately Rate improves, and can select neural network model more, to improve the charging ratio of loading action.

In specific implementation, the preset probability parameter can be the value in section [0,1], and with the increasing for executing step Add and linear decrease.For example, when step 0, the preset probability parameter is 1；When step 1, the preset probability parameter is 0.9；... when step 10, the preset probability parameter is 0.1.

In specific implementation, with the increase for executing step, when the preset probability parameter is not more than, that is, be less than or When equal to preset Parameters threshold, preset fixed value is could be provided as, to ensure the learning efficiency of neural network.For example, working as When the preset probability parameter is decremented to 0.1,0.1 is could be provided as, to ensure the learning efficiency of neural network.

It is understood that the preset Parameters threshold and the preset fixed value could be provided as identical value, Different values is may be set to be, the embodiment of the present invention is not limited.

In an embodiment of the present invention, described to be based on the random number and preset probability parameter, determine loading action Selection strategy includes：When the random number be less than the preset probability parameter when, determine loading action selection strategy be with Machine selection strategy；When the random number is not less than the preset probability parameter, determine that the selection strategy of loading action is base In the selection strategy of neural network.

In specific implementation, the loading, which acts, may include：Identification information to be placed into cargo, the position to be placed into cargo Confidence breath and the orientation information to be placed into cargo.It is acted based on the loading, specific cargo can be selected to place to its correspondence Position.

It is understood that the cargo can also be referred to as other titles such as chest, the loading action can also be by Referred to as other titles such as execution action, vanning action all belong to the scope of protection of the present invention as long as meaning is identical.

In specific implementation, in order to improve charging ratio, the loading action may include at least one of following restriction relation： To be placed into cargo with it is at least one before the cargo placed be close together, to be placed into cargo with the cargo placed before in water Square to not being overlapped.

In specific implementation, the cargo that can also constrain the left side is higher than the cargo on the right, can also constrain the cargo on the right Higher than the cargo on the left side or the right cargo is high as the cargo on the left side, because from the point of view of probability, always there is a kind of feelings This constraint may be implemented in condition.

Since loading action includes：Identification information to be placed into cargo, the location information to be placed into cargo and to be placed into goods The orientation information of object can meet the changeable situation such as compartment, cargo, be suitable for various scenes, improve the vehicle-mounted cargo Stowage versatility.

In specific implementation, described when the selection strategy of loading action is the selection strategy based on neural network Based on the selection strategy of loading action, the action of corresponding loading is selected to may include：Based on backpropagation (Back Propagation, BP) neural network, it calculates different loadings and acts corresponding BP neural network output valve；The corresponding BP god of selection It is that corresponding loading acts through the maximum loading action of network output valve.

In an embodiment of the present invention, the output valve of the BP neural network, which is used for assessing, loads set of actions in state set Charging ratio index in conjunction, is defined as：Value of feedback Q (s, a of loading action；θ_i), wherein a is to load set of actions, and s is current State set, the state set includes：The enclosure space size of vehicle car, optional placement location and to be placed into goods Object identification information, θ_iFor the network parameter of the BP neural network.

Due to traditional enhancing learning algorithm with table to store action evaluation function, and be only applicable to table storage Act valuation to it is discrete (uncorrelated) the case where, be not suitable for extensive computation can be with the side of probability statistics using BP neural network Formula executes action to evaluate, that is, estimates the value of feedback of execution action, is suitble to extensive computation.

Since neural network can be designed as the form of class function, so as to realize Multi-core concurrent operation, into One step improves calculating speed.

In specific implementation, the charging ratio (Packing Rate) can be：Area (volume) shared by afloat goods/ The gross area (volume) of floor compartment, charging ratio is higher, and the output valve of the BP neural network is bigger.

In an embodiment of the present invention, the output valve of the BP neural network is instantaneous load rate and preset first ratio The product of coefficient.

Step S103 is selected corresponding loading to act and is placed corresponding goods based on the selection strategy of loading action Object.

It in specific implementation, can be by the loading after selecting corresponding loading to act and placing corresponding cargo Action deposit training set, for being trained to neural network.

In an embodiment of the present invention, further include holding after selecting corresponding loading to act and placing corresponding cargo The loading is acted deposit training set by the row loading action；The BP neural network is updated based on the training set Parameter.

In specific implementation, can based on normal distribution law from it is described training set in randomly select the data of storage come Training BP neural network can also randomly select the data of storage to instruct based on equally distributed rule from the training set Practice BP neural network, is also based on other regularities of distribution and randomly selects the data of storage from the training set to train BP neural network, details are not described herein again.

In specific implementation, to ensure the housebroken stability and convergence of BP god, a target nerve can be re-defined Network, and train BP neural network by periodically updating the target nerve network.

In an embodiment of the present invention, described to include based on the training set update BP neural network parameter：It carries Take the sample in the training set；Based on the sample, the output valve for calculating target nerve network is：Wherein, r is the instantaneous feedback value for executing the loading action, and a acts for loading Set, a ' are the loading set of actions in future, and s is current state set, and the state set includes：The enclosure space in compartment Size, optional placement location and to be placed into shipment identifier information, s ' be loading action update after state set,For The network parameter of the target nerve network, γ are preset second proportionality coefficient,Exist for assessing different a Charging ratio index in different s；Between the output valve and the output valve of the BP neural network that calculate the target nerve network Mean square deviation be：

Wherein,Output valve for the target nerve network and the BP nerve nets Mean square deviation between the output valve of network.Based on the principle of mean square error minimum, the parameter of the BP neural network is updated.

In specific implementation, the N number of samples executed recently of Top in the training set can be extracted, i.e., are executed recently N number of loading action, wherein N be positive integer, other samples can also be extracted, the embodiment of the present invention is not limited.

In specific implementation, the principle based on mean square error minimum, the parameter for updating the BP neural network are also wrapped It includes：

Each neural network is calculated using stochastic gradient descent (Stochastic Gradient Descent, SGD) algorithm The partial derivative of parameter is as follows：

Wherein,For to θ_iPartial derivative. For the partial derivative of the neural network parameter.

It is then based on the partial derivative, calculates and update the parameter of the BP neural network.

In an embodiment of the present invention, in order to improve the validity of the target nerve network, the vehicle-mounted cargo Stowage can also include：Based on the parameter of the BP neural network, the parameter of the target nerve network is updated.

In specific implementation, it is essentially a kind of approximate Dynamic Programming using Q-learning methods update neural network Method realizes optimization.

In an embodiment of the present invention, after each selective loading action assessment, judge whether the iterative process terminates, At the end of the iterative process, the loading action of this selection is end value；When the iterative process is not finished, will under One state set input target nerve network handled well, and obtain maximum feedback using the target nerve network parameter Then value Q calculates the value of feedback of current action according to the graceful formula of Bell.

Using said program the selection plan of loading action is determined by being based on the random number and preset probability parameter Slightly random selection strategy, or the selection strategy based on neural network are then based on the selection strategy of the loading action, choosing Corresponding loading is selected to act and place corresponding cargo.Due to that different loadings can be selected to act based on neural network, therefore can To evaluate different loading actions in a manner of probability statistics, calculating speed is very fast, can solve large-scale logistics and load Problem.

To make those skilled in the art be better understood from and implementing the present invention, an embodiment of the present invention provides one kind based on god The selection method of loading action through network, the method may include following steps：

Step S201, initialization of variable.

In specific implementation, the initialization of variable includes：The definition of the value of feedback of loading action, state set, loading The setting of set of actions.

Step S202, calculates the value of feedback of BP neural network, and selects the maximum loading action arrangement of goods of value of feedback.

In specific implementation, the output valve of the BP neural network is defined as：Value of feedback Q (s, a of loading action；θ_i), Wherein a is to load set of actions, and s is current state set, and the state set includes：The enclosure space of vehicle car is big Small, optional placement location and to be placed into shipment identifier information, θ_iFor the network parameter of the BP neural network.

The loading is acted deposit experience set by step S203.

Step S204 chooses N number of loading action from the experience set, calculates the value of feedback of target nerve network.

Step S205, based on the value of feedback of the target nerve network, training and the parameter for updating the BP neural network.

Step S206 updates the parameter of the target nerve network based on the parameter of the BP neural network.

To make those skilled in the art be better understood from and implementing the present invention, the embodiment of the present invention additionally provides one kind can Realize the loading attachment of above-mentioned vehicle-mounted cargo, as shown in Figure 3.

Referring to Fig. 3, the loading attachment 30 of the vehicle-mounted cargo may include：Generation unit 31, determination unit 32 and placement Unit 33, wherein：

The generation unit 31 is suitable for generating random number.

The determination unit 32 is suitable for being based on the random number and preset probability parameter, determines the selection of loading action The selection strategy of strategy, the loading action includes following any：Random selection strategy and selection plan based on neural network Slightly.

The placement unit 33, the selection strategy for being suitable for being acted based on the loading, is selected corresponding loading to act and put Set corresponding cargo.

In an embodiment of the present invention, the preset probability parameter is with the increase linear decrease for executing step.

In an embodiment of the present invention, it when the preset probability parameter is not more than preset Parameters threshold, is set as Preset fixed value.

In specific implementation, the determination unit 32 includes：First determination subelement 321 and the second determination subelement 322, Wherein：

First determination subelement 321 is suitable for, when the random number is less than the probability parameter, determining that loading acts Selection strategy be random selection strategy.

Second determination subelement 322 is suitable for when the random number is not less than the probability parameter, determines to load and move The selection strategy of work is the selection strategy based on neural network.

In specific implementation, the loading, which acts, includes：Identification information to be placed into cargo, the letter of the position to be placed into cargo Breath and to be placed into cargo orientation information.

In specific implementation, the loading action meets at least one of following constraints：To be placed into cargo and at least one The cargo placed before a is close together, is not be overlapped in the horizontal direction to be placed into cargo and the cargo placed before.

In specific implementation, described when the selection strategy of loading action is the selection strategy based on neural network Placement unit may include：First computation subunit (Fig. 3 is not shown), selection subelement (Fig. 3 is not shown) and placement subelement (Fig. 3 is not shown), wherein：

First computation subunit is suitable for being based on BP neural network, calculates different loadings and acts corresponding BP nerves Network output valve.

The selection subelement, it is corresponding loading to be suitably selected for the maximum loading action of corresponding BP neural network output valve Action.

The placement subelement is suitable for placing corresponding cargo based on the action of corresponding loading.

In specific implementation, the output valve of the BP neural network is：Q(s,a；θ_i), for assessing loading set of actions Charging ratio index in state set；Wherein a is to load set of actions, and s is current state set, the state set packet It includes：The enclosure space size in compartment, optional placement location and to be placed into chest identification information, θ_iFor the BP neural network Network parameter.

In specific implementation, after the placement unit selects corresponding loading to act and places corresponding cargo, institute Stating the loading attachment 30 of vehicle-mounted cargo can also include：Execution unit (Fig. 3 is not shown) and training unit (Fig. 3 is not shown), In：

The execution unit is adapted for carrying out the loading action, and the loading is acted deposit training set.

The training unit is suitable for updating the BP neural network parameter based on the training set.

In specific implementation, the training unit may include：It is single to extract subelement (Fig. 3 is not shown), the second calculating First (Fig. 3 is not shown), third computation subunit (Fig. 3 is not shown) and update subelement (Fig. 3 is not shown), wherein：

The extraction subelement is suitable for extracting the sample in the training set.

Second computation subunit, is suitable for being based on the sample, and the output valve for calculating target nerve network is：Wherein r is the instantaneous feedback value for executing the loading action, and a is to load behavior aggregate It closes, a ' is the loading set of actions in future, and s is current state set, and the state set includes：The enclosure space in compartment is big It is small, optional placement location and to be placed into chest identification information, s ' be loading action update after state set,For institute The network parameter of target nerve network is stated, γ is preset second proportionality coefficient,For assessing different a not With the charging ratio index in s.

The third computation subunit is suitable for calculating the output valve of the target nerve network and the BP neural network Mean square deviation between output valve is：

Wherein,Output valve for the target nerve network and the BP nerve nets Mean square deviation between the output valve of network.

The update subelement is suitable for the principle based on mean square error minimum, updates the parameter of the BP neural network.

In specific implementation, the update subelement includes：Computing module and update module, wherein：

The computing module, suitable for calculating the partial derivative of each neural network parameter using stochastic gradient descent algorithm such as Under：

WhereinFor to θ_iPartial derivative； For the partial derivative of the neural network parameter；

The update module is suitable for being based on the partial derivative, calculates and update the parameter of the BP neural network.

In specific implementation, the loading attachment 30 of the vehicle-mounted cargo further includes：Updating unit (Fig. 3 is not shown), is suitable for Based on the parameter of the BP neural network, the parameter of the target nerve network is updated.

In specific implementation, the workflow and principle of the loading attachment 30 of the vehicle-mounted cargo can refer to above-mentioned implementation Description in the method provided in example, details are not described herein again.

The embodiment of the present invention provides a kind of computer readable storage medium, and computer readable storage medium is non-volatile deposits Storage media or non-transitory storage media, are stored thereon with computer instruction, and the computer instruction executes any of the above-described when running Step corresponding to kind the method, details are not described herein again.

The embodiment of the present invention provides a kind of loading attachment of vehicle-mounted cargo, including memory and processor, the memory On be stored with the computer instruction that can be run on the processor, the processor executes when running the computer instruction Step corresponding to any of the above-described kind of the method, details are not described herein again.

One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium may include：ROM, RAM, disk or CD etc..

Although present disclosure is as above, present invention is not limited to this.Any those skilled in the art are not departing from this It in the spirit and scope of invention, can make various changes or modifications, therefore protection scope of the present invention should be with claim institute Subject to the range of restriction.

Claims

1. a kind of stowage of vehicle-mounted cargo, which is characterized in that including：

Generate random number；

Based on the random number and preset probability parameter, the selection strategy of loading action, the selection of the loading action are determined Strategy includes following any：Random selection strategy and selection strategy based on neural network；

Based on the selection strategy of loading action, selects corresponding loading to act and place corresponding cargo.

2. the stowage of vehicle-mounted cargo according to claim 1, which is characterized in that the preset probability parameter is with holding The increase of row step and linear decrease.

3. the stowage of vehicle-mounted cargo according to claim 2, which is characterized in that when the preset probability parameter not When more than preset Parameters threshold, it is set as preset fixed value.

4. the stowage of vehicle-mounted cargo according to claim 1, which is characterized in that described based on the random number and pre- If probability parameter, determine loading action selection strategy include：

When the random number is less than the preset probability parameter, determine that the selection strategy of loading action is random selection plan Slightly；

When the random number is not less than the preset probability parameter, determine that the selection strategy of loading action is based on nerve net The selection strategy of network.

5. the stowage of vehicle-mounted cargo according to claim 1, which is characterized in that the loading, which acts, includes：It waits putting Enter the identification information, the location information to be placed into cargo and the orientation information to be placed into cargo of cargo.

6. the stowage of vehicle-mounted cargo according to claim 5, which is characterized in that loading action meet with down toward One item missing constraints：To be placed into cargo with it is at least one before the cargo placed be close together, to be placed into cargo with before The cargo placed is not overlapped in the horizontal direction.

7. the stowage of vehicle-mounted cargo according to claim 1, which is characterized in that when the selection plan of loading action When selection strategy slightly based on neural network, the selection strategy based on loading action, select it is corresponding load it is dynamic Work includes：

Based on BP neural network, calculates different loadings and act corresponding BP neural network output valve；

The maximum loading action of corresponding BP neural network output valve is selected to be acted for corresponding loading.

8. the stowage of vehicle-mounted cargo according to claim 7, which is characterized in that the output valve of the BP neural network For：Q(s,a；θ_i), for assessing the charging ratio index for loading set of actions in state set；

Wherein a is to load set of actions, and s is current state set, and the state set includes：The enclosure space of vehicle car Size, optional placement location and to be placed into shipment identifier information, θ_iFor the network parameter of the BP neural network.

9. the stowage of vehicle-mounted cargo according to claim 8, which is characterized in that selecting corresponding loading action simultaneously After placing corresponding cargo, further include：

The loading action is executed, the loading is acted into deposit training set；

The BP neural network parameter is updated based on the training set.

10. the stowage of vehicle-mounted cargo according to claim 9, which is characterized in that described to be gathered based on the training Updating the BP neural network parameter includes：

Extract the sample in the training set；

Based on the sample, the output valve for calculating target nerve network is：Wherein r To execute the instantaneous feedback value of the loading action, a is to load set of actions, and a ' is following loading set of actions, and s is current State set, the state set includes：The enclosure space size in compartment, optional placement location and to be placed into cargo mark Knowing information, s ' is the state set after loading action update,For the network parameter of the target nerve network, γ is pre- If the second proportionality coefficient,For assessing charging ratio indexs of the different a in different s；

The mean square deviation calculated between the output valve and the output valve of the BP neural network of the target nerve network is：

Wherein,Output valve for the target nerve network and the BP neural network Mean square deviation between output valve；

11. the stowage of vehicle-mounted cargo according to claim 10, which is characterized in that described minimum based on mean square error Principle, the parameter for updating the BP neural network further includes：

The partial derivative that each neural network parameter is calculated using stochastic gradient descent algorithm is as follows：

WhereinFor to θ_iPartial derivative；

Join for the neural network Several partial derivatives；

Based on the partial derivative, the parameter of the BP neural network is calculated and updated.

12. the stowage of vehicle-mounted cargo according to claim 10, which is characterized in that further include：

Based on the parameter of the BP neural network, the parameter of the target nerve network is updated.

13. a kind of loading attachment of vehicle-mounted cargo, which is characterized in that including：

Generation unit is suitable for generating random number；

Determination unit is suitable for being based on the random number and preset probability parameter, determines the selection strategy of loading action, the dress The selection strategy of load action includes following any：Random selection strategy and selection strategy based on neural network；

Placement unit, the selection strategy for being suitable for being acted based on the loading, is selected corresponding loading to act and places corresponding goods Object.

14. the loading attachment of vehicle-mounted cargo according to claim 13, which is characterized in that the preset probability parameter with Execute the increase linear decrease of step.

15. the loading attachment of vehicle-mounted cargo according to claim 14, which is characterized in that when the preset probability parameter When no more than preset Parameters threshold, it is set as preset fixed value.

16. the loading attachment of vehicle-mounted cargo according to claim 13, which is characterized in that the determination unit includes：

First determination subelement is suitable for, when the random number is less than the probability parameter, determining the selection strategy of loading action For random selection strategy；

Second determination subelement is suitable for, when the random number is not less than the probability parameter, determining the selection plan of loading action The slightly selection strategy based on neural network.

17. the loading attachment of vehicle-mounted cargo according to claim 13, which is characterized in that the loading, which acts, includes：It waits for It is put into the identification information, the location information to be placed into cargo and the orientation information to be placed into cargo of cargo.

18. the loading attachment of vehicle-mounted cargo according to claim 17, which is characterized in that the loading action meets following At least one constraints：To be placed into cargo with it is at least one before the cargo placed be close together, to be placed into cargo therewith The preceding cargo placed is not overlapped in the horizontal direction.

19. the loading attachment of vehicle-mounted cargo according to claim 13, which is characterized in that when the selection of loading action When strategy is the selection strategy based on neural network, the placement unit includes：

First computation subunit is suitable for being based on BP neural network, calculates different loadings and acts corresponding BP neural network output Value；

Subelement is selected, it is that corresponding loading acts to be suitably selected for the maximum loading action of corresponding BP neural network output valve；

Subelement is placed, is suitable for placing corresponding cargo based on the action of corresponding loading.

20. the loading attachment of vehicle-mounted cargo according to claim 19, which is characterized in that the output of the BP neural network Value is：Q(s,a；θ_i), for assessing the charging ratio index for loading set of actions in state set；

Wherein a is to load set of actions, and s is current state set, and the state set includes：The enclosure space in compartment is big Small, optional placement location and to be placed into chest identification information, θ_iFor the network parameter of the BP neural network.

21. the loading attachment of vehicle-mounted cargo according to claim 20, which is characterized in that in placement unit selection pair After the loading answered acts and places corresponding cargo, further include：

Execution unit is adapted for carrying out the loading action, and the loading is acted deposit training set；

Training unit is suitable for updating the BP neural network parameter based on the training set.

22. the loading attachment of vehicle-mounted cargo according to claim 21, which is characterized in that the training unit includes：

Subelement is extracted, is suitable for extracting the sample in the training set；

Second computation subunit, is suitable for being based on the sample, and the output valve for calculating target nerve network is：Wherein r is the instantaneous feedback value for executing the loading action, and a is to load behavior aggregate It closes, a ' is the loading set of actions in future, and s is current state set, and the state set includes：The enclosure space in compartment is big It is small, optional placement location and to be placed into chest identification information, s ' be loading action update after state set,For institute The network parameter of target nerve network is stated, γ is preset second proportionality coefficient,For assessing different a not With the charging ratio index in s；

Third computation subunit, be suitable for calculating the output valve of the target nerve network and the BP neural network output valve it Between mean square deviation be：

Subelement is updated, is suitable for the principle based on mean square error minimum, updates the parameter of the BP neural network.

23. the loading attachment of vehicle-mounted cargo according to claim 22, which is characterized in that the update subelement includes：

Computing module, the partial derivative suitable for calculating each neural network parameter using stochastic gradient descent algorithm are as follows：

WhereinFor to θ_iPartial derivative；

Join for the neural network Several partial derivatives；

Update module is suitable for being based on the partial derivative, calculates and update the parameter of the BP neural network.

24. the loading attachment of vehicle-mounted cargo according to claim 22, which is characterized in that further include：

Updating unit is suitable for the parameter based on the BP neural network, updates the parameter of the target nerve network.

25. a kind of computer readable storage medium, computer readable storage medium is non-volatile memory medium or non-transient deposits Storage media is stored thereon with computer instruction, which is characterized in that perform claim requires 1 to 12 when the computer instruction operation Any one of the method the step of.

26. a kind of loading attachment of vehicle-mounted cargo, including memory and processor, being stored on the memory can be at the place The computer instruction run on reason device, which is characterized in that perform claim requires 1 when the processor runs the computer instruction The step of to any one of 12 the method.