CN115907608A - Warehouse logistics item analysis method and system, storage medium and computer equipment - Google Patents
Warehouse logistics item analysis method and system, storage medium and computer equipment Download PDFInfo
- Publication number
- CN115907608A CN115907608A CN202211428250.6A CN202211428250A CN115907608A CN 115907608 A CN115907608 A CN 115907608A CN 202211428250 A CN202211428250 A CN 202211428250A CN 115907608 A CN115907608 A CN 115907608A
- Authority
- CN
- China
- Prior art keywords
- warehouse logistics
- item
- decision tree
- tree model
- warehouse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims description 15
- 238000003066 decision tree Methods 0.000 claims abstract description 101
- 238000000034 method Methods 0.000 claims abstract description 40
- 238000012549 training Methods 0.000 claims abstract description 28
- 238000012360 testing method Methods 0.000 claims abstract description 24
- 238000007781 pre-processing Methods 0.000 claims abstract description 18
- 238000012545 processing Methods 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 11
- 238000012512 characterization method Methods 0.000 claims description 8
- 238000004140 cleaning Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 4
- 238000007499 fusion processing Methods 0.000 claims description 3
- 238000004088 simulation Methods 0.000 description 15
- 238000004364 calculation method Methods 0.000 description 8
- 230000002354 daily effect Effects 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 230000003203 everyday effect Effects 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000000547 structure data Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000011981 development test Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method, a system, a storage medium and computer equipment for analyzing a warehouse logistics item, wherein the method for analyzing the warehouse logistics item comprises the following steps: acquiring first item data of a warehouse logistics item to be analyzed; preprocessing the first project data to obtain a plurality of characteristic information and generate a test set; inputting the test set into a decision tree model, and obtaining the predicted operation index of the warehousing and logistics items according to the decision tree model, wherein the decision tree model is established by training second item data of a plurality of historical warehousing and logistics items. By implementing the technical scheme of the invention, the forecast operation indexes of the warehouse logistics items to be analyzed can be rapidly obtained by classification, and the method has strong universality and accurate result.
Description
Technical Field
The invention relates to the technical field of computer information processing, in particular to a warehouse logistics item analysis method, a warehouse logistics item analysis system, a warehouse logistics item analysis storage medium and computer equipment.
Background
Before an AGV (Automated Guided Vehicle) company designs and develops a warehouse logistics item for a customer, it is necessary to roughly determine an operation index of the warehouse logistics item. When determining the operation index, the traditional mode is as follows: the AGV company carries out simulation on the warehouse logistics items, and determines operation indexes according to the simulation result. Therefore, the AGV needs to invest a lot of manpower to develop the simulation system and spend a lot of time to perform simulation, and in practice, the services of different customers in different industries are different, and most of the simulation systems are only universal and are not specific. Therefore, this method has the following problems:
1. the simulation time period is long, and the efficiency is not high;
2. the simulation universality is poor, the scheme design and the service mode of a client cannot be adapted in time, and a large amount of manpower development tests can be arranged;
3. the simulation result is inaccurate and has no comparability.
Disclosure of Invention
The invention aims to solve the technical problems of long simulation period, poor simulation universality and inaccurate simulation result in the prior art, and provides an analysis method, a system, a storage medium and computer equipment for warehouse logistics items.
The technical scheme adopted by the invention for solving the technical problems is as follows: an analytical method of constructing a warehouse logistics item, comprising:
acquiring first item data of a warehouse logistics item to be analyzed;
preprocessing the first project data to acquire a plurality of characteristic information and generate a test set;
inputting the test set into a decision tree model, and obtaining the predicted operation index of the warehouse logistics item according to the decision tree model, wherein the decision tree model is established by training second item data of a plurality of historical warehouse logistics items.
Preferably, after the step of inputting the test set into a decision tree model, the method further includes:
and obtaining a warehousing configuration scheme of the warehousing logistics items of the same type as the warehousing logistics items to be analyzed.
Preferably, the warehouse configuration scheme for obtaining warehouse logistics items of the same type as the warehouse logistics items to be analyzed comprises:
determining a path corresponding to the warehouse logistics item to be analyzed in the decision tree model;
judging whether the path corresponding to each historical warehouse logistics item in the decision tree model is consistent with the path corresponding to the warehouse logistics item to be analyzed;
determining historical warehouse logistics items which are consistent with paths corresponding to the warehouse logistics items to be analyzed as warehouse logistics items of the same type as the warehouse logistics items to be analyzed;
and acquiring a warehouse configuration scheme of warehouse logistics items of the same type as the warehouse logistics items to be analyzed.
Preferably, the preprocessing the first item data includes:
cleaning the first project data;
fusing the cleaned first project data;
performing characterization processing on the first item data subjected to the fusion processing, wherein the characterization processing comprises the following steps: mean removal, range scaling, normalization, one-hot encoding.
Preferably, the decision tree model is built by:
acquiring second item data of a plurality of historical warehouse logistics items;
preprocessing the second item data to acquire a plurality of characteristic information and generate a training set;
and determining a root node, an internal node and a leaf node of the decision tree model according to the training set to obtain the decision tree model.
Preferably, determining a root node and an internal node of the decision tree model according to the training set includes:
determining parent nodes of the decision tree model based on information gain according to the training set, the parent nodes including the root node and the internal nodes.
Preferably, the training set comprises an operation index and a plurality of attributes of each historical warehouse logistics item;
the determining parent nodes of the decision tree model based on information gain according to the training set comprises:
calculating a first entropy value for the operation index under a current decision tree model;
respectively selecting each attribute to be classified for classification, and respectively calculating a second entropy value of the attribute to be classified aiming at the operation index under the decision tree model classified according to each attribute to be classified;
respectively calculating information gains between the first entropy and each second entropy, and acquiring the attribute to be classified of the target corresponding to the maximum value in the plurality of information gains;
and taking the attribute to be classified of the target as a current parent node.
Preferably, the entropy value is calculated according to the formula:
wherein Ent (D) is the entropy value of sample D; y is the number of categories of the operation index; p is a radical of k Is the proportion of the class of the kth running index in the sample D.
The present invention also constitutes a storage medium storing a computer program which, when executed by a processor, carries out the steps of the method of analyzing a warehouse logistics item described above.
The invention also constitutes a computer arrangement comprising a processor and a memory having a computer program stored thereon, which, when being executed, carries out the steps of the method of analyzing a warehouse logistics item described above.
The present invention also constructs an analytical system for a warehouse logistics item, comprising:
an acquisition module for acquiring first item data of a warehouse logistics item to be analyzed;
the processing module is used for preprocessing the first project data to acquire a plurality of characteristic information and generate a test set;
and the analysis module is used for sending the test set into a decision tree model and acquiring the predicted operation efficiency of the warehousing and logistics items from the decision tree model, wherein the decision tree model is established by training second item data of a plurality of historical warehousing and logistics items.
By implementing the technical scheme of the invention, a decision tree model is established for a large amount of project data (second project data) of historical warehouse logistics projects based on an unsupervised machine learning decision tree algorithm. When the warehouse logistics items to be analyzed are analyzed, the prediction operation index can be quickly obtained directly according to the basic information (first item data) of the warehouse logistics items on the basis of the decision tree model. Therefore, compared with the existing simulation mode by developing a simulation model, the method has the following beneficial effects:
1. for the warehouse logistics items to be analyzed, after the item data (first item data) of the warehouse logistics items are input into the decision tree model, the predicted operation indexes of the warehouse logistics items can be quickly obtained, and the time efficiency is greatly improved;
2. the universality is strong, and the project data of any warehouse logistics project to be analyzed can be input into the decision tree model for testing;
3. the result (predicted operation index) output by the decision tree model is accurate.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a flow chart of a first embodiment of a method for analyzing warehouse logistics items according to the present invention;
FIG. 2 is a schematic diagram of a decision tree model constructed by the present invention;
FIG. 3 is a logic structure diagram of a first embodiment of the warehouse logistics item analysis system of the present invention;
fig. 4 is a logical structure diagram of a first embodiment of the computer apparatus of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Fig. 1 is a flowchart of a first embodiment of a method for analyzing warehouse logistics items according to the present invention, the method for analyzing warehouse logistics items of the embodiment comprising:
step S10, acquiring first item data of a warehouse logistics item to be analyzed;
in this step, various types of data may be included with respect to the first item data, for example, for a certain business superitem to be analyzed, the first item data includes, but is not limited to: order data, stock data, warehouse area, commodity SKU (Stock Keeping Unit) variety distribution, and the like.
Step S20, preprocessing the first project data to obtain a plurality of characteristic information and generate a test set;
in this step, the acquired first item data may include raw data in various formats, and cannot be directly used for model testing, so that data preprocessing needs to be performed on the first item data to generate feature information, and the feature information constitutes a test set.
And S30, inputting the test set into a decision tree model, and obtaining a predicted operation index of the warehousing and logistics items according to the decision tree model, wherein the decision tree model is established by training second item data of a plurality of historical warehousing and logistics items.
In this step, the second item data is real item data of historical warehouse logistics items, and the database of the item server stores data such as order structure data, inventory structure data, map layout, warehouse area information, shelf placement positions, picking efficiency, operation indexes (e.g., robot efficiency, picking efficiency, etc.), order details, etc. in real time, and this part of data is stored in the server in the operation site, and then the item data of a plurality of historical warehouse logistics items can be synchronized to the cloud server in real time through uploading the cloud server and stored item by item, so as to prepare for the subsequent training of the decision tree model. After the decision tree model is established, the test set of the first item data can be input into the decision tree model for testing, so that the predicted operation indexes of the warehouse logistics items to be analyzed can be quickly classified. It should be noted that, since the historical warehouse logistics items are actually operated, the operation indexes of the historical warehouse logistics items are real data; and the warehouse logistics items to be analyzed are not actually operated, and the operation indexes of the warehouse logistics items to be analyzed obtained by utilizing the decision tree model are prediction data.
In the technical solution of this embodiment, a decision tree model is established for a large amount of item data (second item data) of historical warehouse logistics items based on an unsupervised machine learning decision tree algorithm. When the warehouse logistics items to be analyzed are analyzed, the predicted operation indexes can be quickly obtained directly according to the basic information (first item data) of the warehouse logistics items on the basis of the decision tree model. Therefore, compared with the existing simulation mode of developing a simulation model, the method has the following beneficial effects:
1. for the warehouse logistics items to be analyzed, after the item data (first item data) of the warehouse logistics items are input into the decision tree model, the predicted operation indexes of the warehouse logistics items can be quickly obtained, and the time efficiency is greatly improved;
2. the universality is strong, and the project data of any warehouse logistics item to be analyzed can be input into the decision tree model for testing;
3. the result (predicted operation index) output by the decision tree model is accurate.
Further, in an alternative embodiment, the method of analyzing a warehouse logistics item of the present invention further comprises: and obtaining a warehousing configuration scheme of warehousing logistics items of the same type as the warehousing logistics items to be analyzed.
In this embodiment, when analyzing the warehouse logistics items, the decision tree model is utilized to not only obtain the predicted operation index of the warehouse logistics items to be analyzed by fast classification, but also obtain the warehousing configuration scheme of the warehouse logistics items of the same type as the warehouse logistics items to be analyzed, so as to guide the warehousing configuration scheme of the warehouse logistics items to be analyzed according to the warehousing configuration scheme of the warehouse logistics items of the same type, where the warehousing configuration scheme includes: the system comprises the following components of shelf demand quantity, shelf specification, bin specification, shelf placement configuration, storage robot demand quantity, operating platform layout, charging pile demand quantity, charging pile layout, charging strategy setting and the like.
Further, the warehouse configuration of warehouse logistics items of the same type as the warehouse logistics items to be analyzed may be obtained according to the following manner:
determining a path corresponding to the warehouse logistics item to be analyzed in the decision tree model;
judging whether the path corresponding to each historical warehouse logistics item in the decision tree model is consistent with the path corresponding to the warehouse logistics item to be analyzed;
determining historical warehouse logistics items consistent with paths corresponding to the warehouse logistics items to be analyzed as warehouse logistics items of the same type as the warehouse logistics items to be analyzed;
and acquiring a warehouse configuration scheme of warehouse logistics items of the same type as the warehouse logistics items to be analyzed.
In this embodiment, after the decision tree model is trained, a path corresponding to each historical warehouse logistics item in the decision tree model may be determined, for example, a first path, and when the decision tree model is used to test the warehouse logistics item to be analyzed, a path corresponding to the warehouse logistics item to be analyzed in the decision tree model may also be determined, for example, a second path. Then, judging whether the first path corresponding to each historical warehouse logistics item is consistent with the second path one by one, determining the historical warehouse logistics items with consistent paths as the warehouse logistics items of the same type, and further obtaining a warehouse configuration scheme of the warehouse logistics items of the same type, and then designing a storage configuration scheme of the warehouse logistics items to be analyzed according to the warehouse configuration scheme of the warehouse logistics items of the same type, wherein the storage configuration scheme comprises the following steps: goods shelves demand quantity, goods shelves specification, workbin specification, goods shelves are put the configuration, storage robot demand quantity, operation panel demand quantity, the overall arrangement of operation panel, fill electric pile demand quantity, the overall arrangement of electric pile, the setting of the strategy of charging, etc..
Further, in an optional embodiment, the preprocessing the first item data includes:
cleaning the first project data;
fusing the cleaned first project data;
performing characterization processing on the first item data subjected to the fusion processing, wherein the characterization processing comprises the following steps: mean removal, range scaling, normalization, one-hot encoding.
In this embodiment, since the acquired first item data of the warehouse logistics item to be analyzed is raw data of various types, it cannot be directly input into the decision tree model, and needs to be preprocessed first, and can be preprocessed from the following aspects:
aiming at data with inconsistent formats in the first project data, for example, common formats such as excel, csv, access, MYSQL, oracle, txt and the like, cleaning processing such as format standardization, abnormal data removal, error correction, repeated data removal and the like is performed on a data source through a corresponding code module to obtain data with uniform format and no abnormality/error/repetition;
aiming at the scattered data or the data with non-uniform name fields in the first project data, the scattered data of certain library areas, certain cities, or certain years, certain months and the like can not be directly analyzed and trained, and the data can be fused and unified through corresponding codes; for data from different sources, for example, a field inventoryy in one source represents inventory, an INV in the other source also represents inventory, and kunun in the other source also represents inventory, a word stock and a regular rule more suitable for storage, warehouse and AGV can be compiled through a regular expression retrieval matching algorithm of an SEO search engine, so that the word stock and the regular rule are unified into a field INV, and then a standard two-dimensional table, a unified field and a coding rule can be matched;
because the decision tree algorithm is unsupervised machine learning, most of original data cannot be effectively recognized by a machine, for example, the machine does not know that one column is an order number and the other column is a piece number, and the order number is a discrete character string and the piece number is a continuous numerical value for the machine.
In a specific example, such as a warehouse logistics item in the business industry, the number of varieties of the commodity SKUs is tens of thousands, if the number of varieties of the SKU is to be incorporated into the decision tree model, but the number of varieties of the SKU of the item is stored in a wide range of pieces, for example, the large SKUs, such as large furniture beds, sofas and the like, have about 3 to 10 varieties, but the female socks have more than 1000 SKU varieties according to the styles, so the number of pieces is too large for the characteristic SKU, and the two types are not comparable, so that the range scaling processing is required. For example, a range scaler may be programmed in advance, based on the principle that if one feature has a larger range value than other features, the euclidean distance will be dominated by the value of the feature, so that the range scaler may be used to scale the range value of each feature, for example, the range value is between 0 and 1, and the convergence speed of the decision tree classification may be increased.
Further, in an alternative embodiment, the decision tree model is built by:
acquiring second item data of a plurality of historical warehouse logistics items;
preprocessing the second item data to acquire a plurality of feature information and generate a training set, wherein it should be understood that the preprocessing mode of the second item data is the same as the preprocessing mode of the first item data, and the details are not repeated herein;
and determining a root node, an internal node and a leaf node of the decision tree model according to the training set to obtain the decision tree model.
In this embodiment, regarding the decision tree, it should be noted that the decision tree is built up by relying on the decision. The decision tree comprises a root node, a plurality of internal nodes and a plurality of leaf nodes. In machine learning, a decision tree is a prediction model and represents a mapping relation between object attributes and object values, each node represents a certain object attribute, leaf nodes correspond to decision results, other nodes correspond to an attribute, and a sample set contained in each node is distributed into child nodes according to the result of attribute testing. Each branch path in the tree represents a possible attribute value, while each leaf node corresponds to the value of the object represented by the path traversed from the root node to the leaf node.
The selection of the features is a process of establishing a decision tree, considering that warehouse logistics items are small in data volume and complex in service compared with internet items, and in order to find out optimal partition features, an ID3 algorithm can be selected as the decision tree algorithm, namely, a feature project is made based on information gain, and attributes of each node branch for partitioning are selected, that is, a parent node of the decision tree model can be determined based on the information gain according to the training set, wherein the parent node comprises the root node and the internal nodes.
Further, in an optional embodiment, the training set includes an operation index and a plurality of attributes of each historical warehouse logistics item. Furthermore, determining a parent node of the decision tree model based on the information gain according to the training set specifically includes:
calculating a first entropy value for the operation index under a current decision tree model;
respectively selecting each attribute to be classified for classification, and respectively calculating a second entropy value of the attribute to be classified aiming at the operation index under the decision tree model classified according to each attribute to be classified;
respectively calculating information gains between the first entropy and each second entropy (for example, a difference value between the first entropy and each second entropy may be used as an information gain), and obtaining an attribute to be classified of a target corresponding to a maximum value of the information gains, that is, taking the attribute to be classified corresponding to the maximum value of the information gains as the attribute to be classified of the target;
and taking the attribute to be classified of the target as a current parent node.
Moreover, the entropy value can be calculated according to the following equation 1:
wherein, ent (D) is the entropy value of the sample D, and the smaller the value of Ent (D), the higher the purity of the sample set is; | y | is the number of categories of the operation index; p is a radical of k Is the proportion of the class of the kth running index in the sample D.
In a specific example, for a warehouse logistics item, when the decision tree model is trained, the obtained training set is shown in table 1, and it should be understood that the actual feature data is more and the data type is more.
TABLE 1
In the sample data of the training set, the robot vehicle effectiveness can be used as an operation index, that is, as a leaf node of the decision tree model, and the robot vehicle effectiveness of the 14 warehouse logistics items (samples) can be classified into two types according to whether the robot vehicle effectiveness is greater than 36: is (greater than 36); no (not greater than 36), i.e., | y | =2. Four additional attributes: the project industry, the daily warehouse-out order quantity, whether the warehouse area is more than 1000 square meters and whether the number of the stock is more than 1 ten thousand are taken as root nodes or internal nodes.
When the root node of the decision tree model is determined, 9 samples with the robot vehicle efficiency being greater than 36 are taken from 14 training samples, and the ratio of the samples is 9/14; the samples with the vehicle effect of the robot being not more than 36 have 5 samples, and the proportion of the samples is 5/14. When calculated using equation 1 above, one can obtain:i.e. the first entropy value has 0.940.
Then, a first traversal is started, i.e. sorted by the following attributes, respectively: and in the project industry, the quantity of orders which are delivered from the warehouse every day, whether the warehouse area is more than 1000 square meters or not and whether the number of stocks is more than 1 ten thousand or not are judged, and second entropy values corresponding to each classification are respectively calculated.
When classified by "project industry", three sample data subsets are available: sample data subset T1 (project industry = footgear), sample data subset T2 (project industry = 3C), sample data subset T3 (project industry = retail), and, according to the table above, T1 is {1,2,8,9,11}, for a total of 5 samples, so the probability/weight of project industry being footgear is 5/14; t2 is {3,7,12,13}, 4 samples in total, so the probability/weight of project industry being 3C is 4/14; t3 is {4,5,6,10,14}, for a total of 5 samples, so the probability/weight that the project industry is retail is 5/14.
The following calculation procedure for separately calculating the entropy value and the integrated entropy value (second entropy value) of each sample data subset in conjunction with table 2:
TABLE 2
For the project industry to be footwear and apparel,of the 5 samples of T1{1,2,8,9,11}, the samples with the robot effectiveness greater than 36 are {9,11}, and the proportion is 2/5; the number of the robot vehicle effects is 3 in total, the sample number of the robot vehicle effects is not more than 36, and the proportion is 3/5. When calculated using equation 1 above, one can obtain:namely, the entropy value corresponding to the shoes and the clothes in the project industry is 0.971.
For the project industry of 3C, 4 samples with the robot effectiveness greater than 36 in 4 samples of T2{3,7,12,13} account for 4/4 of {3,7,12,13 }; the sample with the vehicle effect of the robot being not more than 36 is empty, and the proportion of the empty sample is 0. When calculated using equation 1 above, one can obtain:that is, the entropy value for project industry 3C is 0.
For retail in the project industry, in 5 samples with T3 containing {4,5,6,10,14}, the samples with the robot efficiency greater than 36 are {4,5,10}, and the total number is 3, and the proportion is 3/5; the number of the robot vehicle effects is 2 in total, which is not more than 36 in the example of {6,14}, and the proportion is 2/5. When calculated using equation 1 above, one can obtain:that is, the entropy value for retail in the project industry is 0.971.
When the weights (weight of 5/14 for the shoe cover, weight of 5/14 for the retail sale) and the entropy values (entropy values of 0.971 for the shoe cover, 0 for the 3C, and 0.971 for the retail sale) of the individual elements (shoe cover, 3C, and retail sale) are calculated, a second entropy value (comprehensive entropy value) of 0.694 (5/14 x 0.971 x 0/4 x 5/14 x 0.971= 0.693) for the project industry can be calculated.
After classifying according to "daily warehouse-out order quantity", as shown in table 3, the calculation process of the entropy value and the comprehensive entropy value (second entropy value) of each sample data subset (> 100, 50-100, < 50) can be calculated respectively:
TABLE 3
After classification according to "whether the warehouse area is greater than 1000 square meters", as shown in table 4, the calculation process of the entropy value and the comprehensive entropy value (second entropy value) of each sample data subset (greater than 1000 square meters and not greater than 1000 square meters) can be respectively calculated:
TABLE 4
After classification is performed according to "whether the number of stocks is greater than 1 ten thousand", as shown in table 5, the calculation process of the entropy value and the comprehensive entropy value (second entropy value) of each sample data subset (greater than 1 ten thousand, not greater than 1 ten thousand) can be respectively calculated:
TABLE 5
After all the attributes to be classified are traversed, the information Gain corresponding to each attribute to be classified can be respectively calculated according to the following formula according to the first entropy and the second entropy corresponding to each attribute to be classified:
gain (project industry) =0.940-0.693=0.247;
gain (daily average ex-warehouse order quantity) =0.940-0.911=0.029;
gain (whether warehouse area is greater than 1000 square meters) =0.940-0.788=0.152;
gain (whether the number of stock stocks is more than 1 ten thousand) =0.940-0.892=0.048.
Obviously, gain (project industry) is the largest, which means that the whole chaos degree is reduced to the greatest extent after division according to the project industry from the most original sample, so that the 'project industry' is selected as the root node of the decision tree model.
After the root node of the decision tree model is determined by the first traversal, three branches can be obtained: and (5) shoe clothes, 3C, retail sale. Moreover, for the 3C branch, since the robot efficiency is greater than 36 in the 4 examples corresponding to the 3C branch, the leaf nodes can be directly connected, and for the shoe wear and retail branches, one of the following three attributes is selected as the next child node (internal node): the number of orders taken out of the warehouse every day, whether the area of the warehouse is more than 1000 square meters or not and whether the number of stocked items is more than 1 ten thousand or not.
The determination process of the next child node of the "shoe and clothes" branch is as follows:
first, it should be understood that the sample size at this time has changed from the first 14 to 5. Moreover, as can be calculated from the above, the entropy value (first entropy value) corresponding to the shoe and clothes in the project industry before classification is 0.971. Then, a second traversal is started, namely, the attributes are classified according to the following three attributes to be classified respectively: the number of orders which are taken out of the warehouse every day, whether the area of the warehouse is more than 1000 square meters or not and whether the number of stocks is more than 1 ten thousand or not are judged, and second entropy values corresponding to each classification are respectively calculated.
After classifying according to "daily warehouse-out order quantity", as shown in table 6, the calculation process of the entropy value and the comprehensive entropy value (second entropy value) of each sample data subset (> 100, 50-100, < 50) can be calculated respectively:
TABLE 6
After classifying according to "whether the warehouse area is greater than 1000 meters" or not, as shown in table 7, the process of calculating the entropy and the comprehensive entropy (second entropy) of each sample data subset (greater than 1000 meters, not greater than 1000 meters) can be calculated respectively:
TABLE 7
After classification is performed according to "whether the number of stocks is greater than 1 ten thousand", as shown in table 8, the calculation process of the entropy value and the comprehensive entropy value (second entropy value) of each sample data subset (greater than 1 ten thousand, not greater than 1 ten thousand) can be respectively calculated:
TABLE 8
After all attributes to be classified are traversed, respectively calculating the information Gain corresponding to each attribute to be classified according to the following formula according to the first entropy value and the second entropy value corresponding to each attribute to be classified:
g (daily warehouse-out order quantity) =0.971-0.4=0.571;
g (whether warehouse area is greater than 1000 square meters) =0.971-0=0.971;
g (whether the number of stock is more than 1 ten thousand) =0.971-0.951=0.02
Obviously, gain (whether the warehouse area is greater than 1000 square meters) is the largest, which means that the whole chaos degree is the largest after the shoe and clothes branch is divided according to the fact that whether the warehouse area is greater than 1000 square meters, so that the 'warehouse area is greater than 1000 square meters' is selected as the child node of the 'shoe and clothes' branch of the decision tree model.
The process of determining the next child node of retail is as follows:
first, it should be understood that the sample size at this time has changed from the first 14 to 5. Also, as calculated above, the entropy value (first entropy value) for retail in the project industry prior to classification is 0.971. Then, starting a third traversal, namely classifying the attributes to be classified according to the following two attributes to be classified respectively: the number of orders which are delivered from the warehouse every day, whether the number of stocks is more than 1 ten thousand (it is understood that the warehouse area is more than 1000 square meters, and the warehouse area is not available any more here), and second entropy values corresponding to each classification are respectively calculated.
After classifying according to "daily warehouse-out order quantity", as shown in table 9, the calculation process of the entropy value and the comprehensive entropy value (second entropy value) of each sample data subset (> 100, 50-100, < 50) can be calculated respectively:
TABLE 9
After classification is performed according to "whether the number of stocks is greater than 1 ten thousand", as shown in table 10, the calculation process of the entropy value and the comprehensive entropy value (second entropy value) of each sample data subset (greater than 1 ten thousand, not greater than 1 ten thousand) can be calculated respectively:
TABLE 10
After all attributes to be classified are traversed, respectively calculating the information Gain corresponding to each attribute to be classified according to the following formula according to the first entropy value and the second entropy value corresponding to each attribute to be classified:
g (daily warehouse-out order quantity) =0.971-0.951=0.02;
g (whether the number of stock is more than 1 ten thousand) =0.971-0=0.971
Obviously, gain (whether the number of stock pieces is greater than 1 ten thousand) is the largest, meaning that from the retail branch, "whether the number of stock pieces is greater than 1 ten thousand" is sorted, so that the pureness of the retail branch is higher than other attributes, so "whether the number of stock pieces is greater than 1 ten thousand" can be selected as a child node of the "retail" branch of the decision tree model.
By this, each branch goes to a leaf node, i.e., the construction of the decision tree stops when there is only one type next to each child node. In addition, it should be noted that the "daily average ex-warehouse order quantity" is used as a classification attribute that is not used, which indicates that the order quantity is not selected by the decision tree and belongs to an attribute with low classification value. Although the decision tree does not choose it, the scheme to which each branch of the decision tree is attached can put this factor in as part of the overall data scheme.
After the decision tree model is built and trained and the first item data of the warehouse logistics items are obtained, the first item data can be sorted out of the attributes in the preprocessing link and then input into the decision tree model for classification prediction to see whether the predicted robot vehicle effect is larger than 36. In addition, the method can also determine the items of the same type as the item by comparing the path of the item in the decision tree model with the path of the historical item in the decision tree model, further acquire the basic information and conditions (such as warehousing configuration schemes) of the items of the same type, and design the warehousing configuration scheme of the warehousing logistics item to be analyzed by referring to the warehousing configuration schemes of the items of the same type.
Fig. 3 is a logical block diagram of a first embodiment of an analysis system for warehouse logistics items according to the present invention, the analysis system of this embodiment comprising: the system comprises an acquisition module 10, a processing module 20 and an analysis module 30, wherein the acquisition module 10 is used for acquiring first item data of a warehouse logistics item to be analyzed; the processing module 20 is configured to perform preprocessing on the first item data to obtain a plurality of feature information, and generate a test set, where the preprocessing includes cleaning, fusion, and characterization, and the characterization includes: mean removal, range scaling, normalization, one-hot encoding; the analysis module 30 is configured to send the test set to a decision tree model, and obtain the predicted operation efficiency of the warehouse logistics item from the decision tree model, where the decision tree model is established by training second item data of a plurality of historical warehouse logistics items.
Further, in an optional embodiment, the method for analyzing warehouse logistics items of the present invention may further include a scenario determination module for obtaining a warehousing configuration scenario of warehouse logistics items of the same type as the warehouse logistics items to be analyzed. Preferably, the scheme determining module is configured to determine a path corresponding to the warehouse logistics item to be analyzed in the decision tree model, determine whether a path corresponding to each historical warehouse logistics item in the decision tree model is consistent with a path corresponding to the warehouse logistics item to be analyzed, determine the historical warehouse logistics item consistent with the path corresponding to the warehouse logistics item to be analyzed as a warehouse logistics item of the same type as the warehouse logistics item to be analyzed, and finally obtain a warehouse configuration scheme of the warehouse logistics item of the same type as the warehouse logistics item to be analyzed.
The invention also constitutes a storage medium storing a computer program which, when executed by a processor, carries out the steps of the method of analyzing a warehouse logistics item described above.
The storage medium of the present invention may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or various computer-readable storage media that can store program codes, such as an optical disk.
Fig. 4 is a block diagram of a first embodiment of a computer device 400 according to the present invention, where the computer device 400 may be a computer or a server, and the server may be an independent server or a server cluster formed by multiple servers.
Referring to fig. 4, the computer device 400 includes a processor 402, memory, and a network interface 405 connected by a system bus 401, where the memory may include a non-volatile storage medium 403 and an internal memory 404.
The non-volatile storage medium 403 may store an operating system 4031 and computer programs 4032. The computer program 4032 includes program instructions that, when executed by the processor 402, cause the processor 402 to perform the steps of the method of analyzing a warehouse logistics item described above.
The processor 402 is used to provide computing and control capabilities to support the operation of the overall computer device 400. It should be understood that, in the embodiment of the present Application, the Processor 402 may be a Central Processing Unit (CPU), and the Processor 402 may also be other general-purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field-Programmable Gate arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The internal memory 404 provides an environment for running the computer program 4032 in the non-volatile storage medium 403, and when the computer program 4032 is executed by the processor 402, the processor 402 may be caused to perform the steps of the method for analyzing a warehouse logistics item.
The network interface 405 is used for network communication with other devices.
Those skilled in the art will appreciate that the configuration shown in fig. 4 is a block diagram of only a portion of the configuration associated with the present application and does not constitute a limitation of the computing device 400 to which the present application is applied, and that a particular computing device 400 may include more or less components than those shown, or combine certain components, or have a different arrangement of components.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.
Claims (11)
1. A method of analyzing a warehouse logistics item, comprising:
acquiring first item data of a warehouse logistics item to be analyzed;
preprocessing the first project data to acquire a plurality of characteristic information and generate a test set;
inputting the test set into a decision tree model, and obtaining the predicted operation index of the warehouse logistics item according to the decision tree model, wherein the decision tree model is established by training second item data of a plurality of historical warehouse logistics items.
2. The method of analyzing a warehouse logistics item of claim 1, wherein the step of entering the test set into a decision tree model is followed by further comprising:
and obtaining a warehousing configuration scheme of warehousing logistics items of the same type as the warehousing logistics items to be analyzed.
3. The method of analyzing a warehouse logistics item of claim 2, wherein the obtaining a warehouse configuration plan for a warehouse logistics item of the same type as the warehouse logistics item to be analyzed comprises:
determining a path corresponding to the warehouse logistics item to be analyzed in the decision tree model;
judging whether the path corresponding to each historical warehouse logistics item in the decision tree model is consistent with the path corresponding to the warehouse logistics item to be analyzed;
determining historical warehouse logistics items consistent with paths corresponding to the warehouse logistics items to be analyzed as warehouse logistics items of the same type as the warehouse logistics items to be analyzed;
and acquiring a warehouse configuration scheme of warehouse logistics items of the same type as the warehouse logistics items to be analyzed.
4. The method of analyzing a warehouse logistics item of claim 1, wherein the pre-processing the first item data comprises:
cleaning the first project data;
fusing the cleaned first project data;
performing characterization processing on the first item data subjected to the fusion processing, wherein the characterization processing comprises the following steps: mean removal, range scaling, normalization, one-hot encoding.
5. The method of analyzing a warehouse logistics item of claim 1, wherein the decision tree model is built by:
acquiring second item data of a plurality of historical warehouse logistics items;
preprocessing the second item data to acquire a plurality of characteristic information and generate a training set;
and determining a root node, an internal node and a leaf node of the decision tree model according to the training set to obtain the decision tree model.
6. The method of analyzing a warehouse logistics item of claim 5, wherein determining a root node, an interior node of the decision tree model from the training set comprises:
determining parent nodes of the decision tree model based on information gain according to the training set, the parent nodes including the root node and the internal nodes.
7. The method of analyzing warehouse logistics items of claim 6 wherein the training set includes an operational indicator and a plurality of attributes for each historical warehouse logistics item;
the determining parent nodes of the decision tree model based on information gain according to the training set includes:
calculating a first entropy value for the operation index under a current decision tree model;
respectively selecting each attribute to be classified for classification, and respectively calculating a second entropy value of the attribute to be classified aiming at the operation index under the decision tree model classified according to each attribute to be classified;
respectively calculating information gains between the first entropy and each second entropy, and acquiring the attribute to be classified of the target corresponding to the maximum value in the information gains;
and taking the attribute to be classified of the target as a current parent node.
8. The method of analyzing a warehouse logistics item of claim 7, wherein the entropy value is calculated according to the following formula:
wherein Ent (D) is the entropy value of sample D; y is the number of categories of the operation index; p is a radical of k Is the proportion of the category of the kth operation index in the sample D.
9. A storage medium storing a computer program, characterized in that the computer program, when being executed by a processor, realizes the steps of the method of analyzing a warehouse logistics item of any one of claims 1-8.
10. A computer arrangement comprising a processor and a memory having a computer program stored thereon, characterized in that the processor realizes the steps of the method of analyzing a warehouse logistics item of any one of claims 1-8 when executing the computer program.
11. An analysis system for a warehouse logistics item, comprising:
an acquisition module for acquiring first item data of a warehouse logistics item to be analyzed;
the processing module is used for preprocessing the first project data to acquire a plurality of characteristic information and generate a test set;
and the analysis module is used for sending the test set into a decision tree model and acquiring the predicted operation efficiency of the warehousing and logistics items from the decision tree model, wherein the decision tree model is established by training second item data of a plurality of historical warehousing and logistics items.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211428250.6A CN115907608A (en) | 2022-11-15 | 2022-11-15 | Warehouse logistics item analysis method and system, storage medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211428250.6A CN115907608A (en) | 2022-11-15 | 2022-11-15 | Warehouse logistics item analysis method and system, storage medium and computer equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115907608A true CN115907608A (en) | 2023-04-04 |
Family
ID=86477281
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211428250.6A Pending CN115907608A (en) | 2022-11-15 | 2022-11-15 | Warehouse logistics item analysis method and system, storage medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115907608A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116562769A (en) * | 2023-06-15 | 2023-08-08 | 深圳爱巧网络有限公司 | Cargo data analysis method and system based on cargo attribute classification |
CN116702059A (en) * | 2023-06-05 | 2023-09-05 | 苏州市联佳精密机械有限公司 | Intelligent production workshop management system based on Internet of things |
CN117455340A (en) * | 2023-12-23 | 2024-01-26 | 翌飞锐特电子商务(北京)有限公司 | Logistics freight transportation information sharing and pushing method based on one record supply chain order |
-
2022
- 2022-11-15 CN CN202211428250.6A patent/CN115907608A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116702059A (en) * | 2023-06-05 | 2023-09-05 | 苏州市联佳精密机械有限公司 | Intelligent production workshop management system based on Internet of things |
CN116702059B (en) * | 2023-06-05 | 2023-12-19 | 苏州市联佳精密机械有限公司 | Intelligent production workshop management system based on Internet of things |
CN116562769A (en) * | 2023-06-15 | 2023-08-08 | 深圳爱巧网络有限公司 | Cargo data analysis method and system based on cargo attribute classification |
CN117455340A (en) * | 2023-12-23 | 2024-01-26 | 翌飞锐特电子商务(北京)有限公司 | Logistics freight transportation information sharing and pushing method based on one record supply chain order |
CN117455340B (en) * | 2023-12-23 | 2024-03-08 | 翌飞锐特电子商务(北京)有限公司 | Logistics freight transportation information sharing and pushing method based on one record supply chain order |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115907608A (en) | Warehouse logistics item analysis method and system, storage medium and computer equipment | |
CN106201871B (en) | Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised | |
CN111222556B (en) | Method and system for identifying electricity utilization category based on decision tree algorithm | |
CN109101632A (en) | Product quality abnormal data retrospective analysis method based on manufacture big data | |
US20230081051A1 (en) | Systems and methods using inventory data to measure and predict availability of products and optimize assortment | |
CN105051729A (en) | Data records selection | |
CN113240518A (en) | Bank-to-public customer loss prediction method based on machine learning | |
CN108629436B (en) | Method and electronic equipment for estimating warehouse goods picking capacity | |
CN112860769B (en) | Energy planning data management system | |
CN113177643A (en) | Automatic modeling system based on big data | |
CN116579804A (en) | Holiday commodity sales prediction method, holiday commodity sales prediction device and computer storage medium | |
CN113554455B (en) | Store commodity analysis method, device and storage medium based on artificial intelligence | |
KR20230052010A (en) | Demand forecasting method using ai-based model selector algorithm | |
CN113177642A (en) | Automatic modeling system for data imbalance | |
CN117743803A (en) | Workload perception instant defect prediction method based on evolutionary feature construction | |
CN111582313A (en) | Sample data generation method and device and electronic equipment | |
CN112015792A (en) | Material duplicate code analysis method and device and computer storage medium | |
CN113779933B (en) | Commodity encoding method, electronic device, and computer-readable storage medium | |
CN116522180A (en) | Analysis model generation method and system based on K-means algorithm | |
CN115587333A (en) | Failure analysis fault point prediction method and system based on multi-classification model | |
CN115271884A (en) | Multi-source data-based commodity selection method and device and electronic equipment | |
WO2013106124A1 (en) | Automatic demand parameter escalation | |
CN112418652A (en) | Risk identification method and related device | |
CN113240353B (en) | Cross-border e-commerce oriented export factory classification method and device | |
CN117539920B (en) | Data query method and system based on real estate transaction multidimensional data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |