CN111815348B

CN111815348B - Regional commodity production planning method based on commodity similarity clustering of stores

Info

Publication number: CN111815348B
Application number: CN202010467295.9A
Authority: CN
Inventors: 王一君; 陈灿; 黄国安; 吴珊珊
Original assignee: Hangzhou Lanzhong Data Technology Co ltd
Current assignee: Hangzhou Lanzhong Data Technology Co ltd
Filing date: 2020-05-28
Publication date: 2024-07-12
Anticipated expiration: 2040-05-28

Abstract

The invention discloses a regional commodity production planning method based on commodity similarity clustering of stores. The method specifically comprises the following steps: firstly, calculating a similarity matrix of each commodity on time sequence periodicity according to historical sales record data of all commodities of each store in the region every day; calculating the variation of commodity sales under different states according to external factors to obtain a sensitivity degree distance matrix between commodities; and clustering the commodities at a store level by combining a time sequence period similarity and a sensitivity degree distance matrix based on a clustering algorithm, selecting the optimal K class according to an elbow rule and a contour coefficient, predicting future sales on the clustered level, and finally summarizing K class requirements to give a regional production plan. The method is beneficial to reducing the loss of stock and the risk of backlog loss reporting, and plays an important role in improving the accuracy of the regional production plan.

Description

Regional commodity production planning method based on commodity similarity clustering of stores

Technical Field

The invention belongs to the technical field of information, and particularly relates to a regional commodity production planning method based on commodity similarity clustering of stores.

Background

With the development of computer technology, computer networks and management systems are being applied to almost all aspects of the retail industry, where regional factory production is machine-critical. In the production plan of the regional commodity, enterprises relate to different sales conditions of each single commodity of each store, different degrees of change of external factors, different production periods of the commodity, different degrees of grasp of each store on the commodity, and the decision maker is difficult to make optimal prediction and decision on the production plan of the regional commodity. Many industries often estimate the total amount of the whole area, collect the production plan amount of the area after the demand amount is set by experience of each store, but because the external change degree of store commodity is different, the store length level is different, and a good production plan is difficult to make; thus, it is a better way to aggregate similar stores and then predict sales in the aggregated categories and give a production plan for the population of the area.

In recent years, more and more industries pay attention to the importance of regional production prediction, and most of industry prediction methods are based on a moving average model of single products in a single store and then are adjusted according to service experience of a decision maker, but the phenomena of demand loss, commodity backlog loss and the like caused by inaccurate prediction still exist. Therefore, the invention provides a regional commodity production planning method based on the commodity similarity clustering of each store, so as to guide enterprises to make a decision of regional commodity production planning more reasonably.

Disclosure of Invention

The invention aims to overcome the defects in the existing regional commodity production plans and provides a regional commodity production planning method based on commodity similarity clustering of stores.

The invention comprises the following steps:

Step 1: firstly, acquiring a transaction detail data set D in a specified time period of the histories of all store commodities, removing activity information and holiday information in the transaction detail data set D, and then counting according to the granularity of the days to obtain a daily sales quantity set S of the store commodities;

Step 2: based on the daily sales quantity set S of the commodities, calculating a time demand mode T of each commodity on periodicity:

Wherein, For the average sales of the commodity in week i, n _i represents the number of days in week i within a specified period of time; d is E [1, n _i ];

Step 3: according to a daily sales data set S and a weather factor data set X in a specified time period, taking the weather factor data set X as a model input X and taking the daily sales data set as a model output y, and training a linear regression model; then, based on the change of each external weather factor data set X, obtaining a return coefficient, namely a change rate matrix E of daily sales;

step 4: based on the historical store daily sales data set S, the time demand pattern distance between the store and the commodity is calculated from sales data in the period of one month close to the daily sales data set S, and the formula is as follows:

Where i, j represent two different stores, T _i ^k represents the kth element in store i time demand pattern T;

step 5: based on the change rate matrix E of each commodity of each store, the change rate distance of the external factors among the store levels is calculated, and the formula is as follows:

where i, j represent two different stores, A kth element in a change rate matrix E of external factors of store i is represented;

Step 6: based on the time demand mode distance Dis _T and the external factor change rate distance Dis _E, the distance calculation method of the commodity at the store level is calculated, and the formula is as follows:

DIS(i，j)＝Dis_T(i,j)+Dis_E(i，j) (5)

Dis _T (i, j) represents the distance of store i, j in the time demand mode, dis _E (i, j) represents the distance of store i, j in the rate of change of the external factor, and DIS (i, j) represents the overall distance between stores i, j;

step 7: clustering similar stores according to a defined inter-store distance formula (5);

the similar stores are defined as the distance between stores;

step 8: obtaining optimal class evaluation Score according to the minimum intra-class distance and the contour coefficient, and selecting the optimal classification number k according to the combination of the optimal class evaluation Score;

Step 9: after obtaining the optimal category number k according to Score, summarizing sales in the category level;

Step 10: predicting a future sales y _t using an ARIMA model;

Step 11: and summarizing the future sales y _t of each category, and obtaining the total predicted demand which is the regional commodity production plan.

Further, the clustering in the step 7 adopts k-means clustering, and is implemented as follows:

input:

Data set

And (3) outputting:

Class center point Label C of each point

Initializing:

Randomly selecting k center points mu from the data set S ₁,…,μ_k

Firstly, initializing and randomly selecting k class center points, dividing each sample s ⁽ⁱ⁾ into a class mark c ^(j) nearest to mu _j, updating the value of mu _j of each class center point according to c ^(j), and repeating iteration until the class center is unchanged or the change amount is smaller than a certain threshold value; the obtained c is the class of each store and the similar stores.

Further, the step 8 is specifically implemented as follows:

Intra-class distance SSE:

selecting the number k of classes by selecting a way that minimizes the overall distance;

profile coefficient SC:

a (i) is the average distance from the sample i to other samples in the class, b (i) is the average distance from the sample i to all samples in other classes, and the smaller the intra-class distance is, the larger the inter-class distance is, the number of class centers k is selected;

Optimal class evaluation Score:

The method of combining the intra-class distance and the contour coefficient is that the smaller the intra-class distance is, the larger the inter-class distance is, the k number is within the range of reasonable class center number.

Further, after obtaining the optimal category number k according to Score in step 9, the sales volume is summarized in the category level to obtain the summarized sales data, which is specifically implemented as follows:

summarizing store sample commodity sales s ⁽ⁱ⁾ belonging to the category c ^(k);

Further, the future sales prediction using the ARIMA model described in step 10 is specifically implemented as follows:

Taking the collected sales data X as a model input X:

μ is a constant term, ε _t is an error term, γ _i is an autocorrelation coefficient, and θ _i is an error term coefficient.

The invention has the beneficial effects that:

according to the method, the daily sales data and the sensitivity degree of the commodity are used for calculating the similarity of the commodity at the store level, and then the regional production plan is predicted according to the ARIMA model, so that a scientific and referenceable prediction result is provided for the region, decision making of the production plan by enterprises and the region and more reasonable inventory management are facilitated, and the method plays an important role in reducing the loss of stock and backlog loss risk and improving the regional production plan accuracy.

Drawings

FIG. 1 is a specific flow chart of an embodiment of the present invention employing the method.

Fig. 2 is a diagram showing the results of the embodiment of the present invention using this method.

Detailed Description

The objects and effects of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings and tables. According to the method, actual conditions are considered, a commodity time sequence mode and a distance matrix of the sensitivity degree of external factors are adopted according to store historical sales data, the optimal class center K number is selected by using the class inner distance and the profile coefficient, each commodity is clustered at store level by using a K-means algorithm, sales prediction is carried out on future sales according to an ARIMA model, and a production plan decision of regional commodities is realized.

A regional commodity production planning method based on commodity similarity clustering of stores.

The invention comprises the following steps:

DiS(i，j)＝Dis_T(i,j)+Dis_E(i，j) (5)

Dis _T (i, j) represents the distance of store i, j in the time demand mode, dis _E (i, j) represents the distance of store i, j in the rate of change of the external factor, diS (i, j) represents the overall distance between stores i, j;

the similar stores are defined as the distance between stores;

Step 10: predicting a future sales y _t using an ARIMA model;

input:

Data set

And (3) outputting:

Class center point Label C of each point

Initializing:

Randomly selecting k center points mu from the data set S ₁,…,μ_k

Further, the step 8 is specifically implemented as follows:

Intra-class distance SSE:

profile coefficient SC:

Optimal class evaluation Score:

Taking the collected sales data X as a model input X:

FIG. 2 is an example of the results of a regional production plan for a target commodity obtained according to the present invention for 3 days in the future, showing a comparison of predicted throughput and true sales.

The present invention is not limited to the above embodiments, and those skilled in the art can practice the present invention using other various embodiments in light of the present disclosure. Therefore, the design structure and thought of the invention are adopted, and some simple changes or modified designs are made, which fall into the protection scope of the invention.

Claims

1. The regional commodity production planning method based on the commodity similarity clustering of each store is characterized by comprising the following steps of:

DIS(i,j)＝Dis_T(i,j)+Dis_E(i,j) (5)

the similar stores are defined as the distance between stores;

step 8: obtaining an optimal class evaluation Score according to the minimum intra-class distance and the contour coefficient, and selecting the optimal classification number k according to the optimal class evaluation Score and the intra-class distance and the contour coefficient;

Step 9: after obtaining the optimal classification number k according to Score, summarizing sales in the class level;

Step 10: predicting a future sales y _t using an ARIMA model;

2. The regional commodity production planning method based on the commodity similarity clustering of each store according to claim 1, wherein the clustering in the step 7 adopts k-means clustering, and the method is realized as follows:

input:

Data set

And (3) outputting:

Class center point Label C of each point

Initializing:

randomly selecting k center points mu from the data set S ₁,…,μ_k

3. The regional commodity production planning method based on the commodity similarity clustering of each store according to claim 1 or 2, wherein the step 8 is specifically implemented as follows:

Intra-class distance SSE:

selecting an optimal classification number k by selecting a mode of minimizing the total distance;

profile coefficient SC:

a (i) is the average distance from the sample i to other samples in the class, b (i) is the average distance from the sample i to all samples in other classes, and the smaller the intra-class distance is, the larger the inter-class distance is, the optimal classification number k is selected;

Optimal class evaluation Score:

and selecting the optimal classification number k with the smaller the intra-class distance and the larger the inter-class distance within a reasonable class center number range by combining the intra-class distance and the contour coefficient.

4. The regional commodity production planning method based on the commodity similarity clustering of each store according to claim 3, wherein in step 9, after obtaining the optimal classification number k according to Score, the sales are summarized in the class level, and the summarized sales data are obtained, which is specifically implemented as follows:

A summary of store sample sales s ⁽ⁱ⁾ belonging to the class c ^(k) is made.

5. The regional commodity production planning method based on the commodity similarity clustering of each store according to claim 4, wherein the future sales prediction in step 10 is implemented by using an ARIMA model as follows:

taking the collected sales data X (k) as a model input X: