CN111815348B - Regional commodity production planning method based on commodity similarity clustering of stores - Google Patents

Regional commodity production planning method based on commodity similarity clustering of stores Download PDF

Info

Publication number
CN111815348B
CN111815348B CN202010467295.9A CN202010467295A CN111815348B CN 111815348 B CN111815348 B CN 111815348B CN 202010467295 A CN202010467295 A CN 202010467295A CN 111815348 B CN111815348 B CN 111815348B
Authority
CN
China
Prior art keywords
store
class
distance
commodity
sales
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010467295.9A
Other languages
Chinese (zh)
Other versions
CN111815348A (en
Inventor
王一君
陈灿
黄国安
吴珊珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Lanzhong Data Technology Co ltd
Original Assignee
Hangzhou Lanzhong Data Technology Co ltd
Filing date
Publication date
Application filed by Hangzhou Lanzhong Data Technology Co ltd filed Critical Hangzhou Lanzhong Data Technology Co ltd
Priority to CN202010467295.9A priority Critical patent/CN111815348B/en
Publication of CN111815348A publication Critical patent/CN111815348A/en
Application granted granted Critical
Publication of CN111815348B publication Critical patent/CN111815348B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a regional commodity production planning method based on commodity similarity clustering of stores. The method specifically comprises the following steps: firstly, calculating a similarity matrix of each commodity on time sequence periodicity according to historical sales record data of all commodities of each store in the region every day; calculating the variation of commodity sales under different states according to external factors to obtain a sensitivity degree distance matrix between commodities; and clustering the commodities at a store level by combining a time sequence period similarity and a sensitivity degree distance matrix based on a clustering algorithm, selecting the optimal K class according to an elbow rule and a contour coefficient, predicting future sales on the clustered level, and finally summarizing K class requirements to give a regional production plan. The method is beneficial to reducing the loss of stock and the risk of backlog loss reporting, and plays an important role in improving the accuracy of the regional production plan.

Description

Regional commodity production planning method based on commodity similarity clustering of stores
Technical Field
The invention belongs to the technical field of information, and particularly relates to a regional commodity production planning method based on commodity similarity clustering of stores.
Background
With the development of computer technology, computer networks and management systems are being applied to almost all aspects of the retail industry, where regional factory production is machine-critical. In the production plan of the regional commodity, enterprises relate to different sales conditions of each single commodity of each store, different degrees of change of external factors, different production periods of the commodity, different degrees of grasp of each store on the commodity, and the decision maker is difficult to make optimal prediction and decision on the production plan of the regional commodity. Many industries often estimate the total amount of the whole area, collect the production plan amount of the area after the demand amount is set by experience of each store, but because the external change degree of store commodity is different, the store length level is different, and a good production plan is difficult to make; thus, it is a better way to aggregate similar stores and then predict sales in the aggregated categories and give a production plan for the population of the area.
In recent years, more and more industries pay attention to the importance of regional production prediction, and most of industry prediction methods are based on a moving average model of single products in a single store and then are adjusted according to service experience of a decision maker, but the phenomena of demand loss, commodity backlog loss and the like caused by inaccurate prediction still exist. Therefore, the invention provides a regional commodity production planning method based on the commodity similarity clustering of each store, so as to guide enterprises to make a decision of regional commodity production planning more reasonably.
Disclosure of Invention
The invention aims to overcome the defects in the existing regional commodity production plans and provides a regional commodity production planning method based on commodity similarity clustering of stores.
The invention comprises the following steps:
Step 1: firstly, acquiring a transaction detail data set D in a specified time period of the histories of all store commodities, removing activity information and holiday information in the transaction detail data set D, and then counting according to the granularity of the days to obtain a daily sales quantity set S of the store commodities;
Step 2: based on the daily sales quantity set S of the commodities, calculating a time demand mode T of each commodity on periodicity:
Wherein, For the average sales of the commodity in week i, n i represents the number of days in week i within a specified period of time; d is E [1, n i ];
Step 3: according to a daily sales data set S and a weather factor data set X in a specified time period, taking the weather factor data set X as a model input X and taking the daily sales data set as a model output y, and training a linear regression model; then, based on the change of each external weather factor data set X, obtaining a return coefficient, namely a change rate matrix E of daily sales;
step 4: based on the historical store daily sales data set S, the time demand pattern distance between the store and the commodity is calculated from sales data in the period of one month close to the daily sales data set S, and the formula is as follows:
Where i, j represent two different stores, T i k represents the kth element in store i time demand pattern T;
step 5: based on the change rate matrix E of each commodity of each store, the change rate distance of the external factors among the store levels is calculated, and the formula is as follows:
where i, j represent two different stores, A kth element in a change rate matrix E of external factors of store i is represented;
Step 6: based on the time demand mode distance Dis T and the external factor change rate distance Dis E, the distance calculation method of the commodity at the store level is calculated, and the formula is as follows:
DIS(i,j)=DisT(i,j)+DisE(i,j) (5)
Dis T (i, j) represents the distance of store i, j in the time demand mode, dis E (i, j) represents the distance of store i, j in the rate of change of the external factor, and DIS (i, j) represents the overall distance between stores i, j;
step 7: clustering similar stores according to a defined inter-store distance formula (5);
the similar stores are defined as the distance between stores;
step 8: obtaining optimal class evaluation Score according to the minimum intra-class distance and the contour coefficient, and selecting the optimal classification number k according to the combination of the optimal class evaluation Score;
Step 9: after obtaining the optimal category number k according to Score, summarizing sales in the category level;
Step 10: predicting a future sales y t using an ARIMA model;
Step 11: and summarizing the future sales y t of each category, and obtaining the total predicted demand which is the regional commodity production plan.
Further, the clustering in the step 7 adopts k-means clustering, and is implemented as follows:
input:
Data set
And (3) outputting:
Class center point Label C of each point
Initializing:
Randomly selecting k center points mu from the data set S 1,…,μk
Firstly, initializing and randomly selecting k class center points, dividing each sample s (i) into a class mark c (j) nearest to mu j, updating the value of mu j of each class center point according to c (j), and repeating iteration until the class center is unchanged or the change amount is smaller than a certain threshold value; the obtained c is the class of each store and the similar stores.
Further, the step 8 is specifically implemented as follows:
Intra-class distance SSE:
selecting the number k of classes by selecting a way that minimizes the overall distance;
profile coefficient SC:
a (i) is the average distance from the sample i to other samples in the class, b (i) is the average distance from the sample i to all samples in other classes, and the smaller the intra-class distance is, the larger the inter-class distance is, the number of class centers k is selected;
Optimal class evaluation Score:
The method of combining the intra-class distance and the contour coefficient is that the smaller the intra-class distance is, the larger the inter-class distance is, the k number is within the range of reasonable class center number.
Further, after obtaining the optimal category number k according to Score in step 9, the sales volume is summarized in the category level to obtain the summarized sales data, which is specifically implemented as follows:
summarizing store sample commodity sales s (i) belonging to the category c (k);
Further, the future sales prediction using the ARIMA model described in step 10 is specifically implemented as follows:
Taking the collected sales data X as a model input X:
μ is a constant term, ε t is an error term, γ i is an autocorrelation coefficient, and θ i is an error term coefficient.
The invention has the beneficial effects that:
according to the method, the daily sales data and the sensitivity degree of the commodity are used for calculating the similarity of the commodity at the store level, and then the regional production plan is predicted according to the ARIMA model, so that a scientific and referenceable prediction result is provided for the region, decision making of the production plan by enterprises and the region and more reasonable inventory management are facilitated, and the method plays an important role in reducing the loss of stock and backlog loss risk and improving the regional production plan accuracy.
Drawings
FIG. 1 is a specific flow chart of an embodiment of the present invention employing the method.
Fig. 2 is a diagram showing the results of the embodiment of the present invention using this method.
Detailed Description
The objects and effects of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings and tables. According to the method, actual conditions are considered, a commodity time sequence mode and a distance matrix of the sensitivity degree of external factors are adopted according to store historical sales data, the optimal class center K number is selected by using the class inner distance and the profile coefficient, each commodity is clustered at store level by using a K-means algorithm, sales prediction is carried out on future sales according to an ARIMA model, and a production plan decision of regional commodities is realized.
A regional commodity production planning method based on commodity similarity clustering of stores.
The invention comprises the following steps:
Step 1: firstly, acquiring a transaction detail data set D in a specified time period of the histories of all store commodities, removing activity information and holiday information in the transaction detail data set D, and then counting according to the granularity of the days to obtain a daily sales quantity set S of the store commodities;
Step 2: based on the daily sales quantity set S of the commodities, calculating a time demand mode T of each commodity on periodicity:
Wherein, For the average sales of the commodity in week i, n i represents the number of days in week i within a specified period of time; d is E [1, n i ];
Step 3: according to a daily sales data set S and a weather factor data set X in a specified time period, taking the weather factor data set X as a model input X and taking the daily sales data set as a model output y, and training a linear regression model; then, based on the change of each external weather factor data set X, obtaining a return coefficient, namely a change rate matrix E of daily sales;
step 4: based on the historical store daily sales data set S, the time demand pattern distance between the store and the commodity is calculated from sales data in the period of one month close to the daily sales data set S, and the formula is as follows:
Where i, j represent two different stores, T i k represents the kth element in store i time demand pattern T;
step 5: based on the change rate matrix E of each commodity of each store, the change rate distance of the external factors among the store levels is calculated, and the formula is as follows:
where i, j represent two different stores, A kth element in a change rate matrix E of external factors of store i is represented;
Step 6: based on the time demand mode distance Dis T and the external factor change rate distance Dis E, the distance calculation method of the commodity at the store level is calculated, and the formula is as follows:
DiS(i,j)=DisT(i,j)+DisE(i,j) (5)
Dis T (i, j) represents the distance of store i, j in the time demand mode, dis E (i, j) represents the distance of store i, j in the rate of change of the external factor, diS (i, j) represents the overall distance between stores i, j;
step 7: clustering similar stores according to a defined inter-store distance formula (5);
the similar stores are defined as the distance between stores;
step 8: obtaining optimal class evaluation Score according to the minimum intra-class distance and the contour coefficient, and selecting the optimal classification number k according to the combination of the optimal class evaluation Score;
Step 9: after obtaining the optimal category number k according to Score, summarizing sales in the category level;
Step 10: predicting a future sales y t using an ARIMA model;
Step 11: and summarizing the future sales y t of each category, and obtaining the total predicted demand which is the regional commodity production plan.
Further, the clustering in the step 7 adopts k-means clustering, and is implemented as follows:
input:
Data set
And (3) outputting:
Class center point Label C of each point
Initializing:
Randomly selecting k center points mu from the data set S 1,…,μk
Firstly, initializing and randomly selecting k class center points, dividing each sample s (i) into a class mark c (j) nearest to mu j, updating the value of mu j of each class center point according to c (j), and repeating iteration until the class center is unchanged or the change amount is smaller than a certain threshold value; the obtained c is the class of each store and the similar stores.
Further, the step 8 is specifically implemented as follows:
Intra-class distance SSE:
selecting the number k of classes by selecting a way that minimizes the overall distance;
profile coefficient SC:
a (i) is the average distance from the sample i to other samples in the class, b (i) is the average distance from the sample i to all samples in other classes, and the smaller the intra-class distance is, the larger the inter-class distance is, the number of class centers k is selected;
Optimal class evaluation Score:
The method of combining the intra-class distance and the contour coefficient is that the smaller the intra-class distance is, the larger the inter-class distance is, the k number is within the range of reasonable class center number.
Further, after obtaining the optimal category number k according to Score in step 9, the sales volume is summarized in the category level to obtain the summarized sales data, which is specifically implemented as follows:
summarizing store sample commodity sales s (i) belonging to the category c (k);
Further, the future sales prediction using the ARIMA model described in step 10 is specifically implemented as follows:
Taking the collected sales data X as a model input X:
μ is a constant term, ε t is an error term, γ i is an autocorrelation coefficient, and θ i is an error term coefficient.
FIG. 2 is an example of the results of a regional production plan for a target commodity obtained according to the present invention for 3 days in the future, showing a comparison of predicted throughput and true sales.
The present invention is not limited to the above embodiments, and those skilled in the art can practice the present invention using other various embodiments in light of the present disclosure. Therefore, the design structure and thought of the invention are adopted, and some simple changes or modified designs are made, which fall into the protection scope of the invention.

Claims (5)

1. The regional commodity production planning method based on the commodity similarity clustering of each store is characterized by comprising the following steps of:
Step 1: firstly, acquiring a transaction detail data set D in a specified time period of the histories of all store commodities, removing activity information and holiday information in the transaction detail data set D, and then counting according to the granularity of the days to obtain a daily sales quantity set S of the store commodities;
Step 2: based on the daily sales quantity set S of the commodities, calculating a time demand mode T of each commodity on periodicity:
Wherein, For the average sales of the commodity in week i, n i represents the number of days in week i within a specified period of time; d is E [1, n i ];
Step 3: according to a daily sales data set S and a weather factor data set X in a specified time period, taking the weather factor data set X as a model input X and taking the daily sales data set as a model output y, and training a linear regression model; then, based on the change of each external weather factor data set X, obtaining a return coefficient, namely a change rate matrix E of daily sales;
step 4: based on the historical store daily sales data set S, the time demand pattern distance between the store and the commodity is calculated from sales data in the period of one month close to the daily sales data set S, and the formula is as follows:
Where i, j represent two different stores, T i k represents the kth element in store i time demand pattern T;
step 5: based on the change rate matrix E of each commodity of each store, the change rate distance of the external factors among the store levels is calculated, and the formula is as follows:
where i, j represent two different stores, A kth element in a change rate matrix E of external factors of store i is represented;
Step 6: based on the time demand mode distance Dis T and the external factor change rate distance Dis E, the distance calculation method of the commodity at the store level is calculated, and the formula is as follows:
DIS(i,j)=DisT(i,j)+DisE(i,j) (5)
Dis T (i, j) represents the distance of store i, j in the time demand mode, dis E (i, j) represents the distance of store i, j in the rate of change of the external factor, and DIS (i, j) represents the overall distance between stores i, j;
step 7: clustering similar stores according to a defined inter-store distance formula (5);
the similar stores are defined as the distance between stores;
step 8: obtaining an optimal class evaluation Score according to the minimum intra-class distance and the contour coefficient, and selecting the optimal classification number k according to the optimal class evaluation Score and the intra-class distance and the contour coefficient;
Step 9: after obtaining the optimal classification number k according to Score, summarizing sales in the class level;
Step 10: predicting a future sales y t using an ARIMA model;
Step 11: and summarizing the future sales y t of each category, and obtaining the total predicted demand which is the regional commodity production plan.
2. The regional commodity production planning method based on the commodity similarity clustering of each store according to claim 1, wherein the clustering in the step 7 adopts k-means clustering, and the method is realized as follows:
input:
Data set
And (3) outputting:
Class center point Label C of each point
Initializing:
randomly selecting k center points mu from the data set S 1,…,μk
Firstly, initializing and randomly selecting k class center points, dividing each sample s (i) into a class mark c (j) nearest to mu j, updating the value of mu j of each class center point according to c (j), and repeating iteration until the class center is unchanged or the change amount is smaller than a certain threshold value; the obtained c is the class of each store and the similar stores.
3. The regional commodity production planning method based on the commodity similarity clustering of each store according to claim 1 or 2, wherein the step 8 is specifically implemented as follows:
Intra-class distance SSE:
selecting an optimal classification number k by selecting a mode of minimizing the total distance;
profile coefficient SC:
a (i) is the average distance from the sample i to other samples in the class, b (i) is the average distance from the sample i to all samples in other classes, and the smaller the intra-class distance is, the larger the inter-class distance is, the optimal classification number k is selected;
Optimal class evaluation Score:
and selecting the optimal classification number k with the smaller the intra-class distance and the larger the inter-class distance within a reasonable class center number range by combining the intra-class distance and the contour coefficient.
4. The regional commodity production planning method based on the commodity similarity clustering of each store according to claim 3, wherein in step 9, after obtaining the optimal classification number k according to Score, the sales are summarized in the class level, and the summarized sales data are obtained, which is specifically implemented as follows:
A summary of store sample sales s (i) belonging to the class c (k) is made.
5. The regional commodity production planning method based on the commodity similarity clustering of each store according to claim 4, wherein the future sales prediction in step 10 is implemented by using an ARIMA model as follows:
taking the collected sales data X (k) as a model input X:
μ is a constant term, ε t is an error term, γ i is an autocorrelation coefficient, and θ i is an error term coefficient.
CN202010467295.9A 2020-05-28 Regional commodity production planning method based on commodity similarity clustering of stores Active CN111815348B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010467295.9A CN111815348B (en) 2020-05-28 Regional commodity production planning method based on commodity similarity clustering of stores

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010467295.9A CN111815348B (en) 2020-05-28 Regional commodity production planning method based on commodity similarity clustering of stores

Publications (2)

Publication Number Publication Date
CN111815348A CN111815348A (en) 2020-10-23
CN111815348B true CN111815348B (en) 2024-07-12

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101819668A (en) * 2010-04-27 2010-09-01 浙江大学 Sales predicting model based on product intrinsic life cycle character
CN105556557A (en) * 2013-09-20 2016-05-04 日本电气株式会社 Shipment-volume prediction device, shipment-volume prediction method, recording medium, and shipment-volume prediction system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101819668A (en) * 2010-04-27 2010-09-01 浙江大学 Sales predicting model based on product intrinsic life cycle character
CN105556557A (en) * 2013-09-20 2016-05-04 日本电气株式会社 Shipment-volume prediction device, shipment-volume prediction method, recording medium, and shipment-volume prediction system

Similar Documents

Publication Publication Date Title
CN109615226B (en) Operation index abnormity monitoring method
CN108564790B (en) Urban short-term traffic flow prediction method based on traffic flow space-time similarity
CN103617459A (en) Commodity demand information prediction method under multiple influence factors
US7742940B1 (en) Method and system for predicting revenue based on historical pattern indentification and modeling
CN101783004A (en) Fast intelligent commodity recommendation system
CN108388974A (en) Top-tier customer Optimum Identification Method and device based on random forest and decision tree
CN108389069A (en) Top-tier customer recognition methods based on random forest and logistic regression and device
US20210125207A1 (en) Multi-layered market forecast framework for hotel revenue management by continuously learning market dynamics
CN114118636A (en) Automobile spare part demand prediction system based on multi-model optimization
CN110060109A (en) It is a kind of for predicting the method, apparatus and computer media of product price
CN116034379A (en) Activity level measurement using deep learning and machine learning
CN114154716B (en) Enterprise energy consumption prediction method and device based on graph neural network
CN116579804A (en) Holiday commodity sales prediction method, holiday commodity sales prediction device and computer storage medium
CN115375205A (en) Method, device and equipment for determining water user portrait
CN111815348B (en) Regional commodity production planning method based on commodity similarity clustering of stores
CN114037138A (en) Subway short-time arrival passenger flow prediction system based on double-layer decomposition and deep learning and implementation method
CN111625578B (en) Feature extraction method suitable for time series data in cultural science and technology fusion field
CN111815348A (en) Regional commodity production planning method based on commodity similarity clustering of stores
US20210142348A1 (en) Multi-layered system for heterogeneous pricing decisions by continuously learning market and hotel dynamics
JP2001243401A (en) Order receipt prediction system
Kabanova et al. ABC-XYZ inventory analysis accounting for change points
CN114444934A (en) Enterprise sales periodic evaluation algorithm and tool application thereof
US20230316302A1 (en) Improving accuracy and efficiency of prediction processes on big data sets using domain based segmentation and time series clustering
Khairina et al. Forecasting Model of Amount of Water Production Using Double Moving Average Method
CN118013469B (en) Time-dependent model analysis method for managing multidimensional data by enterprise architecture

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant