CN116308477A

CN116308477A - Method for recommending store goods of auto parts vulnerable part in big data scene

Info

Publication number: CN116308477A
Application number: CN202310074909.0A
Authority: CN
Inventors: 宋继斌; 林少豪; 彭小院
Original assignee: Xiamen Chuanglianxiang Information Technology Co ltd
Current assignee: Xiamen Chuanglianxiang Information Technology Co ltd
Priority date: 2023-01-30
Filing date: 2023-01-30
Publication date: 2023-06-23

Abstract

The invention provides a method for recommending store orders of a wearing part of a steam fitting in a big data scene, which comprises the following steps: s10: preprocessing historical stock data of a store to construct a training data set; s20: constructing a basic prediction model according to the training data set; s30: constructing an integrated prediction model according to the basic prediction model, and determining a sales volume prediction value of a store; s40: according to all-national store sales and inventory data, constructing a store safety inventory water level model by combining the sales volume predicted value of the store; s50: according to historical sales data of single store commodities, a commodity package clustering model is constructed, and commodity package division is carried out on the commodities; s60: according to the integrated prediction model, the safety stock water level model and the cluster analysis model, the recommended commodity intake value and commodity package attribution of the commodities in the store are determined, the characteristics of each store are applicable, the commodities are more scientifically taken in, stock accumulation is reduced under the condition that the cargoes are sufficient, and stock turnover is facilitated.

Description

Method for recommending store goods of auto parts vulnerable part in big data scene

Technical Field

The invention relates to a method for recommending store orders of a wearing part of a steam fitting in a big data scene, which is applied to the field of store orders of wearing parts of a steam fitting.

Background

In the daily operation of a wearing part store of a gas holder, the stock preparation and the stock intake links have important influence on the sales, inventory, fund turnover and other aspects of the store, and a store buyer integrates and processes quotation files of all automobile spare part suppliers and comprehensively considers various aspects; after comprehensively considering various factors such as inventory, price, distribution timeliness and accessory quality, the automobile accessory and accessory purchasing personnel initiates a purchasing request to the automobile accessory and accessory supplier which meets the requirements, and after receiving the purchasing request of the automobile accessory and accessory purchasing personnel, the automobile accessory and accessory supplier opens a sales order and arranges delivery. The traditional automobile spare and accessory part purchasing transaction mode has complex operation flow and consumes time; the scu data in the field of automobile parts is large, so that a shop purchaser needs to know the market and the shop operation conditions very, the rationality of the commodity can be ensured, and as the automobile parts easy to wear are required by the automobile for solving the potential safety hazard, if the customer cannot replace the parts immediately due to insufficient stock, the customer source is lost; if the stock is piled up more, the overload on goods and funds turnover can cause the problem of bad operation, and the traditional automobile vulnerable part stock mode consumes a great deal of enterprise resources and labor cost, so that the purchasing transaction operation flow of automobile spare parts is complex, the efficiency is low and the cost is higher.

Therefore, the invention aims at the prior art problem to design a method for recommending the store goods of the wearing parts of the steam fittings in the big data scene.

Disclosure of Invention

The invention provides a method for recommending store goods of a wearing part of a steam fitting in a big data scene, which can effectively solve the problems.

The invention is realized in the following way:

a method for recommending store orders of a wearing part of a steam fitting in a big data scene comprises the following steps:

s10: preprocessing historical stock data of a store of a wearing part of the automobile part to construct a training data set;

s20: constructing a basic prediction model according to the training data set;

s30: constructing an integrated prediction model according to the basic prediction model, and determining a sales volume prediction value of a store;

s40: according to the national sales and inventory data of the stores of the wearing parts of the automobile parts, and the sales forecast value of the stores of the wearing parts of the automobile parts, constructing a safe inventory water level model of the products;

s50: according to historical sales data of single wearing parts store commodity, constructing a commodity package clustering model, and dividing commodity packages;

s60: and calculating the recommended commodity intake value and commodity package attribution of the commodity of the store of the vulnerable part of the automobile according to the integrated prediction model, the safety stock water level model and the cluster analysis model.

As a further improvement, the training data set contains data of a plurality of commodities, and for any commodity, a data set d= { (X) is corresponding ₀ ，y ₀ )，…，(X _n ，y _n )}；X _n Feature data set for nth sample, y _n And the characteristic data comprise sales volume of the last preset number of months, goods intake volume, sales number of clients, price average of goods intake in the month, number of clients newly developed, price of clients in repair factory, goods intake of active service provider and goods intake of inactive non-first service provider for the nth sample.

As a further refinement, said constructing a base predictive model from said training dataset comprises:

according to the training data set, a first basic prediction model is built by using a linear regression fit characteristic set and a linear relation between sales;

according to the training data set, a nonlinear relation between the feature set and sales volume is explored by using random forest regression, and a second basic prediction model is constructed;

and constructing a third basic prediction model by using gradient lifting regression according to the training data set.

As a further improvement, said constructing a first base prediction model from said training dataset using linear regression to fit a linear relationship between the feature set and sales comprises:

the constructed linear regression model is:

and then by the loss function:

obtaining a parameter vector theta by solving the minimum value of the parameter vector theta;

linear regression was solved using gradient descent:

wherein x is an input sample feature set, θ is a model parameter, h (x) is a model predicted value, y is a sample actual value, a is a learning rate, J (θ) is a loss function, and k and m represent kth/m sample data.

As a further improvement, said constructing a second base prediction model from said training dataset using random forest regression to explore a nonlinear relationship between feature sets and sales comprises:

randomly extracting m samples from the training data set by adopting a resampling mode to obtain a new sub-data set;

randomly extracting three features in the sub-data set, and training to generate a CART regression tree;

repeating the two steps for n times to construct a random forest model consisting of n regression trees; the predicted value of the random forest is determined by the predicted results of all regression trees together;

the generation of the CART regression tree is a process of recursively constructing a binary tree, and each step of the spanning tree is solving the following equation:

traversing any feature A and any segmentation point s in the data set to segment the data set into subsets D ₁ And D ₂ Find the D ₁ And D ₂ The mean square error of the respective sets is minimum, while D ₁ And D ₂ The feature A and the segmentation point s corresponding to the minimum sum of the mean square differences are generated as branches of the current tree.

As a further refinement, said constructing a third base prediction model using gradient lifting regression from said training dataset comprises:

s201: the gradient lifting regression model takes a regression tree as a base learner, adopts a Boosting idea, utilizes the negative gradient of a loss function as a residual fitting mode to obtain a plurality of weak learners, integrates the strong learners finally, and constructs a gradient lifting regression model:

wherein x is an input sample, w is a model parameter, h is a regression tree, a is the weight of each tree, and m represents the mth regression tree;

s202: constructing an initial weak learner:

s203: solving a minimized loss function, enabling a derivative to be equal to 0, and solving an initial weak learner predicted value: f (F) ₀ ＝AVG(y _i )；

Wherein y is the actual value of the sample, c is the predicted value of the model, L (y, c) is the variance and the loss function, and i represents the ith sample data;

s204: negative gradients are calculated, i.e. residuals: y is _i -F ₀ Constructing a regression tree for the fitting object of the next weak learner, traversing all the characteristics and segmentation points, searching the segmentation point with the minimum residual SSE after segmentation, and generating a tree;

s205: repeating the step S203 to obtain a predicted value of the weak learner, and updating the strong learner by using the learning rate; f (F) ₁ ＝F ₀ +a*C _i A is the learning rate, repeating S204 to obtain a residual error for the next model;

s206: and S205 is repeated until the iteration times are met, and finally the strong learner is solved.

As a further improvement, constructing an integrated predictive model from the base predictive model, determining sales predictive values for the store includes:

constructing an integrated prediction model according to the first basic prediction model, the second basic prediction model and the third basic prediction model: w (x) =a ₁ F ₁ (x)+a ₂ F ₂ (x)+a ₃ F ₃ (x)；

Wherein F (x) is a basic prediction model, a is the weight of the basic prediction model, and W (x) is an integrated prediction model;

and dividing the training data set into a sub-training set and a sub-testing set by taking the previous March of the predicted month as a boundary for each commodity data set of the store. Determining weights a of K basic prediction models by evaluating error rates of the subtest sets _i ；

Solving: a, a _i ＝A _i /∑A _i ，A _i Error rate on the sub-test set for the i-th base prediction model.

As a further improvement, the building the product safety stock water level model according to the product sales and stock data of all-country stores and in combination with the sales forecast value of the stores comprises:

the sales and inventory data of stores in the recent march are summarized and calculated, and a store set with the sales number and the sales ratio in the stores being 20% higher than the ranking is selected, wherein the store set represents stores with excellent sales operation and inventory management capability in each store;

and calculating the integral inventory month of the goods in the store set, and taking the inventory month as a goods turnover stock reference of the whole national store.

As a further improvement, the building a commodity package clustering model according to the historical sales data of the commodities in the single store, and the commodity package classification of the commodities comprises:

the method comprises the steps of constructing a commodity package clustering model by using a KMeans algorithm, providing a commodity priority reference for service providers, selecting the characteristics of the sales number, gross profit amount, sales frequency and the like of store single products by the characteristics of an input data set of the model, and performing algorithm training:

s501: initializing randomly selecting k data points as central points of k clusters;

s502: for each data point, calculating Euclidean distances between the data point and k center points, and dividing the data points into clusters represented by the center points with the shortest distances;

s503: after all the data points are divided, updating the center points of k clusters by using the average value of the data points in the clusters;

s504: the above steps S502 and S503 are repeated until there is no more change in the center point, and the training is stopped.

According to the integrated prediction model, the sales prediction value of the single products of the store is determined, the recommended commodity intake value is output by combining the current stock of the store through the product safety stock water level model, and commodity intake priority is output by combining the commodity package clustering model.

The beneficial effects of the invention are as follows:

(1) Constructing a basic prediction model according to the training data set, constructing an integrated prediction model by using the basic prediction model, and determining a sales volume predicted value of a store; and dividing the training data set into a sub-training set and a sub-test set by taking the previous March of the predicted month as a boundary for each commodity data set of the store, and determining the weights of three basic prediction models by evaluating the error rate of the sub-test set.

(2) According to all-national store sales and inventory data, constructing a store safety inventory water level model by combining sales forecast values of stores; the sales and inventory data of stores in the recent march are summarized and calculated, a store set with the sales number and the sales ratio in the stores being 20% higher than the ranking is selected, and the integral store month is calculated and used as a store turnover stock reference of all-national stores; and updating the stock safety stock water level model of the product in real time according to market and stock sales data, outputting a recommended commodity-in value, and screening the sales quantity in the product so as to ensure that the recommended commodity-in value is more accurate.

(3) Constructing a commodity package clustering model according to the sales volume predicted value of the commodities in the single store, and dividing commodity packages for the commodities; the commodity package clustering model is built by using a KMeans algorithm, commodity package priority reference is provided for service providers, the characteristics of input data sets of the model are selected from the sales number, gross profit amount, sales frequency and the like of store single products, the commodity package clustering model is built by training, commodity package clustering is realized, commodity package priority reference is provided for service providers, the commodity package priority is accurately predicted, the characteristics of each store are suitable, a more suitable commodity package scheme is provided, and therefore more scientific commodity distribution is provided, stock accumulation is reduced under the condition of sufficient commodity, and stock turnover is facilitated.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some examples of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of steps provided by an embodiment of the present invention.

Fig. 2 is a flowchart of the steps provided by an embodiment of the present invention.

FIG. 3 is a global average error plot of the predicted results for all items of the store according to an embodiment of the present invention.

FIG. 4 is a graph of the overall average error pie chart of the predicted results for all items of the store provided by an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention.

In the description of the present invention, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.

Referring to fig. 1-2, a method for recommending store orders of a wearing part of a steam fitting in a big data scene comprises the following steps:

s10: preprocessing historical stock data of a store to construct a training data set; the training data set includes data of a plurality of commodities, and for any commodity, a data set d= { (X) is corresponding ₀ ，y ₀ )，…，(X _n ，y _n )}；X _i Feature data set for the ith sample, y _i And the characteristic data comprise sales volume, goods intake volume, sales customer number, current-month goods intake average price, new-development customer number, repair factory customer unit price, active service provider goods intake and inactive non-first service provider goods intake of the ith sample.

The main process of data preprocessing for the data set comprises:

(1) And carrying out completion treatment of the history missing month on the data. The benefit of the completion processing of the history missing month is that the data is more complete, and the error rate is increased due to the fact that the data is reduced when the data is processed.

(2) In combination with the autoregressive idea, feature engineering is used to construct derived features of sales volume, these additional features are used to improve the quality of the results of the machine learning process, the performance of each period before the same variable is used to predict the performance of the variable in this period, and they are assumed to be linear.

(3) And identifying and removing abnormal data by using an isolated forest algorithm, removing the abnormal data, and reducing the situation of large data error caused by the abnormal data.

S20: constructing a basic prediction model according to the training data set; the constructing a basic prediction model according to the training data set comprises the following steps:

according to the training data set, using linear regression to fit a linear relationship between the feature set and the sales volume, constructing a first basic prediction model includes:

the constructed linear regression model is:

and then by the loss function:

linear regression was solved using gradient descent:

According to the training data set, using random forest regression to explore a nonlinear relation between the feature set and sales, constructing a second basic prediction model comprises:

Constructing a third base prediction model from the training dataset using gradient lifting regression includes:

s201: the gradient lifting regression model takes a regression tree as a basic learner, adopts the Boost i ng idea, utilizes the negative gradient of the loss function as a residual fitting mode to obtain a plurality of weak learners, and finally integrates the strong learners. Constructing a gradient lifting regression model:

s202: constructing an initial weak learner:

S30: constructing an integrated prediction model according to the basic prediction model, wherein determining the sales prediction value of the store comprises the following steps:

and dividing the training data set into a sub-training set and a sub-testing set by taking the previous March of the predicted month as a boundary for each commodity data set of the store. Determining weights a of three basic prediction models by evaluating error rates of the subtest sets _i The method comprises the steps of carrying out a first treatment on the surface of the Solving: a, a _i ＝A _i /∑A _i ，A _i Prediction mode for the ith baseThe error rate on the subtest set, in the daily stock and stock links of the wearing parts store of the automobile, because the preprocessed characteristic data are more, the method comprises the steps of presetting sales volume, stock intake volume, sales customer number, average price of the stock in the month, newly developing customer number, unit price of the repair factory, stock intake of the movable service provider, stock intake of the non-first service provider, data error rate reduction is needed when processing data, the accuracy of stock and stock intake can be ensured, unscientific stock and stock intake caused by large data error is avoided, and compared with a basic prediction model, the integrated prediction model is lower in error rate and more accurate in prediction, as shown in the reference to fig. 3-4.

S40: according to all-national store sales and inventory data, combining the sales volume predicted value of the store, constructing a product safety inventory water level model comprises the following steps:

the sales and inventory data of stores in the recent march are summarized and calculated, and a store set with the sales number and the sales ratio in the stores being 20% higher than the ranking is selected, wherein the store set represents stores with excellent sales operation and inventory management capability in each store; for the store set, the whole product stock month is calculated and is used as a product turnover stock reference of all-country stores, because customers attach importance to the cost performance or appearance-friendly accessories of the automobile parts and parts vulnerable to the automobile parts, the excellent automobile parts and automobile parts manufacturers on the market are reflected according to the sales data sides of stores with excellent sales operation and stock management capacity in each product, the sales quantity and stock sales quantity in the product are selected to be 20% higher than those in the first rank, the automobile parts market guide can be reflected according to the store set 20% higher than those in the first rank, the product safety stock water level model is updated in real time according to market and stock sales data, the recommended commodity value is output, and the recommended commodity value is more accurate by screening the sales quantity in the product.

S50: according to historical sales data of single store commodity, constructing a commodity package clustering model, and carrying out commodity package division on the commodity comprises the following steps:

s501: initializing to randomly select K data points as central points of K clusters, wherein in the embodiment, K=3, namely 3 data points are selected as central points of 3 clusters;

s502: for each data point, calculating Euclidean distances between the data point and 3 center points, and dividing the data points into clusters represented by the center points with the shortest distances; by calculating the distance between each object and the respective seed cluster center, each object is assigned to its nearest cluster center, and the cluster centers and the objects assigned to them represent a cluster.

S503: after all the data points are divided, updating the center points of 3 clusters by using the average value of the data points in the clusters;

s504: repeating the steps S502 and S503 until the center point is not changed, stopping training, and calculating the cluster center of the cluster again according to the existing objects in the cluster every time one sample is allocated, wherein the process is repeated until a certain termination condition is met, the termination condition can be that no object is re-allocated to different clusters, no cluster center is changed again, and the error square sum is minimum locally.

Training and constructing a commodity package clustering model, dividing commodities into 3 clusters, naming the cluster with the highest sales amount as a necessary package, naming the other two clusters as suggestion packages, and integrating the suggestion packages named by the other two clusters in the embodiment to provide a commodity intake priority reference for a service provider.

The clustering model results for the sample store are shown in Table 1: the order priority is divided into two grades, one is an order package, and the order package is used as the highest item of average sales gross, average sales days and average sales quantity, and the order package must be purchased when preparing order; and the system recommends selective purchase for the proposal package, a commodity package clustering model is constructed by using a KMeans algorithm to provide the priority of the purchase, the characteristics of different stores are applicable, a more suitable store purchase scheme is provided, so that the purchase is more scientific, the stock accumulation is reduced under the condition of sufficient goods, and the stock turnover is facilitated.

TABLE 1 clustering model results for sample stores

Commodity package type	Average sales of Muli	Average sales days	Average sales quantity
				Necessary bag	2909.00	27.72	111.21
Advice packet	84.09	2.48	5.60

S60: according to the integrated prediction model, the safety stock water level model and the cluster analysis model, determining the recommended commodity intake value and commodity package attribution of the store commodity comprises the following steps: determining sales forecast values of single products of the store according to the integrated forecast model, outputting recommended goods-taking values by combining the current store of the store through the product safety stock water level model, and outputting goods-taking priority by combining the goods package clustering model; the automobile has a plurality of unclear parts, wherein the service life of part of parts is shorter, if the parts are not replaced in time, the automobile is damaged, and the running safety of owners and passengers is threatened, so that people attach more importance to the maintenance of the automobile and the replacement of automobile parts, and because the number of SKUs of the wearing parts of the automobile is large, in the daily stock and the stock process, general store operators adopt subjective feelings or check simple sales volume data to judge whether the current stock is sufficient, and determine the commodity packages and the number of the stock, so that the stock behavior is not scientific enough, thereby affecting store gross profit and stock turnover to a certain extent; selecting the sales number in the product and the store collection with the sales ratio of 20% before ranking, and calculating the integral product inventory month as the product turnover stock reference of all-national stores; according to market and inventory sales data, updating the stock safety inventory water level model in real time, outputting a recommended commodity-in value, and screening the sales quantity in the commodity so as to enable the recommended commodity-in value to be more accurate; the commodity package clustering model is built by using a KMeans algorithm, commodity package priority reference is provided for service providers, the characteristics of input data sets of the model are selected from the sales number, gross profit amount, sales frequency and the like of store single products, the commodity package clustering model is built by training, commodity package clustering is realized, commodity package priority reference is provided for service providers, the commodity package priority is accurately predicted, the characteristics of each store are suitable, a more suitable commodity package scheme is provided, and therefore more scientific commodity distribution is provided, stock accumulation is reduced under the condition of sufficient commodity, and stock turnover is facilitated.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations may be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The method for recommending the store goods of the auto parts in the big data scene is characterized by comprising the following steps of:

s20: constructing a basic prediction model according to the training data set;

2. The method for recommending a store to a wearing part for a steam fit in a big data scene according to claim 1, wherein the training data set contains data of a plurality of commodities, and for any one commodity, a data set d= { (X) is associated ₀ ，y ₀ )，…，(X _n ，y _n )}；X _n Feature data set for nth sample, y _n And the characteristic data comprise sales volume of the last preset number of months, goods intake volume, sales number of clients, price average of goods intake in the month, number of clients newly developed, price of clients in repair factory, goods intake of active service provider and goods intake of inactive non-first service provider for the nth sample.

3. The method for recommending store orders for wearing parts of automobile parts in big data scenes according to claim 1, wherein the constructing a basic prediction model according to the training data set comprises:

4. A method for recommending a store for a wearing part of a car part in a big data scene according to claim 3, wherein said constructing a first basic prediction model by using linear regression to fit a linear relationship between a feature set and sales volume according to the training data set comprises:

the constructed linear regression model is:

and then by the loss function:

linear regression was solved using gradient descent:

5. A method for recommending a store of a wearing part for a car part in a big data scene according to claim 3, wherein said constructing a second basic prediction model according to the training data set by using a nonlinear relation between a random forest regression exploration feature set and sales volume comprises:

6. A method of recommending a store of a wearing part for a car part in a big data scenario according to claim 3, wherein said constructing a third basic prediction model using gradient lifting regression according to the training data set comprises:

s202: constructing an initial weak learner:

7. The method for recommending a store to a wearing part of a steam fitting in a big data scene according to claim 1, wherein constructing an integrated prediction model according to a basic prediction model, and determining a sales prediction value of the store comprises:

8. The method for recommending a store for a wearing part of a gas fittings in a big data scene according to claim 1, wherein the constructing a model of a stock level of a safety stock of a class according to the sales and stock data of the class of the shops of all countries and in combination with the sales prediction value of the shops comprises:

9. The method for recommending store orders for a wearing part of a steam fitting in a big data scene according to claim 1, wherein the constructing a commodity package clustering model according to historical sales data of commodities in a single store, and dividing commodity packages comprises:

10. The method for recommending store orders of wearing parts of automobile parts in big data scene according to claim 1, wherein the sales forecast value of single products of the store is determined according to the integrated forecast model, the recommended order value is output by combining the current store of the store through the product safety stock water level model, and the commodity order priority is output by combining the commodity package clustering model.