CN110599234A

CN110599234A - Product sales prediction method

Info

Publication number: CN110599234A
Application number: CN201910745299.6A
Authority: CN
Inventors: 陈强; 谢胜利
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2019-08-13
Filing date: 2019-08-13
Publication date: 2019-12-20

Abstract

The invention relates to the field of artificial intelligence, in particular to a product sales prediction method. Compared with the traditional sales prediction method, the method is carried out on the basis of the wide and deep model, and can fully excavate the correlation among products by utilizing the strong space-time sequence prediction capability of a long-time convolution memory network and obtain the structure information of the products. The method well combines the influence characteristics of external factors on the consumption rule of consumers by fusing external characteristics such as regional characteristics, weather characteristics and the like, and realizes sales prediction aiming at different regions.

Description

Product sales prediction method

Technical Field

The invention relates to the field of artificial intelligence, in particular to a product sales prediction method.

Background

In recent years, with the rise of new retail, the progress and change of enterprises are accelerated and concentrated, and the enterprises become faster and more explosive. In the process of realizing social informatization and digitization by the internet, the retail industry is developed rapidly by depending on the development and change of science and technology, and brings more challenges. In the high-speed development and change, the industry cost is reduced, the benefit is increased, meanwhile, new challenges are brought to the new retail industry, and the multiple varieties of the multi-industry and multi-platform dimension reduction impact and the variety of the demand change of consumers bring great influences on the decision of enterprises. Therefore, whether the demands of the consumers can be predicted more accurately or not plays an important role in the new retail industry.

The traditional sales prediction method mainly comprises qualitative prediction and quantitative prediction, wherein the qualitative prediction mainly depends on personal experience of managers to make decisions, and the judgment of the future development of things on the nature and degree is taken as the main basis for predicting the future, so that the method has greater flexibility, but the qualitative prediction method has poor transportability and strong subjective limitation; the quantitative prediction is to quantitatively reveal objective rules of a certain phenomenon, divide the objective rules into an periscopic level and a macroscopic level, reveal the relation between change and time from a certain angle, discover the rules in the objective rules, mine the intrinsic information of the rules, mainly adopt a time sequence analysis method to model historical sales data, including a moving average method, ARIMA, Kalman filtering and a grey theory, and send algorithms including Support Vector Regression (SVR), tree model (XGboot) and the like through machine learning. In addition, with the increasing data volume and the development of neural network models, some time series data also start to be processed by adopting deep learning technology. The models can obtain better prediction accuracy by means of strong learning capability of machine learning. However, the existing prediction method is usually suitable for a single commodity, only one commodity can be considered in one sample, the mutual influence among the commodities is ignored, and the prediction performance cannot be improved even if complex feature engineering and model fusion are carried out on historical data.

The defects of the prior method and the invention are as follows: 1) the existing method is usually suitable for a single commodity, only one commodity can be considered in one sample, and the mutual influence among the commodities is ignored; 2) the existing method only focuses on the memory capacity of the model, and needs more artificial characteristic engineering to enhance the generalization capacity of the model and mine the potential association in the data; 3) the existing method does not well deal with the influence of external factors on the consumption rule of people, like the completely opposite influence of rainstorm weather on consumers in different areas, and the model has poor portability.

Disclosure of Invention

In order to solve the prior art:

1) the existing method is usually suitable for a single commodity, only one commodity can be considered in one sample, and the mutual influence among the commodities is ignored;

2) the existing method only focuses on the memory capacity of the model, and needs more artificial characteristic engineering to enhance the generalization capacity of the model and mine the potential association in the data;

3) the existing method does not well deal with the influence of external factors on the consumption rule of people, like the completely opposite influence of rainstorm weather on consumers in different areas, and the model has poor portability.

In order to solve the technical problems, the technical scheme of the invention is as follows:

a product sales forecasting method comprising the steps of:

step S1: acquiring external data and internal data which influence the product sales volume;

step S2: processing the acquired data, removing abnormal values and filling missing values, and dividing the processed data into a training set and a verification set;

step S3: constructing a wide and deep model;

step S4: inputting the training set into a wide and deep model to train the model to obtain an optimized wide and deep model;

step S5: and inputting the verification set into the optimized wide and deep model to verify the accuracy of the model.

Preferably, the external data in step S1 includes weather attributes, time characteristics and area characteristics corresponding to the time of sale, and the weather attributes include temperature, humidity, wind speed, rain and snow; the time characteristics comprise holidays, single and double break, weeks and months; regional characteristics include surrounding households, number of cells, number of competitors, and surrounding rates.

Preferably, the internal data in step S1 is historical sales data of the product, including sales information, discount information, and price information;

preferably, the specific process of rejecting the abnormal value in step S2 is as follows:

the method for eliminating the abnormal value is to smooth the sales data by using a lowss method, and the specific process of lowss smoothing is as follows:

taking a point x as a center, intercepting a section of data with the length of frac forward and backward, performing weighted linear regression on the section of data by using a weight function w, and recordingIs the central value of the regression line, whereinFor the fitted curve corresponding values, n weighted regression lines can be made for all n data points, and the connecting line of the central value of each regression line is the Lowess curve of the data.

Preferably, the specific process of filling the missing value in step S2 is as follows: taking the average value (x) of historical n-day sales data x₁+x₂+...+x_n) Filling is carried out on the data/n, corresponding conversion is carried out on the data format of external data according to model requirements, the conversion method comprises one hot encoding of a oneHotEncoder, digital encoding of a LabelEncoder and the like, and meanwhile, equal-frequency box-dividing encoding is carried out on numerical characteristics such as temperature, room price and the like.

Preferably, the specific steps of constructing the wide and deep model in step S3 are as follows:

the wide and deep model comprises a wide model and a deep model, wherein the wide model meets the memorialization characteristic, the deep model meets the Generalization characteristic, and the memorialization characteristic is a rule learned by the model into historical data; the Generalization characteristic is based on historical data, a new feature combination which never appears in the past is explored, the historical data is memorized, and then the historical data is generalized to the features which do not appear before, so that the Generalization capability of the model is improved; fusing the memorisation characteristic of the wide model and the generalisation characteristic of the deep model, and simultaneously playing the roles of the memorisation and the generalisation;

the wide model adopts a linear regression linear model, input features comprise continuous features such as sales information and discount information and sparse discrete features such as weather features and regional features, the discrete features can form higher-dimensional discrete features after being crossed, and the discrete features can be converged into an effective feature combination by introducing L1 regularization in linear model training;

the Deep network model is constructed by the Deep model in the convLSTM mode, and the Deep model can realize the function of time information of the LSTM considering the sales volume and the function of learning the space information of the data by the convolutional neural network by introducing the convLSTM mode, namely the mutual influence among different commodities; the ConvLSTM model has the same core essence as the LSTM, and by adding convolution operation on the basis of the LSTM, the LSTM can not only obtain a time sequence relation, but also extract spatial features like a convolution layer, obtain space-time features by fully fusing the features among commodities, and convert the switching between states into convolution calculation.

Compared with the prior art, the technical scheme of the invention has the beneficial effects that:

compared with the traditional sales prediction method, the method has the advantages that historical sales data and external data such as crawled weather are utilized, the strong space-time sequence prediction capability of a memory network in long and short convolution times is utilized, the correlation among products is fully mined, and the structure information of the products is obtained. The method well combines the influence characteristics of external factors on the consumption rule of consumers by fusing external characteristics such as regional characteristics, weather characteristics and the like, and realizes sales prediction aiming at different regions.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

FIG. 2 is the Wide and Deep model framework.

FIG. 3 is a single convLSTM unit.

FIG. 4 is a schematic diagram of the deep output result full connected.

Detailed Description

The drawings are for illustrative purposes only and are not to be construed as limiting the patent;

for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;

it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.

Example 1

As shown in fig. 1, a product sales prediction method includes the steps of:

step S3: constructing a wide and deep model;

As a preferred embodiment, the external data described in step S1 includes weather attributes, time characteristics, and regional characteristics corresponding to the time of sale, the weather attributes including temperature, humidity, wind speed, rain and snow; the time characteristics comprise holidays, single and double break, weeks and months; regional characteristics include surrounding households, number of cells, number of competitors, and surrounding rates.

As a preferred embodiment, the internal data described in step S1 is historical sales data of the product, including sales amount information, discount information, and price information;

as a preferred embodiment, the specific process of rejecting the abnormal value in step S2 is as follows:

taking a point x as a center, and intercepting a section of data with the length of frac forwards and backwards, forThe data is weighted by a weight function w to make a weighted linear regression, and the weighted linear regression is recordedIs the central value of the regression line, whereinFor the fitted curve corresponding values, n weighted regression lines can be made for all n data points, and the connecting line of the central value of each regression line is the Lowess curve of the data.

As a preferred embodiment, the specific process of filling the missing value in step S2 is as follows: taking the average value (x) of historical n-day sales data x₁+x₂+...+x_n) Filling is carried out on the data/n, corresponding conversion is carried out on the data format of external data according to model requirements, the conversion method comprises one hot encoding of a oneHotEncoder, digital encoding of a LabelEncoder and the like, and meanwhile, equal-frequency box-dividing encoding is carried out on numerical characteristics such as temperature, room price and the like.

As a preferred embodiment, the specific steps of constructing the wide and deep model in step S3 are as follows:

Example 2

As shown in fig. 1 to 4, the specific process of this embodiment is as follows:

data processing:

obtaining historical sales data of a product to be predicted and external data of a time period corresponding to the historical sales data, wherein the external data comprises weather attributes, time characteristics and regional characteristics.

Characteristic engineering:

and processing the acquired data into a format capable of being applied to model input through characteristic engineering, and segmenting the data into a training set and a verification set for model training.

Specifically, the method comprises the following steps:

1: abnormal value elimination and missing value filling are carried out on numerical characteristics such as sales data, price characteristics and the like, normalization is needed at the same time, the speed of solving the optimal solution by model gradient reduction is accelerated, model convergence is facilitated,

the normalization method comprises the following steps:

2: performing OneHotEncoder one-hot coding, LabelEncoder digital coding and the like on the category type characteristics of weather attributes and time characteristics, and performing equal-frequency box coding on the numerical characteristics of temperature, room price and the like;

3: in order to ensure the prediction of the model on the time sequence, the processed data needs to be sorted according to time, and a determined time point is taken as a dividing point, the data is divided into a training set and a verification set, and it is ensured that no training set exists in the verification set, such as: the time span of the data set is 2018.01.01-2018.12.30, 2018.01.01-2018.10.01 can be used as a training set, and 2018.10.01-2018.12.30 can be used as a verification set for construction.

wide end linear regression model

1. By x₁，x₂，x_nDescribing components within features, e.g. x₁Product historical sales, x₂Temperature, x_nWhether weekend or not, an estimation function is obtained:

h(x)＝h_θ(x)＝θ₀+θ₁x₁+θ₂x₂

θ is a parameter indicating the degree of importance of the influence of the sales volume of each of the adjustment features, such as whether weather or holidays have a greater influence on the sales volume.

2. To facilitate the calculation, let x₀1, the estimation function is expressed in a vector manner:

h_θ(x)＝θ^T X

3. a mechanism is involved to evaluate how good the value of θ is, and whether the model prediction result is close to the target, so that the h function is needed to be evaluated, generally, this function is called loss function (loss function) or error function (error function) to describe the degree of excellence of the h function, and the defined loss function is as follows:

4. and solving the loss function of the model by using a gradient descent method to obtain the importance of each sales characteristic when the model best meets the prediction target.

And 5, corresponding importance degrees to all the features of the wide-end model training, wherein the output value of the estimation function is the predicted value in prediction.

Constructing a deep end convLSTM depth model:

the deep end adopts a convLSTM depth model, and the input and output elements of the convLSTM depth model are 3D tensors for retaining all spatial information. Since the network has multiple stacked ConvLSTM layers, it has powerful representation capabilities, making it suitable for prediction in complex dynamic systems.

1. The input feature data is constructed into a 3D tensor in time series,

the specific method comprises the following steps: the sales characteristics of each product are regarded as a 1D tensor, the products are arranged according to a certain relation sequence, the correlation among the products is found out through clustering, meanwhile, the 1D tensors corresponding to the sequenced products are spliced according to the positions of the products, the sales characteristics of all the products are spliced every day to correspond to one 2D tensor, and the obtained 2D tensors are spliced according to the time sequence to obtain the 3D tensor needed by the input end of the convLSTM depth model

2. Constructing a convLSTM depth model network,

with a single convLSTM unit as shown in figure 3. ConvLSTM can learn long-term dependence information of product sales in time through the gate control unit, potential relation between product sales data can be learned through convolution operation of the input sales subdivision matrix Xt, namely, association relation between products can be learned through convolution, and a mathematical model of ConvLSTM is as follows:

in the graph, a convolution calculation is represented, o represents a Hadamard product, Xt represents an input variable at the time t, Ot represents a final output variable, it, ft and Ot respectively represent an input gate, a forgetting gate and an output gate, Ct represents a state after the t time of the hidden layer memory unit is updated, Ht represents a final state at the time t of the hidden layer memory unit, Wi, Wf, Wo, Wc, Ui, Uf, UO and Uc represent connection weights, bi, bf, bo and bc represent activation biases, and the activation functions select sigmoid and tanh activation functions.

It is noted here that X, C, H, i, f, o are all three-dimensional sensors whose last two dimensions represent the spatial information of the rows and columns, i.e. ConvLSTM is conceived as a model handling feature vectors in a two-dimensional grid that can predict features of the central grid from features of surrounding points in the grid.

Output of 3D tensor result full connected conversion at deep end

The output 3D tensor of ConvLSTM is as shown in fig. 4, and is subjected to full connected transform to output the predicted sales of the corresponding product, calculate the loss function of the predicted sales, and perform back propagation training on the connection weights Wi, Wf, Wo, Wc, Ui, Uf, Uo and Uc by using a gradient descent method.

And fusing the forecast results of the wide end model and the deep end model:

fusing the predicted values of the two basic models, and obtaining a final Model by adopting a weight fusion method, wherein the formula is as follows:

prediction＝w₁f₁+w₂f₂

wherein w₁Is the weight occupied by the output value of the wide terminal, w₂For the weight occupied by the output value of the deep end, the weight occupied by the two basic models can be calculated according to the Model evaluation index map, and the formula is as follows:

where map₁Map evaluation value, map for wide end₂The estimated value of map at the deep end is shown.

The same or similar reference numerals correspond to the same or similar parts;

the terms describing positional relationships in the drawings are for illustrative purposes only and are not to be construed as limiting the patent;

it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. A product sales forecasting method, comprising the steps of:

step S3: constructing a wide and deep model;

2. The product sales forecasting method according to claim 1, wherein the external data in step S1 includes weather attributes, time characteristics, and regional characteristics corresponding to the sales time period, the weather attributes including temperature, humidity, wind speed, rain and snow; the time characteristics comprise holidays, single and double break, weeks and months; regional characteristics include surrounding households, number of cells, number of competitors, and surrounding rates.

3. The method of claim 1, wherein the internal data in step S1 is historical sales data of the product, including sales information, discount information and price information.

4. The product sales prediction method of claim 3, wherein the step S2 of eliminating the abnormal value comprises the following steps:

5. The product sales prediction method of claim 3, wherein the missing values in step S2 are filled as follows: taking the average value (x) of historical n-day sales data x₁+x₂+...+x_n) Filling is carried out according to/n, and external data needs to be according to the modeThe data format is correspondingly converted according to the type requirement, the conversion method comprises one HotEncoder one-hot coding, LabelEncoder digital coding and the like, and meanwhile, the equal-frequency box-dividing coding is carried out on the numerical characteristics of temperature, room price and the like.

6. The product sales forecasting method according to claim 3, wherein the step S3 of constructing the wide and deep model comprises the following steps: