CN114219522A - Customer consumption behavior prediction method and device, electronic equipment and storage medium - Google Patents

Customer consumption behavior prediction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114219522A
CN114219522A CN202111448155.8A CN202111448155A CN114219522A CN 114219522 A CN114219522 A CN 114219522A CN 202111448155 A CN202111448155 A CN 202111448155A CN 114219522 A CN114219522 A CN 114219522A
Authority
CN
China
Prior art keywords
consumption
data
data set
basis function
prediction model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111448155.8A
Other languages
Chinese (zh)
Inventor
穆维松
李玥
金海滨
齐建芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN202111448155.8A priority Critical patent/CN114219522A/en
Publication of CN114219522A publication Critical patent/CN114219522A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Physiology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a customer consumption behavior prediction method, a customer consumption behavior prediction device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring original data consisting of client personal information and consumption behavior influence factors; inputting the original data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the original data; the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method. The invention adjusts and optimizes the radial basis function neural network through an immune algorithm and a least square method, so that the parameters of a prediction model are more accurate, and the accuracy of customer consumption behavior prediction is improved.

Description

Customer consumption behavior prediction method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of analysis and prediction, in particular to a customer consumption behavior prediction method and device, electronic equipment and a storage medium.
Background
With the continuous development of network technology, a client generates a large amount of consumption data when performing consumption behavior, and the data contains a lot of useful information. By analyzing this information, the possible future consumption behavior of the consumer can be predicted.
At present, for consumption data generated by customers, the traditional classification method assumes that the sample size of each category is balanced, but the class imbalance phenomenon usually occurs in practical application. The class imbalance data refers to the fact that the quantity of each class is extremely unbalanced, the classification algorithm is directly used for learning, and the predicted result is often not accurate enough.
In a data level, data is mainly resampled to reduce an imbalance rate, and the currently used resampling method can cause the problems of easily generating noise data, blurring class boundaries and the like when a sample is synthesized, so that a consumption behavior prediction result is inaccurate.
On the aspect of an algorithm, a Radial Basis Function Neural Network (RBFNN) is widely used in the field of customer consumption behavior prediction, but when an RBFNN model predicts the customer category, the prediction precision is not high, so that the consumption behavior prediction result is not accurate.
Disclosure of Invention
The invention provides a customer consumption behavior prediction method, a customer consumption behavior prediction device, electronic equipment and a storage medium, which are used for solving the defect of inaccurate customer consumption behavior prediction in the prior art and achieving the purpose of accurately predicting customer consumption behaviors.
The invention provides a customer consumption behavior prediction method, which comprises the following steps:
acquiring original data consisting of client personal information and consumption behavior influence factors;
inputting the original data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the original data;
the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method.
According to the customer consumption behavior prediction method provided by the invention, parameters of the prediction model are obtained by adjusting and optimizing the radial basis function neural network according to an immune algorithm and a least square method, and the method comprises the following steps:
determining the number of hidden layer nodes of the radial basis function neural network, and determining a kernel function of the radial basis function neural network based on the training data set;
optimizing the basis function center of the kernel function based on the immune algorithm to obtain an optimal basis function center;
acquiring the width and the connection weight of the kernel function based on a least square method;
generating the consumption prediction model based on the basis function center, the width of the kernel function, and the connection weight.
According to the customer consumption behavior prediction method provided by the invention, the optimization of the basis function center of the kernel function based on the immune algorithm to obtain the optimal basis function center comprises the following steps:
initializing an antibody population of the immune algorithm, and taking the basis function center as an antibody;
obtaining the affinity of the antibody population;
screening the antibody population based on the affinity to obtain a first target antibody population, and carrying out cloning and mutation operations on the first target antibody population to obtain a second target antibody population;
screening the second target antibody population based on the affinity to obtain a third target antibody population;
combining the second target antibody population with the third target antibody population to obtain a fourth target antibody population, and obtaining the affinity of the fourth target antibody population;
and obtaining the iteration times of the immune algorithm, and if the iteration times are more than the preset times, taking the antibody with the optimal affinity in the fourth target antibody population as the optimal basis function center.
According to the customer consumption behavior prediction method provided by the invention, the consumption prediction model is obtained by training a training data set generated by data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and the method comprises the following steps:
acquiring consumption characteristics of a plurality of customer consumption behaviors, and filtering and interpolating the consumption characteristics to generate a balance data set;
based on principal component analysis, performing feature dimension reduction operation on the balanced data set to obtain a training data set;
training the radial basis function neural network to generate the consumption prediction model based on the training dataset.
According to the customer consumption behavior prediction method provided by the invention, the steps of obtaining the consumption characteristics of a plurality of customer consumption behaviors, filtering and interpolating the consumption characteristics and generating a balance data set comprise:
acquiring consumption characteristics of a plurality of customer consumption behaviors, and filtering noise samples of the consumption characteristics based on a Tomek link algorithm to generate an initial balance data set;
and interpolating the initial balance data set according to the regional distribution condition of the initial balance data set to generate the balance data set.
According to the customer consumption behavior prediction method provided by the invention, the feature dimension reduction operation is carried out on the balance data set based on principal component analysis to obtain a training data set, and the method comprises the following steps:
carrying out standardization processing on the balanced data set to obtain standard data;
acquiring a matrix of each standard data;
obtaining the eigenvalue and the eigenvector of the matrix;
and acquiring the training data set based on the characteristic values and the characteristic vectors.
The present invention also provides a customer consumption behavior prediction apparatus, comprising:
the data acquisition module is used for acquiring original data consisting of client personal information and consumption behavior influence factors;
the analysis processing module is used for inputting the original data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the original data;
the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the steps of the customer consumption behavior prediction method according to any one of the above.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for predicting consumer consumption behavior according to any one of the above-mentioned methods.
The present invention also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the method for predicting consumer behaviour in accordance with any one of the preceding claims.
According to the customer consumption behavior prediction method, the customer consumption behavior prediction device, the electronic equipment and the storage medium, the consumption characteristics of a plurality of customer consumption behaviors are subjected to data balance processing operation to generate a training data set, and a consumption prediction model is generated according to the training of the training data set; meanwhile, the radial basis function neural network is adjusted and optimized according to an immune algorithm and a least square method, so that the parameters of a prediction model are more accurate, and the accuracy of customer consumption behavior prediction is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a customer consumption behavior prediction method provided by the present invention;
FIG. 2 is a diagram of a numerical data distribution box provided by the present invention;
FIG. 3 is a schematic diagram of the proportion of the classification-type data features provided by the present invention;
FIG. 4 is a schematic structural diagram of an RBFNN topological structure model provided by the invention;
FIG. 5 is a schematic structural diagram of a customer consumption behavior prediction apparatus provided in the present invention;
fig. 6 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The customer consumption behavior prediction method, apparatus, electronic device and storage medium of the present invention are described below with reference to fig. 1 to 6.
Fig. 1 is a schematic flow diagram of a customer consumption behavior prediction method provided by the present invention, and as shown in fig. 1, the present invention provides a customer consumption behavior prediction method, where an execution subject may be a terminal, such as: computer, vehicle carried terminal, etc., the method includes the following steps:
step 101, obtaining original data consisting of client personal information and consumption behavior influence factors.
It can be understood that, with the development of e-commerce technology, a client may have a series of consumption records and corresponding personal information after consumption, and the raw data is formed by collecting the client personal information and consumption behavior influencing factors.
For example, the personal information of the client may be the gender, age, marital, occupation, monthly income, academic history, and resident attribute of the surveyor.
The consumption behavior influencing factors can be brand, year, producing area, package, price, sales promotion, relatives and friends recommendation, advertisement recommendation, public praise, efficacy, knowledge and the like.
Step 102, inputting the original data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the original data;
the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method.
It can be understood that, by inputting the obtained raw data into a pre-constructed consumption prediction model, consumption behavior prediction data based on the raw data is obtained.
The consumption behavior prediction data can be provided with a consumption category label, namely, the prediction data with the label is obtained by inputting the original data, and the consumption behavior category of the client can be predicted.
For example, in the wine consumption field, the prediction data with labels of domestic wine, imported wine or wine sources can be obtained by inputting the original data consisting of the personal information of customers and consumption behavior influence factors, so that the consumption intention of the customers can be accurately predicted.
The consumption prediction model may be obtained by: by obtaining the statistical consumption characteristics of the consumption behaviors of the clients and carrying out data balance processing on the consumption characteristic data, a balance data set without interference data and without changing the separability among classes can be obtained, and the quality of a new synthesized sample is improved. And training the radial basis function neural network by using the balance data set to obtain a consumption prediction model.
The parameters of the consumption prediction model can be obtained by adjusting and optimizing the radial basis function neural network through an immune algorithm and a least square method.
The Immune Algorithm (IA) is an optimization Algorithm simulating a biological Immune mechanism, iterative computation is carried out by adopting a group search strategy, self diversity and a maintenance mechanism are utilized, training data can be well generalized, the problem of premature convergence is effectively inhibited, and the method has strong global search capability
Wherein, the Radial Basis Function Neural Network (RBFNN) is a feedforward Neural Network with a single hidden layer, the activation Function of the hidden node is a Radial Basis Function, the center point of the Basis Function has Radial symmetry, and the farther the input of the neuron is from the center point, the lower the activation degree of the neuron. Therefore, the RBFNN has a good local response characteristic, and can approximate an arbitrary continuous function with arbitrary accuracy. The input layer of the RBFNN comprises signal source nodes, the number of the nodes of the hidden layer is determined according to specific problems, and the output layer mainly responds to specific functions of an input mode.
Taking wine customer consumption as an example, the wine customer consumption behavior survey is carried out nationwide, and taking the wine customer consumption as a survey object, the survey object relates to different genders, ages, marits, occupations, monthly incomes, academic calendars and resident attributes. The questionnaire finally recovered an effective data amount of 3621, wherein the consumption behavior had 754 for domestic wine, 748 for imported wine and 2119 for data available for purchasing wine sources. After the personal information and the consumption behavior influence factor information data of the customers are obtained, the data are integrated, and the consumption behavior data of each customer has 18 consumption characteristics which comprise 11 numerical attributes and 7 classification attributes.
Fig. 2 is a line drawing of a distribution box of numerical data provided by the present invention, as shown in fig. 2, the numerical data is divided into 11 types, the type labels are a to k, each type attribute has a corresponding attribute value, the attribute value represents the degree of influence on the customer consumption behavior, and the degree of influence is: not important, slightly important, generally, very important, especially important. Fig. 3 is a schematic diagram of the feature ratios of the classified data provided by the present invention, as shown in fig. 3, the classified data are divided into 7 types, the type labels are l to r, the ratio of the attribute values in each type of data is different, for example, ten types exist in the classification p, the feature ratios of each type are different, the customer consumption behavior influencing factors and the personal basic information feature table are obtained by integrating the numerical data and the classified data, and the customer consumption behavior influencing factors and the personal basic information feature table are shown in table 1 below.
TABLE 1
Figure BDA0003381126360000081
Figure BDA0003381126360000091
And (4) through researching and recording data, carrying out data balance processing on the data of the customer consumption behavior influence factors and the personal basic information to generate a training data set. And then training the radial basis function neural network by using the balance data set to obtain a consumption prediction model.
After the original data are input into the consumption prediction model, a client consumption behavior prediction result is generated, namely the original data do not have a class label, the generated prediction data have import, domestic or all available labels, and the consumption behavior of the client can be accurately predicted through the labels.
According to the customer consumption behavior prediction method provided by the invention, consumption characteristics of a plurality of customer consumption behaviors are subjected to data balance processing operation to generate a training data set, and a consumption prediction model is generated according to the training of the training data set; meanwhile, the radial basis function neural network is adjusted and optimized according to an immune algorithm and a least square method, so that the parameters of a prediction model are more accurate, and the accuracy of customer consumption behavior prediction is improved.
Further, the parameters of the prediction model are obtained by adjusting and optimizing the radial basis function neural network according to an immune algorithm and a least square method, and the parameters include:
determining the number of hidden layer nodes of the radial basis function neural network, and determining a kernel function of the radial basis function neural network based on the training data set;
optimizing the basis function center of the kernel function based on the immune algorithm to obtain an optimal basis function center;
acquiring the width and the connection weight of the kernel function based on a least square method;
generating the consumption prediction model based on the basis function center, the width of the kernel function, and the connection weight.
It can be understood that fig. 4 is a schematic structural diagram of the RBFNN topology structure model provided by the present invention, and as shown in fig. 4, the network structure is m-p-q, m is the number of input layer nodes, p is the number of hidden layer nodes, q is the number of output layer nodes, and x1,x2,…,xmIn order to input the quantity of the input,
Figure BDA0003381126360000101
to activate a function, y1,…,yqIs the output quantity. Obtaining an output expression of the network according to the structure of the RBFNN, wherein the output expression is as follows:
Figure BDA0003381126360000102
wherein x is the input amount, wkiFor the kth hidden layer node to the kthiThe connection weight of each output; c. CkIs the kth basis function center of the input vector;
Figure BDA0003381126360000103
and m is the kernel function of the kth hidden layer node, p is the number of the hidden layer nodes, and q is the number of the output layer nodes.
The transform of the input space to the hidden layer space of the RBFNN is non-linear, while the transform from the hidden layer space to the output layerThe spatial transformation is linearly weighted. The basis functions of hidden nodes are most often distance functions (e.g. euclidean distances) and radial basis functions (e.g. gaussian kernel functions) are used as activation functions, which are of the form:
Figure BDA0003381126360000104
σ is the spreading constant of the radial basis function.
As can be seen from the above output expression, the key to determine whether the RBFNN model can be predicted accurately depends on the following parameters: radial basis functions, basis function centers, widths, and hidden layer to output layer connection weights.
The number of RBFNN hidden layer nodes determines the topological structure and the scale of the network, and the basic principle of determining the number of the hidden layer nodes is as follows: on the premise of meeting the precision, a structure which is as compact as possible is adopted, namely the number of nodes of the hidden layer is as small as possible. The number of hidden layer nodes can be determined by using an empirical formula, a is a tuning constant between 1 and 10, and the empirical formula is as follows:
Figure BDA0003381126360000111
in the formula, m is the number of nodes of an input layer, p is the number of nodes of an implicit layer, q is the number of nodes of an output layer, and a is an adjusting constant between 1 and 10.
The kernel function is a key component of the RBFNN, and the kernel function has the basic function of mapping the linear inseparable problem in the low-dimensional space into the high-dimensional feature space through the kernel function, so that linearly separable data can be obtained in the high-dimensional space. In the embodiment, the adaptive multi-core fusion kernel function is adopted, and the local fitting capability of a Gaussian kernel (Gaussian) and the generalization capability of a polynomial kernel Poly are utilized, so that the RBFNN based on the kernel function has high prediction performance. The specific expression form of the self-adaptive multi-core fusion kernel function is shown as the following formula:
Figure BDA0003381126360000112
in the formula phik(x,ck) A kernel function of the kth hidden layer node is adopted, and x represents sample data; c. CkIs the kth radial base center; sigmakIs the kth radial basis width; t is a polynomial constant (t)>0) (ii) a d is polynomial order ( d 1,2,3, …);
Figure BDA0003381126360000113
representing the weight assigned to the gaussian kernel by the kth radial basis,
Figure BDA0003381126360000114
representing the weight assigned to the polynomial kernel by the kth radial basis.
The mapping relationship of the integral function can be expressed by the following formula:
Figure BDA0003381126360000115
where y is the mapping of the integral function, wk,lIs an alternative to the ith participating kernel weight in the kth radial basis, wkThe connection weight of the kth radial basis of the hidden layer to the output layer,
Figure BDA0003381126360000121
the kernel function for the kth radial basis pair the lth participating kernel.
After the kernel function of the RBFNN is determined, in order to enable the network to reach the required precision, an immune algorithm is adopted to optimize a base function center, so that the obtained center can better reflect the characteristic information contained in the training data set.
And calculating the width of the kernel function and the connection weight by adopting a Least Square (LS) method. The width of the kernel function can be obtained by the following formula:
Figure BDA0003381126360000122
in the formula, σiIs the width of the kernel function, dmaxIs the maximum of the input data and the base centerP is the number of nodes in hidden layer, i is equal to [1, p ]]。
The connection weight value can be obtained through the following formula:
w=(ΦTΦ)-1ΦTL
wherein w is the connection weight, phiTIs the transpose of the basis function.
According to the method, the self-adaptive multi-core fusion kernel function is adopted as the kernel function of the radial basis function neural network, and the radial basis function neural network is adjusted and optimized according to an immune algorithm and a least square method, so that the accuracy of customer consumption behavior prediction is improved.
Further, the optimizing the basis function center of the kernel function based on the immune algorithm to obtain an optimal basis function center includes:
initializing an antibody population of the immune algorithm, and taking the basis function center as an antibody;
obtaining the affinity of the antibody population;
screening the antibody population based on the affinity to obtain a first target antibody population, and carrying out cloning and mutation operations on the first target antibody population to obtain a second target antibody population;
screening the second target antibody population based on the affinity to obtain a third target antibody population;
combining the second target antibody population with the third target antibody population to obtain a fourth target antibody population, and obtaining the affinity of the fourth target antibody population;
and obtaining the iteration times of the immune algorithm, and if the iteration times are more than the preset times, taking the antibody with the optimal affinity in the fourth target antibody population as the optimal basis function center.
It is understood that IA corresponds the objective function of the actual solution problem to the antigen and the solution to the problem to the antibody. Part of the antibodies have an immunological memory function, and the function can accelerate the searching speed and improve the global searching capability of the algorithm; meanwhile, the populations based on antibody concentration mutually promote and inhibit to maintain the diversity of the antibody.
The invention adopts an immune algorithm to optimize a basis function center, and the specific optimization steps are as follows:
step 1, initializing antibody population: antibody group a ═ { c ═ c1,c2,cMM is size of antibody population, antibody ci={c1i,c2i,cpiThe corresponding hidden layer data center. The initial antibody population vector c is randomly generated and can be obtained by the random equation shown below:
c=Xmin+rand(D,M)×(Xmax-Xmin)
wherein c is an antibody population vector, D is the dimension of the antibody, M is the size of the antibody population, and XmaxIs the upper limit of the antibody, XminThe lower limit of the antibody.
Step 2, calculating Affinity (AFF) of the antibody population, wherein the Affinity can be obtained through an Affinity formula, and the Affinity formula is shown as follows:
AFF=α·Fit-β·Nd
in the formula, AFF is the affinity of the antibody population, alpha and beta are affinity coefficients, Fit is the fitness of the antibody i, and Nd is the concentration of the antibody population.
Fit for antibody i can be obtained by the following formula:
Fiti=1-Acci
in the formula, ACCiThe output accuracy of the test data for the ith antibody.
The antibody population concentration Nd can be calculated based on the individual concentration, and is obtained by the following formula:
Figure BDA0003381126360000141
in the formula, NdiConcentration of the i-th antibody, ndi,jIs the ith antibody ciWith the jth antibody cjIn the middle of the above.
Ith antibody ciWith the jth antibody cjThe concentration of the intermediate can be obtained by the following formulaTaking:
Figure BDA0003381126360000142
in the formula, δ is a similarity threshold between antibodies.
Step 3, immunizing population: selecting: the selection aims at the process of selecting the superior antibody and the inferior antibody, and M/2 antibodies before the affinity are selected from the current antibody group to carry out immune operation; cloning: cloning the selected antibody to M2. The cloning operator searches near the data center with the optimal affinity to obtain a better antibody; mutation: carrying out variation operation on the cloned sub-antibody group according to variation probability, wherein d-dimension data of the ith sub-cloned antibody of the antibody c can be obtained through a variation formula, and the variation mainly aims at improving the diversity of the antibodies in the group so as to enlarge the search range of a solution; clone inhibition: and updating the affinity of the sub-antibody group after the clonal variation, and keeping the clonal antibody with the optimal affinity. The variation formula is as follows:
Figure BDA0003381126360000143
in the formula, ccd,iD-dimension data of the i-th sub-clone antibody of the antibody c, wherein tau is the initial value of the neighborhood range and t is the number of current iteration.
Step 4, new population: new M/2 antibodies were generated using the selection operator operation and the affinity of the new antibody population was calculated.
Step 5, updating the population: the immunised population is pooled with the neonatal population and the affinity of the pooled antibody population is updated.
Step 6, judging whether the iteration times reach the maximum set iteration times: if so, finishing the evolution, and outputting an optimal antibody, namely a central point set of the network hidden layer; otherwise, go to step 2 for the next iteration.
The method optimizes the basis function center of the kernel function by adopting an immune algorithm to obtain the optimal basis function center, and solves the problem that the RBFNN parameter is easy to fall into local minimum in the conventional method, so that the network model has higher prediction precision and is beneficial to improving the prediction accuracy of customer consumption behaviors.
Further, the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and comprises:
acquiring consumption characteristics of a plurality of customer consumption behaviors, and filtering and interpolating the consumption characteristics to generate a balance data set;
based on principal component analysis, performing feature dimension reduction operation on the balanced data set to obtain a training data set;
training the radial basis function neural network to generate the consumption prediction model based on the training dataset.
It is understood that the data generated according to the consumption characteristics of the plurality of customer consumption behaviors are nonlinear, cluttered and noisy data, and the data has adverse effects on the prediction result. Therefore, the data is filtered and interpolated to generate a balanced data set with less impurities and unchanged separability among classes;
after the original data is processed by class imbalance, the classifier is likely to generate an overfitting phenomenon. At this time, the data characteristics are reduced in dimension by Principal Component Analysis (PCA), and more accurate data is obtained.
According to the invention, data generated according to the consumption characteristics of a plurality of customer consumption behaviors are filtered and interpolated, and the data dimensionality is reduced, so that the training time of the model is saved, and the prediction accuracy is improved.
Further, the acquiring consumption characteristics of a plurality of customer consumption behaviors, filtering and interpolating the consumption characteristics, and generating a balanced data set includes:
acquiring consumption characteristics of a plurality of customer consumption behaviors, and filtering noise samples of the consumption characteristics based on a Tomek link algorithm to generate an initial balance data set;
and interpolating the initial balance data set according to the regional distribution condition of the initial balance data set to generate the balance data set.
It will be appreciated that the balanced data set may be noise sample filtered from the consumption signature by the TomekLinks algorithm.
Wherein the TomekLinks algorithm is defined as follows: hypothesis data set
Figure BDA0003381126360000161
Two sample data Ei=(xi,yi) And Ej=(xj,yj). If the condition is satisfied: eiAnd EjWith different class labels yi≠yjAnd no sample data E existslD (E)i,El)<d(Ei,Ej) Or d (E)j,El)<d(Ei,Ej) D (x, y) is the distance between the data, then the data pair (E)i,Ej) Referred to as a Tomek link. By removing the sample points forming the TomekLinks, the purpose of removing noise samples can be achieved, and an initial balanced data set is generated.
After the initial balanced data set is generated, the initial balanced data set is interpolated, and it is necessary to consider that data is inserted according to the region distribution characteristics in the interpolation process. The invention adjusts the sample synthesis range through the distribution condition of the samples in the neighborhood of the few class centroids, and assigns a certain direction to the interpolation to enable the interpolation to be close to the sample distribution center, thereby reducing the probability of fuzzy class boundaries of the synthesized samples.
Figure BDA0003381126360000162
In the formula, x1And x2Is the sub-cluster sample data; x is the number ofcenRepresenting a center of a sub-cluster;
Figure BDA0003381126360000163
r is [0,1 ]]An internal random number.
The method effectively improves the usability of the data by identifying the noise points of the samples and eliminating the noise points; and the centroid of each cluster is selected, the algorithm oversampling formula is modified to enable newly generated sample points to be in a triangular region close to the subclass centroid, data of fuzzy class boundaries can be effectively reduced, positive and negative classes are obviously distributed, and accuracy of customer consumption behavior prediction is improved.
Further, the performing feature dimension reduction operation on the balanced data set based on principal component analysis to obtain a training data set includes:
carrying out standardization processing on the balanced data set to obtain standard data;
acquiring a matrix of each standard data;
obtaining the eigenvalue and the eigenvector of the matrix;
and acquiring the training data set based on the characteristic values and the characteristic vectors.
It is understood that after acquiring the balanced data set, an overfitting phenomenon is likely to occur. At the moment, the dimension of the data features needs to be reduced, the data dimension is reduced, the training time of the model is saved, and the prediction accuracy is improved.
For example, a feature dimensionality reduction operation may be performed on the balanced data set via PCA to obtain a training data set.
The basic idea of PCA is: and mapping the original feature space to ensure that the mapped feature space data are orthogonal to each other, and reserving the distinguishing low-dimensional data features as much as possible. The method is characterized in that the original information reflected among the main components is not related to each other.
The principal steps of PCA calculation are as follows:
1. and (4) carrying out data standardization processing to eliminate the difference of various indexes in dimension and magnitude.
Figure BDA0003381126360000171
In the formula, ZijIs normalized data; xijAs the ith data objectThe jth index value;
Figure BDA0003381126360000172
the sample mean value of the jth index; sjIs the standard deviation of the jth index; n is sample data volume, i belongs to [1, n ]](ii) a p is the number of data features, j is the [1, p ]]。
2. Calculating a correlation coefficient matrix R ═ Rij]p×p
Figure BDA0003381126360000173
In the formula, rijIn the form of a matrix of correlation coefficients,
Figure BDA0003381126360000174
for normalizing the value of the ith dimension characteristic of the kth data object after the transposition, ZkjIs the value of j-th dimension characteristic of k-th data object after standardization, n is sample data volume, k belongs to [1, n ∈]。
3. The eigenvalues and eigenvectors are computed. P eigenvalues are obtained from the variance | λ E-R | ═ 0 and arranged in ascending order as λ12>…>λpMore than or equal to 0, and the feature vector corresponding to each feature value is mu12,…,μpThus, p principal components are obtained.
Figure BDA0003381126360000181
In the formula of UjIs the jth principal component, mukjIs the value of the jth column of the kth row in the feature matrix.
4. Calculate U1,U2,…,UmCumulative contribution rate a ofmWhen a ismWhen the content is more than or equal to 90 percent, the first m main components are selected.
Figure BDA0003381126360000182
In the formula, amTo accumulate the contribution ratio, λkIs the k-th eigenvalue.
The invention performs the dimension reduction operation on the balanced data set by adopting the principal component analysis, avoids the over-fitting phenomenon and is beneficial to improving the accuracy of the customer consumption behavior prediction.
The following describes the customer consumption behavior prediction apparatus provided by the present invention, and the customer consumption behavior prediction apparatus described below and the customer consumption behavior prediction method described above may be referred to in correspondence with each other.
The present invention provides a customer consumption behavior prediction apparatus, and fig. 5 is a schematic structural diagram of the customer consumption behavior prediction apparatus provided by the present invention.
As shown in fig. 5, the system mainly includes a data acquisition module 501 and an analysis processing module 502; the data acquisition module 501 is used for acquiring original data consisting of client personal information and consumption behavior influence factors; the analysis processing module 502 is configured to input the raw data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the raw data; the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method.
According to the customer consumption behavior prediction device provided by the invention, the consumption characteristics of a plurality of customer consumption behaviors are subjected to data balance processing operation to generate a training data set, and a consumption prediction model is generated according to the training of the training data set; meanwhile, the radial basis function neural network is adjusted and optimized according to an immune algorithm and a least square method, so that the parameters of a prediction model are more accurate, and the accuracy of customer consumption behavior prediction is improved.
Fig. 6 is a schematic structural diagram of an electronic device provided in the present invention, and as shown in fig. 6, the electronic device may include: a processor (processor)601, a communication Interface (Communications Interface)602, a memory (memory)603 and a communication bus 604, wherein the processor 601, the communication Interface 602 and the memory 603 complete communication with each other through the communication bus 604. The processor 601 may call logic instructions in the memory 603 to perform a customer consumption behavior prediction method provided by the above-described method embodiments, the method comprising, for example: acquiring original data consisting of client personal information and consumption behavior influence factors; inputting the original data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the original data; the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method.
In addition, the logic instructions in the memory 603 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention further provides a computer program product, the computer program product including a computer program, the computer program being stored on a non-transitory computer-readable storage medium, wherein when the computer program is executed by a processor, a computer is capable of executing the method for predicting customer consumption behavior provided by the above embodiments of the method, the method for predicting customer consumption behavior includes: acquiring original data consisting of client personal information and consumption behavior influence factors; inputting the original data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the original data; the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method.
In yet another aspect, the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the customer consumption behavior prediction method provided by the above method embodiments, the method for example comprising: acquiring original data consisting of client personal information and consumption behavior influence factors; inputting the original data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the original data; the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for predicting consumer consumption behavior, comprising:
acquiring original data consisting of client personal information and consumption behavior influence factors;
inputting the original data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the original data;
the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method.
2. The method of claim 1, wherein the parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method, and the method comprises:
determining the number of hidden layer nodes of the radial basis function neural network, and determining a kernel function of the radial basis function neural network based on the training data set;
optimizing the basis function center of the kernel function based on the immune algorithm to obtain an optimal basis function center;
acquiring the width and the connection weight of the kernel function based on a least square method;
generating the consumption prediction model based on the basis function center, the width of the kernel function, and the connection weight.
3. The method of predicting consumer consumption behavior according to claim 2, wherein the optimizing the basis function center of the kernel function based on the immune algorithm to obtain an optimal basis function center comprises:
initializing an antibody population of the immune algorithm, and taking the basis function center as an antibody;
obtaining the affinity of the antibody population;
screening the antibody population based on the affinity to obtain a first target antibody population, and carrying out cloning and mutation operations on the first target antibody population to obtain a second target antibody population;
screening the second target antibody population based on the affinity to obtain a third target antibody population;
combining the second target antibody population with the third target antibody population to obtain a fourth target antibody population, and obtaining the affinity of the fourth target antibody population;
and obtaining the iteration times of the immune algorithm, and if the iteration times are more than the preset times, taking the antibody with the optimal affinity in the fourth target antibody population as the optimal basis function center.
4. The customer consumption behavior prediction method according to claim 1, wherein the consumption prediction model is trained from a training data set generated by data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and comprises:
acquiring consumption characteristics of a plurality of customer consumption behaviors, and filtering and interpolating the consumption characteristics to generate a balance data set;
based on principal component analysis, performing feature dimension reduction operation on the balanced data set to obtain a training data set;
training the radial basis function neural network to generate the consumption prediction model based on the training dataset.
5. The method of claim 4, wherein the obtaining consumption characteristics of a plurality of consumer behaviors, filtering and interpolating the consumption characteristics to generate a balanced data set comprises:
acquiring consumption characteristics of a plurality of customer consumption behaviors, and filtering noise samples of the consumption characteristics based on a Tomek link algorithm to generate an initial balance data set;
and interpolating the initial balance data set according to the regional distribution condition of the initial balance data set to generate the balance data set.
6. The method of predicting customer consumption behavior according to claim 4, wherein the performing feature dimension reduction on the balanced data set based on principal component analysis to obtain a training data set comprises:
carrying out standardization processing on the balanced data set to obtain standard data;
acquiring a matrix of each standard data;
obtaining the eigenvalue and the eigenvector of the matrix;
and acquiring the training data set based on the characteristic values and the characteristic vectors.
7. A customer consumption behavior prediction apparatus, comprising:
the data acquisition module is used for acquiring original data consisting of client personal information and consumption behavior influence factors;
the analysis processing module is used for inputting the original data into a pre-constructed consumption prediction model to obtain consumption behavior prediction data based on the original data;
the consumption prediction model is obtained by training a training data set generated through data balance processing based on consumption characteristics of a plurality of customer consumption behaviors, and parameters of the prediction model are obtained by adjusting and optimizing a radial basis function neural network according to an immune algorithm and a least square method.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the method for predicting customer consumption behavior according to any of claims 1 to 6.
9. A non-transitory computer readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps of the customer consumption behavior prediction method according to any one of claims 1 to 6.
10. A computer program product comprising a computer program, wherein the computer program when executed by a processor implements the steps of the customer consumption behavior prediction method according to any one of claims 1 to 6.
CN202111448155.8A 2021-11-29 2021-11-29 Customer consumption behavior prediction method and device, electronic equipment and storage medium Pending CN114219522A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111448155.8A CN114219522A (en) 2021-11-29 2021-11-29 Customer consumption behavior prediction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111448155.8A CN114219522A (en) 2021-11-29 2021-11-29 Customer consumption behavior prediction method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114219522A true CN114219522A (en) 2022-03-22

Family

ID=80699176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111448155.8A Pending CN114219522A (en) 2021-11-29 2021-11-29 Customer consumption behavior prediction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114219522A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116843377A (en) * 2023-07-25 2023-10-03 河北鑫考科技股份有限公司 Consumption behavior prediction method, device, equipment and medium based on big data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116843377A (en) * 2023-07-25 2023-10-03 河北鑫考科技股份有限公司 Consumption behavior prediction method, device, equipment and medium based on big data

Similar Documents

Publication Publication Date Title
US11373233B2 (en) Item recommendations using convolutions on weighted graphs
US6636862B2 (en) Method and system for the dynamic analysis of data
CN111461225B (en) Customer clustering system and method thereof
CN113298230B (en) Prediction method based on unbalanced data set generated against network
CN110866782B (en) Customer classification method and system and electronic equipment
CN113158024B (en) Causal reasoning method for correcting popularity deviation of recommendation system
CN110222838B (en) Document sorting method and device, electronic equipment and storage medium
CN112529638B (en) Service demand dynamic prediction method and system based on user classification and deep learning
CN109726331B (en) Object preference prediction method, device and computer readable medium
CN110929041A (en) Entity alignment method and system based on layered attention mechanism
CN111062806B (en) Personal finance credit risk evaluation method, system and storage medium
CN116401379A (en) Financial product data pushing method, device, equipment and storage medium
US20210383275A1 (en) System and method for utilizing grouped partial dependence plots and game-theoretic concepts and their extensions in the generation of adverse action reason codes
CN114219522A (en) Customer consumption behavior prediction method and device, electronic equipment and storage medium
CN112541530B (en) Data preprocessing method and device for clustering model
CN113688906A (en) Customer segmentation method and system based on quantum K-means algorithm
US20210350272A1 (en) System and method for utilizing grouped partial dependence plots and shapley additive explanations in the generation of adverse action reason codes
CN116993548A (en) Incremental learning-based education training institution credit assessment method and system for LightGBM-SVM
CN114926208A (en) User demand data analysis method and system for product improvement strategy formulation
CN112749345B (en) K neighbor matrix decomposition recommendation method based on neural network
CN112232388B (en) Shopping intention key factor identification method based on ELM-RFE
CN113407827A (en) Information recommendation method, device, equipment and medium based on user value classification
CN113034264A (en) Method and device for establishing customer loss early warning model, terminal equipment and medium
CN113034260A (en) Credit evaluation method, model construction method, display method and related equipment
TEKOUABOU et al. Using Class Membership based Approach to Improve Predictive Classification in Customer Relationship Management Systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination