CN116304299A - Personalized recommendation method integrating user interest evolution and gradient promotion algorithm - Google Patents

Personalized recommendation method integrating user interest evolution and gradient promotion algorithm Download PDF

Info

Publication number
CN116304299A
CN116304299A CN202310003507.1A CN202310003507A CN116304299A CN 116304299 A CN116304299 A CN 116304299A CN 202310003507 A CN202310003507 A CN 202310003507A CN 116304299 A CN116304299 A CN 116304299A
Authority
CN
China
Prior art keywords
commodity
model
user
interest
gru
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310003507.1A
Other languages
Chinese (zh)
Inventor
蔡世民
刘一龙
宗雨欣
周洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202310003507.1A priority Critical patent/CN116304299A/en
Publication of CN116304299A publication Critical patent/CN116304299A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a personalized recommendation method integrating user interest evolution and gradient lifting algorithms, and belongs to the field of recommendation system research in the field of machine learning. The product recommended by the model has higher coincidence rate with the product purchased by the customer within 7 days after the training data is finished. In the experiment, the sorting and prediction precision of the model are improved through preprocessing the original data and feature engineering processing, and a feature engineering method which is useful in the scene is provided. Several important features affecting the ranking result of the recommendation system are provided, and a certain reference is provided for continuously improving the accuracy of the algorithm and the recommendation system. The method has wide application field, is not only the personalized recommendation of clothing, but also can be transplanted to various recommendation fields, such as music recommendation, book recommendation and the like.

Description

Personalized recommendation method integrating user interest evolution and gradient promotion algorithm
Technical Field
The application belongs to the research field of recommendation systems in the machine learning field.
Background
Keyword term definition:
neural network: is a mathematical or computational model that mimics the structure and function of a biological neural network for estimating or approximating a function. Neural networks are calculated from a large number of artificial neuronal junctions. In most cases, the artificial neural network can change the internal structure based on external information, and is an adaptive system.
Gradient lifting tree: the gradient lifting tree, i.e., GBDT, gradient Boosting Decision Tree, is a member of the boosting family in ensemble learning, and is an iterative decision tree algorithm consisting of multiple decision trees, and the conclusions of all trees are accumulated to make a final answer. It was previously proposed to be considered together with the SVM as a powerful algorithm for generalization. In recent years, attention has been paid to machine learning models used for search ranking.
Evolution of user interest: under most non-search e-commerce scenarios, users do not express current interest preferences in real-time. Therefore, capturing the interest of the dynamic change of the user through the design model is a key for improving the recommendation effect.
The life style of people is changed by the Internet, and the life and study of people are more convenient. In recent years, with the wide spread of the internet and the rapid development of electronic commerce, online shopping has become an indispensable part of life. Websites are enriched with a large amount of product information, which often makes customers disoriented and unable to successfully find the desired product. The recommendation system for classifying the commodities can quickly and actively help customers to find favorite commodities and potential buyers, so that sales volume is increased, and huge economic benefits are brought. Meanwhile, the shopping time can be saved for customers, and the shopping efficiency is improved.
Commodity data, transaction data, and customer data hide much of the information that is not mined and affects the customer's choice of commodity. It is difficult to determine which factors are important through subjective experience. Currently, machine learning algorithms have been widely used in research for commodity recommendation and ordering. Li et al set up a recommendation system based on emotion analysis, analyze customer evaluations, and recommend the most interesting products for customers. Sun et al converts commodity recommendation problems into density-based clustering problems, and results show that the model can solve the problems to a certain extent.
The prior art has the following defects:
on one hand, the algorithm can only describe the influence of a few features on sequencing and recommendation results, and the accuracy and efficiency of the traditional machine learning algorithm are greatly reduced in the face of massive unlabeled data; on the other hand, the algorithm cannot simulate the interest evolution route of the user, so that the selection of online shopping of the user is more and more, the interests are changed at the moment, and a plurality of interests possibly exist in the user for a period of time, namely the interests of the user are continuously evolved and crossed, so that how to more accurately express the dynamic change of the interests of the user and capture the long-term interests of the user is important.
In the method, the recommendation model which is integrated into the simulation of the user interest evolution and gradient promotion algorithm is used for conducting personalized recommendation of the commodities to be purchased in the future week of the user by taking the past transaction data provided by H & M company provided by the kagle platform and the multi-mode data of the clients and the commodities as the background.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a recommendation model which is integrated into an algorithm for simulating the evolution of the user interest and gradient lifting. The method not only can process mass data, but also can accurately simulate the evolution of the user interest. Specifically, for processing massive data, algorithms such as GBDT, XGBoost, lightGBM of Boosting families are sequentially proposed and applied, so that recommending and ordering products in massive data is possible, and lambda gradients are added on the basis of a LightGBM algorithm, so that the method is more suitable for recommending and ordering scenes. For simulating the user interest evolution, the focus in the target commodity and the user behavior data can be screened out by using a attention mechanism which is successful in the nlp field, so that the downstream task is performed.
The technical scheme of the invention is a personalized recommendation method integrating user interest evolution and gradient promotion algorithm, which comprises the following steps:
step 1: acquiring user information, commodity information and scene information from a database;
wherein the user information includes: attribute information such as ID, age, zip code, whether club members are active, whether news pushing is accepted, etc. of the user; the commodity information includes: attribute information such as commodity ID, commodity code, commodity name, commodity number, commodity name, commodity color, date of departure, production department, etc., visual information such as commodity picture, and text information such as commodity description; the scene information includes: attribute information such as user ID, commodity ID, transaction date, transaction channel, etc.;
step 2: dividing a data set;
taking the last week of samples as a test set and the previous samples as a training set;
step 3: constructing a model of user interest evolution;
the model comprises a behavior sequence layer, an interest extraction layer, an interest evolution layer and an MLP network:
behavioral sequence layer: the method is used for converting the original ID behavior sequence of the user for n days into an ebedding behavior sequence;
interest extraction layer: for extracting the interest from the ebedding data, the GRU unit is used to extract the interest:
u t =σ(W u i t +U u h t-1 +b u )
r t =σ(W r i t +U r h t-1 +b r )
Figure BDA0004035074290000021
Figure BDA0004035074290000022
wherein u is t Representing the output value of the update gate in GRU at time t, r t Representing the output value, i, of the reset gate in the GRU at the time t t Representing the input of the GRU at time t,
Figure BDA0004035074290000023
representing a newly learned memory state of the GRU at a time t, W representing a parameter weight input by the GRU unit, U representing a parameter weight of a hidden state of the GRU at a previous time, b representing a parameter bias, superscript U, r, h representing an update gate, a reset gate and a new learning state, respectively, sigma representing a sigmoid function, and DEG representing an element-wise product, i t Is the input of GRU, namely, each behavior of behavior sequence layer is an ebedding vector, and represents the ebedding vector of the t-th behavior of the user, h t Then the t hidden state of the GRU is further abstracted by the user behavior vector b (t) after the interest network of the GRU is passed, and an interest state vector h (t) is formed;
interest evolution layer: the method is used for describing the evolution process of the user interests, adding an attention mechanism and scoring the attention mechanism:
Figure BDA0004035074290000031
wherein a is t Represents the attention score at time t of the attention mechanism, W represents the parameter weight of the attention unit, e a An embedding vector representing a target item, T representing the total number of moments, h t The output of the interest extraction layer at the time t is shown, and the attention score is added on the structure of the original updated gate through AUGRU (GRUwith Attentional Update gate) based on the GRU structure of the attention updated gate, and the specific form is as follows:
Figure BDA0004035074290000032
Figure BDA0004035074290000033
wherein u is t Is the original update gate of the augur,
Figure BDA0004035074290000034
attentional update gate, h designed for AUGRU t Is the hidden state of the AUGRU; output h of interest evolution layer t As input to a subsequent MLP network;
step 4: training a model simulating user interest evolution;
training a model of user interest evolution by utilizing the data set obtained in the step 2;
step 5: constructing a gradient lifting tree model for processing mass data;
step 5.1: the model score of each commodity is 0 initially, and N tree models are generated;
step 5.2: aiming at the training of each tree, traversing different commodity pairs of labels in the training data set to obtain Lambda value Lambda of each sample i
The calculation method comprises the following steps:
Figure BDA0004035074290000035
Figure BDA0004035074290000036
wherein lambda is i,j Lambda value of commodity i when it is arranged in front of commodity j, |ΔZ ij The expression "s" indicates a change in MAP index caused by changing positions of items i and j in the list i Representing the output score of the gradient lifting tree model for commodity i;
step 5.3: calculating lambda i Corresponding derivative omega i For solving leaf nodes by the subsequent Newton methodThe numerical value of the dot;
lambda of all documents i As a label training decision tree, adopting a minimized sum of square errors to split nodes, namely selecting a value val for a certain selected feature, and dividing all samples smaller than or equal to val into left child nodes and dividing samples larger than val into right child nodes; then calculating the sum of square errors of Lambda for the left node and the right node respectively, adding the sum of square errors of Lambda as the cost of splitting, selecting the (feature, val) pair with the minimum cost as the current splitting point, and finally generating a decision tree with the leaf node number L;
calculating the numerical value of each leaf node by adopting Newton step for the generated decision tree, namely calculating the output value of the leaf node for a document set falling into the leaf node;
step 5.4: updating a model, adding the currently learned decision tree into the existing LightGBM model, and regularizing with a learning rate;
step 6: training and processing a gradient lifting tree model of mass data;
training the gradient lifting tree model by utilizing the data set obtained in the step 2
Step 7: fusing the models;
taking the data of the multidimensional matrix as input, respectively inputting the data into a model of user interest evolution and a gradient lifting tree model for training and learning to obtain commodity scores of the two models, carrying out linear weighting on the scores of the two models to obtain a total score, and sorting according to the total score to obtain a final commodity recommendation list;
step 8: obtaining prediction data;
acquiring test set data from a database, and preprocessing and feature engineering to obtain user data to be tested;
step 9: obtaining a recommendation list through combined model prediction;
and inputting the to-be-detected products into the fused model, wherein the output of the network is the forecast of the commodity recommendation list to be purchased by the user in the next week.
Compared with the prior art, the invention has the beneficial effects that:
1. a ranking model fusing the DIEN and LightGBMRanker algorithms is provided to recommend H & M group products to improve the shopping experience of customers. Experimental results show that the product recommended by the model has higher coincidence rate with the product purchased by the client within 7 days after the training data are finished.
2. In the experiment, the sorting and prediction precision of the model are improved through preprocessing the original data and feature engineering processing, and a feature engineering method which is useful in the scene is provided.
3. Several important features affecting the ranking result of the recommendation system are provided, and a certain reference is provided for continuously improving the accuracy of the algorithm and the recommendation system.
4. The method has wide application field, is not only the personalized recommendation of clothing, but also can be transplanted to various recommendation fields, such as music recommendation, book recommendation and the like.
Drawings
FIG. 1 is a flow diagram of a fusion of DIEN and LightGBMRanker modules in one embodiment;
FIG. 2 is a histogram used by the DIEN module in one embodiment;
FIG. 3 is a graph of volume of transactions over time in one embodiment;
FIG. 4 is an age distribution diagram of a customer in one embodiment;
FIG. 5 is a DIEN module in one embodiment;
FIG. 6 is pseudocode of the LightGBMRanker module of user behavior prediction in one embodiment;
fig. 7 is a feature importance map of the first ten in one example.
Detailed description of the preferred embodiments
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, a personalized recommendation process for a user is provided by combining a model of DIEN and LightGBMRanker, the method comprising the steps of:
step 1: data exploration, data preprocessing and feature engineering;
the acquired data set includes user information, merchandise information, and scene information.
The user information includes 7 attribute features such as user ID, profile number, whether the user is active, whether the club member is receiving a corporate message push, age, zip code, etc.
The commodity information comprises 24 attribute characteristics such as commodity ID, product code, product name, product type number, product type name, product group name, 1 picture information characteristic such as product picture, and 1 text information characteristic such as product description.
The scene information includes 4 attribute features and time features such as user ID, commodity ID, transaction time, transaction channel (online or offline).
The size of the dataset is viewed, along with the meaning and distribution of each field. Statistical analysis of the data is necessary for which information may be useful without the inclusion of a dominant concept. As shown in fig. 3, it can be seen that there is a sudden increase in commodity transaction amount, and this trend occurs periodically with the lapse of time, so that the influence of factors such as sales promotion, holidays, etc. should be considered. As can be seen from fig. 4, the ages of customers are mainly concentrated between 20 and 30 years, which is different from shopping habits of other ages, and thus should show age distribution differences at the time of feature engineering.
The preprocessing is the operation of carrying out the complement of the global average value and the replacement of the abnormal value on the missing data in the data set.
Feature engineering involves normalizing and barreling continuous features, and one-hot encoding discrete features.
Since the data size is relatively large and it is unclear which data is useful, it is necessary to perform feature engineering before the experiment. Firstly, by a memory compression technology, for example, data is compressed into a smaller floating point float type, so that the demand on computer resources is reduced, and the running speed is improved; secondly, extracting features, mainly comprising time feature extraction, user feature extraction, high-order feature combination and the like; then carrying out feature statistical analysis such as maximum value, minimum value, median value, correlation coefficient and the like; finally, selecting the characteristics through related algorithms such as RFECV, SFS and the like, and deleting some unimportant characteristics.
Step 2: dividing the data set;
the model aims to predict commodities to be purchased in the future week, and to avoid information crossing caused by putting predicted future time window data into a training set training model in the data set dividing process, a time cutting method is used for dividing the data set into a training set and a verification set according to time;
the behavior data of each user is input as a sample, the acquired data set is two-year data, the aim is to predict a commodity list to be purchased by the user in the future for one week, and in order to avoid the problem of 'information crossing', the data set is divided as follows:
dividing into 4 groups of data sets, wherein data0 is a verification set, and data1, data2 and data3 are training sets. The verification set data0 is data of a window of which the reciprocal is 1 week, and the previous time window data is valid; training set data1 is the data of the last 2 weeks window as a target, and the previous time window data is used as valid; data2 is the data of the 3 rd week window as a target, and the previous time window data as valid; data3 is the data of the 4 th week window as target, and the previous time window data as valid.
Step 3: constructing a model for simulating user interest evolution;
constructing a DIEN module in the combined deep learning network; the DIN model is improved on the basis of the DIN model; converting an original id behavior sequence into an Embedding behavior sequence by utilizing a behavior sequence layer; capturing real-time interests according to a user history sequence by utilizing an interest extraction layer, and providing a loss function to supervise and learn the user interests in each step; and capturing an interest evolution process related to the target item by using the interest evolution layer, introducing an attention mechanism in the sequence structure, and enhancing the influence of the related item in the interest evolution process. In the model, GRU and Attention are simply added on the basis of a classical Emmbedding & MLP model, and a loss function is modified;
the DIEN module herein serves as a benchmark model for capturing migration curves of shopping interests of the user. The DIEN model mainly comprises three layers:
behavioral sequence layer: as shown in the bluish layer of fig. 5, like the normal mebedding layer, it is responsible for converting the original ID class behavior sequence of the user for n days into an ebedding behavior sequence.
Interest extraction layer: as shown in a pale yellow layer in FIG. 5, a sequence model consisting of GRUs is utilized to simulate a user interest migration process, and user interests corresponding to each commodity node are extracted. The main goal of the interest extraction layer Interest Extractor Layer is to extract the interest from the ebedding data, using the GRU units to extract the interest:
u t =σ(W u i t +U u h t-1 +b u )
r t =σ(W r i t +U r h t-1 +b r )
Figure BDA0004035074290000071
Figure BDA0004035074290000072
wherein σ represents a sigmoid function, ° represents an element-wise product, i t Is the input of GRU, namely, each behavior of behavior sequence layer is an ebedding vector, and represents the ebedding vector of the t-th behavior of the user, h t Then it is the t-th hidden state of the GRU. After passing through the interest network of the GRU, the user behavior vector b (t) is further abstracted to form an interest state vector h (t).
Interest evolution layer: as shown in a light red layer in FIG. 5, an AUGRU composition sequence model is utilized, an attention mechanism is added on the basis of an interest extraction layer, an interest evolution process related to a current target commodity is simulated, and the output of the last state of the interest evolution layer is the current interest vector of a user. The main goal of the interest evolution layer Interest Evolution Layer is to characterize the evolution process of the user's interests, add attention mechanisms, attention mechanism scores:
Figure BDA0004035074290000073
through AUGRU (GRU with Attentional Update gate) GRU structure based on attention update door, attention score is added on the structure of original update door, and the specific form is as follows:
Figure BDA0004035074290000074
Figure BDA0004035074290000075
wherein u is t Is the original update gate of the augur,
Figure BDA0004035074290000076
attentional update gate, h designed for AUGRU t Is the hidden state of the augur. Output h of interest evolution layer t As input to the subsequent MLP network.
Step 4: training the DIEN deep learning network by using a training set, verifying the trained combined deep learning network by using a verification set, and adjusting the super parameters of the network until the preset conditions are met, so as to obtain a trained user behavior prediction model.
Step 5: constructing a gradient lifting tree model for processing mass data;
constructing a GBDT module in the combined deep learning network; compared with the XGBoost model, the LightGBM model in the GBDT family is lighter and faster, and the accuracy is guaranteed; the LightGBMRanker used in the invention adds lambda gradient on the basis of the LightGBM, so that the lightning GBMRanker is more suitable for the application of sequencing recommended scenes, and is a ListWise type LTR algorithm.
The continuous floating point eigenvalues are first discretized into integers while constructing a histogram of width dimensions. When traversing data, accumulating statistics in the histogram according to the discretized value as an index, accumulating needed statistics in the histogram after traversing the data once, and then traversing to find the optimal segmentation point according to the discretized value of the histogram. Feature discretization has many advantages such as convenient storage, faster operation, strong robustness, more stable model, etc. The most straightforward for this algorithm is the following two advantages:
the memory occupation is smaller, the algorithm does not need to additionally store the pre-ordered result, only the value after feature discretization can be stored, but the value is generally enough to be stored in a bit integer mode, and the memory consumption can be reduced to be original. That is, XGBoost needs to use 32-bit floating point numbers to store the characteristic values and bit shaping to store the indexes, while LightGBM only needs to use 8 bits to store the histograms, and the memory is reduced to 1/8;
the computational cost is smaller, the pre-ordering algorithm XGBoost needs to calculate the gain of one split per traversal of one eigenvalue, while the histogram algorithm LightGBM needs to calculate k times only (k can be considered as a constant).
Goss is a sample sampling algorithm that can exclude most of samples with small gradients while guaranteeing a basic distribution of data, thereby reducing the amount of data while guaranteeing accuracy. EFB is a method of reducing feature dimensions (dimension reduction technique) by feature bundling to improve computation efficiency. This approach allows the option of binding two incompletely mutually exclusive features without affecting the final accuracy. Through efficient parallel processing, including feature parallelism, data parallelism and voting parallelism, the running speed is further improved, and the resource occupation is reduced.
The LightGBMRanker module is used as a reference model of a gradient lifting tree for processing mass data. The LightGBM may be regarded as an improvement of XGBoost algorithm, with faster speed, less resource consumption, higher precision, including decision tree based histogram algorithm, gradient-based One-Side Sampling (GOSS), proprietary feature binding (EFB), classification feature (Categorical Feature), support for efficient parallelism, etc. The LightGBMRanker module adds lambda gradient on the basis of the LightGBM model, so that the LightGBMRanker module is more suitable for sequencing recommended scenes.
The continuous floating point eigenvalues are first discretized into integers while constructing a histogram of width, as shown in fig. 2, and the continuity values are binned. When traversing data, accumulating statistics in the histogram according to the discretized value as an index, accumulating needed statistics in the histogram after traversing the data once, and then traversing to find the optimal segmentation point according to the discretized value of the histogram. The Lambda gradient is defined by sequencing indexes such as MAP and NDCG, and MAP is used as an experimental index.
Figure BDA0004035074290000081
Where U is the number of customers, n is the number of recommended (ordered) products per customer, and m is the number of ground truth values per customer. P (k) represents the precision of the cut-off k, rel (k) is an indicator function, 1 if the term of rank k is a relevant (correct) label, and 0 otherwise.
Considering ordered pairs (i, j), to emphasize the importance of the front and back positions in the ranking, an exchange index |ΔZ is introduced ij I indicates that the MAP index changes after the article i and article j in the list are shifted. s is(s) i Representing the output score of the model for commodity i.
Figure BDA0004035074290000091
Next, lambda gradient of commodity i was calculated to be λ i
Figure BDA0004035074290000092
After defining the lambda gradient, the loss function L is deduced in reverse ij
L ij =log{1+exp(s i -s j )}·|ΔZ ij |
The optimization target during model training iteration is the loss function L added with lambda gradient ij . FIG. 6 details specific steps for constructing a LightGBMRanker module using lambda gradients:
initially, there is no decision tree model, so the model score for each commodity is 0;
aiming at the training of each tree, the algorithm traverses different commodity pairs of label in the training data set to calculate index change |delta Z caused by the position exchange of the commodity pairs of label ij I and lambda i,j Thereby obtaining Lambda value Lambda of each document i
Calculating the derivative omega of each lambda i The values for the leaf nodes are solved for the following Newton method;
and training a decision tree by taking lambda of all documents as label, and splitting nodes by adopting a minimized square error sum, namely selecting a value val for a certain selected feature, and dividing all samples smaller than or equal to val into left child nodes and dividing samples larger than val into right child nodes. Then calculating the sum of square errors of Lambda for the left node and the right node respectively, adding the sum of square errors of Lambda as the cost of splitting, selecting the (feature, val) pair with the minimum cost as the current splitting point, and finally generating a decision tree with the leaf node number L;
calculating the numerical value of each leaf node by adopting Newton step for the generated decision tree, namely calculating the output value of the leaf node for a document set falling into the leaf node;
updating the model, adding the currently learned decision tree into the existing model, and regularizing by using the learning rate.
Step 6: training the LightGBMRanker tree model by using a training set, verifying the trained combined deep learning network by using a verification set, and adjusting the super parameters of the network until the preset condition is met, so as to obtain a trained user behavior prediction model.
Step 7: fusing the models;
and (3) carrying out model integration, and carrying out fine adjustment on the fused model on training parameters. And linearly weighting the recommendation lists obtained respectively, and sequencing the recommendation lists with final scores to obtain a final recommendation item list. The weighting formula:
Weighted=(LightGBMRanker+DIEN)/2
step 8: and acquiring the behaviors of a plurality of users in n days, and preprocessing to obtain matrix data of the users to be detected.
Step 9: and inputting matrix data of the user to be tested into the combined model to obtain a commodity list to be purchased by the user in the next week. Inputting the data series to be tested into a DIEN model and a LightGBMRanker model to obtain the prediction scores of different commodities of a user, then carrying out linear weighting on the prediction values, and sorting according to the finally weighted scores to obtain a final one-week recommended commodity list.
The verification experiment is carried out on the past transaction data provided by H & M company provided by the kagle platform and the multi-mode data of clients and commodities, DIEN, lightGBM and the performance of the combined deep learning model fused with the simulated user interest evolution and gradient lifting algorithm on the test set are respectively compared, and the MAP@12 index results of the DIEN model, the LightGBM model and the combined deep learning model on the training set and the test set are shown in table 1. On the training set, the MAP@12 score of the DIEN model is 0.02256, the MAP@12 score of the LightGBM model is 0.02321, the MAP@12 score of the LightGBMRanker model is 0.02384, the MAP@12 score of the combined deep learning model is 0.02841, and compared with the fluctuation of the MAP@12 score of the previous three, the fluctuation of the MAP@12 score of the combined deep learning model is 25.9%, 22.4% and 19.1%; on the test set, the MAP@12 score of the DIEN model is 0.02239, the MAP@12 score of the LightGBM model is 0.02298, the MAP@12 score of the LightGBMRanker model is 0.02361, the MAP@12 score of the combined deep learning model is 0.0282, and the score amplitudes of the combined deep learning model and the combined deep learning model are 26.0%, 22.8% and 19.5% compared with the previous three. The recommendation result of the combined deep learning network is closer to the purchasing condition of the customer, and the commodity to be purchased in the future week of the user can be recommended more accurately. Compared with DIEN, lightGBM and LightGBMRanker algorithms, the prediction accuracy of the combined deep learning algorithm is obviously improved.
Finally, the first 10 important features are given, as shown in the feature importance diagram of fig. 7, and in this example the more important features are: the method provides a certain reference for continuously improving the accuracy of algorithms and recommendation systems in the future.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.
While the invention has been described in terms of specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the equivalent or similar purpose, unless expressly stated otherwise; all of the features disclosed, or all of the steps in a method or process, except for mutually exclusive features and steps, may be combined in any manner.
Table 1 Experimental results graphs of the combined deep learning model fused with DIEN and LightGBMRankwe in the examples
Model Training set MAP@12 Test set MAP@12
DIEN 0.02256 0.02239
LightGBM 0.02321 0.02298
LightGBMRanker 0.02384 0.02361
DIEN+LightGBMRanker 0.02841 0.02822

Claims (1)

1. A personalized recommendation method incorporating a user interest evolution and gradient promotion algorithm, the method comprising:
step 1: acquiring user information, commodity information and scene information from a database;
wherein the user information includes: attribute information such as ID, age, zip code, whether club members are active, whether news pushing is accepted, etc. of the user; the commodity information includes: attribute information such as commodity ID, commodity code, commodity name, commodity number, commodity name, commodity color, date of departure, production department, etc., visual information such as commodity picture, and text information such as commodity description; the scene information includes: attribute information such as user ID, commodity ID, transaction date, transaction channel, etc.;
step 2: dividing a data set;
taking the last week of samples as a test set and the previous samples as a training set;
step 3: constructing a model of user interest evolution;
the model comprises a behavior sequence layer, an interest extraction layer, an interest evolution layer and an MLP network:
behavioral sequence layer: the method is used for converting the original ID behavior sequence of the user for n days into an ebedding behavior sequence;
interest extraction layer: for extracting the interest from the ebedding data, the GRU unit is used to extract the interest:
u t =σ(W u i t +U u h t-1 +b u )
r t =σ(W r i t +U r h t-1 +b r )
Figure FDA0004035074280000011
Figure FDA0004035074280000012
wherein u is t Representing the output value of the update gate in GRU at time t, r t Representing the output value, i, of the reset gate in the GRU at the time t t Representing the input of the GRU at time t,
Figure FDA0004035074280000013
representing the newly learned memory state of the GRU at the time t, W representing the parameter weight input by the GRU unit, U representing the parameter weight of the hidden state of the GRU at the last time, b representing the parameter bias, the superscript U, r, h representing the update gate, the reset gate and the new learning state respectively, sigma representing the sigmoid function, and the hidden state of the GRU at the last time>
Figure FDA0004035074280000014
Representing element-wise product, i t Is the input of GRU, namely, each behavior of behavior sequence layer is an ebedding vector, and represents the ebedding vector of the t-th behavior of the user, h t Then the t hidden state of the GRU is further abstracted by the user behavior vector b (t) after the interest network of the GRU is passed, and an interest state vector h (t) is formed;
interest evolution layer: the method is used for describing the evolution process of the user interests, adding an attention mechanism and scoring the attention mechanism:
Figure FDA0004035074280000015
wherein a is t Represents the attention score at time t of the attention mechanism, W represents the parameter weight of the attention unit, e a An embedding vector representing a target item, T representing the total number of moments, h t The output of the interest extraction layer at the time t is shown, and the attention score is added on the structure of the original updated gate through AUGRU (GRU with Attentional Update gate) based on the GRU structure of the attention updated gate, and the specific form is as follows:
Figure FDA0004035074280000021
Figure FDA0004035074280000022
wherein u' t Is the original update gate of the augur,
Figure FDA0004035074280000023
attentional update gate, h 'designed for AUGRU' t Is the hidden state of the AUGRU; output h 'of interest evolutionary layer' t As input to a subsequent MLP network;
step 4: training a model simulating user interest evolution;
training a model of user interest evolution by utilizing the data set obtained in the step 2;
step 5: constructing a gradient lifting tree model for processing mass data;
step 5.1: the model score of each commodity is 0 initially, and N tree models are generated;
step 5.2: aiming at the training of each tree, traversing different commodity pairs of labels in the training data set to obtain each sampleLambda value Lambda of (x) i
The calculation method comprises the following steps:
Figure FDA0004035074280000024
Figure FDA0004035074280000025
wherein lambda is i,j Lambda value of commodity i when it is arranged in front of commodity j, |ΔZ ij The expression "s" indicates a change in MAP index caused by changing positions of items i and j in the list i Representing the output score of the gradient lifting tree model for commodity i;
step 5.3: calculating lambda i Corresponding derivative omega i The values for the leaf nodes are solved for the following Newton method;
lambda of all documents i As a label training decision tree, adopting a minimized sum of square errors to split nodes, namely selecting a value val for a certain selected feature, and dividing all samples smaller than or equal to val into left child nodes and dividing samples larger than val into right child nodes; then calculating the sum of square errors of Lambda for the left node and the right node respectively, adding the sum of square errors of Lambda as the cost of splitting, selecting the (feature, val) pair with the minimum cost as the current splitting point, and finally generating a decision tree with the leaf node number L;
calculating the numerical value of each leaf node by adopting Newton step for the generated decision tree, namely calculating the output value of the leaf node for a document set falling into the leaf node;
step 5.4: updating a model, adding the currently learned decision tree into the existing LightGBM model, and regularizing with a learning rate;
step 6: training and processing a gradient lifting tree model of mass data;
training the gradient lifting tree model by utilizing the data set obtained in the step 2
Step 7: fusing the models;
taking the data of the multidimensional matrix as input, respectively inputting the data into a model of user interest evolution and a gradient lifting tree model for training and learning to obtain commodity scores of the two models, carrying out linear weighting on the scores of the two models to obtain a total score, and sorting according to the total score to obtain a final commodity recommendation list;
step 8: obtaining prediction data;
acquiring test set data from a database, and preprocessing and feature engineering to obtain user data to be tested;
step 9: obtaining a recommendation list through combined model prediction;
and inputting the to-be-detected products into the fused model, wherein the output of the network is the forecast of the commodity recommendation list to be purchased by the user in the next week.
CN202310003507.1A 2023-01-03 2023-01-03 Personalized recommendation method integrating user interest evolution and gradient promotion algorithm Pending CN116304299A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310003507.1A CN116304299A (en) 2023-01-03 2023-01-03 Personalized recommendation method integrating user interest evolution and gradient promotion algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310003507.1A CN116304299A (en) 2023-01-03 2023-01-03 Personalized recommendation method integrating user interest evolution and gradient promotion algorithm

Publications (1)

Publication Number Publication Date
CN116304299A true CN116304299A (en) 2023-06-23

Family

ID=86829479

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310003507.1A Pending CN116304299A (en) 2023-01-03 2023-01-03 Personalized recommendation method integrating user interest evolution and gradient promotion algorithm

Country Status (1)

Country Link
CN (1) CN116304299A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116881854A (en) * 2023-09-08 2023-10-13 国际关系学院 XGBoost-fused time sequence prediction method for calculating feature weights
CN116977035A (en) * 2023-09-25 2023-10-31 临沂大学 Agricultural product recommendation method based on LightGBM and deep learning
CN117557306A (en) * 2024-01-09 2024-02-13 北京信索咨询股份有限公司 Management system for classifying consumers based on behaviors and characteristics

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116881854A (en) * 2023-09-08 2023-10-13 国际关系学院 XGBoost-fused time sequence prediction method for calculating feature weights
CN116881854B (en) * 2023-09-08 2023-12-22 国际关系学院 XGBoost-fused time sequence prediction method for calculating feature weights
CN116977035A (en) * 2023-09-25 2023-10-31 临沂大学 Agricultural product recommendation method based on LightGBM and deep learning
CN117557306A (en) * 2024-01-09 2024-02-13 北京信索咨询股份有限公司 Management system for classifying consumers based on behaviors and characteristics
CN117557306B (en) * 2024-01-09 2024-04-19 北京信索咨询股份有限公司 Management system for classifying consumers based on behaviors and characteristics

Similar Documents

Publication Publication Date Title
US20210271975A1 (en) User tag generation method and apparatus, storage medium, and computer device
CN111222332B (en) Commodity recommendation method combining attention network and user emotion
CN108647251B (en) Recommendation sorting method based on wide-depth gate cycle combination model
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
Zheng et al. An optimized collaborative filtering recommendation algorithm
CN116304299A (en) Personalized recommendation method integrating user interest evolution and gradient promotion algorithm
CN111191092B (en) Label determining method and label determining model training method
CN109918563B (en) Book recommendation method based on public data
CN104063481A (en) Film individuation recommendation method based on user real-time interest vectors
CN109034960B (en) Multi-attribute inference method based on user node embedding
CN112085525A (en) User network purchasing behavior prediction research method based on hybrid model
Jonathan et al. Sentiment analysis of customer reviews in zomato bangalore restaurants using random forest classifier
CN112069320A (en) Span-based fine-grained emotion analysis method
Choudhary et al. SARWAS: Deep ensemble learning techniques for sentiment based recommendation system
CN112613953A (en) Commodity selection method, system and computer readable storage medium
Li Accurate digital marketing communication based on intelligent data analysis
CN112800109A (en) Information mining method and system
CN114942974A (en) E-commerce platform commodity user evaluation emotional tendency classification method
Agustyaningrum et al. Online shopper intention analysis using conventional machine learning and deep neural network classification algorithm
Tahiri et al. An intelligent shopping list based on the application of partitioning and machine learning algorithms.
Ahan et al. Social network analysis using data segmentation and neural networks
Ijaz Book recommendation system using machine learning
CN111104614A (en) Method for generating recall information for tourist destination recommendation system
CN115187312A (en) Customer loss prediction method and system based on deep learning
Urkude et al. Comparative analysis on machine learning techniques: a case study on Amazon product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination