CN111062775B - Recommendation system recall method based on attention mechanism - Google Patents
Recommendation system recall method based on attention mechanism Download PDFInfo
- Publication number
- CN111062775B CN111062775B CN201911222216.1A CN201911222216A CN111062775B CN 111062775 B CN111062775 B CN 111062775B CN 201911222216 A CN201911222216 A CN 201911222216A CN 111062775 B CN111062775 B CN 111062775B
- Authority
- CN
- China
- Prior art keywords
- user
- commodity
- vector
- layer
- attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000007246 mechanism Effects 0.000 title claims abstract description 70
- 238000000034 method Methods 0.000 title claims abstract description 32
- 239000013598 vector Substances 0.000 claims abstract description 280
- 238000012512 characterization method Methods 0.000 claims abstract description 78
- 238000012549 training Methods 0.000 claims abstract description 40
- 230000006870 function Effects 0.000 claims abstract description 22
- 239000011159 matrix material Substances 0.000 claims description 42
- 238000013528 artificial neural network Methods 0.000 claims description 38
- 238000006243 chemical reaction Methods 0.000 claims description 8
- 230000006399 behavior Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 238000009825 accumulation Methods 0.000 claims 1
- 230000002457 bidirectional effect Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 239000002537 cosmetic Substances 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 238000013136 deep learning model Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000008570 general process Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Physics & Mathematics (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a recommendation system recall method based on an attention mechanism, which comprises the following steps: extracting user features and commodity features in the training samples, converting the user features into user embedded vectors, and converting the commodity features into commodity embedded vectors; inputting the user embedded vector and the commodity embedded vector into an attention mechanism model for training, learning the weight of each feature through an attention network in the model, and carrying out weighted summation on the embedded vectors of all the features according to the weight to obtain a user characterization vector and a commodity characterization vector; calculating the inner product of the user characterization vector and the commodity characterization vector to obtain the matching degree of the user purchasing commodity will of the training sample, establishing a cross entropy loss function of the matching degree of the user purchasing commodity will, calculating the minimized cross entropy loss function, and converging the attention mechanism model; and inputting the sample to be tested into the converged attention mechanism model, acquiring the matching degree of the user purchasing commodity will of the sample to be tested, and selecting the commodity with the matching degree of the user purchasing commodity will in a preset interval as a recall result to be recommended. The invention enhances generalization and greatly simplifies the calculation amount of recall recommendation.
Description
Technical Field
The invention relates to the field of computer recommendation systems, in particular to a recommendation system recall method based on an attention mechanism.
Background
With the improvement of living standard, the choices of consumption are more and more, the goods from three families to tens of thousands of families today, and many times, it takes much time to find the goods needed by us, and even if we want, it is not necessarily the best fit for us. The recommendation system can help us find the relevant commodities searched by us from the commodity pile and recommend the commodity which is most suitable for us to us. Recommendation systems are now very widely used and ubiquitous in life. When the user purchases the product on line, the user wants to recommend the product which the user wants to purchase, when the user listens to music, he wants to hear songs which are suitable for the user's own taste, and when the user searches things, he wants to find the search result. Thus, fast and accurate prediction of user preferences is a primary goal of recommendation systems.
And recall, as one of the phases of the recommendation system, needs to finish selecting hundreds or tens of related commodities from a large number of commodities, then sends the commodities into a sorting model, and is different from the sorting requirement with high precision, recall is equivalent to rough sorting, and the recall does not need to have high precision, but needs to quickly select commodities related to our search from a large number of candidate commodities.
Recall of the recommendation system is based on collaborative filtering at the earliest, but the collaborative filtering method has a cold start problem because of modeling by using the IDs of the users and the commodities, and is equivalent to using only the IDs as their unique features, so that important information such as other attributes of the users and the commodities cannot be effectively utilized. Based on feature modeling, we think of the simplest LR model, which is easy to implement, but ignores the feature combination problem.
The FM model is widely used once because he considers the feature combination problem, learns a weight for each feature combination, and makes the idea of feature vectorization widely used in various deep learning models. However, the FM model is only a low-order crossover between two features, failing to consider higher-order feature crossover.
In recent years, with the development of deep learning, many models based on deep learning have been applied to recommendation systems. Based on a deep learning recommendation model, sigmoid, tanh and other activation functions are added, nonlinear change is provided, and the multi-layer neural network is an implicit multi-order feature crossover. In recent years, deep FM model is proposed, and the low-order cross and the high-order cross of the feature combination are combined together, so that the effect is remarkable. Considering that the higher order crossover of the neural networK is implicit, the interpretability is not high, google proposes a (Deep & Cross networK k) DCN model, which combines explicit and implicit feature crossover combinations. Considering that these feature crossings are all on an element level, microsoft has also proposed an xDeepFM model, the study being directed to feature crossing combinations in the vector dimension. As can be seen, feature combinations are an important part of the model, but these models are complex, generally applied in the ranking stage, and less applied in the recall stage.
Attention mechanisms, which are initially applied to natural language processing, are capable of selectively extracting important information in long sentences and focusing attention on these important information while ignoring non-important information. And the attention mechanism can be different for different samples, as well as the distribution of attention. This feature is suitable for most fields, and thus is widely used. But there is currently less research on applying attention mechanisms to recommendation system models.
Disclosure of Invention
The main purpose of the invention is to provide a recommendation system recall method based on an attention mechanism, aiming at overcoming the problems.
In order to achieve the above object, the present invention provides a recall method of a recommendation system based on an attention mechanism, comprising the following steps:
s10, extracting user features and commodity features in the training samples, converting the user features into user embedded vectors, and converting the commodity features into commodity embedded vectors;
s20, inputting the user embedded vector and the commodity embedded vector into an attention mechanism model for training, learning the weight of each feature through an attention network in the model, and carrying out weighted summation on the embedded vectors of all the features according to the weight to obtain a user characterization vector and a commodity characterization vector; calculating the inner product of the user characterization vector and the commodity characterization vector to obtain the matching degree of the user purchasing commodity will of the training sample, establishing a cross entropy loss function of the matching degree of the user purchasing commodity will, calculating the minimized cross entropy loss function, and converging the attention mechanism model;
S30, inputting the sample to be tested into the converged attention mechanism model, obtaining the matching degree of the user purchasing commodity intention of the sample to be tested, and selecting the commodity of which the matching degree of the user purchasing commodity intention is in a preset interval as a recall result to be recommended.
Preferably, the attention mechanism model is a bidirectional attention mechanism model, the bidirectional attention mechanism model comprises a multi-layer user attention network and a multi-layer commodity attention network, each layer of user attention network comprises a two-layer feedforward neural network FNN and a normalized layer Softmax, each layer of commodity attention network comprises a two-layer feedforward neural network FNN and a normalized layer Softmax, and the user attention network and the commodity attention network are in a layer-by-layer recursive relationship.
Preferably, more than one of the steps S20The layer user attention network comprises a K layer user attention network in which the user characterization vector u (k) Given by the formula:
wherein the superscript K (K-1) of all variables is the attention network of the K (K-1) th layer, U_attribute is the attention network of the user, each layer of network is the same, the specific operation process of the network structure consists of the following several formulas, and the input of the network is the embedded vector of the user characteristics And the output of the previous layer->The output of the network characterizes the vector u for the user of the layer (k) ,m (k) Is a storage vector for storing the summation of characterization vectors obtained by the previous K-layer network, after the input is obtained, the attention network firstly normalizes through the FNN and softmax layers of the two layers of feedforward neural network to obtain attention weightThe weight vector is utilized to carry out weighted average on T user feature vectors to obtain a characterization vector u of the layer (k) ;
At the K-th layer, for t=1, 2,3, …, T, the weight of the user's T-th embedded vector at that layer is first found
wherein ,are all network parameter matrices, < >>Embedded vector u representing user t-th feature in a k-th layer user attention network t For the parameter matrix of the input neural network, +.>Storage vector representing output of a layer above in a layer k user attention network +.>For the parameter matrix of the input neural network, +.>Representing the hidden layer variable +.>For the parameter matrix of the input neural network, +.>For the hidden layer vector obtained by the user's t-th feature, tanh is the activation function, and as a custom vector multiplication, i.e., two vectors of the same length and elements in the same position are multiplied to obtain a new vector, " >By matrix multiplication with a matrix with a number of rows 1, a value +.>Then obtaining the weight of the surface sheet vector of the K layer of the final user through softmax conversion>e is a natural constant;
then according toCalculating the weighted sum of the embedded vectors of the user to obtain a characterization vector u of the K-th layer of the user (k) :
Preferably, the multi-layer commodity attention network in S20 includes a K-layer commodity attention network, in which the commodity characterization vector v (k) Given by the formula:
wherein V_attribute represents commodity attention network, the network structure is the same as user attention network, and the weight of nth embedded vector of each commodity is obtained firstThen, according to the weight, the weighted sum of all commodity embedded vectors is obtained to obtain a commodity characterization vector v of the K-th layer (k) ;
At the K-th layer, for n=1, 2,3, …, N,the weight of the nth embedded vector of the commodity of the layer is obtained first
wherein ,is a parameter matrix of the commodity attention network, +.>Embedding vector v representing nth characteristic of commodity in kth layer commodity attention network n For the parameter matrix of the input neural network, +.>Storage vector representing output of a layer above in a layer k commodity attention network>For the parameter matrix of the input neural network, +. >Representing the hidden layer variable +.>For the parameter matrix of the input neural network, +.>For the hidden layer vector obtained with the nth characteristic of the commodity, a value +.>Then obtaining the characterization vector weight of the K layer of the final commodity through softmax conversion>
Then according toCalculating the weighted sum of the commodity embedded vectors to obtain a characterization vector v of a K-th layer of the commodity (k) :
Preferably, in the step S20, the method for calculating the inner product of the user characterization vector and the commodity characterization vector to obtain the matching degree of the user' S intention of purchasing the commodity of the training sample specifically includes:
multi-layer user attention network stitching and combining token vectors u for user attention networks of all layers (k) Obtaining a final user characterization vector z u =[u (0) ;…;u (K) ];
Multi-layer commodity attention network splicing and combining characterization vector v of commodity attention network of all layers (k) Obtaining a final commodity representation vector z v =[v (0) ;…;v (k) ];
Calculating the final user token vector z u And a final merchandise characterization vector z v And obtaining the matching degree of the final user's willingness to purchase goods.
Preferably, the cross entropy loss function of the matching degree of the willingness of the user to purchase goods is specifically:
wherein m is the number of samples, y i For sample labels, with click behavior treated as positive sample, labeled 1, no click behavior treated as negative sample, labeled 0, for each user, make up with each item clicked<u,v + >Considered as a positive sample pair; click-through commodity composition<u,v - >Consider as a negative pair of samples, model training by minimizing the loss function L, i.e., continually narrowing the distance between positive samples and expanding the distance between negative pairs.
Preferably, in S10, specifically:
dividing user data and commodity data from training samples, and processing the user data into sparse user vectorsT is the total number of user features, T is the current user feature, and u represents the user; processing commodity data into sparse commodity vector +.>N is the total number of commodity features, N is the current commodity features, and v represents commodity;
classifying training sample data into category type features and continuous type features according to the attribute of the data, and adopting a single-heat coding vector x if the training sample data is the category type features i ,x i The vector length of the training sample is taken as the sum of the numbers of all the features of the current training sample, the value of the class feature value is 1, the other is 0, and a feature dictionary is established for the position sequence number of the class feature value in the vector; if the feature is continuous, the sum of the numbers of all the features of the current training sample is taken as a vector length, the value of the continuous feature is taken as a feature value of the vector, and the other feature values are 0, so that the feature value is coded into a sparse vector.
Preferably, the attention mechanism model is a representation type learning model
Preferably, the vector lengths of the user attention network and the commodity attention network are equal.
Preferably, the training sample data is collected from a click rate estimation CTR model.
The invention provides a recall method of a recommendation system based on attention, which comprises the basic processes of converting the characteristics of a user and a commodity into embedded vectors, searching important characteristic combinations through an attention mechanism network, weighting and summing the embedded vectors of all the characteristics to obtain respective characterization vectors of the user and the commodity in a space, and finally calculating the matching degree according to the distance between the user and the commodity in the vector space. The bright point of the invention is:
first, the method proposed by the present invention is a deep learning model. And is at the feature level because the model will first turn the feature vector into a low-dimensional dense embedded vector, the model output being equivalent to a weighted sum of all feature vectors. On one hand, the model can automatically learn feature combination intersection without manually doing feature engineering. On the other hand, most of the existing recall models are tree models or simple discrimination models, because the deep learning models are complex, because the existing several mainstream models, such as DCN, deep fm and the like, all need to calculate the cross combinations of the elements of all feature vectors, the calculated amount is large, and on the feature level, the combination number is small, the calculated amount is small, so the feasibility of applying the deep learning model to recall is large.
Second, the present invention innovatively applies an attention mechanism to feature combinations, finds important feature combinations for each sample through the attention mechanism, and ignores many unimportant feature combinations. The model takes the embedded vectors of the features as input, a group of attention weights are obtained through neural network learning, each feature corresponds to one weight, finally, the final vector is obtained by calculating the weighted sum of all the features according to the weights, and the combination of the feature vectors with different degrees is realized by giving different weights to each feature. The attention mechanism model combines deep learning, attention mechanism and feature engineering, and has great advantages.
Finally, the model in the invention is a representation learning model, and has strong generalization. The model can learn the characterization vector of the user and the commodity in the same space, so that the model can be applied to various different downstream tasks. In the invention, the downstream task of the model is a recall task, and an end-to-end model can be trained. In addition, the model can be divided into two parts, namely a user model and a commodity model, and the two models are simultaneously learned, so that the model is a model of a bidirectional attention mechanism. In the prediction stage, the characterization vectors of all commodities can be predicted independently, the characterization vectors can be stored, then the characterization vectors of each user are predicted, and the first M commodities closest to each other in the vector space are found in the stored characterization vectors of all shops, so that a large number of computations can be reduced by independent prediction, and the characterization vectors of the commodities are prevented from being repeatedly computed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the structures shown in these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of the overall structure of an embodiment of the bidirectional attention model when k=2;
fig. 2 is a diagram showing a user attention network structure of a kth layer when t=3 according to the present invention;
fig. 3 is a diagram showing a commodity attention network structure of a kth layer when n=3 according to the present invention;
the achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that, if directional indications (such as up, down, left, right, front, and rear … …) are included in the embodiments of the present invention, the directional indications are merely used to explain the relative positional relationship, movement conditions, etc. between the components in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indications are correspondingly changed.
In addition, if there is a description of "first", "second", etc. in the embodiments of the present invention, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
1-3, the recommendation system recall method based on the attention mechanism provided by the invention comprises the following steps:
S10, extracting user features and commodity features in the training samples, converting the user features into user embedded vectors, and converting the commodity features into commodity embedded vectors;
s20, inputting the user embedded vector and the commodity embedded vector into an attention mechanism model for training, learning the weight of each feature through an attention network in the model, and carrying out weighted summation on the embedded vectors of all the features according to the weight to obtain a user characterization vector and a commodity characterization vector; calculating the inner product of the user characterization vector and the commodity characterization vector to obtain the matching degree of the user purchasing commodity will of the training sample, establishing a cross entropy loss function of the matching degree of the user purchasing commodity will, calculating the minimized cross entropy loss function, and converging the attention mechanism model;
s30, inputting the sample to be tested into the converged attention mechanism model, obtaining the matching degree of the user purchasing commodity intention of the sample to be tested, and selecting the commodity of which the matching degree of the user purchasing commodity intention is in a preset interval as a recall result to be recommended.
Preferably, the attention mechanism model is a bidirectional attention mechanism model, the bidirectional attention mechanism model comprises a multi-layer user attention network and a multi-layer commodity attention network, each layer of user attention network comprises a two-layer feedforward neural network FNN and a normalized layer Softmax, each layer of commodity attention network comprises a two-layer feedforward neural network FNN and a normalized layer Softmax, and the user attention network and the commodity attention network are in a layer-by-layer recursive relationship.
Preferably, the multi-layer user attention network in S20 includes a K-layer user attention network, in which the user characterization vector u (k) Given by the formula:
wherein the superscript K (K-1) of all variables is the attention network of the K (K-1) th layer, U_attribute is the attention network of the user, each layer of network is the same, the specific operation process of the network structure consists of the following several formulas, and the input of the network is the embedded vector of the user characteristicsAnd the output of the previous layer->The output of the network characterizes the vector u for the user of the layer (k) ,m (k) Is a storage vector which stores the accumulated sum of characterization vectors obtained by the previous K-layer network, and after input is obtained, the attention network firstlyThe attention weight is obtained by normalizing the FNN and softmax layers of the two layers of feedforward neural networkThe weight vector is utilized to carry out weighted average on T user feature vectors to obtain a characterization vector u of the layer (k) ,
At the K-th layer, for t=1, 2,3, …, T, the weight of the user's T-th embedded vector at that layer is first found
wherein ,is a network parameter matrix, < >>Embedded vector u representing user t-th feature in a k-th layer user attention network t For the parameter matrix of the input neural network, +.>Storage vector representing output of a layer above in a layer k user attention network +. >For the parameter matrix of the input neural network, +.>Representing the hidden layer variable +.>For the parameter matrix of the input neural network, +.>For the hidden layer vector obtained by the user's t-th feature, tanh is the activation function, and as a custom vector multiplication, i.e., two vectors of the same length and elements in the same position are multiplied to obtain a new vector, ">By matrix multiplication with a matrix of row 1, a value is obtainedThen obtaining the final K-layer surface sheet vector weight of commodity by softmax conversion>e is a natural constant;
then according toCalculating the weighted sum of the embedded vectors of the user to obtain a characterization vector u of the K-th layer of the user (k) :
Preferably, the multi-layer commodity attention network in S20 includes a K-layer commodity attention network, in which the commodity characterization vector v (k) Given by the formula:
wherein V_attribute represents commodity attention network, the network structure is the same as user attention network, and the weight of nth embedded vector of each commodity is obtained firstThen, according to the weight, the weighted sum of all commodity embedded vectors is obtained to obtain a commodity characterization vector v of the K-th layer (k) 。
In the K-th layer, for n=1, 2,3, …, N, the weight of the N-th embedded vector of the product of the layer is first obtained
wherein ,is a parameter matrix of the commodity attention network, +.>Embedding vector v representing nth characteristic of commodity in kth layer commodity attention network n For the parameter matrix of the input neural network, +.>Storage vector representing output of a layer above in a layer k commodity attention network>For the parameter matrix of the input neural network, +.>Representing the hidden layer variable +.>For the parameter matrix of the input neural network, +.>For the hidden layer vector obtained with the nth characteristic of the commodity, a value +.>Then obtaining the characterization vector weight of the K layer of the final commodity through softmax conversion>
Then according toCalculating the weighted sum of the commodity embedded vectors to obtain a characterization vector v of a K-th layer of the commodity (k) :
Preferably, in the step S20, the method for calculating the inner product of the user characterization vector and the commodity characterization vector to obtain the matching degree of the user' S intention of purchasing the commodity of the training sample specifically includes:
multi-layer user attention network stitching and combining token vectors u for user attention networks of all layers (k) Obtaining a final user characterization vector z u =[u (0) ;…;u (K) ];
Multi-layer commodity attention network splicing and combining characterization vector v of commodity attention network of all layers (k) Obtaining a final commodity representation vector z v =[v (0) ;…;v (K) ];
Calculating the final user token vector z u And a final merchandise characterization vector z v And obtaining the matching degree of the final user's willingness to purchase goods.
Preferably, the cross entropy loss function of the matching degree of the willingness of the user to purchase goods is specifically:
wherein m is the number of samples, y i For sample labels, with click behavior treated as positive sample, labeled 1, no click behavior treated as negative sample, labeled 0, for each user, make up with each item clicked<u,v + >Considered as a positive sample pair; click-through commodity composition<u,v - >Consider as a negative pair of samples, model training by minimizing the loss function L, i.e., continually narrowing the distance between positive samples and expanding the distance between negative pairs.
Preferably, in S10, specifically:
dividing user data and commodity data from training samples, and processing the user data into sparse user vectorsT is the total number of user features, T is the current user feature, and u represents the user; processing commodity data into sparse commodity vector +.>N is the total number of commodity features, N is the current commodity features, and v represents commodity;
based on the attribute of the dataThe training sample data is divided into a category type feature and a continuous type feature, and if the training sample data is the category type feature, the single thermal coding vector x is adopted i ,x i The vector length of the training sample is taken as the sum of the numbers of all the features of the current training sample, the value of the class feature value is 1, the other is 0, and a feature dictionary is established for the position sequence number of the class feature value in the vector; if the feature is continuous, the sum of the numbers of all the features of the current training sample is taken as a vector length, the value of the continuous feature is taken as a feature value of the vector, and the other feature values are 0, so that the feature value is coded into a sparse vector.
Preferably, the attention mechanism model is a representation type learning model
Preferably, the vector lengths of the user attention network and the commodity attention network are equal.
Preferably, the training sample data is collected from a click rate estimation CTR model.
Actual operation example:
the data of the CTR model is collected in a similar click rate estimation mode, the characteristics of each sample can be divided into two parts, one part is the characteristics of a user, such as gender, age and the like, the other part is the characteristics of goods, such as category, price and the like, each sample corresponds to a label, and the value of the label is 1 or 0, so that whether the user purchases the data (whether the user clicks or stores the data as the label in actual conditions) is indicated. I.e. each sample represents the purchase of a certain commodity by a certain user. The problem to be solved is a classification problem, a classification model is trained by the samples, the model is output to judge whether the user purchases the commodity, the model outputs a probability value of 0 to 1, the probability value represents the likelihood of the user purchasing the commodity, and the probability value is larger the probability value represents the likelihood of purchasing.
In the prediction stage, M commodities are called back from all commodities for a certain user, I samples are respectively formed by the characteristics of the user and the characteristics of all commodities, I is the number of the commodities, the I samples are input into a model to obtain I probability values, namely the probability of the user for purchasing each commodity is represented, the I probability values are ordered, and the first M corresponding commodities with the largest probability values, namely the M commodities most likely to be purchased by the user are taken and then recommended to the user.
Let us take 4 samples in table 1 as examples:
user' s | Sex (sex) | Age of | Goods commodity | Category(s) | Price of | Label (Label) | ||
1 | Zhang San | Man's body | 9 | Pencil with | Stationery | 2 | 1 | |
2 | Li Si | Man's body | 38 | Trousers | Clothes with a pair of elastic members | 56 | 0 | |
3 | Wang Wu | Female | 12 | Facial mask | Cosmetic product | 35 | 1 | |
4 | Zhao Liu | Female | 27 | Basketball ball | Sports article | 67 | 0 |
TABLE 1
In actual data preprocessing, we will discard the "user" column and the "merchandise" column.
The scheme comprises the following steps:
1) An embedding layer: the sparse characteristic data of the input users and commodities are respectively converted into low-dimensional dense embedded vector representations;
2) Attention mechanism layer: the embedded vectors of all the features are input, the weight of each feature is learned through the attention network, and the embedded vectors of the features are weighted and summed according to the weight to obtain the respective characterization vectors of the user and the commodity;
3) Output layer: and obtaining the matching degree of the user and the commodity by calculating the inner product of the characterization vector of the user and the commodity.
The specific operation of each step is described in detail below:
1) Embedding layer
At this level, we turn the input user and item feature data into embedded vector representations, respectively. The characteristics of the user and the commodity are input separately, and can be seen from a model structure diagram, so that the user and the commodity also need to be processed separately.
We first understand the concept of feature number and vocabulary, such as "gender" as a feature, the number of values of this feature is 2, i.e. "gender=men" and "gender=women", then the feature number is the number of features, and the vocabulary is the sum of the numbers of values of all features.
First, we need to process the data into sparse vectors wherein ,/>Is a set of sparse vectors of user features, i.e., { x u,1 ,x u,2 ,x u,3 ,…,x u,T T is the number of user features, and u in the subscript represents the user;the method is characterized in that the method is a set of sparse vectors of commodity features, N is the number of the commodity features, and v in subscripts represents commodities.
The data generally includes a continuous value feature and a category value feature. Class value features, e.g. "gender", typically encoded as a single heat vector x i E.g. "gender=men" is coded as "[0, ], 0,1,0, …,0]", vector length is the vocabulary size. For continuous value features, such as "age=10", we can see a class of features, also encoded as sparse vectors, such as "[0, ], 0,1,0, …,0]". The sparse vectors are each of a certain position with a value (specifically, the values of the class features are all 1, the successive featuresThe sign value is unchanged), the remainder being 0. A feature dictionary needs to be built for a specific location number so that each location can represent a feature.
Taking the samples in the following tables 2 and 3 as examples, firstly, feature dictionaries of users and commodities are established, and a position serial number is allocated to each feature:
features (e.g. a character) | Position number |
Sex = male | 0 |
Sex = |
1 |
Age of | 2 |
TABLE 2
Features (e.g. a character) | Position number |
Category = |
0 |
Category = |
1 |
Category = |
2 |
Category = |
3 |
Price of | 4 |
TABLE 3 Table 3
Taking the characteristics of the user as an example, the number of the characteristic values of the user is 5, including "gender=man", "gender=woman" and "age", so that the vocabulary, that is, the dictionary length is 3, and the obtained sparse vector length is also 3:
"sex=men", position number 0, sparse vector [1, 0]
"sex=female", position number 1, sparse vector [0,1,0]
"age=10", position number 2, sparse vector [0,0,10]
Taking the characteristics of the commodity as an example, the number of the characteristic values of the user is 5, including "category=stationery", "category=clothing", "category=cosmetics", "category=sports goods", "price", so the vocabulary, that is, the dictionary length is 5, and the obtained sparse vector length is also 5:
"category=stationery", position number 0, sparse vector [1, 0]
"category=clothing", position number 1, sparse vector [0,1,0]
"category=cosmetic", position number 2, sparse vector [0,1,0]
Category=sports goods, position number is 3, and sparse vector [0,1,0] is obtained
"price=2", position number 4, sparse vector [0,0,0,0,2]
Then we can get a sparse vector for each feature of sample 1:
"sex=man": [1, 0];
"age=9": [0,0,9];
category = stationery ": [1, 0];
"price=2": [0,0,0,0,2].
I.e.
wherein ,xu,1 The sparse vector representing the 1 st feature of the user, "gender=men", and so on, to obtain four sparse vectors, the general process is to integrate the sparse vectors of all the features of the sample into one-dimensional vector, for example, the user features of sample 1 can be integrated into one length 3 vector [ 10 9] ]. But the encoded vector thus processed is high in latitude and sparse. And if the vocabulary is large, like some ID class characteristics, the direct input neural network cannot be effectively trained.
So to reduce the dimensions we use another widely used method to transform these sparse long feature vectors into low-dimensional and dense vectors (i.e. embedding vectors) by multiplying the sparse vectors with an embedding matrix. Since the sparse vector has a value at only one place, the multiplication result is equivalent to selecting a certain column from the embedded matrix and multiplying the value, and since the value of the category feature is 1, we can use each column of the embedded matrix as the embedded vector of each feature, except that the continuous feature needs to be multiplied by a number:
u t =W embed,u x u,t
v n =W embed,v x v,n
wherein ,ut An embedded vector representing the t-th feature of the user, the subscript u representing the user, v n Is the embedded vector of the nth feature of the commodity, and the subscript v represents the commodity; w (W) embed,u ∈R d×T I.e. user embedding (EmBedding) matrix, W embed,v ∈R d×N Is the commodity embedding matrix, d is the embedding vector length, and T and N are the vocabulary sizes of the user feature and the commodity feature, respectively. Because d<<T,d<<N(<<Representing far smaller), the original T-dimensional or N-dimensional vector is converted into a d-dimensional vector, thereby achieving the purpose of reducing the vector length. Both embedding matrices are parameters that the embedding layer needs to learn to get, optimized along with other parameters of the network.
Finally, we construct a set of embedded vectors containing user features and a set of embedded vectors containing merchandise features, respectively:
wherein ,ut An embedded vector for the T-th feature of the user, T is the feature number of the user, v n The feature number of the N commodity is the embedded vector of the nth feature of the commodity.
2) Attention mechanism layer
Attention mechanisms can focus on important information in a multi-layer attention network and in each step. For feature-level in our model, the network of each layer represents a cross-combination between features, i.e. the attention mechanism is able to find important feature combinations and increase their weights. On the other hand, the attention mechanism comprises a multi-layer attention network, and the cross combination of the high-order features is deduced, so that the attention mechanism can extract important high-order feature combinations. The attention mechanism can also reduce the amount of information processed by screening feature combinations of the core.
Our approach simultaneously focuses on the characteristics of both the user and the commodity through a multi-layer attention network, both of which use the same network architecture to extract important feature combinations. In this section, we will explain the attention mechanisms used in each layer of attention network, which eventually make up the whole model. For simplicity we will omit the bias term b in the following equation.
As can be seen from the model structure diagram, the attention mechanism model is divided into a left part and a right part, namely a user attention mechanism and a commodity attention mechanism. The user attention mechanism takes the embedded vector of the user characteristics as input and comprises a plurality of layers of user attention networks; the commodity attention mechanism takes the embedded vector of commodity characteristics as input and comprises a plurality of layers of commodity attention networks. The attention network is a layer-by-layer recursive relationship, the input of the current layer is the output of the previous layer, and the output of the current layer is used as the input of the next layer. The number of layers can be determined according to the data characteristics, and the number of layers of the attention mechanisms of the user and the commodity are kept consistent. The two-layer attention network is used herein as an example to make up the attention mechanism.
[ user attention mechanism ]
The user attention mechanism aims to find important feature combinations among the features of the user. Is made up of K user attention networks. In the layer k attention network, the user characterizes the vector u (k) Given by the formula:
wherein the superscript k (k-1) of all variables represents the attention network of the k (k-1) th layer, U_attribute represents the attention network of the user, each layer of network is the same, and the specific operation process of the network structure consists of the following several formulas. The input to the network is an embedded vector of user features And the output of the previous layer->The output of the network characterizes the vector u for the user of the layer (k) ,m (k) Is a stored vector that holds the accumulated sum of the characterization vectors obtained from the previous k-layer network. After input is obtained, the attention network is normalized by a two-layer feed Forward Neural Network (FNN) and softmax layers to obtain attention weightsThe weight vector is utilized to carry out weighted average on T user feature vectors to obtain a characterization vector u of the layer (k) 。
At the kth layer, for t=1, 2,3, …, T, the weight of the user's kth embedded vector at that layer is first found
wherein ,are all network parameter matrices, < >>Embedded vector u representing user t-th feature in a k-th layer user attention network t For the parameter matrix of the input neural network, +.>Representing output of a layer above in a layer k user attention networkStore vector->For the parameter matrix of the input neural network, +.>Representing the hidden layer variable +.>Is a parameter matrix of the input neural network. />For the hidden layer vector obtained by the user's t-th feature, tan h is the activation function, and as a custom vector multiplication operation, i.e., two vectors of the same length and elements in the same position are multiplied to obtain a new vector. />By matrix multiplication with a matrix with a number of rows 1, a value +. >Then obtaining the weight of the surface sheet vector of the K layer of the final user through softmax conversion>e is a natural constant.
Then, calculating the weighted sum of the embedded vectors of the user according to the obtained weights to obtain the characterization vector u of the kth layer of the user (k) :
[ Commodity attentiveness mechanism ]
The commodity attention mechanism aims at searching important characteristic combinations in the characteristics of commodities, and the whole network is composed of K commodity attention networks. In the layer k attention network, the characterization vector of the commodity is obtained by the following formula:
wherein, V_Atation represents commodity attention network, and the network structure is the same as user attention network. The weight of the nth embedded vector of each commodity is obtainedThen, according to the weight, the weighted sum of all commodity embedded vectors is obtained to obtain a commodity characterization vector v of the k-th layer (k) 。
In the K-th layer, for n=1, 2,3, …, N, the weight of the N-th embedded vector of the product of the layer is first obtained
wherein ,is a parameter matrix of the commodity attention network, +.>Embedding vector v representing nth characteristic of commodity in kth layer commodity attention network n For the parameter matrix of the input neural network, +.>Storage vector representing output of a layer above in a layer k commodity attention network>For the parameter matrix of the input neural network, +. >Representing the hidden layer variable +.>Is a parameter matrix of the input neural network. />For the hidden layer vector obtained with the nth characteristic of the commodity, a value +.>Then obtaining the characterization vector weight of the K layer of the final commodity through softmax conversion>Then according to->Calculating the weighted sum of the commodity embedded vectors to obtain a characterization vector v of a K-th layer of the commodity (k) :
In particular, u as an input to the layer 1 attention network (0) and v(0) Initialized to the mean of the feature vectors, and and />Respectively equal to u (0) and v(0) :/>
3) Output layer
After the characterization vectors of the user and the commodity are obtained, a general method for measuring the matching degree of the user and the commodity is that the inner product of the two vectors is obtained, and the higher the inner product is, the closer the distance between the two vectors in the characterization space is, so that the matching degree is higher. The purpose of the output layer is to obtain the characterization vectors of the user and the commodity and calculate their matching degree.
Through the attention mechanism layer, we obtain the characterization vector of each layer of network of the user and the commodity, and finally obtain the characterization vector z of the user and the commodity through combination u and zv And obtaining the final matching degree by solving the inner product:
z u =[u (0) ;…;u (K) ]
z v =[v (0) ;…;v (K) ]
S=z u ·z v
Wherein, [; the method comprises the steps of carrying out a first treatment on the surface of the And the 'is a splicing operation' is a dot product operation, namely, multiplying two vectors with the same length and elements with the same position and summing to obtain the inner product of the two vectors.
As in fig. 1, the overall structure of the model is described when k=2,
model training uses cross entropy loss functions, which is a widely used loss function.
Wherein m is the number of samples, y i For sample labels, the positive sample with click action is 1, otherwise, is 0, and for each user, the sample labels form a pair with each clicked commodity<u,v + >Is a positive sample pair; too many products can be sampled after clicking, and a plurality of products are randomly selected to form a plurality of pairs<u,v - >Is a negative sample pair. Model training is accomplished by minimizing the loss function L, i.e., continually narrowing the distance between positive samples and expanding the distance between negative pairs of samples.
In the prediction stage, the characterization vectors of all commodities are calculated and stored through the right half part of the model, namely the commodity attention mechanism. For each user, the characterization vector of the user is obtained through the left half part of the model, namely the user attention mechanism, then the matching degree of the user and all commodities is calculated, and finally the top P commodities with the highest matching degree are selected as recall results. The authentication index may select a recall rate.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structural changes made by the description of the present invention and the accompanying drawings or direct/indirect application in other related technical fields are included in the scope of the invention.
Claims (9)
1. A recommendation system recall method based on an attention mechanism, comprising the steps of:
s10, extracting user features and commodity features in the training samples, converting the user features into user embedded vectors, and converting the commodity features into commodity embedded vectors;
s20, inputting the user embedded vector and the commodity embedded vector into an attention mechanism model for training, learning the weight of each feature through an attention network in the model, and carrying out weighted summation on the embedded vectors of all the features according to the weight to obtain a user characterization vector and a commodity characterization vector; calculating the inner product of the user characterization vector and the commodity characterization vector to obtain the matching degree of the user purchasing commodity will of the training sample, establishing a cross entropy loss function of the matching degree of the user purchasing commodity will, calculating the minimized cross entropy loss function, and converging the attention mechanism model;
The multi-layer user attention network in S20 includes a K-layer user attention network in which the user characterization vector u (k) Given by the formula:
wherein the superscript K (K-1) of all variables is the attention network of the K (K-1) th layer, U_attribute is the attention network of the user, each layer of network is the same, the specific operation process of the network structure consists of the following several formulas, and the input of the network is the embedded vector of the user characteristicsAnd the output of the previous layer->The output of the network characterizes the vector u for the user of the layer (k) ,m (k) Is the accumulation of characterization vectors obtained by the K-layer network before preservationAnd a storage vector, after input, the attention network normalizes the two layers of feedforward neural network FNN and softmax layers to obtain attention weight->The weight vector is utilized to carry out weighted average on T user feature vectors to obtain a characterization vector u of the layer (k) ,
At the K-th layer, for t=1, 2,3, …, T, the weight of the user's T-th embedded vector at that layer is first found
wherein ,are all network parameter matrices, < >>Embedded vector u representing user t-th feature in a k-th layer user attention network t For the parameter matrix of the input neural network, +.>Storage vector representing output of a layer above in a layer k user attention network +. >For the parameter matrix of the input neural network, +.>Representing the hidden layer variable +.>For the parameter matrix of the input neural network, +.>For the hidden layer vector obtained by the user's t-th feature, tanh is the activation function, and as a custom vector multiplication, i.e., two vectors of the same length and elements in the same position are multiplied to obtain a new vector, ">By matrix multiplication with a matrix with a number of rows 1, a value +.>Then obtaining the weight of the surface sheet vector of the K layer of the final user through softmax conversion>e is a natural constant, and is a natural constant,
then according toCalculating the weighted sum of the embedded vectors of the user to obtain a characterization vector u of the K-th layer of the user (k) :
S30, inputting the sample to be tested into the converged attention mechanism model, obtaining the matching degree of the user purchasing commodity intention of the sample to be tested, and selecting the commodity of which the matching degree of the user purchasing commodity intention is in a preset interval as a recall result to be recommended.
2. The attention mechanism based recommendation system recall method of claim 1 wherein the attention mechanism model is a bi-directional attention mechanism model comprising a multi-layer user attention network and a multi-layer commodity attention network, each layer of user attention network comprising a two-layer feed forward neural network FNN and a normalized layer Softmax, each layer of commodity attention network comprising a two-layer feed forward neural network FNN and a normalized layer Softmax, the user attention network and the commodity attention network each being in a layer-by-layer recursive relationship.
3. The attention mechanism based recommendation system recall method of claim 2 wherein the multi-tier commodity attention network in S20 comprises a K-tier commodity attention network in which commodity characterization vector v (k) Given by the formula:
wherein V_attribute represents commodity attention network, the network structure is the same as user attention network, and the weight of nth embedded vector of each commodity is obtained firstThen, according to the weight, the weighted sum of all commodity embedded vectors is obtained to obtain a commodity characterization vector v of the K-th layer (k) 。
In the K-th layer, for n=1, 2,3, …, N, the weight of the N-th embedded vector of the product of the layer is first obtained
wherein ,is a parameter matrix of the commodity attention network, +.>Embedding vector v representing nth characteristic of commodity in kth layer commodity attention network n For the parameter matrix of the input neural network, +.>Storage vector representing output of a layer above in a layer k commodity attention network>For the parameter matrix of the input neural network, +.>Representing the hidden layer variable +.>For the parameter matrix of the input neural network, +.>For the hidden layer vector obtained with the nth characteristic of the commodity, a value +. >Then obtaining the characterization vector weight of the K layer of the final commodity through softmax conversion>
Then according toCalculating the weighted sum of the commodity embedded vectors to obtain a characterization vector v of a K-th layer of the commodity (k) :
4. The recall method of a recommendation system based on an attention mechanism of claim 1, wherein the method for calculating the inner product of the user characterization vector and the commodity characterization vector to obtain the matching degree of the user' S intention to purchase the commodity of the training sample in S20 is specifically as follows:
multi-layer user attention network stitching and combining token vectors u for user attention networks of all layers (k) Obtaining a final user characterization vector z u =[u (0) ;…;u (K) ];
Multi-layer commodity attention network splicing and combining characterization vector v of commodity attention network of all layers (k) Obtaining a final commodity representation vector z v =[v (0) ;…;v (K) ];
Calculating the final user token vector z u And a final merchandise characterization vector z v And obtaining the matching degree of the final user's willingness to purchase goods.
5. The attention mechanism based recommender system recall method of claim 1 wherein said cross entropy loss function of user willingness to purchase commodity matching is specifically:
wherein m is the number of samples, y i For sample labels, with click behavior treated as positive sample, labeled 1, no click behavior treated as negative sample, labeled 0, for each user, make up with each item clicked <u,v + >Considered as a positive sample pair; click-through commodity composition<u,v - >Consider as a negative pair of samples, model training by minimizing the loss function L, i.e., continually narrowing the distance between positive samples and expanding the distance between negative pairs.
6. The attention mechanism based recommender system recall method of claim 1 wherein at S10:
dividing user data and commodity data from training samples, and processing the user data into sparse user vectorsT is the total number of user features, T is the current user feature, and u represents the user; processing commodity data into sparse commodity vector +.>N is the total number of commodity features, N is the current commodity features, and v represents commodity;
classifying training sample data into category type features and continuous type features according to the attribute of the data, and adopting a single-heat coding vector x if the training sample data is the category type features i ,x i The vector length of (2) is taken as the sum of the numbers of all the characteristics of the current training sample, and the category characteristics thereofThe value is 1, the other values are 0, and a feature dictionary is built for the position serial numbers of the category feature values in the vectors; if the feature is continuous, the sum of the numbers of all the features of the current training sample is taken as a vector length, the value of the continuous feature is taken as a feature value of the vector, and the other feature values are 0, so that the feature value is coded into a sparse vector.
7. The attention mechanism based recommender system recall method of claim 1 wherein said attention mechanism model is a representational learning model.
8. The attention mechanism based recommender recall method of claim 1 wherein the vector lengths of said user attention network and said merchandise attention network are equal.
9. The attention mechanism based recommender recall method of claim 1 wherein said training sample data is collected from a click rate estimation CTR model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911222216.1A CN111062775B (en) | 2019-12-03 | 2019-12-03 | Recommendation system recall method based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911222216.1A CN111062775B (en) | 2019-12-03 | 2019-12-03 | Recommendation system recall method based on attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111062775A CN111062775A (en) | 2020-04-24 |
CN111062775B true CN111062775B (en) | 2023-05-05 |
Family
ID=70299499
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911222216.1A Active CN111062775B (en) | 2019-12-03 | 2019-12-03 | Recommendation system recall method based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111062775B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021223165A1 (en) * | 2020-05-07 | 2021-11-11 | Beijing Didi Infinity Technology And Development Co., Ltd. | Systems and methods for object evaluation |
CN111737569B (en) * | 2020-06-04 | 2022-05-03 | 山东省人工智能研究院 | Personalized recommendation method based on attribute perception intention-minded convolutional neural network |
CN111695260B (en) * | 2020-06-12 | 2022-06-21 | 上海大学 | Material performance prediction method and system |
CN111737573A (en) * | 2020-06-17 | 2020-10-02 | 北京三快在线科技有限公司 | Resource recommendation method, device, equipment and storage medium |
CN112184391B (en) * | 2020-10-16 | 2023-10-10 | 中国科学院计算技术研究所 | Training method of recommendation model, medium, electronic equipment and recommendation model |
CN112270571B (en) * | 2020-11-03 | 2023-06-27 | 中国科学院计算技术研究所 | Meta-model training method for cold-start advertisement click rate estimation model |
CN112416931A (en) * | 2020-11-18 | 2021-02-26 | 脸萌有限公司 | Information generation method and device and electronic equipment |
CN112328893B (en) * | 2020-11-25 | 2022-08-02 | 重庆理工大学 | Recommendation method based on memory network and cooperative attention |
CN112598462B (en) * | 2020-12-19 | 2023-08-25 | 武汉大学 | Personalized recommendation method and system based on collaborative filtering and deep learning |
CN113139850A (en) * | 2021-04-26 | 2021-07-20 | 西安电子科技大学 | Commodity recommendation model for relieving data sparsity and commodity cold start |
CN113761392B (en) * | 2021-09-14 | 2022-04-12 | 上海任意门科技有限公司 | Content recall method, computing device, and computer-readable storage medium |
CN113742594B (en) * | 2021-09-16 | 2024-02-27 | 中国银行股份有限公司 | Recommendation system recall method and device |
CN115062220B (en) * | 2022-06-16 | 2023-06-23 | 成都集致生活科技有限公司 | Attention merging-based recruitment recommendation system |
CN116521936B (en) * | 2023-06-30 | 2023-09-01 | 云南师范大学 | Course recommendation method and device based on user behavior analysis and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109509054A (en) * | 2018-09-30 | 2019-03-22 | 平安科技(深圳)有限公司 | Method of Commodity Recommendation, electronic device and storage medium under mass data |
CN109960759A (en) * | 2019-03-22 | 2019-07-02 | 中山大学 | Recommender system clicking rate prediction technique based on deep neural network |
CN110196946A (en) * | 2019-05-29 | 2019-09-03 | 华南理工大学 | A kind of personalized recommendation method based on deep learning |
-
2019
- 2019-12-03 CN CN201911222216.1A patent/CN111062775B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109509054A (en) * | 2018-09-30 | 2019-03-22 | 平安科技(深圳)有限公司 | Method of Commodity Recommendation, electronic device and storage medium under mass data |
CN109960759A (en) * | 2019-03-22 | 2019-07-02 | 中山大学 | Recommender system clicking rate prediction technique based on deep neural network |
CN110196946A (en) * | 2019-05-29 | 2019-09-03 | 华南理工大学 | A kind of personalized recommendation method based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN111062775A (en) | 2020-04-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111062775B (en) | Recommendation system recall method based on attention mechanism | |
Li et al. | Multi-interest network with dynamic routing for recommendation at Tmall | |
CN112598462B (en) | Personalized recommendation method and system based on collaborative filtering and deep learning | |
CN108363804B (en) | Local model weighted fusion Top-N movie recommendation method based on user clustering | |
Tan et al. | Improved recurrent neural networks for session-based recommendations | |
Tautkute et al. | Deepstyle: Multimodal search engine for fashion and interior design | |
Zhang et al. | Deep Learning over Multi-field Categorical Data: –A Case Study on User Response Prediction | |
CN106021364B (en) | Foundation, image searching method and the device of picture searching dependency prediction model | |
CN110458627B (en) | Commodity sequence personalized recommendation method for dynamic preference of user | |
CN108537624B (en) | Deep learning-based travel service recommendation method | |
CN111737474A (en) | Method and device for training business model and determining text classification category | |
CN108665323B (en) | Integration method for financial product recommendation system | |
US20100223258A1 (en) | Information retrieval system and method using a bayesian algorithm based on probabilistic similarity scores | |
CN109064285B (en) | Commodity recommendation sequence and commodity recommendation method | |
EP3300002A1 (en) | Method for determining the similarity of digital images | |
Chen et al. | Using fruit fly optimization algorithm optimized grey model neural network to perform satisfaction analysis for e-business service | |
Meena et al. | Identifying emotions from facial expressions using a deep convolutional neural network-based approach | |
Li et al. | Retrieving real world clothing images via multi-weight deep convolutional neural networks | |
CN111814842A (en) | Object classification method and device based on multi-pass graph convolution neural network | |
CN111737578A (en) | Recommendation method and system | |
CN114693397A (en) | Multi-view multi-modal commodity recommendation method based on attention neural network | |
Alfarhood et al. | DeepHCF: a deep learning based hybrid collaborative filtering approach for recommendation systems | |
Du et al. | POLAR++: active one-shot personalized article recommendation | |
CN111597428A (en) | Recommendation method for splicing user and article with q-separation k sparsity | |
CN110163716B (en) | Red wine recommendation method based on convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |