CN113850656A - Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data - Google Patents
Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data Download PDFInfo
- Publication number
- CN113850656A CN113850656A CN202111348060.9A CN202111348060A CN113850656A CN 113850656 A CN113850656 A CN 113850656A CN 202111348060 A CN202111348060 A CN 202111348060A CN 113850656 A CN113850656 A CN 113850656A
- Authority
- CN
- China
- Prior art keywords
- user
- vector
- attention
- comment
- clothing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 230000008447 perception Effects 0.000 title claims description 13
- 239000013598 vector Substances 0.000 claims abstract description 252
- 230000007246 mechanism Effects 0.000 claims abstract description 15
- 238000005065 mining Methods 0.000 claims abstract description 12
- 239000011159 matrix material Substances 0.000 claims description 48
- 238000012545 processing Methods 0.000 claims description 20
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 15
- 238000010606 normalization Methods 0.000 claims description 10
- 238000001914 filtration Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 239000002356 single layer Substances 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 238000011176 pooling Methods 0.000 claims description 4
- 230000009901 attention process Effects 0.000 abstract 1
- 230000006870 function Effects 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 239000010410 layer Substances 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Finance (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Accounting & Taxation (AREA)
- Probability & Statistics with Applications (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an attention-perception-based personalized clothing recommendation method and system fusing multi-modal data, wherein the grading data of a user is used for guiding the generation of comment features of the user, and the comment features processed by an attention mechanism contain comment information more relevant to the user; and guiding the generation of fine-grained feature vectors of the clothing images according to the processed comment features so as to obtain the key image features of the clothing concerned by the user, and obtaining more accurate clothing feature vectors which are more in line with the personalized preference of the user after two attention processes. According to the method, the attention mechanism is utilized to filter the noise data in the user comment and clothing image data, so that the multidimensional characteristics of the comment and clothing image more relevant to the user are obtained, the personalized preference of the user is more accurately reflected, and the problems of insufficient interest mining, insufficient recommendation precision and the like of the user are solved.
Description
Technical Field
The invention belongs to the technical field of intelligent recommendation, and particularly relates to an attention-perception-based personalized clothing recommendation method and system fusing multi-modal data.
Background
With the rapid development of science and technology and the wide application of electronic commerce, various large electronic commerce platforms rise. However, information data is increasing, and internet data is growing explosively. The recommendation system is widely applied as an effective method for solving 'information overload', and personalized services of the recommendation system have penetrated into the lives of people and play more and more important roles.
The recommendation system aims to deeply analyze and mine factors such as characteristics, interests and the like of a user according to historical behavior information of the user and then match information or services which are possibly interested by the user from massive information. The most important characteristics of the method are that the method can fully adapt to the problem of user requirement ambiguity, and can utilize historical data of a user to build a model to capture the interest of the user. The personalized recommendation is to recommend specific commodities which are relatively more interested to a target user according to different personalized requirements. The most widely used recommendation algorithm in the clothing personalized recommendation technology is recommendation based on collaborative filtering, a preference function relation between a user and an article is mined according to historical interaction records of different users and the article of the user and information of similar users, the preference of the user to the article which has not generated interaction is predicted, and then the article which is possibly interested by the user is recommended for the different users according to the prediction result. However, the collaborative filtering algorithm may have the problems of data sparsity, incapability of sufficiently mining potential interests of users and the like, and secondly, the collaborative filtering algorithm based on the users only utilizes the interactive information of the users and the commodities and ignores the characteristics of the clothing products, such as the visual characteristics of the clothing. Finally, the recommendation result is not accurate, and the user experience is not high.
In summary, currently, the existing methods for recommending clothes are mainly divided into two categories:
1) although the method can produce a better recommendation effect, the collaborative filtering method mainly has the following defects: (1) clothing recommendation is performed only by using sparse scoring data, so that the problems of data sparsity, insufficient mining of potential interest of a user and the like are caused; (2) relevant information about the clothing item itself is ignored, for example: image features of the garment. Resulting in less accurate recommendation systems.
2) The recommendation system based on the clothing image features mainly has the following defects: (1) the global features of the clothing image are used as the feature representation of the image, and the clothing feature representation with fine granularity is lacked; (2) the user personalized preferences of different characteristics of the clothes concerned by different users are ignored, so that the personalized experience of the user recommending the result is poor.
Disclosure of Invention
The invention provides an individual clothing recommendation method and system based on attention perception and fusing multi-mode data, aiming at the problems that the existing clothing recommendation result is not accurate and the user experience degree is not high.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention provides an attention perception-based personalized clothing recommendation method fusing multi-modal data, which comprises the following steps:
step 1: extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
step 2: mining feature information of the user comments by using a BiGRU and attention mechanism, wherein the feature information comprises a hidden factor vector matrix of a user for guiding the user comments to generate word-level attention vectors;
and step 3: applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining a user comment preference feature vector based on the contextualized user comment feature vector;
and 4, step 4: dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
and 5: splicing the obtained user comment preference characteristic vector and the clothing image characteristic vector generated by the user comment guidance to generate a user preference characteristic vector which is used as the output of an encoder part;
step 6: the user preference feature vector is input to a decoder section, the probability distribution of the set of candidate apparel items is calculated, and the apparel item with the highest probability is selected as the next recommendation.
Further, the step 1 comprises:
and (3) constructing a grading matrix of the user-clothing item by taking the user as a row of the matrix and the clothing item as a column of the matrix:
wherein r isu,iScoring the clothing item i for the user u; p is a radical ofu,kA k-dimensional hidden factor vector representing user u; q. q.si,kK-dimensional hidden factor vector representing clothing item i; f represents the dimension of the vector; p is a radical ofuA hidden factor vector matrix representing the user.
Further, the step 2 comprises:
setting the history comment set S of the user as S1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…,t|S|;
Using pre-trained BERT to perform embedded expression of vectors, and using BiGRU to process a word vector sequence of each comment;
splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht;
Guiding the comments to generate word-level attention vectors by using the user hidden factor vectors obtained from the scores;
hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
wherein,representing the attention degree of the user u to the word t; w1,W2,W3Is the weight to learn; a iskRepresenting the attention weight of the k word obtained by performing the normalization operation; a is a word-level attention vector, which is a summary of comment S.
Further, the step 3 comprises:
adopting attention to process contextualized user comment feature vector to generate user comment preference feature vector SuThe formula for attention calculation is:
βn=W5 tanh(W4cn+b1)+b2
wherein beta isnIndicating the degree of attention of the nth comment obtained using the single-layer neural network; c. CnCommenting the feature vector for the contextualized user; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnAnd representing the weight of the nth comment obtained after the normalization operation.
Further, the step 4 comprises:
dividing the clothing image purchased by the user for the last time into m areas;
performing feature extraction on each region by using a VGG network to obtain an original clothing image feature vector;
then, the contextualized user comment feature vectors obtained from the user comments are summarized into a single vector c by using average poolings;
The method comprises the following steps of performing attention processing on an original clothing image feature vector, filtering noise, and obtaining a clothing image feature vector generated by a user comment guidance, wherein the attention calculation formula is as follows:
δI=tanh(W6vI⊙W7cS)
wherein deltaIDenotes csFor vIDegree of attention of, vI={vi|vi∈Rd,i=1,…,m},vI∈Rd×mThe characteristic vector of the original clothing image is obtained; c. Cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
The invention provides a personalized clothing recommendation system fusing multi-mode data and based on attention perception, which comprises the following components:
the hidden factor vector extraction module is used for extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
the user comment feature extraction module is used for mining feature information of the user comment by using a BiGRU and attention mechanism, and comprises the steps of guiding the user comment to generate a word-level attention vector by using a hidden factor vector matrix of the user;
the user comment preference feature vector obtaining module is used for applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining the user comment preference feature vector based on the contextualized user comment feature vector;
the user comment guidance module is used for dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
the user preference feature vector obtaining module is used for splicing the obtained user comment preference feature vector and the clothing image feature vector generated by the user comment guidance to generate a user preference feature vector which is used as the output of the encoder part;
and the recommending module is used for inputting the user preference feature vector into the decoder part, calculating the probability distribution of the candidate clothing item set and selecting the clothing item with the maximum probability as the next item recommendation.
Further, the implicit factor vector extraction module is specifically configured to:
and (3) constructing a grading matrix of the user-clothing item by taking the user as a row of the matrix and the clothing item as a column of the matrix:
wherein r isu,iScoring the clothing item i for the user u; p is a radical ofu,kA k-dimensional hidden factor vector representing user u; q. q.si,kK-dimensional hidden factor vector representing clothing item i; f represents the dimension of the vector; p is a radical ofuA hidden factor vector matrix representing the user.
Further, the user comment feature extraction module is specifically configured to:
rating the history of the userArgument S ═ S1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…,t|S|;
Using pre-trained BERT to perform embedded expression of vectors, and using BiGRU to process a word vector sequence of each comment;
splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht;
Guiding the comments to generate word-level attention vectors by using the user hidden factor vectors obtained from the scores;
hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
wherein,representing the attention degree of the user u to the word t; w1,W2,W3Is the weight to learn; a iskRepresenting the attention weight of the k word obtained by performing the normalization operation; a is a word-level attention vector, which is a summary of comment S.
Further, the user comment preference feature vector derivation module is specifically configured to:
adopting attention to process contextualized user comment feature vector to generate user comment preference feature vector SuAttention meterThe formula of the calculation is as follows:
βn=W5 tanh(W4cn+b1)+b2
wherein beta isnIndicating the degree of attention of the nth comment obtained using the single-layer neural network; c. CnCommenting the feature vector for the contextualized user; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnAnd representing the weight of the nth comment obtained after the normalization operation.
Further, the user comment guidance module is specifically configured to:
dividing the clothing image purchased by the user for the last time into m areas;
performing feature extraction on each region by using a VGG network to obtain an original clothing image feature vector;
then, the contextualized user comment feature vectors obtained from the user comments are summarized into a single vector c by using average poolings;
The method comprises the following steps of performing attention processing on an original clothing image feature vector, filtering noise, and obtaining a clothing image feature vector generated by a user comment guidance, wherein the attention calculation formula is as follows:
δI=tanh(W6vI⊙W7cS)
wherein deltaIDenotes csFor vIDegree of attention of, vI={vi|vi∈Rd,i=1,…,m},vI∈Rd×mThe characteristic vector of the original clothing image is obtained; c. Cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
Compared with the prior art, the invention has the following beneficial effects:
compared with the prior recommendation system/method, the method disclosed by the invention has the advantages that the multi-mode data is fused for recommendation, the user score, the user comment and the clothing image are specifically used as the input of the model, the user score data is used for carrying out attention processing on the user comment, and the more accurate user comment characteristic is obtained. The features are extracted after the clothing image is segmented, so that the features of different parts of the image can be represented in a finer granularity, and more accurate feature vectors in a finer granularity are obtained. Therefore, multi-source data are fused for recommendation, and the problem of data sparsity in a clothing recommendation system is solved.
Compared with the existing recommendation system/method that vector splicing operation is simply carried out between multi-modal data or fusion is simply carried out by using a full connection layer, the method carries out attention processing of mutual guidance between the multi-modal data, uses user scores to guide user comments to carry out attention processing, and then uses the user comments to guide clothing images to carry out attention processing. According to the method, the attention mechanism is utilized to filter the noise data in the user comment and clothing image data, so that the multidimensional characteristics of the comment and clothing image more relevant to the user are obtained, the personalized preference of the user is more accurately reflected, and the problems of insufficient interest mining, insufficient recommendation precision and the like of the user are solved.
Drawings
FIG. 1 is a basic flowchart of a personalized clothing recommendation method based on attention perception by fusing multi-modal data according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an architecture of a personalized clothing recommendation system based on attention perception and integrating multimodal data according to an embodiment of the invention.
Detailed Description
The invention is further illustrated by the following examples in conjunction with the accompanying drawings:
as shown in fig. 1, a personalized clothing recommendation method based on attention perception by fusing multi-modal data includes:
step 1: extracting a hidden factor vector matrix of the user from the user scoring data by using an LFM (hidden semantic model);
specifically, a user implicit factor vector and a clothing implicit factor vector are extracted from user scoring data by using the LFM, and the user implicit factor vector and the clothing implicit factor vector reflect the interests of a user. Let U be set as U ═ U1,u2,…,umThe clothing item set is I ═ I1,i2,…,inAnd constructing a grading matrix R of the user-clothing items by taking the user as a row of the matrix and the clothing items as columns of the matrix, wherein Ru,iAnd scoring the clothing item i for the user u according to the formula:
wherein p isu,kK-dimensional hidden factor vector, q, representing user ui,kK-dimensional hidden factor vector representing item of clothing i, and F represents the dimension of the vector. Finally, a hidden factor vector matrix p of the user representing the preference of the user is obtainedu。
Step 2: mining feature information of the user comments by using a BiGRU and an attention mechanism (specifically, processing comment texts by using the BiGRU to obtain user comment features containing context information, and then obtaining more relevant user comment features by using the attention mechanism), wherein the method comprises the steps of using a hidden factor vector matrix of a user to guide the user comments to generate word-level attention vectors;
in particular, the BiGRU and the attention mechanism are used for deeply mining the feature information of the user comment, and the comment text of the user contains important preference information and specific detailed information related to clothing. When extracting information from comments, it is important to distinguish relevant comments from noise comments, determine important parts in each comment, and set S ═ S of user' S historical comments1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…, t|S|. The embedded representation of the vectors was first performed using pre-trained BERT, and the word vector sequence for each comment was processed using BiGRU in order to capture context information. Splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht. Because the word sequence of the text is long, the final feature vector obtained by the BiGRU model in practical application is biased to the last words of the text sequence. In order to solve the problem, an attention mechanism is adopted to guide the comment to generate a word-level attention vector by using a user hidden factor vector obtained in the grading, and the weight of the word vector can be adjusted by fully utilizing the preference of a user so as to obtain more compact and more accurate comment content feature representation. Hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
wherein,to representUsing w1And w2To puAnd htPerforming linear conversion, extracting nonlinear semantic information by using a nonlinear activation function, and finally obtaining the attention degree of the user u to the word t by using the linear conversion; w1,W2,W3Is the weight to learn; a iskRepresenting the attention weight of the k word obtained by performing the normalization operation; a is a word-level attention vector, which is a summary of comment S. For each comment S1,…, SNRepeatedly calculating to obtain a1,…,aN。
And step 3: applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining a user comment preference feature vector based on the contextualized user comment feature vector;
specifically, in order to capture global context information in user comments, BiGRU is applied to a word-level attention vector, and hidden states in the forward direction and the backward direction are spliced to obtain a contextualized user comment vector c1,…,cn,…,cN. Attention processing is adopted to pay attention to important comment information, and a final user comment preference feature vector S is generateduThe formula for attention calculation is:
βn=W5 tanh(W4cn+b1)+b2
wherein, betanIndicating the degree of attention of the nth comment obtained using the single-layer neural network; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnRepresenting the weight of the nth comment obtained after the normalization operationAnd (4) heavy.
And 4, step 4: dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
specifically, when a clothing image is processed, in general, the user's attention may be related to only a specific region of the input clothing image. Therefore, instead of using the global vector as an image feature, the image is divided into m regions and a feature vector of each region is extracted. The scoring guided user comment features are then used to focus on the garment image features, filter noise and find areas more relevant to the user's preferences. First, the garment image size is scaled to 224 × 224, divided into m × N regions, and feature extraction is performed for each region using a preprocessed 19-layer VGG network. Obtaining the original clothing image feature vector vI={vi|vi∈RdI is 1, …, m }. Then obtaining contextualized user comment vector c from the user comment1,…,cn,…, cNThe pools were aggregated into a single vector using averaging.
Wherein c issIndicates the use of c1,…,cn,…,cNAnd averaging pooled single vectors after aggregation.
The original clothing image feature vectors are subjected to attention processing, noise is filtered to obtain features highlighting the clothing image key areas, and for calculation convenience, a full connection layer is used for converting each image feature vector into an image feature vector with the dimension equal to the comment vector. The attention calculation formula is:
δI=tanh(W6vI⊙W7cS)
wherein, deltaIDenotes csFor vIDegree of attention of, vI∈Rd×m,cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
And 5: splicing the obtained user comment preference characteristic vector and the clothing image characteristic vector generated by the user comment guidance to generate a user preference characteristic vector which is used as an Encoder (Encoder) part for output;
and splicing the obtained user comment features and the clothing image features generated by the user comment guidance to generate a user preference feature vector, and outputting the user preference feature vector as an Encoder part of the network.
Step 6: the user preference feature vector is input to a Decoder (Decoder) section, the probability distribution of the set of candidate apparel items is computed, and the apparel item with the highest probability is selected as the next recommendation.
The Decoder part is mainly a GRU network, and during training, a user preference feature vector and a clothing item sequence are used for training, and clothing items are mapped into features with fixed lengths to be used as GRU input.
xt=W8It,t∈{1,…,n}
ht+1=GRU(xt),t∈{1,…,n}
Wherein I ═ I (I)1,…,In) Is a sequence of clothing items, each item represented as a one-hot vector Ht,W8Is a weight vector, xtIs the clothing item embedding feature at time t; h ist+1And representing the hidden state of the GRU model output at the t +1 moment in the training process.
In prediction, the previous output h is givent-1Generating the next output h through GRUt. And at each time step, generating the probability distribution of each clothing item at the time t by adopting a single-layer full-connection network and a softmax function, and finally selecting the item with the highest probability as the next item recommendation.
Pt=softmax(Wihs)
Wherein, PtProbability distribution, h, for each item of clothing generated at time ts∈{h1,…,hnIs input, WiIs a weight parameter.
The objective function of the model training is:
wherein HtIs the true tag at time t, H0Is the output value of the Encoder part (i.e. the user preference feature vector), H{1:t-1}Is the previous sequence of items, theta represents all the parameters of the model, and lambda is the regularization parameter. And (4) optimizing an objective function by adopting random gradient descent (SGD), and randomly selecting a training example each time to update the model parameters towards the direction of negative gradient.
On the basis of the above embodiment, as shown in fig. 2, the present invention further provides an attention-aware-based personalized clothing recommendation system fusing multimodal data, including:
the hidden factor vector extraction module is used for extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
the user comment feature extraction module is used for mining feature information of the user comment by using a BiGRU and attention mechanism, and comprises the steps of guiding the user comment to generate a word-level attention vector by using a hidden factor vector matrix of the user;
the user comment preference feature vector obtaining module is used for applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining the user comment preference feature vector based on the contextualized user comment feature vector;
the user comment guidance module is used for dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
the user preference feature vector obtaining module is used for splicing the obtained user comment preference feature vector and the clothing image feature vector generated by the user comment guidance to generate a user preference feature vector which is used as the output of the encoder part;
and the recommending module is used for inputting the user preference feature vector into the decoder part, calculating the probability distribution of the candidate clothing item set and selecting the clothing item with the maximum probability as the next item recommendation.
Further, the implicit factor vector extraction module is specifically configured to:
and (3) constructing a grading matrix of the user-clothing item by taking the user as a row of the matrix and the clothing item as a column of the matrix:
wherein r isu,iScoring the clothing item i for the user u; p is a radical ofu,kA k-dimensional hidden factor vector representing user u; q. q.si,kK-dimensional hidden factor vector representing clothing item i; f represents the dimension of the vector; p is a radical ofuA hidden factor vector matrix representing the user.
Further, the user comment feature extraction module is specifically configured to:
setting the history comment set S of the user as S1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…,t|S|;
Using pre-trained BERT to perform embedded expression of vectors, and using BiGRU to process a word vector sequence of each comment;
splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht;
Guiding the comments to generate word-level attention vectors by using the user hidden factor vectors obtained from the scores;
hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
wherein,representing the attention degree of the user u to the word t; w1,W2,W3Is the weight to learn; a iskRepresenting the attention weight of the k word obtained by performing the normalization operation; a is a word-level attention vector, which is a summary of comment S.
Further, the user comment preference feature vector derivation module is specifically configured to:
adopting attention to process contextualized user comment feature vector to generate user comment preference feature vector SuThe formula for attention calculation is:
βn=W5 tanh(W4cn+b1)+b2
wherein beta isnIndicating the degree of attention of the nth comment obtained using the single-layer neural network; c. CnCommenting the feature vector for the contextualized user; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnAnd representing the weight of the nth comment obtained after the normalization operation.
Further, the user comment guidance module is specifically configured to:
dividing the clothing image purchased by the user for the last time into m areas;
performing feature extraction on each region by using a VGG network to obtain an original clothing image feature vector;
then, the contextualized user comment feature vectors obtained from the user comments are summarized into a single vector c by using average poolings;
The method comprises the following steps of performing attention processing on an original clothing image feature vector, filtering noise, and obtaining a clothing image feature vector generated by a user comment guidance, wherein the attention calculation formula is as follows:
δI=tanh(W6vI⊙W7cS)
wherein deltaIDenotes csFor vIDegree of attention of, vI={vi|vi∈Rd,i=1,…,m},vI∈Rd×mThe characteristic vector of the original clothing image is obtained; c. Cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
In summary, firstly, aiming at the problem that the precision of a recommendation system is not high due to data sparsity existing in a matrix decomposition-based collaborative filtering algorithm, the method and the device jointly use the user scoring data and the comment information to guide a model to learn more reasonable user characteristics in a mode of adding more data, and further improve the prediction precision of the model. Compared with single scoring data, the text comment information generated by the user is an important source of user preference characteristics and contains more specific and subtle characteristics about the user preferences. Therefore, the precision of the recommendation system can be effectively improved by fusing the comment data. Secondly, the fine-grained clothing image features are used as the representation of the semantic information of clothing commodities, more accurate influence can be generated on the purchase intention of a user, and the method for representing the image features by using the whole clothing image is replaced. Finally, for the user comment information and the clothing image, if the clothing recommendation is performed by directly using the comment text extracted by the bidirectional GRU model or using the VGG model to capture the overall vector characteristics of the clothing image, the final recommendation result may be affected by noise data therein. The present invention introduces an attention mechanism that allows the model to focus more on information related to the user's features. Firstly, the generation of comment features of a user is guided by using the grading data of the user, the user grading explicitly expresses the preference of the user, and the weight of a word vector in each comment can be adjusted by fully utilizing the preference of the user so as to obtain more accurate comment feature representation. Because the user comment may contain intuitive words capable of expressing user preferences, which indicate that the user focuses more on information of a certain local area on the clothing image, the comment features processed by the attention mechanism are used for guiding the generation of clothing image features, and clothing vector features more relevant to user preferences are obtained. Therefore, user comment and clothing image vector features which are more accurate and accord with user personalized preferences can be obtained from the user comment and clothing image respectively. After the user comment feature vector and the clothing image feature vector are obtained, the user comment feature vector and the clothing image feature vector are spliced and a GRU model is used for carrying out serialized recommendation, and therefore the next personalized recommendation is carried out on the user.
Compared with the prior recommendation system method, the method disclosed by the invention has the advantages that the multi-mode data are fused for recommendation, the user score, the user comment and the clothing image are specifically used as the input of the model, the user score data is used for carrying out attention processing on the user comment, and the more accurate user comment characteristic is obtained. The features are extracted after the clothing image is segmented, so that the features of different parts of the image can be represented in a finer granularity, and more accurate feature vectors in a finer granularity are obtained. Therefore, multi-source data are fused for recommendation, and the problem of data sparsity in a clothing recommendation system is solved.
Compared with the existing recommendation system and method in which vector splicing operation is simply carried out between multi-modal data or fusion is simply carried out by using a full connection layer, the method carries out attention processing of mutual guidance between the multi-modal data, uses user scores to guide user comments to carry out attention processing, and then uses the user comments to guide clothing images to carry out attention processing. According to the method, the attention mechanism is utilized to filter the noise data in the user comment and clothing image data, so that the multidimensional characteristics of the comment and clothing image more relevant to the user are obtained, the personalized preference of the user is more accurately reflected, and the problems of insufficient interest mining, insufficient recommendation precision and the like of the user are solved.
The above shows only the preferred embodiments of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.
Claims (6)
1. A personalized clothing recommendation method based on attention perception and fusing multi-modal data is characterized by comprising the following steps:
step 1: extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
step 2: mining feature information of the user comments by using a BiGRU and attention mechanism, wherein the feature information comprises a hidden factor vector matrix of a user for guiding the user comments to generate word-level attention vectors;
and step 3: applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining a user comment preference feature vector based on the contextualized user comment feature vector;
and 4, step 4: dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
and 5: splicing the obtained user comment preference characteristic vector and the clothing image characteristic vector generated by the user comment guidance to generate a user preference characteristic vector which is used as the output of an encoder part;
step 6: the user preference feature vector is input to a decoder section, the probability distribution of the set of candidate apparel items is calculated, and the apparel item with the highest probability is selected as the next recommendation.
2. The personalized clothing recommendation method based on attention perception fusing multi-modal data as claimed in claim 1, wherein the step 1 comprises:
and (3) constructing a grading matrix of the user-clothing item by taking the user as a row of the matrix and the clothing item as a column of the matrix:
wherein r isu,iScoring the clothing item i for the user u; p is a radical ofu,kA k-dimensional hidden factor vector representing user u; q. q.si,kK-dimensional hidden factor vector representing clothing item i; f represents the dimension of the vector; p is a radical ofuA hidden factor vector matrix representing the user.
3. The personalized clothing recommendation method based on attention perception fusing multi-modal data as claimed in claim 1, wherein the step 2 comprises:
setting the history comment set S of the user as S1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…,t|S|;
Using pre-trained BERT to perform embedded expression of vectors, and using BiGRU to process a word vector sequence of each comment;
splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht;
Guiding the comments to generate word-level attention vectors by using the user hidden factor vectors obtained from the scores;
hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
4. The personalized clothing recommendation method based on attention perception fusing multi-modal data as claimed in claim 1, wherein the step 3 comprises:
adopting attention to process contextualized user comment feature vector to generate user comment preference feature vector SuThe formula for attention calculation is:
βn=W5 tanh(W4cn+b1)+b2
wherein beta isnIndicating the degree of attention of the nth comment obtained using the single-layer neural network; c. CnCommenting the feature vector for the contextualized user; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnAnd representing the weight of the nth comment obtained after the normalization operation.
5. The personalized clothing recommendation method based on attention perception fusing multi-modal data as claimed in claim 1, wherein the step 4 comprises:
dividing the clothing image purchased by the user for the last time into m areas;
performing feature extraction on each region by using a VGG network to obtain an original clothing image feature vector;
then, the contextualized user comment feature vectors obtained from the user comments are summarized into a single vector c by using average poolings;
The method comprises the following steps of performing attention processing on an original clothing image feature vector, filtering noise, and obtaining a clothing image feature vector generated by a user comment guidance, wherein the attention calculation formula is as follows:
δI=tanh(W6vI⊙W7cS)
wherein deltaIDenotes csFor vIDegree of attention of, vI={vi|vi∈Rd,i=1,…,m},vI∈Rd×mThe characteristic vector of the original clothing image is obtained; c. Cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
6. An attention-aware-based personalized garment recommendation system fusing multimodal data, comprising:
the hidden factor vector extraction module is used for extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
the user comment feature extraction module is used for mining feature information of the user comment by using a BiGRU and attention mechanism, and comprises the steps of guiding the user comment to generate a word-level attention vector by using a hidden factor vector matrix of the user;
the user comment preference feature vector obtaining module is used for applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining the user comment preference feature vector based on the contextualized user comment feature vector;
the user comment guidance module is used for dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
the user preference feature vector obtaining module is used for splicing the obtained user comment preference feature vector and the clothing image feature vector generated by the user comment guidance to generate a user preference feature vector which is used as the output of the encoder part;
and the recommending module is used for inputting the user preference feature vector into the decoder part, calculating the probability distribution of the candidate clothing item set and selecting the clothing item with the maximum probability as the next item recommendation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111348060.9A CN113850656B (en) | 2021-11-15 | 2021-11-15 | Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111348060.9A CN113850656B (en) | 2021-11-15 | 2021-11-15 | Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113850656A true CN113850656A (en) | 2021-12-28 |
CN113850656B CN113850656B (en) | 2022-08-23 |
Family
ID=78984272
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111348060.9A Active CN113850656B (en) | 2021-11-15 | 2021-11-15 | Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113850656B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114943035A (en) * | 2022-06-08 | 2022-08-26 | 青岛文达通科技股份有限公司 | User dressing recommendation method and system based on self-encoder and memory network |
CN114971784A (en) * | 2022-05-21 | 2022-08-30 | 内蒙古工业大学 | Graph neural network-based session recommendation method and system integrating self-attention mechanism |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109754317A (en) * | 2019-01-10 | 2019-05-14 | 山东大学 | Merge interpretation clothes recommended method, system, equipment and the medium of comment |
CN110807477A (en) * | 2019-10-18 | 2020-02-18 | 山东大学 | Attention mechanism-based neural network garment matching scheme generation method and system |
CN110807154A (en) * | 2019-11-08 | 2020-02-18 | 内蒙古工业大学 | Recommendation method and system based on hybrid deep learning model |
CN111415222A (en) * | 2020-03-19 | 2020-07-14 | 苏州大学 | Article recommendation method, device, equipment and computer-readable storage medium |
CN112016002A (en) * | 2020-08-17 | 2020-12-01 | 辽宁工程技术大学 | Mixed recommendation method integrating comment text level attention and time factors |
US20210065278A1 (en) * | 2019-08-27 | 2021-03-04 | Nec Laboratories America, Inc. | Asymmetrically hierarchical networks with attentive interactions for interpretable review-based recommendation |
CN113420221A (en) * | 2021-07-01 | 2021-09-21 | 宁波大学 | Interpretable recommendation method integrating implicit article preference and explicit feature preference of user |
-
2021
- 2021-11-15 CN CN202111348060.9A patent/CN113850656B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109754317A (en) * | 2019-01-10 | 2019-05-14 | 山东大学 | Merge interpretation clothes recommended method, system, equipment and the medium of comment |
US20210065278A1 (en) * | 2019-08-27 | 2021-03-04 | Nec Laboratories America, Inc. | Asymmetrically hierarchical networks with attentive interactions for interpretable review-based recommendation |
CN110807477A (en) * | 2019-10-18 | 2020-02-18 | 山东大学 | Attention mechanism-based neural network garment matching scheme generation method and system |
CN110807154A (en) * | 2019-11-08 | 2020-02-18 | 内蒙古工业大学 | Recommendation method and system based on hybrid deep learning model |
CN111415222A (en) * | 2020-03-19 | 2020-07-14 | 苏州大学 | Article recommendation method, device, equipment and computer-readable storage medium |
CN112016002A (en) * | 2020-08-17 | 2020-12-01 | 辽宁工程技术大学 | Mixed recommendation method integrating comment text level attention and time factors |
CN113420221A (en) * | 2021-07-01 | 2021-09-21 | 宁波大学 | Interpretable recommendation method integrating implicit article preference and explicit feature preference of user |
Non-Patent Citations (2)
Title |
---|
田保军 等: "融合主题信息和卷积神经网络的混合推荐算法", 《计算机应用》 * |
田保军 等: "融合信任和基于概率矩阵分解的推荐算法", 《计算机应用》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114971784A (en) * | 2022-05-21 | 2022-08-30 | 内蒙古工业大学 | Graph neural network-based session recommendation method and system integrating self-attention mechanism |
CN114971784B (en) * | 2022-05-21 | 2024-05-14 | 内蒙古工业大学 | Session recommendation method and system based on graph neural network by fusing self-attention mechanism |
CN114943035A (en) * | 2022-06-08 | 2022-08-26 | 青岛文达通科技股份有限公司 | User dressing recommendation method and system based on self-encoder and memory network |
Also Published As
Publication number | Publication date |
---|---|
CN113850656B (en) | 2022-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112579778B (en) | Aspect-level emotion classification method based on multi-level feature attention | |
CN114936623B (en) | Aspect-level emotion analysis method integrating multi-mode data | |
CN113850656B (en) | Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data | |
CN109815903A (en) | A kind of video feeling classification method based on adaptive converged network | |
CN112256866B (en) | Text fine-grained emotion analysis algorithm based on deep learning | |
CN110765769B (en) | Clause feature-based entity attribute dependency emotion analysis method | |
CN109598387A (en) | Forecasting of Stock Prices method and system based on two-way cross-module state attention network model | |
CN107451118A (en) | Sentence-level sensibility classification method based on Weakly supervised deep learning | |
Lopes et al. | An AutoML-based approach to multimodal image sentiment analysis | |
CN110991290A (en) | Video description method based on semantic guidance and memory mechanism | |
CN114648031B (en) | Text aspect emotion recognition method based on bidirectional LSTM and multi-head attention mechanism | |
Wang et al. | Sentiment analysis from Customer-generated online videos on product review using topic modeling and Multi-attention BLSTM | |
CN116610778A (en) | Bidirectional image-text matching method based on cross-modal global and local attention mechanism | |
CN111598596A (en) | Data processing method and device, electronic equipment and storage medium | |
KR20200010672A (en) | Smart merchandise searching method and system using deep learning | |
LU506520B1 (en) | A sentiment analysis method based on multimodal review data | |
CN110569869A (en) | feature level fusion method for multi-modal emotion detection | |
Gao et al. | Play and rewind: Context-aware video temporal action proposals | |
CN116703506A (en) | Multi-feature fusion-based E-commerce commodity recommendation method and system | |
Parvin et al. | Transformer-based local-global guidance for image captioning | |
CN115062174A (en) | End-to-end image subtitle generating method based on semantic prototype tree | |
Gandhi et al. | Multimodal sentiment analysis: review, application domains and future directions | |
CN114595693A (en) | Text emotion analysis method based on deep learning | |
CN117237479A (en) | Product style automatic generation method, device and equipment based on diffusion model | |
CN115906824A (en) | Text fine-grained emotion analysis method, system, medium and computing equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |