CN113850656A - Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data - Google Patents

Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data Download PDF

Info

Publication number
CN113850656A
CN113850656A CN202111348060.9A CN202111348060A CN113850656A CN 113850656 A CN113850656 A CN 113850656A CN 202111348060 A CN202111348060 A CN 202111348060A CN 113850656 A CN113850656 A CN 113850656A
Authority
CN
China
Prior art keywords
user
vector
attention
comment
clothing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111348060.9A
Other languages
Chinese (zh)
Other versions
CN113850656B (en
Inventor
田保军
康萌
房建东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University of Technology
Original Assignee
Inner Mongolia University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University of Technology filed Critical Inner Mongolia University of Technology
Priority to CN202111348060.9A priority Critical patent/CN113850656B/en
Publication of CN113850656A publication Critical patent/CN113850656A/en
Application granted granted Critical
Publication of CN113850656B publication Critical patent/CN113850656B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Finance (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Accounting & Taxation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an attention-perception-based personalized clothing recommendation method and system fusing multi-modal data, wherein the grading data of a user is used for guiding the generation of comment features of the user, and the comment features processed by an attention mechanism contain comment information more relevant to the user; and guiding the generation of fine-grained feature vectors of the clothing images according to the processed comment features so as to obtain the key image features of the clothing concerned by the user, and obtaining more accurate clothing feature vectors which are more in line with the personalized preference of the user after two attention processes. According to the method, the attention mechanism is utilized to filter the noise data in the user comment and clothing image data, so that the multidimensional characteristics of the comment and clothing image more relevant to the user are obtained, the personalized preference of the user is more accurately reflected, and the problems of insufficient interest mining, insufficient recommendation precision and the like of the user are solved.

Description

Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data
Technical Field
The invention belongs to the technical field of intelligent recommendation, and particularly relates to an attention-perception-based personalized clothing recommendation method and system fusing multi-modal data.
Background
With the rapid development of science and technology and the wide application of electronic commerce, various large electronic commerce platforms rise. However, information data is increasing, and internet data is growing explosively. The recommendation system is widely applied as an effective method for solving 'information overload', and personalized services of the recommendation system have penetrated into the lives of people and play more and more important roles.
The recommendation system aims to deeply analyze and mine factors such as characteristics, interests and the like of a user according to historical behavior information of the user and then match information or services which are possibly interested by the user from massive information. The most important characteristics of the method are that the method can fully adapt to the problem of user requirement ambiguity, and can utilize historical data of a user to build a model to capture the interest of the user. The personalized recommendation is to recommend specific commodities which are relatively more interested to a target user according to different personalized requirements. The most widely used recommendation algorithm in the clothing personalized recommendation technology is recommendation based on collaborative filtering, a preference function relation between a user and an article is mined according to historical interaction records of different users and the article of the user and information of similar users, the preference of the user to the article which has not generated interaction is predicted, and then the article which is possibly interested by the user is recommended for the different users according to the prediction result. However, the collaborative filtering algorithm may have the problems of data sparsity, incapability of sufficiently mining potential interests of users and the like, and secondly, the collaborative filtering algorithm based on the users only utilizes the interactive information of the users and the commodities and ignores the characteristics of the clothing products, such as the visual characteristics of the clothing. Finally, the recommendation result is not accurate, and the user experience is not high.
In summary, currently, the existing methods for recommending clothes are mainly divided into two categories:
1) although the method can produce a better recommendation effect, the collaborative filtering method mainly has the following defects: (1) clothing recommendation is performed only by using sparse scoring data, so that the problems of data sparsity, insufficient mining of potential interest of a user and the like are caused; (2) relevant information about the clothing item itself is ignored, for example: image features of the garment. Resulting in less accurate recommendation systems.
2) The recommendation system based on the clothing image features mainly has the following defects: (1) the global features of the clothing image are used as the feature representation of the image, and the clothing feature representation with fine granularity is lacked; (2) the user personalized preferences of different characteristics of the clothes concerned by different users are ignored, so that the personalized experience of the user recommending the result is poor.
Disclosure of Invention
The invention provides an individual clothing recommendation method and system based on attention perception and fusing multi-mode data, aiming at the problems that the existing clothing recommendation result is not accurate and the user experience degree is not high.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention provides an attention perception-based personalized clothing recommendation method fusing multi-modal data, which comprises the following steps:
step 1: extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
step 2: mining feature information of the user comments by using a BiGRU and attention mechanism, wherein the feature information comprises a hidden factor vector matrix of a user for guiding the user comments to generate word-level attention vectors;
and step 3: applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining a user comment preference feature vector based on the contextualized user comment feature vector;
and 4, step 4: dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
and 5: splicing the obtained user comment preference characteristic vector and the clothing image characteristic vector generated by the user comment guidance to generate a user preference characteristic vector which is used as the output of an encoder part;
step 6: the user preference feature vector is input to a decoder section, the probability distribution of the set of candidate apparel items is calculated, and the apparel item with the highest probability is selected as the next recommendation.
Further, the step 1 comprises:
and (3) constructing a grading matrix of the user-clothing item by taking the user as a row of the matrix and the clothing item as a column of the matrix:
Figure BDA0003354707010000021
wherein r isu,iScoring the clothing item i for the user u; p is a radical ofu,kA k-dimensional hidden factor vector representing user u; q. q.si,kK-dimensional hidden factor vector representing clothing item i; f represents the dimension of the vector; p is a radical ofuA hidden factor vector matrix representing the user.
Further, the step 2 comprises:
setting the history comment set S of the user as S1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…,t|S|
Using pre-trained BERT to perform embedded expression of vectors, and using BiGRU to process a word vector sequence of each comment;
splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht
Guiding the comments to generate word-level attention vectors by using the user hidden factor vectors obtained from the scores;
hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
Figure BDA0003354707010000031
Figure BDA0003354707010000032
Figure BDA0003354707010000033
wherein,
Figure BDA0003354707010000034
representing the attention degree of the user u to the word t; w1,W2,W3Is the weight to learn; a iskRepresenting the attention weight of the k word obtained by performing the normalization operation; a is a word-level attention vector, which is a summary of comment S.
Further, the step 3 comprises:
adopting attention to process contextualized user comment feature vector to generate user comment preference feature vector SuThe formula for attention calculation is:
βn=W5 tanh(W4cn+b1)+b2
Figure BDA0003354707010000035
Figure BDA0003354707010000036
wherein beta isnIndicating the degree of attention of the nth comment obtained using the single-layer neural network; c. CnCommenting the feature vector for the contextualized user; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnAnd representing the weight of the nth comment obtained after the normalization operation.
Further, the step 4 comprises:
dividing the clothing image purchased by the user for the last time into m areas;
performing feature extraction on each region by using a VGG network to obtain an original clothing image feature vector;
then, the contextualized user comment feature vectors obtained from the user comments are summarized into a single vector c by using average poolings
The method comprises the following steps of performing attention processing on an original clothing image feature vector, filtering noise, and obtaining a clothing image feature vector generated by a user comment guidance, wherein the attention calculation formula is as follows:
δI=tanh(W6vI⊙W7cS)
Figure BDA0003354707010000041
Figure BDA0003354707010000042
wherein deltaIDenotes csFor vIDegree of attention of, vI={vi|vi∈Rd,i=1,…,m},vI∈Rd×mThe characteristic vector of the original clothing image is obtained; c. Cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
The invention provides a personalized clothing recommendation system fusing multi-mode data and based on attention perception, which comprises the following components:
the hidden factor vector extraction module is used for extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
the user comment feature extraction module is used for mining feature information of the user comment by using a BiGRU and attention mechanism, and comprises the steps of guiding the user comment to generate a word-level attention vector by using a hidden factor vector matrix of the user;
the user comment preference feature vector obtaining module is used for applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining the user comment preference feature vector based on the contextualized user comment feature vector;
the user comment guidance module is used for dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
the user preference feature vector obtaining module is used for splicing the obtained user comment preference feature vector and the clothing image feature vector generated by the user comment guidance to generate a user preference feature vector which is used as the output of the encoder part;
and the recommending module is used for inputting the user preference feature vector into the decoder part, calculating the probability distribution of the candidate clothing item set and selecting the clothing item with the maximum probability as the next item recommendation.
Further, the implicit factor vector extraction module is specifically configured to:
and (3) constructing a grading matrix of the user-clothing item by taking the user as a row of the matrix and the clothing item as a column of the matrix:
Figure BDA0003354707010000051
wherein r isu,iScoring the clothing item i for the user u; p is a radical ofu,kA k-dimensional hidden factor vector representing user u; q. q.si,kK-dimensional hidden factor vector representing clothing item i; f represents the dimension of the vector; p is a radical ofuA hidden factor vector matrix representing the user.
Further, the user comment feature extraction module is specifically configured to:
rating the history of the userArgument S ═ S1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…,t|S|
Using pre-trained BERT to perform embedded expression of vectors, and using BiGRU to process a word vector sequence of each comment;
splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht
Guiding the comments to generate word-level attention vectors by using the user hidden factor vectors obtained from the scores;
hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
Figure BDA0003354707010000052
Figure BDA0003354707010000053
Figure BDA0003354707010000061
wherein,
Figure BDA0003354707010000062
representing the attention degree of the user u to the word t; w1,W2,W3Is the weight to learn; a iskRepresenting the attention weight of the k word obtained by performing the normalization operation; a is a word-level attention vector, which is a summary of comment S.
Further, the user comment preference feature vector derivation module is specifically configured to:
adopting attention to process contextualized user comment feature vector to generate user comment preference feature vector SuAttention meterThe formula of the calculation is as follows:
βn=W5 tanh(W4cn+b1)+b2
Figure BDA0003354707010000063
Figure BDA0003354707010000064
wherein beta isnIndicating the degree of attention of the nth comment obtained using the single-layer neural network; c. CnCommenting the feature vector for the contextualized user; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnAnd representing the weight of the nth comment obtained after the normalization operation.
Further, the user comment guidance module is specifically configured to:
dividing the clothing image purchased by the user for the last time into m areas;
performing feature extraction on each region by using a VGG network to obtain an original clothing image feature vector;
then, the contextualized user comment feature vectors obtained from the user comments are summarized into a single vector c by using average poolings
The method comprises the following steps of performing attention processing on an original clothing image feature vector, filtering noise, and obtaining a clothing image feature vector generated by a user comment guidance, wherein the attention calculation formula is as follows:
δI=tanh(W6vI⊙W7cS)
Figure BDA0003354707010000065
Figure BDA0003354707010000071
wherein deltaIDenotes csFor vIDegree of attention of, vI={vi|vi∈Rd,i=1,…,m},vI∈Rd×mThe characteristic vector of the original clothing image is obtained; c. Cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
Compared with the prior art, the invention has the following beneficial effects:
compared with the prior recommendation system/method, the method disclosed by the invention has the advantages that the multi-mode data is fused for recommendation, the user score, the user comment and the clothing image are specifically used as the input of the model, the user score data is used for carrying out attention processing on the user comment, and the more accurate user comment characteristic is obtained. The features are extracted after the clothing image is segmented, so that the features of different parts of the image can be represented in a finer granularity, and more accurate feature vectors in a finer granularity are obtained. Therefore, multi-source data are fused for recommendation, and the problem of data sparsity in a clothing recommendation system is solved.
Compared with the existing recommendation system/method that vector splicing operation is simply carried out between multi-modal data or fusion is simply carried out by using a full connection layer, the method carries out attention processing of mutual guidance between the multi-modal data, uses user scores to guide user comments to carry out attention processing, and then uses the user comments to guide clothing images to carry out attention processing. According to the method, the attention mechanism is utilized to filter the noise data in the user comment and clothing image data, so that the multidimensional characteristics of the comment and clothing image more relevant to the user are obtained, the personalized preference of the user is more accurately reflected, and the problems of insufficient interest mining, insufficient recommendation precision and the like of the user are solved.
Drawings
FIG. 1 is a basic flowchart of a personalized clothing recommendation method based on attention perception by fusing multi-modal data according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an architecture of a personalized clothing recommendation system based on attention perception and integrating multimodal data according to an embodiment of the invention.
Detailed Description
The invention is further illustrated by the following examples in conjunction with the accompanying drawings:
as shown in fig. 1, a personalized clothing recommendation method based on attention perception by fusing multi-modal data includes:
step 1: extracting a hidden factor vector matrix of the user from the user scoring data by using an LFM (hidden semantic model);
specifically, a user implicit factor vector and a clothing implicit factor vector are extracted from user scoring data by using the LFM, and the user implicit factor vector and the clothing implicit factor vector reflect the interests of a user. Let U be set as U ═ U1,u2,…,umThe clothing item set is I ═ I1,i2,…,inAnd constructing a grading matrix R of the user-clothing items by taking the user as a row of the matrix and the clothing items as columns of the matrix, wherein Ru,iAnd scoring the clothing item i for the user u according to the formula:
Figure BDA0003354707010000081
wherein p isu,kK-dimensional hidden factor vector, q, representing user ui,kK-dimensional hidden factor vector representing item of clothing i, and F represents the dimension of the vector. Finally, a hidden factor vector matrix p of the user representing the preference of the user is obtainedu
Step 2: mining feature information of the user comments by using a BiGRU and an attention mechanism (specifically, processing comment texts by using the BiGRU to obtain user comment features containing context information, and then obtaining more relevant user comment features by using the attention mechanism), wherein the method comprises the steps of using a hidden factor vector matrix of a user to guide the user comments to generate word-level attention vectors;
in particular, the BiGRU and the attention mechanism are used for deeply mining the feature information of the user comment, and the comment text of the user contains important preference information and specific detailed information related to clothing. When extracting information from comments, it is important to distinguish relevant comments from noise comments, determine important parts in each comment, and set S ═ S of user' S historical comments1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…, t|S|. The embedded representation of the vectors was first performed using pre-trained BERT, and the word vector sequence for each comment was processed using BiGRU in order to capture context information. Splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht. Because the word sequence of the text is long, the final feature vector obtained by the BiGRU model in practical application is biased to the last words of the text sequence. In order to solve the problem, an attention mechanism is adopted to guide the comment to generate a word-level attention vector by using a user hidden factor vector obtained in the grading, and the weight of the word vector can be adjusted by fully utilizing the preference of a user so as to obtain more compact and more accurate comment content feature representation. Hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
Figure BDA0003354707010000082
Figure BDA0003354707010000091
Figure BDA0003354707010000092
wherein,
Figure BDA0003354707010000093
to representUsing w1And w2To puAnd htPerforming linear conversion, extracting nonlinear semantic information by using a nonlinear activation function, and finally obtaining the attention degree of the user u to the word t by using the linear conversion; w1,W2,W3Is the weight to learn; a iskRepresenting the attention weight of the k word obtained by performing the normalization operation; a is a word-level attention vector, which is a summary of comment S. For each comment S1,…, SNRepeatedly calculating to obtain a1,…,aN
And step 3: applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining a user comment preference feature vector based on the contextualized user comment feature vector;
specifically, in order to capture global context information in user comments, BiGRU is applied to a word-level attention vector, and hidden states in the forward direction and the backward direction are spliced to obtain a contextualized user comment vector c1,…,cn,…,cN. Attention processing is adopted to pay attention to important comment information, and a final user comment preference feature vector S is generateduThe formula for attention calculation is:
βn=W5 tanh(W4cn+b1)+b2
Figure BDA0003354707010000094
Figure BDA0003354707010000095
wherein, betanIndicating the degree of attention of the nth comment obtained using the single-layer neural network; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnRepresenting the weight of the nth comment obtained after the normalization operationAnd (4) heavy.
And 4, step 4: dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
specifically, when a clothing image is processed, in general, the user's attention may be related to only a specific region of the input clothing image. Therefore, instead of using the global vector as an image feature, the image is divided into m regions and a feature vector of each region is extracted. The scoring guided user comment features are then used to focus on the garment image features, filter noise and find areas more relevant to the user's preferences. First, the garment image size is scaled to 224 × 224, divided into m × N regions, and feature extraction is performed for each region using a preprocessed 19-layer VGG network. Obtaining the original clothing image feature vector vI={vi|vi∈RdI is 1, …, m }. Then obtaining contextualized user comment vector c from the user comment1,…,cn,…, cNThe pools were aggregated into a single vector using averaging.
Figure BDA0003354707010000101
Wherein c issIndicates the use of c1,…,cn,…,cNAnd averaging pooled single vectors after aggregation.
The original clothing image feature vectors are subjected to attention processing, noise is filtered to obtain features highlighting the clothing image key areas, and for calculation convenience, a full connection layer is used for converting each image feature vector into an image feature vector with the dimension equal to the comment vector. The attention calculation formula is:
δI=tanh(W6vI⊙W7cS)
Figure BDA0003354707010000102
Figure BDA0003354707010000103
wherein, deltaIDenotes csFor vIDegree of attention of, vI∈Rd×m,cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
And 5: splicing the obtained user comment preference characteristic vector and the clothing image characteristic vector generated by the user comment guidance to generate a user preference characteristic vector which is used as an Encoder (Encoder) part for output;
and splicing the obtained user comment features and the clothing image features generated by the user comment guidance to generate a user preference feature vector, and outputting the user preference feature vector as an Encoder part of the network.
Step 6: the user preference feature vector is input to a Decoder (Decoder) section, the probability distribution of the set of candidate apparel items is computed, and the apparel item with the highest probability is selected as the next recommendation.
The Decoder part is mainly a GRU network, and during training, a user preference feature vector and a clothing item sequence are used for training, and clothing items are mapped into features with fixed lengths to be used as GRU input.
xt=W8It,t∈{1,…,n}
ht+1=GRU(xt),t∈{1,…,n}
Wherein I ═ I (I)1,…,In) Is a sequence of clothing items, each item represented as a one-hot vector Ht,W8Is a weight vector, xtIs the clothing item embedding feature at time t; h ist+1And representing the hidden state of the GRU model output at the t +1 moment in the training process.
In prediction, the previous output h is givent-1Generating the next output h through GRUt. And at each time step, generating the probability distribution of each clothing item at the time t by adopting a single-layer full-connection network and a softmax function, and finally selecting the item with the highest probability as the next item recommendation.
Pt=softmax(Wihs)
Wherein, PtProbability distribution, h, for each item of clothing generated at time ts∈{h1,…,hnIs input, WiIs a weight parameter.
The objective function of the model training is:
Figure BDA0003354707010000111
wherein HtIs the true tag at time t, H0Is the output value of the Encoder part (i.e. the user preference feature vector), H{1:t-1}Is the previous sequence of items, theta represents all the parameters of the model, and lambda is the regularization parameter. And (4) optimizing an objective function by adopting random gradient descent (SGD), and randomly selecting a training example each time to update the model parameters towards the direction of negative gradient.
On the basis of the above embodiment, as shown in fig. 2, the present invention further provides an attention-aware-based personalized clothing recommendation system fusing multimodal data, including:
the hidden factor vector extraction module is used for extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
the user comment feature extraction module is used for mining feature information of the user comment by using a BiGRU and attention mechanism, and comprises the steps of guiding the user comment to generate a word-level attention vector by using a hidden factor vector matrix of the user;
the user comment preference feature vector obtaining module is used for applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining the user comment preference feature vector based on the contextualized user comment feature vector;
the user comment guidance module is used for dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
the user preference feature vector obtaining module is used for splicing the obtained user comment preference feature vector and the clothing image feature vector generated by the user comment guidance to generate a user preference feature vector which is used as the output of the encoder part;
and the recommending module is used for inputting the user preference feature vector into the decoder part, calculating the probability distribution of the candidate clothing item set and selecting the clothing item with the maximum probability as the next item recommendation.
Further, the implicit factor vector extraction module is specifically configured to:
and (3) constructing a grading matrix of the user-clothing item by taking the user as a row of the matrix and the clothing item as a column of the matrix:
Figure BDA0003354707010000121
wherein r isu,iScoring the clothing item i for the user u; p is a radical ofu,kA k-dimensional hidden factor vector representing user u; q. q.si,kK-dimensional hidden factor vector representing clothing item i; f represents the dimension of the vector; p is a radical ofuA hidden factor vector matrix representing the user.
Further, the user comment feature extraction module is specifically configured to:
setting the history comment set S of the user as S1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…,t|S|
Using pre-trained BERT to perform embedded expression of vectors, and using BiGRU to process a word vector sequence of each comment;
splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht
Guiding the comments to generate word-level attention vectors by using the user hidden factor vectors obtained from the scores;
hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
Figure BDA0003354707010000122
Figure BDA0003354707010000131
Figure BDA0003354707010000132
wherein,
Figure BDA0003354707010000133
representing the attention degree of the user u to the word t; w1,W2,W3Is the weight to learn; a iskRepresenting the attention weight of the k word obtained by performing the normalization operation; a is a word-level attention vector, which is a summary of comment S.
Further, the user comment preference feature vector derivation module is specifically configured to:
adopting attention to process contextualized user comment feature vector to generate user comment preference feature vector SuThe formula for attention calculation is:
βn=W5 tanh(W4cn+b1)+b2
Figure BDA0003354707010000134
Figure BDA0003354707010000135
wherein beta isnIndicating the degree of attention of the nth comment obtained using the single-layer neural network; c. CnCommenting the feature vector for the contextualized user; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnAnd representing the weight of the nth comment obtained after the normalization operation.
Further, the user comment guidance module is specifically configured to:
dividing the clothing image purchased by the user for the last time into m areas;
performing feature extraction on each region by using a VGG network to obtain an original clothing image feature vector;
then, the contextualized user comment feature vectors obtained from the user comments are summarized into a single vector c by using average poolings
The method comprises the following steps of performing attention processing on an original clothing image feature vector, filtering noise, and obtaining a clothing image feature vector generated by a user comment guidance, wherein the attention calculation formula is as follows:
δI=tanh(W6vI⊙W7cS)
Figure BDA0003354707010000141
Figure BDA0003354707010000142
wherein deltaIDenotes csFor vIDegree of attention of, vI={vi|vi∈Rd,i=1,…,m},vI∈Rd×mThe characteristic vector of the original clothing image is obtained; c. Cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
In summary, firstly, aiming at the problem that the precision of a recommendation system is not high due to data sparsity existing in a matrix decomposition-based collaborative filtering algorithm, the method and the device jointly use the user scoring data and the comment information to guide a model to learn more reasonable user characteristics in a mode of adding more data, and further improve the prediction precision of the model. Compared with single scoring data, the text comment information generated by the user is an important source of user preference characteristics and contains more specific and subtle characteristics about the user preferences. Therefore, the precision of the recommendation system can be effectively improved by fusing the comment data. Secondly, the fine-grained clothing image features are used as the representation of the semantic information of clothing commodities, more accurate influence can be generated on the purchase intention of a user, and the method for representing the image features by using the whole clothing image is replaced. Finally, for the user comment information and the clothing image, if the clothing recommendation is performed by directly using the comment text extracted by the bidirectional GRU model or using the VGG model to capture the overall vector characteristics of the clothing image, the final recommendation result may be affected by noise data therein. The present invention introduces an attention mechanism that allows the model to focus more on information related to the user's features. Firstly, the generation of comment features of a user is guided by using the grading data of the user, the user grading explicitly expresses the preference of the user, and the weight of a word vector in each comment can be adjusted by fully utilizing the preference of the user so as to obtain more accurate comment feature representation. Because the user comment may contain intuitive words capable of expressing user preferences, which indicate that the user focuses more on information of a certain local area on the clothing image, the comment features processed by the attention mechanism are used for guiding the generation of clothing image features, and clothing vector features more relevant to user preferences are obtained. Therefore, user comment and clothing image vector features which are more accurate and accord with user personalized preferences can be obtained from the user comment and clothing image respectively. After the user comment feature vector and the clothing image feature vector are obtained, the user comment feature vector and the clothing image feature vector are spliced and a GRU model is used for carrying out serialized recommendation, and therefore the next personalized recommendation is carried out on the user.
Compared with the prior recommendation system method, the method disclosed by the invention has the advantages that the multi-mode data are fused for recommendation, the user score, the user comment and the clothing image are specifically used as the input of the model, the user score data is used for carrying out attention processing on the user comment, and the more accurate user comment characteristic is obtained. The features are extracted after the clothing image is segmented, so that the features of different parts of the image can be represented in a finer granularity, and more accurate feature vectors in a finer granularity are obtained. Therefore, multi-source data are fused for recommendation, and the problem of data sparsity in a clothing recommendation system is solved.
Compared with the existing recommendation system and method in which vector splicing operation is simply carried out between multi-modal data or fusion is simply carried out by using a full connection layer, the method carries out attention processing of mutual guidance between the multi-modal data, uses user scores to guide user comments to carry out attention processing, and then uses the user comments to guide clothing images to carry out attention processing. According to the method, the attention mechanism is utilized to filter the noise data in the user comment and clothing image data, so that the multidimensional characteristics of the comment and clothing image more relevant to the user are obtained, the personalized preference of the user is more accurately reflected, and the problems of insufficient interest mining, insufficient recommendation precision and the like of the user are solved.
The above shows only the preferred embodiments of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.

Claims (6)

1. A personalized clothing recommendation method based on attention perception and fusing multi-modal data is characterized by comprising the following steps:
step 1: extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
step 2: mining feature information of the user comments by using a BiGRU and attention mechanism, wherein the feature information comprises a hidden factor vector matrix of a user for guiding the user comments to generate word-level attention vectors;
and step 3: applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining a user comment preference feature vector based on the contextualized user comment feature vector;
and 4, step 4: dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
and 5: splicing the obtained user comment preference characteristic vector and the clothing image characteristic vector generated by the user comment guidance to generate a user preference characteristic vector which is used as the output of an encoder part;
step 6: the user preference feature vector is input to a decoder section, the probability distribution of the set of candidate apparel items is calculated, and the apparel item with the highest probability is selected as the next recommendation.
2. The personalized clothing recommendation method based on attention perception fusing multi-modal data as claimed in claim 1, wherein the step 1 comprises:
and (3) constructing a grading matrix of the user-clothing item by taking the user as a row of the matrix and the clothing item as a column of the matrix:
Figure FDA0003354707000000011
wherein r isu,iScoring the clothing item i for the user u; p is a radical ofu,kA k-dimensional hidden factor vector representing user u; q. q.si,kK-dimensional hidden factor vector representing clothing item i; f represents the dimension of the vector; p is a radical ofuA hidden factor vector matrix representing the user.
3. The personalized clothing recommendation method based on attention perception fusing multi-modal data as claimed in claim 1, wherein the step 2 comprises:
setting the history comment set S of the user as S1,S2,…,Sn,…,SNEvery comment in is represented as a combination of words t1,t2,…,t|S|
Using pre-trained BERT to perform embedded expression of vectors, and using BiGRU to process a word vector sequence of each comment;
splicing the hidden states of each word in the forward direction and the backward direction to obtain a context word vector, thereby obtaining a word sequence ht
Guiding the comments to generate word-level attention vectors by using the user hidden factor vectors obtained from the scores;
hiding the user factor vector matrix puAnd word sequence htAs input, attention processing is performed, and the calculation formula is:
Figure FDA0003354707000000021
Figure FDA0003354707000000022
Figure FDA0003354707000000023
wherein,
Figure FDA0003354707000000024
representing the attention degree of the user u to the word t; w1,W2,W3Is the weight to learn; a iskRepresenting the attention weight of the k word obtained by performing the normalization operation; a is a word-level attention vector, which is a summary of comment S.
4. The personalized clothing recommendation method based on attention perception fusing multi-modal data as claimed in claim 1, wherein the step 3 comprises:
adopting attention to process contextualized user comment feature vector to generate user comment preference feature vector SuThe formula for attention calculation is:
βn=W5 tanh(W4cn+b1)+b2
Figure FDA0003354707000000025
Figure FDA0003354707000000026
wherein beta isnIndicating the degree of attention of the nth comment obtained using the single-layer neural network; c. CnCommenting the feature vector for the contextualized user; w4,W5Is a weight matrix; b1,b2Is a bias vector; gnAnd representing the weight of the nth comment obtained after the normalization operation.
5. The personalized clothing recommendation method based on attention perception fusing multi-modal data as claimed in claim 1, wherein the step 4 comprises:
dividing the clothing image purchased by the user for the last time into m areas;
performing feature extraction on each region by using a VGG network to obtain an original clothing image feature vector;
then, the contextualized user comment feature vectors obtained from the user comments are summarized into a single vector c by using average poolings
The method comprises the following steps of performing attention processing on an original clothing image feature vector, filtering noise, and obtaining a clothing image feature vector generated by a user comment guidance, wherein the attention calculation formula is as follows:
δI=tanh(W6vI⊙W7cS)
Figure FDA0003354707000000031
Figure FDA0003354707000000032
wherein deltaIDenotes csFor vIDegree of attention of, vI={vi|vi∈Rd,i=1,…,m},vI∈Rd×mThe characteristic vector of the original clothing image is obtained; c. Cs∈Rd;W6,W7Is a weight matrix; an indication of a connection of a vector; p is a radical ofI∈RmA vector representing m regions, corresponding to the probability of attention, p, for each regioniRepresents pIThe vector of the corresponding ith area; vLAnd guiding the generated clothing image feature vector for user comment.
6. An attention-aware-based personalized garment recommendation system fusing multimodal data, comprising:
the hidden factor vector extraction module is used for extracting a hidden factor vector matrix of the user from the user scoring data by using the LFM;
the user comment feature extraction module is used for mining feature information of the user comment by using a BiGRU and attention mechanism, and comprises the steps of guiding the user comment to generate a word-level attention vector by using a hidden factor vector matrix of the user;
the user comment preference feature vector obtaining module is used for applying BiGRU to splice hidden states in the forward direction and the backward direction on the attention vector of the word level to obtain a contextualized user comment feature vector, and then obtaining the user comment preference feature vector based on the contextualized user comment feature vector;
the user comment guidance module is used for dividing the clothing image purchased by the user for the last time into m regions, extracting the clothing image feature vector of each region, and performing attention guidance on clothing image features by using the contextualized user comment vector to obtain the clothing image feature vector generated by the user comment guidance;
the user preference feature vector obtaining module is used for splicing the obtained user comment preference feature vector and the clothing image feature vector generated by the user comment guidance to generate a user preference feature vector which is used as the output of the encoder part;
and the recommending module is used for inputting the user preference feature vector into the decoder part, calculating the probability distribution of the candidate clothing item set and selecting the clothing item with the maximum probability as the next item recommendation.
CN202111348060.9A 2021-11-15 2021-11-15 Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data Active CN113850656B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111348060.9A CN113850656B (en) 2021-11-15 2021-11-15 Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111348060.9A CN113850656B (en) 2021-11-15 2021-11-15 Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data

Publications (2)

Publication Number Publication Date
CN113850656A true CN113850656A (en) 2021-12-28
CN113850656B CN113850656B (en) 2022-08-23

Family

ID=78984272

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111348060.9A Active CN113850656B (en) 2021-11-15 2021-11-15 Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data

Country Status (1)

Country Link
CN (1) CN113850656B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114943035A (en) * 2022-06-08 2022-08-26 青岛文达通科技股份有限公司 User dressing recommendation method and system based on self-encoder and memory network
CN114971784A (en) * 2022-05-21 2022-08-30 内蒙古工业大学 Graph neural network-based session recommendation method and system integrating self-attention mechanism

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754317A (en) * 2019-01-10 2019-05-14 山东大学 Merge interpretation clothes recommended method, system, equipment and the medium of comment
CN110807477A (en) * 2019-10-18 2020-02-18 山东大学 Attention mechanism-based neural network garment matching scheme generation method and system
CN110807154A (en) * 2019-11-08 2020-02-18 内蒙古工业大学 Recommendation method and system based on hybrid deep learning model
CN111415222A (en) * 2020-03-19 2020-07-14 苏州大学 Article recommendation method, device, equipment and computer-readable storage medium
CN112016002A (en) * 2020-08-17 2020-12-01 辽宁工程技术大学 Mixed recommendation method integrating comment text level attention and time factors
US20210065278A1 (en) * 2019-08-27 2021-03-04 Nec Laboratories America, Inc. Asymmetrically hierarchical networks with attentive interactions for interpretable review-based recommendation
CN113420221A (en) * 2021-07-01 2021-09-21 宁波大学 Interpretable recommendation method integrating implicit article preference and explicit feature preference of user

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754317A (en) * 2019-01-10 2019-05-14 山东大学 Merge interpretation clothes recommended method, system, equipment and the medium of comment
US20210065278A1 (en) * 2019-08-27 2021-03-04 Nec Laboratories America, Inc. Asymmetrically hierarchical networks with attentive interactions for interpretable review-based recommendation
CN110807477A (en) * 2019-10-18 2020-02-18 山东大学 Attention mechanism-based neural network garment matching scheme generation method and system
CN110807154A (en) * 2019-11-08 2020-02-18 内蒙古工业大学 Recommendation method and system based on hybrid deep learning model
CN111415222A (en) * 2020-03-19 2020-07-14 苏州大学 Article recommendation method, device, equipment and computer-readable storage medium
CN112016002A (en) * 2020-08-17 2020-12-01 辽宁工程技术大学 Mixed recommendation method integrating comment text level attention and time factors
CN113420221A (en) * 2021-07-01 2021-09-21 宁波大学 Interpretable recommendation method integrating implicit article preference and explicit feature preference of user

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
田保军 等: "融合主题信息和卷积神经网络的混合推荐算法", 《计算机应用》 *
田保军 等: "融合信任和基于概率矩阵分解的推荐算法", 《计算机应用》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114971784A (en) * 2022-05-21 2022-08-30 内蒙古工业大学 Graph neural network-based session recommendation method and system integrating self-attention mechanism
CN114971784B (en) * 2022-05-21 2024-05-14 内蒙古工业大学 Session recommendation method and system based on graph neural network by fusing self-attention mechanism
CN114943035A (en) * 2022-06-08 2022-08-26 青岛文达通科技股份有限公司 User dressing recommendation method and system based on self-encoder and memory network

Also Published As

Publication number Publication date
CN113850656B (en) 2022-08-23

Similar Documents

Publication Publication Date Title
CN112579778B (en) Aspect-level emotion classification method based on multi-level feature attention
CN114936623B (en) Aspect-level emotion analysis method integrating multi-mode data
CN113850656B (en) Personalized clothing recommendation method and system based on attention perception and integrating multi-mode data
CN109815903A (en) A kind of video feeling classification method based on adaptive converged network
CN112256866B (en) Text fine-grained emotion analysis algorithm based on deep learning
CN110765769B (en) Clause feature-based entity attribute dependency emotion analysis method
CN109598387A (en) Forecasting of Stock Prices method and system based on two-way cross-module state attention network model
CN107451118A (en) Sentence-level sensibility classification method based on Weakly supervised deep learning
Lopes et al. An AutoML-based approach to multimodal image sentiment analysis
CN110991290A (en) Video description method based on semantic guidance and memory mechanism
CN114648031B (en) Text aspect emotion recognition method based on bidirectional LSTM and multi-head attention mechanism
Wang et al. Sentiment analysis from Customer-generated online videos on product review using topic modeling and Multi-attention BLSTM
CN116610778A (en) Bidirectional image-text matching method based on cross-modal global and local attention mechanism
CN111598596A (en) Data processing method and device, electronic equipment and storage medium
KR20200010672A (en) Smart merchandise searching method and system using deep learning
LU506520B1 (en) A sentiment analysis method based on multimodal review data
CN110569869A (en) feature level fusion method for multi-modal emotion detection
Gao et al. Play and rewind: Context-aware video temporal action proposals
CN116703506A (en) Multi-feature fusion-based E-commerce commodity recommendation method and system
Parvin et al. Transformer-based local-global guidance for image captioning
CN115062174A (en) End-to-end image subtitle generating method based on semantic prototype tree
Gandhi et al. Multimodal sentiment analysis: review, application domains and future directions
CN114595693A (en) Text emotion analysis method based on deep learning
CN117237479A (en) Product style automatic generation method, device and equipment based on diffusion model
CN115906824A (en) Text fine-grained emotion analysis method, system, medium and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant