CN112784173B - Recommendation system scoring prediction method based on self-attention confrontation neural network - Google Patents
Recommendation system scoring prediction method based on self-attention confrontation neural network Download PDFInfo
- Publication number
- CN112784173B CN112784173B CN202110217932.1A CN202110217932A CN112784173B CN 112784173 B CN112784173 B CN 112784173B CN 202110217932 A CN202110217932 A CN 202110217932A CN 112784173 B CN112784173 B CN 112784173B
- Authority
- CN
- China
- Prior art keywords
- self
- attention
- distribution
- matrix
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 43
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000009826 distribution Methods 0.000 claims abstract description 176
- 239000011159 matrix material Substances 0.000 claims abstract description 118
- 238000012549 training Methods 0.000 claims abstract description 24
- 230000003042 antagnostic effect Effects 0.000 claims abstract description 9
- 230000000873 masking effect Effects 0.000 claims abstract description 4
- 239000013598 vector Substances 0.000 claims description 97
- 238000004364 calculation method Methods 0.000 claims description 28
- 238000013507 mapping Methods 0.000 claims description 18
- 238000011176 pooling Methods 0.000 claims description 18
- 230000004927 fusion Effects 0.000 claims description 17
- 238000005070 sampling Methods 0.000 claims description 9
- 238000003062 neural network model Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 230000004069 differentiation Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 abstract description 11
- 230000006870 function Effects 0.000 description 38
- 230000009286 beneficial effect Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a recommendation system score prediction method based on a self-attention confrontation neural network, which comprises the following steps of: s1: collecting user information, project information and project scoring data of a user, and constructing a high-dimensional sparse scoring matrix and a corresponding masking matrix; s2: generating distribution information on a high-dimensional sparse scoring matrix; s3: building a scoring prediction model of a recommendation system by using a self-attention antagonistic neural network, and training the scoring prediction model; s4: and evaluating the high-dimensional sparse scoring matrix to complete the scoring prediction of the user on the project. The invention combines the self-attention mechanism and the discrimination automatic encoder and provides a specific method applied to a recommendation system. And extracting the distribution information of the score data from the mask matrix of the high-dimensional sparse matrix by using a self-attention distinguishing self-encoder, and providing more distribution information for subsequent learning score data characteristics and prediction score data.
Description
Technical Field
The invention belongs to the technical field of recommendation systems, and particularly relates to a recommendation system score prediction method based on a self-attention confrontation neural network.
Background
The rapid development of the internet causes the problem of information overload, and the efficiency of obtaining useful information by a user is seriously influenced. To solve this problem, recommendation system technology has gained a great deal of research effort. In a recommendation system, user-item scoring data is the underlying data source. Because there are a large number of users and projects in the system, it is not possible for each user to rate all projects, and so there is very little scoring data. These scoring data are typically represented using a high-dimensional sparse matrix, in which only a small number of elements are known. To solve this high-dimensional sparse matrix, a number of collaborative filtering-based methods are proposed. The methods mainly utilize the existing evaluation data to extract the low-dimensional hidden feature representation of the users and the projects, and have the following defects: firstly, the relationship between the score data of the local regions in the high-dimensional sparse matrix data is not fully utilized; secondly, the overall distribution characteristics of the scoring data in the high-dimensional sparse matrix are not considered.
Disclosure of Invention
The invention aims to solve the problem of score prediction among user items and provides a recommendation system score prediction method based on a self-attention confrontation neural network.
The technical scheme of the invention is as follows: a recommendation system scoring prediction method based on a self-attention confrontation neural network comprises the following steps:
s1: collecting user information, project information and project scoring data of a user, and constructing a high-dimensional sparse scoring matrix and a corresponding masking matrix;
s2: extracting the distribution characteristics of a mask matrix by using a self-attention encoder to generate distribution information about a high-dimensional sparse scoring matrix;
s3: building a scoring prediction model of a recommendation system by using a self-attention antagonistic neural network, and training the scoring prediction model according to distribution information and a high-dimensional sparse scoring matrix;
s4: and evaluating the high-dimensional sparse scoring matrix by using the trained scoring prediction model to complete the scoring prediction of the project by the user.
The invention has the beneficial effects that:
(1) the invention combines the self-attention mechanism and the discrimination automatic encoder and provides a specific method applied to a recommendation system. And extracting the distribution information of the score data from the mask matrix of the high-dimensional sparse matrix by using a self-attention distinguishing self-encoder, and providing more distribution information for subsequent learning score data characteristics and prediction score data. Meanwhile, the model adopts a convolution neural network to extract the distribution characteristics of the local area in the mask matrix. Besides, the dependency relationship between all data in the mask matrix is calculated by using a self-attention mechanism, and the global distribution characteristic is obtained. And finally, the distribution characteristics of the local area and the global distribution characteristics are fused to train the model, so that the distribution information of the mask matrix can be effectively and comprehensively acquired.
(2) The invention establishes a prediction model based on the antagonistic neural network to estimate the missing scoring data in the high-dimensional sparse matrix. The distribution information of the high-dimensional sparse matrix and the scoring data are fused to be used as training data, and a self-attention mechanism is fused into the generator model, so that the dependency relationship between the scoring data sensed by the neural network is favorably resisted, and the characteristics of the scoring data are better learned. Meanwhile, the mean square error between the predicted scoring data and the real scoring data is used as a regularization term of the objective function of the antagonistic neural network, and the prediction precision of the model is improved.
Further, step S1 includes the following sub-steps:
s11: collecting user information, project information and project grading data of users to obtain a user setItem setAnd user scoring set of itemsWherein n denotes the number of users, m denotes the number of items, u1,u2,…,unRepresenting 1 st to nth users, i1,i2,…,imDenotes the 1 st to m-th items, su,iRepresents the score of the user u on the item i, and v represents the maximum value of the score;
s12: aggregation of ratings from user to itemsConstructing a high-dimensional sparse scoring matrix R, wherein each element Ru,iThe expression of (a) is:
s13: according to the high-dimensional sparse scoring matrix R, constructing a corresponding mask matrix H epsilon {0,1}n×mWherein each element hu,iThe expression of (a) is:
wherein 1 indicates that the user u has a known score for the item i, 0 indicates that the user u has an unknown score for the item i,represents the set of known elements in the R,representing a set of unknown elements.
The beneficial effects of the further scheme are as follows: in the invention, a recommendation system model is designed aiming at the internet application based on the grading feedback, and is used for predicting the missing grading data and providing the possibly interested items for the user. First, user information, project information, and user rating data for projects, such as movie rating, joke rating, and web service quality rating, need to be collected from a real application. Matrix arrayIn (1),andrespectively represent a known element set and an unknown element set in R, becauseThus R is a high-dimensional sparse matrix. The mask matrix H can reflect the overall distribution characteristics of the known scores in R, and each row vector HuE H can reflect the distribution characteristics of the scoring data of the user u.
Further, step S2 includes the following sub-steps:
s21: setting each row vector { H) in the mask matrix H1,…,hu,…,hnObey the first data distribution q (h);
s22: defining a self-attention encoder for distributing samples h from a first data distribution q (h)uE.g. H turnTransformation into a corresponding low-dimensional hidden feature representation zuWherein z isuOne sample of the second data distribution q (z);
s23: representing the low-dimensional hidden features from the second data distribution q (z) as zuAs input to a self-attention decoder, and generates samples huOf the reconstructed sample
S24: calculate sample huAnd reconstructing the sampleThe reconstruction error rec _ error in between;
s25: setting a distribution p (z) of a known analytic expression, and training a self-attention encoder and a self-attention decoder according to the distance between the distribution p (z) of the known analytic expression and a second data distribution q (z);
s26: and converting the mask vector into a low-dimensional hidden feature representation of the distribution p (z) conforming to the known analytic expression by using a trained self-attention encoder, and converting sample data of the distribution p (z) of the known analytic expression into distribution information by using a trained self-attention decoder to generate the distribution information about the high-dimensional sparse scoring matrix.
The beneficial effects of the further scheme are as follows: in the present invention, assume that each row vector H in H1,…,hu,…,hnObey a data distribution q (h). The corresponding low dimensional hidden feature matrix is represented asWherein the row vector z1,…,zu,…,znD represents the dimension of the hidden feature representation. The low-dimensional hidden feature representation is assumed to follow an analytically unknown data distribution q (z).
Further, step S22 includes the following sub-steps:
s221: randomly sampling t mask vectors { H) from a mask matrix H by using a small batch gradient descent algorithmu,hu+1…,hu+tForming an input matrix X;
s222: defining a self-attention encoder, taking an input matrix X as an input of the self-attention encoder, and distributing samples h from a first data distribution q (h)uE H is converted into corresponding low-dimensional hidden feature representation zuWherein z isuIs a sample of the second data distribution q (z).
Further, in step S222, the self-attention encoder includes a convolutional layer, a self-attention layer, and a pooling layer;
the method for constructing the convolutional layer comprises the following steps: the convolutional layer contains K1 × 1 convolutional kernels, and the feature map E of the input matrix X is extracted by using the convolutional kernels, and the calculation formula is as follows:
wherein, represents a two-dimensional convolution calculation,the parameters representing the k-th convolution kernel,represents the deviation of the kth convolution kernel, σ (·) represents the activation function;
the method for constructing the self-attention layer comprises the following steps: calculating a dependency relationship matrix Y between each element in the feature mapping E, and fusing the dependency relationship matrix Y into the feature mapping E to obtain a fused feature I, wherein each element Y in the feature mapping EpAnd the calculation formula of the fusion characteristic I is respectively as follows:
wherein e ispP-th one representing a feature map EElement, eqQ element, y, representing feature map EpThe p-th element representing Y, f (-) represents a function for calculating any two-point similarity relationship, g (-) represents a mapping function, γ (E) represents a normalization factor,the parameters representing the kth self-attention layer convolution kernel,the deviation of the kth self-attention layer convolution kernel;
the method for constructing the pooling layer comprises the following steps: inputting the fusion feature I into a pooling layer with a pooling kernel size of c × c and a sliding step of a, wherein the expression of the pooling layer is as follows:
Z=MeanPooling2D(I)
where MeanPooling2D (. cndot.) represents average pooling, Z represents a low-dimensional implicit feature matrix, and each row vector { Z ·u,zu+1…,zu+tDenotes a mask vector hu,hu+1…,hu+tThe low dimensional implicit feature representation of.
The beneficial effects of the above further scheme are: in the invention, in a self-attention layer, firstly, a dependency matrix Y between each element in a feature mapping E is calculated, the dependency matrix Y is fused into the feature mapping E, and global features are introduced for local features extracted from each receptive field in convolution operation, so that richer information is brought to a subsequent convolution layer; wherein g (-) represents a mapping function for computing a feature vector of a point.
Further, in step S23, a sample { h } is generatedu,hu+1…,hu+tCorresponding reconstructed dataUsing matrices formed from reconstructed dataExpressed, the calculation formula is:
wherein D is0Input of a representation model, Dl-1Denotes the output of layer l-1, DlRepresents the output of the l-th layer, UlRepresents the output of the l-th upsampling layer, UL-1Represents the output of the L-1 st upsampling layer, K represents the number of convolution kernels, σ (-) represents the activation function,representing the parameters of the kth convolution kernel in layer l-1,represents the deviation of the kth convolution kernel in the L-1 layer, L represents the number of deconvolution layers, Z represents the low-dimensional hidden feature matrix, UpSamplling 2D (·) represents the upsampled layer,parameters representing the kth convolution kernel in the lth layer,indicating the deviation of the kth convolution kernel in the L-th layer.
DlAnd UlThe intermediate result of the calculation step is shown, so that the detailed process of the calculation step is convenient to explain, and the calculation reconstruction data has a direct relation.
Further, in step S24, the calculation formula of the reconstruction error rec _ error is:
Further, in step S25, the step of calculating the distance C (p (z)) between the distribution p (z) of the known analytic expression and the second data distribution q (z), q (z) includes the following sub-steps:
s251: setting a distribution p (z) of a known analytical formula;
s252: establishing a discriminator by utilizing a full-connection neural network;
s253: randomly sampling t low-dimensional hidden feature representations from a distribution p (z) of known analytic expressionsRandomly sampling t low-dimensional hidden feature representations { z ] from the distribution q (z)1,…,zt},
S254: representing low-dimensional hidden featuresAndas the input of the discriminator, and outputs the discrimination result;
s253: based on the determination result, the distance C (p (z), q (z)) between the distribution p (z) of the known analytical formula and the second data distribution q (z) is calculated by the following formula:
wherein D (-) represents the discrimination result, log (-) represents the logarithmic function,indicating that a mathematical expectation is calculated on a second data distribution q (z) of z,is shown inData distribution ofThe mathematical expectation value is calculated.
The beneficial effects of the further scheme are as follows: in the present invention, to enable a decoder to generate new mask vectorsThe decoder must receive the other vectors from q (z)However, the analytical formula of q (z) is unknown, and it is difficult to directly obtain other data samples. For indirectly obtaining other samples z of the q (z) distribution*Assuming that a distribution p (z) for which the analytic expression is known is the equivalent distribution of q (z), the samples in q (z) are obeyed to p (z) and the samples in p (z) are also obeyed to q (z). At this time, a new mask vector can be generated by using the samples in p (z) as input data of the decoder. To satisfy this assumption, the distance between q (z) and p (z) is taken as part of the encoder and decoder objective function, with which the parameters of the encoder and decoder are updated, enabling the encoder to convert the data distribution q (h) to an equivalent distribution p (z) of q (z), and the decoder to convert the low-dimensional hidden feature representation output by the encoder to a new mask vector. After training, the transformed low-dimensional implicit features represent samples that are more and more similar to p (z).
Further, in step S26, the target function expression of the self-attention encoder is rec _ error; the target function expression of the discriminator is dis _ loss ═ C (p (z), q (z)); the objective function expression of the self-attention decoder isWhere rec _ error represents sample huAnd reconstructing the sampleC (p (z), q (z)) represents the distance between the distribution p (z) of the known analytical formula and the second data distribution q (z), D (-) represents the discrimination result, log (-) represents the logarithmic function,is shown inData distribution ofThe mathematical expectation value is calculated.
The beneficial effects of the further scheme are as follows: in the present invention, after training is completed, the encoder can convert the mask vector into a low-dimensional hidden feature representation that obeys the data distribution p (z), and the decoder can convert the sample data from p (z) into useful distribution information. Since the analytic expression of p (z) is known, a large amount of low-dimensional hidden feature representation data can be uniformly obtained from the low-dimensional hidden feature representation data, and more comprehensive distribution information about a high-dimensional sparse scoring matrix is obtained through the conversion of a decoder.
Further, step S3 includes the following sub-steps:
s31: setting row vectors { R) in high-dimensional sparse scoring matrix R1,…,ru,…,rnObey the true data distribution prAnd the row vector r1,…,ru,…,rnAs a true score vector, where ruA score vector representing user u;
s32: constructing a self-attention confrontation neural network model, wherein a generator of the self-attention confrontation neural network model is in a self-attention self-encoder structure, a discriminator is in a fully-connected neural network, and a prediction score vector generated by the generator is set to obey the generator distribution pg;
S33: sample to be reconstructedAnd a score vector r of the useruPerforming fusion to obtain fusion dataThe calculation formula for fusion is:
wherein h isuAn indication sample, < > indicates a logical operation;
s34: fusing dataAs input to the generator, a prediction score vector for each user is calculatedThe calculation formula is as follows:
s35: vector prediction scoresAnd a truth score vector { r1,…,ru,…,rnAs the input of the discriminator, the real data distribution p is discriminatedrSum generator distribution pgThe difference between the grading sample data and the real grading data and outputting a judgment result;
s36: training a scoring prediction model according to the judgment result until the scoring prediction model is converged;
in step S35, the sparsification method is used to facilitate the differentiation by the discriminator, and the expression is:
in step S36, the discriminator is trained according to the discrimination result, and the objective function expression is:
wherein, JDisAn objective function of the discriminator is represented,is shown at ruTrue data distribution p ofrThe mathematical expectation value is calculated as above,is shown inGenerator distribution p ofgCalculating a mathematical expected value, and Dis (·) represents a discrimination result input by the discriminator;
training the generator according to the discrimination result, wherein the target function expression is as follows:
wherein, JGenAn objective function of the generator is represented,is shown inData distribution p ofgUpper calculated mathematical expected value,λ denotes regularization coefficient, ψ denotes regularization, ru,iRepresenting elements in a high-dimensional sparse scoring matrix R,representing the predicted score of user u for item i,representing a known set of scoring data.
The beneficial effects of the further scheme are as follows: in the present invention, in order to learn the features of the score data in R, first assume a row vector { R ] in R1,…,ru,…,rnObey the true data distribution pr(ii) a Establishing a self-attention confrontation neural network model, wherein a generator uses a self-attention self-encoder structure, and a discriminator uses a full-connection neural network; to enable the generator to generate a true score vector, it is assumed that the predicted score vector generated by the generator obeys the generator distribution pg. If can enable pgAnd prThe same, the prediction score vector generated by the generator is real at this time; to make p stand forgClose to prUntil the same, the distance between the two distributions is taken as the objective function of the model. Updating the parameters of the generator by using the target function to ensure that the distribution distance is smaller and smaller; due to pgAnd prThe analytical formula (2) is unknown, and the distance between the two cannot be calculated by a clear formula. Therefore, the difference between the sample data of the two distributions and the sample data of the real distribution is estimated through the discriminator and then is substituted into the Wasserstein distance formula to approximately calculate pgAnd prThe distance of (d); to obtain sample data for the producer distribution, the input to the producer is first obtained. Decoder sample mask vector learned from S2And the mask vector and the score vector r of the useruFusion, providing more distribution information for scoring dataThe fused data is used as the input of the generator. The generator calculates the prediction scoring vector of each userVector prediction scoresAnd a truth score vector { r1,…,ru,…,rnThe data is used as the input of a discriminator for evaluating the difference between the two types of data and the real scoring data; since the true score vector has only a small amount of known score data, a large number of unknown scores are filled with 0, and the predicted score values of all the predicted score vectors are difficult for the discriminator to discriminate the true and false of the two. Therefore, the prediction score vector is thinned by using the mask vector, only the prediction score corresponding to the known score data in the prediction score vector is reserved, so that the consent is kept in the form, and only the characteristics of the true and false score data are reserved on the data, thereby being convenient for the distinguishing of a discriminator. The sparse prediction score vector and the real score vector are used as the input of a discriminator, and the discrimination result is output; and finally training a model: the discriminants are trained first, the generators are trained second, and regularization (which represents the mean square error between the prediction scores and the true scores) is used until the model converges.
Drawings
FIG. 1 is a flow chart of a recommendation system score prediction method.
Detailed Description
The embodiments of the present invention will be further described with reference to the accompanying drawings.
As shown in fig. 1, the present invention provides a recommendation system score prediction method based on a self-attention confrontation neural network, comprising the following steps:
s1: collecting user information, project information and project grading data of a user, and constructing a high-dimensional sparse grading matrix and a corresponding masking matrix;
s2: extracting the distribution characteristics of a mask matrix by using a self-attention encoder to generate distribution information about a high-dimensional sparse scoring matrix;
s3: building a scoring prediction model of a recommendation system by using a self-attention antagonistic neural network, and training the scoring prediction model according to distribution information and a high-dimensional sparse scoring matrix;
s4: and evaluating the high-dimensional sparse scoring matrix by using the trained scoring prediction model to complete the scoring prediction of the project by the user.
In the embodiment of the present invention, as shown in fig. 1, step S1 includes the following sub-steps:
s11: collecting user information, project information and project grading data of users to obtain a user setItem setAnd user scoring set of itemsWherein n denotes the number of users, m denotes the number of items, u1,u2,…,unRepresenting 1 st to nth users, i1,i2,…,imDenotes the 1 st to m-th items, su,iRepresents the score of the user u on the item i, and v represents the maximum value of the score;
s12: aggregation of ratings according to user to itemsConstructing a high-dimensional sparse scoring matrix R, wherein each element Ru,iThe expression of (c) is:
s13: according to the high-dimensional sparse scoring matrix R, a corresponding mask matrix H epsilon {0,1} is constructedn×mWherein each element hu,iThe expression of (a) is:
wherein 1 indicates that the user u has a known score for the item i, 0 indicates that the user u has an unknown score for the item i,represents the set of known elements in the R,representing a set of unknown elements.
In the invention, a recommendation system model is designed aiming at the internet application based on the grading feedback, and is used for predicting the missing grading data and providing the possibly interested items for the user. First, user information, project information, and user rating data for projects, such as movie rating, joke rating, and web service quality rating, need to be collected from a practical application. Matrix arrayIn (1),andrespectively represent a known element set and an unknown element set in R, becauseThus R is a high-dimensional sparse matrix. The mask matrix H can reflect the overall distribution characteristics of the known scores in R, and each row vector HuE H can reflect the distribution characteristics of the scoring data of the user u.
In the embodiment of the present invention, as shown in fig. 1, step S2 includes the following sub-steps:
s21: setting each row vector { H) in the mask matrix H1,…,hu,…,hnObey the first data distribution q (h);
s22: defining a self-attention encoder to receive a first number from a first numberSample h according to distribution q (h)uE H is converted into corresponding low-dimensional hidden feature representation zuWherein z isuOne sample of the second data distribution q (z);
s23: representing the low-dimensional hidden features from the second data distribution q (z) as zuAs input to a self-attention decoder, and generates samples huOf the reconstructed sample
S24: calculate sample huAnd reconstructing the sampleThe reconstruction error rec _ error in between;
s25: setting a distribution p (z) of a known analytic expression, and training a self-attention encoder and a self-attention decoder according to the distance between the distribution p (z) of the known analytic expression and a second data distribution q (z);
s26: and converting the mask vector into a low-dimensional hidden feature representation of the distribution p (z) conforming to the known analytic expression by using a trained self-attention encoder, and converting sample data of the distribution p (z) of the known analytic expression into distribution information by using a trained self-attention decoder to generate the distribution information about the high-dimensional sparse scoring matrix.
In the present invention, assume that each row vector H in H1,…,hu,…,hnObey a data distribution q (h). The corresponding low dimensional hidden feature matrix is represented asWherein the row vector z1,…,zu,…,znD represents the dimension of the hidden feature representation. It is assumed that the low dimensional implicit features represent a data distribution q (z) subject to an analytical unknowns.
In the embodiment of the present invention, as shown in fig. 1, step S22 includes the following sub-steps:
s221: randomly sampling t mask vectors { H) from a mask matrix H by using a small batch gradient descent algorithmu,hu+1…,hu+tForming an input matrix X;
s222: defining a self-attention encoder, taking an input matrix X as an input of the self-attention encoder, and distributing samples h from a first data distribution q (h)uE H is converted into corresponding low-dimensional hidden feature representation zuWherein z isuIs a sample of the second data distribution q (z).
In the embodiment of the present invention, as shown in fig. 1, in step S222, the self-attention encoder includes a convolutional layer, a self-attention layer, and a pooling layer;
the method for constructing the convolutional layer comprises the following steps: the convolutional layer contains K1 × 1 convolutional kernels, and the feature map E of the input matrix X is extracted by using the convolutional kernels, and the calculation formula is as follows:
wherein, represents a two-dimensional convolution calculation,the parameters representing the k-th convolution kernel,represents the deviation of the kth convolution kernel, σ (·) represents the activation function;
the method for constructing the self-attention layer comprises the following steps: calculating a dependency relationship matrix Y between each element in the feature mapping E, and fusing the dependency relationship matrix Y into the feature mapping E to obtain a fused feature I, wherein each element Y in the feature mapping EpAnd the calculation formula of the fusion characteristic I is respectively as follows:
wherein e ispP-th element, E, representing a feature map EqQ-th element, y, representing a feature map EpThe p-th element representing Y, f (-) represents a function for calculating any two-point similarity relationship, g (-) represents a mapping function, γ (E) represents a normalization factor,the parameters representing the kth self-attention layer convolution kernel,the deviation of the kth self-attention layer convolution kernel;
the method for constructing the pooling layer comprises the following steps: inputting the fusion feature I into a pooling layer with a pooling kernel size of c × c and a sliding stride of a, wherein the expression of the pooling layer is as follows:
Z=MeanPooling2D(I)
where MeanPooling2D (. cndot.) represents average pooling, Z represents a low-dimensional implicit feature matrix, and each row vector { Z ·u,zu+1…,zu+tDenotes a mask vector hu,hu+1…,hu+tThe low dimensional implicit feature representation of.
In the invention, in the self-attention layer, firstly, a dependency matrix Y between each element in a feature mapping E is calculated, the dependency matrix Y is fused into the feature mapping E, and global features are introduced for local features extracted from each receptive field in convolution operation, so that richer information is brought to the following convolution layer; wherein g (-) represents a mapping function for computing a feature vector of a point.
In the embodiment of the present invention, as shown in fig. 1, in step S23, samples { h } are generatedu,hu+1…,hu+tCorresponding reconstructed dataUsing matrices formed from reconstructed dataExpressed, the calculation formula is:
wherein D is0Input of a representation model, Dl-1Denotes the output of layer l-1, DlRepresents the output of the l-th layer, UlRepresents the output of the l-th upsampling layer, UL-1Represents the output of the L-1 st upsampling layer, K represents the number of convolution kernels, σ (-) represents the activation function,representing the parameters of the kth convolution kernel in layer l-1,represents the deviation of the kth convolution kernel in the L-1 layer, L represents the number of deconvolution layers, Z represents the low-dimensional hidden feature matrix, UpSamplling 2D (·) represents the upsampled layer,parameters representing the kth convolution kernel in the lth layer,indicating the deviation of the kth convolution kernel in the L-th layer.
DlAnd UlThe intermediate result of the calculation step is shown, so that the detailed process of the calculation step is convenient to explain, and the calculation reconstruction data has a direct relation.
In the embodiment of the present invention, as shown in fig. 1, in step S24, the calculation formula of the reconstruction error rec _ error is as follows:
In the embodiment of the present invention, as shown in fig. 1, in step S25, calculating a distance C (p (z)) between a distribution p (z) of a known analytic expression and a second data distribution q (z), q (z)) includes the following sub-steps:
s251: setting a distribution p (z) of a known analytical formula;
s252: establishing a discriminator by utilizing a full-connection neural network;
s253: randomly sampling t low-dimensional hidden feature representations from a distribution p (z) of known analytic expressionsRandomly sampling t low-dimensional hidden feature representations { z ] from the distribution q (z)1,…,zt},
S254: representing low-dimensional hidden featuresAndas the input of the discriminator, and outputs the discrimination result;
s253: based on the determination result, the distance C (p (z), q (z)) between the distribution p (z) of the known analytical formula and the second data distribution q (z) is calculated by the following formula:
wherein D (-) represents the discrimination result, log (-) represents the logarithmic function,indicating that a mathematical expectation is calculated on a second data distribution q (z) of z,is shown inData distribution ofThe mathematical expectation value is calculated.
In the present invention, to enable a decoder to generate new mask vectorsThe decoder must receive the other vectors from q (z)However, the analytical formula of q (z) is unknown, and it is difficult to directly obtain other data samples. For indirectly obtaining other samples z of the q (z) distribution*Assuming that a distribution p (z) for which the analytic expression is known is the equivalent distribution of q (z), the samples in q (z) are obeyed to p (z) and the samples in p (z) are also obeyed to q (z). At this time, a new mask vector can be generated by using the samples in p (z) as input data of the decoder. To satisfy this assumption, the distance between q (z) and p (z) is taken as part of the encoder and decoder objective function, with which the parameters of the encoder and decoder are updated, enabling the encoder to convert the data distribution q (h) to an equivalent distribution p (z) of q (z), and the decoder to convert the low-dimensional hidden feature representation output by the encoder to a new mask vector. After training, the transformed low-dimensional implicit features represent samples that are more and more similar to p (z).
In the embodiment of the present invention, as shown in fig. 1, in step S26, the target function expression of the self-attention encoder is rec _ error; the objective function expression of the discriminator is dis _ loss ═ C (p (z), q (z)); the objective function expression of the self-attention decoder isWhere rec _ error represents sample huAnd reconstructing the sampleC (p (z), q (z)) represents the distance between the distribution p (z) of the known analytical formula and the second data distribution q (z), D (-) represents the discrimination result, log (-) represents the logarithmic function,is shown inData distribution ofThe mathematical expectation value is calculated.
In the present invention, after training is completed, the encoder can convert the mask vector into a low-dimensional hidden feature representation that obeys the data distribution p (z), and the decoder can convert the sample data from p (z) into useful distribution information. Since the analytic expression of p (z) is known, a large amount of low-dimensional hidden feature representation data can be uniformly obtained from the p (z), and more comprehensive distribution information about a high-dimensional sparse scoring matrix is obtained through conversion of a decoder.
In the embodiment of the present invention, as shown in fig. 1, step S3 includes the following sub-steps:
s31: setting row vectors { R) in high-dimensional sparse scoring matrix R1,…,ru,…,rnObey the true data distribution prAnd the row vector r1,…,ru,…,rnAs a true score vector, where ruA score vector representing user u;
s32: constructing a self-attention confrontation neural network model, wherein a generator of the self-attention confrontation neural network model is in a self-attention self-encoder structure, a discriminator is in a fully-connected neural network, and a prediction score vector generated by the generator is set to obey the generator distribution pg;
S33: sample to be reconstructedAnd a score vector r of the useruPerforming fusion to obtain fusion dataThe calculation formula for fusion is:
wherein h isuAn indication sample, < > indicates a logical operation;
s34: fusing dataAs input to the generator, a prediction score vector for each user is calculatedThe calculation formula is as follows:
s35: vector prediction scoresAnd a truth score vector { r1,…,ru,…,rnAs the input of the discriminator, the real data distribution p is discriminatedrSum generator distribution pgThe difference between the grading sample data and the real grading data and outputting a judgment result;
s36: training a scoring prediction model according to the judgment result until the scoring prediction model is converged;
in step S35, the sparsification method is used to facilitate the differentiation by the discriminator, and the expression is:
in step S36, training the discriminator according to the discrimination result, where the target function expression is:
wherein, JDisAn objective function of the discriminator is represented,is shown at ruTrue data distribution p ofrThe mathematical expectation value is calculated as above,is shown inGenerator distribution p ofgCalculating a mathematical expected value, and Dis (·) represents a discrimination result input by the discriminator;
training the generator according to the discrimination result, wherein the target function expression is as follows:
wherein, JGenAn objective function of the generator is represented,is shown inData distribution p ofgThe mathematical expectation value is calculated as above,λ denotes regularization coefficient, ψ denotes regularization, ru,iRepresenting elements in a high-dimensional sparse scoring matrix R,representing the predicted score of user u for item i,representing a known set of scoring data.
In the present invention, in order to learn the features of the score data in R, first assume a row vector { R ] in R1,…,ru,…,rnObey the true data distribution pr(ii) a Establishing a self-attention confrontation neural network model, wherein a generator uses a self-attention self-encoder structure, and a discriminator uses a full-connection neural network; to enable the generator to generate a true score vector, it is assumed that the predicted score vector generated by the generator obeys the generator distribution pg. If can enable pgAnd prThe same, the prediction score vector generated by the generator is real at this time; to make p begClose to prUntil the same, the distance between the two distributions is taken as the objective function of the model. Updating the parameters of the generator by using the target function to ensure that the distribution distance is smaller and smaller; due to pgAnd prThe analytical formula (2) is unknown, and the distance between the two cannot be calculated by a clear formula. Therefore, the difference between the sample data of the two distributions and the sample data of the real distribution is estimated through the discriminator and then is substituted into the Wasserstein distance formula to approximately calculate pgAnd prThe distance of (d); to obtain sample data for the producer distribution, the input to the producer is first obtained. Decoder sample mask vector learned from S2And the mask vector and the score vector r of the useruAnd fusion, namely providing more distribution information for the scoring data, and taking the fused data as the input of the generator. The generator calculates the prediction scoring vector of each userVector prediction scoresAnd a truth score vector { r1,…,ru,…,rnThe data is used as the input of a discriminator for evaluating the difference between the two types of data and the real scoring data; since the true score vector has only a small amount of known score data, a large number of unknown scores are filled with 0, and the predicted score values of all the predicted score vectors are difficult for the discriminator to discriminate the true and false of the two. Therefore, the prediction score vector is thinned by using the mask vector, only the prediction score corresponding to the known score data in the prediction score vector is reserved, so that the consent is kept in the form, and only the characteristics of the true and false score data are reserved on the data, thereby being convenient for being distinguished by a discriminator. The sparse prediction score vector and the real score vector are used as the input of a discriminator, and the discrimination result is output; and finally training a model: the discriminants are trained first, the generators are trained second, and regularization (which represents the mean square error between the prediction scores and the true scores) is used until the model converges.
In a specific implementation process, the hyper-parameters of the model influence the performance of the model, and the hidden layer dimension and the number of hidden layers of the generator need to be carefully adjusted. In addition, when training the self-attention-confrontation neural network, a regularization term is added to the objective function, the regularization term is quite related to the prediction precision of the model, and the sparseness of the regularization term needs to be carefully selected to balance the proportion between the regularization term and the distribution distance.
In summary, the invention considers the scoring matrix for solving the high-dimensional sparsity from two aspects of the distribution characteristics and the scoring data characteristics of the high-dimensional sparse matrix. First, the present invention combines the self-attention mechanism with the discrimination autoencoder and provides a specific method for application in the recommendation system. And extracting the distribution information of the score data from the mask matrix of the high-dimensional sparse matrix by using a self-attention distinguishing self-encoder, and providing more distribution information for subsequent learning score data characteristics and prediction score data. Meanwhile, the model adopts a convolution neural network to extract the distribution characteristics of the local area in the mask matrix. Besides, the dependency relationship between all data in the mask matrix is calculated by using a self-attention mechanism, and the global distribution characteristic is obtained. And finally, the distribution characteristics of the local area and the global distribution characteristics are fused to train the model, so that the distribution information of the mask matrix can be effectively and comprehensively obtained. Secondly, the method establishes a prediction model based on the antagonistic neural network to estimate the missing score data in the high-dimensional sparse matrix. The distribution information of the high-dimensional sparse matrix and the scoring data are fused to be used as training data, and a self-attention mechanism is fused into the generator model, so that the dependency relationship between the scoring data sensed by the neural network is favorably resisted, and the characteristics of the scoring data are better learned. Meanwhile, the mean square error between the predicted scoring data and the real scoring data is used as a regularization term of the objective function of the antagonistic neural network, and the prediction precision of the model is improved.
The working principle and the process of the invention are as follows: the recommendation system scoring prediction method based on the self-attention confrontation neural network provided by the invention utilizes the whole data distribution characteristic of the self-attention confrontation neural network learning high-dimensional sparse matrix and utilizes the self-attention mechanism and the convolution neural network learning relationship between the local region scoring data of the high-dimensional sparse matrix, thereby being beneficial to improving the prediction precision.
The invention has the beneficial effects that:
(1) the invention combines the self-attention mechanism and the discrimination automatic encoder and provides a specific method applied to a recommendation system. And extracting the distribution information of the score data from the mask matrix of the high-dimensional sparse matrix by using a self-attention distinguishing self-encoder, and providing more distribution information for subsequent learning score data characteristics and prediction score data. Meanwhile, the model adopts a convolution neural network to extract the distribution characteristics of the local area in the mask matrix. Besides, the dependency relationship between all data in the mask matrix is calculated by using a self-attention mechanism, and the global distribution characteristic is obtained. And finally, the distribution characteristics of the local area and the global distribution characteristics are fused to train the model, so that the distribution information of the mask matrix can be effectively and comprehensively acquired.
(2) The invention establishes a prediction model based on the antagonistic neural network to estimate the missing scoring data in the high-dimensional sparse matrix. The distribution information of the high-dimensional sparse matrix and the scoring data are fused to be used as training data, and a self-attention mechanism is fused into the generator model, so that the dependency relationship between the scoring data sensed by the neural network is favorably resisted, and the characteristics of the scoring data are better learned. Meanwhile, the mean square error between the predicted scoring data and the real scoring data is used as a regularization term of an anti-neural network target function, and the prediction accuracy of the model is improved.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (9)
1. A recommendation system scoring prediction method based on a self-attention confrontation neural network is characterized by comprising the following steps:
s1: collecting user information, project information and project scoring data of a user, and constructing a high-dimensional sparse scoring matrix and a corresponding masking matrix;
s2: extracting the distribution characteristics of a mask matrix by using a self-attention encoder to generate distribution information about a high-dimensional sparse scoring matrix;
s3: building a scoring prediction model of a recommendation system by using a self-attention antagonistic neural network, and training the scoring prediction model according to distribution information and a high-dimensional sparse scoring matrix;
s4: evaluating a high-dimensional sparse scoring matrix by using the trained scoring prediction model to complete the scoring prediction of the project by the user;
the step S1 includes the following sub-steps:
s11: collecting user information, project information and project grading data of users to obtain a user setItem setAnd user scoring set of itemsWherein n denotes the number of users, m denotes the number of items, u1,u2,…,unRepresenting the 1 st to nth users, i1,i2,…,imDenotes the 1 st to m-th items, su,iRepresents the score of the user u on the item i, and v represents the maximum value of the score;
s12: aggregation of ratings from user to itemsConstructing a high-dimensional sparse scoring matrix R, wherein each element Ru,iThe expression of (c) is:
s13: according to the high-dimensional sparse scoring matrix R, constructing a corresponding mask matrix H epsilon {0,1}n×mWherein each element hu,iThe expression of (a) is:
2. The method for predicting the scoring of the recommendation system based on the self-attention-confrontation neural network as claimed in claim 1, wherein the step S2 comprises the following sub-steps:
s21: let each row vector { H in the mask matrix H1,…,hu,…,hnObey the first data distribution q (h);
s22: defining a self-attention encoder, samples h from a first data distribution q (h)uE H is converted into corresponding low-dimensional hidden feature representation zuWherein z isuOne sample of the second data distribution q (z);
s23: representing the low-dimensional hidden features from the second data distribution q (z) as zuAs input to a self-attention decoder, and generates samples huOf the reconstructed sample
S24: calculate sample huAnd reconstructing the sampleThe reconstruction error rec _ error in between;
s25: setting a distribution p (z) of a known analytic expression, and training a self-attention encoder and a self-attention decoder according to the distance between the distribution p (z) of the known analytic expression and a second data distribution q (z);
s26: and converting the mask vector into a low-dimensional hidden feature representation of the distribution p (z) conforming to the known analytic expression by using a trained self-attention encoder, and converting sample data of the distribution p (z) of the known analytic expression into distribution information by using a trained self-attention decoder to generate the distribution information about the high-dimensional sparse scoring matrix.
3. The self-attention-directed neural network-based recommendation system score prediction method of claim 2, wherein the step S22 comprises the sub-steps of:
s221: randomly sampling t mask vectors { H) from a mask matrix H by using a small batch gradient descent algorithmu,hu+1…,hu+tForming an input matrix X;
s222: defining a self-attention encoder, taking an input matrix X as an input of the self-attention encoder, and distributing samples h from a first data distribution q (h)uE H is converted into corresponding low-dimensional hidden feature representation zuWherein z isuIs a sample of the second data distribution q (z).
4. The method according to claim 3, wherein in step S222, the self-attention encoder comprises a convolutional layer, a self-attention layer and a pooling layer;
the method for constructing the convolutional layer comprises the following steps: the convolutional layer comprises K1 × 1 convolutional kernels, and the feature mapping E of the input matrix X is extracted by using the convolutional kernels, and the calculation formula is as follows:
wherein, represents a two-dimensional convolution calculation,the parameters representing the k-th convolution kernel,represents the deviation of the kth convolution kernel, σ (·) represents the activation function;
the method for constructing the self-attention layer comprises the following steps: a dependency matrix Y between each element in the feature map E is calculated,and the dependency relationship matrix Y is fused into a feature mapping E to obtain a fusion feature I, wherein each element Y in the feature mapping EpAnd the calculation formula of the fusion characteristic I is respectively as follows:
wherein e ispP-th element, E, representing a feature map EqQ-th element, y, representing a feature map EpRepresents the p-th element of Y, f (-) represents a function for calculating any two-point similarity relationship, g (-) represents a mapping function, Y (E) represents a normalization factor,the parameters representing the kth self-attention layer convolution kernel,the deviation of the kth self-attention layer convolution kernel;
the method for constructing the pooling layer comprises the following steps: inputting the fusion feature I into a pooling layer with a pooling kernel size of c × c and a sliding stride of a, wherein the expression of the pooling layer is as follows:
Z=MeanPooling2D(I)
where MeanPooling2D (. cndot.) represents average pooling, Z represents a low-dimensional implicit feature matrix, and each row vector { Z ·u,zu+1…,zu+tDenotes a mask vector hu,hu+1…,hu+tThe low dimensional implicit feature representation of.
5. The method for predicting the recommender system score based on the self-attention-directed neural network as claimed in claim 2, wherein in step S23, samples { h } are generatedu,hu+1…,hu+tCorresponding reconstructed dataUsing matrices formed from reconstructed dataExpressed, the calculation formula is:
wherein D is0Input of a representation model, Dl-1Denotes the output of layer l-1, DlRepresents the output of the l-th layer, UlRepresents the output of the l-th upsampling layer, UL-1Represents the output of the L-1 st upsampling layer, K represents the number of convolution kernels, σ (-) represents the activation function,representing the parameters of the kth convolution kernel in layer l-1,represents the deviation of the kth convolution kernel in the L-1 layer, L represents the number of deconvolution layers, Z represents a low-dimensional hidden feature matrix, UpSampling2D (·) represents an upsampled layer,parameters representing the kth convolution kernel in the lth layer,indicating the deviation of the kth convolution kernel in the L-th layer.
6. The method for predicting the scoring of the recommendation system based on the self-attention-confrontation neural network as claimed in claim 2, wherein in the step S24, the calculation formula of the reconstruction error rec _ error is as follows:
7. The method for predicting the scoring of the recommendation system based on the self-attention-confrontation neural network as claimed in claim 2, wherein the step S25 of calculating the distance C (p (z), q (z)) between the distribution p (z) of the known analytic expression and the second data distribution q (z) comprises the following sub-steps:
s251: setting a distribution p (z) of a known analytical formula;
s252: establishing a discriminator by utilizing a full-connection neural network;
s253: randomly sampling t low-dimensional hidden feature representations from a distribution p (z) of known analytic expressionsRandomly sampling t low-dimensional hidden feature representations { z ] from the distribution q (z)1,…,zt},
S254: representing low-dimensional hidden featuresAnd z1,…,ztThe result is used as the input of the discriminator, and the discrimination result is output;
s253: based on the determination result, a distance C (p (z), q (z)) between the distribution p (z) of the known analytical formula and the second data distribution q (z) is calculated, wherein the calculation formula is:
8. The method for predicting the scoring of the recommendation system based on the self-attention confrontation neural network as claimed in claim 2, wherein in the step S26, the objective function expression of the self-attention encoder is rec _ error; the target function expression of the discriminator is dis _ loss ═ C (p (z), q (z)); the objective function expression of the self-attention decoder isWhere rec _ error represents sample huAnd reconstructing the sampleC (p (z), q (z)) represents the distance between the distribution p (z) of the known analytical formula and the second data distribution q (z), D (-) represents the discrimination result, log (-) represents the logarithmic function,is shown inData distribution ofThe mathematical expectation value is calculated.
9. The self-attention-directed neural network-based recommendation system score prediction method of claim 2, wherein the step S3 comprises the sub-steps of:
s31: setting row vectors { R } in high-dimensional sparse scoring matrix R1,…,ru,…,rnObey the true data distribution prAnd the row vector r1,…,ru,…,rnAs a true score vector, where ruA score vector representing user u;
s32: constructing a self-attention confrontation neural network model, wherein a generator of the self-attention confrontation neural network model is in a self-attention self-encoder structure, a discriminator is in a fully-connected neural network, and a prediction score vector generated by the generator is set to obey the generator distribution pg;
S33: sample to be reconstructedAnd a score vector r of the useruPerforming fusion to obtain fusion dataThe calculation formula for fusion is:
wherein h isuAn indication sample, < > indicates a logical operation;
s34: fusing dataAs input to the generator, a prediction score vector for each user is calculatedThe calculation formula is as follows:
s35: vector prediction scoresAnd a truth score vector { r1,…,ru,…,rnAs the input of the discriminator, the real data distribution p is discriminatedrSum generator distribution pgThe difference between the grading sample data and the real grading data and outputting a judgment result;
s36: training a scoring prediction model according to the judgment result until the scoring prediction model is converged;
in step S35, the sparsification method is used to facilitate the differentiation by the discriminator, and the expression is:
in step S36, the discriminator is trained according to the discrimination result, and the target function expression is:
wherein, JDisAn objective function of the discriminator is represented,is shown at ruTrue data distribution p ofrThe mathematical expectation value is calculated as above,is shown inGenerator distribution p ofgCalculating a mathematical expected value, wherein Dis (·) represents a discrimination result input by the discriminator;
training the generator according to the discrimination result, wherein the target function expression is as follows:
wherein, JGenAn objective function that represents the generator is determined,is shown inData distribution p ofgThe mathematical expectation value is calculated as above and,λ denotes regularization coefficient, ψ denotes regularization, ru,iRepresenting elements in a high-dimensional sparse scoring matrix R,representing the predicted score of user u for item i,representing a known set of scoring data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110217932.1A CN112784173B (en) | 2021-02-26 | 2021-02-26 | Recommendation system scoring prediction method based on self-attention confrontation neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110217932.1A CN112784173B (en) | 2021-02-26 | 2021-02-26 | Recommendation system scoring prediction method based on self-attention confrontation neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112784173A CN112784173A (en) | 2021-05-11 |
CN112784173B true CN112784173B (en) | 2022-06-10 |
Family
ID=75762027
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110217932.1A Active CN112784173B (en) | 2021-02-26 | 2021-02-26 | Recommendation system scoring prediction method based on self-attention confrontation neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112784173B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113486257B (en) * | 2021-07-01 | 2023-07-11 | 湖北工业大学 | Coordinated filtering convolutional neural network recommendation system and method based on countermeasure matrix decomposition |
CN114693624B (en) * | 2022-03-23 | 2024-07-26 | 腾讯科技(深圳)有限公司 | Image detection method, device, equipment and readable storage medium |
CN115225369B (en) * | 2022-07-15 | 2023-04-28 | 北京天融信网络安全技术有限公司 | Botnet detection method, device and equipment |
CN118333054B (en) * | 2024-06-12 | 2024-08-23 | 之江实验室 | Text-to-text system and method based on local-global attention |
Family Cites Families (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102129463A (en) * | 2011-03-11 | 2011-07-20 | 北京航空航天大学 | Project correlation fused and probabilistic matrix factorization (PMF)-based collaborative filtering recommendation system |
CN102789499B (en) * | 2012-07-16 | 2015-08-12 | 浙江大学 | Based on the collaborative filtering method of implicit relationship situated between article |
CN103942288B (en) * | 2014-04-10 | 2017-02-08 | 南京邮电大学 | Service recommendation method based on user risk preferences |
CN105844261A (en) * | 2016-04-21 | 2016-08-10 | 浙江科技学院 | 3D palmprint sparse representation recognition method based on optimization feature projection matrix |
CN106055873A (en) * | 2016-05-20 | 2016-10-26 | 北京旷视科技有限公司 | Fitness auxiliary method and apparatus based on image recognition |
CN106446015A (en) * | 2016-08-29 | 2017-02-22 | 北京工业大学 | Video content access prediction and recommendation method based on user behavior preference |
CN107122722A (en) * | 2017-04-19 | 2017-09-01 | 大连理工大学 | A kind of self-adapting compressing track algorithm based on multiple features |
CN107273349B (en) * | 2017-05-09 | 2019-11-22 | 清华大学 | A kind of entity relation extraction method and server based on multilingual |
EP3622521A1 (en) * | 2017-10-16 | 2020-03-18 | Illumina, Inc. | Deep convolutional neural networks for variant classification |
CN108595550A (en) * | 2018-04-10 | 2018-09-28 | 南京邮电大学 | A kind of music commending system and recommendation method based on convolutional neural networks |
CN108563640A (en) * | 2018-04-24 | 2018-09-21 | 中译语通科技股份有限公司 | A kind of multilingual pair of neural network machine interpretation method and system |
CN108665308A (en) * | 2018-05-07 | 2018-10-16 | 华东师范大学 | Score in predicting method and apparatus |
CN108874790A (en) * | 2018-06-29 | 2018-11-23 | 中译语通科技股份有限公司 | A kind of cleaning parallel corpora method and system based on language model and translation model |
CN109522372A (en) * | 2018-11-21 | 2019-03-26 | 北京交通大学 | The prediction technique of civil aviaton field passenger value |
CN109784806B (en) * | 2018-12-27 | 2023-09-19 | 北京航天智造科技发展有限公司 | Supply chain control method, system and storage medium |
CN111160016B (en) * | 2019-04-15 | 2022-05-03 | 深圳碳云智能数字生命健康管理有限公司 | Semantic recognition method and device, computer readable storage medium and computer equipment |
CN110188351B (en) * | 2019-05-23 | 2023-08-25 | 鼎富智能科技有限公司 | Sentence smoothness and syntax scoring model training method and device |
CN110196946B (en) * | 2019-05-29 | 2021-03-30 | 华南理工大学 | Personalized recommendation method based on deep learning |
US11144721B2 (en) * | 2019-05-31 | 2021-10-12 | Accenture Global Solutions Limited | System and method for transforming unstructured text into structured form |
CN110442781B (en) * | 2019-06-28 | 2023-04-07 | 武汉大学 | Pair-level ranking item recommendation method based on generation countermeasure network |
CN110866637B (en) * | 2019-11-06 | 2022-07-05 | 湖南大学 | Scoring prediction method, scoring prediction device, computer equipment and storage medium |
CN111061951A (en) * | 2019-12-11 | 2020-04-24 | 华东师范大学 | Recommendation model based on double-layer self-attention comment modeling |
CN111126864A (en) * | 2019-12-26 | 2020-05-08 | 中国地质大学(武汉) | Street quality assessment method based on man-machine confrontation score |
CN111191718B (en) * | 2019-12-30 | 2023-04-07 | 西安电子科技大学 | Small sample SAR target identification method based on graph attention network |
CN112328900A (en) * | 2020-11-27 | 2021-02-05 | 北京工业大学 | Deep learning recommendation method integrating scoring matrix and comment text |
-
2021
- 2021-02-26 CN CN202110217932.1A patent/CN112784173B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112784173A (en) | 2021-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112784173B (en) | Recommendation system scoring prediction method based on self-attention confrontation neural network | |
CN112446591B (en) | Zero sample evaluation method for student comprehensive ability evaluation | |
CN105975573B (en) | A kind of file classification method based on KNN | |
CN111859680A (en) | Comprehensive evaluation method for system performance | |
CN113298230B (en) | Prediction method based on unbalanced data set generated against network | |
CN112541532B (en) | Target detection method based on dense connection structure | |
CN111222847B (en) | Open source community developer recommendation method based on deep learning and unsupervised clustering | |
CN103714148B (en) | SAR image search method based on sparse coding classification | |
CN110880369A (en) | Gas marker detection method based on radial basis function neural network and application | |
CN115047421A (en) | Radar target identification method based on Transformer | |
CN112685591A (en) | Accurate picture retrieval method for user interest area and feedback guidance | |
CN114926693A (en) | SAR image small sample identification method and device based on weighted distance | |
CN114580262A (en) | Lithium ion battery health state estimation method | |
CN115907122A (en) | Regional electric vehicle charging load prediction method | |
Wang et al. | Classification and extent determination of rock slope using deep learning | |
CN111898822B (en) | Charging load interval prediction method based on multi-correlation-day scene generation | |
Salman et al. | Creating a cutting-edge neurocomputing model with high precision | |
CN109063095A (en) | A kind of weighing computation method towards clustering ensemble | |
CN117371511A (en) | Training method, device, equipment and storage medium for image classification model | |
CN117786441A (en) | Multi-scene photovoltaic user electricity consumption behavior analysis method based on improved K-means clustering algorithm | |
Wen et al. | Short-term load forecasting based on feature mining and deep learning of big data of user electricity consumption | |
Mendez-Ruiz et al. | SuSana Distancia is all you need: Enforcing class separability in metric learning via two novel distance-based loss functions for few-shot image classification | |
CN118211494B (en) | Wind speed prediction hybrid model construction method and system based on correlation matrix | |
JP2020035042A (en) | Data determination device, method, and program | |
Liu et al. | A hybrid model integrating improved fuzzy c-means and optimized mixed kernel relevance vector machine for classification of coal and gas outbursts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |