CN112508351B - Strong robustness item recommendation method, system, device and medium in attack environment - Google Patents

Strong robustness item recommendation method, system, device and medium in attack environment Download PDF

Info

Publication number
CN112508351B
CN112508351B CN202011299425.9A CN202011299425A CN112508351B CN 112508351 B CN112508351 B CN 112508351B CN 202011299425 A CN202011299425 A CN 202011299425A CN 112508351 B CN112508351 B CN 112508351B
Authority
CN
China
Prior art keywords
rating
user
matrix
prediction
item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011299425.9A
Other languages
Chinese (zh)
Other versions
CN112508351A (en
Inventor
刘培玉
丁琦
朱振方
徐富永
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN202011299425.9A priority Critical patent/CN112508351B/en
Publication of CN112508351A publication Critical patent/CN112508351A/en
Application granted granted Critical
Publication of CN112508351B publication Critical patent/CN112508351B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Algebra (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method, a system, equipment and a medium for recommending items with strong robustness in an attack environment, which are used for converting one-dimensional rating vectors of items of a user into a two-dimensional rating matrix; predicting and filling null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix; describing the non-sparse rating matrix as a rating matrix of a user rating matrix and an item rating; learning a rating-based user representation and a rating-based item representation based on the user rating matrix and the item rating matrix; obtaining a rating prediction matrix according to the rating-based user representation and the rating-based item representation; obtaining the probability that the user is detected as an attacking user according to the error between the rating prediction matrix and the real rating, the user representation based on the rating and the user embedded ID; adding the probability into loss training of rating prediction to control the contribution proportion of an attacking user to the rating prediction, and obtaining a user rating prediction result; and generating a project recommendation list according to the user rating prediction result.

Description

Strong robustness item recommendation method, system, device and medium in attack environment
Technical Field
The present application relates to the field of project recommendation technologies, and in particular, to a method, a system, a device, and a medium for recommending a high robustness project in an attack environment.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
The generation of recommendation systems is closely related to the development of times progress: a plurality of business modes are converted from offline to online, a large amount of data wastes system occupation, and a user cannot quickly and accurately select mass data. The prediction recommendation accuracy is a basic problem of a recommendation system, and the ratings or comments of a large number of real users can truly represent the user preference and the item level. Because the recommendation system has certain openness and other loopholes, some individuals or groups carry out false scoring or comment injection on the recommendation system under the drive of business benefits or market penetration and other reasons, and therefore recommendation accuracy and fairness in the recommendation system are disturbed. The existing recommendation system attack has some obvious characteristics, such as the characteristic that the rating level of the attack user on the target item is close to the average score of all items, but the difference between the rating level and the average score of the attack target item is large, the attack user has high similarity with a real user, and the like. At present, there are three main ways for preventing and controlling the attack behavior of the recommendation system: firstly, the robustness of the recommendation algorithm is enhanced on the premise that the existence of the known attack behavior is known, namely the robustness or accuracy of the recommendation performance when the attack exists. Secondly, attack users, malicious scores and comments and false configuration files are effectively detected. Thirdly, attack users are eliminated from the source, and a more complete user screening system mechanism is needed for the system.
The combined application of deep learning in different knowledge domains and collaborative filtering promotes the development of the predictive recommendation performance of the recommendation system. Deep learning can learn deep feature representations of users and projects in the multi-source heterogeneous data set through a linear deep neural network structure, and then the feature representations are mapped to the same space to obtain a unified representation, so that prediction and recommendation can be performed on project levels more accurately. Since a certain attack and fraud phenomenon exists in a real recommendation system, a simple recommendation system may not have strong robustness to the attack, so that the robustness of researchers to the recommendation system is gradually increased.
Disclosure of Invention
In order to solve the defects of the prior art, the application provides a method, a system, equipment and a medium for recommending a strong robustness item in an attack environment; in the prior robustness research, different types of recommendation attack detection are mainly focused, and in the method, attack user detection and rating prediction recommendation are combined, so that the robustness of a recommendation system is enhanced in an environment where recommendation attacks exist, and the problem of performance reduction of the recommendation system caused by the data sparsity problem in the recommendation system is relieved to a certain extent.
In a first aspect, the application provides a recommendation method for a strong robustness item in an attack environment;
the strong robustness item recommendation method under the attack environment comprises the following steps:
acquiring a one-dimensional rating vector of a user to a project, and converting the one-dimensional rating vector into a two-dimensional rating matrix;
performing prediction filling on null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix;
respectively describing the non-sparse rating matrix as a user rating matrix and a rating matrix of item rating; respectively learning a user representation based on the rating and a project representation based on the rating based on the user rating matrix and the project rating matrix; obtaining a rating prediction matrix according to the rating-based user representation and the rating-based item representation;
detecting the user according to the error between the rating prediction matrix and the real rating, the user representation based on the rating and the user embedded ID to obtain the probability that the user is detected as an attack user;
adding the probability of the user detected as the attacking user into loss training of rating prediction to control the contribution proportion of the attacking user to the rating prediction so as to obtain a final user rating prediction result;
and generating a project recommendation list according to the user rating prediction result.
In a second aspect, the present application provides a strong robustness item recommendation system in an attack environment;
a strong robustness item recommendation system in an attack environment comprises:
an acquisition module configured to: acquiring a one-dimensional rating vector of a user for a project, and converting the one-dimensional rating vector into a two-dimensional rating matrix;
a fill module configured to: carrying out prediction filling on null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix;
a user representation and item representation learning module configured to: respectively describing the non-sparse rating matrix into a user rating matrix and a rating matrix of item rating; respectively learning a user representation based on the rating and a project representation based on the rating based on the user rating matrix and the project rating matrix; obtaining a rating prediction matrix according to the rating-based user representation and the rating-based item representation;
a detection module configured to: detecting the user according to the error between the rating prediction matrix and the real rating, the user representation based on the rating and the user embedded ID to obtain the probability that the user is detected as an attack user;
a prediction module configured to: adding the probability of the user detected as the attacking user into loss training of rating prediction to control the contribution proportion of the attacking user to the rating prediction so as to obtain a final user rating prediction result;
a generation module configured to: and generating a project recommendation list according to the user rating prediction result.
In a third aspect, the present application further provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs being stored in the memory, and when the electronic device is running, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first aspect.
In a fourth aspect, the present application also provides a computer-readable storage medium for storing computer instructions which, when executed by a processor, perform the method of the first aspect.
In a fifth aspect, the present application also provides a computer program (product) comprising a computer program for implementing the method of any one of the preceding first aspects when run on one or more processors.
Compared with the prior art, the beneficial effects of this application are:
(1) The present disclosure acts attack detection results in scoring predictions to control the impact of the attack user's scoring on the scoring predictions.
(2) The AVE-SVD method is provided based on the SVD method in the disclosure to predict the non-interactive items in the user-item rating matrix more accurately, so that the sparsity of the rating matrix is reduced, and a foundation is laid for subsequent work.
(3) In the rating prediction component of the present disclosure, the rating prediction is performed after deep feature learning is performed on the user rating and the item rating, respectively, using a multi-layer neural network.
(4) According to the method and the device, the attack user detection component is added, so that the function of detecting the user is realized, the probability that the user is detected as the attack user and finally output is acted on the rating prediction component to control the contribution of the attack user to rating prediction, and the prediction accuracy is improved.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.
FIG. 1 is a method flow diagram of a process of an embodiment of the present disclosure;
FIG. 2 is an overall block diagram of an embodiment of the present disclosure;
FIG. 3 is a detailed framework diagram of the operation of the ratings matrix generation and AVE-SVD population prediction component of an embodiment of the present disclosure;
FIG. 4 is a detailed framework diagram of the operation of the rating prediction and attack user detection component of an embodiment of the present disclosure.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and it should be understood that the terms "comprises" and "comprising", and any variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
Example one
The embodiment provides a recommendation method for items with strong robustness in an attack environment;
as shown in fig. 1, the method for recommending a strong robustness item in an attack environment includes:
s1: acquiring a one-dimensional rating vector of a user for a project, and converting the one-dimensional rating vector into a two-dimensional rating matrix;
s2: carrying out prediction filling on null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix;
s3: respectively describing the non-sparse rating matrix into a user rating matrix and a rating matrix of item rating; respectively learning a user representation based on the rating and a project representation based on the rating based on the user rating matrix and the project rating matrix; obtaining a rating prediction matrix according to the rating-based user representation and the rating-based item representation;
s4: detecting the user according to the error between the rating prediction matrix and the real rating, the user representation based on the rating and the user embedded ID to obtain the probability that the user is detected as an attack user;
s5: adding the probability of the user being detected as an attacking user into loss training of rating prediction to control the contribution proportion of the attacking user to the rating prediction so as to obtain a final user rating prediction result;
s6: and generating a project recommendation list according to the user rating prediction result.
As one or more embodiments, the S1: acquiring a one-dimensional rating vector of a user to a project, and converting the one-dimensional rating vector into a two-dimensional rating matrix; the specific implementation mode is as follows:
setting the number of users in the data set as N, the number of items as M, and I representing all rating numbers in the data set, the generated rating matrix of the users and the items is an NxM order matrix A n,m . N represents a certain row in the user-item rating matrix, N belongs to N, M represents a certain column in the user-item rating matrix, M belongs to M, and the matrix generation formula is as follows:
Figure BDA0002786366760000061
wherein A is n,m Rating for generating Rating values for the n-th row and m-columns of the Rating matrix m And Rating (n-1)×M+m The hollow value of the Rating vector also occupies the vector position and is marked as 0 for the Rating score in the initial Rating vector. The specific process is shown in fig. 3.
Illustratively, in a dataset, the user's ratings for items exist in a one-dimensional vector. The one-dimensional rating vector refers to: each user's score for the interacted items is stored sequentially in the form of a single real number.
It should be appreciated that converting a one-dimensional rating vector to a two-dimensional rating matrix may enable enhanced aggregation of information in the same dimension. To facilitate information gathering to facilitate an intuitive view of a user's or project's scoring profile, the present disclosure transforms a rating vector into a two-dimensional user-project rating matrix.
As one or more embodiments, the S2: performing prediction filling on null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix; the specific implementation mode is as follows:
decomposing the two-dimensional rating matrix into a user hidden vector matrix and a project hidden vector matrix by using a matrix decomposition method;
and based on the user hidden vector matrix, the project hidden vector matrix, the user attribute values, the project attribute values and the grading mean value of the project, carrying out prediction filling on null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix.
As will be appreciated, the null values in the two-dimensional rating matrix are predicted and filled to obtain a non-sparse rating matrix; the sparsity of data can be reduced. And the recommendation performance reduction brought by the matrix sparsity problem is reduced to a certain extent.
Illustratively, a two-dimensional rating matrix is decomposed into a user hidden vector matrix and a project hidden vector matrix by using a matrix decomposition method; the method specifically comprises the following steps:
rank user-item matrix A N,M Matrix decomposition is carried out on the matrix to obtain a user hidden vector matrix P N,K And the term implicit vector matrix Q K,M
A N,M =P N,K Q K,M (2)
Illustratively, null values in a two-dimensional rating matrix are filled in a prediction mode based on a user hidden vector matrix, a project hidden vector matrix, user attribute values, project attribute values and a grading mean value of projects; the method specifically comprises the following steps:
Figure BDA0002786366760000071
wherein the content of the first and second substances,
Figure BDA0002786366760000081
is the mean score of the mth item, b n Is the attribute value of user n, b m Is the attribute value of item m. Wherein
Figure BDA0002786366760000082
Hiding vector matrix P for users N,K N-th line of (1), q m For the term latent vector matrix Q K,M Column m.
The present disclosure improves the SVD predictive filling method by scoring the predictions
Figure BDA0002786366760000083
The formula (2) is defined as shown in formula (3). Since scoring habits of pairs vary from user to user, the present disclosure is predictiveThe preference information of users and items is added, and various additional information is added on the basis of the prediction scoring infrastructure of the basis matrix decomposition so as to improve the accuracy of scoring prediction.
The loss function of the input layer is a predicted value and a real score value R n,m For effectively avoiding the phenomenon of overfitting, the method adds a regular term in the loss function to punish parameters, wherein lambda is a regularization coefficient:
Figure BDA0002786366760000084
the non-sparse rating matrix is generated after the vacancy value filling is carried out on the user-item rating matrix through the AVE-SVD component, and the improvement of the performance of prediction and attack user detection in a subsequent method layer is facilitated. The implementation is shown in the input layer of fig. 3.
As one or more embodiments, the S3: respectively describing the non-sparse rating matrix as a user rating matrix and a rating matrix of item rating; the specific implementation mode is as follows:
the non-sparse user-item rating matrix is described as a user rating matrix by taking a row unit in the non-sparse rating matrix, namely each user as a unit, and the non-sparse user-item rating matrix is described as an item rating matrix by taking a column unit in the non-sparse rating matrix, namely each item as a unit.
Deep feature learning is respectively carried out on the user and the project in the non-sparse rating matrix through a multilayer neural network, and the prediction score of the user on the project is calculated by using a hidden semantic model.
As one or more embodiments, the S3, based on the user rating matrix and the item rating matrix, respectively learning respective corresponding rating-based user representations and rating-based item representations; the specific implementation mode is as follows:
learning a rating-based user representation based on a user rating matrix and a Multi-Layer Perceptron (MLP);
based on the rating-based item representation and the multi-layered perceptron MLP, a rating-based item representation is learned.
As one or more embodiments, the S3 obtains a rating prediction matrix according to the rating-based user representation and the rating-based item representation; the specific implementation mode is as follows:
a rating prediction matrix is obtained based on a Latent semantic Model LFM (Latent Factor Model) from a rating-based user representation and a rating-based item representation. The implementation is shown in fig. 4.
Illustratively, a non-sparse rating matrix is output at an input layer as an input of a method layer, and the user-item rating matrix is represented as a user-rating matrix Ru and an item-rating matrix Ri, so as to use an MLP model to train learning feature representation for users and items respectively.
For the MLP multi-layer perceptron, the multi-layer hidden layer structure is used for mining deep feature representations of users and projects to improve training accuracy and performance of the MLP multi-layer perceptron, and the hidden layer number is set to be d. The input of MLP is row n 'in Ru and column m' in Ri respectively, and the characteristic representation E of the user based on the score is obtained through training and learning n And deep profile representation of the score-based item E m
E n =σ(W d n′ (d-1) +b d )(5)
E m =σ(W d m′ (d-1) +b d )(6)
Wherein σ is an activation function, d is the number of hidden layers set in MLP, and W and b are parameters in the hidden layers in MLP.
E to be output in MLP n And E m Input into the LFM model. The present disclosure scores items using an LFM model
Figure BDA0002786366760000091
Predict and generate a new user-item rating matrix:
Figure BDA0002786366760000101
the scoring prediction formula comprises four component elements, wherein, the first item is in a dot product format of user characteristic representation and item characteristic representation,
Figure BDA0002786366760000102
is a parametric matrix representation of LFM, bis n 、bis m And bis denote user bias, item bias and global bias, respectively.
As one or more embodiments, the S4: detecting the user according to the error between the rating prediction matrix and the real rating, the user representation based on the rating and the user embedded ID to obtain the probability that the user is detected as an attack user; the specific implementation mode is as follows:
and detecting the user based on the trained deep neural decision forest NDF according to the user representation based on the rating, the user embedded ID and the error between the rating prediction matrix and the real rating to obtain the probability that the user is detected as an attack user.
And detecting each user according to the partial information of the user to generate a detection result suspected of being an attack user.
Illustratively, a Neural Detection Forest (NDF) is used as an attacking user detection component to detect all users in a dataset. The attack detection component is a forest consisting of a plurality of binary classification trees, and because the depth of the forest is determined and the forest has certain model expression capability, the detection process has certain interpretability, and finally, the probability that all input sample users are attack users is output, so that certain support is provided for subsequent work of the disclosure. The operation process of this step is shown in fig. 4.
As shown in equation (9), the input to this component consists of three parts: rating-based user representation E n Error between the prediction result of the prediction component and the truth score, error' and user ID embedding.
Figure BDA0002786366760000103
Wherein H (n) A set of items rated for user n.
E 'is represented by a series' n The three parts of input are used for carrying out dense representation on information through a connection layer, and finally the input of the NDF component is obtained
Figure BDA0002786366760000104
Figure BDA0002786366760000111
Where σ is the activation function, W E′ And b E′ Are the weight and bias terms.
Suppose a forest is a set F = { T) composed of G binary trees 1 ,T 2 ,…,T G And setting non-leaf nodes in the forest as decision nodes, setting indexes to be D, setting each decision node D to be E D, setting all leaf nodes as prediction nodes, setting the indexes to be P, and setting each prediction node P to be E P. For each tree Tg in the forest (g e [1]) Each predicted node has a class probability distribution pi for class label y ∈ [0,1]Y =0 and y =1 represent that the user is detected as a normal user and the user is detected as an attack user respectively, and the final prediction result is represented by pi py Give a value of p1 =P(y=1),π p0 = P (y = 0). The probability that the decision node for each tree g in the forest predicts the input as a y classification is as follows:
Figure BDA0002786366760000112
wherein, pi py Representing the probability of a sample arriving at the prediction node and thus being classified as y. Decision function using decision node d
Figure BDA0002786366760000113
To decide whether to route the input to the left or right. Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002786366760000114
the explanation for the parenthesized part in the formula (11) is: if the p node is the left sub-tree of the d node
Figure BDA0002786366760000115
Is true, i.e.
Figure BDA0002786366760000116
If the p node is the right subtree of the d node, then
Figure BDA0002786366760000117
Is true, i.e.
Figure BDA0002786366760000118
The decision function is expressed as follows:
Figure BDA0002786366760000119
in the formula, sigma (x) = (1 + e-x) -1 represents a sigmoid activation function and is input
Figure BDA00027863667600001110
Assigning weights
Figure BDA00027863667600001111
The final prediction result is the average value of prediction results output by prediction nodes of all trees in the forest, and here, the probability value of the prediction results output by an attacking user (y = 1) is taken as:
Figure BDA0002786366760000121
as one or more embodiments, the S5: adding the probability of the user detected as the attacking user into loss training of rating prediction to control the contribution proportion of the attacking user to the rating prediction so as to obtain a final user rating prediction result; the specific implementation mode is as follows:
and taking the probability of the user being detected as the attacking user as a part of a loss function of the rating prediction, wherein if the probability of the user being detected as the attacking user is higher, the numerical value of the user in the loss function is lower.
And using the sum of the squared differences between the real value and the predicted value as a main body of a loss function, and adding an attack user detection probability result into the loss function to control the contribution of the attack user to the rating prediction so as to improve the accuracy of the rating prediction.
Target Loss function Loss of score prediction part rating The setting is as follows:
Figure BDA0002786366760000122
in equation (14), λ represents the parameter of the regularization term, and all the parameters in the model are denoted as Θ. The model in the disclosure can continuously improve the parameters to reach the optimal values according to the loss function, so that the model is continuously trained, and finally the model reaches the optimal prediction performance.
The detection results are used for rating prediction training to control the contribution of real users and attack users to rating prediction.
As one or more embodiments, the S6: generating a project recommendation list according to the user rating prediction result; the specific implementation mode is as follows:
and (3) after the final prediction rating combined with the attack detection result is generated, processing a rating matrix: taking the mean value of the scores of each item
Figure BDA0002786366760000123
And representing the grade level of the item, and taking the average value of all the items as the information of the item interacting with the user, sorting the average values of all the items in a descending order, and taking the first X items to generate a recommendation list.
As one or more embodiments, the method is implemented using a user rating prediction model;
as shown in fig. 2, the user rating prediction model includes: the input layer and the method layer are connected in sequence;
the input layer includes: the matrix generation component and the AVE-SVD component;
the method layer comprises: and the perceptron MLP, the latent semantic model LFM and the deep neural decision forest NDF are used for respectively carrying out feature representation learning on the user rating matrix and the project rating matrix.
Wherein the AVE-SVD component is configured to predictively populate a sparse user-item rating matrix to become a non-sparse user-item rating matrix;
the output end of the AVE-SVD component is respectively connected with the input ends of two sensors MLP in the method layer;
the output end of the sensor MLP for performing feature representation learning on the user rating matrix is connected with the input end of the latent semantic model LFM;
the output end of the sensor MLP for performing feature representation learning on the item rating matrix is connected with the input end of the latent semantic model LFM;
the output end of the latent semantic model LFM and the output end of a sensor MLP for performing feature representation learning on a user rating matrix are both connected with the input end of the deep neural decision forest NDF;
the output end of the deep neural decision forest NDF is connected with the input end of a latent semantic model LFM;
the AVE-SVD component, AVE-SVD (Average-Singular Value Decomposition), represents the Singular Value Decomposition of adding extra information and the Average Value of the item rating, and has the working principle that: decomposing the matrix into a user hidden vector matrix and a project hidden vector matrix, multiplying a certain row in the user hidden vector matrix and a certain column in the project hidden vector matrix to obtain a prediction rating score of a user for a certain project, adding attribute information of the user and the project and average rating scores of different projects in a prediction formula to predict the rating more comprehensively, and minimizing an error between a target function, namely the prediction rating and the real rating by a gradient descent method.
As one or more embodiments, the user rating prediction model is trained and not put into use.
The training step of the user rating prediction model comprises the following steps:
constructing a user rating prediction model;
constructing a training set; the training sets all comprise one-dimensional item rating vectors of users with known user rating results;
and inputting the training set into a user rating prediction model, training the model, and stopping training when the overall loss function of the model reaches the minimum value to obtain the trained user rating prediction model.
The overall loss function of the model is formulated as:
Loss RMPD =Loss AVE-SVD +Loss rating (15)
because in the present disclosure there are two scoring prediction parts, the first part uses AVE-SVD to score predict items in the input layer, the second part generates final scoring prediction in the method layer through the cooperation of the prediction component and the detection component, and the prediction performance of the first part will have some impact on the prediction performance in the method layer, training the two predictions in combination, the final goal is to minimize the sum of the two loss functions. λ represents the regularized parameter and all parameters in the model are denoted as Θ.
In the process of verifying the validity of the present disclosure, the present embodiment adopts two data sets, i.e., yelpZip data set and Amazon data set. The YelpZip data set aggregates reviews and rating information for restaurants by users in successive U.S. continents counted from the state of new york, and the Amazon data set collects the ratings and reviews of movies and television shows by users. The scoring intervals of the two data sets are [1,5], and the scores are sequentially increased according to the preference degree of the user to the items.
The evaluation indexes used in the method are verified to have Root Mean Square Error (Root Mean Square Error) and Mean Square Error (Mean Absolute Error), the smaller the evaluation index value is, the better the final prediction rating performance is proved, and the calculation mode is as follows:
Figure BDA0002786366760000151
Figure BDA0002786366760000152
example two
The embodiment provides a strong robustness item recommendation system in an attack environment;
a strong robustness item recommendation system in an attack environment comprises:
an acquisition module configured to: acquiring a one-dimensional rating vector of a user to a project, and converting the one-dimensional rating vector into a two-dimensional rating matrix;
a fill module configured to: performing prediction filling on null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix;
a user representation and item representation learning module configured to: respectively describing the non-sparse rating matrix into a user rating matrix and a rating matrix of item rating; respectively learning a corresponding rating-based user representation and a corresponding rating-based item representation based on the user rating matrix and the item rating matrix; obtaining a rating prediction matrix according to the rating-based user representation and the rating-based item representation;
a detection module configured to: detecting the user according to the error between the rating prediction matrix and the real rating, the user representation based on the rating and the user embedded ID to obtain the probability that the user is detected as an attack user;
a prediction module configured to: adding the probability of the user detected as the attacking user into loss training of rating prediction to control the contribution proportion of the attacking user to the rating prediction so as to obtain a final user rating prediction result;
a generation module configured to: and generating a project recommendation list according to the user rating prediction result.
It should be noted here that the acquiring module, the filling module, the user representation and item representation learning module, the detecting module, the predicting module and the generating module correspond to steps S1 to S6 in the first embodiment, and the modules are the same as the corresponding steps in the implementation example and application scenario, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.
In the foregoing embodiments, the descriptions of the embodiments have different emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The proposed system can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the above-described modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules may be combined or integrated into another system, or some features may be omitted, or not executed.
EXAMPLE III
The present embodiment further provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein, a processor is connected with the memory, the one or more computer programs are stored in the memory, and when the electronic device runs, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first embodiment.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processor, a digital signal processor DSP, an application specific integrated circuit ASIC, an off-the-shelf programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software.
The method in the first embodiment may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, etc. as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
Example four
The present embodiments also provide a computer-readable storage medium for storing computer instructions, which when executed by a processor, perform the method of the first embodiment.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (9)

1. The strong robustness item recommendation method under the attack environment is characterized by comprising the following steps:
acquiring a one-dimensional rating vector of a user for a project, and converting the one-dimensional rating vector into a two-dimensional rating matrix;
carrying out prediction filling on null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix;
respectively describing the non-sparse rating matrix into a user rating matrix and a rating matrix of item rating; respectively learning a corresponding rating-based user representation and a corresponding rating-based item representation based on the user rating matrix and the item rating matrix; obtaining a rating prediction matrix according to the rating-based user representation and the rating-based item representation;
detecting the user according to the error between the rating prediction matrix and the real rating, the user representation based on the rating and the user embedded ID to obtain the probability that the user is detected as an attack user;
adding the probability of the user detected as the attacking user into loss training of rating prediction to control the contribution proportion of the attacking user to the rating prediction so as to obtain a final user rating prediction result; the specific implementation mode is as follows:
the probability that the user is detected as the attacking user is used as one part of a loss function of the rating prediction, and if the probability that the user is detected as the attacking user is higher, the numerical value of the user in the loss function is smaller;
using the sum of the squared differences between the real value and the predicted value as a main body of a loss function, and adding an attack user detection probability result into the loss function to control the contribution of the attack user to the rating prediction so as to improve the accuracy of the rating prediction;
alternatively, the first and second electrodes may be,
the method is realized by adopting a user rating prediction model; the user rating prediction model comprises: the input layer and the method layer are connected in sequence;
the input layer includes: the matrix generation component and the AVE-SVD component;
the method layer comprises: a perceptron MLP, a latent semantic model LFM and a deep neural decision forest NDF which are used for respectively carrying out feature representation learning on a user rating matrix and a project rating matrix;
wherein the AVE-SVD component is configured to predictively populate a sparse user-item rating matrix to become a non-sparse user-item rating matrix;
the output end of the AVE-SVD component is respectively connected with the input ends of two sensors MLP in the method layer;
the output end of the sensor MLP for performing feature representation learning on the user rating matrix is connected with the input end of the latent semantic model LFM;
the output end of the sensor MLP for performing feature representation learning on the item rating matrix is connected with the input end of the latent semantic model LFM;
the output end of the latent semantic model LFM and the output end of a sensor MLP for performing feature representation learning on a user rating matrix are both connected with the input end of the deep neural decision forest NDF;
the output end of the deep neural decision forest NDF is connected with the input end of a latent semantic model LFM;
alternatively, the first and second electrodes may be,
the AVE-SVD layer represents the singular value decomposition of the added extra information and the average value of the project rating, and the working principle is as follows: decomposing the matrix into a user hidden vector matrix and a project hidden vector matrix, multiplying a certain row in the user hidden vector matrix and a certain column in the project hidden vector matrix to obtain a predicted rating score of a user for a certain project, adding attribute information of the user and the project and average rating scores of different projects in a prediction formula to predict the rating more comprehensively, and minimizing an objective function, namely an error between the predicted rating and a real rating by a gradient descent method;
alternatively, the first and second electrodes may be,
the training step of the user rating prediction model comprises the following steps:
constructing a user rating prediction model;
constructing a training set; the training sets all comprise one-dimensional item rating vectors of users with known user rating results;
inputting the training set into a user rating prediction model, training the model, and stopping training when the overall loss function of the model reaches the minimum value to obtain the trained user rating prediction model;
and generating a project recommendation list according to the user rating prediction result.
2. The method for recommending items with strong robustness in the attack environment according to claim 1, wherein null values in a two-dimensional rating matrix are subjected to predictive filling to obtain a non-sparse rating matrix; the specific implementation mode is as follows:
decomposing the two-dimensional rating matrix into a user hidden vector matrix and a project hidden vector matrix by using a matrix decomposition method;
and based on the user hidden vector matrix, the project hidden vector matrix, the user attribute values, the project attribute values and the grading mean value of the project, carrying out prediction filling on null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix.
3. The strong robustness item recommendation method in the attack environment as recited in claim 1, wherein the non-sparse rating matrix is respectively described as a user rating matrix and a rating matrix of item rating; the specific implementation mode is as follows:
the non-sparse user-item rating matrix is described as a user rating matrix by taking a row unit in the non-sparse rating matrix, namely each user as a unit, and the non-sparse user-item rating matrix is described as an item rating matrix by taking a column unit in the non-sparse rating matrix, namely each item as a unit.
4. The method for recommending items with strong robustness under attack environment according to claim 1, wherein a user rating matrix and an item rating matrix are used to learn a rating-based user representation and a rating-based item representation respectively; the specific implementation mode is as follows:
learning a rating-based user representation based on the user rating matrix and the multi-layer perceptron MLP;
based on the rating-based item representation and the multi-layered perceptron MLP, a rating-based item representation is learned.
5. The method of recommending items with strong robustness in aggressive environments as recited in claim 1, wherein a rating prediction matrix is obtained based on a rating-based user representation and a rating-based item representation; the specific implementation mode is as follows:
a rating prediction matrix is obtained based on the latent semantic model LFM based on the rating-based user representation and the rating-based item representation.
6. The method of recommending items with strong robustness in an attack environment according to claim 1, wherein the user is detected according to an error between the rating prediction matrix and the true rating, a user representation based on the rating, and a user embedded ID, to obtain a probability that the user is detected as an attacking user; the specific implementation mode is as follows:
and detecting the user based on the trained deep neural decision forest NDF according to the user representation based on the rating, the user embedded ID and the error between the rating prediction matrix and the real rating to obtain the probability that the user is detected as an attack user.
7. The strong robustness item recommendation system under the attack environment is characterized by comprising the following steps:
an acquisition module configured to: acquiring a one-dimensional rating vector of a user for a project, and converting the one-dimensional rating vector into a two-dimensional rating matrix;
a fill module configured to: carrying out prediction filling on null values in the two-dimensional rating matrix to obtain a non-sparse rating matrix;
a user representation and item representation learning module configured to: respectively describing the non-sparse rating matrix as a user rating matrix and a rating matrix of item rating; respectively learning a corresponding rating-based user representation and a corresponding rating-based item representation based on the user rating matrix and the item rating matrix; obtaining a rating prediction matrix according to the rating-based user representation and the rating-based item representation;
a detection module configured to: detecting the user according to the error between the rating prediction matrix and the real rating, the user representation based on the rating and the user embedded ID to obtain the probability that the user is detected as an attack user;
a prediction module configured to: adding the probability of the user detected as the attacking user into loss training of rating prediction to control the contribution proportion of the attacking user to the rating prediction so as to obtain a final user rating prediction result; the specific implementation mode is as follows:
taking the probability of the user being detected as an attacking user as a part of a loss function of rating prediction, wherein if the probability of the user being detected as the attacking user is higher, the numerical value of the user in the loss function is smaller;
using the sum of the square differences between the true value and the predicted value as a main body of a loss function, and adding an attack user detection probability result into the loss function to control the contribution of the attack user to the rating prediction so as to improve the rating prediction accuracy;
alternatively, the first and second electrodes may be,
the method is realized by adopting a user rating prediction model; the user rating prediction model comprises: the input layer and the method layer are connected in sequence;
the input layer includes: the matrix generation component and the AVE-SVD component;
the method layer comprises: a perceptron MLP, a latent semantic model LFM and a deep neural decision forest NDF which are used for respectively carrying out feature representation learning on a user rating matrix and a project rating matrix;
wherein the AVE-SVD component is configured to predictively populate a sparse user-item rating matrix to become a non-sparse user-item rating matrix;
the output end of the AVE-SVD component is respectively connected with the input ends of two sensors MLP in the method layer;
the output end of the sensor MLP for performing feature representation learning on the user rating matrix is connected with the input end of the latent semantic model LFM;
the output end of the sensor MLP for performing feature representation learning on the item rating matrix is connected with the input end of the latent semantic model LFM;
the output end of the latent semantic model LFM and the output end of a sensor MLP for performing feature representation learning on a user rating matrix are both connected with the input end of the deep neural decision forest NDF;
the output end of the deep neural decision forest NDF is connected with the input end of a latent semantic model LFM;
alternatively, the first and second electrodes may be,
the AVE-SVD layer represents the singular value decomposition of the added extra information and the average value of the project rating, and the working principle is as follows: decomposing the matrix into a user hidden vector matrix and a project hidden vector matrix, multiplying a certain row in the user hidden vector matrix and a certain column in the project hidden vector matrix to obtain a predicted rating score of a user for a certain project, adding attribute information of the user and the project and average rating scores of different projects in a prediction formula to predict the rating more comprehensively, and minimizing an objective function, namely an error between the predicted rating and a real rating by a gradient descent method;
alternatively, the first and second electrodes may be,
the training step of the user rating prediction model comprises the following steps:
constructing a user rating prediction model;
constructing a training set; the training sets all comprise one-dimensional item rating vectors of users with known user rating results;
inputting the training set into a user rating prediction model, training the model, and stopping training when the overall loss function of the model reaches the minimum value to obtain the trained user rating prediction model;
a generation module configured to: and generating a project recommendation list according to the user rating prediction result.
8. An electronic device, comprising: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs being stored in the memory, the processor executing the one or more computer programs stored in the memory when the electronic device is running, to cause the electronic device to perform the method of any of the preceding claims 1-6.
9. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the method of any one of claims 1 to 6.
CN202011299425.9A 2020-11-19 2020-11-19 Strong robustness item recommendation method, system, device and medium in attack environment Active CN112508351B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011299425.9A CN112508351B (en) 2020-11-19 2020-11-19 Strong robustness item recommendation method, system, device and medium in attack environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011299425.9A CN112508351B (en) 2020-11-19 2020-11-19 Strong robustness item recommendation method, system, device and medium in attack environment

Publications (2)

Publication Number Publication Date
CN112508351A CN112508351A (en) 2021-03-16
CN112508351B true CN112508351B (en) 2022-12-30

Family

ID=74958117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011299425.9A Active CN112508351B (en) 2020-11-19 2020-11-19 Strong robustness item recommendation method, system, device and medium in attack environment

Country Status (1)

Country Link
CN (1) CN112508351B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807154A (en) * 2019-11-08 2020-02-18 内蒙古工业大学 Recommendation method and system based on hybrid deep learning model
CN111460316A (en) * 2020-03-20 2020-07-28 南京邮电大学 Knowledge system-oriented personalized recommendation method and computer storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807154A (en) * 2019-11-08 2020-02-18 内蒙古工业大学 Recommendation method and system based on hybrid deep learning model
CN111460316A (en) * 2020-03-20 2020-07-28 南京邮电大学 Knowledge system-oriented personalized recommendation method and computer storage medium

Also Published As

Publication number Publication date
CN112508351A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
Suthaharan et al. Decision tree learning
CN112529168B (en) GCN-based attribute multilayer network representation learning method
US8156056B2 (en) Method and system of classifying, ranking and relating information based on weights of network links
West et al. Some experimental issues in financial fraud mining
CN110750640A (en) Text data classification method and device based on neural network model and storage medium
US11755838B2 (en) Machine learning for joint recognition and assertion regression of elements in text
CN113139052B (en) Rumor detection method and device based on graph neural network feature aggregation
CN113743675B (en) Construction method and system of cloud service QoS deep learning prediction model
EP4226283A1 (en) Systems and methods for counterfactual explanation in machine learning models
Tirumala et al. Hierarchical data classification using deep neural networks
EP3874412A1 (en) Computer architecture for multiplier-less machine learning
Ma et al. Fuzzy hybrid framework with dynamic weights for short‐term traffic flow prediction by mining spatio‐temporal correlations
CN114741599A (en) News recommendation method and system based on knowledge enhancement and attention mechanism
Ahraminezhad et al. An intelligent ensemble classification method for spam diagnosis in social networks
Chen et al. Integration of genetic algorithms and neural networks for the formation of the classifier of the hierarchical Choquet integral
CN111708865B (en) Technology forecasting and patent early warning analysis method based on improved XGboost algorithm
Rama et al. Deep learning to address candidate generation and cold start challenges in recommender systems: A research survey
Goyal et al. Hierarchical class-based curriculum loss
Renström et al. Fraud Detection on Unlabeled Data with Unsupervised Machine Learning
Wang et al. Enhancing rumor detection in social media using dynamic propagation structures
CN112508351B (en) Strong robustness item recommendation method, system, device and medium in attack environment
US8117143B2 (en) Using affinity measures with supervised classifiers
Li et al. Neural network ensembles: theory, training, and the importance of explicit diversity
CN113159976B (en) Identification method for important users of microblog network
CN114358186A (en) Data processing method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant