CN106874427B - Item association-based trust attack detection method - Google Patents

Item association-based trust attack detection method Download PDF

Info

Publication number
CN106874427B
CN106874427B CN201710057846.2A CN201710057846A CN106874427B CN 106874427 B CN106874427 B CN 106874427B CN 201710057846 A CN201710057846 A CN 201710057846A CN 106874427 B CN106874427 B CN 106874427B
Authority
CN
China
Prior art keywords
matrix
user
users
item
scoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710057846.2A
Other languages
Chinese (zh)
Other versions
CN106874427A (en
Inventor
李巧巧
陈百基
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201710057846.2A priority Critical patent/CN106874427B/en
Publication of CN106874427A publication Critical patent/CN106874427A/en
Application granted granted Critical
Publication of CN106874427B publication Critical patent/CN106874427B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for detecting a trusting attack based on project association, which comprises the following steps: carrying out project correlation calculation on the mixed user scoring matrix R to obtain a project correlation matrix A of the mixed user scoring matrix R; searching a neighbor user of a certain target user in the mixed user scoring matrix R, removing the target user and the neighbor user thereof, and obtaining a new mixed user scoring matrix R; repeatedly searching a target user; carrying out project correlation calculation on the new mixed user scoring matrix r to obtain a project correlation matrix a of the new mixed user scoring matrix r; calculating Euclidean distances between the item correlation matrix A and different item correlation matrices a; and accumulating the Euclidean distance to each user and the adjacent users thereof in the mixed user scoring matrix R, and finally filtering the attacking users. According to the invention, the difference values are obtained by calculating the correlation value matrix, and the difference values are sequenced, so that the purpose of filtering attack users is finally achieved, the detection accuracy is improved, and the defects of the existing detection method are overcome.

Description

Item association-based trust attack detection method
Technical Field
The invention relates to the field of machine learning, in particular to a trusting attack detection method based on project association.
Background
With the development of the internet, the amount of information on the network increases dramatically, and it is difficult for people to quickly locate target content from massive information, and at the same time, the utilization rate of the information is reduced. Therefore, how to implement high-quality recommendation for users in the case of "information overload" becomes a research focus. The collaborative filtering recommendation algorithm is one of the most widely applied recommendation algorithms due to the efficient and convenient personalized recommendation technology. The method and the system analyze the existing information of the user, and search similar users for the target user from individual points such as preference and demand of the user, so that commodities which are closer to the taste of the target user are recommended.
And due to the characteristics of convenience and openness, the safety of the collaborative filtering recommendation system is challenged. In order to improve personal interests, illegal merchants can add users who score good commodities or users who score bad commodities to the recommendation system, so that the recommendation result contains recommendations of abnormal items, and own interests are achieved. Such an attack for achieving an illegal purpose by adding a fake user profile is called a trusted attack, and a random attack, an average attack and the like are common.
In practical applications, the collaborative filtering recommendation system presents a great vulnerability in the face of such attacks. Therefore, it is becoming a popular research topic to research the safety of the collaborative filtering recommendation system. At present, the detection aiming at the trusting attack is mainly to remove the attacking user before recommending and generating by analyzing the characteristics of the credit values and the like of the real user and the fake user. From the perspective of machine learning, many detection methods can be classified into three major categories, namely, supervised learning detection methods, unsupervised learning detection methods, and semi-supervised learning detection methods, according to their detection modes. The supervised detection method mainly extracts features aiming at each user profile, and after marking, the detection is realized by classifiers such as a support vector machine and the like. The method for extracting features has better performance only by requiring larger filling scale of the attack, has not ideal effect when the profile of an attacking user is filled to be smaller, and needs a large amount of priori knowledge for learning. Besides using the features of the user profile such as the score length and the score change, the learner combines the signal recognition technique with the detection method, but the method is only effective for the noise data, and the detection accuracy is not ideal. Then, researchers have proposed unsupervised learning methods, for example, a detection method based on principal component analysis, since the scoring patterns of the attacking users are similar, the similarity between the attacking users is high and the similarity between the real users is low, the covariance between the fake users obtained by converting into covariance is low, the covariance between the fake users and the real users is low, the covariance between the real users is high, and the fake users are filtered out after calculation by extracting the principal components, but the number of the attacking users needs to be known in the method, so that the number of the filtering users is set, the influence of the number setting on the detection result is very large, and when the filling scale of the users is increased, the detection effect of the method is reduced accordingly. When only a small number of labeled users exist, the effect of learning depending on supervision is not ideal, so a semi-supervised learning method appears, the method mainly comprises two parts, a small number of labeled data are trained by using a classifier, unlabelled data are added into the classifier in an iterative mode, but the method needs continuous iteration for adjustment, and besides time consumption, the difficulty is increased for detection by determining the iteration times and the like. Meanwhile, the existing inspection methods are all based on the similarity between users, the user score value and other angles, the relation between the scoring items is not directly considered, and if an attacker is based on the angle, the existing inspection methods lose the inspection effect to a certain extent.
Disclosure of Invention
In order to overcome the defects and shortcomings in the prior art, the invention provides a project association-based trust attack detection method, which obtains difference values by calculating an association value matrix and sorts the difference values, so that the purpose of filtering attack users is finally achieved, the detection accuracy is improved, and the defects of the existing detection method are overcome.
In order to solve the technical problems, the invention provides the following technical scheme: a trusting attack detection method based on project association comprises the following steps:
s1, carrying out project correlation calculation on the mixed user scoring matrix R to obtain a project correlation matrix A of the mixed user scoring matrix R;
s2, searching a neighbor user of a certain target user in the mixed user scoring matrix R, and removing the target user and the neighbor user to obtain a new mixed user scoring matrix R; repeatedly searching the target user until all the users in the mixed user scoring matrix R are searched, and obtaining a plurality of new mixed user scoring matrices R;
s3, carrying out project correlation calculation on the new mixed user scoring matrix r to obtain a project correlation matrix a of the new mixed user scoring matrix r;
s4, calculating Euclidean distances between the project correlation matrix A and different project correlation matrices a;
and S5, accumulating the Euclidean distance to each user and the adjacent users in the mixed user scoring matrix R, and filtering the attacking users.
Further, the step S1 is specifically:
s11, converting the mixed user scoring matrix R into a new matrix R ', wherein the scoring items in the new matrix R' are 1, and the non-scoring items are 0;
s12, expanding each item in the new matrix R ', and adding the expanded items and the new matrix R' to obtain a common bisection item of the items;
s13, calculating similarity of the common score items by utilizing the Pearson similarity, wherein the formula is as follows:
Figure GDA0002239176310000031
wherein, in the item X,
Figure GDA0002239176310000032
is the mean of item X; y is an item Y which is a group of,is the mean of item Y;
s14, repeating the steps S12-S13 until the project correlation matrix A is generated.
Further, the step S2 is specifically:
s21, setting the number k of neighbor users in the mixed user scoring matrix R;
s22, calculating k neighboring users for each user by using the KNN method, specifically as follows:
s221, calculating the distances between the other users and the target user according to a formula;
Figure GDA0002239176310000034
wherein x isiScore target user, yiScoring the neighbor users, wherein m is the column number of the original mixed scoring matrix and represents the scoring number;
s222, sequencing the calculated distances, and finding out k nearest neighbor users to the target user;
and S23, removing the target user and the neighbor users thereof to obtain a new mixed user scoring matrix r.
Further, the step S3 is specifically:
s31, converting the new mixed user scoring matrix r into a new matrix r ', wherein the scoring items in the new matrix r' are 1, and the non-scoring items are 0;
s32, expanding each item in the new matrix r ', and adding the expanded items and the new matrix r' to obtain a common bisection item of the items;
s33, calculating similarity of the common score items by utilizing the Pearson similarity, wherein the formula is as follows:
wherein, in the item X,
Figure GDA0002239176310000042
is the mean of item X; y is an item Y which is a group of,
Figure GDA0002239176310000043
is the mean of item Y;
s34, repeating the steps S32-S33 until the project correlation matrix a is generated.
Further, the filtering attack user in step S5 specifically includes: and sorting the corresponding distances of the users, wherein the sorting adopts ascending processing, and filtering the first N users with the shortest distances, wherein the first N users with the shortest distances represent filtering attack users.
After the technical scheme is adopted, the invention at least has the following beneficial effects:
(1) the invention provides a new detection method from the perspective of project relevance, and makes up the defects of the existing detection method;
(2) as most of the existing attack methods randomly select items for scoring, the method has wider applicability;
(3) the method has a good detection effect on common attacks, and simultaneously improves the detection effect of AOP attacks which are difficult to defend by the existing detection method;
(4) compared with learning methods such as semi-supervision and supervision, the method provided by the invention does not need a large amount of prior knowledge, and can directly learn the mixed matrix, thereby achieving the purpose of filtering attack users.
Drawings
FIG. 1 is a flowchart illustrating steps of a method for detecting a trust attack based on project association according to the present invention;
FIG. 2 is a schematic diagram of the detection result of the random attack according to the present invention and other two detection methods;
FIG. 3 is a schematic diagram of the detection results of FIG. 2 for mean value attack, showing the accuracy of each detection algorithm when the score of the attacking user increases;
FIG. 4 shows the detection result of the AOP attack according to the present invention.
Detailed Description
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict, and the present application is further described in detail with reference to the drawings and specific embodiments.
The invention relates to a method for detecting a trusting attack based on project association, which comprises the following steps as shown in figure 1:
1. and (3) calculating the relevance of each item in the mixing matrix by using the common item scores of the items: the collected original matrix is a data set only with real users, and the mixed scoring matrix is obtained by adding fake users to the data set of the real users through the existing attack method; for example, some attack methods are generated by a random attack method, and after the attack methods are added to an original matrix, an existing data set is called as a mixed matrix;
1) converting the mixed user scoring matrix R (including real users and attack users) into a new matrix R ', wherein the scoring items in the R' are 1, and the non-scoring items are 0;
2) expanding each item in the R ', and adding the expanded items with the R' to obtain a common average item of the items; in matlab software, expansion means that each item has only one column, and the item is repeated to generate a matrix as large as R' after expansion, so that the two matrices are added and represent the common average item of the item and other items when the sum is 2;
3) calculating similarity by using a Pearson similarity for the common scoring items;
Figure GDA0002239176310000051
4) repeating the steps 2-3 until a project correlation matrix is generated, and assuming that the size of the original mixed scoring matrix is m x n and the size of the project correlation matrix is n x n;
2. searching a target user to obtain a neighbor user, removing the target user and the neighbor user:
1) setting the number k of adjacent users;
2) in the mixed matrix, k neighbor users are obtained for each user by KNN:
A. calculating the distances between the other users and the target user according to a formula;
Figure GDA0002239176310000052
wherein x isiScore target user, yiScoring the neighbor users, wherein m is the column number of the original mixed scoring matrix and represents the scoring number;
B. b, sequencing the sequence obtained in the step A, and finding out k nearest neighbor users which are nearest to the target user;
3) removing the target user and the neighbor users thereof to obtain a new mixing matrix r;
3. calculating a correlation matrix difference value:
1) calculating a project correlation matrix for the result r of the step 2-3, wherein the calculation method is the same as that of the step 1;
2) calculating the mahalanobis distance between the R and the item correlation matrix of different R;
4. filtering the first N attack users:
1) and accumulating the result obtained in the step 3-2 to the related user and the neighbor users thereof.
2) And sorting (ascending) the distances corresponding to the users, and filtering the first N users with the shortest distances.
The detection result of the proposed new detection method for random attacks is shown in fig. 2, and compared with the PCA and SVM detection methods, the new detection method is not affected by attack scale (attack size) and filling scale (filler size), and the optimal performance is achieved under the same data set;
the detection result of the new detection method for the epidemic attack is shown in fig. 3, and the new detection method has optimal performance;
the detection result of the proposed new detection method for the mean-based epidemic attack is shown in fig. 4, and when the attack scale is increased, the accuracy of PCA detection is obviously reduced; since the attack also considers the correlation between commodities, the new detection method is also suitable for the attack and has better performance.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various equivalent changes, modifications, substitutions and alterations can be made herein without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.

Claims (3)

1. A trusting attack detection method based on item association is characterized by comprising the following steps:
s1, carrying out project correlation calculation on the mixed user scoring matrix R to obtain a project correlation matrix A of the mixed user scoring matrix R; the method comprises the following specific steps:
s11, converting the mixed user scoring matrix R into a new matrix R ', wherein the scoring items in the new matrix R' are 1, and the non-scoring items are 0;
s12, expanding each item in the new matrix R ', and adding the expanded items and the new matrix R' to obtain a common bisection item of the items;
s13, calculating similarity of the common score items by utilizing the Pearson similarity, wherein the formula is as follows:
wherein, X is an item X,
Figure FDA0002239176300000012
is the mean of item X; y is an item Y which is a group of,is the mean of item Y;
s14, repeating the steps S12-S13 until a project correlation matrix A is generated;
s2, searching a neighbor user of a certain target user in the mixed user scoring matrix R, and removing the target user and the neighbor user to obtain a new mixed user scoring matrix R; repeatedly searching the target user until all the users in the mixed user scoring matrix R are searched, and obtaining a plurality of new mixed user scoring matrices R;
s3, carrying out project correlation calculation on the new mixed user scoring matrix r to obtain a project correlation matrix a of the new mixed user scoring matrix r; the method comprises the following specific steps:
s31, converting the new mixed user scoring matrix r into a new matrix r ', wherein the scoring items in the new matrix r' are 1, and the non-scoring items are 0;
s32, expanding each item in the new matrix r ', and adding the expanded items and the new matrix r' to obtain a common bisection item of the items;
s33, calculating similarity of the common score items by utilizing the Pearson similarity, wherein the formula is as follows:
Figure FDA0002239176300000014
wherein, X is an item X,is the mean of item X; y is an item Y which is a group of,
Figure FDA0002239176300000016
is the mean of item Y;
s34, repeating the steps S32-S33 until a project correlation matrix a is generated;
s4, calculating Euclidean distances between the project correlation matrix A and different project correlation matrices a;
and S5, accumulating the Euclidean distance to each user and the adjacent users in the mixed user scoring matrix R, and filtering the attacking users.
2. The item association-based trusting attack detection method according to claim 1, wherein said step S2 specifically is:
s21, setting the number k of neighbor users in the mixed user scoring matrix R;
s22, calculating k neighboring users for each user by using the KNN method, specifically as follows:
s221, calculating the distances between the other users and the target user according to a formula;
Figure FDA0002239176300000021
wherein x isiScore target user, yiScoring the neighbor users, wherein m is the column number of the original mixed scoring matrix and represents the scoring number;
s222, sequencing the calculated distances, and finding out k nearest neighbor users to the target user;
and S23, removing the target user and the neighbor users thereof to obtain a new mixed user scoring matrix r.
3. The item association-based trusting attack detection method according to claim 1, wherein the filtering attack user in step S5 specifically is: and sorting the corresponding distances of the users, wherein the sorting adopts ascending processing, and filtering the first N users with the shortest distances, wherein the first N users with the shortest distances represent filtering attack users.
CN201710057846.2A 2017-01-23 2017-01-23 Item association-based trust attack detection method Expired - Fee Related CN106874427B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710057846.2A CN106874427B (en) 2017-01-23 2017-01-23 Item association-based trust attack detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710057846.2A CN106874427B (en) 2017-01-23 2017-01-23 Item association-based trust attack detection method

Publications (2)

Publication Number Publication Date
CN106874427A CN106874427A (en) 2017-06-20
CN106874427B true CN106874427B (en) 2020-01-14

Family

ID=59158390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710057846.2A Expired - Fee Related CN106874427B (en) 2017-01-23 2017-01-23 Item association-based trust attack detection method

Country Status (1)

Country Link
CN (1) CN106874427B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108197215A (en) * 2017-12-28 2018-06-22 努比亚技术有限公司 A kind of recommendation method, server and computer readable storage medium
CN108470052B (en) * 2018-03-12 2021-03-19 南京邮电大学 Anti-trust attack recommendation algorithm based on matrix completion
CN109948677B (en) * 2019-03-06 2022-12-02 长安大学 Touchi attack detection method based on mixed characteristic values
CN110417765B (en) * 2019-07-22 2021-10-26 南京邮电大学 Trust-based method and system for detecting trust attack user

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184364A (en) * 2011-05-26 2011-09-14 南京财经大学 Semi-supervised learning-based recommendation system shilling attack detection method
CN104809393A (en) * 2015-05-11 2015-07-29 重庆大学 Shilling attack detection algorithm based on popularity classification features
CN105677900A (en) * 2016-02-04 2016-06-15 南京理工大学 Malicious user detection method and device
CN105389505B (en) * 2015-10-19 2018-06-12 西安电子科技大学 Support attack detection method based on the sparse self-encoding encoder of stack

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184364A (en) * 2011-05-26 2011-09-14 南京财经大学 Semi-supervised learning-based recommendation system shilling attack detection method
CN104809393A (en) * 2015-05-11 2015-07-29 重庆大学 Shilling attack detection algorithm based on popularity classification features
CN105389505B (en) * 2015-10-19 2018-06-12 西安电子科技大学 Support attack detection method based on the sparse self-encoding encoder of stack
CN105677900A (en) * 2016-02-04 2016-06-15 南京理工大学 Malicious user detection method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《SVM-TIA a shilling attack detection method based on SVM and target item analysis in recommender systems》;Zhou, Wei 等;《Neurocomputing》;20160611;第197-204页 *
《推荐系统中信息相似度的研究及其应用》;高鹏;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130715(第7期);I138-1513 *

Also Published As

Publication number Publication date
CN106874427A (en) 2017-06-20

Similar Documents

Publication Publication Date Title
Abdelnabi et al. Visualphishnet: Zero-day phishing website detection by visual similarity
Samanta et al. Towards crafting text adversarial samples
CN106874427B (en) Item association-based trust attack detection method
CN108228915B (en) Video retrieval method based on deep learning
Pengcheng et al. Query-efficient black-box attack by active learning
CN104573652A (en) Method, device and terminal for determining identity identification of human face in human face image
CN106415594A (en) A method and a system for face verification
Ko et al. PAC-Net: pairwise aesthetic comparison network for image aesthetic assessment
CN107798080B (en) Similar sample set construction method for fishing URL detection
An et al. Weather classification using convolutional neural networks
Ding et al. Beyond universal person re-identification attack
Zhou et al. Selective domain-invariant feature alignment network for face anti-spoofing
CN111967909A (en) Trust attack detection method based on convolutional neural network
CN104899321A (en) Collaborative filtering recommendation method based on item attribute score mean value
Li et al. Sa-es: Subspace activation evolution strategy for black-box adversarial attacks
Alrefaai et al. Detecting phishing websites using machine learning
Narvaez et al. Painting authorship and forgery detection challenges with ai image generation algorithms: Rembrandt and 17th century dutch painters as a case study
CN106682151B (en) Education resource personalized recommendation method and system
WO2021012220A1 (en) Evasion attack method and device for integrated tree classifier
Abbasi et al. Detecting fake escrow websites using rich fraud cues and kernel based methods
Wang et al. Id-aware quality for set-based person re-identification
Imani et al. Phishing Website Detection Using Weighted Feature Line Embedding.
Liu et al. Adaptive re-ranking of deep feature for person re-identification
Ha et al. Negative-based sampling for multimedia retrieval
Zhang et al. Perception-driven Imperceptible Adversarial Attack against Decision-based Black-box Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200114