CN107689960B - Attack detection method for unorganized malicious attack - Google Patents

Attack detection method for unorganized malicious attack Download PDF

Info

Publication number
CN107689960B
CN107689960B CN201710811240.3A CN201710811240A CN107689960B CN 107689960 B CN107689960 B CN 107689960B CN 201710811240 A CN201710811240 A CN 201710811240A CN 107689960 B CN107689960 B CN 107689960B
Authority
CN
China
Prior art keywords
matrix
malicious
score
attack
scoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710811240.3A
Other languages
Chinese (zh)
Other versions
CN107689960A (en
Inventor
周志华
庞明
高尉
陶敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201710811240.3A priority Critical patent/CN107689960B/en
Publication of CN107689960A publication Critical patent/CN107689960A/en
Application granted granted Critical
Publication of CN107689960B publication Critical patent/CN107689960B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general

Abstract

The invention discloses a learning algorithm which can detect unorganized malicious attacks in a recommendation system so as to better ensure the quality of the recommendation system. The invention firstly provides and solves the attack detection task under the unorganized small-scale attack scene, formalizes the attack detection task into a matrix completion learning problem, and obtains a real scoring matrix X, a system noise matrix Z and a malicious attack deviation matrix Y corresponding to the scoring matrix M by utilizing the matrix completion algorithm provided by the invention. And detecting a malicious attacker in the user according to the information of the malicious attack deviation matrix Y.

Description

Attack detection method for unorganized malicious attack
Technical Field
The invention relates to machine learning and application technologies, in particular to collaborative filtering, a recommendation system and attack detection, and discloses a learning method capable of detecting unorganized malicious attacks in the recommendation system so as to better guarantee the quality of the recommendation system.
Background
The recommendation system is widely applied in our lives, and especially in the present day that people live on line more and more abundantly, the recommendation system plays more and more important influence on our lives. For example, more and more people buy goods on online shopping retail platforms such as Taobao, Amazon and the like, and watch videos on video websites such as Youkou, Aiqiyi and the like. In the face of the increasing number of users and the number of items, it is a great challenge how to recommend items suitable for each user. To address such challenges, many recommendations for collaborative filtering have been proposed.
The basic assumption of collaborative filtering is that users who originally exhibited similar hobbies should have similar hobbies later. There are two main categories of collaborative filtering, namely storage-based collaborative filtering methods and model-based collaborative filtering methods. The storage-based collaborative filtering method directly utilizes the scoring information given to the items by the user to predict the items in which the user is interested. Such methods are divided into two broad categories, namely user-based collaborative filtering methods and article-based collaborative filtering methods. The user-based collaborative filtering method comprises the steps that a similar user of a user is found firstly, and articles liked by the similar user are recommended to the user; the item-based collaborative filtering method recommends to a user similar items of items that the user likes. The collaborative filtering method based on the model firstly trains a prediction model by utilizing the scoring information of the articles given by the users, and then generates the recommendation for each user by utilizing the prediction model.
Both of the above two broad categories of collaborative filtering methods generally assume that the user's score for an item faithfully reflects the user's preferences. In real life, however, the owner of the article can control the recommendation system by counterfeiting a false user to give a false score, so that the interest of the owner of the article is increased. For example, an attacker may forge a fake user, resemble a normal user in scoring behavior, give a high score to his own item, or give a low score to his competitor's item. The existing research work shows that the recommendation method based on collaborative filtering is easily affected by malicious attacks.
In response to this problem, many methods of malicious attack detection have been proposed. The existing malicious attack detection method mainly comprises a statistical method, a clustering method and a classification method. Statistical methods find malicious attackers by detecting suspicious scores. The clustering method is used for clustering users into a plurality of clusters with similar performances according to the scoring information of the users, wherein the users in the smallest cluster are regarded as malicious attackers. The classification method comprises the steps of firstly extracting the characteristics of each user according to the grading information of each user on the articles, and then training according to the characteristics and the marks of the users (namely whether the articles are malicious attackers) to obtain a classification model for detecting the malicious attackers.
The application scenes of the existing malicious attack detection are all directed to organized large-scale attacks, but not unorganized small-scale attacks. An organized mass attack is, for example, where an owner of an item counterfeits hundreds of users according to the same strategy, where each counterfeited user would evaluate multiple items against normal users and give the own item a high score. However, the current recommendation system increases the cost of the attack through multiple mechanisms, so as to reduce the occurrence of malicious attacks, for example, the verification codes are widely used, more and more online platforms need short message verification or mailbox verification to complete registration, and a mechanism that a user can give evaluation only when purchasing an article, and the like. In this context, the high cost makes launching an organized large scale attack difficult to implement. However, unorganized small-scale attacks still exist widely, for example, merchants with competitive relationships on the online shopping retail platform may attack each other, for example, a plurality of users are forged to score high scores for their own articles, and score low scores for competitor's articles, and attack strategies adopted by different merchants are different. At this time, the existing attack detection method cannot effectively detect malicious attackers by detecting the same attack strategy, so that a great amount of attack scores still exist in the recommendation system, and the effect of the recommendation method is poor. Therefore, a learning algorithm that can detect unorganized malicious attacks is needed for recommendation systems.
Disclosure of Invention
The purpose of the invention is as follows: the existing attack detection method can only detect organized large-scale attacks, and under an unorganized small-scale attack scene, the existing algorithm cannot effectively detect the attack in the type by detecting the same malicious attack strategy. Aiming at the problems, the invention firstly provides and solves the attack detection task under the unorganized small-scale attack scene, formalizes the attack detection task into a learning problem of matrix completion and provides a corresponding attack detection learning algorithm. Specifically, the scoring information given to the item by the user consists of three parts. The first part is the real scoring of the article by the user, the second part is system noise which may appear when the user scores the article, and the third part is the malicious attack given to the article by the user. For example, in a scoring system with a scoring interval of 1 to 5 points, it is assumed that the real score of the user on the article is 4.8 points, if the user is a normal user, the final score may be 5 points, i.e., there is 0.2 points of system noise, and if the user is a malicious user, the final score may be 1 point, i.e., there is a malicious attack with a deviation of-4 points. Based on the setting that the preference of each user is affected by a small number of factors, we assume that the true score matrix composed of the first part is a low rank matrix. The second part is ubiquitous in scoring systems, but its value is small. The third part is a malicious attack part, and the number proportion of the malicious attacks compared with the normal scores is small, namely the third part is sparse and has a performance contrary to the real scores. The invention aims to restore the three parts of contents of the scoring information as much as possible according to the scoring information given to the article by the user and the properties of the three parts of information. Users who score the presence of a malicious attack component are considered malicious attackers.
The technical scheme is as follows: an attack detection method aiming at unorganized malicious attacks comprises the following steps:
step 1.1, converting all the scores of the users for the articles into an incomplete score matrix M, and determining parameters of an algorithm according to the number of the users, the number of the articles, the score number and the score interval of the score matrix M.
And 1.2, obtaining a corresponding real score matrix X, a system noise matrix Z and a malicious attack deviation matrix Y according to the score matrix M by using a matrix completion algorithm provided by us.
And 1.3, detecting a malicious attacker from the user according to the information of the malicious attack deviation matrix Y.
The determining of the parameters of the algorithm according to the information of the scoring matrix M specifically includes: determining the upper bound of a system noise matrix Z allowed by the algorithm according to the number of users, the number of articles and the number of scores of the score matrix M, and increasing the upper bound of the system noise matrix Z allowed by the algorithm with the increase of the number of users, the number of articles and the number of scores; and determining a lower bound of the deviation of the malicious score from the real score according to the score interval of the score matrix M, wherein the larger the score interval is, the larger the lower bound is.
According to the scoring matrix M, a real scoring matrix X, a system noise matrix Z and a malicious attack deviation matrix Y which correspond to the scoring matrix M are obtained by using a matrix completion algorithm, and the method specifically comprises the following steps: the scoring matrix M is composed of a true scoring matrix X, a system noise matrix Z, and a malicious attack bias matrix Y, so that M is X + Z + Y. According to common assumptions in recommendation systems, the preference of each user is determined by a small number of factors, so the true score matrix X is a low rank matrix. The system noise matrix Z is commonly present in the scoring system, i.e. its non-zero elements are many, but its value is generally small, for example, in the scoring system with a scoring interval of 1 to 5 points, it is assumed that the real score of the user on the item is 4.8 points, if it is a normal user, the final score may be 5 points, i.e. there is 0.2 points of system noise. Because the proportion of the number of the malicious attacks is small compared with that of the normal scores, namely, the malicious attack deviation matrix Y is a sparse matrix, and the non-zero entries are generally large. Another feature of the malicious attack deviation matrix Y is that its non-zero items perform contrary to the true score, that is, the items with high true score are maliciously degraded, and the items with low true score are maliciously promoted, and the product of the corresponding items with the value of X and Y is not greater than 0, for example, in a scoring system with a scoring interval of 1 to 5 points, it is assumed that the true score of the user on the items is 5 points, if the user is a malicious user, the final score may be 1 point, and there is a deviation of-4 points from the true score, that is, the high-score items are intentionally degraded.
Obtaining an optimization target according to the M ═ X + Z + Y and the properties of X, Z and Y,
Figure GDA0002502356700000041
wherein | · | purple*Represents the kernel norm, | ·| non-woven phosphor of the matrix1L representing a matrix1Norm, | · | luminanceFA frobenius norm of the matrix is represented,<X,Y>representing the product sum of the corresponding elements of the matrices X and Y. The hyper-parameters τ, α and the degree of emphasis used to trade off each term in the optimization objective. Ω is the set of all the subscripts of the scoring item, e.g., (i, j) ∈ Ω indicates that user i has a scoring record for item j. PΩFor orthogonal projection, the meaning is as follows,
Figure GDA0002502356700000042
Mijrepresenting the user i's score on item j.
The optimization target is solved by using an alternate optimization method, and theoretically, the method can well recover a real score matrix X, a system noise matrix Z and a malicious attack deviation matrix Y under a certain condition.
The method for detecting the malicious attacker from the user according to the information of the malicious attack deviation matrix Y specifically comprises the following steps: and each row of the malicious attack deviation matrix Y corresponds to the grading deviation information of one user, and if one row of the Y has non-zero elements, the user corresponding to the row is judged to be a malicious attacker.
Has the advantages that: compared with the prior attack detection technology, the method for obtaining the real score matrix X, the system noise matrix Z and the malicious attack deviation matrix Y corresponding to the score matrix M by utilizing the matrix-complemented variant algorithm can fully consider the characteristic that the score of the malicious attack is violated with the real score in the implementation process, is suitable for unorganized small-scale attack scenes, and can effectively detect unorganized small-scale attacks generated by different strategies.
Drawings
Fig. 1 is a flowchart of an attack detection method for an unorganized attack according to an embodiment of the present invention.
Detailed Description
The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.
As shown in fig. 1, the attack detection method for the unorganized attack specifically includes the steps of:
step 1.1, converting all the scores of the users for the articles into an incomplete score matrix M, and determining parameters of an algorithm according to the number of the users, the number of the articles, the score number and the score interval of the score matrix M.
The determining of the parameters of the algorithm according to the information of the scoring matrix M specifically includes: determining the upper bound of a system noise matrix Z allowed by the algorithm according to the number M of users, the number n of articles and the number d of scores of the scoring matrix M, and expressing the numerical value as
Figure GDA0002502356700000051
As the number of users, the number of items and the number of scores increase, the upper bound of the system noise matrix Z allowed by the algorithm also increases(ii) a And determining a lower bound of the deviation of the malicious score from the real score according to the score interval of the score matrix M, wherein the larger the score interval is, the larger the lower bound is.
And 1.2, obtaining a corresponding real score matrix X, a system noise matrix Z and a malicious attack deviation matrix Y according to the score matrix M by using a matrix completion algorithm provided by us.
According to the scoring matrix M, a real scoring matrix X, a system noise matrix Z and a malicious attack deviation matrix Y which correspond to the scoring matrix M are obtained by using a matrix completion algorithm, and the method specifically comprises the following steps: the scoring matrix M is composed of a true scoring matrix X, a system noise matrix Z, and a malicious attack bias matrix Y, so that M is X + Z + Y. We assume that the preference of each user is influenced by a small number of factors, and thus the true score matrix X is a low rank matrix. The system noise matrix Z is commonly present in the scoring system, i.e. its non-zero elements are many, but its value is generally small, for example, in the scoring system with a scoring interval of 1 to 5 points, it is assumed that the real score of the user on the item is 4.8 points, if it is a normal user, the final score may be 5 points, i.e. there is 0.2 points of system noise. Because the proportion of the number of the malicious attacks is small compared with that of the normal scores, namely, the malicious attack deviation matrix Y is a sparse matrix, and the non-zero entries are generally large. Another feature of the malicious attack deviation matrix Y is that its non-zero items perform contrary to the true score, for example, in a scoring system with a scoring interval of 1 to 5 points, it is assumed that the true score of the user on the item is 5 points, if the user is a malicious user, the final score may be 1 point, and there is a deviation of-4 points from the true score, i.e. the high-score item is intentionally scored low.
Obtaining an optimization target according to the M ═ X + Z + Y and the properties of X, Z and Y,
Figure GDA0002502356700000052
wherein | · | purple*Represents the kernel norm, | ·| non-woven phosphor of the matrix1L representing a matrix1Norm, | · | luminanceFFlobenius norm representing matrixThe number, < X, Y >, represents the product sum of the corresponding elements of the matrix X and Y. And omega is a set of all the subscripts of the scoring items, and (i, j) epsilon omega represents that the user i gives a scoring record to the item j. PΩFor orthogonal projection, the meaning is as follows,
Figure GDA0002502356700000061
Mijrepresenting the user i's score on item j.
The optimization target is solved by using an alternative optimization method, and theoretically, the method can well recover a real score matrix X, a system noise matrix Z and a malicious attack deviation matrix Y under a certain condition.
And 1.3, detecting a malicious attacker from the user according to the information of the malicious attack deviation matrix Y.
The method for detecting the malicious attacker from the user according to the information of the malicious attack deviation matrix Y specifically comprises the following steps: and each row of the malicious attack deviation matrix Y corresponds to the grading deviation information of one user, and if one row of the Y has non-zero elements, the user corresponding to the row is judged to be a malicious attacker.

Claims (4)

1. An attack detection method for unorganized malicious attacks is characterized by comprising the following steps:
step 1.1, converting all the scores of the users for the articles into an incomplete score matrix M, and determining parameters of an algorithm according to the number of the users, the number of the articles, the score number and the score interval of the score matrix M, wherein the parameters specifically comprise the following steps: determining the upper bound of a system noise matrix Z allowed by an algorithm according to the number of users, the number of articles and the number of scores of the score matrix M; determining the lower bound of the deviation of the malicious score from the real score according to the scoring interval of the scoring matrix M;
step 1.2, obtaining a corresponding real scoring matrix X, a system noise matrix Z and a malicious attack deviation matrix Y according to the scoring matrix M by using a matrix completion algorithm;
and 1.3, detecting a malicious attacker from the user according to the information of the malicious attack deviation matrix Y.
2. The attack detection method for the unorganized malicious attack as recited in claim 1, wherein the true score matrix X, the system noise matrix Z, and the malicious attack deviation matrix Y corresponding to the score matrix M are obtained by a matrix completion algorithm according to the score matrix M, and specifically: the scoring matrix M is composed of a true scoring matrix X, a system noise matrix Z, and a malicious attack bias matrix Y, so that M is X + Z + Y.
3. The attack detection method for unstructured malicious attacks according to claim 2, characterized in that an optimization objective is obtained according to M ═ X + Z + Y and the properties X, Z, Y have,
Figure FDA0002786609370000011
s.t.PΩ(X+Z+Y)=PΩM,
||PΩ(Z)||F
Figure FDA0002786609370000012
wherein | · | purple*Represents the kernel norm, | ·| non-woven phosphor of the matrix1L representing a matrix1Norm, | · | luminanceFA frobenius norm of the matrix is represented,<X,Y>representing the product sum of corresponding elements of the matrices X and Y; the hyper-parameters tau, alpha and 6 are used for balancing the degree of emphasis of each item in the optimization target; Ω is the set of all scoring item subscripts, MijIndicating the rating, P, of user i on item jΩIs an orthogonal projection.
4. The attack detection method for an unstructured malicious attack as defined in claim 1, wherein the malicious attacker is detected from the user according to the information of the malicious attack bias matrix Y, specifically: and each row of the malicious attack deviation matrix Y corresponds to the grading deviation information of one user, and if one row of the Y has non-zero elements, the user corresponding to the row is judged to be a malicious attacker.
CN201710811240.3A 2017-09-11 2017-09-11 Attack detection method for unorganized malicious attack Active CN107689960B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710811240.3A CN107689960B (en) 2017-09-11 2017-09-11 Attack detection method for unorganized malicious attack

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710811240.3A CN107689960B (en) 2017-09-11 2017-09-11 Attack detection method for unorganized malicious attack

Publications (2)

Publication Number Publication Date
CN107689960A CN107689960A (en) 2018-02-13
CN107689960B true CN107689960B (en) 2021-01-01

Family

ID=61155222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710811240.3A Active CN107689960B (en) 2017-09-11 2017-09-11 Attack detection method for unorganized malicious attack

Country Status (1)

Country Link
CN (1) CN107689960B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108470052B (en) * 2018-03-12 2021-03-19 南京邮电大学 Anti-trust attack recommendation algorithm based on matrix completion

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102118382A (en) * 2010-10-31 2011-07-06 华南理工大学 System and method for detecting attack of collaborative recommender based on interest combination
CN102184364A (en) * 2011-05-26 2011-09-14 南京财经大学 Semi-supervised learning-based recommendation system shilling attack detection method
CN103338223B (en) * 2013-05-27 2016-08-10 清华大学 A kind of recommendation method of Mobile solution and server
CN104463601A (en) * 2014-11-13 2015-03-25 电子科技大学 Method for detecting users who score maliciously in online social media system
US10536357B2 (en) * 2015-06-05 2020-01-14 Cisco Technology, Inc. Late data detection in data center
CN104936287A (en) * 2015-06-09 2015-09-23 南京邮电大学 Sensor network indoor fingerprint positioning method based on matrix completion
CN105205130A (en) * 2015-09-15 2015-12-30 广东工业大学 Method of improving accuracy of recommendation system
CN105488684A (en) * 2015-11-16 2016-04-13 孙宝文 Method and apparatus for determining recommendation relationship in trading system
CN106503096B (en) * 2016-10-14 2020-02-04 上海斐讯数据通信技术有限公司 Social network recommendation method and system based on distributed noise interference prevention
CN106682963A (en) * 2016-12-29 2017-05-17 同济大学 Recommendation system data completion method based on convex optimization local low-rank matrix approximation

Also Published As

Publication number Publication date
CN107689960A (en) 2018-02-13

Similar Documents

Publication Publication Date Title
US20200226186A1 (en) System and method for analyzing user device information
CN107835113B (en) Method for detecting abnormal user in social network based on network mapping
Zhou et al. SVM-TIA a shilling attack detection method based on SVM and target item analysis in recommender systems
JP6821149B2 (en) Information processing using video for advertisement distribution
Badri Satya et al. Uncovering fake likers in online social networks
US20090049036A1 (en) Systems and methods for keyword selection in a web-based social network
CN104715023A (en) Commodity recommendation method and system based on video content
EP3848881A1 (en) System and method for analyzing user device information
Li et al. Noisy but non-malicious user detection in social recommender systems
CN108805598A (en) Similarity information determines method, server and computer readable storage medium
Stitelman et al. Using co-visitation networks for detecting large scale online display advertising exchange fraud
Gelli et al. Beyond the product: Discovering image posts for brands in social media
Hernandez et al. Fraud de-anonymization for fun and profit
CN112085171B (en) Recommendation method based on clustered multi-entity graph neural network
CN104899321A (en) Collaborative filtering recommendation method based on item attribute score mean value
CN107689960B (en) Attack detection method for unorganized malicious attack
Kaushik et al. A novel machine learning‐based framework for detecting fake Instagram profiles
Gao et al. A robust collaborative filtering approach based on user relationships for recommendation systems
CN113609394A (en) Information flow-oriented safety recommendation system
Yang A robust recommended system based on attack detection
CN114493784A (en) Commodity recommendation method and recommendation system based on big data
Hsu et al. Using the beta distribution technique to detect attacked items from collaborative filtering
CN112541669A (en) Risk identification method, system and device
CN114070653B (en) Hybrid phishing website detection method and device, electronic equipment and storage medium
Stitelman et al. Using Co-Visitation Networks For Classifying Non-Intentional Traffic

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB02 Change of applicant information

Address after: 210008 No. 22, Hankou Road, Gulou District, Jiangsu, Nanjing

Applicant after: NANJING University

Address before: 210046 Xianlin Avenue 163, Qixia District, Nanjing City, Jiangsu Province

Applicant before: NANJING University

CB02 Change of applicant information