CN116702136A - Manipulation attack method and device for personalized recommendation system - Google Patents

Manipulation attack method and device for personalized recommendation system Download PDF

Info

Publication number
CN116702136A
CN116702136A CN202310973637.8A CN202310973637A CN116702136A CN 116702136 A CN116702136 A CN 116702136A CN 202310973637 A CN202310973637 A CN 202310973637A CN 116702136 A CN116702136 A CN 116702136A
Authority
CN
China
Prior art keywords
arm
round
learner
environment
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310973637.8A
Other languages
Chinese (zh)
Inventor
周潘
罗志
孙裕华
徐子川
袁增辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202310973637.8A priority Critical patent/CN116702136A/en
Publication of CN116702136A publication Critical patent/CN116702136A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a manipulation attack method and a manipulation attack device for an X-armed bases-based personalized recommendation system in a big data environment. Comprising the following steps: discretizing the arm space; intercepting a recommendation result of a system, wherein the recommendation result is one arm selected from an arm space corresponding to a node after a learner determines the node in the HCT coverage tree selected in the round by utilizing an HCT algorithm; judging whether the arm space contains a target arm, if so, not attacking the wheel, otherwise, selecting other arms to replace the arm selected by the learner and submitting the arm to the environment, and enabling the environment to generate feedback to be received by the learner and the attacker. The invention can carry out manipulation attack on the personalized recommendation system in the angle of an attacker, and evaluate the vulnerability of the personalized recommendation system when the personalized recommendation system is subjected to manipulation attack according to the attack result.

Description

Manipulation attack method and device for personalized recommendation system
Technical Field
The invention relates to the technical field of data security, in particular to a manipulation attack method and device for an X-armed bases-based personalized recommendation system in a big data environment
Background
The core of the X-armed databases problem is how to conduct personalized recommendation for specific users in a continuous data space, and plays a vital role in personalized recommendation application in the fields of videos, internet of things services, advertisements and the like in a big data environment.
The X-armed bases is different from the traditional Multi-arm slots (MAB) problem, which the MAB algorithm solves is how to choose to maximize the benefit in the case of a limited number of arms, and feedback after each arm (arm) pull corresponds to an unknown probability distribution. The algorithm chooses to pull one arm and get feedback (reward) per round while gradually knowing its probability distribution, so to maximize benefit, the algorithm needs to compromise the benefit and get more information about the probability distribution of the pull arm feedback as each round chooses the pull arm. The difference between the X-armed candidates is that it solves the problem of maximization of benefits under the assumption that the number of arms is infinite, and because of this feature, the X-armed candidates is also applied to the fields of personalized recommendation systems (each selected object can be regarded as an arm, and feedback from the pull arm can be regarded as feedback obtained after recommending the object to the user) in situations where the number of selected objects such as big data is extremely huge or even nearly infinite, such as multimedia big data recommendation systems, job and job seeker recommendation systems in big data environments, and service recommendation systems in the service condition of the internet of things that are rapidly growing in the current network environment. The main idea of The work is to divide continuous space of infinite data continuously through a tree-shaped HCT algorithm (The High-Confidence Tree Algorithm, high confidence tree algorithm), and then to use a Monte Carlo method (Monte Carlo method), so that The efficiency of big data analysis is greatly improved.
Attacks on the bands algorithm mainly include two modes, namely a poisoning attack (data-poisoning attack) and a manipulation attack (action-manipulation attack). In an attack scenario, there are three interactors: learner (e.g., recommender system), attacker (e.g., user group), and Environment (e.g., environment). Wherein the attacker acts as an intermediate role between the learner and the environment, receives the arm selected by the learner, and returns feedback generated by the environment. For a poisoning attack, an attacker tampers with the feedback of the environment, and the learner is misled to reach the expected attack goal due to receiving feedback that does not coincide with the original. Manipulation attacks are more operational but more difficult than poison attacks because they do not directly act on the environmental feedback and tamper with it to an arbitrary value, but rather tamper with the arm chosen by the learner as another arm and then submit it to the environment. Another challenge is that the average benefit that an attacker corresponds to each arm is not known. Furthermore, there is also a limit to the attacker's launching of the attack due to objective conditions, in other words, the smaller the attack consumption, the more advantageous (e.g., less detectable by the system) the attacker is to achieve the same attack objective.
With respect to the research work in the current field, whether the attack is a manipulation attack or a poisoning attack, most of the attack targets are MAB algorithm, and the attack research of the X-armed bases algorithm is blank. Those skilled in the art are therefore unable to effectively investigate the vulnerability of the X-armed bases-based personalized recommendation system to manipulation attacks encountered in a big data environment.
Disclosure of Invention
Aiming at the technical problems in the prior art, the invention provides a method and a device for operating and attacking a personalized recommendation system based on X-armed bands under a big data environment.
The technical scheme for solving the technical problems is as follows:
in a first aspect, the present invention provides a manipulation attack method for an X-armed bases-based personalized recommendation system, comprising:
discretizing the arm space;
intercepting a recommendation result of a system, wherein the recommendation result is one arm selected from an arm space corresponding to a node after a learner determines the node in the HCT coverage tree selected in the round by utilizing an HCT algorithm;
judging whether the arm space contains a target arm, if so, not attacking the wheel, otherwise, selecting other arms to replace the arm selected by the learner and submitting the arm to the environment, and enabling the environment to generate feedback to be received by the learner and the attacker; the learner refers to a personalized recommendation system, and the environment refers to a user facing the personalized recommendation system.
Further, the discretizing the arm space includes:
dividing the arm space into M subspaces, wherein the value of M is M=2 X Wherein, the value of X is as follows:
t is the total round of HCT algorithm operation.
Further, the determining whether the arm space includes the target arm, if so, the present wheel does not attack, otherwise, other arms are selected to replace the arm selected by the learner and submitted to the environment, including:
s10, capturing the arm selected by the learner as x in the t th round t The method comprises the steps of carrying out a first treatment on the surface of the If the arm space contains the target arm, the next round is entered, namely: t=t+1; otherwise, executing step S20;
s20, for each arm x (i), i E [1, M ], calculating an L value, namely:
wherein:the value of L corresponding to the ith arm of the t-th wheel; />Representation->Average value of middle round environmental feedback, +.>Representing a set of rounds up to the t-th round selection arm x (i); />Indicating the number of times arm x (i) was selected up to the t-th round; />Is a parameter with a value range of (0, 1);
s30, selecting one to enableArm x(s) with the smallest value, i.e.>Will x t Tamper to x(s) and submit to the environment to obtain feedback r of the environment t
S40, updating T i (t)=T i (t) +1 and
s50, let t=t+1, and jump to step S10.
In a second aspect, the present invention provides a manipulation attack apparatus for an X-armed bases-based personalized recommendation system, comprising:
the arm space discretizing module is used for discretizing the arm space;
the result interception module is used for intercepting a recommendation result of the system, wherein the recommendation result is one arm selected from an arm space corresponding to a node after a learner determines the node in the HCT coverage tree selected by the learner by utilizing an HCT algorithm;
the replacing module is used for judging whether the arm space contains a target arm, if so, the round does not attack, otherwise, other arms are selected to replace the arm selected by the learner and are submitted to the environment, and the environment generates feedback and is received by the learner and the attacker; the learner refers to a personalized recommendation system, and the environment refers to a user facing the personalized recommendation system.
Further, the arm space discrete module is specifically configured to:
dividing the arm space into M subspaces, wherein the value of M is M=2 X Wherein, the value of X is as follows:
t is the total round of HCT algorithm operation.
Further, the replacing module is specifically configured to execute the following steps:
s10, capturing the arm selected by the learner as x in the t th round t The method comprises the steps of carrying out a first treatment on the surface of the If the arm space contains the target arm, the next round is entered, namely: t=t+1; otherwise, executing step S20;
s20, for each arm x (i), i E [1, M ], calculating an L value, namely:
wherein:the value of L corresponding to the ith arm of the t-th wheel; />Representation->Average value of middle round environmental feedback, +.>Representing a set of rounds up to the t-th round selection arm x (i); />Indicating the number of times arm x (i) was selected up to the t-th round; />Is a parameter with a value range of (0, 1);
s30, selecting one to enableArm x(s) with the smallest value, i.e.>Will x t Tamper to x(s) and submit to the environment to obtain feedback r of the environment t
S40, updating T i (t)=T i (t) +1 and
s50, let t=t+1, and jump to step S10.
In a third aspect, the present invention provides an electronic device comprising:
a memory for storing a computer software program;
and the processor is used for reading and executing the computer software program so as to realize the manipulation attack method for the personalized recommendation system based on the X-armed bases.
In a fourth aspect, the present invention provides a non-transitory computer readable storage medium, in which a computer software program for implementing a manipulation attack method for an X-armed based personalized recommendation system according to the first aspect of the present invention is stored.
The beneficial effects of the invention are as follows: with respect to the research work in the current field, whether the attack is a manipulation attack or a poisoning attack, most of the attack targets are MAB algorithm, and the attack research of the X-armed bases algorithm is blank. Meanwhile, the HCT algorithm, which is a typical algorithm of X-armed bases, is quite different from the typical MAB algorithm UCB (The Upper Confidence Bound Algorithm), because the former requires maintenance of a binary tree to discretize the huge arm space at run-time, while the latter does not. The goal of an attacker is to force the HCT algorithm to choose the node in the binary tree that contains the target arm under attack, rather than forcing the UCB algorithm to choose a particular arm as in the case of attacking UCB, resulting in the inability of existing poisoning and manipulation attacks against UCB algorithms to directly act on the HCT algorithm.
Drawings
FIG. 1 is a schematic diagram of a workflow of an Internet of things service recommendation system and an attack mode of an attacker;
FIG. 2 is a schematic diagram of a manipulation attack model;
FIG. 3 is a flowchart of a method for handling attacks on an X-armed bases-based personalized recommendation system in a big data environment according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a manipulation attack apparatus for a personalized recommendation system based on X-armed bases in a big data environment according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an embodiment of an electronic device according to an embodiment of the present invention;
fig. 6 is a schematic diagram of an embodiment of a computer readable storage medium according to an embodiment of the present invention.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
Parameter definition:
representing arm space, i.e. the collection of all arms
Total round of T, HCT algorithm operation
x t Representing the arm selected by the learner at wheel t
Representing the arm submitted to the environment by the attacker at round t
r t Representing feedback generated by the environment at the t-th round
K, representing target arm specified by attacker
M, representing the arm space of an attackerDivided into M subspaces
Represents the ith (i.e. [1, M)]) A number of subspaces, and these subspaces satisfy: />And->
x (i), for each subspaceFrom which an attacker randomly selects one arm x (i) to sample, and after the selection does not change any more
By the t-th round, the attacker selects the set of rounds of arm x (i), i.e. +.>
T i (t) by the t-th round, the number of times the attacker selects arm x (i), i.e
,/>Average value of medium-turn environmental feedback, namely: />
In this example, we have proposed vulnerability studies under manipulation attacks against a typical X-armed bases algorithm HCT (The High-Confidence Tree Algorithm, high confidence tree algorithm).
Taking the internet of things recommendation system as an example for illustration, as shown in fig. 1, in the system, all internet of things services are mapped to a measurement space X (each internet of things service can be regarded as an arm), and are discretized through a hierarchical structure, namely a binary coverage tree. The X-armed bases algorithm uses the coverage tree to estimate the optimal Internet of things service, and outputs the recommended result to an Internet of things service provider, the provider pushes the specific service to a user according to the operation result of the algorithm, and then the user submits feedback to the algorithm. Based on the feedback, the algorithm will improve the recommendation results over the turn in an effort to recommend services that better meet the needs of a particular user. Meanwhile, an attacker can force the system to recommend certain specific content by hijacking recommendation results or feedback.
To deal with the huge arm spaceThus optimizing the yield, the HCT algorithm adopts a hierarchical structure to discretize the arm space. The hierarchical structure is a binary coverage tree +.>Wherein->The layering is +.>The node of (1) is denoted as (h, i), then the root node is (0, 1). The two child nodes of node (h, i) are (h+1, 2 i-1) and (h+1, 2 i). For each node (h, i), it covers arm space +.>Is a subset->And->The following three conditions are satisfied:
(1)
(2)
(3)
at the same time, each node (h, i) randomly selects a representative armArm x will be chosen as long as node (h, i) is chosen h,i Submitting to the environment.
At each round t, to decide which node to choose, the HCT algorithm calculates a confidence upper bound for each node (h, i), namely:
(1)
wherein the method comprises the steps of,/>,/>And->And->Two hyper-parameters for use by the learner. Definitions->Representing the set of rounds up to the t-th round, the HCT algorithm selects the node (h, i),(h s ,i s ) Indicating the node selected by the HCT algorithm at round s, -/->And->. In addition, the arm space is->Defining a difference function->It satisfies->Is->The method comprises the steps of carrying out a first treatment on the surface of the For a subset of arm space->The diameter is defined as->Then for each node (h, i) it is necessary to satisfy +.>
To get a more stringent upper confidence bound, in addition to calculating the U value, the HCT algorithm also calculates a B value for each node, namely:
(2)
it can be seen that the larger the corresponding B value of the node, the greater the likelihood that it will contain the optimal arm, which is also the basis for HCT to select the node. At each round t, the HCT algorithm starts from the root node (0, 1), and selects one of the two child nodes with larger B value down to the leaf node or the node that establishes the following formula (note that the node selected by this round HCT algorithm is (h) t ,i t )):
(3)
And then the arm is connectedSubmit to the environment and get feedback, and update +.>,/>And +.>Simultaneously updating the root node to (h) according to equation (2) t ,i t ) All nodes on the path of (a). Finally, the current node (h t ,i t ) Whether or not to hold the following formula:
(4)
if the above equation can be satisfied, the node (h t ,i t ) Is a two child node of (a), i.eAnd sets the U value corresponding to the two child nodes to +.>
The steering attack model is shown in fig. 2. At round t, in a first step, the learner selects arm x t Meanwhile, the system is intercepted by an attacker; second step, attacker will x t Tampering by attack algorithmAnd submitted to the environment; third step, attacker and learner receive environmental pair +.>Feedback of->
Based on the above, an embodiment of the present invention provides a manipulation attack method for a personalized recommendation system based on X-armed bases, as shown in fig. 3, including:
an attacker designates a target arm, constructs an attack algorithm LBT based on a binary coverage tree, and performs the following steps in each round of HCT algorithm operation:
step 1, a learner determines nodes in an HCT coverage tree selected in the round by utilizing an HCT algorithm, then selects one arm in an arm space corresponding to the nodes, recommends the arm to the environment, and an attacker intercepts a recommendation result;
and 2, the attacker judges whether the arm space corresponding to the node selected by the round of HCT algorithm contains a target arm, if so, the round of attack is not performed, otherwise, other arms are selected to replace the arm selected by the learner through the LBT algorithm and submitted to the environment, and the environment generates feedback and is received by the learner and the attacker.
Through the above steps, the attacker achieves manipulation of the HCT algorithm even though it selects the node containing the target arm as much as possible.
Before an attack, an attacker can make a pair with arm spaceDiscretizing, namely: dividing the arm space into M subspaces, wherein the value of M is M=2 X Wherein, the value of X is as follows:
t is the total round of HCT algorithm operation.
Specifically, the procedure performed by the attacker is as follows:
s10, capturing the arm selected by the learner as x in the t th round t The method comprises the steps of carrying out a first treatment on the surface of the If the arm space contains the target arm, the next round is entered, namely: t=t+1; otherwise, executing step S20;
s20, for each arm x (i), i E [1, M ], calculating an L value, namely:
wherein:the value of L corresponding to the ith arm of the t-th wheel; />Representation->Average value of middle round environmental feedback, +.>Representing a set of rounds up to the t-th round selection arm x (i); />Indicating the number of times arm x (i) was selected up to the t-th round; />Is a parameter with a value range of (0, 1);
s30, selecting one to enableArm x(s) with the smallest value, i.e.>Will x t Tamper to x(s) and submit to the environment to obtain feedback r of the environment t
S40, updating T i (t)=T i (t) +1 and
s50, let t=t+1, and jump to step S10.
In this embodiment, for the X-armed bases algorithm HCT implemented based on the overlay Tree, an attack algorithm named LBT (Lower Bound Tree) is proposed, and when the learner does not select an arm according to the manner desired by the attacker, the attacker falsifies the arm selected by the attacker into an arm with Lower average feedback through the LBT algorithm, so as to mislead the learner to force the learner to select a pull arm according to the manner desired by the attacker, so as to achieve the purpose of controlling the result selected by the learner (for example, in the multimedia recommendation system, a certain attacker wants to increase the click rate of the video authored by the attacker, and the attacker can force the recommendation system to continuously recommend the video designated by the attacker or the video similar to the attacker through the attack manner).
In this embodiment, assuming that the total round of operation of the HCT algorithm is T, an attacker can consume the HCT algorithm as attack through the LBT algorithmIn the case of (a) an attack is implemented, i.e. the round of the attacker attack is +.>Whereas the HCT algorithm has at least +.>Is run as desired by the attacker.
As shown in fig. 4, the embodiment of the present invention further provides a manipulation attack apparatus for a personalized recommendation system based on X-armed bases, including:
the arm space discretizing module is used for discretizing the arm space;
the result interception module is used for intercepting a recommendation result of the system, wherein the recommendation result is one arm selected from an arm space corresponding to a node after a learner determines the node in the HCT coverage tree selected by the learner by utilizing an HCT algorithm;
and the attack execution module is used for judging whether the arm space contains a target arm, if so, the round does not attack, otherwise, other arms are selected to replace the arm selected by the learner and are submitted to the environment, and the environment generates feedback and is received by the learner and the attacker.
Referring to fig. 5, fig. 5 is a schematic diagram of an embodiment of an electronic device according to an embodiment of the invention. As shown in fig. 5, an embodiment of the present invention provides an electronic device 500, including a memory 510, a processor 520, and a computer program 511 stored on the memory 510 and executable on the processor 520, wherein the processor 520 executes the computer program 511 to implement the following steps:
step 1, discretizing an arm space;
step 2, the learner determines the node in the HCT coverage tree selected in the round by utilizing the HCT algorithm, then selects one arm in the arm space corresponding to the node, recommends the arm to the environment, and an attacker intercepts the recommended result;
and 3, the attacker judges whether the arm space corresponding to the node selected by the round of HCT algorithm contains a target arm, if so, the round of attack is not performed, otherwise, other arms are selected to replace the arm selected by the learner through the LBT algorithm and submitted to the environment, and the environment generates feedback and is received by the learner and the attacker.
Referring to fig. 6, fig. 6 is a schematic diagram of an embodiment of a computer readable storage medium according to an embodiment of the invention. As shown in fig. 6, the present embodiment provides a computer-readable storage medium 600 having stored thereon a computer program 611, which computer program 611 when executed by a processor implements the steps of:
step 1, discretizing an arm space;
step 2, the learner determines the node in the HCT coverage tree selected in the round by utilizing the HCT algorithm, then selects one arm in the arm space corresponding to the node, recommends the arm to the environment, and an attacker intercepts the recommended result;
and 3, the attacker judges whether the arm space corresponding to the node selected by the round of HCT algorithm contains a target arm, if so, the round of attack is not performed, otherwise, other arms are selected to replace the arm selected by the learner through the LBT algorithm and submitted to the environment, and the environment generates feedback and is received by the learner and the attacker.
In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (8)

1. A manipulation attack method of a personalized recommendation system based on X-armed bases, comprising:
discretizing the arm space;
intercepting a recommendation result of a system, wherein the recommendation result is one arm selected from an arm space corresponding to a node after a learner determines the node in the HCT coverage tree selected in the round by utilizing an HCT algorithm;
judging whether the arm space contains a target arm, if so, not attacking the wheel, otherwise, selecting other arms to replace the arm selected by the learner and submitting the arm to the environment, and enabling the environment to generate feedback to be received by the learner and the attacker; the learner refers to a personalized recommendation system, and the environment refers to a user facing the personalized recommendation system.
2. The method of claim 1, wherein discretizing the arm space comprises:
dividing the arm space into M subspaces, wherein the value of M is M=2 X Wherein, the value of X is as follows:
t is the total round of HCT algorithm operation.
3. The method of claim 2, wherein determining whether the arm space contains the target arm, if so, the round does not attack, otherwise selecting other arms to replace the arm selected by the learner and submitting to the environment, comprises:
s10, capturing the arm selected by the learner as x in the t th round t The method comprises the steps of carrying out a first treatment on the surface of the If the arm space contains the target arm, the next round is entered, namely: t=t+1; otherwise, executing step S20;
s20, for each arm x (i), i E [1, M ], calculating an L value, namely:
wherein:the value of L corresponding to the ith arm of the t-th wheel; />Representation->Average value of middle round environmental feedback, +.>Representing a set of rounds up to the t-th round selection arm x (i); />Indicating the number of times arm x (i) was selected up to the t-th round; />Is a parameter with a value range of (0, 1);
s30, selecting one to enableArm x(s) with the smallest value, i.e.>Will x t Tamper to x(s) and submit to the environment to obtain feedback r of the environment t
S40, updating T i (t)=T i (t) +1 and
s50, let t=t+1, and jump to step S10.
4. A manipulation attack apparatus of a personalized recommendation system based on X-armed bases, comprising:
the arm space discretizing module is used for discretizing the arm space;
the result interception module is used for intercepting a recommendation result of the system, wherein the recommendation result is one arm selected from an arm space corresponding to a node after a learner determines the node in the HCT coverage tree selected by the learner by utilizing an HCT algorithm;
the replacing module is used for judging whether the arm space contains a target arm, if so, the round does not attack, otherwise, other arms are selected to replace the arm selected by the learner and are submitted to the environment, and the environment generates feedback and is received by the learner and the attacker; the learner refers to a personalized recommendation system, and the environment refers to a user facing the personalized recommendation system.
5. The apparatus of claim 4, wherein the arm space discrete module is specifically configured to:
dividing the arm space into M subspaces, wherein the value of M is M=2 X Wherein, the value of X is as follows:
t is the total round of HCT algorithm operation.
6. The apparatus of claim 5, wherein the replacement module is specifically configured to perform the steps of:
s10, capturing the arm selected by the learner as x in the t th round t The method comprises the steps of carrying out a first treatment on the surface of the If the arm space contains the target arm, the next round is entered, namely: t=t+1; otherwise, executing step S20;
s20, for each arm x (i), i E [1, M ], calculating an L value, namely:
wherein:the value of L corresponding to the ith arm of the t-th wheel; />Representation->Average value of middle round environmental feedback, +.>Representing a set of rounds up to the t-th round selection arm x (i); />Indicating the number of times arm x (i) was selected up to the t-th round; />Is a parameter with a value range of (0, 1);
s30, selecting one to enableArm x(s) with the smallest value, i.e.>Will x t Tamper to x(s) and submit to the environment to obtain feedback r of the environment t
S40, updating T i (t)=T i (t) +1 and
s50, let t=t+1, and jump to step S10.
7. An electronic device, comprising:
a memory for storing a computer software program;
a processor for reading and executing the computer software program to implement a method of handling attacks on an X-armed bases based personalized recommendation system according to any one of claims 1-3.
8. A non-transitory computer readable storage medium, characterized in that the storage medium has stored therein a computer software program for implementing a method of handling attacks on an X-armed based personalized recommendation system according to any one of claims 1-3.
CN202310973637.8A 2023-08-04 2023-08-04 Manipulation attack method and device for personalized recommendation system Pending CN116702136A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310973637.8A CN116702136A (en) 2023-08-04 2023-08-04 Manipulation attack method and device for personalized recommendation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310973637.8A CN116702136A (en) 2023-08-04 2023-08-04 Manipulation attack method and device for personalized recommendation system

Publications (1)

Publication Number Publication Date
CN116702136A true CN116702136A (en) 2023-09-05

Family

ID=87837796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310973637.8A Pending CN116702136A (en) 2023-08-04 2023-08-04 Manipulation attack method and device for personalized recommendation system

Country Status (1)

Country Link
CN (1) CN116702136A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117172303A (en) * 2023-10-23 2023-12-05 华中科技大学 Black box attack method and device for deep reinforcement learning under continuous action space

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US20060253274A1 (en) * 2005-05-05 2006-11-09 Bbn Technologies Corp. Methods and systems relating to information extraction
WO2014001908A1 (en) * 2012-06-29 2014-01-03 Thomson Licensing A system and method for recommending items in a social network
CN107066446A (en) * 2017-04-13 2017-08-18 广东工业大学 A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules
US20190108579A1 (en) * 2017-10-05 2019-04-11 International Business Machines Corporation Product configuration recommendation and optimization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US20060253274A1 (en) * 2005-05-05 2006-11-09 Bbn Technologies Corp. Methods and systems relating to information extraction
WO2014001908A1 (en) * 2012-06-29 2014-01-03 Thomson Licensing A system and method for recommending items in a social network
CN107066446A (en) * 2017-04-13 2017-08-18 广东工业大学 A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules
US20190108579A1 (en) * 2017-10-05 2019-04-11 International Business Machines Corporation Product configuration recommendation and optimization

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZHI LUO: "Action-Manipulation Attack and Defense to X-Armed Bandits", 《21ST IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (IEEE TRUSTCOM)》, pages 1116 - 1122 *
ZHI LUO: "Data_Poisoning_Attack_to_X-armed_Bandits", 《21ST IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (IEEE TRUSTCOM)》, pages 345 - 351 *
王志梅;杨帆;: "根据兴趣导向的移动学习者社区构建", 机电工程, no. 12 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117172303A (en) * 2023-10-23 2023-12-05 华中科技大学 Black box attack method and device for deep reinforcement learning under continuous action space
CN117172303B (en) * 2023-10-23 2024-03-08 华中科技大学 Black box attack method and device for deep reinforcement learning under continuous action space

Similar Documents

Publication Publication Date Title
Su et al. TAP: A personalized trust-aware QoS prediction approach for web service recommendation
US10114954B1 (en) Exploit prediction based on machine learning
Wu et al. QoS prediction of web services based on two-phase k-means clustering
US11853018B2 (en) Determining causal models for controlling environments
CN114600106A (en) Embedded online federated learning
CN112434213B (en) Training method of network model, information pushing method and related devices
CN116702136A (en) Manipulation attack method and device for personalized recommendation system
Chen et al. {Cost-Aware} robust tree ensembles for security applications
WO2023019456A1 (en) Method and apparatus for evaluation of adversarial robustness
CN113449011A (en) Big data prediction-based information push updating method and big data prediction system
CN113449012A (en) Internet service mining method based on big data prediction and big data prediction system
US20230114228A1 (en) Method for arbitrating encrypted electronic transactions among intermediary and authoring users only when an interaction occurs between authoring and candidate users who was exposed by the intermediary user to data published by authoring user
CN116595528A (en) Method and device for poisoning attack on personalized recommendation system
CN114841820A (en) Transaction risk control method and system
CN111444930B (en) Method and device for determining prediction effect of two-classification model
Kumar et al. Deep residual convolutional neural Network: An efficient technique for intrusion detection system
CN113256335B (en) Data screening method, multimedia data delivery effect prediction method and device
Kan et al. Improving generalization for neural adaptive video streaming via meta reinforcement learning
Kozal et al. Employing chunk size adaptation to overcome concept drift
Gohr et al. Subsampling and knowledge distillation on adversarial examples: New techniques for deep learning based side channel evaluations
Pereira et al. Assessing active learning strategies to improve the quality control of the soybean seed vigor
US20220178899A1 (en) Manufacturing a biologic pharmaceutical using causal models
CN113656153A (en) Improved artificial bee colony algorithm for cloud computing task scheduling
Su et al. Electricity Network Security Monitoring Based on Bee Colony Algorithm
CN112800419A (en) Method, apparatus, medium and device for identifying IP group

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination