CN116595528A - Method and device for poisoning attack on personalized recommendation system - Google Patents

Method and device for poisoning attack on personalized recommendation system Download PDF

Info

Publication number
CN116595528A
CN116595528A CN202310880108.3A CN202310880108A CN116595528A CN 116595528 A CN116595528 A CN 116595528A CN 202310880108 A CN202310880108 A CN 202310880108A CN 116595528 A CN116595528 A CN 116595528A
Authority
CN
China
Prior art keywords
node
hct
learner
tree
arm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310880108.3A
Other languages
Chinese (zh)
Inventor
周潘
罗志
孙裕华
徐子川
袁增辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202310880108.3A priority Critical patent/CN116595528A/en
Publication of CN116595528A publication Critical patent/CN116595528A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a poisoning attack method and device for an X-armed bases-based personalized recommendation system in a big data environment. An attacker acquires a recommendation result recommended to the environment by a learner, and the HCT tree of the learner is reproduced according to the recommendation result; then, the attacker acquires the feedback result of the environment, tamper the feedback result based on the repeated HCT tree, and return the feedback result to the learner. The invention can carry out the poisoning attack on the personalized recommendation system in the angle of an attacker, and evaluate the vulnerability of the personalized recommendation system when the personalized recommendation system is subjected to the poisoning attack according to the attack result.

Description

Method and device for poisoning attack on personalized recommendation system
Technical Field
The invention relates to the technical field of data security, in particular to a method and a device for poisoning and attacking a personalized recommendation system based on X-armed bases in a big data environment.
Background
The core of the X-armed databases problem is how to conduct personalized recommendation for specific users in a continuous data space, and plays a vital role in personalized recommendation application in the fields of videos, internet of things services, advertisements and the like in a big data environment.
The X-armed bases is different from the traditional Multi-arm slots (MAB) problem, which the MAB algorithm solves is how to choose to maximize the benefit in the case of a limited number of arms, and feedback after each arm (arm) pull corresponds to an unknown probability distribution. The algorithm chooses to pull one arm and get feedback (reward) per round while gradually knowing its probability distribution, so to maximize benefit, the algorithm needs to compromise the benefit and get more information about the probability distribution of the pull arm feedback as each round chooses the pull arm. The difference between the X-armed candidates is that it solves the problem of maximization of benefits under the assumption that the number of arms is infinite, and because of this feature, the X-armed candidates is also applied to the fields of personalized recommendation systems (each selected object can be regarded as an arm, and feedback from the pull arm can be regarded as feedback obtained after recommending the object to the user) in situations where the number of selected objects such as big data is extremely huge or even nearly infinite, such as multimedia big data recommendation systems, job and job seeker recommendation systems in big data environments, and service recommendation systems in the service condition of the internet of things that are rapidly growing in the current network environment. The main idea of The work is to divide continuous space of infinite data continuously through a tree-shaped HCT algorithm (The High-Confidence Tree Algorithm, high confidence tree algorithm), and then to use a Monte Carlo method (Monte Carlo method), so that The efficiency of big data analysis is greatly improved.
Attacks on the bands algorithm mainly include two modes, namely a poisoning attack (data-poisoning attack) and a manipulation attack (action-manipulation attack). In an attack scenario, there are three interactors: learner (e.g., recommender system), attacker (e.g., user group), and Environment (e.g., environment). Wherein the attacker acts as an intermediate role between the learner and the environment, receives the arm selected by the learner, and returns feedback generated by the environment. For the manipulation attack, an attacker attacks by tampering with the recommended result and submits the tampered arm to the environment, and the learner is misled due to receiving feedback generated by the arm which does not coincide with the recommended result, so that the attacker achieves the attack purpose. For a poisoning attack, an attacker falsifies the feedback of the environment, and a learner is misled to achieve the expected attack target due to receiving feedback which is not consistent with the original, and the poisoning attack directly acts on the feedback received by the learner, so that the attack mode is more direct and efficient compared with the manipulation attack mode for falsifying the recommended result. Furthermore, there is also a limit to the attacker's launching of the attack due to objective conditions, in other words, the less the attack consumption is to the attacker's advantage (e.g., less detectable by the system) if the same attack objective is reached.
With respect to the research work in the current field, whether the attack is a manipulation attack or a poisoning attack, most of the attack targets are MAB algorithm, and the attack research of the X-armed bases algorithm is blank. Those skilled in the art are therefore unable to effectively study the vulnerability of the X-armed bases-based personalized recommendation system to poisoning attacks encountered in a big data environment.
Disclosure of Invention
The invention aims at the technical problems in the prior art, and provides a method and a device for poisoning an X-armed bases-based personalized recommendation system under a big data environment.
The technical scheme for solving the technical problems is as follows:
in a first aspect, the present invention provides a method of poisoning an X-armed bands-based personalized recommendation system, comprising:
s100, acquiring a recommendation result recommended to the environment by a learner, and reproducing an HCT tree of the learner according to the recommendation result; the learner refers to a personalized recommendation system, and the environment refers to a user facing the personalized recommendation system;
s200, obtaining feedback results of the environment, tampering the feedback results based on the repeated HCT tree, and returning the feedback results to the learner.
Further, the reproducing the HCT tree of the learner according to the recommendation result includes:
s110, defining a flag variable、/>And node variable +.>The node variable->For pointing to a node in the recurring HCT tree; and remembers the node selected by the t-th round learner from the HCT tree as +.>The arm selected is->T is a positive integer;
s120, at the t-th wheel, putJudging the current->Whether the value of (2) is 1:
if it isConsider->Nodes and +.>Corresponding, and->Juxtaposing->Is->After that, step S200 is performed;
if it isStep S130 is performed;
s130, judgingWhether the value of (2) is 0:
if it isTraversing all nodes in the recurrent HCT tree to find whether the node +.>Corresponding arm->If present, consider node->And->Correspondingly, juxtapose->Step S200 is executed, if the HCT leaf node does not exist, the repeated HCT leaf node is expanded;
if it isStep S200 is performed.
Further, the expanding recurrent HCT leaf child node includes:
s140, searching for the covered arm space containing the armLeaf node->Expanding two sub-nodes, and judging which of the two sub-nodes covers an arm space containing an arm +.>If the arm space covered by one of the child nodes contains arm +.>The child node is considered to be +.>Arms corresponding to the child node and +.>Same and let node variable->Pointing to another child node and then setting +>And performs step S200.
Further, the obtaining the feedback result of the environment, tampering the feedback result based on the repeated HCT tree, and returning the feedback result to the learner includes:
s210, obtaining feedback results of the environment, and lettingAnd update->
S220, ifCalculate +.>
wherein ,
m represents the round of ensuring that the attack can be effective, T represents the total round of HCT algorithm operation, and +.>
The method comprises the steps of carrying out a first treatment on the surface of the N is->Or->
If it isPut->
S230, the probability of successful tampering isUnder the condition of->And returns it to the learner; if the tampering fails, then->
S240, orderAnd jumps to step S120;
representing the learner select node +.>Is>Representation->Average value of the original feedback of the environment in the round, +.>Representing the learner select node +.>K represents the target arm specified by the attacker, ++>Representing node->Covered arm space, +.>Represents the HCT tree reproduced up to round t,>is a confidence parameter in the range (0, 1).
In a second aspect, the present invention provides a poisoning attack apparatus for an X-armed bases-based personalized recommendation system, comprising:
the HCT tree reproduction module is used for obtaining a recommended result recommended to the environment by the learner and reproducing the HCT tree of the learner according to the recommended result; the learner refers to a personalized recommendation system, and the environment refers to a user facing the personalized recommendation system;
and the feedback tampering module acquires the feedback result of the environment, tampering the feedback result based on the repeated HCT tree and returning the feedback result to the learner.
Further, the HCT tree reproduction module is specifically configured to:
defining a logo variable、/>And node variable +.>The node variablesFor pointing to a node in the recurring HCT tree; and remembers the node selected by the t-th round learner from the HCT tree asThe arm selected is->T is a positive integer;
at the t-th wheel, putJudging the current->Whether the value of (2) is 1:
if it isConsider->Nodes and +.>Corresponding, and->Juxtaposing->Is->
If it isJudging->Whether the value of (2) is 0:
if it isTraversing all nodes in the recurrent HCT tree to find whether the node +.>Corresponding arm->If present, consider node->And->Correspondingly, juxtapose->If the tree leaf node does not exist, expanding the repeated HCT tree leaf node.
Further, the expanding recurrent HCT leaf child node includes:
finding an included arm in the covered arm spaceLeaf node->Expanding two sub-nodes, and judging which of the two sub-nodes covers an arm space containing an arm +.>If the arm space covered by one of the child nodes contains arm +.>The child node is considered to be +.>Arms corresponding to the child node and +.>Same and let node variable->Pointing to another child node and then setting +>
Further, the feedback tampering module is specifically configured to:
obtaining feedback results of the environment, and makingAnd update->
If it isCalculate +.>
wherein ,
m represents the round of ensuring that the attack can be effective, T represents the total round of HCT algorithm operation, and +.>
The method comprises the steps of carrying out a first treatment on the surface of the N is->Or->
If it isPut->
The probability of successful tampering isUnder the condition of->And returns it to the learner; if the tampering fails, then->The method comprises the steps of carrying out a first treatment on the surface of the Then let->
Representing the learner select node +.>Is>Representation->Average value of the original feedback of the environment in the round, +.>Representing the learner select node +.>K represents the target arm specified by the attacker, ++>Representing node->Covered arm space, +.>Represents the HCT tree reproduced up to round t,>is a confidence parameter in the range (0, 1).
In a third aspect, the present invention provides an electronic device comprising:
a memory for storing a computer software program;
and the processor is used for reading and executing the computer software program so as to realize the poisoning attack method for the personalized recommendation system based on the X-armed bases.
In a fourth aspect, the present invention provides a non-transitory computer readable storage medium, where a computer software program for implementing a method of poisoning an X-armed based personalized recommendation system according to the first aspect of the present invention is stored.
The beneficial effects of the invention are as follows: with respect to the research work in the current field, whether the attack is a manipulation attack or a poisoning attack, most of the attack targets are MAB algorithm, and the attack research of the X-armed bases algorithm is blank. Meanwhile, the HCT algorithm, which is a typical algorithm of X-armed bases, is quite different from the typical MAB algorithm UCB (The Upper Confidence Bound Algorithm), because the former requires maintenance of a binary tree to discretize the huge arm space at run-time, while the latter does not. The goal of an attacker is to force the HCT algorithm to choose the node in the binary tree that contains the target arm under attack, rather than forcing the UCB algorithm to choose a particular arm as in the case of attacking UCB, resulting in the inability of existing poisoning and manipulation attacks against UCB algorithms to directly act on the HCT algorithm.
Drawings
FIG. 1 is a schematic diagram of a workflow of an Internet of things service recommendation system and an attack mode of an attacker;
FIG. 2 is a schematic diagram of a poisoning attack model;
FIG. 3 is a flowchart of a method for poisoning an X-armed bands based personalized recommendation system according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a poisoning attack device for an X-armed bases-based personalized recommendation system in a big data environment according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an embodiment of an electronic device according to an embodiment of the present invention;
fig. 6 is a schematic diagram of an embodiment of a computer readable storage medium according to an embodiment of the present invention.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
In this example, we have proposed vulnerability studies under a poisoning attack against a typical X-armed bases algorithm HCT (The High-Confidence Tree Algorithm, high confidence tree algorithm).
Taking the video big data recommendation system as an example, as shown in fig. 1, in the system, all videos are mapped to one metric space X (each video can be regarded as one arm) and discretized by a hierarchical structure, i.e., binary coverage tree. After the system receives the user request, the X-armed candidates algorithm uses the overlay tree to estimate the optimal video, pushes the recommendation to the user, and then the user submits the feedback to the algorithm. Based on the feedback, the algorithm will improve the recommendation results over the turn in an effort to recommend video content that is more relevant to the particular user's interests. Meanwhile, an attacker can force the system to recommend certain specific video content by hijacking recommendation results or feedback.
To deal with the huge arm spaceThus optimizing the yield, the HCT algorithm adopts a hierarchical structure to discretize the arm space. The hierarchical structure is a binary coverage tree +.>Wherein->The layering is +.>Is represented as (1)Then the root node is (0, 1). Node->Is +.> and />. For a pair ofIn every node->Which covers the arm space +.>Is a subset->, and />The following three conditions are satisfied:
at the same time, each nodeRandomly select a representative arm->As long as the node->Is selected to be arm->Submitting to the environment.
At each round t, for each node, in order to decide which node to chooseThe HCT algorithm calculates a confidence upper bound, namely:
(1)
wherein ,/>,/> and and />Two hyper-parameters for use by the learner. Definitions->Node representing the choice of HCT algorithm at round t, -/->Indicating by round t, HCT algorithm selects node +.>Is set of turns, +.> and />. In addition, the arm space is->Defining a difference function->It satisfies->Is->The method comprises the steps of carrying out a first treatment on the surface of the For a subset of arm space->The diameter is defined as->Then for each node ∈ ->Needs to meet->
To get a more stringent upper confidence bound, in addition to calculating the U value, the HCT algorithm also calculates a B value for each node, namely:
(2)
it can be seen that the larger the corresponding B value of the node, the greater the likelihood that it will contain the optimal arm, which is also the basis for HCT to select the node. At each round t, the HCT algorithm starts from the root node (0, 1), and selects one of the two child nodes with larger B value from each layer down until the leaf node or the node for which the following formula is established (note that the node selected by the round HCT algorithm is):
(3)
And then the arm is connectedSubmit to the environment and get feedback, and update +.>,/> and />At the same time, the root node is updated to +.>All nodes on the path of (a). Finally, check the current node->Whether or not to hold the following formula:
(4)
if the above equation can be satisfied, the node is extendedIs a two child node of (a), i.eAnd sets the U value corresponding to the two child nodes to +.>
The model of the poisoning attack is shown in fig. 2. In the first placeWheel, first step, learner selects arm +.>Meanwhile, the device is intercepted by an attacker; second step, the environment receives->After that, give feedback->And intercepted by an attacker; third step, the attacker by means of the pair +.>Add->Tamper it as +.>And submitted to the learner. It should be noted that, because of environmental restrictions (such as condition restrictions of the attacker or a learner adopting a certain defense means), the attacker has only a chance +.>Can be successfully tampered with->
Based on the above, the embodiment of the present invention provides a method for poisoning and attacking a personalized recommendation system based on X-armed bases, as shown in fig. 3. Before describing this embodiment, first, the parameters and their meanings used in this embodiment will be described.
Representing arm space, i.e. the set of all arms;
representing a target arm specified by an attacker;
representing the cut to->HCT trees replicated by round-robin aggressors;
representing->A set of middle leaf nodes;
covering tree->The number of intermediate nodes;
representing the coverage tree->Middle level is->The order is->Wherein->,/> and />For node->Each node covering arm spaceThe root node (0, 1) covers the whole arm space;
representing the node selected by the learner at round t;
representing in round t,/>Comprises a target arm->Is a leaf node of (a);
representing node->The subspace covered, i.e. +.>
When node->When the HCT algorithm is selected, a pull arm is randomly selected from the HCT algorithm>After the selection, only select the node +.>Will be->Submitting to the environment. In addition, because the data volume is extremely huge in the big data environment and in order to give consideration to the diversity of recommended contents, the +.>Are generally different from each other;
to the endRound t, learner select node +.>Of turns of (i.e.)
By the t-th round, the learner selects node +.>The number of times, i.e.)>
,/>Average of the environmental raw feedback in the round, namely: />, wherein />Raw feedback representing the environment in the s-th round;
,/>the average value of environmental feedback after tampering in the round is: />, wherein />Feedback representing the environment after being tampered by an attacker in the s-th round;
additionally defined is a function with N, delta and t as arguments:
delta is a parameter in a value range of (0, 1), the function B (N, delta, t) is used for calculating a confidence interval of a group of independent random variable average values with the same distribution, and the confidence of the confidence interval is 1-delta.
Variable(s)(initial value is 0), variable->(initial value is 0) and node variable +.>
The poisoning attack method for the personalized recommendation system based on the X-armed bases provided by the embodiment of the invention comprises the following contents:
s100, acquiring a recommendation result recommended to the environment by a learner, and reproducing an HCT tree of the learner according to the recommendation result;
for an attacker, it is known that there is only a chance of each launch of an attackTamper success, in order to guarantee the validity of the attack, there is a variable +.>The method comprises the following steps:
where M represents the round of ensuring that the attack is effective, T represents the total round of HCT algorithm operation, and the attacker can select according to the need to determineBut need to be guaranteed +.>
Specifically, step S100 includes the following sub-steps:
s110, at the t-th wheel, putJudging the current->Whether the value of (2) is 1:
if it isConsider->Nodes and +.>Corresponding, and->Juxtaposing->Is->After that, step S200 is performed;
if it isStep S120 is performed;
s120, judgingWhether the value of (2) is 0:
if it isThen go through the recurrenceAll nodes in the HCT tree find out if there is a node +.>Corresponding arm->If present, consider node->And->Correspondingly, juxtapose->Step S200 is executed, if not, step S130 is executed;
if it isStep S200 is performed.
S130, searching for the covered arm space containing the armLeaf node->The method comprises the following steps:and expand two child nodes of the node +.>Is->Determining which of the two child nodes covers an arm space containing an arm +.>If the arm space covered by one of the child nodes contains arm +.>The child node is considered to be +.>Arms corresponding to the child node and +.>Same and let node variable->Pointing to another child node. For example: if->Then consider child node->And->Correspondingly (I)>And ∈node variable>Point to->On the contrary if->Then consider child node->And->Correspondingly (I)>And ∈node variable>Point to->
Then put inAnd performs step S200.
S200, obtaining feedback results of the environment, tampering the feedback results based on the repeated HCT tree, and returning the feedback results to the learner.
Specifically, step S200 includes the following sub-steps:
s210, obtaining feedback results of the environment, and lettingAnd update->
S220, ifCalculate +.>
wherein ,
if it isPut->
S230, the probability of successful tampering isUnder the condition of->And returns it to the learner; if the tampering fails, then->
S240, orderAnd inherits the reproduced HCT tree, and then jumps to step S110.
For HCT trees maintained by learners, the attacker does not have direct access to their specific structure, but can obtain the learner's selected arm through each roundFeedback of the environment->This was reproduced.
Through the above steps, the attacker achieves manipulation of the HCT algorithm even though it selects the node containing the target arm as much as possible. More specifically, here, assume that the total round of HCT algorithm operation isThen the attacker can be sub-linear in the upper bound of attack consumption, i.e. +.>Under the condition of (a) implementing attack, then the HCT algorithm has at leastThe nodes of the round of selection of (a) contain target arms specified by the attacker.
As shown in fig. 4, the embodiment of the present invention further provides a poisoning attack apparatus for a personalized recommendation system based on X-armed bases, including:
the HCT tree reproduction module is used for obtaining a recommended result recommended to the environment by the learner and reproducing the HCT tree of the learner according to the recommended result;
and the feedback tampering module acquires the feedback result of the environment, tampering the feedback result based on the repeated HCT tree and returning the feedback result to the learner.
Further, the HCT tree reproduction module is specifically configured to:
defining a logo variable、/>And node variable +.>The node variablesFor pointing to a node in the recurring HCT tree; and remembers the node selected by the t-th round learner from the HCT tree asThe arm selected is->T is a positive integer;
at the t-th wheel, putJudging the current->Whether the value of (2) is 1:
if it isConsider->Nodes and +.>Corresponding, and->Juxtaposing->Is->
If it isJudging->Whether the value of (2) is 0:
if it isTraversing all nodes in the recurrent HCT tree to find whether the node +.>Corresponding arm->If present, consider node->And->Correspondingly, juxtapose->If the tree leaf node does not exist, expanding the repeated HCT tree leaf node.
Further, the expanding recurrent HCT leaf child node includes:
finding an included arm in the covered arm spaceLeaf node->Expanding two sub-nodes, and judging which of the two sub-nodes covers an arm space containing an arm +.>If the arm space covered by one of the child nodes contains arm +.>The child node is considered to be +.>Arms corresponding to the child node and +.>Same and let node variable->Pointing to another child node and then setting +>
Further, the feedback tampering module is specifically configured to:
s210, obtaining feedback results of the environment, and lettingAnd update->
S220, ifCalculate +.>
wherein ,
the method comprises the steps of carrying out a first treatment on the surface of the N is->Or->
If it isPut->
The probability of successful tampering isUnder the condition of->And returns it to the learner; if the tampering fails, then->The method comprises the steps of carrying out a first treatment on the surface of the Then let->
Representing the learner select node +.>Is>Representation->Average value of the original feedback of the environment in the round, +.>Representing the learner select node +.>K represents the target arm specified by the attacker, ++>Representing node->Covered arm space, +.>Represents the HCT tree reproduced up to round t,>is a confidence parameter in the range (0, 1).
Referring to fig. 5, fig. 5 is a schematic diagram of an embodiment of an electronic device according to an embodiment of the invention. As shown in fig. 5, an embodiment of the present invention provides an electronic device 500, including a memory 510, a processor 520, and a computer program 511 stored on the memory 510 and executable on the processor 520, wherein the processor 520 executes the computer program 511 to implement the following steps:
s100, acquiring a recommendation result recommended to the environment by a learner, and reproducing an HCT tree of the learner according to the recommendation result;
s200, obtaining feedback results of the environment, tampering the feedback results based on the repeated HCT tree, and returning the feedback results to the learner.
Referring to fig. 6, fig. 6 is a schematic diagram of an embodiment of a computer readable storage medium according to an embodiment of the invention. As shown in fig. 6, the present embodiment provides a computer-readable storage medium 600 having stored thereon a computer program 611, which computer program 611 when executed by a processor implements the steps of:
s100, acquiring a recommendation result recommended to the environment by a learner, and reproducing an HCT tree of the learner according to the recommendation result;
s200, obtaining feedback results of the environment, tampering the feedback results based on the repeated HCT tree, and returning the feedback results to the learner.
In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. A method of poisoning a personalized recommendation system based on X-armed bases, comprising:
s100, acquiring a recommendation result recommended to the environment by a learner, and reproducing an HCT tree of the learner according to the recommendation result; the learner refers to a personalized recommendation system, and the environment refers to a user facing the personalized recommendation system;
s200, obtaining feedback results of the environment, tampering the feedback results based on the repeated HCT tree, and returning the feedback results to the learner.
2. The method of claim 1, wherein the reproducing the HCT tree of the learner based on the recommendation result comprises:
s110, defining a flag variable、/>And node variable +.>The node variablesFor pointing to a node in the recurring HCT tree; and remembers the node selected by the t-th round learner from the HCT tree asThe arm selected is->T is a positive integer;
s120, at the t-th wheel, putJudging the current->Whether the value of (2) is 1:
if it isConsider->Nodes and +.>Corresponding andjuxtaposing->Is->After that, step S200 is performed;
if it isStep S130 is performed;
s130, judgingWhether the value of (2) is 0:
if it isTraversing all nodes in the recurrent HCT tree to find whether the node +.>Corresponding arm->If present, consider node->And->Correspondingly, juxtapose->Step S200 is executed, if the HCT leaf node does not exist, the repeated HCT leaf node is expanded;
if it isStep S200 is performed.
3. The method of claim 2, wherein expanding the recurring HCT leaf child nodes comprises:
s140, searching for the covered arm space containing the armLeaf node->Expanding two sub-nodes, and judging which of the two sub-nodes covers an arm space containing an arm +.>If the arm space covered by one of the child nodes contains arm +.>The child node is considered to be +.>Arms corresponding to the child node and +.>Same and let node variable->Pointing to another child node and then setting +>And performs step S200.
4. The method of claim 3, wherein the obtaining the feedback result of the environment, tampering with the feedback result based on the recurring HCT tree, and returning to the learner comprises:
s210, obtaining feedback results of the environment, and lettingAnd update->
S220, ifCalculate +.>
wherein ,
m represents the round of ensuring that the attack can be effective, T represents the total round of HCT algorithm operation, and +.>
The method comprises the steps of carrying out a first treatment on the surface of the N is->Or->
If it isPut->
S230, the probability of successful tampering isUnder the condition of->And returns it to the learner; if the tampering fails, then->
S240, orderAnd jumps to step S120;
representing the learner select node +.>Is>Representation->Average value of the original feedback of the environment in the round, +.>Representing the learner select node +.>K represents the target arm specified by the attacker, ++>Representing node->Covered arm space, +.>Represents the HCT tree reproduced up to round t,>is a confidence parameter in the range (0, 1).
5. A poisoning attack device for an X-armed bases-based personalized recommendation system, comprising:
the HCT tree reproduction module is used for obtaining a recommended result recommended to the environment by the learner and reproducing the HCT tree of the learner according to the recommended result; the learner refers to a personalized recommendation system, and the environment refers to a user facing the personalized recommendation system;
and the feedback tampering module acquires the feedback result of the environment, tampering the feedback result based on the repeated HCT tree and returning the feedback result to the learner.
6. The apparatus of claim 5, wherein the HCT tree replication module is specifically configured to:
defining a logo variable、/>And node variable +.>The node variable->For pointing to a node in the recurring HCT tree; and remembers the node selected by the t-th round learner from the HCT tree as +.>The arms being selected as/>T is a positive integer;
at the t-th wheel, putJudging the current->Whether the value of (2) is 1:
if it isConsider->Nodes and +.>Corresponding andjuxtaposing->Is->
If it isJudging->Whether the value of (2) is 0:
if it isTraversing all nodes in the recurrent HCT tree to find whether the node +.>Corresponding arm->If present, consider node->And->Correspondingly, juxtapose->If the tree leaf node does not exist, expanding the repeated HCT tree leaf node.
7. The apparatus of claim 6, wherein the expanding recurring HCT leaf child node comprises:
finding an included arm in the covered arm spaceLeaf node->Expanding two sub-nodes, and judging which of the two sub-nodes covers an arm space containing an arm +.>If the arm space covered by one of the child nodes contains an armThe child node is considered to be +.>Arms corresponding to the child node and +.>Identical and let node variablesPointing to another child node and then setting +>
8. The apparatus of claim 7, wherein the feedback tampering module is specifically configured to:
obtaining feedback results of the environment, and makingAnd update->
If it isCalculate +.>
wherein ,
m represents the round of ensuring that the attack can be effective, T represents the total round of HCT algorithm operation, and +.>
The method comprises the steps of carrying out a first treatment on the surface of the N is->Or->
If it isPut->
The probability of successful tampering isUnder the condition of->And returns it to the learner; if the tampering fails, then->The method comprises the steps of carrying out a first treatment on the surface of the Then let->
Representing the learner select node +.>Is>Representation->Average value of the original feedback of the environment in the round, +.>Representing the learner select node +.>K represents the target arm specified by the attacker, ++>Representing node->Covered arm space, +.>Represents the HCT tree reproduced up to round t,>is a confidence parameter in the range (0, 1).
9. An electronic device, comprising:
a memory for storing a computer software program;
a processor for reading and executing the computer software program to implement a method of poisoning an X-armed bands based personalized recommendation system according to any one of claims 1-4.
10. A non-transitory computer readable storage medium, characterized in that the storage medium has stored therein a computer software program for implementing a method of poisoning an X-armed based personalized recommendation system according to any of claims 1-4.
CN202310880108.3A 2023-07-18 2023-07-18 Method and device for poisoning attack on personalized recommendation system Pending CN116595528A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310880108.3A CN116595528A (en) 2023-07-18 2023-07-18 Method and device for poisoning attack on personalized recommendation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310880108.3A CN116595528A (en) 2023-07-18 2023-07-18 Method and device for poisoning attack on personalized recommendation system

Publications (1)

Publication Number Publication Date
CN116595528A true CN116595528A (en) 2023-08-15

Family

ID=87606656

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310880108.3A Pending CN116595528A (en) 2023-07-18 2023-07-18 Method and device for poisoning attack on personalized recommendation system

Country Status (1)

Country Link
CN (1) CN116595528A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093997A (en) * 2023-10-20 2023-11-21 广东省科技基础条件平台中心 Code countermeasure sample generation method based on stable multi-arm slot machine

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US20060253274A1 (en) * 2005-05-05 2006-11-09 Bbn Technologies Corp. Methods and systems relating to information extraction
US20150199715A1 (en) * 2012-06-29 2015-07-16 Thomson Licensing System and method for recommending items in a social network
US20150348432A1 (en) * 2014-05-29 2015-12-03 Samsung Electronics Co., Ltd. Context-aware recommendation system for adaptive learning
CN107066446A (en) * 2017-04-13 2017-08-18 广东工业大学 A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules
US20190108579A1 (en) * 2017-10-05 2019-04-11 International Business Machines Corporation Product configuration recommendation and optimization

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US20060253274A1 (en) * 2005-05-05 2006-11-09 Bbn Technologies Corp. Methods and systems relating to information extraction
US20150199715A1 (en) * 2012-06-29 2015-07-16 Thomson Licensing System and method for recommending items in a social network
US20150348432A1 (en) * 2014-05-29 2015-12-03 Samsung Electronics Co., Ltd. Context-aware recommendation system for adaptive learning
CN107066446A (en) * 2017-04-13 2017-08-18 广东工业大学 A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules
US20190108579A1 (en) * 2017-10-05 2019-04-11 International Business Machines Corporation Product configuration recommendation and optimization

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ZHI LUO: "Action-Manipulation Attack and Defense to X-Armed Bandits", 《21ST IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (IEEE TRUSTCOM)》, pages 1115 - 1112 *
ZHI LUO: "Data Poisoning Attack to chi-armed Bandits", 《21ST IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (IEEE TRUSTCOM)》, pages 345 - 351 *
王志梅;杨帆;: "根据兴趣导向的移动学习者社区构建", 机电工程, no. 12 *
骆剑承, 梁怡, 周成虎: "基于尺度空间的分层聚类方法及其在遥感影像分类中的应用", 测绘学报, no. 04 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093997A (en) * 2023-10-20 2023-11-21 广东省科技基础条件平台中心 Code countermeasure sample generation method based on stable multi-arm slot machine
CN117093997B (en) * 2023-10-20 2024-02-20 广东省科技基础条件平台中心 Code countermeasure sample generation method based on stable multi-arm slot machine

Similar Documents

Publication Publication Date Title
CN108920947B (en) Abnormity detection method and device based on log graph modeling
CN111566654B (en) Machine learning integrating knowledge and natural language processing
CN108875776B (en) Model training method and device, service recommendation method and device, and electronic device
CN111310074B (en) Method and device for optimizing labels of interest points, electronic equipment and computer readable medium
US20170300580A1 (en) System and method for identifying contacts of a target user in a social network
CN111382283B (en) Resource category label labeling method and device, computer equipment and storage medium
CN116595528A (en) Method and device for poisoning attack on personalized recommendation system
CN112884802B (en) Attack resistance method based on generation
US20230004608A1 (en) Method for content recommendation and device
CN113449011A (en) Big data prediction-based information push updating method and big data prediction system
CN115688913A (en) Cloud-side collaborative personalized federal learning method, system, equipment and medium
CN113449012A (en) Internet service mining method based on big data prediction and big data prediction system
CN112699667A (en) Entity similarity determination method, device, equipment and storage medium
Guo et al. Homophily-oriented heterogeneous graph rewiring
CN113641797A (en) Data processing method, device, equipment, storage medium and computer program product
CN116702136A (en) Manipulation attack method and device for personalized recommendation system
WO2023024408A1 (en) Method for determining feature vector of user, and related device and medium
CN115062709A (en) Model optimization method, device, equipment, storage medium and program product
CN114332550A (en) Model training method, system, storage medium and terminal equipment
CN116541592A (en) Vector generation method, information recommendation method, device, equipment and medium
CN114943077A (en) Malicious PDF file countermeasure sample generation method based on deep reinforcement learning
CN114239049A (en) Parameter compression-based defense method facing federal learning privacy reasoning attack
CN115114442A (en) Knowledge graph updating method and device, storage medium and electronic equipment
Ma et al. Perceptual hashing method for video content authentication with maximized robustness
CN112231571A (en) Information data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination