CN111309975A - Method and system for enhancing attack resistance of graph model - Google Patents

Method and system for enhancing attack resistance of graph model Download PDF

Info

Publication number
CN111309975A
CN111309975A CN202010105695.5A CN202010105695A CN111309975A CN 111309975 A CN111309975 A CN 111309975A CN 202010105695 A CN202010105695 A CN 202010105695A CN 111309975 A CN111309975 A CN 111309975A
Authority
CN
China
Prior art keywords
point
graph
edge
target
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010105695.5A
Other languages
Chinese (zh)
Inventor
皇甫志刚
林建滨
任彦昆
梁琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010105695.5A priority Critical patent/CN111309975A/en
Publication of CN111309975A publication Critical patent/CN111309975A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a method and a system for enhancing attack resistance of a graph model. The method comprises the following steps: acquiring target graph data which comprises a first point set V1 and an edge set E; randomly generating a disturbance edge set delta E of n edges, wherein nodes of the n edges form a second point set V2; arbitrarily selecting a correction point from the second point set V2; randomly selecting a plurality of points from the first point set V1 to form a candidate target point set V3 of correction points; selecting a target point from the candidate target point set V3, and replacing the correction point by using the selected target point to update the disturbance edge set delta E to obtain a disturbance edge set delta E'; adjusting the edge set E based on the disturbance edge set delta E' to obtain adjusted target map data; repeatedly selecting correction points from the second point set V2 and iteratively adjusting the edge set E to obtain the confrontation graph data; based on the confrontation graph data, a graph model is adjusted. Wherein the target graph data may include personal information and the graph model may be a machine learning model.

Description

Method and system for enhancing attack resistance of graph model
Technical Field
The application relates to the technical field of computers, in particular to a method and a system for enhancing attack resistance of a graph model.
Background
The graph model has wide application scenarios, such as aspects of natural science, engineering technology, social economy, management and the like. However, only a few perturbations need to be added to the graph structure to enable the graph model to output results beyond those expected. In order to predict the potential threats of the model in advance, research on the attack resisting algorithm of the graph model is helpful for discovering model holes, and the method has great significance for robustness testing and safety application of the model.
Based on this, the application provides a method and a system for enhancing the attack resistance of a graph model.
Disclosure of Invention
One embodiment of the application provides a method for enhancing attack resistance of a graph model. The method for enhancing the attack resistance of the graph model comprises the following steps: acquiring target graph data, wherein the target graph data comprises a first point set V1 and an edge set E; randomly generating a perturbed edge set Δ E of n edges, wherein nodes of the n edges form a second point set V2, and the nodes of the n edges all exist in the first point set V1; arbitrarily selecting a correction point from the second point set V2; randomly selecting a plurality of points from the first point set V1 to form a candidate target point set V3 of the correction points; selecting a target point from the candidate target point set V3, and replacing the correction point with the selected target point to update the disturbance edge set delta E to obtain a disturbance edge set delta E'; adjusting the edge set E based on the disturbance edge set delta E' to obtain adjusted target graph data, and accepting the replacement when the absolute value of the difference between the output result of the adjusted target graph data input into the graph model and the real result corresponding to the output result is increased; repeatedly selecting correction points from the second point set V2 and iteratively adjusting the edge set E to obtain confrontation graph data; based on the confrontation graph data, the graph model is adjusted to enhance the attack resistance of the graph model.
One of the embodiments of the present application provides a system for enhancing an attack resistance capability of a graph model, where the system may include: the acquisition module is used for acquiring target graph data, and the target graph data comprises a first point set V1 and an edge set E; a generating module, configured to generate a perturbed edge set Δ E of n edges arbitrarily, where nodes of the n edges constitute a second point set V2, and the nodes of the n edges all exist in the first point set V1; a first selection module, configured to arbitrarily select a correction point from the second point set V2; a second selecting module, configured to arbitrarily select, from the first point set V1, a plurality of points to form a candidate target point set V3 of the correction points; a replacing module, configured to select a target point from the candidate target point set V3, and replace the correction point with the selected target point, so as to update the perturbed edge set Δ E to obtain a perturbed edge set Δ E'; an adjusting module, configured to adjust the edge set E based on the disturbance edge set Δ E' to obtain adjusted target map data, and accept the replacement when an absolute value of a difference between an output result of the adjusted target map data input to the map model and a real result corresponding to the output result increases; a confrontation graph data determination module for obtaining confrontation graph data when the first selection module repeatedly selects correction points from the second point set V2 so that the adjustment module iteratively adjusts the edge set E; and the enhancement module is used for adjusting the graph model based on the confrontation graph data so as to enhance the attack resistance of the graph model.
An aspect of the embodiments of the present specification provides an apparatus for enhancing a graph model against attacks, including a processor for executing any one of the methods as described above for providing a method for enhancing a graph model against attacks.
An aspect of embodiments of the present specification provides a computer-readable storage medium storing computer instructions, and when the computer instructions in the storage medium are read by a computer, the computer executes any one of the methods described above to provide a method for enhancing the attack resistance of a graph model.
Drawings
The present application will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:
FIG. 1 is a block diagram of a system for enhancing the attack resistance of a graph model according to some embodiments of the present application;
FIG. 2 is an exemplary flow diagram illustrating an enhanced graph model for attack resistance according to some embodiments of the present application;
FIG. 3 is an exemplary flow diagram of an adaptation graph model to enhance the graph model's resistance to attacks according to some embodiments of the present application;
FIG. 4 is an exemplary graph of target graph data shown in accordance with some embodiments of the present application;
FIG. 5 is an example graph of adjusted target graph data according to some embodiments of the present application;
and
FIG. 6 is an exemplary graph of confrontation graph data shown in accordance with some embodiments of the present application.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only examples or embodiments of the application, from which the application can also be applied to other similar scenarios without inventive effort for a person skilled in the art. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
It should be understood that "system", "device", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this application and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Flow charts are used herein to illustrate operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
FIG. 1 is a block diagram of a system for enhancing the attack resistance of a graph model according to some embodiments of the present application.
As shown in fig. 1, the enhanced graph model attack resistance system 100 may include an obtaining module 110, a generating module 120, a first selecting module 130, a second selecting module 140, a replacing module 150, an adjusting module 160, a confrontation graph data determining module 170, and an enhancing module 180.
The acquisition module 110 may be used to acquire target graph data, which may include a first set of points V1 and an edge set E.
The generating module 120 may be configured to arbitrarily generate a perturbed edge set Δ E of n edges, where nodes of the n edges constitute a second point set V2, and the nodes of the n edges all exist in the first point set V1.
The first selection module 130 may be configured to arbitrarily select a correction point from the second set of points V2.
The second selection module 140 may be configured to arbitrarily select a candidate set of target points V3, from the first set of points V1, for which the correction points are composed of a number of points.
The replacing module 150 may be configured to select a target point from the candidate target point set V3, and replace the correction point with the selected target point, so as to update the perturbed edge set Δ E to obtain a perturbed edge set Δ E'.
The adjusting module 160 may be configured to adjust the edge set E based on the perturbed edge set Δ E' to obtain adjusted target graph data, and accept the replacement when an absolute value of a difference between an output result of the adjusted target graph data input to the graph model and a real result corresponding to the output result is increased.
In some embodiments, the adjusting module 160 may be further configured to, for each edge in the perturbed edge set Δ E, delete the edge from the edge set E if the edge exists in the edge set E; and if the edge does not exist in the edge set E, adding the edge in the edge set E.
The confrontation graph data determination module 170 may be configured to repeatedly select a correction point from the second set of points V2 and iteratively adjust the edge set E to obtain confrontation graph data.
In some embodiments, the confrontation graph data determination module 170 may be further configured to, when the first selection module repeatedly selects a correction point from the second point set V2, so that the adjustment module repeatedly adjusts the edge set E based on the disturbance edge set Δ E', determine whether the number of repetitions is greater than or equal to a number threshold, and when the number of repetitions is greater than or equal to the number threshold, use the currently obtained target graph data as the confrontation graph data.
The enhancement module 180 may be configured to adjust the graph model based on the confrontation graph data to enhance the attack resistance of the graph model.
In some embodiments, the enhancement module 180 may be further configured to input the confrontation graph data into the graph model, resulting in an output corresponding to each point in the confrontation graph data, the output being used to characterize the probability that the point belongs to the class to which the output corresponds; aiming at the same point in the countermeasure map data, if the category corresponding to the maximum value in all the outputs of the point is different from the real category of the point, judging that the attack corresponding to the point is successful; and aiming at a plurality of points in the confrontation graph data, counting the proportion of the successful times of the attacks corresponding to the points in the total times of the attacks, and adjusting the graph model based on the proportion so as to enhance the capability of resisting the attacks of the graph model.
In some embodiments, the graph model is a model for classifying points in graph data, nodes in the graph data being used to characterize entity objects.
It should be understood that the system and its modules shown in FIG. 1 may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules of the present application may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of the above hardware circuits and software (e.g., firmware).
It should be noted that the above descriptions of the candidate item display and determination system and the modules thereof are only for convenience of description, and are not intended to limit the present application within the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings. For example, in some embodiments, for example, the obtaining module 110, the generating module 120, the first selecting module 130, the second selecting module 140, the replacing module 150, the adjusting module 160, the confrontation graph data determining module 170, and the enhancing module 180 disclosed in fig. 1 may be different modules in one system, or may be one module to implement the functions of two or more modules described above. For another example, in the system 100 for enhancing the attack resistance of the graph model, each module may share one storage module, and each module may have its own storage module. Such variations are within the scope of the present application.
FIG. 2 is an exemplary flow diagram illustrating an enhanced graph model for attack resistance according to some embodiments of the present application. As shown in fig. 2, the method 200 for enhancing the attack resistance of the graph model includes:
in step 202, target graph data is obtained, wherein the target graph data comprises a first point set V1 and an edge set E.
Specifically, step 202 may be performed by the obtaining module 110.
In some embodiments, graph data is data in the form of graph objects. The graph data may be a graph structure containing nodes and edges. The nodes may be used to characterize physical objects, such as individuals, businesses, places, things, categories, or other data. The edge may be a connection between two nodes representing the way the two nodes are associated. In some embodiments, the edges may have directions, i.e., the corresponding graph is referred to as a directed graph. In some embodiments, the edges may not have a direction, i.e., the corresponding graph is referred to as an undirected graph. In some embodiments, the nodes and/or the edges may have attributes. The attributes may include symbolic labels or numeric attributes such as cost, capacity, length, weight, etc. In some embodiments, the nodes and/or the edges may not have attributes.
In this specification, a set of nodes in graph data is denoted by V, a set of edges in graph data is denoted by E, graph data is denoted in the form of G (V, E), and if A, B is a node in V, an edge between a and B is denoted by (a-B).
In some embodiments, the target graph data refers to graph data applicable to a graph model to be enhanced in resistance to attacks. The target graph data may include a first set of points V1 and an edge set E. The first set of points V1 may be a set of nodes in the target graph data, e.g., the first set of points V1 of the graph data in the example graph fig. 4 of the target graph data, the example graph fig. 5 after adjustment of the target graph data, and the example graph fig. 6 of the confrontation graph data ═ a, B, C, D, E, F, H, I }. The edge set E may be a set of edges in the target map data, for example, the edge set E { (a-B), (B-C), (C-D), (C-E), (B-F), (F-I), (F-H) } of the target map data in fig. 4, the edge set E { (B-C), (C-D), (C-E), (B-F), (B-I), (F-H) } of the adjusted target map data in fig. 5, the set of edges E { (a-B), (B-C), (C-D), (B-E), (C-E), (B-F), (B-I), (F-H) } of the confrontation map data in fig. 6.
In some embodiments, the target graph data may be artificially constructed, for example, the target graph data is artificially constructed according to the obtained social network relationship data, in the constructed graph data, the node may represent a specific user, and the edge may represent a relationship that is a friend of the address book among the users. In some embodiments, the target map data may also be from an existing map database. In this regard, the present specification is not particularly limited.
In some embodiments, the target graph data may be obtained by a terminal or a server through data communication, for example, the terminal sends the target graph data in the terminal to the server through wireless transmission, so as to obtain the target graph data by the server.
And 204, arbitrarily generating a disturbance edge set delta E of n edges, wherein nodes of the n edges form a second point set V2, and the nodes of the n edges exist in the first point set V1.
In particular, step 204 may be performed by the generation module 120.
In some embodiments, the nodes of the n edges may all be nodes in the target graph data, i.e., the second set of points V2 is a subset of the first set of points V1. For example, for the target map data in fig. 4, the corresponding perturbation edge set Δ E { (a-B), (B-I) }, so that the second point set V2 { (a, B, I }, is a subset of the first point set V1 { (a, B, C, D, E, F, H, I } corresponding to the target map data in fig. 4.
In some embodiments, perturbing the edge set Δ E may be used to adjust the edge set E in the target graph data.
The number n of the n edges may be proportional to the number of edges in the edge set E, and the magnitude of the change in the target map data may be limited by setting the magnitude of n. For example, n may be 1% to 5% of the number of edges in the edge set E.
At step 206, a correction point is arbitrarily selected from the second set of points V2.
In particular, step 206 may be performed by the first selection module 130.
In some embodiments, a correction point may be randomly selected from the second set of points V2. For example, the node a is selected as the correction point from the second point set V2 ═ { a, B, I } generated in step 204.
In some embodiments, the probability that all points in the second set of points V2 are selected may be the same. In some embodiments, the probability distribution of all the points in the second point set V2 being selected may also be subject to a normal distribution, a binomial distribution, or a poisson distribution, which may be determined according to practical situations and is not limited in this specification.
And step 208, arbitrarily selecting a plurality of points from the first point set V1 to form a candidate target point set V3 of the correction points.
In particular, step 208 may be performed by the second selection module 140.
In some embodiments, a number of points may be randomly selected from the first set of points V1. The points may be some or all of the points in the first set of points V1. These points constitute a set, referred to as the candidate set of target points for the correction point, denoted in this description by V3. For example, 3 points may be arbitrarily selected from the target map data in fig. 4 to constitute a candidate target point set V3 ═ { C, E, F }. In some embodiments, the set of candidate target points V3 may be different from the second set of points V2. For example, the candidate target point set V3 of the target map data in fig. 4 ═ { C, E, F }, may be different from the second point set V2 of the target map data in fig. 4 ═ a, B, I }. In some embodiments, the set of candidate target points V3 may be the same as the second set of points V2.
In some embodiments, the probability that all points in the first set of points V1 are selected may be the same. In some embodiments, the probability distribution that all the points in the first point set V1 are selected may also be subject to a normal distribution, a binomial distribution, or a poisson distribution, which may be determined according to practical situations and is not limited in this specification.
Step 210, selecting a target point from the candidate target point set V3, and replacing the correction point with the selected target point to update the disturbance edge set Δ E to obtain a disturbance edge set Δ E'.
In particular, step 210 may be performed by replacement module 150.
In some embodiments, the target points may be sequentially selected from the set of candidate target points V3. For example, nodes may be sequentially selected as target points in the order of C, E, F from the candidate target point set V3 ═ { C, E, F } corresponding to the target map data in fig. 4. In some embodiments, the target point may randomly select unselected nodes from the set of candidate target points V3. For example, the order in which the nodes in the candidate target point set V3 ═ { C, E, F } corresponding to the target map data in fig. 4 are selected may be any one of the following orders: C. e, F, respectively; C. f, E, respectively; E. c, F, respectively; E. f, C, respectively; F. c, E, or F, E, C. In some embodiments, the probability that all nodes in the candidate set of target points V3 are selected may be the same.
In some embodiments, the target point is used to replace the correction point determined in step 206 and update the interference edge set Δ E. For example, a target point C is selected from the candidate target point set V3 ═ { C, E, F } corresponding to the target map data in fig. 4 to replace the correction point a, and the perturbation edge set Δ E { (a-B), (B-I) } is updated, so that an updated perturbation edge set Δ E' { (C-B), (B-I) } is obtained.
Step 212, adjusting the edge set E based on the disturbance edge set Δ E' to obtain adjusted target graph data, and accepting the replacement when an absolute value of a difference between an output result of the adjusted target graph data input to the graph model and a real result corresponding to the output result is increased.
In particular, step 212 may be performed by adjustment module 160.
In some embodiments, the set of edges E may be adjusted based on the perturbed set of edges Δ E'. Specifically, each edge in the disturbance edge set Δ E' is compared with all edges in the edge set E, and if the edge exists in the edge set E, the edge is deleted from the edge set E; and if the edge does not exist in the edge set E, adding the edge in the edge set E. For example, the perturbed edge set Δ E { (a-B), (B-I) }, the edge set E { (a-B), (B-C), (C-D), (C-E), (B-F), (F-I), (F-H) }, since the edge (a-B) in Δ E 'exists in the edge set E, the edge (a-B) is deleted, and the edge (B-I) in Δ E' does not exist in the edge set E, the edge (B-I) is added in the edge set E, resulting in the adjusted target map data shown in fig. 5, whose edge set E { (B-C), (B-I), (C-D), (C-E), (B-F), (F-I), (F-H) }. Therefore, the target map data G (V1, E) can be adjusted to G (V1, E').
In some embodiments, it may be determined that an absolute value of a difference between the output result and a true result corresponding to the output result increases when the loss function value of the adjusted target map data increases. The real result is the correct result that the adjusted target graph data input graph model should obtain. The penalty function may represent the difference between the predicted value and the true value. The replacement may be to replace the set of edges E before the adjustment with the set of edges E' after the adjustment. In some embodiments, the loss function values may be cross-entropy. The cross entropy may be used to calculate the difference between the output of the target map data through the map model and the confrontation map data. When the loss function value of the target map data after replacement is larger than that of the target map data before replacement, the difference between the output results of the adjusted confrontation map data and the target map data through the map model is increased, and the probability of the map model outputting an error result to the confrontation map data is increased. The cross entropy can be shown as follows:
Figure BDA0002388282170000111
wherein L isfIs the cross entropy; g is the target graph data; v. ofiIs a node in the target graph data, e.g., node a as in fig. 4; c for characterizing said point viFor example, for the graph data of the graph model for determining whether the user account is an abnormal account, the prediction categories may be an abnormal account and a normal account; y iscTo indicate the variable, at said point viIs the same as the true class, ycIs 1, otherwise ycIs 0, e.g., the prediction class of node A is normalThe account number, the real category is also the normal account number, then ycIs 1, and for example, if the predicted category of the node A is an abnormal account number and the real category is a normal account number, then ycIs 0; p is a radical ofcIs the point viThe prediction probability belonging to the prediction class c may be [0, 1 ]]Any direct number, e.g., 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, etc.; m is the point viFor example, M may be 1 in a single classification, and M may be the corresponding classification number in a multi-classification.
In some embodiments, whether the loss function value is increased or not may be determined by the difference between the loss function of the target map data and the adjusted target map data, for example, when the difference between the loss function value of the target map data and the loss function value of the adjusted target map data is decreased, the loss function value of the adjusted target map data may be considered to be increased, and the replacement may be accepted.
When the loss function value of the target map data G (V1, E ') after the edge set E' is adjusted is larger than the loss function value of the target map data G (V1, E) before the edge set E ', the replacement can be accepted, that is, the edge set E before adjustment is replaced with the adjusted edge set E'.
When the loss function value of the target map data G (V1, E ') after the edge set E ' is adjusted is unchanged or smaller than the loss function value of the target map data G (V1, E) before the edge set E ' is adjusted, this replacement may be rejected, i.e., the edge set E before adjustment is kept unchanged.
In some embodiments, when the loss function value of the adjusted target map data is reduced or unchanged, the replacement may be rejected, and the step 210 is returned to execute the step of selecting an unselected target point from the candidate target point set V3 until a target point is selected that can increase the loss function value, or all the point-corresponding replacements in the candidate target point set V3 are replaced.
For example, after the candidate target point set V3 corresponding to the target graph data in fig. 4 is substituted for the correction point a based on the node C in { C, E, F }, the obtained loss function value decreases or does not change, the substitution is rejected, step 210 is performed, the node E is selected for substitution, and if the loss function value of the adjusted edge set E 'obtained after the node E substitution increases, the edge set E is substituted with the adjusted edge set E' obtained after the node E substitution; if the loss function value of the adjusted edge set E ' obtained after the node E is replaced is still reduced or unchanged, rejecting the replacement, executing the step 210, selecting the node F for replacement, and if the loss function value of the adjusted edge set E ' obtained after the node F is replaced is increased, replacing the edge set E with the adjusted edge set E ' obtained after the node F is replaced; if the loss function value of the adjusted edge set E' obtained after the node F replacement is still reduced or unchanged, the replacement is rejected, but the edge set E is kept unchanged because the candidate target point set V3 has been traversed, and step 214 is performed.
And step 214, repeatedly selecting correction points from the second point set V2 and iteratively adjusting the edge set E to obtain the confrontation graph data.
In particular, step 214 may be performed by the confrontation graph data determination module 170.
In some embodiments, it may be determined whether the number of repetitions reaches a number threshold. The number of repetitions may be the number of repetitions of performing step 206. The number threshold may be a preset number threshold of repetitions. In some embodiments, the number threshold may be set manually. For example, 500 times, 1000 times, or any other suitable value.
When it is determined that the number of repetitions does not reach the number threshold, steps 206 through 212 may be repeatedly performed by repeatedly selecting a fix-up point from the second set of points V2 and iteratively adjusting the set of edges E. The correction points may be unselected nodes in the second set of points V2.
When it is determined that the number of repetitions reaches the number threshold, the edge set E generated in step 212 may be determined as a final edge set E, and the map data including the first point set V1 and the final edge set E may be determined as the confrontation map data. For example, the map data shown in fig. 6 is determined as the confrontation map data.
In some embodiments, whether to accept an iteration may be determined in other ways. For example, when the loss function value is less than a certain threshold, it is determined that the iteration is complete. For another example, when the variation of the loss function is less than the predetermined range several times in succession, it is determined that the iteration is completed.
And step 216, adjusting the graph model based on the confrontation graph data to enhance the attack resistance of the graph model.
In particular, step 216 may be performed by the enhancement module 180.
In some embodiments, based on the operation result of the confrontation graph data in the graph model, the graph model can be adjusted to enhance the attack resistance of the graph model. A specific process may be taken in fig. 3.
The key of the mode is that the confrontation graph data is obtained by adding few disturbances to the graph data, the realization is simple, and the method can be conveniently applied to various graph models. Compared with the existing attack algorithm, the method provided by the specification can simplify the process and simultaneously ensure the attack resistance of the anti-image data, thereby improving the robustness and the safety of the model and enhancing the attack resistance of the image model.
FIG. 3 is an exemplary flowchart of step 216 of enhancing the attack resistance of a graph model according to some embodiments of the present application, and as shown in FIG. 3, the flow 300 of step 216 includes:
step 302, inputting the confrontation graph data into the graph model, and obtaining an output corresponding to each point in the confrontation graph data, wherein the output is used for representing the probability that the point belongs to the output corresponding category.
For example, the graph model may be a graph model for determining whether the user account is an abnormal account, the target graph data may be graph data suitable for the graph model, the nodes in the template graph data may represent the user account, and the confrontation graph data may be graph data of the target graph data processed in steps 202 to 214. After the confrontation graph data is input into the graph model, the output corresponding to each node in the confrontation graph data, that is, the probability that the node is an abnormal account and the probability that the node is not an abnormal account, can be obtained. For example, the probability of finding a node in the countermeasure data as an abnormal account and the probability of not being an abnormal account may be 0.4 and 0.6, respectively.
Step 304, for the same point in the confrontation graph data, if the category corresponding to the maximum value in all the outputs of the point is different from the real category of the point, it is determined that the attack corresponding to the point is successful.
In some embodiments, the truth categories may be truth content reflected by the nodes themselves in the confrontation graph data, may be determined manually, or may be tagged data stored in a graph database. For example, the real category of the node of the confrontation graph data corresponding to the zombie account is an abnormal account; the real category of the node of the countermeasure data corresponding to the account frequently operated by the normal user is the normal account.
In some embodiments, for the same node of the countermeasure graph data, the category corresponding to the maximum value in all the outputs is different from the true category of the node of the countermeasure graph data, and the example in step 302 is used, since the category corresponding to the maximum value of 0.6 is a normal account, the graph model determines that the probability that the node belongs to the normal account is the greatest, and the node is different from an abnormal account of the true category, and the attack is considered to be successful.
Step 306, for a plurality of points in the confrontation graph data, counting the proportion of the successful times of the attacks corresponding to the plurality of points in the total times of the attacks, and adjusting the graph model based on the proportion to enhance the attack resistance of the graph model.
In some embodiments, a plurality of nodes of the countermeasure map data are input into the map model to perform multiple attacks, the number of times of successful attacks and the total number of times of attacks are counted, and then a ratio of the number of successful attacks to the total number of attacks can be obtained as a success rate of successful attacks on the countermeasure map data, wherein the higher the ratio is, the more unsafe the map model is, and if the ratio exceeds a certain value (for example, 20%), the map model can be adjusted to enhance the attack resistance of the map model. For example, if the number of attack successes is 40, the ratio is 40% and exceeds a preset value of 20%, and the graph model is adjusted by using the countermeasure data, for example, the graph model may be trained by using the countermeasure data, so that the graph model may correctly identify nodes in the countermeasure data, and the nodes in the countermeasure data lose the attack capability of the graph model, thereby enhancing the attack resistance capability of the graph model.
In some embodiments, the Graph model may be a model for classifying Graph data, for example, a Graph Neural Network (GNN), a Graph Convolutional Network (GCN), a Graph Autoencoder (GAE), an Autoencoder (AE), a Sparse Autoencoder (SAE), a Variational encoder (VAE), a Graph Recurrent Neural Network (GRNN), Graph Reinforcement Learning (GRL), deepwater, and the like may be used as a model for classifying Graph data. In some embodiments, the nodes in the graph data are used to characterize entity objects, such as individuals, businesses, places, things, categories, or other data. For example, the node of the target graph data is an account of an application program, the graph model is a graph model that can determine whether the account of the application program is an abnormal account, and the abnormal account in the application program can be determined by the graph model. However, when some information in the account is modified, i.e., some perturbations are added to the graph data (e.g., some edges are modified), the graph model may not normally identify the abnormal account. Therefore, the confrontation graph data of the target graph data needs to be acquired, and the graph model is tested, so that the graph model is adjusted according to the test result, and the attack resistance and the robustness of the graph model are enhanced.
The beneficial effects that may be brought by the embodiments of the present application include, but are not limited to: (1) the method has the advantages that the confrontation graph data are obtained by increasing the loss function value, the used limiting conditions are few, the process is easy to realize, the operation time is shortened, and the method can be applied to various graph models; (2) the adding frequency threshold value in the process of obtaining the confrontation graph data is more consistent with the actual situation, and the aggressivity of the confrontation graph data is ensured while the process is simplified; (3) by adding few disturbances to the graph data to obtain output results beyond the expectation of the graph model, the potential threats of the model can be predicted in advance, the model vulnerability can be found, the robustness and the safety of the model are improved, and the attack resistance of the graph model is enhanced. It is to be noted that different embodiments may produce different advantages, and in different embodiments, any one or combination of the above advantages may be produced, or any other advantages may be obtained.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered merely illustrative and not restrictive of the broad application. Various modifications, improvements and adaptations to the present application may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present application and thus fall within the spirit and scope of the exemplary embodiments of the present application.
Also, this application uses specific language to describe embodiments of the application. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the present application is included in at least one embodiment of the present application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the present application may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present application may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereon. Accordingly, various aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which elements and sequences of the processes described herein are processed, the use of alphanumeric characters, or the use of other designations, is not intended to limit the order of the processes and methods described herein, unless explicitly claimed. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.
The entire contents of each patent, patent application publication, and other material cited in this application, such as articles, books, specifications, publications, documents, and the like, are hereby incorporated by reference into this application. Except where the application is filed in a manner inconsistent or contrary to the present disclosure, and except where the claim is filed in its broadest scope (whether present or later appended to the application) as well. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the statements and/or uses of the present application in the material attached to this application.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present application. Other variations are also possible within the scope of the present application. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the present application can be viewed as being consistent with the teachings of the present application. Accordingly, the embodiments of the present application are not limited to only those embodiments explicitly described and depicted herein.

Claims (15)

1. A method of enhancing a graph model's resistance to attacks, wherein the method comprises:
acquiring target graph data, wherein the target graph data comprises a first point set V1 and an edge set E;
randomly generating a perturbed edge set Δ E of n edges, wherein nodes of the n edges form a second point set V2, and the nodes of the n edges all exist in the first point set V1;
arbitrarily selecting a correction point from the second point set V2;
randomly selecting a plurality of points from the first point set V1 to form a candidate target point set V3 of the correction points;
selecting a target point from the candidate target point set V3, and replacing the correction point with the selected target point to update the disturbance edge set delta E to obtain a disturbance edge set delta E';
adjusting the edge set E based on the disturbance edge set delta E' to obtain adjusted target graph data, and accepting the replacement when the absolute value of the difference between the output result of the adjusted target graph data input into the graph model and the real result corresponding to the output result is increased;
repeatedly selecting correction points from the second point set V2 and iteratively adjusting the edge set E to obtain confrontation graph data;
based on the confrontation graph data, the graph model is adjusted to enhance the attack resistance of the graph model.
2. The method of claim 1, wherein n is 1% to 5% of the number of edges in the edge set E.
3. The method of claim 1, wherein said repeating of selecting correction points from said second set of points V2 and iteratively adjusting said set of edges E, resulting in confrontation map data, comprises:
and repeatedly executing the step of arbitrarily selecting one correction point from the second point set V2 to the step of adjusting the edge set E based on the disturbance edge set delta E', and when the repeated execution times of the steps are larger than or equal to a time threshold value, taking the currently obtained target map data as the confrontation map data.
4. The method of claim 1, wherein said adjusting said set of edges E based on said perturbed set of edges Δ E' comprises:
for each edge in the disturbance edge set delta E, if the edge exists in the edge set E, deleting the edge from the edge set E; and if the edge does not exist in the edge set E, adding the edge in the edge set E.
5. The method according to claim 1, wherein when the loss function value of the adjusted target map data increases, it is determined that an absolute value of a difference between the output result and a true result corresponding to the output result increases, and the loss function value is a cross entropy as follows:
Figure FDA0002388282160000021
wherein L isfFor the cross entropy, G is the target graph data, viFor points in the target map data, c is used to characterize the points viPrediction category of ycTo indicate the variable, at said point viIs the same as the true class, ycIs 1, otherwise ycIs 0, pcIs the point viA prediction probability belonging to said prediction class c, M being said point viThe number of prediction classes c.
6. The method of claim 1, wherein said adapting said graph model based on said confrontation graph data to enhance said graph model's resistance to attacks comprises:
inputting the confrontation graph data into the graph model to obtain an output corresponding to each point in the confrontation graph data, wherein the output is used for representing the probability that the point belongs to the class corresponding to the output;
aiming at the same point in the countermeasure map data, if the category corresponding to the maximum value in all the outputs of the point is different from the real category of the point, judging that the attack corresponding to the point is successful;
and aiming at a plurality of points in the confrontation graph data, counting the proportion of the successful times of the attacks corresponding to the points in the total times of the attacks, and adjusting the graph model based on the proportion so as to enhance the capability of resisting the attacks of the graph model.
7. The method of claim 1, wherein the graph model is a model for classifying points in graph data, nodes in the graph data being used to characterize entity objects.
8. A system for enhancing the attack resistance of a graph model, wherein the system comprises:
the acquisition module is used for acquiring target graph data, and the target graph data comprises a first point set V1 and an edge set E;
a generating module, configured to generate a perturbed edge set Δ E of n edges arbitrarily, where nodes of the n edges constitute a second point set V2, and the nodes of the n edges all exist in the first point set V1;
a first selection module, configured to arbitrarily select a correction point from the second point set V2;
a second selecting module, configured to arbitrarily select, from the first point set V1, a plurality of points to form a candidate target point set V3 of the correction points;
a replacing module, configured to select a target point from the candidate target point set V3, and replace the correction point with the selected target point, so as to update the perturbed edge set Δ E to obtain a perturbed edge set Δ E';
an adjusting module, configured to adjust the edge set E based on the disturbance edge set Δ E' to obtain adjusted target map data, and accept the replacement when an absolute value of a difference between an output result of the adjusted target map data input to the map model and a real result corresponding to the output result increases;
a confrontation graph data determination module for obtaining confrontation graph data when the first selection module repeatedly selects correction points from the second point set V2 so that the adjustment module iteratively adjusts the edge set E;
and the enhancement module is used for adjusting the graph model based on the confrontation graph data so as to enhance the attack resistance of the graph model.
9. The system of claim 8, wherein n is 1% to 5% of the number of edges in the edge set E.
10. The system of claim 8, wherein the confrontation graph data determination module is to:
when the first selection module repeatedly selects any correction point from the second point set V2 so that the adjustment module repeatedly adjusts the edge set E based on the disturbance edge set Δ E', it is determined whether the number of repetitions is greater than or equal to a number threshold, and when the number of repetitions is greater than or equal to the number threshold, the currently obtained target map data is used as the confrontation map data.
11. The system of claim 8, wherein the adjustment module is to:
for each edge in the disturbance edge set delta E, if the edge exists in the edge set E, deleting the edge from the edge set E; and if the edge does not exist in the edge set E, adding the edge in the edge set E.
12. The system of claim 8, wherein when the loss function value of the adjusted target map data increases, it is determined that the absolute value of the difference between the output result and the true result corresponding to the output result increases, and the loss function value is a cross entropy as follows:
Figure FDA0002388282160000041
wherein L isfFor the cross entropy, G is the target graph data, viFor points in the target map data, c is used to characterize the points viPrediction category of ycTo indicate the variable, at said point viIs the same as the true class, ycIs 1, otherwise ycIs 0, pcIs the point viA prediction probability belonging to said prediction class c, M being said point viThe number of prediction classes c.
13. The system of claim 8, wherein the augmentation module is to:
inputting the confrontation graph data into the graph model to obtain an output corresponding to each point in the confrontation graph data, wherein the output is used for representing the probability that the point belongs to the class corresponding to the output;
aiming at the same point in the countermeasure map data, if the category corresponding to the maximum value in all the outputs of the point is different from the real category of the point, judging that the attack corresponding to the point is successful;
and aiming at a plurality of points in the confrontation graph data, counting the proportion of the successful times of the attacks corresponding to the points in the total times of the attacks, and adjusting the graph model based on the proportion so as to enhance the capability of resisting the attacks of the graph model.
14. The system of claim 8, wherein the graph model is a model for classifying points in graph data, nodes in the graph data being used to characterize entity objects.
15. An apparatus for enhancing attack resistance of a graph model, comprising a processor, wherein the processor is configured to execute the method for enhancing attack resistance of the graph model according to any one of claims 1 to 7.
CN202010105695.5A 2020-02-20 2020-02-20 Method and system for enhancing attack resistance of graph model Pending CN111309975A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010105695.5A CN111309975A (en) 2020-02-20 2020-02-20 Method and system for enhancing attack resistance of graph model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010105695.5A CN111309975A (en) 2020-02-20 2020-02-20 Method and system for enhancing attack resistance of graph model

Publications (1)

Publication Number Publication Date
CN111309975A true CN111309975A (en) 2020-06-19

Family

ID=71158558

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010105695.5A Pending CN111309975A (en) 2020-02-20 2020-02-20 Method and system for enhancing attack resistance of graph model

Country Status (1)

Country Link
CN (1) CN111309975A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112637178A (en) * 2020-12-18 2021-04-09 成都知道创宇信息技术有限公司 Attack similarity calculation method and device, electronic equipment and readable storage medium
CN112966165A (en) * 2021-02-03 2021-06-15 北京大学 Interactive community searching method and device based on graph neural network
CN113378899A (en) * 2021-05-28 2021-09-10 百果园技术(新加坡)有限公司 Abnormal account identification method, device, equipment and storage medium
WO2022141625A1 (en) * 2021-01-04 2022-07-07 Robert Bosch Gmbh Method and apparatus for generating training data for graph neural network

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112637178A (en) * 2020-12-18 2021-04-09 成都知道创宇信息技术有限公司 Attack similarity calculation method and device, electronic equipment and readable storage medium
CN112637178B (en) * 2020-12-18 2022-09-20 成都知道创宇信息技术有限公司 Attack similarity calculation method and device, electronic equipment and readable storage medium
WO2022141625A1 (en) * 2021-01-04 2022-07-07 Robert Bosch Gmbh Method and apparatus for generating training data for graph neural network
CN112966165A (en) * 2021-02-03 2021-06-15 北京大学 Interactive community searching method and device based on graph neural network
CN113378899A (en) * 2021-05-28 2021-09-10 百果园技术(新加坡)有限公司 Abnormal account identification method, device, equipment and storage medium
CN113378899B (en) * 2021-05-28 2024-05-28 百果园技术(新加坡)有限公司 Abnormal account identification method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111309975A (en) Method and system for enhancing attack resistance of graph model
US10360517B2 (en) Distributed hyperparameter tuning system for machine learning
US11488055B2 (en) Training corpus refinement and incremental updating
US20220374746A1 (en) Model interpretation
US20220036178A1 (en) Dynamic gradient aggregation for training neural networks
CN112988774B (en) User information updating method based on big data acquisition and information server
CN110019790A (en) Text identification, text monitoring, data object identification, data processing method
CN114358197A (en) Method and device for training classification model, electronic equipment and storage medium
CN113807728A (en) Performance assessment method, device, equipment and storage medium based on neural network
CN110889493A (en) Method and device for adding disturbance aiming at relational network
Damaševičius et al. Decomposition aided attention-based recurrent neural networks for multistep ahead time-series forecasting of renewable power generation
CN112269875B (en) Text classification method, device, electronic equipment and storage medium
CN109977131A (en) A kind of house type matching system
CN113472860A (en) Service resource allocation method and server under big data and digital environment
Pevec et al. Prediction intervals in supervised learning for model evaluation and discrimination
CN117521063A (en) Malicious software detection method and device based on residual neural network and combined with transfer learning
CN107424026A (en) Businessman's reputation evaluation method and device
US20200174760A1 (en) Automatic code generation
CN113641823B (en) Text classification model training, text classification method, device, equipment and medium
Mazijn et al. LUCID: exposing algorithmic bias through inverse design
US11704591B2 (en) Fast and accurate rule selection for interpretable decision sets
Bourdache et al. Active preference elicitation by bayesian updating on optimality polyhedra
Peng et al. Towards better generalization of deep neural networks via non-typicality sampling scheme
Kałuża et al. On Several New Dempster-Shafer-Inspired Uncertainty Measures Applicable for Active Learning
Dhanalakshmy et al. Analytical study on the role of scale factor parameter of differential evolution algorithm on its convergence nature

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40031261

Country of ref document: HK