CN115512133A - Exception detection method and system for import-export behavior dynamic graph data - Google Patents

Exception detection method and system for import-export behavior dynamic graph data Download PDF

Info

Publication number
CN115512133A
CN115512133A CN202211251195.8A CN202211251195A CN115512133A CN 115512133 A CN115512133 A CN 115512133A CN 202211251195 A CN202211251195 A CN 202211251195A CN 115512133 A CN115512133 A CN 115512133A
Authority
CN
China
Prior art keywords
graph
import
sequence
dynamic graph
time sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211251195.8A
Other languages
Chinese (zh)
Inventor
包先雨
蔡伊娜
高祖康
李俊杰
程烨
蒋涛
黄智强
黄哲学
郑文丽
程立勋
方凯彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Qianhai Quantum Cloud Code Technology Co ltd
Shenzhen University
Shenzhen Academy of Inspection and Quarantine
Original Assignee
Shenzhen Qianhai Quantum Cloud Code Technology Co ltd
Shenzhen University
Shenzhen Academy of Inspection and Quarantine
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Qianhai Quantum Cloud Code Technology Co ltd, Shenzhen University, Shenzhen Academy of Inspection and Quarantine filed Critical Shenzhen Qianhai Quantum Cloud Code Technology Co ltd
Priority to CN202211251195.8A priority Critical patent/CN115512133A/en
Publication of CN115512133A publication Critical patent/CN115512133A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Abstract

The invention relates to the technical field of import and export behavior dynamic graph detection, and discloses an import and export behavior dynamic graph data anomaly detection method, which comprises the following steps: s1, defining an import and export behavior dynamic graph; s2, extracting characteristics of the import and export behavior dynamic graph; s3, detecting the abnormality of the import and export behavior dynamic graph: in an anomaly detection module, the nodes in the S2 are used for representing and detecting abnormal edges in the import and export behavior dynamic graph; according to the method, on the aspect of time sequence feature extraction of the dynamic graph, a sliding window cyclic neural network structure is not adopted, a graph embedding method based on long-term and short-term time sequence attention is adopted to extract the time sequence feature, the short-term time sequence attention of the nodes in the snapshot can be efficiently extracted through a block calculation attention structure, each time sequence block extracts and transmits the long-term time sequence feature of the nodes through a long-term memory state vector, the integrity of the time sequence feature of the model extraction nodes is guaranteed, and therefore the performance of anomaly detection is improved.

Description

Exit-import behavior dynamic graph data anomaly detection method and system
Technical Field
The invention relates to the technical field of import and export behavior dynamic graph detection, in particular to an import and export behavior dynamic graph data anomaly detection method and system.
Background
The customs clearance system comprises a large amount of key data for describing import and export behaviors, the data are supervised, abnormal mutation of the customs monitoring import and export behaviors faces various challenges, firstly, the relation between commodities and ports and the relations between the commodities and the commodities are considered when detecting the abnormity, but the relations become complex along with the increase of the commodity types and the ports, and the relation between the commodities and the ports is an important characteristic for detecting the abnormity; meanwhile, the relationship between the same or similar commodities and the port is often similar, and the anomaly detection function is achieved; due to the fact that the economy is global deeply, the types of commodities accumulated by customs are multiple, the relations between the commodities and the port and between the commodities and the commodities become complex rapidly along with the increase of the types of the commodities, and further challenges are brought to abnormal detection; secondly, the change condition of the relationship between the commodity and the port needs to be considered when detecting the abnormity; detecting an abnormality, wherein an important basis is the change condition of the relationship between the commodity and the port, for example, if a certain commodity is mainly at an inlet and an outlet of a fixed port and changes to an inlet and an outlet of another fixed port, a certain abnormality is likely to occur; however, as the number of categories of goods increases, these relationships become complex, and it is therefore also a challenge to consider how these relationships change over time.
The port and the commodity, the relationship between the commodity and how these relationships change are represented in the form of a dynamic graph. The dynamic graph can represent the object and the relation between the objects, so that the relation between the commodity and the port can be effectively represented; the dynamic graph has a constantly changing characteristic and can show how the relation between the commodity and the port changes along with the time;
the dynamic graph is a graph data structure with time sequence characteristics, and the anomaly detection of the dynamic graph is one of the most challenging tasks in anomaly detection, and the aim of the anomaly detection is to find anomalous objects, points and edges in the dynamic graph data, wherein the anomaly point and anomaly edge detection is particularly widely applied, for example, the detection of the occurrence of accidents in a traffic network, the detection of network anomalies or network attacks in a computer network, and the like;
the existing dynamic graph anomaly detection method can be mainly divided into two types, one type is based on a heuristic rule anomaly detection algorithm, and the other type is based on a graph representation learning technology anomaly detection algorithm;
when the time sequence feature is extracted, most of the existing methods based on graph type learning adopt a sliding window-based recurrent neural network structure to extract the time sequence feature of nodes of snapshots in a window, but the extracted time sequence feature only considers the short-term time sequence feature of the nodes of the snapshots in the window, ignores the node state of the past snapshots, causes the loss of the long-term time sequence feature of the nodes, and limits the effect of extracting the time sequence feature to a certain extent, thereby causing the incomplete detection of the abnormity. Therefore, those skilled in the art provide a method and a system for detecting data abnormality of an import/export behavior dynamic graph, so as to solve the problems mentioned in the background art.
Disclosure of Invention
The invention aims to provide a method and a system for detecting data abnormality of an import/export behavior dynamic graph, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme:
an import-export behavior dynamic graph data anomaly detection method comprises the following steps:
s1, defining an import and export behavior dynamic graph: constructing a graph sequence to define an import-export behavior dynamic graph;
s2, feature extraction of the import and export behavior dynamic graph: extracting structural features and time features in the import and export behavior dynamic graph into node representations of respective modules through a structural and context feature extraction module and a dynamic time sequence feature extraction module;
s3, anomaly detection of the import and export behavior dynamic graph: and detecting abnormal edges in the import and export behavior dynamic graph by using the node representation in the S2 in the abnormal detection module.
As a still further scheme of the invention: the definition content of the import and export behavior dynamic graph in the S1 is that the import and export behavior dynamic graph is defined as G;
Figure BDA0003888040320000021
the above formula is a graph sequence;
in the above-mentioned formula, the compound has the following structure,
Figure BDA0003888040320000022
represented as graphs at time stamp t, for each graph
Figure BDA0003888040320000023
Wherein
Figure BDA0003888040320000024
And
Figure BDA0003888040320000025
representative drawing
Figure BDA0003888040320000026
A point set and an edge set;
Figure BDA0003888040320000027
representative node v i And v j One edge in between, and the weight is w;
Figure BDA0003888040320000028
and
Figure BDA0003888040320000029
respectively representing the total point set and the edge set of the import-export behavior dynamic graph G, then n = | V |, A t ∈R n×n An adjacency matrix representing each graph.
As a still further scheme of the invention: the structural and contextual feature extraction module in S2 uses a multi-layer graph convolutional neural network for snapshot to extract structural features in the import and export behavior dynamic graph, and content features are aggregated among nodes.
As a still further scheme of the invention: the dynamic time sequence feature extraction module in the S2 provides a multi-head attention mechanism of the import and export behavior dynamic graph, and expands the multi-head attention mechanism into a block circulation structure, so that two main time features, namely long-term and short-term time sequence features, of the dynamic graph are extracted.
An import and export behavior dynamic graph data anomaly detection system comprises a structure and context feature extraction module, a dynamic time sequence feature extraction module and an anomaly detection module.
As a still further scheme of the invention: the structure and context feature extraction module extracts a structure feature of the snapshot in the dynamic graph by using a multilayer graph convolution neural network and maps the nodes in the dynamic graph into a high-dimensional space vector;
the dynamic time sequence feature extraction module divides a dynamic graph sequence into a plurality of time sequence blocks according to the size of a fixed window, a multi-head attention mechanism is introduced into each time sequence block to better extract time sequence features, vector expression of the time sequence features is updated, and each time sequence block transmits long-term time sequence information of each node to the next block through a long-term memory state vector, so that the long-term time sequence features are stored and extracted, and abnormal edge detection performance is improved;
the anomaly detection module uses the node vectors in the dynamic graph to construct vector expressions of edges, each edge vector is put into a nonlinear activation function to carry out anomaly scoring, and abnormal edge data with the anomaly score larger than a threshold value is found out.
As a still further scheme of the invention: the structural and contextual feature extraction module will perform on each graph in the sequence of graphs
Figure BDA0003888040320000031
Performing multilayer graph convolution operation once to extract the structural features of each graph and obtain the vertex vector of each graph, which is specifically described as:
Figure BDA0003888040320000032
in the formula (1), GCN is a multi-layer graph convolution neural network for snapshots, L represents the number of layers of GCN,
Figure BDA0003888040320000033
representative drawing
Figure BDA0003888040320000034
Vector representation per vertex, d h Is a dimension of a vector and is a function of,and the GCN is calculated specifically as follows:
Z (0) =X t (2)
Figure BDA0003888040320000035
Figure BDA0003888040320000041
in equations (3) and (4), σ (-) represents some activation function, such as the ReLU activation function,
Figure BDA0003888040320000042
Figure BDA0003888040320000043
the resulting output of this module is each graph
Figure BDA0003888040320000044
Vertex vector matrix H t
As a still further scheme of the invention: the dynamic time sequence feature extraction module adopts a dynamic graph multi-head attention network to extract long-term and short-term time features into the representation of each node in parallel;
the graph sequence G is divided into a plurality of blocks by the window size k,
Figure BDA0003888040320000045
the ith block is represented, and a vertex vector sequence { H ] of each block is obtained after the ith block passes through a structure and context feature extraction module t-k+1 ,…,H t };
Before putting the sequence into the multi-head attention network of the dynamic graph, a memory vector M, a current block is added
Figure BDA0003888040320000046
The vector sequence of (1) only contains local short-time characteristics and also needs to retain previous time sequence characteristics, so that the block is subjected to
Figure BDA0003888040320000047
In other words, the input sequence is
Figure BDA0003888040320000048
Note the book
Figure BDA0003888040320000049
Into blocks
Figure BDA00038880403200000410
A vector sequence of points v, the sequence being due to the symmetry of the multi-headed attention mechanism
Figure BDA00038880403200000411
The sequence does not contain position information of the sequence, so that the sequence needs to be subjected to position coding PE (-) once, and multi-head attention input is obtained, and the input process is as follows:
Figure BDA00038880403200000412
Figure BDA00038880403200000413
in the formulas (5) and (6),
Figure BDA00038880403200000414
is the coded information of the ith position,
Figure BDA00038880403200000415
into blocks
Figure BDA00038880403200000416
An input sequence of medium vertices v;
after the input sequence is processed, a calculation process of extracting the time sequence information by the multi-head attention mechanism will be described, which specifically comprises the following steps:
Figure BDA00038880403200000417
Figure BDA00038880403200000418
Figure BDA00038880403200000419
Figure BDA00038880403200000420
in the formulas (7) to (10),
Figure BDA0003888040320000051
is a representative block
Figure BDA0003888040320000052
The medium vertex v outputs a sequence of vectors,
Figure BDA0003888040320000053
and
Figure BDA0003888040320000054
three learnable parameters are provided, and the above process can make the calculation of each point be parallel by the characteristic of matrix operation.
As a still further scheme of the invention: after the two modules of the structure and context feature extraction module and the dynamic time sequence feature extraction module are processed, each time chart is obtained
Figure BDA0003888040320000055
Vector representation of each point in
Figure BDA0003888040320000056
To detect anomalous edges in each graph, a scoring function is defined toThe degree of abnormality of each edge was evaluated, and the score function was defined as follows:
Figure BDA0003888040320000057
in the formula (11), W a And b is a parameter that can be learned,
Figure BDA0003888040320000058
and
Figure BDA0003888040320000059
vector expressions of two vertexes respectively representing the edge e, wherein the value range of f (e) is {0,1}, and in order to obtain the optimal parameter, a loss function during training is defined as follows:
Figure BDA00038880403200000510
in the formula (12), y e The label representing the edge e is a label,
Figure BDA00038880403200000511
for the L2 norm, λ is an adjustable hyper-parameter.
After the model is trained, abnormal edges in the dynamic graph can be detected according to formula (11).
Compared with the prior art, the invention has the beneficial effects that:
the invention applies a dynamic graph abnormal edge detection algorithm based on long and short time sequence attention, does not adopt a sliding window-based recurrent neural network structure but adopts a graph embedding method based on long and short time sequence attention to extract time sequence characteristics on the extraction of the time sequence characteristics of the dynamic graph, can efficiently extract the short time sequence attention of nodes in a snapshot through a block calculation attention structure, and ensures the integrity of model extraction node time sequence characteristics through extracting and transmitting the long time sequence characteristics of the nodes by each time sequence block through a long-term memory state vector, thereby improving the performance of abnormal detection.
Drawings
FIG. 1 is a block diagram of an algorithm framework in a method and system for detecting data anomalies in a dynamic graph of import-export behavior;
FIG. 2 is a diagram of AUC of snapshots of a data set after injection of 1% abnormal data in a method and system for detecting data abnormality in a dynamic graph of import and export behaviors;
fig. 3 is a flowchart of an algorithm in a method and system for detecting data anomaly of an import-export behavior dynamic graph.
Detailed Description
Referring to fig. 1 to 3, a method for detecting data anomaly of an import/export behavior dynamic graph includes the following steps:
s1, defining an import and export behavior dynamic graph: constructing a graph sequence to define an import-export behavior dynamic graph;
s2, feature extraction of the import and export behavior dynamic graph: extracting structural features and time features in the import and export behavior dynamic graph into node representations of respective modules through a structural and context feature extraction module and a dynamic time sequence feature extraction module;
s3, anomaly detection of the import and export behavior dynamic graph: and detecting abnormal edges in the import and export behavior dynamic graph by using the node representation in the S2 in the abnormal detection module.
Preferably, the definition content of the import-export behavior dynamic graph in S1 is to define the import-export behavior dynamic graph as G;
Figure BDA0003888040320000061
the above formula is a graph sequence;
in the above formula, the first and second carbon atoms are,
Figure BDA0003888040320000062
represented as graphs at time stamp t, for each graph
Figure BDA0003888040320000063
Wherein
Figure BDA0003888040320000064
And
Figure BDA0003888040320000065
representative drawing
Figure BDA0003888040320000066
A point set and an edge set;
Figure BDA0003888040320000067
representative node v i And v j One edge in between, and the weight is w;
Figure BDA0003888040320000068
and
Figure BDA0003888040320000069
respectively representing the total point set and the edge set of the import-export behavior dynamic graph G, then n = | V |, A t ∈R n×n An adjacency matrix representing each graph.
Preferably, the structural and contextual feature extraction module in S2 uses the snapshot to extract structural features in the import-export behavior dynamic graph by using the multi-layer graph convolutional neural network, and aggregates content features between nodes.
Preferably, the dynamic time sequence feature extraction module in S2 proposes a multi-head attention mechanism of the import-export behavior dynamic graph, and expands the multi-head attention mechanism into a structure of a block cycle, so as to extract two main time features of the dynamic graph, namely long-term and short-term time sequence features.
An import-export behavior dynamic graph data anomaly detection system comprises a structure and context feature extraction module, a dynamic time sequence feature extraction module and an anomaly detection module.
Preferably, the structure and context feature extraction module extracts a structure feature of the snapshot in the dynamic graph by using a multilayer graph convolution neural network, and maps the nodes in the dynamic graph into a high-dimensional space vector;
the dynamic time sequence feature extraction module divides a dynamic graph sequence into a plurality of time sequence blocks according to the size of a fixed window, a multi-head attention mechanism is introduced into each time sequence block to better extract time sequence features, vector expression of the time sequence features is updated, and each time sequence block transmits long-term time sequence information of each node to the next block through a long-term memory state vector, so that the long-term time sequence features are stored and extracted, and the abnormal edge detection performance is improved;
and the anomaly detection module uses the node vectors in the dynamic graph to construct vector expression of edges, each edge vector is put into a nonlinear activation function to carry out anomaly scoring, and the abnormal edge data with the anomaly score larger than a threshold value is found out.
Preferably, the structural and contextual feature extraction module will perform on each graph in the sequence of graphs
Figure BDA0003888040320000071
Performing multilayer graph convolution operation once to extract the structural features of each graph and obtain the vertex vector of each graph, which is specifically described as:
Figure BDA0003888040320000072
in the formula (1), GCN is a multi-layer graph convolution neural network for snapshots, L represents the number of layers of GCN,
Figure BDA0003888040320000073
representative drawing
Figure BDA0003888040320000074
Vector representation per vertex, d h Is the vector dimension, and the GCN is calculated specifically as follows:
Z (0) =X t (2)
Figure BDA0003888040320000075
Figure BDA0003888040320000076
in equations (3) and (4), σ (-) represents some activation function, such as the ReLU activation function,
Figure BDA0003888040320000077
Figure BDA0003888040320000078
the final output of this module is each graph
Figure BDA0003888040320000079
Vertex vector matrix H of t
Preferably, the dynamic time series feature extraction module adopts a dynamic graph multi-head attention network to extract long-term and short-term time features into the representation of each node in parallel;
the graph sequence G is divided into a plurality of blocks by the window size k,
Figure BDA00038880403200000710
the ith block is represented, and a vertex vector sequence { H ] of each block is obtained after the ith block passes through a structure and context feature extraction module t-k+1 ,…,H t };
Before putting the sequence into the dynamic graph multi-head attention network, the memory vector M, the current block are added
Figure BDA00038880403200000711
The vector sequence of (2) contains only local, short-time temporal characteristics, and also the previous temporal characteristics need to be preserved, so for a block
Figure BDA00038880403200000712
In other words, the input sequence is
Figure BDA00038880403200000713
Note the book
Figure BDA00038880403200000714
Into blocks
Figure BDA00038880403200000715
A vector sequence of points v, the sequence being due to the symmetry of the multi-headed attention mechanism
Figure BDA00038880403200000716
The sequence does not contain position information of a sequence, so that the sequence needs to be subjected to position coding PE (-) once, and the input of multi-head attention is obtained, and the input process is as follows:
Figure BDA0003888040320000081
Figure BDA0003888040320000082
in the equations (5) and (6),
Figure BDA0003888040320000083
is the coded information of the ith position,
Figure BDA0003888040320000084
into blocks
Figure BDA0003888040320000085
An input sequence of medium vertices v;
after the input sequence is processed, a calculation process of extracting the time sequence information by the multi-head attention mechanism will be described, which specifically comprises the following steps:
Figure BDA0003888040320000086
Figure BDA0003888040320000087
Figure BDA0003888040320000088
Figure BDA0003888040320000089
in the formulas (7) to (10),
Figure BDA00038880403200000810
is a representative block
Figure BDA00038880403200000811
The medium vertex v outputs a sequence of vectors,
Figure BDA00038880403200000812
and
Figure BDA00038880403200000813
three learnable parameters are provided, and the above process can make the calculation of each point be parallel by the characteristic of matrix operation.
Preferably, each time graph is obtained after being processed by two modules, namely a structure and context feature extraction module and a dynamic time sequence feature extraction module
Figure BDA00038880403200000814
Vector representation of each point in
Figure BDA00038880403200000815
In order to detect abnormal edges in each graph, a scoring function is defined to evaluate the degree of abnormality of each edge, and the scoring function is defined as follows:
Figure BDA00038880403200000816
in the formula (11), W a And b is a parameter that can be learned,
Figure BDA00038880403200000817
and
Figure BDA00038880403200000818
vector expressions of two vertexes respectively representing the edge e, the value range of f (e) is {0,1}, and in order to obtain the optimal parameter, a loss function during training is defined as follows:
Figure BDA00038880403200000819
in the formula (12), y e The label representing the edge e is a label,
Figure BDA00038880403200000820
for the L2 norm, λ is an adjustable hyper-parameter.
After the model is trained, abnormal edges in the dynamic graph can be detected according to formula (11).
To better illustrate the technical effects of the present invention, the following tests were carried out:
the detection method is marked as LASTAN, and the LASTAN is defined as a dynamic graph abnormal edge detection algorithm based on long and short time sequence attention;
experiments are carried out under six real data sets, namely a database (UCI) for machine learning, a database (Digg) of a U.S. duke company, an electronic mail distributed control database (Email-DNC), a quantification system (Bitcoid-Alpha) of intelligent algorithm automatic coin frying, a Bitcoin trading platform system (Bitcoid-OTC) and an AS-level Topology (AS-Topology) generated from route view project data, wherein the six real world data sets comprise dynamic graph data of types such AS social networks, computer network attacks and the like, meanwhile, a statistical frequency algorithm (CM-etStch) for improving processing speed and reducing processing time by sacrificing counting accuracy, a method for detecting anomaly in edge streams (Sedan Spot), a method for detecting dynamic network anomaly (Net Walk), a method for detecting variable data type neural network (StrGNN) are selected AS datum line methods for comparison, and each data set is divided into 50% of training set by the research method;
because the data set does not carry abnormal data, the method for referring to Net Walk in the research injects abnormal data into the data set according to the proportion of 1 percent, 5 percent and 10 percent;
TABLE 1 comparison of AUC results of each method experiment
Figure BDA0003888040320000091
Figure BDA0003888040320000101
The overall experimental results are shown in table 1, and from the experimental results, the performance of the method is better than that of other reference methods on six data sets, while fig. 2 shows the AUC scores of different snapshots in each data set, and from the experimental results, the performance of the research method at any time is better than that of other methods, particularly on three data sets of UCI, bitcion-Alpha and bitcion-OTC, as time increases, other reference methods show a greater decreasing trend on the AUC scores, and the research method still maintains a better effect.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention are equivalent to or changed within the technical scope of the present invention.

Claims (9)

1. A method for detecting data abnormity of an import-export behavior dynamic graph is characterized by comprising the following steps:
s1, defining an import and export behavior dynamic graph: constructing a graph sequence to define an import-export behavior dynamic graph;
s2, feature extraction of the import and export behavior dynamic graph: extracting structural features and time features in the import and export behavior dynamic graph into node representations of respective modules through a structural and context feature extraction module and a dynamic time sequence feature extraction module;
s3, anomaly detection of the import and export behavior dynamic graph: and detecting abnormal edges in the import and export behavior dynamic graph by using the node representation in the S2 in the abnormal detection module.
2. The method according to claim 1, wherein the defining content of the import/export behavior dynamic graph in S1 is that the import/export behavior dynamic graph is defined as G;
Figure FDA0003888040310000011
the above formula is a graph sequence;
in the above formula, the first and second carbon atoms are,
Figure FDA0003888040310000012
represented as graphs at time stamp t, for each graph
Figure FDA0003888040310000013
Wherein
Figure FDA0003888040310000014
And
Figure FDA0003888040310000015
representative drawing
Figure FDA0003888040310000016
A point set and an edge set;
Figure FDA0003888040310000017
representative node v i And v j An edge in between, and weight w;
Figure FDA0003888040310000018
and
Figure FDA0003888040310000019
respectively representing the total point set and the edge set of the import-export behavior dynamic graph G, then n = | V |, A t ∈R n×n An adjacency matrix representing each graph.
3. The method for detecting the data abnormality of the import/export behavior dynamic graph according to claim 1, wherein the structural and contextual feature extraction module in S2 uses a snapshot-based multi-layer graph convolutional neural network to extract structural features in the import/export behavior dynamic graph and aggregate content features between nodes.
4. The method as claimed in claim 1, wherein the dynamic timing feature extraction module in S2 extracts two main time features of the dynamic graph, namely long-term and short-term timing features, by extracting a multi-point attention mechanism of the dynamic graph of the import-export behavior and expanding the multi-point attention mechanism into a structure of a block cycle.
5. The system for realizing the data anomaly detection of the import-export behavior dynamic graph is characterized by comprising a structure and context feature extraction module, a dynamic time sequence feature extraction module and an anomaly detection module.
6. The system for detecting data abnormality of an import-export behavior dynamic graph according to claim 5, wherein said structural and contextual feature extraction module performs a structural feature extraction on the snapshot in the dynamic graph by using a multi-layer graph convolution neural network, and maps the nodes in the dynamic graph into high-dimensional space vectors;
the dynamic time sequence feature extraction module divides a dynamic graph sequence into a plurality of time sequence blocks according to the size of a fixed window, a multi-head attention mechanism is introduced into each time sequence block to better extract time sequence features, vector expression of the time sequence features is updated, and each time sequence block transmits long-term time sequence information of each node to the next block through a long-term memory state vector, so that the long-term time sequence features are stored and extracted, and abnormal edge detection performance is improved;
the anomaly detection module uses the node vectors in the dynamic graph to construct vector expressions of edges, each edge vector is put into a nonlinear activation function to carry out anomaly scoring, and abnormal edge data with the anomaly score larger than a threshold value is found out.
7. The system according to claim 5 or 6, wherein the structural and contextual feature extraction module extracts the structural and contextual feature of each graph in the graph sequence
Figure FDA0003888040310000021
Performing multilayer graph convolution operation once to extract the structural features of each graph and obtain the vertex vector of each graph, which is specifically described as:
Figure FDA0003888040310000022
in the formula (1), GCN is a multi-layer graph convolution neural network for snapshots, L represents the number of layers of GCN,
Figure FDA0003888040310000023
representative drawing
Figure FDA0003888040310000024
Vector representation per vertex, d h Is the vector dimension, and the GCN is calculated specifically as follows:
Z (0) =X t (2)
Figure FDA0003888040310000025
Figure FDA0003888040310000026
in equations (3) and (4), σ (-) represents some activation function, such as the ReLU activation function,
Figure FDA0003888040310000027
Figure FDA0003888040310000028
the resulting output of this module is each graph
Figure FDA0003888040310000029
Vertex vector matrix H of t
8. The system for detecting the data abnormality of the import-export behavior dynamic graph according to claim 5 or 6, wherein the dynamic time sequence feature extraction module adopts a dynamic graph multi-head attention network to extract long-term and short-term time features into the representation of each node in parallel;
the graph sequence G is divided into a plurality of blocks by a window size k,
Figure FDA00038880403100000210
the ith block is represented, and a vertex vector sequence { H ] of each block is obtained after the ith block passes through a structure and context feature extraction module t-k+1 ,…,H t };
Before putting the sequence into the dynamic graph multi-head attention network, the memory vector M, the current block are added
Figure FDA00038880403100000211
The vector sequence of (2) contains only local, short-time temporal characteristics, and also the previous temporal characteristics need to be preserved, so for a block
Figure FDA0003888040310000031
In other words, the input sequence is
Figure FDA0003888040310000032
Note the book
Figure FDA0003888040310000033
Into blocks
Figure FDA0003888040310000034
A vector sequence of mid points v, the sequence being symmetrical due to a multi-headed attention mechanism
Figure FDA0003888040310000035
The sequence does not contain position information of a sequence, so that the sequence needs to be subjected to position coding PE (-) once, and the input of multi-head attention is obtained, and the input process is as follows:
Figure FDA0003888040310000036
Figure FDA0003888040310000037
in the equations (5) and (6),
Figure FDA0003888040310000038
is the coded information of the ith position,
Figure FDA0003888040310000039
into blocks
Figure FDA00038880403100000310
An input sequence of medium vertices v;
after the input sequence is processed, a calculation process of extracting the time sequence information by the multi-head attention mechanism will be described, which specifically comprises the following steps:
Figure FDA00038880403100000311
Figure FDA00038880403100000312
Figure FDA00038880403100000313
Figure FDA00038880403100000314
in the formulas (7) to (10),
Figure FDA00038880403100000315
is a representative block
Figure FDA00038880403100000316
The middle vertex v outputs a sequence of vectors,
Figure FDA00038880403100000317
and
Figure FDA00038880403100000318
three learnable parameters are provided, and the above process can make the calculation of each point be parallel by the characteristic of matrix operation.
9. The system for detecting data abnormality of import-export behavior dynamic graph according to claim 5 or 6, wherein each time graph is obtained after being processed by a structure and context feature extraction module and a dynamic time sequence feature extraction module
Figure FDA00038880403100000319
Vector representation of each point in
Figure FDA00038880403100000320
In order to detect abnormal edges in each graph, a scoring function is defined to evaluate the degree of abnormality of each edge, and the scoring function is defined as follows:
Figure FDA00038880403100000321
in the formula (11), W a And b is a parameter that can be learned,
Figure FDA0003888040310000041
and
Figure FDA0003888040310000042
vector expressions of two vertexes respectively representing the edge e, wherein the value range of f (e) is {0,1}, and in order to obtain the optimal parameter, a loss function during training is defined as follows:
Figure FDA0003888040310000043
in the formula (12), y e The label representing the edge e is a label,
Figure FDA0003888040310000044
for the L2 norm, λ is an adjustable hyper-parameter.
After the model is trained, abnormal edges in the dynamic graph can be detected according to formula (11).
CN202211251195.8A 2022-10-13 2022-10-13 Exception detection method and system for import-export behavior dynamic graph data Pending CN115512133A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211251195.8A CN115512133A (en) 2022-10-13 2022-10-13 Exception detection method and system for import-export behavior dynamic graph data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211251195.8A CN115512133A (en) 2022-10-13 2022-10-13 Exception detection method and system for import-export behavior dynamic graph data

Publications (1)

Publication Number Publication Date
CN115512133A true CN115512133A (en) 2022-12-23

Family

ID=84509924

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211251195.8A Pending CN115512133A (en) 2022-10-13 2022-10-13 Exception detection method and system for import-export behavior dynamic graph data

Country Status (1)

Country Link
CN (1) CN115512133A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561688A (en) * 2023-05-09 2023-08-08 浙江大学 Emerging technology identification method based on dynamic graph anomaly detection

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561688A (en) * 2023-05-09 2023-08-08 浙江大学 Emerging technology identification method based on dynamic graph anomaly detection
CN116561688B (en) * 2023-05-09 2024-03-22 浙江大学 Emerging technology identification method based on dynamic graph anomaly detection

Similar Documents

Publication Publication Date Title
Deng et al. Graph neural network-based anomaly detection in multivariate time series
CN115018021B (en) Machine room abnormity detection method and device based on graph structure and abnormity attention mechanism
CN111444247A (en) KPI (Key performance indicator) -based root cause positioning method and device and storage medium
CN111400452B (en) Text information classification processing method, electronic device and computer readable storage medium
CN112130200A (en) Fault identification method based on grad-CAM attention guidance
CN113065974A (en) Link prediction method based on dynamic network representation learning
Yang et al. Extracting and composing robust features with broad learning system
CN111598179A (en) Power monitoring system user abnormal behavior analysis method, storage medium and equipment
CN115512133A (en) Exception detection method and system for import-export behavior dynamic graph data
Li et al. A novel anomaly detection method for digital twin data using deconvolution operation with attention mechanism
Wang et al. RETRACTED ARTICLE: Intrusion detection and performance simulation based on improved sequential pattern mining algorithm
CN117272195A (en) Block chain abnormal node detection method and system based on graph convolution attention network
Mostafa et al. Permutohedral-gcn: Graph convolutional networks with global attention
CN115761654B (en) Vehicle re-identification method
Huang et al. Comparison of Carbon Emission Forecasting in Guangdong Province Based on Multiple Machine Learning Models
CN114760104A (en) Distributed abnormal flow detection method in Internet of things environment
Lim et al. Analyzing deep neural networks with noisy labels
Azadifar et al. Feature selection using social network techniques
CN112990618A (en) Prediction method based on machine learning method in industrial Internet of things
CN112668002B (en) Industrial control safety detection method based on feature expansion
CN117252488B (en) Industrial cluster energy efficiency optimization method and system based on big data
FangYuan et al. A Multi-view Images Classification Based on Deep Graph Convolution
Zhang et al. Network Robustness Learning via Graph Transformer
Guo et al. DTC: Addressing the long-tailed problem in intrusion detection through the divide-then-conquer paradigm
CN115620224A (en) Equipment indicator lamp identification method of convLSTM based on self-attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination