CN116561688B - Emerging technology identification method based on dynamic graph anomaly detection - Google Patents

Emerging technology identification method based on dynamic graph anomaly detection Download PDF

Info

Publication number
CN116561688B
CN116561688B CN202310517066.7A CN202310517066A CN116561688B CN 116561688 B CN116561688 B CN 116561688B CN 202310517066 A CN202310517066 A CN 202310517066A CN 116561688 B CN116561688 B CN 116561688B
Authority
CN
China
Prior art keywords
node
time
technical
space
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310517066.7A
Other languages
Chinese (zh)
Other versions
CN116561688A (en
Inventor
庄越挺
宗畅
邵健
鲁伟明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202310517066.7A priority Critical patent/CN116561688B/en
Publication of CN116561688A publication Critical patent/CN116561688A/en
Application granted granted Critical
Publication of CN116561688B publication Critical patent/CN116561688B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an emerging technology identification method based on dynamic graph anomaly detection. The invention is based on the new combination assumption of the emerging technology as the prior art, by constructing dynamic diagram data oriented to the technical field, utilizing various space-time coupling characteristics and self-attention depth neural network algorithms, representing the relation between the nodes of the technical field as characteristic vectors fusing structural information and time sequence information, calculating to obtain an abnormal score of the technical combination, further regarding the high-score technical combination as a candidate set of the emerging technical field, and obtaining the final result of the emerging technical field through manual judgment. The method fully utilizes the space and time coupling information in the dynamic diagram in both the feature input and the neural network, achieves the effect superior to other similar latest methods in the conventional abnormality detection task, is innovatively applied to the emerging technology identification task, plays a role in screening candidate fields, and remarkably reduces the cost for solving the task.

Description

Emerging technology identification method based on dynamic graph anomaly detection
Technical Field
The invention relates to the fields of artificial intelligence, data mining, deep learning, anomaly detection, emerging technology identification and the like, in particular to an emerging technology identification method based on dynamic graph anomaly detection.
Background
The emerging technical field is often formed by innovative combination of the prior art, and the accurate identification of the emerging technical field can help enterprises and technicians to quickly find new investment and research directions, so that the method has remarkable social value. The problem of the identification of the emerging technical field can be modeled as an abnormal detection task of the technical combination relation, namely, a dynamic graph is constructed from technological big data such as patents and projects, the technical field is taken as a node in the graph, the co-occurrence relation among the technologies is taken as an edge in the graph, and the possible emerging technical field is found out by mining the abnormal degree of the technical combination relation in the graph, so that various downstream business scenes are assisted.
In the prior art, the invention patent with the application number of CN202210014102.3 discloses a dual self-attention-based dynamic graph anomaly detection method, which applies structure self-attention to a vertex sequence obtained by random walk sampling of a graph, and further extracts structural features and time sequence features of the dynamic graph to detect anomaly edges, so that the extraction of the structural features is enhanced by introducing more important nodes focused by a self-attention mechanism, further, the evolution mode of the vertex is learned, the time sequence features are extracted, and a better effect is achieved on anomaly detection tasks through double attention. The invention patent with the application number of CN202210019006.8 discloses a dynamic graph anomaly detection method based on a community structure, which is used for reconstructing the distances between nodes in communities and among communities by detecting an evolved community of a dynamic graph, so that the characteristics of the nodes in the same community are similar and the distances between communities are far, thereby effectively solving the problem of anomaly detection tasks. The invention patent with the application number of CN202210530965.6 provides a method and a device for identifying an emerging technology based on large-scale corpus, wherein the method is characterized in that key words are extracted from documents, emerging score values are obtained through the number of candidate documents and related information of the key words, and then the obtained candidate emerging technology key words are subjected to a dynamic backtracking method to obtain the target emerging technology field. The invention patent with application number of CN201710356745.5 provides an emerging technology identification method based on patent citation, the method obtains a main classification number with highest coupling degree by calculating the co-citation coupling degree of the patent citation in the last two years, further marks the newly built classification number as an emerging technology, circularly completes technology identification of all patents to obtain labeling data for training a classification model, further predicts the subsequent patent technology, and obtains a better effect on the emerging technology identification task.
However, the existing dynamic graph anomaly detection method has defects in the fusion depth of the time sequence features and the space structure features, so that the detection performance is not high, and the anomaly detection technology has blank in the application task identified by the emerging technology, so that the detection method needs to be further improved and verified.
Disclosure of Invention
The invention aims to solve the problem of low recognition and detection performance of the emerging technology in the prior art and provides a recognition method of the emerging technology based on dynamic image anomaly detection.
The specific technical scheme adopted by the invention is as follows:
an emerging technology identification method based on dynamic graph anomaly detection, comprising:
s1, constructing technical text data into a technical dynamic graph, wherein graph nodes are technical fields, edges are co-occurrence relations among the technical fields, and time stamps are dates of technical text disclosure; taking each edge in the technical dynamic graph as a center edge, and extracting neighbor subgraphs corresponding to each edge through subgraph sampling; the node set of the neighbor subgraph comprises two nodes forming a center edge and all first-order neighbor nodes of the two nodes, and the edges in the neighbor subgraph are edges among all nodes in the subgraph;
s2, aiming at neighbor subgraphs corresponding to each edge in the technical dynamic graph, calculating multi-level node characteristics of a time-space independent characteristic set and a time-space coupling characteristic set in each node in the graph, projecting the multi-level node characteristics into a characteristic space by utilizing weight parameters, and acquiring a space-time characteristic vector corresponding to each node by aggregation;
S3, corresponding node sets of each neighbor subgraph in the technical dynamic graph are spliced in a time sequence to form a dynamic graph node sequence, and space-time feature vectors of all nodes in the dynamic graph node sequence are fused with space-time two-dimensional position coding information to obtain fusion features of all nodes; inputting the fusion characteristics of each node in the dynamic graph node sequence into a self-attention network deep learning model for depth representation calculation, and aggregating the depth representation vectors of all nodes in the dynamic graph node sequence to obtain a depth representation vector corresponding to the center edge of each neighbor subgraph;
s4, inputting the corresponding depth representation vector of each side in the latest snapshot of the technical dynamic graph into a multi-layer perceptron deep learning model, converting the depth representation vector of each side into a corresponding anomaly score, taking the anomaly score as a screening standard, and screening a plurality of sides with the anomaly scores of the positions being ranked from high to low to the front in the latest snapshot of the technical dynamic graph, wherein the two technical field combinations with the co-occurrence relations corresponding to the sides are the emerging technical candidate fields.
Preferably, the technical text is a patent document, in the constructed patent technology dynamic diagram, nodes are patent CPC classification codes, edges are the combination relation among the first three CPCs related to the patent document, and the time stamp is the patent publication date.
Preferably, the technical text is a project text, in the constructed project technical dynamic diagram, the nodes are project technical keywords, the edges are the combination relation among the first five keywords related to the project document, and the time stamp is the project publication date.
Preferably, in the step S1, each edge is formed in the technical dynamic diagram when sub-sampling is performedIs>And->All the neighbor nodes are selected, the nodes and the edges thereof form a neighbor subgraph corresponding to the edge, and any node in the neighbor subgraph is expressed as follows:
wherein,for the kth node at time t, < +.>And->Nodes +.>And->Is described herein).
Preferably, in the step S2, for the neighbor subgraphs corresponding to each edge in the technical dynamic graph, the method for calculating the space-time feature vector corresponding to each node is as follows:
s21, calculating a time-space independent feature set consisting of a global space feature, a local space feature and a time-of-existence feature, wherein:
the global space features are represented by PageRank values of nodes in the global graph, and the calculation formula is as follows:
wherein: s is S t Is a snapshot of the global technical dynamic graph at time t, pageRank (&) is PageRank value Calculating a function;
the local spatial features are represented by the minimum distance between node-to-edge constituent nodes, and are calculated as follows:
wherein: dist (·) is a shortest path distance calculation function, and min (·) is a minimum function;
the presence time feature is represented by the time span that exists at the center edge of the subgraph where the node is located, and the calculation formula is as follows:
wherein: t is t start Is thatThe first time point of generating the center edge of the sub-graph;
s22, calculating a time-space coupling feature set consisting of a distance change feature, an interaction change feature and a co-adjacent change feature, wherein:
the distance change feature is represented by the change of the distance between the nodes formed by the center edges of the subgraph where the nodes are located in the time dimension, and the calculation formula is as follows:
wherein: dist (·) is a shortest path distance calculation function for calculating the shortest distance between two constituent nodes of the edge at the time point t- Δt; Δt is the time step of the feature change of interest;
the interactive change characteristic is represented by the change of the degree of the node formed by the center edge of the subgraph where the node is located in the time dimension, and the calculation formula is as follows:
wherein: deg (·) is a degree calculation function for calculating the degrees of the center edge forming nodes on the snapshots of the technical dynamic graph at different moments respectively;
The co-neighbor change feature is represented by the change of the number of the co-neighbors of the node formed by the center edge of the sub-graph where the node is located in the time dimension, and the calculation formula is as follows:
wherein: v is the edge forming nodeAnd->Nodes in the intersection of the respective neighbor node sets;
s23, aiming at any node in neighbor subgraphsProjecting each feature in the time-space independent feature set and the time-space coupled feature set into a feature vector space by a weight parameter which can be learned, and further aggregating to obtain a node +.>Corresponding space-time feature vector->The calculation formula is as follows:
wherein: w (W) g ,W l ,W t Learnable weight parameters, W, for projection of time-space independent features d ,W i ,W n Is a weight parameter of the projected time-space coupling feature.
Preferably, in the step S3, the method for obtaining the depth representation vector corresponding to the center edge of each neighbor subgraph is as follows:
s31, corresponding node sets of each neighbor subgraph in different snapshots of the technical dynamic graph are spliced in a time sequence to form a dynamic graph node sequence with total length of (C+2) x T
Wherein: u is a splicing operator, C is the number of all neighbor nodes of two constituent nodes on two sides of a center edge,for all neighbor nodes, T is the number of time stamps contained in the technical dynamic graph, namely the total number of snapshots;
S32, each node in the dynamic graph node sequenceSumming the absolute space position projection and the relative space position projection, and splicing the two-dimensional space position projection and the time position projection to obtain space-time two-dimensional position coding information +.>The calculation formula is as follows:
wherein:for vector concatenation operator, W abs 、W rel 、W tmp Three learnable projection matrices;
for node->Is calculated as follows:
wherein: rw=ad -1 Is a random walk operation result matrix, A is an adjacency matrix of a technical dynamic diagram, D -1 The inverse of the degree matrix of the technical dynamic graph; RW kk For taking the value of the kth column of the kth row of the random walk operation result matrix, RW kk The superscript of (a) represents a power;
for node->Is calculated as follows:
PE tmp is a nodeThe calculation formula is as follows:
wherein: t is a nodeThe current timestamp of the sub-graph in which it is located;
s33, fusing the space-time feature vector of each node in the dynamic graph node sequence with space-time two-dimensional position coding information, and splicing the space-time feature vector and the space-time two-dimensional position coding information into an input feature sequence of the model:
wherein:(·) · transpose the operator for the matrix;
s34, inputting the characteristic sequenceIn a multi-layer self-attention network with the total layer number of P, depth representation is carried out on an input characteristic sequence through a multi-layer self-attention mechanism, wherein the depth representation mode in any first-layer self-attention network is as follows:
First, calculate the attention weight A (l) The formula is as follows:
wherein: softmax (·) is a Softmax function, l is the current network layer number, l is E [1, P]The method comprises the steps of carrying out a first treatment on the surface of the Wherein the method comprises the steps ofFor the initial input feature sequence +.>
Then carrying out layer standardization operation and feedforward network calculation on the obtained result to obtain a depth representation vector output by the current network layerThe calculation formula is as follows:
H (l) =LN(A (l) +Q (l) ),
wherein: LN (·) is the layer normalization operation, FFN (·) is the feed-forward network calculation;
s35, the depth representation vector outputted by the last layer of self-attention network is as followsAverage value aggregation operation is carried out to obtain the center side +.>The dynamic graph feature results of (2) are expressed as follows:
wherein:representing the vector for depth +.>Corresponding to the dynamic graph node sequence +.>Is the characterization vector of the nth node of (C), l= (c+2) ×t.
Preferably, in the step S4, each edge in the latest snapshot of the technical dynamic graph is processedThe expression of the depth characterization vector of (c) to the corresponding anomaly score is:
wherein, sigmoid (·) is Sigmoid function, and MLP (·) is multi-layer perceptron model.
Preferably, in the step S4, the screened candidate field of the emerging technology needs to be sent to an artificial auditing end for auditing, and a final emerging technical field is generated by combining an artificial auditing result.
Preferably, before the emerging technology recognition framework formed by the S1-S4 is used for actual reasoning, the learnable parameters of each network layer need to be optimized by pre-constructed positive and negative samples in a training stage.
Preferably, the error loss employed by the training emerging technology recognition frameworkThe expression is as follows:
where N is the total number of all samples,and->The anomaly scores for the positive and negative samples, respectively.
Compared with the prior art, the invention has the following beneficial effects:
the invention is based on the new combination assumption of the emerging technology as the prior art, by constructing dynamic diagram data oriented to the technical field, utilizing various space-time coupling characteristics and self-attention depth neural network algorithms, representing the relation between the nodes of the technical field as characteristic vectors fusing structural information and time sequence information, calculating to obtain an abnormal score of the technical combination, further regarding the high-score technical combination as a candidate set of the emerging technical field, and obtaining the final result of the emerging technical field through manual judgment. The method fully utilizes the space and time coupling information in the dynamic diagram in both the feature input and the neural network, achieves the effect superior to other similar latest methods in the conventional abnormality detection task, is innovatively applied to the emerging technology identification task, plays a role in screening candidate fields, and remarkably reduces the cost for solving the task. The method can further support the business scenes such as technical research direction selection, technical field investment, technical development analysis and the like.
Drawings
FIG. 1 is a flowchart of an emerging technology identification method based on dynamic graph anomaly detection in an embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the invention will be readily understood, a more particular description of the invention will be rendered by reference to the appended drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The present invention may be embodied in many other forms than described herein and similarly modified by those skilled in the art without departing from the spirit of the invention, whereby the invention is not limited to the specific embodiments disclosed below. The technical features of the embodiments of the invention can be combined correspondingly on the premise of no mutual conflict.
In a preferred embodiment of the present invention, there is provided an emerging technology identification method based on dynamic graph anomaly detection, comprising the steps of:
s1, constructing technical text data into a technical dynamic graph, wherein graph nodes are technical fields, edges are co-occurrence relations among the technical fields, and time stamps are dates of technical text disclosure; taking each edge in the technical dynamic graph as a center edge, and extracting neighbor subgraphs corresponding to each edge through subgraph sampling; the node set of the neighbor subgraph comprises two nodes forming a center edge and all first-order neighbor nodes of the two nodes, and the edges in the neighbor subgraph are edges between all nodes in the subgraph.
In the embodiment of the invention, the technical text can be a patent document, in the constructed patent technical dynamic diagram, nodes are patent CPC classification codes, edges are the combination relation among the first three CPCs related to the patent document, and time stamps are patent publication dates; the technical text can also be project text, in the constructed project technical dynamic diagram, nodes are project technical keywords, edges are combination relations among the first five keywords related to the project document, and time stamps are project publication dates.
In the embodiment of the present invention, each edge is formed in the technical dynamic diagram when the sub-sampling is performedIs>And->All the neighbor nodes are selected, the nodes and the edges thereof form a neighbor subgraph corresponding to the edge, and any node in the neighbor subgraph is expressed as follows:
wherein,for the kth node at time t, < +.>And->Nodes +.>And->Is described herein).
It should be noted that since the technical text is constantly being disclosed, snapshots of the technical dynamics graph exist at different moments. The present invention assumes that the emerging technology is a novel combination of the prior art, so that each edge in the technology dynamic diagram corresponds to a technology combination.
S2, aiming at neighbor subgraphs corresponding to each edge in the technical dynamic graph, calculating multi-level node features of a time-space independent feature set and a time-space coupling feature set in each node in the graph, projecting the multi-level node features into a feature space by utilizing weight parameters, and obtaining a space-time feature vector corresponding to each node through aggregation.
In the embodiment of the invention, the method for calculating the space-time feature vector corresponding to each node aiming at the neighbor subgraphs corresponding to each edge in the technical dynamic graph is as follows:
s21, calculating a time-space independent feature set consisting of a global space feature, a local space feature and a time-of-existence feature, wherein:
the global space features are represented by PageRank values of nodes in the global graph, and the calculation formula is as follows:
wherein: s is S t The method is a snapshot of a global technical dynamic diagram at time t, and PageRank (·) is a PageRank value calculation function;
the local spatial features are represented by the minimum distance between node-to-edge constituent nodes, and are calculated as follows:
wherein: dist (·) is a shortest path distance calculation function, and min (·) is a minimum function;
the presence time feature is represented by the time span that exists at the center edge of the subgraph where the node is located, and the calculation formula is as follows:
Wherein: t is t start Is thatThe first time point of generating the center edge of the sub-graph;
s22, calculating a time-space coupling feature set consisting of a distance change feature, an interaction change feature and a co-adjacent change feature, wherein:
the distance change feature is represented by the change of the distance between the nodes formed by the center edges of the subgraph where the nodes are located in the time dimension, and the calculation formula is as follows:
wherein: dist (·) is a shortest path distance calculation function for calculating the shortest distance between two constituent nodes of the edge at the time point t- Δt; Δt is the time step of the feature change of interest;
the interactive change characteristic is represented by the change of the degree of the node formed by the center edge of the subgraph where the node is located in the time dimension, and the calculation formula is as follows:
wherein: deg (·) is a degree calculation function for calculating the degrees of the center edge forming nodes on the snapshots of the technical dynamic graph at different moments respectively;
the co-neighbor change feature is represented by the change of the number of the co-neighbors of the node formed by the center edge of the sub-graph where the node is located in the time dimension, and the calculation formula is as follows:
wherein: v is the edge forming nodeAnd->Nodes in the intersection of the respective neighbor node sets;
s23, aiming at any node in neighbor subgraphs Projecting each feature in the time-space independent feature set and the time-space coupled feature set into a feature vector space by a weight parameter which can be learned, and further aggregating to obtain a node +.>Corresponding space-time feature vector->The calculation formula is as follows:
wherein: w (W) g ,W l ,W t Learnable weight parameters, W, for projection of time-space independent features d ,W i ,W n Is a weight parameter of the projected time-space coupling feature.
S3, corresponding node sets of each neighbor subgraph in the technical dynamic graph are spliced in a time sequence to form a dynamic graph node sequence, and space-time feature vectors of all nodes in the dynamic graph node sequence are fused with space-time two-dimensional position coding information to obtain fusion features of all nodes; and inputting the fusion characteristics of each node in the dynamic graph node sequence into a self-attention network deep learning model to perform depth representation calculation, and aggregating depth representation vectors of all nodes in the dynamic graph node sequence to obtain depth representation vectors corresponding to the center edges of each neighbor subgraph.
In the embodiment of the invention, the method for obtaining the depth representation vector corresponding to the center edge of each neighbor subgraph is as follows:
s31, corresponding node sets of each neighbor subgraph in different snapshots of the technical dynamic graph are spliced in a time sequence to form a dynamic graph node sequence with total length of (C+2) x T
Wherein: u is a splicing operator, C is the number of all neighbor nodes of two constituent nodes on two sides of a center edge,for all neighbor nodes, T is the number of time stamps contained in the technical dynamic graph, namely the total number of snapshots;
s32, each node in the dynamic graph node sequenceSumming the absolute space position projection and the relative space position projection, and splicing the two-dimensional space position projection and the time position projection to obtain space-time two-dimensional position coding information +.>The calculation formula is as follows:
wherein:for vector concatenation operator, W abs 、W rel 、W tmp Three learnable projection matrices;
for node->Is calculated as follows:
wherein: rw=ad -1 Is a random walk operation result matrix, A is an adjacency matrix of a technical dynamic diagram, D -k1 The inverse of the degree matrix of the technical dynamic graph; RW kk For taking the value of the kth column of the kth row of the random walk operation result matrix, RW kk The superscript of (a) represents a power;
for node->Is calculated as follows:
PE tmp is a nodeThe calculation formula is as follows:
wherein: t is a nodeThe current timestamp of the sub-graph in which it is located;
s33, fusing the space-time feature vector of each node in the dynamic graph node sequence with space-time two-dimensional position coding information, and splicing the space-time feature vector and the space-time two-dimensional position coding information into an input feature sequence of the model:
Wherein:(·) · transpose the operator for the matrix;
s34, inputting the characteristic sequenceIn a multi-layer self-attention network with the total input layer number of P, depth characterization is carried out on an input characteristic sequence through a multi-layer self-attention mechanism, wherein any first layerThe depth characterization in the self-attention network is as follows:
first, calculate the attention weight A (l) The formula is as follows:
wherein: softmax (·) is a Softmax function, l is the number of network layers where the current is located,
l∈[1,P]the method comprises the steps of carrying out a first treatment on the surface of the Wherein the method comprises the steps ofFor the initial input feature sequence +.>
Then carrying out layer standardization operation and feedforward network calculation on the obtained result to obtain a depth representation vector output by the current network layerThe calculation formula is as follows:
H (l) =LN(A (l) +Q (l) ),
wherein: LN (·) is the layer normalization operation, FFN (·) is the feed-forward network calculation;
s35, the depth representation vector outputted by the last layer of self-attention network is as followsAverage value aggregation operation is carried out to obtain an input characteristic sequenceCorresponding center edge->The dynamic graph feature results of (2) are expressed as follows:
wherein:representing the vector for depth +.>Corresponding to the dynamic graph node sequence +.>Is the characterization vector of the nth node of (C), l= (c+2) ×t.
S4, inputting the corresponding depth representation vector of each side in the latest snapshot of the technical dynamic graph into a multi-layer perceptron deep learning model, converting the depth representation vector of each side into a corresponding anomaly score, taking the anomaly score as a screening standard, and screening a plurality of sides with the anomaly scores of the positions being ranked from high to low to the front in the latest snapshot of the technical dynamic graph, wherein the two technical field combinations with the co-occurrence relations corresponding to the sides are the emerging technical candidate fields.
It should be noted that, since different snapshots exist in the technology dynamic graph, the emerging technology at the current latest moment is mainly required to be focused when the emerging technology is screened, and therefore, only each edge in the latest snapshot (i.e. t=t) of the technology dynamic graph needs to be identified.
In the embodiment of the present invention, in S4, each edge in the latest snapshot of the technical dynamic graph is determinedThe expression of the depth characterization vector of (c) to the corresponding anomaly score is:
wherein, sigmoid (·) is Sigmoid function, and MLP (·) is multi-layer perceptron model.
In addition, the screened emerging technical candidate fields may have misjudgment, so that the screening result is preferably sent to an artificial auditing end for auditing, and the final emerging technical field is generated by combining the artificial auditing result.
It should be noted that the steps S1 to S4 actually constitute a model framework identified by the emerging technology, and before the model framework is used for actual reasoning, the learnable parameters of each network layer need to be optimized by using pre-constructed positive and negative samples in the training stage. In embodiments of the present invention, training an emerging technology identifies error loss employed by a frameworkThe expression is as follows:
where N is the total number of all samples, And->The anomaly scores for the positive and negative samples, respectively.
The method of the emerging technology identification method based on the anomaly detection of the dynamic graph described in the above S1 to S4 is applied to a specific example to show the training mode, the technical effect, and the like.
Examples
As shown in fig. 1, in this embodiment, the method for identifying an emerging technology based on the anomaly detection of a dynamic graph includes the following steps:
1. technology dynamic diagram data construction and sampling link
The emerging technology types of interest in this embodiment are two types of patents in the artificial intelligence field and projects in the cancer medical field, so that the two types of technology texts need to be collected in advance for constructing a dataset.
Aiming at the collected project data of the artificial intelligence field patent and the cancer medical field, a technical dynamic graph dataset respectively oriented to two technical subjects is constructed by utilizing a data processing method, and a training set and a testing set which comprise neighbor subgraphs and time sequences are constructed based on graph calculation and random sampling methods.
The technical dynamic graph data construction and sampling link consists of two self links of technical dynamic graph construction, training and test sample set construction.
1.1, technology dynamic diagram construction links
The link comprises three data processing steps of recent technical text acquisition, technical field co-occurrence relation generation and technical field ID, wherein the data processing methods are all from an open source toolkit.
The node of the technical dynamic graph constructed in the link is the technical field, the edges are the co-occurrence relations among the technical fields, and the time stamp is the date of technical text disclosure. For a patent technology dynamic diagram, nodes are patent CPC classification codes, edges are combination relations among the first three CPCs of a patent document, and time stamps are patent publication dates; for the project technology dynamic graph, nodes are project technology keywords, edges are combination relations among the first five keywords related to the project document, and time stamps are project publication dates. The technical dynamic diagram constructed by the two data is shown as follows:
1.2, training and testing sample set construction links
The training set and the testing set constructed in the link are obtained by sequencing dynamic graphs with N sides according to time stamps and then cutting the dynamic graphs with M sides as units, wherein the time sequence length is N/M. The first half of the time sequence is a training set, the second half of the time sequence is a testing set, and the time sequence length is N/M/2. The specific values of N and M can be adjusted and optimized according to the actual data set condition and the model input requirement.
For positive and negative sample label construction, all edges on the graph at the last time point of the training set can be regarded as negative samples, and the same number of edges which do not exist on the graph are randomly generated as positive samples, so that a final training data set is obtained. In addition, all edges on the graph at the last time point of the test set are regarded as negative samples, and 10% of edges of the negative samples which are not existing on the graph are randomly generated as positive samples, so that a final test data set is obtained.
2. Multi-level space-time coupling characteristic calculation link
And constructing multi-level node features covering space-time independent features and time-space coupling features according to the space-time coupling degree by using priori knowledge of the dynamic graph structure and the time sequence and combining general knowledge of abnormal behaviors, calculating to obtain feature values, projecting the feature values into a feature space by using weight parameters, and polymerizing to obtain the node features.
The link consists of four sub-links of sub-sampling, time-space independent feature set, time-space coupling feature set, node feature projection and aggregation.
2.1 sub-sampling
For the nodes forming each edge, selecting all neighbor nodes, and taking a subgraph formed by the nodes and the edges as a sample for extracting the characteristics, wherein any node in the subgraph is represented as follows:
Wherein,for the kth node at time t, < +.>And->Respectively is a constitution edge->Is>Anda first-order set of neighbor nodes, two nodes each.
2.2 time-space independent feature set
The time-space independent feature set consists of three node-oriented spatial or temporal features, including global spatial features, local spatial features, and time-of-existence features. Wherein the global spatial feature is represented by a PageRank value of a node in the global graph, as follows:
wherein S is t Is a graph snapshot of the global dynamic graph at time t,as any node in the subgraph,
PageRank (.cndot.) is a PageRank value calculation function. The local spatial features are represented by the minimum distance between node-to-edge constituent nodes, as follows:
wherein Dist (·) is a shortest path distance calculation function, and min (·) is a minimum function. The presence time feature is represented by the time span that exists at the center edge of the subgraph where the node is located, as follows:
wherein t is start Is thatThe first time point of the center edge of the sub-graph is located, and t is the current time point of the calculation feature.
2.3 time-space coupled feature set
The time-space coupling feature set is composed of three node-oriented space-time evolution features, including distance change features, interaction change features and co-adjacent change features. The distance change feature is represented by the change of the distance between the nodes formed by the center edges of the subgraph where the nodes are located in the time dimension, and is represented as follows:
/>
Wherein Dist (·) is a shortest path distance calculation function for calculating the shortest distance between two constituent nodes of an edge at a time point t- Δt, whereas at the current time point t, the distance is 1 due to the edge presence. In addition, the interactive change characteristic is represented by the change of the degree of the node formed by the center edge of the sub-graph where the node is located in the time dimension, and is represented as follows:
wherein Deg (·) is a degree calculation function for calculating the values of the degree of the edge forming nodes at different moments, respectively. Finally, the co-neighbor change feature is represented by the change of the number of co-neighbors of the node formed by the center edge of the sub-graph where the node is located in the time dimension, and is represented as follows:
wherein v is an edge forming nodeAnd->Nodes in the intersection of respective sets of neighbor nodes. In the above feature calculation, the value of Δt may be changed, and Δt is set to 1 in this patent, that is, the feature of the change between the previous time and the current time is focused.
2.4 node feature projection and aggregation
Each feature in the above time-space independent feature set and time-space coupled feature set is projected into a feature vector space by a learnable parameter and further aggregated to obtain a final node feature vector, expressed as follows:
Wherein W is g ,W l ,W t Weight parameters, W, for projection of time-space independent features d ,W i ,W n Is a weight parameter of the projected time-space coupling feature.
3. Self-attention network space-time characterization calculation link
Constructing a node sequence of the dynamic graph aiming at the edge, merging space-time two-dimensional position coding information by using a self-attention network deep learning model, performing depth characterization calculation on the input node characteristic sequence, and aggregating the characterization result according to the sequence to obtain a characteristic vector aiming at the edge;
the link consists of three sub-links of input sequence construction, two-dimensional position coding and edge feature coding.
3.1, input sequence Structure
For the input of the sample of the dynamic graph and the subgraph where the sample is located, the nodes are arranged in time sequence to form an input sequence, and the input sequence is expressed as follows:
and the U is a splicing operator, the sampling number of the neighbor nodes is T is the time stamp number, and the total length of the input sequence is (C+2) multiplied by T.
3.2 two-dimensional position coding
Each node in the dynamic graph input sequence represents its position information in the dynamic graph with a combination code of positions in two dimensions of space and time, respectively.
Spatial position information: the spatial position of a node is derived from a combination of the absolute position in the graph in which it is located and the relative position in the subgraph in which it is located. Wherein the absolute position of a node is represented by a vector resulting from its higher order random walk operation, as follows:
Wherein rw=ad -1 Is a random walk operation result matrix, A is an adjacent matrix of the graph, D -1 RW is the inverse of the degree matrix of the graph kk To take the value of the kth column of the kth row of its result matrix, its superscript stands for power.
In addition, the relative positions of the nodes are obtained by the position relation between the nodes and the center edge node in the subgraph, and are expressed as follows:
if the node is a node of the center side, the position is 0; if the node is a common neighbor node of which the center edge forms the node, the position is 1; if not, the position is 2.
Time position information: the time position of a node is the current time stamp number of the graph in which it is located, and is expressed as follows:
wherein t is a nodeThe current timestamp number of the figure.
Two-dimensional position combination coding: the space-time two-dimensional position coding information of one node is obtained by splicing the sum of the absolute spatial position projection and the relative spatial position projection of the node and the time position projection of the node, and the space-time two-dimensional position coding information is expressed as follows:
wherein,for vector concatenation operator, W abs And W is rel Projection matrices of absolute and relative positions, respectively, with dimensions 1×d/2 and m×d/2,w, respectively tmp The dimension of the projection matrix is 1 x d/2, and d is the vector dimension of the whole position code.
3.3 edge feature coding links
The link consists of three parts, namely input sequence feature calculation, self-attention network characterization calculation and edge feature aggregation calculation.
3.3.1 input sequence feature computation
The input sequence features are obtained by summing the node input features of each node formed by the input sequence features and the position coding features of the node input features, and are expressed as follows:
wherein:(·) T transpose the operators for the matrix.
3.3.2 self-attention network characterization calculation
Firstly, according to a self-attention layer method of a transducer model, the obtained input sequence features are deeply characterized by adopting a multi-layer self-attention network, and the depth characterization is expressed as follows:
/>
wherein Softmax (.cndot.) is a function of Softmax, l is the number of network layers where the current is located,
l∈[1,P]p is the total number of layers of the multi-layer self-care network. When l=1, whereinFor the initial input feature sequence +.>
Next, layer normalization operation and feedforward network calculation are performed on the obtained result, which are expressed as follows:
H (l) =LN(A (l) +Q (l) ),
where LN (-) is the layer normalization operation and FFM (-) is the feed forward network calculation.
3.3.3 edge feature coding
And carrying out average value aggregation operation on the obtained depth characterization result of the input sequence to obtain a dynamic diagram characteristic result of the center edge corresponding to the input sequence, wherein the dynamic diagram characteristic result is expressed as follows:
Wherein:representing the vector for depth +.>Corresponding to the dynamic graph node sequence +.>Is a token vector for the nth node.
4. Combined relation abnormal score and error calculation link
Converting the feature vector of the edge into an abnormal score by using a multi-layer perceptron deep learning model, calculating error loss between the abnormal score and the label by using the positive and negative sample label information obtained by sampling in the step 1, and updating the learnable parameters in the links in a reverse propagation manner;
the link consists of three sub-links of edge anomaly score calculation, error loss calculation and model parameter optimization.
4.1 edge anomaly score calculation
Converting the obtained feature vector of the edge into an anomaly score by using a multi-layer perceptron model, wherein the anomaly score is used for an anomaly value of a combination relationship represented by the edge and is represented as follows:
wherein, sigmoid (·) is Sigmoid function, and MLP (·) is multi-layer perceptron model.
4.2 error loss calculation
Calculating error loss between the abnormality score and the label by utilizing the positive and negative sample labels obtained by pre-sampling, wherein the error loss is expressed as follows:
where N is the total number of all samples,and->The anomaly scores for the positive and negative samples, respectively.
4.3 model parameter optimization
And updating the reverse propagation parameters of the learnable parameters in the links based on the obtained error loss, and obtaining an optimized model through multiple iterations, wherein the model is used for identifying technical point combinations in the emerging technical field.
In this embodiment, AUC (area under ROC curve) and AP (average accuracy) are used to perform performance evaluation on the model, and test is performed on two self-constructed data sets of artificial intelligence patent and medical project, the generated positive sample accounts for 10% of negative sample, the training period is 300 rounds, the feature vector dimension is set to 32, the number of self-attentive network layers is set to 2, the learning rate is set to 0.001, and the obtained test index result is:
the model compared with the method model is a front-edge deep learning model in a dynamic image anomaly detection task, and comprises AddGraph (https:// www.ijcai.org/procedings/2019/614), strGNN (https:// dl.acm.org/doi/10.1145/3459637.3481955) and TADDY (https:// ieeExplore.ieee.org/document/9599560 /). It can be seen that the performance of the method is obviously improved compared with the existing model.
5. Combined output link in candidate emerging technical field
The optimized model parameters are obtained through multi-step iterative learning, the abnormal scores of the technical combination relations output by the model are ordered in descending order, the first K high-score technical combinations are obtained, and the emerging technical field is further obtained through manual verification and used for downstream tasks.
Based on the obtained optimized model, the abnormal detection can be carried out on the artificial intelligent patent technology classification dynamic diagram and the cancer medical project technical keyword dynamic diagram respectively in the actual application stage, K high-score technical combinations before the abnormal score are output as emerging technical candidate fields, and then the final emerging technical fields are obtained through manual verification and are used for tasks in downstream application scenes. Taking k=10, the resulting candidate emerging technology combinations are as follows:
from the analysis results, some identified technical fields represented by the technical combinations have emerged and are popular, such as Knowledge Reasoning + Natural Language Query and Vehicle Adapting Control + Visual or Acoustic Aids and Cancer Intervention and Surveillance +Automation, and these results indicate that our method helps to identify the hot spot research field; some technical combinations are in completely new technical fields, such as Brain Neoplasms+Cell Linear and Androgen Receptor + Biological Markers, and these results indicate that our method helps to give technical fields that might lead to future research directions.
The above embodiment is only a preferred embodiment of the present invention, but it is not intended to limit the present invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, all the technical schemes obtained by adopting the equivalent substitution or equivalent transformation are within the protection scope of the invention.

Claims (8)

1. An emerging technology identification method based on dynamic graph anomaly detection is characterized by comprising the following steps:
s1, constructing technical text data into a technical dynamic graph, wherein graph nodes are technical fields, edges are co-occurrence relations among the technical fields, and time stamps are dates of technical text disclosure; taking each edge in the technical dynamic graph as a center edge, and extracting neighbor subgraphs corresponding to each edge through subgraph sampling; the node set of the neighbor subgraph comprises two nodes forming a center edge and all first-order neighbor nodes of the two nodes, and the edges in the neighbor subgraph are edges among all nodes in the subgraph;
s2, aiming at the neighbor subgraphs corresponding to each edge in the technical dynamic graph, calculating multi-level node characteristics of each node in the neighbor subgraphs, wherein each node comprises a time-space independent characteristic set and a time-space coupling characteristic set, projecting the multi-level node characteristics into a characteristic space by utilizing a weight parameter, and acquiring a space-time characteristic vector corresponding to each node through aggregation;
s3, corresponding node sets of each neighbor subgraph in the technical dynamic graph are spliced in a time sequence to form a dynamic graph node sequence, and space-time feature vectors of all nodes in the dynamic graph node sequence are fused with space-time two-dimensional position coding information to obtain fusion features of all nodes; inputting the fusion characteristics of each node in the dynamic graph node sequence into a self-attention network deep learning model for depth representation calculation, and aggregating the depth representation vectors of all nodes in the dynamic graph node sequence to obtain a depth representation vector corresponding to the center edge of each neighbor subgraph;
S4, inputting the corresponding depth representation vector of each side in the latest snapshot of the technical dynamic graph into a multi-layer perceptron deep learning model, converting the depth representation vector of each side into a corresponding abnormal score, taking the abnormal score as a screening standard, and screening a plurality of sides with front abnormal scores from high to low in the latest snapshot of the technical dynamic graph, wherein two technical field combinations with co-occurrence relations corresponding to the sides are emerging technical candidate fields;
in the step S1, each side is formed in the technical dynamic diagram when sub-sampling is performedIs>And->Selecting all neighbor nodes, wherein the nodes and edges thereof form a neighbor subgraph corresponding to the edge, and the kth node in the neighbor subgraph at time t is +.>The expression is as follows:
wherein,and->Nodes +.>And->Is a first-order neighbor node set;
in the step S2, for the neighbor subgraphs corresponding to each edge in the technical dynamic graph, the space-time feature vector calculation method corresponding to each node is as follows:
s21, calculating a time-space independent feature set consisting of a global space feature, a local space feature and a time-of-existence feature, wherein:
global spatial features Represented by the PageRank value of the node in the global graph, the calculation formula is as follows:
wherein: s is S t The method is a snapshot of a global technical dynamic diagram at time t, and PageRank (·) is a PageRank value calculation function;
local spatial featuresThe minimum distance between nodes is represented by the node-to-edge composition, and the calculation formula is as follows:
wherein: dist (·) is a shortest path distance calculation function, and min (·) is a minimum function;
time of presence featureRepresented by the time span existing at the center edge of the subgraph where the node is located, the calculation formula is as follows:
wherein: t is t start Is thatThe first time point of generating the center edge of the sub-graph;
s22, calculating a time-space coupling feature set consisting of a distance change feature, an interaction change feature and a co-adjacent change feature, wherein:
distance change featuresThe distance between the nodes formed by the center edges of the subgraph where the nodes are located is represented by the change of the time dimension, and the calculation formula is as follows:
wherein: dist (·) is a shortest path distance calculation function for calculating the shortest distance between two constituent nodes of the edge at the time point t- Δt; Δt is the time step of the feature change of interest;
interactive change featuresThe degree of the node formed by the center edge of the sub-graph where the node is located is represented by the change in the time dimension, and the calculation formula is as follows:
Wherein: deg (·) is a degree calculation function for calculating the degrees of the center edge forming nodes on the snapshots of the technical dynamic graph at different moments respectively;
co-neighbor variation characteristicsThe calculation formula is as follows, wherein the change of the common neighbor number of the node in the time dimension is represented by the central edge of the sub-graph where the node is located:
wherein: v is the edge forming nodeAnd->Nodes in the intersection of the respective neighbor node sets;
s23, aiming at any node in neighbor subgraphsProjecting each feature in the time-space independent feature set and the time-space coupled feature set into a feature vector space by a weight parameter which can be learned, and further aggregating to obtain a node +.>Corresponding space-time feature vector->The calculation formula is as follows:
wherein: w (W) g ,W l ,W t Learnable weight parameters, W, for projection of time-space independent features d ,W i ,W n Is a learnable weight parameter for projecting the time-space coupling feature.
2. The emerging technology identification method based on the anomaly detection of the dynamic graph as claimed in claim 1, wherein the technology text is a patent document, in the constructed patent technology dynamic graph, nodes are patent CPC classification codes, the combination relation of the first three CPCs related to the patent document is simultaneously adopted, and the timestamp is patent publication date.
3. The method for identifying emerging technologies based on anomaly detection of dynamic graph according to claim 1, wherein the technical text is a project text, in the constructed project technical dynamic graph, nodes are project technical keywords, edges are combination relations among the first five keywords related to the project document, and a timestamp is a project publication date.
4. The method for identifying an emerging technology based on dynamic graph anomaly detection according to claim 1, wherein in S3, the method for obtaining the depth representation vector corresponding to the center edge of each neighbor subgraph is as follows:
s31, corresponding node sets of each neighbor subgraph in different snapshots of the technical dynamic graph are spliced in a time sequence to form a dynamic graph node sequence with total length of (C+2) x T
Wherein: u is a splicing operator, C is the number of all neighbor nodes of two constituent nodes on two sides of a center edge,for all neighbor nodes, T is the number of time stamps contained in the technical dynamic graph, namely the total number of snapshots;
s32, each node in the dynamic graph node sequenceSumming the absolute space position projection and the relative space position projection, and splicing the two-dimensional space position projection and the time position projection to obtain space-time two-dimensional position coding information +. >The calculation formula is as follows:
wherein:for vector concatenation operator, W abs 、W rel 、W tmp Three learnable projection matrices;
for node->Is calculated as follows:
wherein: rw=ad -1 Is a random walk operation result matrix, A is an adjacency matrix of a technical dynamic diagram, D -1 The inverse of the degree matrix of the technical dynamic graph; RW kk For taking the value of the kth column of the kth row of the random walk operation result matrix, RW kk The superscript of (a) represents a power;
for node->Is calculated as follows:
for node->The calculation formula is as follows:
s33, fusing the space-time feature vector of each node in the dynamic graph node sequence with space-time two-dimensional position coding information, and splicing the space-time feature vector and the space-time two-dimensional position coding information into an input feature sequence of the model:
wherein: transpose the operator for the matrix;
s34, inputting the characteristic sequenceIn a multi-layer self-attention network with the total layer number of P, depth representation is carried out on an input characteristic sequence through a multi-layer self-attention mechanism, wherein the depth representation mode in any first-layer self-attention network is as follows:
first, calculate the attention weight A (l) The formula is as follows:
wherein: softmax (·) is a Softmax function, l is the current network layer number, l is E [1, P]The method comprises the steps of carrying out a first treatment on the surface of the Wherein the method comprises the steps of For the initial input feature sequence +.>
Then carrying out layer standardization operation and feedforward network calculation on the obtained result to obtain a depth representation vector output by the current network layerThe calculation formula is as follows:
H (l) =LN(A (l) +Q (l) ),
wherein: LN (·) is the layer normalization operation, FFN (·) is the feed-forward network calculation;
s35, the depth representation vector outputted by the last layer of self-attention network is as followsAverage value aggregation operation is carried out to obtain the center side +.>The dynamic graph feature results of (2) are expressed as follows:
wherein:representing the vector for depth +.>Corresponding to the dynamic graph node sequence +.>Is the characterization vector of the nth node of (C), l= (c+2) ×t.
5. The method for identifying emerging technologies based on motion graph anomaly detection as recited in claim 4, wherein in S4, each edge in the latest snapshot of the technology motion graph is processedDepth characterization vector conversion to corresponding outliersFrequent score->The expression of (2) is:
wherein, sigmoid (·) is Sigmoid function, and MLP (·) is multi-layer perceptron model.
6. The method for identifying emerging technologies based on dynamic graph anomaly detection according to claim 4, wherein in S4, the screened emerging technology candidate fields are required to be sent to an artificial auditing end for auditing, and a final emerging technology field is generated by combining an artificial auditing result.
7. The method for identifying emerging technologies based on anomaly detection of dynamic graph as claimed in claim 1, wherein the emerging technology identification framework formed by S1-S4 is used for parameter optimization of the learnable parameters of each network layer by pre-constructed positive and negative samples in training phase before actual reasoning.
8. The method for emerging technology identification based on dynamic graph anomaly detection of claim 7, wherein the error penalty employed by the emerging technology identification framework is trainedThe expression is as follows:
where N is the total number of all samples,and->The anomaly scores for the positive and negative samples, respectively.
CN202310517066.7A 2023-05-09 2023-05-09 Emerging technology identification method based on dynamic graph anomaly detection Active CN116561688B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310517066.7A CN116561688B (en) 2023-05-09 2023-05-09 Emerging technology identification method based on dynamic graph anomaly detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310517066.7A CN116561688B (en) 2023-05-09 2023-05-09 Emerging technology identification method based on dynamic graph anomaly detection

Publications (2)

Publication Number Publication Date
CN116561688A CN116561688A (en) 2023-08-08
CN116561688B true CN116561688B (en) 2024-03-22

Family

ID=87501304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310517066.7A Active CN116561688B (en) 2023-05-09 2023-05-09 Emerging technology identification method based on dynamic graph anomaly detection

Country Status (1)

Country Link
CN (1) CN116561688B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020033404A1 (en) * 2018-08-07 2020-02-13 Triad National Security, Llc Modeling anomalousness of new subgraphs observed locally in a dynamic graph based on subgraph attributes
WO2021179838A1 (en) * 2020-03-10 2021-09-16 支付宝(杭州)信息技术有限公司 Prediction method and system based on heterogeneous graph neural network model
CN113868474A (en) * 2021-09-02 2021-12-31 子亥科技(成都)有限公司 Information cascade prediction method based on self-attention mechanism and dynamic graph
CN114120652A (en) * 2021-12-21 2022-03-01 重庆邮电大学 Traffic flow prediction method based on dynamic graph neural network
CN114118375A (en) * 2021-11-29 2022-03-01 吉林大学 Continuous dynamic network characterization learning method based on time sequence diagram Transformer
CN114817571A (en) * 2022-05-16 2022-07-29 浙江大学 Method, medium, and apparatus for predicting achievement quoted amount based on dynamic knowledge graph
CN115017368A (en) * 2022-04-28 2022-09-06 北京交通大学 Dynamic graph representation learning method based on self-supervision learning
CN115208680A (en) * 2022-07-21 2022-10-18 中国科学院大学 Dynamic network risk prediction method based on graph neural network
CN115293247A (en) * 2022-07-21 2022-11-04 支付宝(杭州)信息技术有限公司 Method for establishing risk identification model, risk identification method and corresponding device
CN115512133A (en) * 2022-10-13 2022-12-23 深圳市检验检疫科学研究院 Exception detection method and system for import-export behavior dynamic graph data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10361926B2 (en) * 2017-03-03 2019-07-23 Nec Corporation Link prediction with spatial and temporal consistency in dynamic networks
US11522881B2 (en) * 2019-08-28 2022-12-06 Nec Corporation Structural graph neural networks for suspicious event detection
US20220019887A1 (en) * 2020-07-14 2022-01-20 International Business Machines Corporation Anomaly detection in network topology

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020033404A1 (en) * 2018-08-07 2020-02-13 Triad National Security, Llc Modeling anomalousness of new subgraphs observed locally in a dynamic graph based on subgraph attributes
WO2021179838A1 (en) * 2020-03-10 2021-09-16 支付宝(杭州)信息技术有限公司 Prediction method and system based on heterogeneous graph neural network model
CN113868474A (en) * 2021-09-02 2021-12-31 子亥科技(成都)有限公司 Information cascade prediction method based on self-attention mechanism and dynamic graph
CN114118375A (en) * 2021-11-29 2022-03-01 吉林大学 Continuous dynamic network characterization learning method based on time sequence diagram Transformer
CN114120652A (en) * 2021-12-21 2022-03-01 重庆邮电大学 Traffic flow prediction method based on dynamic graph neural network
CN115017368A (en) * 2022-04-28 2022-09-06 北京交通大学 Dynamic graph representation learning method based on self-supervision learning
CN114817571A (en) * 2022-05-16 2022-07-29 浙江大学 Method, medium, and apparatus for predicting achievement quoted amount based on dynamic knowledge graph
CN115208680A (en) * 2022-07-21 2022-10-18 中国科学院大学 Dynamic network risk prediction method based on graph neural network
CN115293247A (en) * 2022-07-21 2022-11-04 支付宝(杭州)信息技术有限公司 Method for establishing risk identification model, risk identification method and corresponding device
CN115512133A (en) * 2022-10-13 2022-12-23 深圳市检验检疫科学研究院 Exception detection method and system for import-export behavior dynamic graph data

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AddGraph: Anomaly Detection in Dynamic Graph Using Attention-based Temporal GCN;Zheng Li;PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE;4419-4425 *
Lei Cai.Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs.CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management.2021,3747–3756. *
Motif-Level Anomaly Detection in Dynamic Graphs;Zirui Yuan;IEEE Transactions on Information Forensics and Security;2870-2882 *
基于动态图结构学习的多参数时间序列异常检测模型;汪增辉;信息与电脑(理论版);112-116 *

Also Published As

Publication number Publication date
CN116561688A (en) 2023-08-08

Similar Documents

Publication Publication Date Title
Muthukrishna et al. RAPID: early classification of explosive transients using deep learning
Zhang et al. Scalable multiplex network embedding.
Hasan et al. Context aware active learning of activity recognition models
Veres et al. Deep learning architectures for soil property prediction
CN102496028A (en) Breakdown maintenance and fault analysis method for complicated equipment
Nun et al. Ensemble learning method for outlier detection and its application to astronomical light curves
Ortego et al. Evolutionary LSTM-FCN networks for pattern classification in industrial processes
Sayed et al. From time-series to 2d images for building occupancy prediction using deep transfer learning
Rajamohana et al. An effective hybrid cuckoo search with harmony search for review spam detection
CN115983984A (en) Multi-model fusion client risk rating method
Itakura et al. Automatic pear and apple detection by videos using deep learning and a Kalman filter
Zhang et al. Hypergraph label propagation network
Tian et al. Genetic algorithm based deep learning model selection for visual data classification
CN115544239A (en) Deep learning model-based layout preference prediction method
Enamorado Active learning for probabilistic record linkage
Nguyen et al. An extensive investigation on leveraging machine learning techniques for high-precision predictive modeling of CO2 emission
Kuang et al. Coarformer: Transformer for large graph via graph coarsening
CN116561688B (en) Emerging technology identification method based on dynamic graph anomaly detection
CN116975123A (en) Multidimensional time sequence anomaly detection method combining graph structure learning and graph annotation force network
CN115392474B (en) Local perception graph representation learning method based on iterative optimization
JP5401885B2 (en) Model construction method, construction system, and construction program
Jasim et al. Analyzing Social Media Sentiment: Twitter as a Case Study
Altaf et al. Hard voting meta classifier for disease diagnosis using mean decrease in impurity for tree models
US11568177B2 (en) Sequential data analysis apparatus and program
Masui et al. Recurrent visual relationship recognition with triplet unit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant