CN112398819A - Method and device for recognizing abnormality - Google Patents
Method and device for recognizing abnormality Download PDFInfo
- Publication number
- CN112398819A CN112398819A CN202011206145.9A CN202011206145A CN112398819A CN 112398819 A CN112398819 A CN 112398819A CN 202011206145 A CN202011206145 A CN 202011206145A CN 112398819 A CN112398819 A CN 112398819A
- Authority
- CN
- China
- Prior art keywords
- node
- abnormal
- relationship
- community
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 230000005856 abnormality Effects 0.000 title abstract description 4
- 230000002159 abnormal effect Effects 0.000 claims abstract description 148
- 230000009471 action Effects 0.000 claims abstract description 47
- 230000002547 anomalous effect Effects 0.000 claims abstract description 8
- 230000002776 aggregation Effects 0.000 claims description 41
- 238000004220 aggregation Methods 0.000 claims description 41
- 239000013598 vector Substances 0.000 claims description 31
- 238000012545 processing Methods 0.000 abstract description 13
- 238000012512 characterization method Methods 0.000 abstract description 2
- 238000004891 communication Methods 0.000 description 20
- 230000006870 function Effects 0.000 description 19
- 230000002093 peripheral effect Effects 0.000 description 10
- 230000009466 transformation Effects 0.000 description 10
- 230000001133 acceleration Effects 0.000 description 9
- 238000012546 transfer Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 230000006399 behavior Effects 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000004913 activation Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 239000000919 ceramic Substances 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 241001522296 Erithacus rubecula Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 210000002268 wool Anatomy 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application relates to a method and a device for recognizing an abnormality, and belongs to the field of data processing. The method comprises the following steps: generating a node relationship graph according to each relationship point pair data in a time period, wherein each relationship point pair data comprises a first entity and a second entity which have an action relationship, attribute information of the action relationship and occurrence time, the first entity and the second entity are two different nodes in the node relationship graph, and the two nodes are connected by using an edge for representing the action relationship; acquiring at least one abnormal description characteristic of each node in the node relation graph according to the node relation graph and the data of each relation point pair, wherein the at least one abnormal description characteristic of each node is used for measuring the degree of each node as an abnormal node; an anomalous node is identified from each node based on the at least one anomaly characterization feature for each node. The data security can be improved.
Description
Technical Field
The present application relates to the field of data processing, and in particular, to a method and an apparatus for identifying an anomaly.
Background
With the development of information technology, the data volume shows an exponential growth trend, and meanwhile, the data security is more and more threatened. In the field of risk anomaly, data safety is greatly challenged, and the damage to the data safety is particularly obvious, such as cheating group network, group of wool pulling and the like.
In order to improve data security, data generated in a network needs to be analyzed to identify an abnormal data node, so that security processing is performed based on the data node, and data security can be effectively improved. Therefore, in order to improve data security, it is currently necessary to identify abnormal data nodes.
Disclosure of Invention
In order to improve data security, the embodiment of the application provides a method and a device for identifying an exception. The technical scheme is as follows:
in one aspect, the present application provides a method for anomaly identification, including:
generating a node relationship graph according to each relationship point pair data in a time period, wherein each relationship point pair data comprises a first entity and a second entity which have an action relationship, attribute information of the action relationship and occurrence time, the first entity and the second entity are two different nodes in the node relationship graph, and the two nodes are connected by using an edge for representing the action relationship;
acquiring at least one abnormal description feature of each node in the node relation graph according to the node relation graph, wherein the at least one abnormal description feature of each node is used for measuring the degree of each node as an abnormal node;
and identifying abnormal nodes from each node according to the at least one abnormal description characteristic of each node.
Optionally, the at least one anomaly description feature comprises one or more of a base feature, a structural feature, an aggregate feature, a community feature, and an unsupervised anomaly feature;
the basic characteristics of the node are used for describing the behavior of the node, the structural characteristics of the node are used for describing the position of the node in the node relation graph and/or the relation between the node and a neighbor node, the aggregation characteristics of the node are aggregation representation of the neighbor nodes around the node, the community characteristics of the node are used for describing the attribute of the community to which the node belongs, and the unsupervised abnormal characteristics of the node are used for describing abnormal information carried by the node.
Optionally, obtaining the structural feature of each node in the node relationship graph according to the node relationship graph includes:
identifying a preset triple structure including a first node and a preset quadruple structure including the first node in the node relationship graph, wherein the first node is any one node in the node relationship graph, and the structural characteristics of the first node include the preset triple structure number and the preset quadruple structure number.
Optionally, obtaining the aggregation characteristic of each node in the node relationship graph according to the node relationship graph includes:
identifying each first-order neighbor node of a first node from the node relationship graph, wherein the first node is any one node in the node relationship graph, and the first node is connected with each first-order neighbor node through one edge;
and acquiring the aggregation characteristics of the first node through an aggregation function according to the information of the first node and the information of each first-order neighbor node.
Optionally, obtaining the community characteristics of each node in the node relationship graph according to the node relationship graph includes:
dividing the node relationship graph into a plurality of communities;
counting the number of nodes, the number of edges and the proportion of known abnormal nodes included in a first community, and/or counting the number of a second community, wherein the second community is a community associated with the first community, and the community characteristics of all the nodes in the first community are obtained.
Optionally, identifying an abnormal node from each node according to the at least one abnormal description feature of each node, includes:
respectively forming an abnormal feature vector of each node by using the at least one abnormal description feature of each node;
replacing each node in the node relation graph with the abnormal characteristic vector of each node respectively to obtain an abnormal recognition frame graph;
and acquiring the abnormal score of each node according to the abnormal recognition frame graph, and determining the node with the abnormal score meeting the specified conditions as an abnormal node.
In another aspect, the present application provides an apparatus for anomaly identification, the apparatus comprising:
a generating module, configured to generate a node relationship graph according to each relationship point-to-point data in a time period, where each relationship point-to-point data includes a first entity and a second entity that have an action relationship, and attribute information and occurrence time of the action relationship, where the first entity and the second entity in the node relationship graph are two different nodes, and the two nodes are connected by using an edge used for representing the action relationship;
an obtaining module, configured to obtain at least one abnormal description feature of each node in the node relationship graph according to the node relationship graph, where the at least one abnormal description feature of each node is used to measure a degree that each node is an abnormal node;
and the identification module is used for identifying abnormal nodes from each node according to the at least one abnormal description characteristic of each node.
Optionally, the at least one anomaly description feature comprises one or more of a base feature, a structural feature, an aggregate feature, a community feature, and an unsupervised anomaly feature;
the basic characteristics of the node are used for describing the behavior of the node, the structural characteristics of the node are used for describing the position of the node in the node relation graph and/or the relation between the node and a neighbor node, the aggregation characteristics of the node are aggregation representation of the neighbor nodes around the node, the community characteristics of the node are used for describing the attribute of the community to which the node belongs, and the unsupervised abnormal characteristics of the node are used for describing abnormal information carried by the node.
Optionally, the obtaining module is configured to:
identifying a preset triple structure including a first node and a preset quadruple structure including the first node in the node relationship graph, wherein the first node is any one node in the node relationship graph, and the structural characteristics of the first node include the preset triple structure number and the preset quadruple structure number.
Optionally, the obtaining module is configured to:
identifying each first-order neighbor node of a first node from the node relationship graph, wherein the first node is any one node in the node relationship graph, and the first node is connected with each first-order neighbor node through one edge;
and acquiring the aggregation characteristics of the first node through an aggregation function according to the information of the first node and the information of each first-order neighbor node.
Optionally, the obtaining module is configured to:
dividing the node relationship graph into a plurality of communities;
counting the number of nodes, the number of edges and the proportion of known abnormal nodes included in a first community, and/or counting the number of a second community, wherein the second community is a community associated with the first community, and the community characteristics of all the nodes in the first community are obtained.
Optionally, the identification module is configured to:
respectively forming an abnormal feature vector of each node by using the at least one abnormal description feature of each node;
replacing each node in the node relation graph with the abnormal characteristic vector of each node respectively to obtain an abnormal recognition frame graph;
and acquiring the abnormal score of each node according to the abnormal recognition frame graph, and determining the node with the abnormal score meeting the specified conditions as an abnormal node.
In another aspect, the present application provides an electronic device, comprising: a processor and a memory. The processor and the memory can be connected through a bus system. The memory is used for storing programs, instructions or codes, and the processor is used for executing the programs, the instructions or the codes in the memory to realize the method.
In another aspect, the present application provides a computer program product comprising a computer program stored in a computer readable storage medium and loaded by a processor to implement the above method.
In another aspect, the present application provides a non-transitory computer-readable storage medium for storing a computer program, which is loaded by a processor to execute the instructions of the method.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
generating a node relation graph according to the data of each relation point in a time period, acquiring at least one abnormal description feature of each node in the node relation graph according to the node relation graph because the node relation graph represents the relation between the node and the surrounding nodes, so that the abnormal description feature of each node can be used for measuring the degree of each node as an abnormal node, and the at least one abnormal description feature can express the feature of the node from multiple angles; therefore, according to at least one abnormal description characteristic of each node, the abnormal node can be accurately identified from each node, data security analysis is carried out based on the abnormal node, and data security can be improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
FIG. 1 is a system architecture diagram according to an embodiment of the present application;
FIG. 2 is a flow chart of a method for anomaly identification provided by an embodiment of the present application;
FIG. 3 is a flow chart of another method for anomaly identification provided by embodiments of the present application;
fig. 4 is a schematic diagram of a node relationship graph provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of a triple structure and a quadruple structure provided in the examples of the present application;
FIG. 6 is a schematic structural diagram of an anomaly identification apparatus provided in the present application;
fig. 7 is a schematic structural diagram of another anomaly identification apparatus according to an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
A relationship point-to-data is a record that records the relationship between two instances. The relationship point-to-data includes a first entity and a second entity in an action relationship, the action relationship having occurrence time and attribute information. The attribute information may include a relationship strength and a type of the action relationship.
When the first instance and the second entity are in the action relationship, the server may generate a record for recording the action relationship between the first entity and the second entity, that is, generate a piece of relationship point-to-data.
For example, the first entity and the second entity may be two different financial accounts, the operative relationship that occurs between the first entity and the second entity may be a transfer, and so on. The strength of the action relationship may be the transfer amount, etc. The type of action relationship may be roll-out or roll-in, etc. The occurrence time of the action relationship is the transfer time. When the two financial account numbers transfer, the financial server can generate a piece of relational point-to-point data for recording the transfer between the two financial account numbers.
For another example, the first entity and the second entity may be two different social accounts, the action relationship between the first entity and the second entity may be a communication, and the communication may be a voice communication, a video communication, or a text communication. The strength of the relationship of the action relationship may be a communication time length, a communication data amount or a message number, and the like. The action relationship may be a communication initiator or a communication receiver, etc. The occurrence time of the action relation is communication initiation time. When the two social accounts communicate, the social server may generate a piece of relationship point pair data for recording the communication between the two social accounts.
Referring to fig. 1, the present application provides a system architecture including an electronic device and a server device, and a network connection may be established between the electronic device and the server device.
The server device may store the point-to-point relation data generated at different times, and the electronic device may acquire the point-to-point relation data generated at different times by the server device from the server device, and process the acquired point-to-point relation data to perform abnormality identification.
The server device may be a server or a server cluster, and may be, for example, the above-listed financial server or social server. The electronic device may be a server or a terminal, etc.
Referring to fig. 2, an embodiment of the present application provides a method for anomaly identification, where the method is applied to the system architecture shown in fig. 1, and an execution subject of the method may be an electronic device in the system architecture. The method comprises the following steps:
step 201: and generating a node relation graph according to the data of each relation point pair in a time period.
Wherein each relationship point pair data comprises a first entity and a second entity which have an action relationship, attribute information of the action relationship and occurrence time, wherein the first entity and the second entity are two different nodes in the node relationship graph, and the two nodes are connected by using an edge for representing the action relationship.
Step 202: and acquiring at least one abnormal description characteristic of each node in the node relation graph according to the node relation graph and the data of each relation point pair.
And at least one abnormal description characteristic of each node is used for measuring the degree of each node as an abnormal node.
Step 203: an anomalous node is identified from each node based on the at least one anomaly characterization feature for each node.
The method may acquire the relational point-to-point data in a plurality of consecutive time periods, perform the operations in steps 201 and 202 on the relational point-to-point data in each time period, obtain the abnormal description feature of each node in each time period, and identify an abnormal node from each node according to at least one abnormal description feature of each node in each time period. Because the data is processed by the relation points of a plurality of continuous time periods, the time sequence change characteristic of the network in the actual service is considered, and the accuracy of identifying the abnormal node can be improved.
In the embodiment of the application, a node relation graph is generated according to data of each relation point pair in a time period, and because the node relation graph embodies the relation between a node and surrounding nodes thereof, at least one abnormal description feature of each node in the node relation graph is obtained according to the node relation graph, so that the abnormal description feature of each node can be respectively used for measuring the degree that each node is an abnormal node; therefore, according to at least one abnormal description characteristic of each node, abnormal nodes can be accurately identified from each node, data security analysis is carried out based on the abnormal nodes, and therefore data security can be improved based on the identified abnormal nodes.
Referring to fig. 3, the present application provides a method for anomaly identification, where the method is applied to the system architecture shown in fig. 1, and an execution subject of the method may be an electronic device in the system architecture. The method comprises the following steps:
step 301: and acquiring data of each relationship point pair in a time period, wherein the time length of the time period is a specified time length.
Each relationship point pair data comprises a first entity, a second entity, the occurrence time of the action relationship and attribute information. The occurrence time of data inclusion for each important relationship point lies within the time period.
In step 301, each piece of relational point pair data within one time period of input may be acquired. That is, the technician may obtain each piece of relationship point pair data in the time period, and input each piece of relationship point pair data in the time period into the electronic device, and accordingly, the electronic device obtains each piece of relationship point pair data in the time period, which is input by the technician. Or,
the electronic equipment and the server equipment establish network connection, and acquire the relational point pair data in a time period from the server equipment.
For example, it is assumed that five pieces of relationship point pair data are acquired, the first piece of relationship point pair data includes a first entity "ID 1", a second entity "ID 2", the type of the effect relationship is roll-out, the amount of money that the first entity "ID 1" rolls out to the second entity "ID 2" is "100", and the roll-out time is t 1. The second relation point pair data comprises a first entity ID1, a second entity ID2 and the type of the action relation is transfer, the amount of money transferred by the first entity ID1 after the second entity ID2 is transferred is 150, and the transfer time is t 2. The third relationship point pair comprises the first entity ID1, the second entity ID3 and the type of the action relationship is roll-out, the amount of money rolled out from the first entity ID1 to the second entity ID3 is 100, and the roll-out time is t 3. The fourth piece of relationship point-to-data comprises the first entity 'ID 1', the second entity 'ID 2', the type of the effect relationship is roll-out, the amount of money of the first entity 'ID 1' roll-out to the second entity 'ID 2' is '20', and the roll-out time is t 4. The fifth relationship point pair data comprises a first entity ID2, a second entity ID4 and a type of role relationship of roll-out, wherein the amount of roll-out of the first entity ID2 to the second entity ID4 is 100, and the roll-out time is t 5.
Step 302: and establishing a node relationship according to the data of each relationship point in the time period.
In the node relationship graph, each piece of relationship node data comprises a first entity and a second entity which are two different nodes, and the two nodes are connected by using an edge for representing the action relationship between the first entity and the second entity. The edge has a direction that corresponds to the type of the action relationship.
For example, taking an entity as a financial account, that is, a first entity and a second entity are two different financial accounts, assuming that the type of the action relationship between the first entity and the second entity is roll-out, that is, the first entity rolls out money to the second entity, the direction of the edge existing between the first entity and the second entity is from the first entity to the second entity. And then, assuming that the type of the action relationship between the first entity and the second entity is transfer, that is, the second entity transfers money to the first entity, the direction of the edge existing between the first entity and the second entity is from the second entity to the first entity.
In step 302, for each piece of relationship point-to-point data in the time period, if there is no node corresponding to the first entity and no node corresponding to the second entity included in the relationship point-to-point data in the node relationship graph, a node corresponding to the first entity and a node corresponding to the second entity are established in the node relationship graph; and adding an edge corresponding to the type between the two established nodes according to the type of the action relation between the first entity and the second entity. And judging whether an edge corresponding to the type exists between the two nodes or not if the edge corresponding to the type does not exist, and adding the edge corresponding to the type between the two nodes. If there is an edge corresponding to the type, then no operation may be performed on the data for the relationship point.
As can be seen from the above contents of establishing the node relationship graph, each node in the node relationship graph is substantially an entity.
For example, for the five pieces of relationship point pair data acquired as described above, referring to fig. 4, for the first piece of relationship point pair data, the node ID1 and the node ID2 are established in the node relationship graph according to the fact that the first piece of relationship point pair data includes the first entity "ID 1" and the second entity "ID 2", and the type of the action relationship is roll-out, the node ID1 includes the first entity "ID 1", the node ID2 includes the second entity "ID 2", and an edge is added between the node "ID 1" and the node "ID 2", the edge being pointed to the node "ID 2" by the node "ID 1".
For the second piece of relationship point pair data, since the node "ID 1" corresponding to the first entity "ID 1" and the node "ID 2" corresponding to the second entity "ID 2" in the second piece of relationship point pair data are included in the node relationship graph, an edge is added between the node "ID 1" and the node "ID 2" according to the type of the effect relationship included in the second piece of relationship point pair data, and the edge is pointed to the node "ID 1" by the node "ID 2".
For the third piece of point-of-relationship pair data, according to the third piece of point-of-relationship pair data including the second entity "ID 3" and the type of action relationship being roll-out, a node ID3 is established in the node relationship graph, the node ID3 includes the second entity "ID 3", an edge is added between the node "ID 1" and the node "ID 3", the edge being pointed to the node "ID 3" by the node "ID 1".
For the fourth relational point pair data, since the node ID1 corresponding to the first entity "ID 1" and the node ID2 corresponding to the second entity "ID 2" in the fourth relational point pair data are included in the node relational map, and an edge corresponding to the type "roll out" of the effect relationship included in the fourth relational point pair data exists between the node ID1 and the node ID2 in the node relational map, no operation is performed on the fourth relational point pair data.
For the fifth point-of-relationship pair data, according to the fifth point-of-relationship pair data including the second entity "ID 4" and the type of the effect relationship being roll-out, a node ID4 is established in the node relationship graph, the node ID4 includes the second entity "ID 4", an edge is added between the node "ID 2" and the node "ID 4", the edge being pointed to the node "ID 4" by the node "ID 2".
Step 303: and acquiring the basic characteristics of each node in the node relation graph according to the data of each relation point pair in the time period and the node relation graph.
The basic feature of the node may be a vector, which is used to describe the behavior feature of the node, and the basic feature of the node reflects the frequency, time, and intensity of communication or interaction between the node and other nodes, the type of communication or interaction object of the node, and other information. The basic characteristics of the node may include one or more of the node relationship pair data number of the node, the time interval statistics value, the node relationship pair data number of the node in a sensitive time period, the node relationship pair data number between the node and a known abnormal node, the first-order neighbor node number of the node, the second-order neighbor node number, the relationship strength statistics value between the node and the first-order neighbor node, the type statistics value of the action relationship occurring between the node and the first-order neighbor node, and the basic network index of the node. I.e. the underlying feature of the node is a vector comprising the one or more features.
For each node in the node relationship graph, the node is called a first node, and then, an acquisition process of each element included in the basic feature of the first node is described one by one.
For the node relation pair data number of the first node, the number of the relation point pair data including the first node is counted from each relation point pair data in the one time period, and the node relation pair data number of the first node is obtained.
For the time interval statistic of the first node, acquiring each piece of relational point pair data including the first node from each piece of relational point pair data in the time period; acquiring time intervals of adjacent occurrence times according to the occurrence times of the data of the relationship points; and acquiring the average value, the variance, the maximum value and/or the minimum value of the time interval according to the acquired time interval to obtain the time interval statistic of the first node. I.e. the first node's time interval statistics comprise one or more of the mean, variance, maximum and minimum values.
For the number of the node relationship pair data of the first node in the sensitive time period, acquiring each piece of relationship point pair data which has the occurrence time in the sensitive time period and includes the first node from each piece of relationship point pair data in the time period, and counting the number of each piece of relationship point pair data to obtain the number of the node relationship pair data of the first node in the sensitive time period. Wherein the sensitive time period is a specified time period.
For the node relation pair data number between the first node and the known abnormal node, obtaining each piece of relation point pair data including the first node and the known abnormal node from each piece of relation point pair data in the time period, and counting the number of each piece of relation point pair data to obtain the node relation pair data number between the first node and the known abnormal node.
And for the number of first-order neighbor nodes of the first node, determining each first-order neighbor node of the first node from the node relation graph, enabling the first node to reach each first-order neighbor node through one edge, and counting the number of each queue of neighbor nodes to obtain the number of the first-order neighbor nodes of the first node.
And for the number of second-order neighbor nodes of the first node, determining each second-order neighbor node of the first node from the node relation graph, enabling the first node to reach each first-order neighbor node through two edges, and counting the number of each second-queue neighbor node to obtain the number of second-order neighbor nodes of the first node.
For the relation strength statistic value between the first node and the first-order neighbor node, acquiring each piece of relation point pair data of the first-order neighbor node comprising the first node and the first node, and acquiring a relation strength accumulated value, an average value, a variance, a maximum value and/or a minimum value according to the relation strength included by each piece of relation point pair data.
For the type statistic value of the action relationship between the first node and the first-order neighbor node, acquiring each piece of relationship point pair data of the first-order neighbor node comprising the first node and the first node, and counting the number of each type according to the type of the action relationship included in each piece of relationship point pair data to obtain the type statistic value of the action relationship between the first node and the first-order neighbor node.
Regarding the basic network index of the first node, the node relation map is used as input and input into a webpage ranking (pageRank) algorithm, the pageRank score of each node in the node relation map is obtained through the pageRank algorithm, and the pageRank score of each node is respectively used as the basic network index of each node. Wherein the base network indication for each node comprises a base network indication for the first node.
Step 304: and acquiring the structural characteristics of each node in the node relation graph according to the node relation graph.
The structural characteristics of the node are used for describing the position of the node in the node relationship graph and/or the relationship between the node and a neighboring node, and the like, and the communication and interaction between the node and the neighboring node of the node can be embodied specifically. The structural feature of the node is also a vector.
Before step 304 is performed, referring to fig. 4, a plurality of triple structures and quadruple structures are established in advance, each triple structure comprising a node and two first-order neighbor nodes of the node. Each triplet structure includes three nodes, and each triplet structure is different. Each quad structure includes a node, two first order neighbor nodes and one second order neighbor node of the node. Each quad structure includes four nodes, and each quad structure is different.
In step 304, in the node relationship graph, each triple structure including the first node and each quadruple structure including the first node are identified, and the number of each triple structure and the number of each quadruple structure are counted. The structural characteristics of the first node include the number of triple structures and the number of quadruple structures.
Step 305: and acquiring the aggregation characteristics of each node in the node relation graph.
For a first node, the first node is any one of the nodes in the node relationship graph. The aggregation feature of the first node is obtained by aggregating the information of the neighboring nodes around the first node, and is an aggregation representation of the information of the neighboring nodes around the first node.
In step 305, first-order neighbor nodes of the first node are identified from the node relationship graph, and the aggregation characteristics of the first node are obtained through an aggregation function according to the information of the first node and the information of each identified first-order neighbor node. When implemented: the aggregated characteristics of the first node may be obtained in several ways. The several modes are respectively as follows:
in the first mode, a normalized weight between a first node and each first-order neighbor node is obtained, and the aggregation characteristic of the first node is obtained through an aggregation function according to the normalized weight between the first node and each first-order neighbor node and the basic characteristic of each first-order neighbor node.
In the first method, for the normalized weight, the relationship strength or pageRank score between the first node and each first-order neighbor node is normalized to obtain the normalized weight between the first node and each first-order neighbor node. For example, assume that the first node has five first-order neighbor nodes, and the strengths of the relationships between the first node and the five first-order neighbor nodes are 20, 25, 30, 40, and 50, respectively. The normalized weights between the first node and the five first-order neighbor nodes are 0.4(20/50), 0.5(25/50), 0.6(30/50), 0.8(40/50), and 1(50/50), respectively.
And for the aggregation characteristic of the first node, multiplying the normalized weight between the first node and each first-order neighbor node by the basic characteristic of each first-order neighbor node respectively to obtain the product of each first-order neighbor node, and acquiring the aggregation characteristic of the first node through an aggregation function according to the product of each first-order neighbor node.
Alternatively, the aggregation function may include calculating a mean, variance, accumulation, and/or maximum value. Therefore, for the aggregation characteristic of the first node, the average value, the variance, the accumulated value and/or the maximum value may be calculated according to the product of each first-order neighbor node, so as to obtain the aggregation characteristic of the first node, that is, the aggregation characteristic of the first node includes the calculated average value, variance, accumulated value and/or maximum value.
In the second mode, the first-order neighbor nodes of each second node are obtained, the second nodes are the first-order neighbor nodes of the first node, and the aggregation characteristics of the first node are obtained through the following first formula according to the characteristic vector of each second node, the first-order neighbor nodes of each second node and the first-order neighbor nodes of the first node.
in the first formula, i is the first node,is the aggregate characteristic of the first node, sigma is the standard nonlinear transformation function, w(t)Is a standard linear transformation matrix and is a linear transformation matrix,is the feature vector of the second node j, N (i) is each first-order neighbor node of the first node, and N (j) is each first-order neighbor node of the second node j.
Where n (i) u |, (i) denotes a union of each first-order neighbor node of the first node and the first node, and | represents an operation of calculating the number of nodes, that is, | n (i) | denotes the number of first-order neighbor nodes of the first node, and | n (j) | denotes the number of first-order neighbor nodes of the second node j. Feature vector of second node jIncluding one or more of the base characteristics, structural characteristics, node attribute information, etc. of the second node i. When the second node is a financial account or a social account, the node attribute information of the second node is account information, and includes one or more of account type, account opening time, account login frequency, and the like.
The third mode is as follows: and acquiring the aggregation characteristics of the first node according to the characteristic vector of the first node and the characteristic vector of each first-order neighbor node of the first node and the following second formula or third formula.
in the second formula, the first formula is,is the aggregate characteristic of the first node, sigma is the standard nonlinear transformation function, w(t)Is a standard linear transformation matrix, MEAN is a MEAN operation,is a feature vector of the first node,is the feature vector of the first-order neighbor node j of the first node, U is the union operation, and N (i) is each first-order neighbor node of the first node.
in the third formula, the first formula is,is the aggregate characteristic of the first node, sigma is a standard nonlinear transformation function, WpoolFor processing linear transformation matrices of neighbour information, w(t)Is a standard linear transformation matrix, CONCAT is splicing operation,is a feature vector of the first node,a first order neighbor node of the first nodej, n (i) is each first-order neighbor node of the first node, and b is a constant.
The splicing operation is used to splice two feature vectors into a vector, for example, the splicing operation is performed on the feature vectors [1, 2, 3] and [4, 5, 6], and the obtained vector is [1, 2, 3, 4, 5, 6 ].
In a fourth mode, according to the feature vector of each first-order neighbor node of the first node and the weight coefficient of each first-order neighbor node, the aggregation feature of the first node is obtained in a fourth mode as follows.
in the fourth formula, the first and second equations,is the aggregate characteristic of the first node, sigma is the standard nonlinear transformation function, alphaijThe weight coefficients of the first-order neighbor node j, which is the first-order neighbor node, are standard linear transformation matrices,is the feature vector of the first-order neighbor node j of the first node, and n (i) is each first-order neighbor node of the first node.
The first formula, the second formula, the third formula and the fourth formula are different aggregation functions respectively.
Step 306: and acquiring the community characteristics of each node in the node relation graph according to the node relation graph.
The community characteristics of a node are used to describe the community attributes to which the node belongs.
In step 306, the node relationship graph is divided into a plurality of communities, each community including at least one node, using a community discovery algorithm. And counting the number of nodes, the number of edges and the proportion of known abnormal nodes included in the first community, and/or counting the number of the second community to obtain the community characteristics of each node in the first community, wherein the first community is any community.
That is, the community characteristics of each node in the first community include one or more of the number of nodes, the number of edges, the proportion of known outlier nodes, and the second community number.
The ratio to the known outlier nodes is the ratio between the number of known outlier nodes and the number of nodes included in the first community.
The second community is a community associated with the first community. By associating the first community with the second community, it is meant that there is an edge connection between at least one node in the first community and at least one node in the second community.
When the node relationship graph is divided into a plurality of community areas, the community discovery algorithm enables closely-connected nodes in the node relationship graph to form a community.
Step 307: and acquiring the abnormal features of each node through an unsupervised abnormal algorithm frame according to each node in the node relation graph.
Optionally, the unsupervised anomaly algorithm may be an isolated forest or a Local Outlier Factor (LOF) algorithm.
The abnormal characteristics of the nodes are used for describing abnormal information carried by the nodes.
Step 308: an anomalous node is identified from each node based on an anomalous feature vector for each node, at least one anomalous feature vector for a node including one or more of a base feature, a structural feature, an aggregate feature, a community feature, and an unsupervised anomalous feature for the node.
In step 308, the basic features, the structural features, the aggregation features, the community features and/or the unsupervised abnormal features of each node are respectively formed into an abnormal feature vector of each node. And replacing each node in the node relation graph with the abnormal characteristic vector of each node respectively to obtain an abnormal recognition frame graph. And according to the abnormal recognition frame diagram, acquiring the abnormal score of each node through the abnormal recognition model, and determining the node with the abnormal score meeting the specified conditions as the abnormal node.
The method may continuously obtain each piece of relational point pair data in a plurality of time periods, and perform the operations in steps 302 to 307 on the relational point pair data in each time period, so as to obtain the anomaly identification frame map corresponding to each time period. And inputting the abnormal recognition frame graph corresponding to each time period into the abnormal recognition model, and acquiring the abnormal score of each node through the abnormal recognition model.
The anomaly score of a node is used to represent the degree to which the node is an anomalous node.
The anomaly identification model includes a gated round robin unit (GRU) network layer, an attention (attention) network layer and a softmax nonlinear activation function, the attention network layer being located between the GRU network layer and the softmax nonlinear activation function. And inputting the anomaly identification frame map corresponding to each time period into a GRU network layer included by the anomaly identification model, and acquiring the anomaly score of each node output by the softmax nonlinear activation function included by the anomaly identification model.
The relational point pair data in a plurality of consecutive time periods may be acquired, the operations in steps 301 and 307 may be performed on the relational point pair data in each time period to obtain an abnormal description feature of each node in each time period, and an abnormal node may be identified from each node according to at least one abnormal description feature of each node in each time period. Because the data is processed by the relation points of a plurality of continuous time periods, the time sequence change characteristic of the network in the actual service is considered, and the accuracy of identifying the abnormal node can be improved.
In the embodiment of the application, because the node relationship graph is generated according to the data of each relationship point in a time period, and because the node relationship graph represents the relationship between a node and the surrounding nodes, one or more abnormal description features of the basic feature, the aggregation feature, the structural feature, the community feature and the unsupervised abnormal feature of each node in the node relationship graph are obtained according to the node relationship graph, and the node can be expressed from multiple angles based on the one or more abnormal description features; therefore, according to at least one abnormal description characteristic of each node, abnormal nodes can be accurately identified from each node, data security analysis is carried out based on the abnormal nodes, and therefore data security can be improved based on the identified abnormal nodes.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Referring to fig. 6, an embodiment of the present application provides an apparatus 600 for anomaly identification, where the apparatus 600 includes:
a generating module 601, configured to generate a node relationship graph according to each relationship point pair data in a time period, where each relationship point pair data includes a first entity and a second entity that have an action relationship, and attribute information and occurrence time of the action relationship, where the first entity and the second entity in the node relationship graph are two different nodes, and the two nodes are connected by using an edge used for representing the action relationship;
an obtaining module 602, configured to obtain at least one abnormal description feature of each node in the node relationship graph according to the node relationship graph, where the at least one abnormal description feature of each node is used to measure a degree that each node is an abnormal node;
an identifying module 603, configured to identify an abnormal node from each node according to the at least one abnormal description feature of each node.
Optionally, the at least one anomaly description feature comprises one or more of a base feature, a structural feature, an aggregate feature, a community feature, and an unsupervised anomaly feature;
the basic characteristics of the nodes are used for describing the behaviors of the nodes, the structural characteristics of the nodes are used for describing the positions of the nodes in a node relation graph and/or the relation between the nodes and neighbor nodes, the aggregation characteristics of the nodes are aggregation representation of the neighbor nodes around the nodes, the community characteristics of the nodes are used for describing the attributes of communities to which the nodes belong, and the unsupervised abnormal characteristics of the nodes are used for describing abnormal information carried by the nodes.
Optionally, the obtaining module 602 is configured to:
and identifying a preset triple structure comprising a first node and a preset quadruple structure comprising the first node in the node relation graph, wherein the first node is any one node in the node relation graph, and the structural characteristics of the first node comprise the number of the preset triple structures and the number of the preset quadruple structures.
Optionally, the obtaining module 602 is configured to:
identifying each first-order neighbor node of a first node from a node relation graph, wherein the first node is any one node in the node relation graph, and the first node is connected with each first-order neighbor node through one edge;
and acquiring the aggregation characteristics of the first node through an aggregation function according to the information of the first node and the information of each first-order neighbor node.
Optionally, the obtaining module 602 is configured to:
dividing the node relation graph into a plurality of communities;
and counting the number of nodes, the number of edges and the proportion of known abnormal nodes included in the first community, and/or counting the number of a second community, wherein the second community is a community associated with the first community, and the community characteristics of each node in the first community are obtained.
Optionally, the identifying module 603 is configured to:
respectively forming an abnormal feature vector of each node by using at least one abnormal description feature of each node;
replacing each node in the node relation graph with the abnormal characteristic vector of each node respectively to obtain an abnormal recognition frame graph;
and acquiring the abnormal score of each node according to the abnormal recognition frame graph, and determining the node with the abnormal score meeting the specified conditions as the abnormal node.
In the embodiment of the application, the generation module generates a node relationship graph according to data of each relationship point in a time period, and because the node relationship graph embodies the relationship between a node and nodes around the node, the acquisition module acquires at least one abnormal description feature of each node in the node relationship graph according to the node relationship graph, the abnormal description feature of each node can be used for measuring the degree of each node as an abnormal node, and the at least one abnormal description feature can express the feature of the node from multiple angles; therefore, the identification module can accurately identify the abnormal nodes from each node according to at least one abnormal description characteristic of each node, and data security analysis is performed based on the abnormal nodes, so that data security can be improved.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 7 shows a block diagram of an electronic device 700 according to an exemplary embodiment of the present application. The electronic device 700 may be a portable mobile terminal, such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. The electronic device 700 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so forth.
In general, the electronic device 700 includes: a processor 701 and a memory 702.
The processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 701 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 701 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit) which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 701 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
In some embodiments, the electronic device 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by buses or signal lines. Various peripheral devices may be connected to peripheral interface 703 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 704, a display screen 705, a camera assembly 706, an audio circuit 707, a positioning component 708, and a power source 709.
The peripheral interface 703 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 701 and the memory 702. In some embodiments, processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
The Radio Frequency circuit 704 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 704 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 704 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 704 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 704 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.
The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 705 is a touch display screen, the display screen 705 also has the ability to capture touch signals on or over the surface of the display screen 705. The touch signal may be input to the processor 701 as a control signal for processing. At this point, the display 705 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 705 may be one, disposed on the front panel of the electronic device 700; in other embodiments, the number of the display screens 705 may be at least two, and the at least two display screens are respectively disposed on different surfaces of the electronic device 700 or are in a folding design; in other embodiments, the display 705 may be a flexible display disposed on a curved surface or on a folded surface of the electronic device 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The Display 705 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or the like.
The camera assembly 706 is used to capture images or video. Optionally, camera assembly 706 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
The audio circuitry 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing or inputting the electric signals to the radio frequency circuit 704 to realize voice communication. For stereo capture or noise reduction purposes, the microphones may be multiple and disposed at different locations of the electronic device 700. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 707 may also include a headphone jack.
The positioning component 708 is operable to locate a current geographic Location of the electronic device 700 to implement a navigation or LBS (Location Based Service). The Positioning component 708 can be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, or the galileo System in russia.
The power supply 709 is used to supply power to various components in the electronic device 700. The power source 709 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 709 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the electronic device 700 also includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyro sensor 712, pressure sensor 77, fingerprint sensor 714, optical sensor 715, and proximity sensor 716.
The acceleration sensor 711 may detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the electronic device 700. For example, the acceleration sensor 711 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 701 may control the display screen 705 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 711. The acceleration sensor 711 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 712 may detect a body direction and a rotation angle of the electronic device 700, and the gyro sensor 712 may cooperate with the acceleration sensor 711 to acquire a 3D motion of the user with respect to the electronic device 700. From the data collected by the gyro sensor 712, the processor 701 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
Pressure sensor 77 may be disposed on a side bezel of electronic device 700 and/or underlying display screen 705. When the pressure sensor 77 is disposed on the side frame of the electronic device 700, the holding signal of the user to the electronic device 700 can be detected, and the processor 701 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 77. When the pressure sensor 77 is disposed at the lower layer of the display screen 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 705. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 714 is used for collecting a fingerprint of a user, and the processor 701 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 714, or the fingerprint sensor 714 identifies the identity of the user according to the collected fingerprint. When the user identity is identified as a trusted identity, the processor 701 authorizes the user to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, changing settings, and the like. The fingerprint sensor 714 may be disposed on the front, back, or side of the electronic device 700. When a physical button or vendor Logo is provided on the electronic device 700, the fingerprint sensor 714 may be integrated with the physical button or vendor Logo.
The optical sensor 715 is used to collect the ambient light intensity. In one embodiment, the processor 701 may control the display brightness of the display screen 705 based on the ambient light intensity collected by the optical sensor 715. Specifically, when the ambient light intensity is high, the display brightness of the display screen 705 is increased; when the ambient light intensity is low, the display brightness of the display screen 705 is adjusted down. In another embodiment, processor 701 may also dynamically adjust the shooting parameters of camera assembly 706 based on the ambient light intensity collected by optical sensor 715.
A proximity sensor 716, also referred to as a distance sensor, is typically disposed on the front panel of the electronic device 700. The proximity sensor 716 is used to capture the distance between the user and the front of the electronic device 700. In one embodiment, the processor 701 controls the display screen 705 to switch from the bright screen state to the dark screen state when the proximity sensor 716 detects that the distance between the user and the front surface of the electronic device 700 is gradually decreased; when the proximity sensor 716 detects that the distance between the user and the front surface of the electronic device 700 is gradually increased, the processor 701 controls the display screen 705 to switch from the breath screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 7 does not constitute a limitation of the electronic device 700 and may include more or fewer components than those shown, or combine certain components, or employ a different arrangement of components.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
Claims (12)
1. A method of anomaly identification, the method comprising:
generating a node relationship graph according to each relationship point pair data in a time period, wherein each relationship point pair data comprises a first entity and a second entity which have an action relationship, attribute information of the action relationship and occurrence time, the first entity and the second entity are two different nodes in the node relationship graph, and the two nodes are connected by using an edge for representing the action relationship;
acquiring at least one abnormal description feature of each node in the node relation graph according to the node relation graph, wherein the at least one abnormal description feature of each node is used for measuring the degree of each node as an abnormal node;
and identifying abnormal nodes from each node according to the at least one abnormal description characteristic of each node.
2. The method of claim 1, wherein the at least one anomaly description feature comprises one or more of a base feature, a structural feature, an aggregate feature, a community feature, and an unsupervised anomaly feature;
the basic characteristics of the node are used for describing the behavior of the node, the structural characteristics of the node are used for describing the position of the node in the node relation graph and/or the relation between the node and a neighbor node, the aggregation characteristics of the node are aggregation representation of the neighbor nodes around the node, the community characteristics of the node are used for describing the attribute of the community to which the node belongs, and the unsupervised abnormal characteristics of the node are used for describing abnormal information carried by the node.
3. The method of claim 2, wherein obtaining structural features of each node in the node relationship graph from the node relationship graph comprises:
identifying a preset triple structure including a first node and a preset quadruple structure including the first node in the node relationship graph, wherein the first node is any one node in the node relationship graph, and the structural characteristics of the first node include the preset triple structure number and the preset quadruple structure number.
4. The method of claim 2, wherein obtaining the aggregated features for each node in the node relationship graph from the node relationship graph comprises:
identifying each first-order neighbor node of a first node from the node relationship graph, wherein the first node is any one node in the node relationship graph, and the first node is connected with each first-order neighbor node through one edge;
and acquiring the aggregation characteristics of the first node through an aggregation function according to the information of the first node and the information of each first-order neighbor node.
5. The method of claim 2, wherein obtaining community features for each node in the node relationship graph from the node relationship graph comprises:
dividing the node relationship graph into a plurality of communities;
counting the number of nodes, the number of edges and the proportion of known abnormal nodes included in a first community, and/or counting the number of a second community, wherein the second community is a community associated with the first community, and the community characteristics of all the nodes in the first community are obtained.
6. The method of claims 2-5, wherein identifying an anomalous node from said each node based on at least one anomaly descriptive characteristic of said each node comprises:
respectively forming an abnormal feature vector of each node by using the at least one abnormal description feature of each node;
replacing each node in the node relation graph with the abnormal characteristic vector of each node respectively to obtain an abnormal recognition frame graph;
and acquiring the abnormal score of each node according to the abnormal recognition frame graph, and determining the node with the abnormal score meeting the specified conditions as an abnormal node.
7. An apparatus for anomaly identification, the apparatus comprising:
a generating module, configured to generate a node relationship graph according to each relationship point-to-point data in a time period, where each relationship point-to-point data includes a first entity and a second entity that have an action relationship, and attribute information and occurrence time of the action relationship, where the first entity and the second entity in the node relationship graph are two different nodes, and the two nodes are connected by using an edge used for representing the action relationship;
an obtaining module, configured to obtain at least one abnormal description feature of each node in the node relationship graph according to the node relationship graph, where the at least one abnormal description feature of each node is used to measure a degree that each node is an abnormal node;
and the identification module is used for identifying abnormal nodes from each node according to the at least one abnormal description characteristic of each node.
8. The apparatus of claim 7, in which the at least one anomaly description feature comprises one or more of a base feature, a structural feature, an aggregate feature, a community feature, and an unsupervised anomaly feature;
the basic characteristics of the node are used for describing the behavior of the node, the structural characteristics of the node are used for describing the position of the node in the node relation graph and/or the relation between the node and a neighbor node, the aggregation characteristics of the node are aggregation representation of the neighbor nodes around the node, the community characteristics of the node are used for describing the attribute of the community to which the node belongs, and the unsupervised abnormal characteristics of the node are used for describing abnormal information carried by the node.
9. The apparatus of claim 8, wherein the acquisition module is to:
identifying a preset triple structure including a first node and a preset quadruple structure including the first node in the node relationship graph, wherein the first node is any one node in the node relationship graph, and the structural characteristics of the first node include the preset triple structure number and the preset quadruple structure number.
10. The apparatus of claim 8, wherein the acquisition module is to:
identifying each first-order neighbor node of a first node from the node relationship graph, wherein the first node is any one node in the node relationship graph, and the first node is connected with each first-order neighbor node through one edge;
and acquiring the aggregation characteristics of the first node through an aggregation function according to the information of the first node and the information of each first-order neighbor node.
11. The apparatus of claim 8, wherein the acquisition module is to:
dividing the node relationship graph into a plurality of communities;
counting the number of nodes, the number of edges and the proportion of known abnormal nodes included in a first community, and/or counting the number of a second community, wherein the second community is a community associated with the first community, and the community characteristics of all the nodes in the first community are obtained.
12. The apparatus of any one of claims 8-11, wherein the identification module is to:
respectively forming an abnormal feature vector of each node by using the at least one abnormal description feature of each node;
replacing each node in the node relation graph with the abnormal characteristic vector of each node respectively to obtain an abnormal recognition frame graph;
and acquiring the abnormal score of each node according to the abnormal recognition frame graph, and determining the node with the abnormal score meeting the specified conditions as an abnormal node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011206145.9A CN112398819A (en) | 2020-11-02 | 2020-11-02 | Method and device for recognizing abnormality |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011206145.9A CN112398819A (en) | 2020-11-02 | 2020-11-02 | Method and device for recognizing abnormality |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112398819A true CN112398819A (en) | 2021-02-23 |
Family
ID=74598975
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011206145.9A Pending CN112398819A (en) | 2020-11-02 | 2020-11-02 | Method and device for recognizing abnormality |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112398819A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112968906A (en) * | 2021-03-25 | 2021-06-15 | 湖南大学 | Modbus TCP abnormal communication detection method and system based on multi-tuple |
CN113010896A (en) * | 2021-03-17 | 2021-06-22 | 北京百度网讯科技有限公司 | Method, apparatus, device, medium and program product for determining an abnormal object |
CN113610521A (en) * | 2021-07-27 | 2021-11-05 | 胜斗士(上海)科技技术发展有限公司 | Method and apparatus for detecting anomalies in behavioral data |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180255084A1 (en) * | 2017-03-02 | 2018-09-06 | Crypteia Networks S.A. | Systems and methods for behavioral cluster-based network threat detection |
US10121000B1 (en) * | 2016-06-28 | 2018-11-06 | Fireeye, Inc. | System and method to detect premium attacks on electronic networks and electronic devices |
CN108933793A (en) * | 2018-07-24 | 2018-12-04 | 中国人民解放军战略支援部队信息工程大学 | The attack drawing generating method and its device of knowledge based map |
GB201904276D0 (en) * | 2019-03-27 | 2019-05-08 | British Telecomm | Pre-emptive computer security |
CN110784470A (en) * | 2019-10-30 | 2020-02-11 | 上海观安信息技术股份有限公司 | Method and device for determining abnormal login of user |
CN110837538A (en) * | 2019-10-24 | 2020-02-25 | 北京中科捷信信息技术有限公司 | Financial knowledge map visual query and multidimensional analysis system |
CN110866190A (en) * | 2019-11-18 | 2020-03-06 | 支付宝(杭州)信息技术有限公司 | Method and device for training neural network model for representing knowledge graph |
CN110929047A (en) * | 2019-12-11 | 2020-03-27 | 中国人民解放军国防科技大学 | Knowledge graph reasoning method and device concerning neighbor entities |
CN110955834A (en) * | 2019-11-27 | 2020-04-03 | 西北工业大学 | Knowledge graph driven personalized accurate recommendation method |
CN111325347A (en) * | 2020-02-19 | 2020-06-23 | 山东大学 | Automatic danger early warning description generation method based on interpretable visual reasoning model |
CN111507470A (en) * | 2020-03-02 | 2020-08-07 | 上海金仕达软件科技有限公司 | Abnormal account identification method and device |
-
2020
- 2020-11-02 CN CN202011206145.9A patent/CN112398819A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10121000B1 (en) * | 2016-06-28 | 2018-11-06 | Fireeye, Inc. | System and method to detect premium attacks on electronic networks and electronic devices |
US20180255084A1 (en) * | 2017-03-02 | 2018-09-06 | Crypteia Networks S.A. | Systems and methods for behavioral cluster-based network threat detection |
CN108933793A (en) * | 2018-07-24 | 2018-12-04 | 中国人民解放军战略支援部队信息工程大学 | The attack drawing generating method and its device of knowledge based map |
GB201904276D0 (en) * | 2019-03-27 | 2019-05-08 | British Telecomm | Pre-emptive computer security |
CN110837538A (en) * | 2019-10-24 | 2020-02-25 | 北京中科捷信信息技术有限公司 | Financial knowledge map visual query and multidimensional analysis system |
CN110784470A (en) * | 2019-10-30 | 2020-02-11 | 上海观安信息技术股份有限公司 | Method and device for determining abnormal login of user |
CN110866190A (en) * | 2019-11-18 | 2020-03-06 | 支付宝(杭州)信息技术有限公司 | Method and device for training neural network model for representing knowledge graph |
CN110955834A (en) * | 2019-11-27 | 2020-04-03 | 西北工业大学 | Knowledge graph driven personalized accurate recommendation method |
CN110929047A (en) * | 2019-12-11 | 2020-03-27 | 中国人民解放军国防科技大学 | Knowledge graph reasoning method and device concerning neighbor entities |
CN111325347A (en) * | 2020-02-19 | 2020-06-23 | 山东大学 | Automatic danger early warning description generation method based on interpretable visual reasoning model |
CN111507470A (en) * | 2020-03-02 | 2020-08-07 | 上海金仕达软件科技有限公司 | Abnormal account identification method and device |
Non-Patent Citations (5)
Title |
---|
JUN MA∗ ETAL: "《GraphRAD: A Graph-based Risky Account Detection System》", 《ACM》 * |
JUN MA∗ ETAL: "《GraphRAD: A Graph-based Risky Account Detection System》", 《ACM》, 18 August 2018 (2018-08-18), pages 2 - 5 * |
蔡琼;陈鹏慧;黎远松;: "利用模糊聚合特征向量的视频人体识别方法", 控制工程, no. 05 * |
陈佳: "基于知识图谱的DDoS攻击源检测研究", 《信息安全研究》 * |
陈佳: "基于知识图谱的DDoS攻击源检测研究", 《信息安全研究》, no. 01, 5 January 2020 (2020-01-05) * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113010896A (en) * | 2021-03-17 | 2021-06-22 | 北京百度网讯科技有限公司 | Method, apparatus, device, medium and program product for determining an abnormal object |
CN113010896B (en) * | 2021-03-17 | 2023-10-03 | 北京百度网讯科技有限公司 | Method, apparatus, device, medium and program product for determining abnormal object |
CN112968906A (en) * | 2021-03-25 | 2021-06-15 | 湖南大学 | Modbus TCP abnormal communication detection method and system based on multi-tuple |
CN113610521A (en) * | 2021-07-27 | 2021-11-05 | 胜斗士(上海)科技技术发展有限公司 | Method and apparatus for detecting anomalies in behavioral data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110059744B (en) | Method for training neural network, method and equipment for processing image and storage medium | |
CN110795236B (en) | Method, device, electronic equipment and medium for adjusting capacity of server | |
CN111127509B (en) | Target tracking method, apparatus and computer readable storage medium | |
CN111104980B (en) | Method, device, equipment and storage medium for determining classification result | |
CN112398819A (en) | Method and device for recognizing abnormality | |
CN111078521A (en) | Abnormal event analysis method, device, equipment, system and storage medium | |
CN112084811A (en) | Identity information determining method and device and storage medium | |
CN111857793A (en) | Network model training method, device, equipment and storage medium | |
CN110647881A (en) | Method, device, equipment and storage medium for determining card type corresponding to image | |
CN113627413A (en) | Data labeling method, image comparison method and device | |
CN112989198B (en) | Push content determination method, device, equipment and computer-readable storage medium | |
CN112001442B (en) | Feature detection method, device, computer equipment and storage medium | |
CN113099378A (en) | Positioning method, device, equipment and storage medium | |
CN112819103A (en) | Feature recognition method and device based on graph neural network, storage medium and terminal | |
CN111931712A (en) | Face recognition method and device, snapshot machine and system | |
CN112100528A (en) | Method, device, equipment and medium for training search result scoring model | |
CN109688064B (en) | Data transmission method and device, electronic equipment and storage medium | |
CN111563201A (en) | Content pushing method, device, server and storage medium | |
CN111310526A (en) | Parameter determination method and device of target tracking model and storage medium | |
CN110580561B (en) | Analysis method and device for oil well oil increasing effect and storage medium | |
CN112365088A (en) | Method, device and equipment for determining travel key points and readable storage medium | |
CN112308104A (en) | Abnormity identification method and device and computer storage medium | |
CN111523876A (en) | Payment mode display method, device and system and storage medium | |
CN111325083A (en) | Method and device for recording attendance information | |
CN111984738A (en) | Data association method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |