CN116150429A - Abnormal object identification method, device, computing equipment and storage medium - Google Patents

Abnormal object identification method, device, computing equipment and storage medium Download PDF

Info

Publication number
CN116150429A
CN116150429A CN202111362003.6A CN202111362003A CN116150429A CN 116150429 A CN116150429 A CN 116150429A CN 202111362003 A CN202111362003 A CN 202111362003A CN 116150429 A CN116150429 A CN 116150429A
Authority
CN
China
Prior art keywords
node
graph
network
objects
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111362003.6A
Other languages
Chinese (zh)
Inventor
殷丽秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111362003.6A priority Critical patent/CN116150429A/en
Publication of CN116150429A publication Critical patent/CN116150429A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Provided are an abnormal object recognition method, an apparatus, a computing device, and a storage medium. The method may include: determining interaction related objects related to the pre-marked abnormal objects based on the interaction data in the first preset period; constructing an object network diagram based on the interaction data in a second preset period and by using the abnormal object and the interaction related object, wherein the duration of the second preset period is the same as or different from that of the first preset period; performing community division on the object network graph to obtain at least one network sub-graph, wherein each network sub-graph corresponds to one object community obtained by division; and predicting the object category corresponding to the node in at least one network sub-graph by utilizing the graph neural network model so as to identify the unlabeled abnormal object. Embodiments of the present disclosure may be used in the fields of intelligent transportation, network security, third party payments, and the like.

Description

Abnormal object identification method, device, computing equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technology, and more particularly, to a method, an apparatus, a computing device, and a computer-readable storage medium for identifying an abnormal object.
Background
With the development of computer technology in recent years, the third party payment market continues to keep a state of high growth. Meanwhile, various illegal transactions with the aim of illegal profit are also coming to appear in large quantity, and gradually take on complex and large-scale situations, so that the stability of the third party payment environment is greatly affected. In addition, in other scenarios where there is object interaction, there may be some abnormal objects, resulting in interaction between objects that may not achieve the intended purpose.
How to identify abnormal objects with risks in a multi-object interaction scene is an important difficulty facing each enterprise. The interactive data can be trained through the deep learning model/machine learning model, and the abnormal interactive data can be detected through the trained deep learning model/machine learning model. However, since the interaction data between the objects belongs to Non-European (Non-Euclidean) data which does not have a fixed topology structure, the structure around the node corresponding to each object is unique, and therefore, the application of the existing deep learning model/machine learning model is greatly limited.
Accordingly, it is desirable to provide an efficient and accurate method of identifying abnormal objects.
Disclosure of Invention
According to an aspect of the present disclosure, a method of identifying an abnormal object is provided. The method may include: determining an interaction related object related to the abnormal object based on the interaction data in the first preset period, wherein the abnormal object is pre-marked; constructing an object network diagram by using the abnormal objects and the interactive related objects based on the interactive data in a second preset period, wherein each node in the object network diagram represents an object, each side represents the correlation between two objects corresponding to the nodes at the two ends of the object network diagram, the weight of each side represents the correlation degree between the two objects corresponding to the nodes at the two ends of the object network diagram, and the duration of the second preset period is the same as or different from the duration of the first preset period; performing community division on the object network graph to obtain at least one network sub-graph, wherein each network sub-graph corresponds to one object community obtained by division; and predicting the object category corresponding to the node in the at least one network subgraph by utilizing the graph neural network model so as to identify the unlabeled abnormal object.
According to another aspect of the present disclosure, there is also provided an apparatus for identifying an abnormal object. The device comprises: the determining module is used for determining interaction related objects related to the abnormal objects based on the interaction data in the first preset period, wherein the abnormal objects are marked in advance; the network construction module is used for constructing an object network diagram based on the interaction data in a second preset period and by using the abnormal objects and the interaction related objects, wherein each node in the object network diagram represents an object, each side represents the correlation degree between two objects corresponding to the nodes at the two ends of the object network diagram, the weight of each side represents the correlation degree between the two objects corresponding to the nodes at the two ends of the object network diagram, and the duration of the second preset period is the same as or different from the duration of the first preset period; the division module is used for carrying out community division on the object network graph to obtain at least one network sub-graph, wherein each network sub-graph corresponds to one object community obtained by division; and the identification module is used for predicting the object category corresponding to the node in at least one network sub-graph by utilizing the graph neural network model so as to identify the abnormal object.
According to another aspect of the present disclosure, there is also provided a computing device, comprising: a processor; and a memory. A computer program stored on a memory, which when executed by the processor, causes the one or more processing units to perform the method of: determining an interaction related object related to the abnormal object based on the interaction data in the first preset period, wherein the abnormal object is pre-marked; constructing an object network diagram by using the abnormal objects and the interactive related objects based on the interactive data in a second preset period, wherein each node in the object network diagram represents an object, each side represents the correlation between two objects corresponding to the nodes at the two ends of the object network diagram, the weight of each side represents the correlation degree between the two objects corresponding to the nodes at the two ends of the object network diagram, and the duration of the second preset period is the same as or different from the duration of the first preset period; performing community division on the object network graph to obtain at least one network sub-graph, wherein each network sub-graph corresponds to one object community obtained by division; and predicting the object category corresponding to the node in the at least one network sub-graph by utilizing the graph neural network model so as to identify the abnormal object.
According to another aspect of the present disclosure, there is also provided a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of identifying an abnormal object as described above.
According to yet another aspect of the present disclosure, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method of identifying an abnormal object as described hereinbefore.
In the method for identifying abnormal objects, each object community obtained through division is provided to the graph neural network model so as to identify the class of the node corresponding to the object, so that the labor cost can be reduced, and when the class of one object is evaluated by using the graph neural network, the attribute of the object and the information of the neighbor object nodes of the object are utilized, so that the accuracy of identification can be improved.
Drawings
Fig. 1A illustrates an exemplary application scenario in which a technical solution according to an embodiment of the present disclosure may be implemented.
FIG. 1B illustrates a flow diagram for identifying outliers based on a community discovery method.
Fig. 2 shows a flow diagram of a method of identifying an abnormal object according to an embodiment of the present disclosure.
Fig. 3 shows a flow diagram of sub-steps of the step of constructing the object network graph in fig. 2.
FIG. 4 shows a process diagram of community partitioning based on modularity.
Fig. 5A-5B illustrate the identification process of identifying an abnormal object in fig. 2.
Fig. 6 shows a flow diagram of a method of training the neural network model.
Fig. 7A-7B show block diagrams of a device for identifying an abnormal object according to an embodiment of the present disclosure.
Fig. 8 illustrates a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more apparent, exemplary embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present disclosure and not all of the embodiments of the present disclosure, and that the present disclosure is not limited by the example embodiments described herein.
In the present specification and drawings, steps and elements having substantially the same or similar are denoted by the same or similar reference numerals, and repeated descriptions of the steps and elements will be omitted. Meanwhile, in the description of the present disclosure, the terms "first," "second," and the like are used merely to distinguish the descriptions, and are not to be construed as indicating or implying relative importance or order.
Machine learning is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of AI, which is a way for computing devices to have intelligence; the machine learning is a multi-field interdisciplinary, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like; the method is used for specially researching how the computing equipment simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills, and reorganizing the existing knowledge structure to continuously improve the performance of the computing equipment. Deep learning is a technique for machine learning by using a deep neural network system; machine learning/deep learning may generally include a variety of techniques including artificial neural networks, reinforcement learning (Reinforcement Learning, RL), supervised learning, unsupervised learning, etc.
Before describing the present disclosure in detail, some terms that may be used herein are first explained and illustrated as follows.
The structure of the figure is as follows: data structure composed of several nodes and inter-node connecting edges, each nodeThe point can be regarded as or corresponds to a sample and can be represented by an adjacency matrix A, wherein the element a in the adjacency matrix A i,j Representing the connection relationship between node i and node j, a for the non-weight graph i,j Non-0, i.e. 1, for the rights graph a i,j Typically a weight value between 0 and 1, different values indicating the degree of correlation between the corresponding samples of the two nodes, and a 0 indicating no edge between the two nodes. The adjacency matrix is divided into a directed graph adjacency matrix and an undirected graph adjacency matrix (undirected graph is taken herein as an example). For undirected graphs (undirected simple graphs), the adjacency matrix must be symmetrical and the diagonal must be zero. The graph structure can also be expressed herein as graph structure data, graph structure information, graphs, networks, graph networks, and the like.
Fig. neural network (Graph neural network, GNN): modeling is performed based on the relevance between nodes of the graph structure, the input of the modeling is understood to be the graph structure (which is reflected by the characteristics of each node and node topology information), and the characteristic representation (such as high-dimensional characteristic vector, also called output characteristic and embedding) of each node is finally obtained through various operations (such as convolution, updating and the like) of a plurality of graph neural network layers so as to facilitate the tasks of node classification, generation of graphs and subgraphs and the like. Specific examples may include graph convolutional neural networks (Graph Convolutional Networks, GCN), graph attention networks (Graph Attention Networks, GAT), graph sample and aggregate (graph sage) networks, node2vec, deep walk (deep walk) networks, and so forth.
Fast Unfolding: a community discovery algorithm is characterized in that the modularity is an important standard for measuring the quality of community division, the greater the modularity value of a divided network is, the better the community division effect is, the Fast Unfolding algorithm is an algorithm for dividing communities based on the modularity, and the main goal is to divide communities continuously so that the modularity of the whole divided network is increased continuously until the whole divided network is not changed any more, and then an optimal division mode is obtained.
Fig. 1A illustrates an exemplary application scenario 100 in which a technical solution according to an embodiment of the present disclosure may be implemented. As shown in fig. 1A, the application scenario 100 includes a server 110, terminals 120, 130, and a network 140. Terminals 120 and 130 are communicatively coupled to server 110 via network 140. As an example, the respective objects may interact with the server 110 via the network 140 by means of an application or client on the respective terminal (interactions between objects are not shown in the figure), e.g. the objects may be e.g. buyer objects, merchant objects etc. in a payment scenario, but also ordinary user objects in other application scenarios, e.g. video interactions, audio interactions, graphic interactions, transaction payments etc. A plurality of objects in a payment scenario is shown in fig. 1A, including a plurality of merchant objects and buyer objects.
As an example, the server 110 may collect interaction data between various objects (e.g., a plurality of merchant objects and buyer objects) on various terminals. Server 110 may then construct an object network graph from some of the target objects (e.g., merchant objects) involved in the interaction data according to a particular manner. The object network graph includes a plurality of nodes corresponding to the target objects of the category and edges connecting between the nodes, the node category of the nodes corresponds to the object category (for example, each target object is a normal object or an abnormal object), each edge has a weight to represent the degree of correlation between the target objects corresponding to the two nodes connected by each edge, and the degree of correlation can be determined according to the number of other interaction partners shared by the two target objects, for example.
The server 110 may perform recognition of the categories of the objects in the object network graph by analyzing the constructed object network graph. Embodiments of the present disclosure relate to identifying anomalous objects in the same class of objects (e.g., merchant objects only), and thus, hereinafter, the term "object" will refer to only a particular class of objects (e.g., merchant objects).
The object classification may be identified by first determining a part of abnormal objects by using an existing model on the server 110 side, then performing community division on the object network graph by using a community discovery method, and pushing communities with more abnormal objects obtained by division to a human for auditing, as will be described in detail below with reference to fig. 1B. On the other hand, each community obtained by division may also be provided to a graph neural network model, through which a recognition result is obtained, as described in detail below with reference to fig. 2-6.
Alternatively, the server 110 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligence platforms. The terminals 120, 130 may include, but are not limited to, at least one of: terminals capable of presenting content, such as mobile phones, tablet computers, notebook computers, desktop PCs, digital televisions, and the like. The network 140 may be, for example, a Wide Area Network (WAN), a Local Area Network (LAN), a wireless network, a public telephone network, an intranet, and any other type of network known to those skilled in the art.
It should also be noted that in one or more embodiments of the present disclosure, for ease of understanding, a merchant object is described as the object, and transaction data between a merchant and a transaction partner (buyer) is described in detail as interaction data. It should be appreciated that other scenarios involving interactions, such as social networking, intelligent transportation, etc., may also employ abnormal object recognition methods employing the present disclosure.
It is to be appreciated that in the context of the present disclosure, information-related data (e.g., interaction data, such as business or buyer transaction data, etc.) of a user is referred to, that when embodiments of the present disclosure are applied to a particular product or technology, acquisition of such data requires user approval or consent, and collection, use, and processing of the related data requires compliance with relevant national and regional laws and regulations and standards.
As described above, how to efficiently and accurately identify abnormal objects in a multi-object interaction scene is an important basis for ensuring normal interaction. One approach may identify outliers based on a community discovery method, such as performing community discovery based on Fast Unfolding algorithm or an infomap algorithm, or the like.
FIG. 1B illustrates a flow diagram for identifying outliers based on a community discovery method. At least a portion of this process is performed by server 110 in fig. 1A.
First, information of a known abnormal object may be acquired. For example, known anomaly object information may be derived from existing anomaly identification models (e.g., xgboost models, logistic models, decision trees, gambling models, fraud models, etc.) for interactive data, user reporting and auditing passes, related department notifications, etc. over a period of time.
Then, an object network is constructed according to the mutual interaction opposite parties existing between all the objects, wherein the object network comprises abnormal objects and normal objects.
Then, community division is carried out on the object network by utilizing a community discovery algorithm so as to obtain a plurality of object communities.
Finally, because the information of the abnormal objects is known, the objects with high correlation with the abnormal objects can be abnormal objects but are not marked, the number of marked abnormal objects or the proportion of the abnormal objects in the object communities can be determined for each object community, when the number of marked abnormal objects in one or more abnormal object communities is excessive or the proportion of the marked abnormal objects is excessive, the object communities are highly suspicious, the object communities can be pushed to the auditing personnel, and the auditing personnel can conduct abnormal auditing on the objects in the object communities to determine whether other non-marked abnormal objects exist in the object communities or not, so as to determine the abnormal objects.
By the method for identifying the abnormal users based on the community discovery algorithm, part of abnormal objects can be identified, and the accuracy of identifying the abnormal objects can be improved by further combining manual verification.
However, the process of identifying abnormal users based on the community discovery algorithm requires manual participation, and considering the labor cost, for some object communities with a small number or low proportion of marked abnormal objects, manual auditing is not performed, so that part of abnormal objects possibly existing in the community discovery algorithm may be missed. In addition, in community division, the method relies on only interaction data (e.g., transaction data such as identification of both parties to a transaction, transaction amount, number of transactions, transaction time, etc. in a payment scenario) to exclude a part of communities, and does not consider the attribute of the object itself (e.g., whether the merchant authenticates, class, registration duration, transaction area, transaction scenario, etc.), thus also resulting in inaccurate recognition results. Therefore, the accuracy of identifying the abnormal object needs to be further improved.
Based on this, the present disclosure proposes a method of identifying an abnormal object based on a graph neural network model. After communities are divided, node topology information among nodes corresponding to the objects in each community is obtained, node characteristics (attribute representing the objects) corresponding to the objects in each community and the node topology information are provided to the graph neural network model, so that the attribute of the objects and the information of other nodes are considered when the category of each object is identified, and the identification result is more accurate.
A method of identifying an abnormal object according to an embodiment of the present disclosure is described below in conjunction with fig. 2-6.
Fig. 2 shows a flow diagram of a method of identifying an abnormal object according to an embodiment of the present disclosure. At least a portion of this process is performed by server 110 in fig. 1A.
In the context of the present disclosure, an anomalous object to be identified may refer to an object of all objects for which identification is to be performed that is not marked as an anomalous object but is substantially an anomalous object, e.g., an unlabeled merchant of a plurality of merchants that is actually an anomalous merchant is identified.
As shown in fig. 2, in step S210, an interaction-related object related to an abnormal object is determined based on the interaction data within the first preset time period, wherein the abnormal object is pre-labeled.
Alternatively, the interaction related object associated with each of the annotated abnormal objects may be an object having a common interaction partner with the abnormal object. For example, the interaction-related object associated with each abnormal merchant of the annotation may be a merchant that has a transaction purchaser in common with the abnormal merchant. In the context of the present disclosure, two object correlations may represent that the two objects have a common interaction partner.
For example, for each of the labeled abnormal objects, an interaction partner that interacts with the abnormal object within a first preset time period may be determined based on the interaction data within the first preset time period, and for each determined interaction partner, other objects that interact with the interaction partner may be determined as interaction related objects related to the abnormal object. For each abnormal object, its interaction related object is determined, and the abnormal object and its associated interaction related object can be used for subsequent object network graph construction.
Alternatively, the first preset time period may be a period of one month, one week, or any preset duration.
In addition, the annotated anomaly object may be identified, user reported and audited through, related department notifications, etc. by an existing anomaly identification model (e.g., xgboost model, logistic model, decision tree, gambling model, fraud model, etc.) for the interaction data within the first preset period.
Alternatively, the interaction data may include an identification of both parties to the interaction, the number of interactions, the time of the interaction, etc., without involving too many attribute features of the object. For example, the interaction data may be transaction data including identification of the merchant and the interaction partner (buyer), the amount of interaction, the number of interactions, the time of interaction, and the like.
In step S220, an object network graph is constructed based on the interaction data in the second preset period and using all abnormal objects and all interaction related objects, wherein each node in the object network graph represents an object, each edge represents the correlation between two objects corresponding to the two end nodes thereof, and the weight of each edge represents the degree of correlation between two objects corresponding to the two end nodes thereof.
The duration of the second preset period is the same as or different from the duration of the first preset period. For example, the first preset period may be one month for determining which objects are to be used to construct the object network graph, as described in step S210; the second preset period may be half a year or one week, and based on the interaction data within the second preset period, the determined interaction relationship of the users for constructing the object network graph is obtained, and whether edges and weights of the edges exist between the nodes in the object network graph is determined.
The constructed object network graph is a topological structure comprising a plurality of nodes, each node corresponds to one object (for example, one merchant), if any two objects have a common interaction counterpart in a second preset period, one side of direct connection exists between the two nodes corresponding to the two objects, and the side can have weight according to the number of the common interaction counterpart. The correlation between individual nodes (also referred to as node topology information) may be represented by an adjacency matrix.
In addition, the node characteristic of each node is a characteristic vector obtained by encoding the attribute of the object corresponding to the node, for example, the node characteristic of the corresponding node may be obtained by encoding based on whether the merchant authenticates, the interaction amount, the level, the registration duration, and the like.
The specific steps of constructing the object network graph will be described in detail with reference to fig. 3.
In step S230, the object network graph is subjected to community division to obtain at least one network sub-graph, where each network sub-graph corresponds to one object community obtained by division.
Alternatively, community discovery algorithms may be utilized to community-divide the constructed object network graph. For example, a method of performing community division based on modularity may be used, and a Fast unfolding method will be described below with reference to fig. 4.
In step S240, the class of the object corresponding to the node in the at least one network sub-graph is predicted by using the graph neural network model, so as to identify an unlabeled abnormal object.
Instead of screening out only a portion of communities for manual review, as described above with reference to fig. 1B, the solution proposed by the embodiment of the present disclosure provides the partitioned network subgraphs corresponding to each object community to the graph neural network model to identify the class of the node corresponding to at least a portion of the objects in the object community (the class of the node corresponding to the marked abnormal object may not be identified any more).
The inputs of the graph neural network model may include node characteristics of individual nodes in each network sub-graph and node topology information for the network sub-graph, where the node topology information indicates whether an edge exists between any two nodes in the network sub-graph and a weight of the existing edge. The graph neural network model is a trained model, for example, as a classification model, that can be operated on input data to obtain a corresponding object class.
Optionally, for each network sub-graph, node characteristics of all nodes in the network sub-graph form a characteristic matrix as input, and the node topology information may be a sum of an adjacency matrix and an identity matrix of the network sub-graph, where the adjacency matrix may represent whether an edge exists between any two nodes in the network sub-graph and a weight of the edge, and the identity matrix may represent that the nodes in the network sub-graph have self-connected edges (a starting point and an end point of the edge are the same node).
The specific identification process will be described later with reference to fig. 5A-5B.
In the method for identifying abnormal objects described with reference to fig. 2, by providing each object community obtained by division to the graph neural network model to identify the class of the node corresponding to the object, the labor cost can be reduced, and when the class of one object is evaluated by using the graph neural network, the accuracy of identification can be improved by using the attribute of the object itself and the information of the neighboring object nodes thereof.
In order to more clearly describe the aspects of the embodiments of the present disclosure, the process of object network graph construction (S220) is described below with reference to fig. 3, the process of community division (step S230) is described with reference to fig. 4, and the process of object class identification (step S240) is described with reference to fig. 5A-5B.
As shown in fig. 3, in sub-step S220-1, any two objects of all abnormal objects and all interactive related objects are matched, resulting in a plurality of object matching pairs.
For example, assuming that the total number of objects including all the abnormal objects and their associated interactively related objects is S, then one can get
Figure BDA0003359633240000101
The individual objects match pairs.
In sub-step S220-2, for each object matching pair, the respective interaction partners and the common interaction partner of the two objects of the object matching pair are determined based on the interaction data within the second preset period.
For example, for the above
Figure BDA0003359633240000102
The object matching pairs (A1, B1) in the object matching pairs can determine that the object A1 has s1 interaction partners and the object B1 has s2 interaction partners, and the mutual interaction partners of the object A1 and the object B1 have s according to the interaction data of each object in the interaction data in the second preset period.
In step S230-3, for each object matching pair whose number of mutual interaction partners is not zero, the weight of the edge between the nodes to which the two objects correspond is determined based on the number of mutual interaction partners of the two objects of the object matching pair and the number of mutual interaction partners.
For example, still for the object matching pair (A1, B1), the common interaction partner ratio= (2×the number of common interaction partners)/the object A1 interaction partner number+the object B1 interaction partner number) =2s/(s1+s2), this ratio is the weight on the edge between the two nodes corresponding to the object A1 and the object B1. For example, the A1 object has 2 interaction partners, the B1 object has 3 interaction partners, and the two objects have 2 mutual interaction partners, and the mutual interaction partner ratio of the two objects is (2+2)/(3+2) =4/5.
Of course, if two objects do not have a mutual interaction party, i.e. s=0, there is no directly connected edge between the nodes corresponding to the two objects, it may also be considered that the weight of the edge is zero.
In step S220-4, the object network graph is constructed based on all nodes corresponding to all abnormal objects and all interactively related objects and the determined weights of each edge.
For example, the topology of the object network graph includes a plurality of nodes, and edges having the determined weights are connected between the plurality of node pairs. The node topology information of the object network graph may indicate whether an edge exists between any two nodes and the weight of the existing edge.
By the method for constructing the object network diagram of fig. 3, whether a common interaction party exists between any two objects and the number of the common interaction parties can be determined based on the interaction data in the second preset period, so that the object network diagram is constructed, the operation is simple, and the operation amount is small.
Next, fig. 4 shows a process diagram of community division based on modularity, in which a Fast unfolding method is taken as an example for illustration.
The goal of community division is to make the connection inside the communities after division tighter, and the connection among communities is sparser, and the degree of preference of community division can be indicated through modularity, wherein the greater the modularity is, the better the effect of community division is.
Modularity can be characterized by the following formula:
Figure BDA0003359633240000111
wherein Q is a modularity value, A i,j Weights, k, representing edges between any two nodes (node i and node j) in the object network graph constructed in the previous step i Representing the sum, k, of the weights of all the edges connected to node i j Representing the sum of the weights of all the edges connected to node j, c i Representing the community to which the node is assigned, delta (c) i ,c j ) And the method is used for judging whether the node i and the node j are divided into the same community, if so, returning to 1, otherwise, returning to 0.
More specifically, the process of community division based on modularity may be implemented by the following process.
The community partitioning algorithm may include two phases.
First stage (modularity optimization stage): each node traverses all neighbor nodes of the node, tries to place the node in communities of the neighbor nodes, and then, as the node can have a plurality of neighbor nodes, a plurality of communities which can be placed exist, and the community with the largest module increment in the communities is selected as the community updated by the node. This process is done for each node until the modularity cannot be increased by changing the community in which the node is placed.
Second stage (community polymerization stage): each community is combined into a new super node, the edge weight between the super nodes is the sum of the weights of all edges (one end of each edge is a node of one original community, and the other end of each edge is a node of the other original community) between the corresponding two original communities, and a new network is formed.
The two phases are iterated until the modularity is no longer increased.
Some iteration of these two phases is shown in fig. 4, assuming there are currently 16 nodes (possibly the previous machine has performed multiple iterations), these 16 nodes are divided into 4 communities through the first phase (modularity optimization phase); in the second phase (community aggregation phase), 4 communities are aggregated into 4 supernodes and the edge weights are updated again. And then entering the next iteration, and finally obtaining the community with the module degree value unchanged. Each node in the finally obtained community is actually a set of a plurality of original nodes, so that community division based on modularity is realized.
That is, the object network graph may be divided into a plurality of network sub-graphs through modularity, the network sub-graphs have fewer connections, the nodes within the network sub-graphs are more tightly connected, each network sub-graph corresponds to an object community, and the nodes corresponding to at least one object may be included.
The method for dividing communities described with reference to fig. 4 can quickly and simply divide the object network graph constructed in step S220 into at least one network sub-graph (each corresponding to one object community) for the graph neural network model to identify the object class corresponding to the node therein.
Fig. 5A-5B, furthermore, illustrate more details regarding the recognition process of the neural network model on the resulting object class corresponding to the nodes of each network subgraph.
As shown in fig. 5A, by way of example, it is assumed that after four communities obtained in fig. 4, the modularity is not increased any more, i.e., the division manner of fig. 4 is the optimal community division method. Thus, four network subgraphs (1-4) are obtained, the four network subgraphs are respectively provided to the graph neural network model, and object categories corresponding to nodes in the network subgraphs can be obtained for each network subgraph (only unlabeled nodes can be identified or all nodes can be identified).
Further, as shown in fig. 5B, the step S240 for identifying a category may specifically include the following sub-steps.
In sub-step S240-1, an initial node characteristic for each node is generated based on the attributes of the object corresponding to each node in the network sub-graph.
For example, the network subgraph includes N1 nodes (some of these nodes may be labeled nodes corresponding to an abnormal object, and the other node may be unlabeled nodes), and for each node, an initial node characteristic of each node is generated. Here, since node characteristics of nodes having connected edges need to be aggregated at the time of generating an output characteristic of each node at each layer as described later, node characteristics of all nodes need to be obtained.
In sub-step S240-2, using the graph neural network model, an object class corresponding to each unlabeled node is obtained based on the initial node characteristics of each node and the node topology information of the network subgraph.
Alternatively, only the object categories corresponding to the unlabeled nodes of the network sub-graph may be predicted, or the object categories corresponding to all the nodes of the network sub-graph may be predicted.
Optionally, the graph neural network model includes at least one graph neural network layer, wherein in each graph neural network layer, the output characteristic of the current layer for each node is updated based on the initial node characteristic of each node or the output characteristic of the previous layer for each node, the node topology information of the network subgraph, and the parameters of the current layer.
By way of example and not limitation, an aggregation sub-layer and an update sub-layer may be included in each of the neural network layers. For example, the aggregation sub-layer may aggregate the output characteristics of neighboring nodes (directly connected nodes) of each node of the previous layer based on the node topology information of the network sub-graph, and update the sub-layer to obtain the output characteristics of the node of the current layer based on the output characteristics of each node of the previous layer, the aggregate characteristics of the output characteristics of neighboring nodes of each node, and the parameters of the current layer.
For example, the manner of data transfer for each layer may be described as:
Figure BDA0003359633240000131
wherein σ () is an activation function;
Figure BDA0003359633240000132
for node topology information, A is an adjacency matrix of a network sub-graph (undirected graph), and I N Is an identity matrix which can indicate that each node has self-connection,/or->
Figure BDA0003359633240000133
Is->
Figure BDA0003359633240000134
Is a diagonal matrix and +.>
Figure BDA0003359633240000135
And represents the sum of the weights on the edges between node i and other nodes j in the network subgraph, H (l) An output matrix for the first layer (including output features of all nodes); w (W) (l) Is the firstModel parameter matrix of layer i.
For example, assuming node A is directly connected to B, C, and B and C are not connected (which may be expressed in node topology information), in the process of node A node characterization (other nodes are similar), at the first level, the first level targets node A's output characteristics
Figure BDA0003359633240000136
Initial node feature comprising node A (labeled->
Figure BDA0003359633240000137
Initial node characteristics of neighboring nodes B and C
Figure BDA0003359633240000138
And->
Figure BDA0003359633240000139
In the second layer, the output characteristic of node A +.>
Figure BDA00033596332400001310
Node characteristics of the first layer output of node A (labeled) are aggregated +.>
Figure BDA00033596332400001311
And node characteristics of adjacent node B and C (no tag) output at the first layer +.>
Figure BDA00033596332400001312
And->
Figure BDA00033596332400001313
The initial feature vectors of the three nodes can be +.>
Figure BDA00033596332400001314
Is combined into a feature matrix H (0) And the output matrix of the first layer may be composed of +.>
Figure BDA00033596332400001315
Composed ofH (1) The output matrix of the second layer may be +.>
Figure BDA00033596332400001316
H of composition (2)
Alternatively, each layer of the graph neural network model only processes first-order neighborhood information (features of directly connected nodes are aggregated), but multi-order neighborhood information transfer can be achieved by stacking several layers. In the embodiment of the disclosure, only one graph neural network layer can be adopted, a better recognition effect can be realized, and the calculation efficiency can be improved.
Optionally, the process of predicting the class of the object corresponding to each unlabeled node (and possibly the same class of objects similar to the class of the object corresponding to the labeled node) may include: and inputting the output characteristics of the last layer aiming at each unlabeled node into an output control function, determining the prediction probability that each unlabeled node is predicted to be of different object categories, and determining the object category corresponding to each unlabeled node based on the prediction probability that each unlabeled node is predicted to be of different object categories. Alternatively, the object categories may include an abnormal object category and a normal object category. Of course, it is also possible to have more sub-divided categories depending on the actual tag at the time of training.
Alternatively, as described above, the object categories corresponding to all nodes may be predicted.
For example, the output control function may be a softmax function, into which an output representation (output feature, output embedding) of each node is input, which will result in a value between (0, 1) for different classes as a predictive probability for different classes (e.g., normal or abnormal objects).
The object class with the highest prediction probability can be determined as the class of the object corresponding to the node.
Alternatively, different subsequent processing is performed only for the prediction probability that each node is predicted as an abnormal object, for example, hierarchical control is performed according to the prediction probability value. For each node, if the prediction probability of the object predicted to be an abnormal object is within a first threshold range (higher value range), the object corresponding to the node may be determined to be an abnormal object, and thus the payment authority of the object corresponding to the node may be directly closed, if it is within a second threshold range (intermediate value range), may be pushed for manual verification, and if it is within a third threshold range (lower value range), the object corresponding to the node is determined to be a normal object, and thus no processing is performed. For example, the first threshold range, the second threshold range, and the third threshold range are preset and sequentially increase, for example, corresponding to three consecutive non-overlapping ranges in [0,1], respectively.
Alternatively, the graph neural network model may be a graph roll-up neural network (GCN) model, a graphpage model, or a Node2vec model, or the like.
By the method for identifying abnormal objects described with reference to fig. 2-5, by providing each object community obtained by division to the graph neural network model to identify the class of the node corresponding to the object, the labor cost can be reduced, and when the class of one object is evaluated by using the graph neural network, the attribute of the object and the information of the neighbor object nodes thereof are utilized, so that the accuracy of identification can be improved. In addition, different subsequent processes are carried out aiming at the prediction probability that each node is an abnormal user, and the efficiency and the recognition accuracy can be improved by combining the manual work with the prediction of the graph neural network model.
Furthermore, the neural network model needs to be trained to perform the recognition process described above. By way of example and not limitation, in embodiments of the present disclosure, the training of the graph neural network may be performed in the manner described with reference to FIG. 6.
As shown in fig. 6, in step S610, information of a sample graph structure is acquired, wherein the information of the sample graph structure includes: node characteristics of all sample nodes of the sample graph structure and object labels of at least a portion of the sample nodes, and sample node topology information of the sample graph structure.
Similarly, the sample node topology information indicates whether there is an edge between any two nodes in the sample graph structure and the weight of the edge that is present, and the sample node topology information may be the sum of the adjacency matrix and the identity matrix, which is also a matrix.
The object tags may also be abnormal objects and normal objects.
In step S620, supervised training or semi-supervised training is performed on the graph neural network model based on the information of the sample graph structure and the loss function.
Alternatively, if the labels of all sample nodes in the sample graph structure are known, or the labels of a portion of the sample nodes are known (both abnormal and normal object labels, i.e., positive and negative samples), then supervised training may be performed using the known labeled sample nodes. In supervised training, a graph neural network model is utilized, the prediction probability that each sample node is a normal object or an abnormal object is obtained based on the node characteristics and the sample node topology information of each sample node which are known to be marked, a loss value is calculated by utilizing a loss function based on the prediction probability that each sample node which is known to be marked is a class of each object, and parameters of the model are adjusted until the model parameters converge by enabling the loss value to be minimum.
Alternatively, the loss function may be a cross entropy function. The cross entropy function is shown as follows:
Figure BDA0003359633240000151
where N is the number of sample nodes, C is the number of classification categories (e.g., 2 if abnormal and normal objects are included), p ki Is the true value (1 or 0) that node k belongs to class i, q ki Is the probability that the model predicts that node k belongs to category i.
Alternatively, if only the labels of some of the sample nodes in the sample graph structure are known, i.e., the sample node corpus of the sample graph structure may include a first sample node set with labels and a second sample node set without labels, and semi-supervised training with other unlabeled sample nodes is desired, then a self-training approach may be used.
First, a first model is obtained by performing supervised training on a graph neural network model by using a first sample node set. The manner of supervised training is similar to that described above.
Then, the following operations are iteratively performed until the number of nodes in the training sample node set for training the first model reaches a preset number and the model parameters of the first model converge.
Operation i), predicting unmarked sample nodes in the second sample node set by using the first model, selecting an extended sample node set from the unmarked sample nodes in the second sample node set based on the prediction result, and updating the second sample node set (i.e. the second sample node set no longer comprises the sample nodes in the extended sample node set selected at this time).
Optionally, predicting the unlabeled nodes in the second sample node set by using the first model to obtain a prediction label and a corresponding confidence coefficient of each unlabeled node in the second sample node set; and selecting sample nodes with the confidence meeting the preset condition from the second sample node set based on the confidence corresponding to the prediction label of each unlabeled sample node in the second sample node set, so as to obtain an expanded sample node set.
For example, for each class of labels, M sample nodes with highest confidence are selected from the sample nodes with the class of predictive labels, and an extended node set is derived based on the M sample nodes and the class of labels.
And ii, taking the first sample node set and the extended sample node set as training sample node sets, and training the first model by using the training sample nodes to update model parameters of the first model.
By the training method of the graph neural network model described with reference to fig. 6, a model for identifying object classes can be obtained, and supervised training or semi-supervised training can be selected according to whether or not the unlabeled sample nodes in the sample graph structure are required to be utilized.
According to another aspect of the present disclosure, there is also provided an apparatus for identifying an abnormal object.
Fig. 7A-7B show block diagrams of a device for identifying an abnormal object according to an embodiment of the present disclosure.
As shown in fig. 7A, the apparatus 700 includes a determination module 710, a network construction module 720, a partitioning module 730, and an identification module 740.
The determining module 710 is configured to determine an interaction related object related to an abnormal object based on the interaction data within the first preset time period, wherein the abnormal object is pre-annotated.
Optionally, the determination module 710, when determining the interaction related object, is configured to: and determining the interaction opposite party of each abnormal object and other interaction objects of each interaction opposite party in the first preset time period based on the interaction data in the first preset time period, wherein the other interaction objects of the interaction opposite party of each abnormal object are used as interaction related objects related to the abnormal object.
The network construction module 720 is configured to construct an object network graph based on the interaction data in the second preset period and using the abnormal object and the interaction related object, where each node in the object network graph represents an object, each edge represents that two objects corresponding to two end nodes of the object are related, and the weight of each edge represents the degree of correlation between two objects corresponding to two end nodes of the object, and the duration of the second preset period is the same as or different from the duration of the first preset period.
Optionally, the object network graph constructed by the network construction module 720 is a topology structure including a plurality of nodes, each node corresponds to an object, and if any two objects have a common interaction partner within a second preset period, there is an edge directly connected between two nodes corresponding to the two objects, and the edge may have a weight according to the number of the common interaction partners. The correlation between individual nodes (also referred to as node topology information) may be represented by an adjacency matrix.
The partitioning module 730 is configured to perform community partitioning on the object network graph to obtain at least one network sub-graph, where each network sub-graph corresponds to one object community obtained by partitioning.
Alternatively, community discovery algorithms may be utilized to community the constructed object network graph, for example, methods based on modularity may be utilized such as Fast unfolding algorithms.
The identifying module 740 is configured to predict, using the graph neural network model, an object class corresponding to a node in the at least one network sub-graph, so as to identify an abnormal object.
The inputs of the graph neural network model may include node characteristics of individual nodes in each network sub-graph and node topology information for the network sub-graph, where the node topology information indicates whether an edge exists between any two nodes in the network sub-graph and a weight of the existing edge. The graph neural network model is a trained model, for example, as a classification model, that can be operated on input data to obtain a corresponding object class.
Optionally, the apparatus 700 may further include a training module that may be used to train the graph neural network model. However, in other embodiments, the training module may be external to the apparatus 700, and the apparatus 700 obtains a trained neural network model from the training module for use in the abnormal object recognition process.
For example, the training module may be configured to obtain information of a sample graph structure, wherein the information of the sample graph structure comprises: node characteristics of all sample nodes of a sample graph structure, object labels of at least a portion of the sample nodes, and sample node topology information in the sample graph structure; and performing supervised training or semi-supervised training on the graph neural network model based on the information of the sample graph structure and the loss function.
Specific details of supervised or semi-supervised training have been described above and thus are not repeated here.
It should be noted that the apparatus 700 is divided into a plurality of modules according to the operations performed and sub-modules are further divided as described below, but those skilled in the art will appreciate that the apparatus 700 may include more or less modules, each module may include more or less sub-modules according to different manners, and the present disclosure is not limited thereto as long as the various functions described can be implemented.
Further, as shown in FIG. 7B, the network construction module 720 may include a matching sub-module 720-1, a statistics sub-module 720-2, a weight determination sub-module 720-3, and a construction sub-module 720-4.
The matching sub-module 720-1 is configured to match any two objects of all the abnormal objects and all the interactive related objects, so as to obtain a plurality of object matching pairs.
The statistics sub-module 720-2 is configured to determine, for each object matching pair, an interaction counterpart and a common interaction counterpart of each of the two objects of the object matching pair based on the interaction data within the second preset period.
The weight determining sub-module 720-3 is configured to determine, for each object matching pair with a number of mutual interaction partners that is not zero, a weight of an edge between nodes corresponding to two objects based on the number of mutual interaction partners of the two objects of the object matching pair and the number of mutual interaction partners.
The construction sub-module 720-4 is configured to construct an object network graph based on all nodes corresponding to all abnormal objects and all interactive related objects and the determined weight of each edge.
Further details of the operation of the various sub-modules in the network construction module 720 have been described in detail above with reference to the portion of fig. 3 and are therefore not repeated here.
In addition, as shown in FIG. 7B, the identification module 740 for predicting the object class corresponding to the node in at least one network sub-graph may include a feature generation sub-module 740-1 and an identification sub-module 740-2.
The feature generation sub-module 740-1 is configured to generate, for each network sub-graph, an initial node feature of each node based on an attribute of an object corresponding to each node in the network sub-graph.
For example, the network subgraph includes N1 nodes (some of these nodes may be labeled nodes corresponding to an abnormal object, and the other node may be unlabeled nodes), and for each node, an initial node characteristic of each node is generated. Here, since node characteristics of nodes having connected edges need to be aggregated at the time of generating an output characteristic of each node at each layer as described later, node characteristics of all nodes need to be obtained.
The identifying sub-module 740-2 is configured to obtain, for each network sub-graph, an object class corresponding to each unlabeled node based on the initial node characteristic of each node and node topology information of the network sub-graph by using the graph neural network model, where the node topology information indicates whether an edge exists between any two nodes in the network sub-graph and a weight of the existing edge.
Alternatively, only the object categories corresponding to the unlabeled nodes of the network sub-graph may be predicted, or the object categories corresponding to all the nodes of the network sub-graph may be predicted.
Optionally, the graph neural network model includes at least one graph neural network layer, wherein in each graph neural network layer, the output characteristic of the current layer for each node is updated based on the initial node characteristic of each node or the output characteristic of the previous layer for each node, the node topology information of the network subgraph, and the parameters of the current layer.
By way of example and not limitation, an aggregation sub-layer and an update sub-layer may be included in each of the neural network layers. For example, the aggregation sub-layer may aggregate the output characteristics of neighboring nodes (directly connected nodes) of each node of the previous layer based on the node topology information of the network sub-graph, and update the sub-layer to obtain the output characteristics of the node of the current layer based on the output characteristics of each node of the previous layer, the aggregate characteristics of the output characteristics of neighboring nodes of each node, and the parameters of the current layer.
In addition, the operation of the recognition sub-module 740-2 in predicting the class of the object corresponding to each unlabeled node (and may also similarly include: and inputting the output characteristics of the last layer aiming at each unlabeled node into an output control function, determining the prediction probability that each unlabeled node is predicted to be of different object categories, and determining the object category corresponding to each unlabeled node based on the prediction probability that each unlabeled node is predicted to be of different object categories.
Alternatively, the object categories may include an abnormal object category and a normal object category. Of course, it is also possible to have more sub-divided categories depending on the actual tag at the time of training.
Alternatively, as described above, the object categories corresponding to all nodes may be predicted.
For example, the output control function may be a softmax function, into which an output representation (output feature, output embedding) of each node is input, which will result in a value between (0, 1) for different classes as a predictive probability for different classes (e.g., normal or abnormal objects).
The object class with the highest prediction probability can be determined as the class of the object corresponding to the node.
Alternatively, different subsequent processing is performed only for the prediction probability that each node is predicted as an abnormal object, for example, hierarchical control is performed according to the prediction probability value. For each node, if the prediction probability of the object predicted to be an abnormal object is within a first threshold range (higher value range), the object corresponding to the node may be determined to be an abnormal object, and thus the payment authority of the object corresponding to the node may be directly closed, if it is within a second threshold range (intermediate value range), may be pushed for manual verification, and if it is within a third threshold range (lower value range), the object corresponding to the node is determined to be a normal object, and thus no processing is performed. For example, the first threshold range, the second threshold range, and the third threshold range are preset and sequentially increase, for example, corresponding to three consecutive non-overlapping ranges in [0,1], respectively.
Further details of the operation of the various sub-modules in the network construction module 720 have been described in detail above with reference to the portion of fig. 3 and are therefore not repeated here.
By the device for identifying abnormal objects described with reference to fig. 7A-7B, each object community obtained by dividing by the dividing module is provided to the identifying module, the identifying module uses the graph neural network model to identify the class of the node corresponding to the object, so that the labor cost can be reduced, and when the class of one object is evaluated by using the graph neural network, the attribute of the object and the information of the neighboring object nodes are utilized, so that the accuracy of identification can be improved. In addition, different subsequent processes are carried out aiming at the prediction probability that each node is an abnormal user, and the efficiency and the recognition accuracy can be improved by combining the manual work with the prediction of the graph neural network model.
According to yet another aspect of the disclosure, a computing device is also disclosed.
Fig. 8 shows a schematic block diagram of a computing device 800 according to an embodiment of the disclosure.
As shown in fig. 8, computing device 800 includes a processor, memory, a network interface, an input device, and a display screen connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the terminal stores an operating system and may also store a computer program which, when executed by a processor, causes the processor to carry out the various operations described in the steps of the method of identifying an abnormal object as previously described. The internal memory may also have stored therein a computer program which, when executed by a processor, causes the processor to perform the various operations described in the steps of the same method of identifying an abnormal object.
For example, operations of a method of identifying an abnormal object may include: determining interaction related objects related to the abnormal objects based on the interaction data in the first preset period, wherein the abnormal objects are pre-marked; constructing an object network diagram by using abnormal objects and interactive related objects based on interactive data in a second preset period, wherein each node in the object network diagram represents an object, each side represents the correlation of two objects corresponding to two end nodes of the object network diagram, the weight of each side represents the correlation degree between the two objects corresponding to the two end nodes of the object network diagram, and the duration of the second preset period is the same as or different from the duration of the first preset period; performing community division on the object network graph to obtain at least one network sub-graph, wherein each network sub-graph corresponds to one object community obtained by division; and predicting the object category corresponding to the node in at least one network sub-graph by utilizing the graph neural network model so as to identify the abnormal object. Further details of each step have been described in detail above and are therefore not repeated here.
The processor may be an integrated circuit chip with signal processing capabilities. The processor may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and may be of the X84 architecture or ARM architecture.
The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. It should be noted that the memory of the methods described in this disclosure is intended to comprise, without being limited to, these and any other suitable categories of memory.
The display screen of the computing device can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computing device can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on the terminal shell, and can also be an external keyboard, a touch pad or a mouse and the like.
The computing device may be a terminal or a server. Wherein the terminal may include, but is not limited to: smart phones, tablet computers, notebook computers, desktop computers, smart televisions, etc.; a wide variety of clients (APP) may be running within the terminal, such as multimedia play clients, social clients, browser clients, information flow clients, educational clients, and so forth. The server may be a server described with reference to fig. 1A, which may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligence platforms.
According to another aspect of the present disclosure, there is also provided a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of identifying an abnormal object as described above.
According to yet another aspect of the present disclosure, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method of identifying an abnormal object as described hereinbefore.
It should be noted that the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and apparatus according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises at least one executable instruction for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The exemplary embodiments of the present disclosure described in detail above are illustrative only and are not limiting. Those skilled in the art will understand that various modifications and combinations of these embodiments or features thereof may be made without departing from the principles and spirit of the disclosure, and such modifications should fall within the scope of the disclosure.

Claims (16)

1. A method of identifying an abnormal object, comprising:
determining an interaction related object related to the abnormal object based on the interaction data in the first preset period, wherein the abnormal object is pre-marked;
constructing an object network diagram by using the abnormal objects and the interactive related objects based on the interactive data in a second preset period, wherein each node in the object network diagram represents an object, each side represents the correlation between two objects corresponding to the nodes at the two ends of the object network diagram, the weight of each side represents the correlation degree between the two objects corresponding to the nodes at the two ends of the object network diagram, and the duration of the second preset period is the same as or different from the duration of the first preset period;
performing community division on the object network graph to obtain at least one network sub-graph, wherein each network sub-graph corresponds to one object community obtained by division; and
And predicting the object category corresponding to the node in the at least one network subgraph by using the graph neural network model so as to identify the unlabeled abnormal object.
2. The method of claim 1, wherein determining an interaction related object related to the abnormal object based on the interaction data within the first preset time period comprises:
and determining the interaction opposite party of each abnormal object and other interaction objects of each interaction opposite party based on the interaction data in the first preset period, wherein the other interaction objects of the interaction opposite party of each abnormal object are used as interaction related objects related to the abnormal object.
3. The method of claim 2, wherein constructing an object network graph based on interaction data within a second preset period and using the abnormal object and the interaction related object comprises:
matching any two objects in all abnormal objects and all interactive related objects to obtain a plurality of object matching pairs;
for each object matching pair, determining respective interaction partners and common interaction partners of two objects of the object matching pair based on the interaction data in the second preset period;
for each object matching pair with the number of common interaction partners being different from zero, determining the weight of the edge between the nodes corresponding to the two objects based on the number of the interaction partners of the two objects of the object matching pair and the number of the common interaction partners; and
And constructing the object network graph based on all nodes corresponding to all abnormal objects and all interactive related objects and the determined weight of each edge.
4. The method of claim 3, wherein community partitioning the object network graph to obtain at least one network sub-graph comprises:
performing community division on the object network graph based on modularity to obtain the at least one network sub-graph,
the value of the modularity is used for indicating the degree of the object network graph divided by communities, and is related to the weight of each edge in the object network graph, the weight of all edges connected with each node and whether the object communities divided by every two nodes are the same or not.
5. The method of claim 1, wherein predicting, using a graph neural network model, an object class corresponding to a node in the at least one network subgraph comprises: for each network sub-graph,
generating initial node characteristics of each node based on the attribute of the object corresponding to each node in the network subgraph; and
predicting the object category corresponding to each unlabeled node based on the initial node characteristics of each node and the node topology information of the network subgraph by utilizing the graph neural network model,
Wherein the node topology information indicates whether an edge exists between any two nodes in the network subgraph and the weight of the existing edge.
6. The method of claim 5, wherein the graph neural network model includes at least one graph neural network layer,
in each graph neural network layer, based on the initial node characteristics of each node or the output characteristics of the previous layer aiming at each node, the node topology information of the network subgraph and the parameters of the current layer, the output characteristics of the current layer aiming at each node are updated.
7. The method of claim 6, wherein predicting, with the graph neural network model, an object class corresponding to a node in the at least one network subgraph, further comprises:
providing the output characteristics of the last layer aiming at each unlabeled node to an output control function, and determining the prediction probability of each unlabeled node predicted to be of different object categories; and
and determining the object category corresponding to each unlabeled node based on the prediction probability that each unlabeled node is predicted to be of different object categories.
8. The method of claim 7, wherein determining the object class to which each unlabeled node corresponds based on a predictive probability that the unlabeled node is predicted to be a different object class comprises:
Determining that the object corresponding to the unlabeled node is an abnormal object under the condition that the prediction probability of the unlabeled node being the abnormal object class is in a first threshold range;
pushing the unlabeled node to perform anomaly checking when the prediction probability of the unlabeled node being the abnormal object class is in a second threshold range; and
under the condition that the prediction probability of the unlabeled node as the abnormal object class is in a third threshold range, determining that the object corresponding to the unlabeled node is a normal object,
wherein the first threshold range, the second threshold range, and the third threshold range do not overlap and increase in sequence.
9. The method according to any of claims 6-8, wherein the node topology information of the network sub-graph is a sum of an adjacency matrix and an identity matrix of the network sub-graph.
10. The method of claim 1, wherein the graph neural network model is derived by:
obtaining information of a sample graph structure, wherein the information of the sample graph structure comprises: node characteristics of all sample nodes of a sample graph structure, object labels of at least a portion of the sample nodes, and sample node topology information in the sample graph structure; and
And performing supervised training or semi-supervised training on the graph neural network model based on the information of the sample graph structure and the loss function.
11. An apparatus for identifying an abnormal object, comprising:
the determining module is used for determining interaction related objects related to the abnormal objects based on the interaction data in the first preset period, wherein the abnormal objects are marked in advance;
the network construction module is used for constructing an object network diagram based on interaction data in a second preset period and by utilizing the abnormal objects and the interaction related objects, wherein each node in the object network diagram represents an object, each side represents that two objects corresponding to two end nodes of the object are related, the weight of each side represents the degree of correlation between the two objects corresponding to the two end nodes of the object, and the duration of the second preset period is the same as or different from the duration of the first preset period;
the division module is used for carrying out community division on the object network graph to obtain at least one network sub-graph, wherein each network sub-graph corresponds to one object community obtained by division; and
and the identification module is used for predicting the object category corresponding to the node in the at least one network subgraph by utilizing the graph neural network model so as to identify the abnormal object.
12. The method of claim 11, wherein the determination module, when determining the interactively related object, is configured to:
and determining the interaction opposite party of each abnormal object and other interaction objects of each interaction opposite party in the first preset time period based on the interaction data in the first preset time period, wherein the other interaction objects of the interaction opposite party of each abnormal object are used as interaction related objects related to the abnormal object.
13. The apparatus of claim 12, wherein the network construction module comprises:
the matching sub-module is used for matching any two objects in all abnormal objects and all interactive related objects to obtain a plurality of object matching pairs;
the statistics sub-module is used for determining respective interaction partners and common interaction partners of the two objects of the object matching pair according to the interaction data in the second preset period of time for each object matching pair;
the weight determining sub-module is used for determining the weight of the edge between the nodes corresponding to the two objects according to the number of the interaction partners of the two objects of the object matching pair and the number of the common interaction partners aiming at each object matching pair with the number of the common interaction partners being different from zero; and
And the construction sub-module is used for constructing the object network graph based on all the abnormal objects, all the nodes corresponding to all the interactive related objects and the determined weight of each edge.
14. The apparatus of claim 11, wherein the identification module comprises:
the characteristic generation sub-module is used for generating initial node characteristics of each node based on the attribute of the object corresponding to each node in each network sub-graph; and
an identification sub-module, configured to obtain, for each network sub-graph, an object class corresponding to each unlabeled node based on the initial node characteristic of each node and node topology information of the network sub-graph by using the graph neural network model,
wherein the node topology information indicates whether an edge exists between any two nodes in the network subgraph and the weight of the existing edge.
15. A computing device, comprising:
a processor; and
a memory having stored thereon a computer program which, when executed by the processor, causes the one or more processing units to perform the respective steps of the method of any of claims 1-10.
16. A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of any of claims 1-10.
CN202111362003.6A 2021-11-17 2021-11-17 Abnormal object identification method, device, computing equipment and storage medium Pending CN116150429A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111362003.6A CN116150429A (en) 2021-11-17 2021-11-17 Abnormal object identification method, device, computing equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111362003.6A CN116150429A (en) 2021-11-17 2021-11-17 Abnormal object identification method, device, computing equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116150429A true CN116150429A (en) 2023-05-23

Family

ID=86349339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111362003.6A Pending CN116150429A (en) 2021-11-17 2021-11-17 Abnormal object identification method, device, computing equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116150429A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648588A (en) * 2024-01-29 2024-03-05 和尘自仪(嘉兴)科技有限公司 Meteorological radar parameter anomaly identification method based on correlation network graph cluster analysis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648588A (en) * 2024-01-29 2024-03-05 和尘自仪(嘉兴)科技有限公司 Meteorological radar parameter anomaly identification method based on correlation network graph cluster analysis
CN117648588B (en) * 2024-01-29 2024-04-26 和尘自仪(嘉兴)科技有限公司 Meteorological radar parameter anomaly identification method based on correlation network graph cluster analysis

Similar Documents

Publication Publication Date Title
CN110363449B (en) Risk identification method, device and system
CN111814977B (en) Method and device for training event prediction model
CN109087079B (en) Digital currency transaction information analysis method
Liu et al. Simulating land-use dynamics under planning policies by integrating artificial immune systems with cellular automata
WO2019114344A1 (en) Graphical structure model-based method for prevention and control of abnormal accounts, and device and equipment
CN107330731B (en) Method and device for identifying click abnormity of advertisement space
CN111199474B (en) Risk prediction method and device based on network map data of two parties and electronic equipment
TW202011285A (en) Sample attribute evaluation model training method and apparatus, and server
US20230049817A1 (en) Performance-adaptive sampling strategy towards fast and accurate graph neural networks
CN112989059A (en) Method and device for identifying potential customer, equipment and readable computer storage medium
CN116010684A (en) Article recommendation method, device and storage medium
CN112231592A (en) Network community discovery method, device, equipment and storage medium based on graph
CN113761250A (en) Model training method, merchant classification method and device
Alhamdani et al. Recommender system for global terrorist database based on deep learning
CN116090504A (en) Training method and device for graphic neural network model, classifying method and computing equipment
Song et al. Visibility estimation via deep label distribution learning in cloud environment
CN116150429A (en) Abnormal object identification method, device, computing equipment and storage medium
Wen et al. From generative ai to generative internet of things: Fundamentals, framework, and outlooks
CN116307078A (en) Account label prediction method and device, storage medium and electronic equipment
CN111641517A (en) Community division method and device for homogeneous network, computer equipment and storage medium
CN114329099B (en) Overlapping community identification method, device, equipment, storage medium and program product
CN115618065A (en) Data processing method and related equipment
Singh et al. Advances in Computing and Data Sciences: Second International Conference, ICACDS 2018, Dehradun, India, April 20-21, 2018, Revised Selected Papers, Part II
CN114897607A (en) Data processing method and device for product resources, electronic equipment and storage medium
Zhang et al. A crowd-AI dynamic neural network hyperparameter optimization approach for image-driven social sensing applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40087221

Country of ref document: HK