CN113609345B - Target object association method and device, computing equipment and storage medium - Google Patents

Target object association method and device, computing equipment and storage medium Download PDF

Info

Publication number
CN113609345B
CN113609345B CN202111157861.7A CN202111157861A CN113609345B CN 113609345 B CN113609345 B CN 113609345B CN 202111157861 A CN202111157861 A CN 202111157861A CN 113609345 B CN113609345 B CN 113609345B
Authority
CN
China
Prior art keywords
nodes
node
graph
subgraph
objects
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111157861.7A
Other languages
Chinese (zh)
Other versions
CN113609345A (en
Inventor
吴成龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111157861.7A priority Critical patent/CN113609345B/en
Publication of CN113609345A publication Critical patent/CN113609345A/en
Application granted granted Critical
Publication of CN113609345B publication Critical patent/CN113609345B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The application describes a target object association method, comprising: acquiring a network map; and establishing and updating the first subgraph and the second subgraph until the second subgraph does not contain key nodes, and associating the nodes in the first subgraph with the corresponding target objects. The updating step comprises the following steps: determining a core node which is a node in the network graph with the association degree with at least one key node in the first subgraph being larger than zero and the sum of the association degrees with all key nodes in the second subgraph being the maximum, determining a node set so that the node set comprises the node in the second subgraph with the association degree with the core node being larger than zero and the node in the first subgraph with the maximum association degree with the core node, and determining the node set, the core node and an edge connecting the core node and the node in the node set in the network graph as a third subgraph; and adding a third sub-graph in the first sub-graph and removing the third sub-graph in the second sub-graph. The embodiment of the invention can be applied to the scenes of cloud technology, artificial intelligence, intelligent traffic, network security, object recommendation and the like.

Description

Target object association method and device, computing equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a target object association method and apparatus, a computing device, and a storage medium.
Background
With the development and application of internet technology, technologies such as network security and object recommendation are widely used. The usage scenarios of the network security technology relate to, for example, associating a bank account of a malicious user with the malicious user, associating a fraudulent user with a merchant suspected of money laundering for the fraudulent user, and the like, and the usage scenarios of the object recommendation technology relate to, for example, associating an article author with a high-frequency reader of the article author, associating a consumer with a repurchase product of the consumer, and the like. The key in these techniques is how to efficiently and accurately determine the object associated with the target object and the primary association relationship between the object and the target object from a wide variety of objects. However, in the related art, when determining an object associated with a target object from a large number of objects, such work has a great challenge due to the enormous amount of various objects and their related data.
Disclosure of Invention
In view of the above, the present disclosure provides a target object association method and apparatus that desirably overcomes some or all of the above-referenced deficiencies and possibly others.
According to a first aspect of the present disclosure, there is provided a target object association method, including: acquiring a network graph constructed based on a plurality of objects of different types, wherein the network graph comprises a plurality of nodes corresponding to the plurality of objects and edges connecting between the nodes, the types of the nodes correspond to the types of the objects, each edge represents an association relationship between the objects corresponding to two nodes connected by each edge and has a weight to represent the association degree between the two nodes, and the association degree between the two nodes indicates the association degree between the corresponding objects between the two nodes; initializing a first subgraph and a second subgraph, wherein the first subgraph comprises one or more key nodes in the network graph, the second subgraph comprises nodes and edges which are left in the network graph after the nodes in the first subgraph are removed, and each key node represents a node corresponding to an object with a key type; iteratively performing the following updating steps to update the first sub-graph and the second sub-graph until the second sub-graph does not include a key node, the updating steps including: determining a core node from the nodes of the network graph, the core node being a node in the network graph having an association with at least one of the one or more key nodes in the first subgraph greater than zero and having a maximum sum of associations with all key nodes in the second subgraph; determining a node set so that the node set comprises nodes with the association degree with a core node larger than zero in a second subgraph and nodes with the maximum association degree with the core node in a first subgraph, and determining the nodes in the node set, the core node and edges connecting the core node and the nodes in the node set in a network graph as a third subgraph; adding nodes and edges of a third subgraph in the first subgraph and removing nodes and edges of the third subgraph in the second subgraph; and associating the target objects corresponding to the nodes in the updated first sub-graph based on the edges included in the updated first sub-graph.
According to a second aspect of the present disclosure, there is provided a target object associating apparatus, including: the device comprises an acquisition module, an initialization module, an updating module and a correlation module. The obtaining module is configured to obtain a network graph constructed based on a plurality of objects of different types, wherein the network graph comprises a plurality of nodes corresponding to the plurality of objects and edges connecting between the nodes, the types of the nodes correspond to the types of the objects, each edge represents an association relationship between the objects corresponding to two nodes connected by each edge and has a weight to represent an association degree between the two nodes, and the association degree between the two nodes indicates the association degree between the corresponding objects between the two nodes. The initialization module is configured to initialize a first subgraph and a second subgraph such that the first subgraph contains one or more key nodes in the network graph and the second subgraph includes nodes and edges in the network graph that remain after the nodes in the first subgraph are removed, wherein each key node represents a node corresponding to an object having a key type. The update module is configured to iteratively perform the following update steps to update the first subgraph and the second subgraph until no key nodes are included in the second subgraph. The updating step includes: determining a core node from the nodes of the network graph, the core node being a node in the network graph having an association with at least one of the one or more key nodes in the first subgraph greater than zero and having a maximum sum of associations with all key nodes in the second subgraph; determining a node set so that the node set comprises nodes with the association degree with a core node larger than zero in a second subgraph and nodes with the maximum association degree with the core node in a first subgraph, and determining the nodes in the node set, the core node and edges connecting the core node and the nodes in the node set in a network graph as a third subgraph; nodes and edges of a third subgraph are added to the first subgraph, and nodes and edges of the third subgraph are removed from the second subgraph. The association module is configured to associate target objects corresponding to nodes in the updated first sub-graph based on edges included in the updated first sub-graph.
In some embodiments, the update module is configured to: traversing all nodes of the network graph, and determining the sum of the association degrees of the current node and all key nodes in the second subgraph in response to the traversed current node and at least one key node in the first subgraph being more than zero; and determining a node corresponding to the maximum sum value of all the determined sum values as a core node.
In some embodiments, each node in the network graph has a weight to represent the importance of the node, and the update module is configured to: in response to the first sub-graph comprising a plurality of nodes with the maximum association degree with the core node, acquiring the node with the maximum node weight in the plurality of nodes in the first sub-graph; determining a set of nodes such that the set of nodes includes the nodes in the second subgraph having an association degree with the core node greater than zero and the obtained nodes having the largest node weights.
In some embodiments, the acquisition module is configured to: acquiring network object data, wherein the network object data comprises multiple types of objects and incidence relations among the objects; determining a heterogeneous graph based on the type of the objects in the network object data and the association relationship between the objects, wherein the heterogeneous graph comprises nodes for representing the objects and edges for representing the association relationship between the objects, the nodes for representing the objects have node weights for representing the importance of the objects corresponding to the nodes, and the edges for representing the association relationship between the objects have edge weights for representing the association degree between the objects corresponding to two nodes connected by the edges; and mining subgraphs of the abnormal graph from the abnormal graph to obtain the network graph.
In some embodiments, the acquisition module is configured to: acquiring relation network data of various objects; performing data cleaning on the acquired relational network data to obtain the network object data; wherein the data cleansing includes one or more of the following operations: removing incomplete data in the relational network data, removing repeated data in the relational network data and removing error data in the relational network data.
In some embodiments, the acquisition module is configured to: establishing an initial set based on each node in the heterogeneous graph, so as to divide a plurality of nodes in the heterogeneous graph into a plurality of initial sets, and determining the modularity of each initial set, wherein the modularity of the sets characterizes the relevance of the connection among the nodes in the sets; taking each of the plurality of initial sets as a current set; iteratively performing a network update step until the modularity of the resulting update set does not increase any more, the update step comprising: forming an update set by assigning each node in the heterogeneous graph to a current set in which neighboring nodes are located, such that a modularity of the update set is greater than a modularity of the current set in which the neighboring nodes are located, and taking the update set as the current set; and taking one of the obtained current sets as the network graph.
In some embodiments, the acquisition module is configured to: determining a modularity for each of the obtained plurality of current sets; and taking the set corresponding to the maximum modularity in the obtained multiple current sets as the network graph.
In some embodiments, the acquisition module is configured to: and mining the maximum connected subgraph of the abnormal graph from the abnormal graph to serve as the network graph.
According to a third aspect of the present disclosure, there is provided a computing device comprising a processor; and a memory configured to have computer-executable instructions stored thereon that, when executed by the processor, perform any of the methods described above.
According to a fourth aspect of the present disclosure, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed, perform any of the methods described above.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising computer executable instructions, wherein the computer executable instructions, when executed by a processor, perform any of the methods as described above.
In the target object association method and apparatus claimed in the present disclosure, a network graph constructed based on a plurality of objects of different types is first obtained, and then a graph calculation method designed in the present application is applied to a network, that is: initializing a first sub-graph and a second sub-graph to enable the first sub-graph to contain one or more key nodes (namely nodes corresponding to objects of key types) in the network graph and enable the second sub-graph to comprise the remaining nodes and edges in the network graph, then iteratively updating the first sub-graph and the second sub-graph until the updated first sub-graph contains all key nodes and high-relevance nodes partially associated with the key nodes, and finally associating target objects corresponding to the nodes in the updated first sub-graph based on the edges in the updated first sub-graph. In this way, in the target object association process of the application, interference caused by objects with less importance and redundant association information is greatly removed when the subgraph is updated, so that the association of the target object is more accurate and efficient, and the method is more suitable for being used in fields such as network security and object recommendation.
These and other advantages of the present disclosure will become apparent from and elucidated with reference to the embodiments described hereinafter.
Drawings
Embodiments of the present disclosure will now be described in more detail and with reference to the accompanying drawings, in which:
fig. 1 illustrates an exemplary application scenario in which a technical solution according to an embodiment of the present disclosure may be implemented;
FIG. 2 illustrates a schematic flow chart diagram of a target object association method in accordance with one embodiment of the present disclosure;
FIG. 3 illustrates a schematic flow diagram of obtaining a network graph constructed based on multiple objects of different types according to one embodiment of the present disclosure;
FIG. 4 illustrates a schematic flow diagram of mining subgraphs of an anomaly graph from the anomaly graph to derive a network graph in accordance with one embodiment of the present disclosure;
FIG. 5 illustrates a schematic diagram of an association relationship between multiple objects of different types depicted in FIG. 2, according to one embodiment of the present disclosure;
FIG. 6 illustrates a schematic diagram of determining a third subgraph according to one embodiment of the present disclosure;
FIG. 7 illustrates an effect diagram of a target object association method according to one embodiment of the present disclosure;
FIG. 8 illustrates an exemplary block diagram of a target object association apparatus according to one embodiment of the present disclosure;
fig. 9 illustrates an example system that includes an example computing device that represents one or more systems and/or devices that may implement the various techniques described herein.
Detailed Description
The following description provides specific details of various embodiments of the disclosure so that those skilled in the art can fully understand and practice the various embodiments of the disclosure. It is understood that aspects of the disclosure may be practiced without some of these details. In some instances, well-known structures or functions are not shown or described in detail in this disclosure to avoid obscuring the description of the embodiments of the disclosure by these unnecessary descriptions. The terminology used in the present disclosure should be interpreted in its broadest reasonable manner, even though it is being used in conjunction with a particular embodiment of the present disclosure.
First, some terms referred to in the embodiments of the present application are explained to facilitate understanding by those skilled in the art.
Network diagram: a Network diagram (Network planning) is a graphical model, shaped like a Network, and is called a Network diagram. The network graph in the present application is a graph having a network-like shape and formed of nodes representing objects of various types and edges representing an association relationship between the objects.
Patterning by different patterns: as one of network graphs, the type of node and the type of edge >2 in a graph are generally used as criteria for classifying as an abnormal graph, for example, in a paper citation network, there are author nodes and paper nodes, and the types of edges used for representing relationships between nodes are: edges representing co-authoring relationships between authors and papers, edges representing dependencies between authors and papers, edges representing citation relationships between papers and papers, i.e. type of node and type of edge =2+3>2, so the paper citation network graph belongs to an heteromorphic graph.
Directed graph: the graph with edges having directions in the graph is called a directed graph. This is expressed mathematically as a directed graph D, which refers to an ordered triplet (v (D), a (D), phi (D)), where phi (D) is the correlation function that causes each element in a (D), called a directed edge or arc, to correspond to an ordered pair of elements in v (D), called vertices or points.
Undirected graph: the graph with no direction on the edge is called an undirected graph. The mathematical expression is as follows, an undirected graph G = < V, E >, where 1. V is a non-empty set, called a set of vertices; 2. e is a set of unordered doublets of elements in V, called an edge set.
A great-size connected subgraph: a connected subgraph representing a network graph in which adding any point that is not in its set of points results in it no longer being connected.
Modularity: also called modularization metric value, is a commonly used method for measuring the structural strength of the network community at present. The magnitude of the modularity value mainly depends on the community distribution of nodes in the network, namely the community division condition of the network, and can be used for quantitatively measuring the network community division quality, and the closer the value is to 1, the stronger the strength of the community structure divided by the network is, namely the better the division quality is. Therefore, the optimal network community division can be obtained by maximizing the modularity.
Machine learning: machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning. With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and researched in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical services, smart customer service, internet of vehicles, automatic driving, smart traffic and the like.
In the related art, it is possible to associate target objects in a network by constructing a network map. For example, collecting each type of objects having an association relationship with each other in advance, and then constructing nodes of a network graph and edges connecting the nodes according to the objects having the association relationship; then, the network graph is utilized to determine the associated object of the target object. However, if the association relationship between the objects is complex, the network graph constructed from these objects will be dense and messy, making it difficult to efficiently and accurately determine the associated objects of the target object from the network graph.
Fig. 1 illustrates an exemplary application scenario 100 in which a technical solution according to an embodiment of the present disclosure may be implemented. As shown in fig. 1, the application scenario 100 includes a server 110, terminals 120, 130, and a network 140. Terminals 120, 130 are communicatively coupled to server 110 via network 140. By way of example, users a and B, such as personal users, bank users, merchant users, etc., may view content, such as video, audio, text, etc., through an application or client on the terminals 120, 130. Only two terminals are shown here, and in fact three or more terminals may be present. As an example, the server 110 may collect a viewing history or a click history of various types of contents by a user on each terminal. The server 110 may then construct a network graph with the users and content involved in the viewing history or click history as objects. The network graph comprises a plurality of nodes corresponding to the plurality of objects and edges connecting among the nodes, wherein the node types of the nodes correspond to the types of the objects (such as users or contents), each edge has a weight to represent the association degree among the objects corresponding to two nodes connected by each edge, and the association degree can be determined according to the viewing time length, the click rate and the like of the users for the contents. For example, a fund payment operation is performed between the individual user a and the merchant user B through the network, an account binding operation is performed between the individual user a and the bank user C through the network, and the association between the users through the terminal and the network can be used by the server 110 to construct the network map.
As an example, the server 110 may obtain a network graph constructed based on a plurality of objects of different types, where the network graph includes a plurality of nodes corresponding to the plurality of objects and edges connecting between the nodes, where the types of the nodes correspond to types of the objects (e.g., users, merchants, accounts, etc.), each edge indicates an association relationship between the objects corresponding to two nodes connected by the edge and has a weight to represent an association degree between the two nodes, and the association degree between the two nodes indicates an association degree between the corresponding objects between the two nodes, and the association degree may be determined according to, for example, how much amount the user pays the merchant, complaints between the user and the user, and the like. The server 110 may initialize a first subgraph and a second subgraph such that the first subgraph contains one or more key nodes in the network graph, the second subgraph including nodes and edges in the network graph that remain after the nodes in the first subgraph are removed, wherein each key node represents a node corresponding to an object having a key type, e.g., a node corresponding to an object of a user type in an anti-fraud scenario. Server 110 may then iteratively perform the following updating steps to update the first subgraph and the second subgraph until the second subgraph does not include a critical node, the updating steps including: determining a core node from the nodes of the network graph, the core node being a node in the network graph having an association with at least one of the one or more key nodes in the first subgraph greater than zero and having a maximum sum of associations with all key nodes in the second subgraph; determining a node set so that the node set comprises nodes with the association degree with a core node larger than zero in a second subgraph and nodes with the maximum association degree with the core node in a first subgraph, and determining the nodes in the node set, the core node and edges connecting the core node and the nodes in the node set in a network graph as a third subgraph; nodes and edges of a third subgraph are added to the first subgraph, and nodes and edges of the third subgraph are removed from the second subgraph.
Based on the edges included in the updated first sub-graph, the server 110 associates the target objects corresponding to the nodes in the updated first sub-graph. As an example, in an anti-fraud scenario, to associate the user-type object and other related objects, the server 110 associates target objects corresponding to nodes in the updated first sub-graph, including all user nodes and nodes partially associated with the user nodes (e.g., account-type nodes, merchant-type nodes, etc.).
Optionally, the server 110 may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform. The terminals 120, 130 may include, but are not limited to, at least one of the following: the terminal can present contents such as a mobile phone, a tablet computer, a notebook computer, a desktop PC, a digital television and the like. The network 140 may be, for example, a Wide Area Network (WAN), a Local Area Network (LAN), a wireless network, a public telephone network, an intranet, or any other type of network known to those skilled in the art. It should also be noted that the scenario described above is only one example in which the embodiments of the present disclosure may be implemented, and is not limiting.
It should be noted that the scenario described above is only one example in which the embodiments of the present disclosure may be implemented, and is not limiting. For example, in some embodiment scenarios, target object association may also be implemented on a particular terminal.
Fig. 2 illustrates a schematic flow diagram of a target object association method 200 according to one embodiment of the present disclosure. The method 200 may be implemented, for example, on a server such as server 120 in fig. 1, although this is not limiting. As shown in fig. 2, the method 200 includes the following steps.
In step 210, a network graph constructed based on a plurality of objects of different types is obtained, where the network graph includes a plurality of nodes corresponding to the plurality of objects and edges connecting between the nodes, the types of the nodes correspond to the types of the objects, each edge represents an association relationship between the objects corresponding to two nodes connected by each edge and has a weight to represent an association degree between the two nodes, and the association degree between the two nodes indicates the association degree between the corresponding objects between the two nodes. In some embodiments, the type of object (i.e., the type of node) to which the node in the network graph corresponds includes at least one of: user type, account type, merchant type, credential type, etc., and the key type may be, for example, user type, although this is not limiting. As an example, the association includes at least one of: a binding relationship between the object of the user type and the object of the account type, a complaint relationship between the object of the user type and the object of the user type, a payment relationship between the object of the user type and the object of the merchant type, a real name relationship between the object of the user type and the object of the certificate type, and the like. The network graph in the present application may be a directed graph or an undirected graph, which is not limited herein.
In one example application scenario for associating malicious users, the network graph includes nodes representing objects of types such as user type, merchant type, account type, certificate type, etc., and edges representing associations between these objects, i.e., edges representing associations between objects represented by nodes such as binding relationships between objects of user type and objects of account type, complaint relationships between objects of user type and objects of user type, payment relationships between objects of user type and objects of merchant type, and real name relationships between objects of user type and objects of certificate type. For simplicity, the user-type node is hereinafter referred to as a user node, the account-type node is hereinafter referred to as an account node, the merchant-type node is hereinafter referred to as a merchant node, and the credential-type node is hereinafter referred to as a credential node.
In step 220, a first subgraph and a second subgraph are initialized, such that the first subgraph contains one or more key nodes in the network graph, and the second subgraph comprises nodes and edges remaining after the nodes in the first subgraph are removed from the network graph, wherein each key node represents a node corresponding to an object with a key type. As an example, in a scenario where a rogue user is associated, the object of the key type may include an object of the user type, and the key node may include a user node corresponding to the user object in the network graph. For example, a first subgraph is initialized such that the first subgraph contains one or more user nodes, and a second subgraph includes nodes and edges in the network graph that remain after the nodes contained in the first subgraph are removed.
The updating step is iteratively performed to update the first subgraph and the second subgraph until it is determined in step 234 that the second subgraph does not include the key node. The updating step includes: step 231, step 232 and step 233. As an example, the key node may be a user node, and when the updating step is finished, the second sub-graph will not contain the user node, i.e. the updating results of the first sub-graph and the second sub-graph are obtained.
At step 231, core nodes are determined from the nodes of the network graph, the core nodes being the nodes in the network graph having an association with at least one of the one or more key nodes in the first subgraph greater than zero and having the largest sum of the associations with all key nodes in the second subgraph. In some embodiments, the nodes in the network graph include user nodes, account nodes, merchant nodes, credential nodes, and the like for representing objects of a user type, an account type, a merchant type, a node type, and the like, respectively, and if the key node is determined to be a user node, the core node is a node in the network graph having an association degree with at least one user node of the one or more user nodes in the first sub-graph greater than zero and a maximum sum of the association degrees with all user nodes in the second sub-graph. The determined core node can be any one of a user node, an account node, a merchant node and a certificate node.
In some embodiments, determining the core node from the nodes of the network graph comprises: traversing all nodes of the network graph, and determining the sum of the association degrees of the current node and all key nodes in the second subgraph in response to the traversed current node and at least one key node in the first subgraph being more than zero; and determining the node corresponding to the maximum sum value in all the determined sum values as the core node. As an example, when the nodes in the network graph include a user node, an account node, a merchant node, a certificate node, if the key node is determined to be the user node, traversing all nodes (including all user nodes, account nodes, merchant nodes, certificate nodes, etc.) of the network graph, and in response to the traversed current node being more than zero associated with at least one user node in the first sub-graph, determining a sum of the associated degrees of the current node with all user nodes in the second sub-graph; and determining the node corresponding to the maximum sum value in all the determined sum values as the core node. The determined core nodes may be user nodes, account nodes, merchant nodes, credential nodes, etc. that meet the above requirements.
In step 232, a set of nodes is determined such that the set of nodes includes nodes in the second subgraph having a degree of association with the core node greater than zero and nodes in the first subgraph having a greatest degree of association with the core node, and the nodes in the set of nodes, the core node, and edges connecting the core node and the nodes in the set of nodes in the network graph are determined as a third subgraph. As an example, the node of the edge in the second subgraph that has a connection with the core node and the node of the edge in the first subgraph that has the highest weight of the edges connected with the core node are determined, and the node set is established with the determined nodes.
In some embodiments, each node in the network graph has a weight to represent the importance of the node. As an example, when determining the set of nodes such that the set of nodes includes a node in the second subgraph that is associated with a core node more than zero and a node in the first subgraph that is associated with a core node most, the node in the plurality of nodes in the first subgraph having the largest node weight may be obtained in response to the first subgraph including the plurality of nodes associated with the core node most, i.e., more than one node in the first subgraph that is associated with the core node most; then, a node set is determined, so that the node set comprises the nodes with the association degree with the core node being greater than zero in the second subgraph and the obtained nodes with the maximum node weight.
By way of example, the nodes in the network graph may include user nodes, account nodes, merchant nodes, credential nodes, and the like. The nodes may have weights to represent the importance of the node. For example, a node corresponding to a merchant with a large sales amount has a node weight that is greater than a node corresponding to a merchant with a small sales amount, and a node corresponding to an account with a larger account balance has a node weight that is greater than a node corresponding to an account with a small account balance. At this time, when the node set is determined such that the node set includes a node having a degree of association with the core node greater than zero in the second subgraph and a node having a maximum degree of association with the core node in the first subgraph, if the first subgraph includes a plurality of nodes having a maximum degree of association with the core node, a node having a maximum node weight among the plurality of nodes in the first subgraph may be acquired. Then, the node set is determined so that the node set includes the node in the second subgraph having an association degree with the core node greater than zero (i.e. the node in the second subgraph having an edge connected with the core node, which represents the object having an association relation with the object corresponding to the core node), and the obtained node having the largest node weight.
In step 233, nodes and edges of the third subgraph are added to the first subgraph and removed from the second subgraph. As an example, the third sub-graph contains a part of user nodes, merchant nodes and account nodes and an edge partially connecting them, then the edges between the user nodes, merchant nodes and account nodes and departments contained in the third sub-graph are added in the first sub-graph, and the user nodes, merchant nodes, account nodes and edges contained in the third sub-graph are removed in the second sub-graph.
In step 240, target objects corresponding to nodes in the updated first sub-graph are associated based on edges included in the updated first sub-graph. The updated first subgraph includes all user nodes as well as some other types of nodes and some edges connecting these nodes. As an example, all the user nodes and part of the merchant nodes and some edges therebetween are included in the updated first sub-graph, and then all the user nodes and the part of the merchant nodes are associated by using the edges included in the updated first sub-graph. It can be seen that in the target object association process of the present application, in the steps 231, 232, and 233 of updating the first sub-graph, the less important objects and redundant association information are largely removed, so that the target object can be associated in step 240 based on the edges included in the updated first sub-graph, which largely avoids the interference of the less important objects and redundant association information in the network graph on the target object association, so that the association of the target object is more accurate and efficient, and is more suitable for use in fields such as network security and object recommendation.
FIG. 3 illustrates a schematic flow chart diagram of a method 300 of obtaining a network graph constructed based on multiple objects of different types according to one embodiment of the present disclosure. The method 300 may be used to implement step 210 in fig. 2. As shown in fig. 3, the illustrated method 300 includes the following steps.
At step 310, network object data is obtained, the network object data including multiple types of objects and associations between the objects. For example, the network object data includes user, certificate, account, merchant, etc. type objects and associations between these objects.
In some embodiments, obtaining the network object data further comprises: acquiring relation network data of various objects; performing data cleaning on the acquired relational network data to obtain the network object data; wherein the data cleansing includes one or more of the following operations: removing incomplete data in the relational network data, removing repeated data in the relational network data and removing error data in the relational network data. For example, if the relationship network data includes incomplete data of the merchant object, duplicate account objects, and obviously incorrect user object data, the data will be removed during the cleaning process of the relationship network data, and network object data is obtained.
In step 320, determining a heterogeneous graph based on the type of the object in the network object data and the association relationship between the objects, wherein the heterogeneous graph comprises nodes for representing the objects and edges for representing the association relationship between the objects, the nodes for representing the objects have node weights for representing the importance of the object corresponding to the nodes, and the edges for representing the association relationship between the objects have edge weights for representing the association degree between the objects corresponding to the two nodes connected by the edges. By way of example, the types of the objects in the network object data include an account type, a user type and a merchant type, and the association relationship between the objects includes a binding relationship between the user object and the account object, a payment relationship between the user object and the merchant object, and a real-name relationship between the user object and the user object. The heterogeneous graph determined based on the type of the object in the network object data and the association relationship between the objects comprises nodes (such as a user node and a merchant node) for representing the objects and edges for representing the association relationship between the objects (such as a payment relationship between the user object and the merchant object and a real name relationship between the user object and the user object), and the nodes for representing the objects have node weights to represent the importance of the objects corresponding to the nodes. For example, if the sales amount of the merchant object a is larger than that of the merchant object B, the node weight of the node corresponding to the merchant object a may be larger than that of the node corresponding to the merchant object B. The edge used for representing the association relationship between the objects has a weight to represent the association degree between the objects corresponding to the two nodes connected by the edge, for example, the payment amount between the user object C and the merchant object a is greater than the payment amount between the user object C and the merchant object B, and then the weight of the edge between the node corresponding to the user object C and the node corresponding to the merchant object a is greater than the weight of the edge between the node corresponding to the user object C and the node corresponding to the merchant object B.
In step 330, a subgraph of the abnormal graph is mined from the abnormal graph to obtain the network graph. By way of example, mining the subgraph of the abnormal graph from the abnormal graph can be performed by various methods, such as a label propagation method, a method of acquiring a connected graph, a community division method, and the like, which are not limited herein.
In the method 300, the heterogeneous graph is determined according to the acquired network object data, the complex information in the network object data is included in the heterogeneous graph, then a sub-graph of the heterogeneous graph is mined from the heterogeneous graph, and nodes and edges corresponding to objects which are relatively isolated and less related in the heterogeneous graph are removed to obtain the network graph. Therefore, when the subsequent target object correlation operation is carried out on the basis of the obtained network diagram, the efficiency and the accuracy can be improved.
FIG. 4 illustrates a schematic flow chart diagram of a method 400 of mining subgraphs of an anomaly from the anomaly to obtain a network graph in accordance with one embodiment of the present disclosure. The method 400 may be used to implement step 330 of fig. 3. As shown in fig. 4, the illustrated method 400 includes the following steps.
At step 410, an initial set is established based on each node in the heterogeneous graph, so as to divide the plurality of nodes in the heterogeneous graph into a plurality of initial sets, and the modularity of each initial set is determined. As an example, the modularity Q of the initial set may be defined as:
Figure 38523DEST_PATH_IMAGE001
wherein A is an adjacency matrix; a. theijA weight representing an edge between node i and node j;
Figure 586179DEST_PATH_IMAGE002
is the sum of the weights of all edges connected to node i; kjIs the sum of the weights of all edges connected to node j;
Figure 490681DEST_PATH_IMAGE003
represents the sum of the weights of all edges; c. CiIs the set of i nodes, cjIs the set where the j node is located;
Figure 527907DEST_PATH_IMAGE004
represents: and returning 1 when the node i and the node j are in the same set, and otherwise, returning 0.
At step 420, each of the plurality of initial sets is treated as a current set.
Then, step 431 is performed iteratively, that is, an update set is formed by assigning each node in the anomaly graph to a current set in which an adjacent node is located, such that the modularity of the update set is greater than that of the current set in which the adjacent node is located, and the update set is regarded as the current set until the condition in step 432 is satisfied, which is set as: the modularity of the resulting update set is no longer increased. As an example, step 431 and step 432 are iteratively executed to update the current set, and may specifically be operated as follows, (1) each node is divided into different sets, (2) for each node, each node is divided into sets in which nodes adjacent to the node are located, the modularity at this time is calculated, whether a difference Δ Q between the modularity before and after the division is a positive number is judged, if the difference Δ Q is a positive number, the division at this time is accepted, and if the difference Δ Q is not a positive number, the division at this time is abandoned; (3) operations (1) and (2) are repeated until the modularity cannot be increased any more.
At step 440, one of the resulting plurality of current sets is taken as the network map. In some embodiments, after updating the current set, for obtaining a plurality of current sets, determining a modularity of each of the obtained plurality of current sets, and then using a set corresponding to a maximum modularity in the obtained plurality of current sets as the network graph. The method 400 can be used for mining subgraphs of the heterogeneous graph from the heterogeneous graph, is beneficial to performing set division on the heterogeneous graph, can be used for performing fine classification on the complex heterogeneous graph, and further improves the accuracy of mining the heterogeneous graph.
In some embodiments, treating one of the resulting plurality of current sets as the network graph comprises: determining a modularity for each of the obtained plurality of current sets; and taking the set corresponding to the maximum modularity in the obtained multiple current sets as the network graph. As an example, assuming that there are 100 obtained current sets, since the association degree of the connection between nodes in the set with the maximum modularity among the 100 current sets is higher, the set corresponding to the maximum modularity among the 100 obtained current sets is taken as the network graph.
In other embodiments, mining sub-graphs of the heterogeneous graph from the heterogeneous graph in step 330 to obtain the network graph includes: and mining the maximum connected subgraph of the abnormal graph from the abnormal graph to serve as the network graph. As an example, the heterogeneous graph includes nodes representing various types of objects and edges representing relationships between various types of objects, and the maximum connected subgraph mined from the heterogeneous graph may indicate a main association relationship between objects corresponding to some nodes in the heterogeneous graph, so that the obtained maximum connected subgraph may be used as the network graph.
In some embodiments, for example, in the anti-fraud field, in order to identify a fraudulent user as much as possible, an object associated with the user is mined, for example, through an association existing between a merchant, an account, a certificate, and the like and the user, an association object of the user is mined, and particularly, an association object of the user with fraudulent activities and an association with the user are mined.
As an example, fig. 5 illustrates a schematic diagram of an association relationship according to some embodiments. As shown in fig. 5, the association includes at least one of: a binding relationship between the object of the user type and the object of the account type, a complaint relationship between the object of the user type and the object of the user type, a payment relationship between the object of the user type and the object of the merchant type, and a real name relationship between the object of the user type and the object of the certificate type. For example, specifically, when a payment transaction is performed between the user a and the merchant B, the user a and the merchant B have a payment relationship; if the user A and the certificate C are subjected to real-name operation, the user A and the certificate C have a real-name relationship; if the user A and the account D are bound, the user A and the account D have a binding relationship; and if the user A complains about the behavior of the user E, such as fraud and the like, the user A and the user E have a complaint relation.
In the process of associating the user object, the weight of each node and the corresponding association relation is determined firstly. For example, for a payment relationship, the weight of the payment relationship may be determined by considering the payment amount between the user objects. If the payment amount is greater than or greater than the amount of payment, the weight of the payment relationship may be set to 2 and the others to 1. The weight of each node can be used to indicate the importance of the object corresponding to the node. For example, the sales amount of the merchant a is higher than that of the merchant B, the node weight of the node corresponding to the merchant a may be greater than that of the node corresponding to the merchant B. And then constructing an association network according to the nodes and the weights of the corresponding association relations.
And then, according to the constructed associated network, carrying out malicious aggregation mining and analysis through graph mining algorithms such as label propagation, connected graph, community discovery and the like. Obtaining a network diagram through the mined aggregation result
Figure 885070DEST_PATH_IMAGE005
. Wherein the content of the first and second substances,
Figure 469373DEST_PATH_IMAGE006
a set of nodes is represented that is,
Figure 353016DEST_PATH_IMAGE007
vthe nodes are represented as a list of nodes,
Figure 436509DEST_PATH_IMAGE008
Figure 405602DEST_PATH_IMAGE009
which indicates the type of the node or nodes,
Figure 498323DEST_PATH_IMAGE010
representing the importance of the node;
Figure 236472DEST_PATH_IMAGE011
representing a set of user nodes in a node set.ERepresents a set of associations between the nodes,
Figure 756446DEST_PATH_IMAGE012
wherein
Figure 212835DEST_PATH_IMAGE013
Express a relationshipeIs a slave nodevPoint of directionwOf the type of edge of
Figure 345133DEST_PATH_IMAGE014
And the weight is
Figure 203367DEST_PATH_IMAGE010
The weight of the edge can be expressed as
Figure 894243DEST_PATH_IMAGE015
. Thus, it is possible to provide
Figure 775611DEST_PATH_IMAGE016
Is shown as
Figure 334769DEST_PATH_IMAGE017
The objects corresponding to the nodes have an association relationship,
Figure 922876DEST_PATH_IMAGE018
to represent
Figure 581390DEST_PATH_IMAGE017
And the objects corresponding to the nodes have no association relationship.
As an example, nodes of the user type (i.e., user-type nodes)
Figure 12372DEST_PATH_IMAGE019
) The node is set as a key node, and the nodes of other types are set as optional reserved nodes. Representing the first sub-graph as
Figure 483542DEST_PATH_IMAGE020
To a
Figure 50790DEST_PATH_IMAGE021
Is provided with
Figure 817888DEST_PATH_IMAGE022
Figure 470587DEST_PATH_IMAGE023
Representing a set of user nodes in the first subgraph. Representing the second sub-graph as
Figure 43650DEST_PATH_IMAGE024
To a
Figure 606350DEST_PATH_IMAGE025
Is provided with
Figure 934563DEST_PATH_IMAGE026
Figure 451388DEST_PATH_IMAGE027
Representing a set of user nodes in the second sub-graph. Optionally, the first sub-graph and the second sub-graph may be initialized such that the first sub-graph contains one or more user nodes and the second sub-graph contains remaining nodes and edges of the network graph after the nodes and edges in the first sub-graph are removed.
Then, the updating steps (1), (2) and (3) are executed iteratively to update the first sub-graph and the second sub-graph until the second sub-graph does not contain user nodes, namely
Figure 890460DEST_PATH_IMAGE027
The collection is empty. The updating steps (1), (2), and (3) are as follows.
Step (1): from a network diagram
Figure 307666DEST_PATH_IMAGE005
Of the nodes of (1) determining a core node, the core node being a network graph
Figure 806780DEST_PATH_IMAGE005
Middle and first sub-diagram
Figure 309437DEST_PATH_IMAGE020
Is greater than zero and is associated with the second sub-graph
Figure 755462DEST_PATH_IMAGE024
The node with the largest sum of the association degrees of all the user nodes in the group. The concrete expression is as follows:
determining and
Figure 417387DEST_PATH_IMAGE011
have a relationship, reject and
Figure 962769DEST_PATH_IMAGE023
node with association relation
Figure 716837DEST_PATH_IMAGE028
The relevance degree is as follows:
Figure 28869DEST_PATH_IMAGE029
calculation and
Figure 420667DEST_PATH_IMAGE011
has a relationship with
Figure 996005DEST_PATH_IMAGE023
Node with association relation
Figure 207675DEST_PATH_IMAGE028
The relevance degree is as follows:
Figure 792240DEST_PATH_IMAGE030
traversing all nodes in the network graph to determine satisfaction
Figure 304124DEST_PATH_IMAGE031
And is
Figure 315942DEST_PATH_IMAGE032
The largest node is the core node
Figure 793671DEST_PATH_IMAGE033
Figure 916348DEST_PATH_IMAGE034
Step (2): constructing a set of nodes, the set of nodes including a second subgraph
Figure 282738DEST_PATH_IMAGE024
Central and core node
Figure 465458DEST_PATH_IMAGE028
Node with relevance greater than zero and first subgraph
Figure 651719DEST_PATH_IMAGE020
Central and core node
Figure 312508DEST_PATH_IMAGE028
And the node with the maximum relevance degree. Then the nodes in the node set and the core nodes are combined
Figure 533405DEST_PATH_IMAGE028
And determining edges connecting the core nodes and the nodes in the node set in the network graph as a third subgraph
Figure 887026DEST_PATH_IMAGE035
,
Figure 59119DEST_PATH_IMAGE036
Figure 789177DEST_PATH_IMAGE037
Is shown asThe collection of nodes in the three sub-graphs,
Figure 864581DEST_PATH_IMAGE038
and representing an edge set in the third subgraph, specifically as follows:
Figure 389103DEST_PATH_IMAGE039
Figure 549957DEST_PATH_IMAGE040
FIG. 6 illustrates determining a third subgraph
Figure 83707DEST_PATH_IMAGE041
Exemplary processes of (a). As shown in fig. 6, G1 represents a first sub-graph, containing nodes N1, N2, N3; g2 represents the second subgraph, containing nodes K1, K2, K3, K4. The node with the maximum association degree in the core nodes w and G1 is N3, the nodes with the association degree larger than zero in the core nodes w and G2 are K1, K2, K3 and K4, the node set comprises the nodes N3, K1, K2, K3 and K4, and the third sub-graph
Figure 279196DEST_PATH_IMAGE035
,
Figure 974619DEST_PATH_IMAGE036
Node set in (1)
Figure 858655DEST_PATH_IMAGE037
A third sub-graph comprising nodes N3, K1, K2, K3, K4 and a core node w
Figure 930516DEST_PATH_IMAGE035
,
Figure 980512DEST_PATH_IMAGE036
Set of edges in (1)
Figure 846837DEST_PATH_IMAGE038
Will include core node w and nodesThe edges connected by points N3, K1, K2, K3, K4.
And (3): for the first sub-diagram
Figure 716704DEST_PATH_IMAGE020
Adding a third subgraph
Figure 592256DEST_PATH_IMAGE035
,
Figure 496758DEST_PATH_IMAGE036
The nodes and edges of (2) are specifically represented as:
Figure 799563DEST_PATH_IMAGE042
Figure 655262DEST_PATH_IMAGE044
for the second sub-diagram
Figure 68925DEST_PATH_IMAGE024
Removing the third sub-graph
Figure 827934DEST_PATH_IMAGE035
,
Figure 36061DEST_PATH_IMAGE036
The nodes and edges of (2) are specifically represented as:
Figure 880521DEST_PATH_IMAGE045
Figure 97875DEST_PATH_IMAGE046
and after the updating step is completed, associating the target objects corresponding to the nodes in the updated first sub-graph based on the edges included in the updated first sub-graph. Fig. 7 illustrates an effect diagram of a target object association method according to one embodiment of the present disclosure. As shown in fig. 7, the left side is a network diagram in which target object association is to be performed, and the right side is the effect of the target object association. As can be seen from the figure, most of the interference information and redundant associated information are removed by the method disclosed by the present disclosure, and the target object corresponding to the node in the updated first sub-graph is associated based on the edge included in the updated first sub-graph, so that the association of the target object is more accurate and efficient.
Fig. 8 shows an exemplary structural block diagram of a target object associating apparatus 800 according to an embodiment of the present disclosure. As shown in fig. 8, the target object associating apparatus 800 includes an obtaining module 810, an initializing module 820, an updating module 830, and an associating module 840.
The obtaining module 810 is configured to obtain a network graph constructed based on a plurality of objects of different types, where the network graph includes a plurality of nodes corresponding to the plurality of objects and edges connecting between the nodes, the types of the nodes correspond to the types of the objects, each edge represents an association relationship between the objects corresponding to two nodes connected by each edge and has a weight to represent an association degree between the two nodes, and the association degree between the two nodes indicates the association degree between the corresponding objects between the two nodes. In some embodiments, the type of node-corresponding object in the network map comprises at least one of: user type, account type, merchant type, credential type, and the key types include user type. For example, the association includes at least one of: a binding relationship between the object of the user type and the object of the account type, a complaint relationship between the object of the user type and the object of the user type, a payment relationship between the object of the user type and the object of the merchant type, and a real name relationship between the object of the user type and the object of the certificate type. For example, in an application scenario associated with a rogue user, a node in the network graph may contain information such as the type of object (such as a user node, an account node, a merchant node, a credential node, etc.) to which the node corresponds, and an edge in the network graph may contain information between the types to which the node corresponds. For example, the edge connecting the user node and the account node may represent a binding relationship between the user-type object and the account-type object, the edge connecting the user node and the user node may represent a complaint relationship or a payment relationship between the user-type object and the user-type object, the edge connecting the user node and the merchant node may represent a payment relationship between the user-type object and the merchant-type object, and the edge connecting the user node and the credential node may represent a real-name relationship between the user-type object and the credential-type object. In some embodiments, the obtaining module 810 is configured to obtain the network graph constructed based on a plurality of objects of different types, which may be performed in the following manner: firstly, network object data is obtained, the network object data can comprise multiple types of objects and incidence relations among the objects, then an abnormal composition is determined based on the types of the objects and the incidence relations among the objects in the network object data, the abnormal composition graph comprises nodes used for representing the objects and edges used for representing the incidence relations among the objects, the nodes used for representing the objects are provided with node weights to represent the importance of the objects corresponding to the nodes, the edges used for representing the incidence relations among the objects are provided with edge weights to represent the incidence degrees among the objects corresponding to two nodes connected by the edges, and then a subgraph of the abnormal composition is mined from the abnormal composition graph to serve as the network graph.
The initialization module 820 is configured to initialize a first subgraph and a second subgraph such that the first subgraph contains one or more key nodes in the network graph and the second subgraph includes nodes and edges in the network graph that remain after the nodes in the first subgraph are removed, wherein each key node represents a node corresponding to an object having a key type. For example, in a scenario in which a rogue user is associated, the objects of a key type may include user objects, and the key nodes may include user nodes corresponding to the user objects in the network graph. For example, a first sub-graph is initialized such that the first sub-graph contains one or more user nodes, and a second sub-graph includes nodes and edges in the network that remain after the nodes contained in the first sub-graph are removed.
The update module 830 is configured to iteratively perform update steps to update the first sub-graph and the second sub-graph until no key nodes are included in the second sub-graph, the update steps including: determining a core node from the nodes of the network graph, the core node being a node in the network graph having an association with at least one of the one or more key nodes in the first subgraph greater than zero and having a maximum sum of associations with all key nodes in the second subgraph; determining a node set so that the node set comprises nodes with the association degree with a core node larger than zero in a second subgraph and nodes with the maximum association degree with the core node in a first subgraph, and determining the nodes in the node set, the core node and edges connecting the core node and the nodes in the node set in a network graph as a third subgraph; nodes and edges of a third subgraph are added to the first subgraph, and nodes and edges of the third subgraph are removed from the second subgraph. For example, if the third sub-graph includes a part of user nodes, merchant nodes, and account nodes and a part of edges connecting them, the edges between the user nodes, merchant nodes, and account nodes and departments included in the third sub-graph are added in the first sub-graph, and the edges between the user nodes, merchant nodes, and account nodes and departments included in the third sub-graph are removed in the second sub-graph.
The associating module 840 is configured to associate target objects corresponding to nodes in the updated first sub-graph based on edges included in the updated first sub-graph. For example, the updated first sub-graph includes all user nodes and some merchant nodes, and some edges between them. And associating all the user nodes with the part of the merchant nodes according to the edges included in the updated first sub-graph.
Fig. 9 illustrates an example system 900 that includes an example computing device 910 that represents one or more systems and/or devices that can implement the various techniques described herein. The computing device 910 may be, for example, a server of a service provider, a device associated with a server, a system on a chip, and/or any other suitable computing device or computing system. The target object associating means 800 described above with reference to fig. 8 may take the form of a computing device 910. Alternatively, the target object associating means 800 may be implemented as a computer program in the form of an application 916.
The example computing device 910 as illustrated includes a processing system 911, one or more computer-readable media 912, and one or more I/O interfaces 913 communicatively coupled to each other. Although not shown, the computing device 910 may also include a system bus or other data and command transfer system that couples the various components to one another. A system bus can include any one or combination of different bus structures, such as a memory bus or memory controller, a peripheral bus, a universal serial bus, and/or a processor or local bus that utilizes any of a variety of bus architectures. Various other examples are also contemplated, such as control and data lines.
The processing system 911 represents functionality to perform one or more operations using hardware. Accordingly, the processing system 911 is illustrated as including hardware elements 914 that may be configured as processors, functional blocks, and the like. This may include implementation in hardware as an application specific integrated circuit or other logic device formed using one or more semiconductors. Hardware element 914 is not limited by the material from which it is formed or the processing mechanisms employed therein. For example, a processor may be comprised of semiconductor(s) and/or transistors (e.g., electronic Integrated Circuits (ICs)). In such a context, processor-executable instructions may be electronically-executable instructions.
The computer-readable medium 912 is illustrated as including a memory/storage 915. Memory/storage 915 represents memory/storage capacity associated with one or more computer-readable media. The memory/storage 915 may include volatile media (such as Random Access Memory (RAM)) and/or nonvolatile media (such as Read Only Memory (ROM), flash memory, optical disks, magnetic disks, and so forth). The memory/storage 915 may include fixed media (e.g., RAM, ROM, a fixed hard drive, etc.) as well as removable media (e.g., flash memory, a removable hard drive, an optical disk, and so forth). The computer-readable medium 912 may be configured in various other ways as further described below.
One or more I/O interfaces 913 represent functionality that allows a user to enter commands and information to computing device 910 using various input devices and optionally also allows information to be presented to the user and/or other components or devices using various output devices. Examples of input devices include a keyboard, a cursor control device (e.g., a mouse), a microphone (e.g., for voice input), a scanner, touch functionality (e.g., capacitive or other sensors configured to detect physical touch), a camera (e.g., motion that may not involve touch may be detected as gestures using visible or invisible wavelengths such as infrared frequencies), and so forth. Examples of output devices include a display device (e.g., a monitor or projector), speakers, a printer, a network card, a haptic response device, and so forth. Thus, the computing device 910 may be configured in various ways to support user interaction, as described further below.
The computing device 910 also includes an application 916. The application 916 may be, for example, a software instance of the target object association apparatus 800 and implement the techniques described herein in combination with other elements in the computing device 910.
Various techniques may be described herein in the general context of software hardware elements or program modules. Generally, these modules include routines, programs, objects, elements, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The terms "module," "functionality," and "component" as used herein generally represent software, firmware, hardware, or a combination thereof. The features of the techniques described herein are platform-independent, meaning that the techniques may be implemented on a variety of computing platforms having a variety of processors.
An implementation of the described modules and techniques may be stored on or transmitted across some form of computer readable media. Computer readable media can include a variety of media that can be accessed by computing device 910. By way of example, and not limitation, computer-readable media may comprise "computer-readable storage media" and "computer-readable signal media".
"computer-readable storage medium" refers to a medium and/or device, and/or a tangible storage apparatus, capable of persistently storing information, as opposed to mere signal transmission, carrier wave, or signal per se. Accordingly, computer-readable storage media refers to non-signal bearing media. Computer-readable storage media include hardware such as volatile and nonvolatile, removable and non-removable media and/or storage devices implemented in a method or technology suitable for storage of information such as computer-readable instructions, data structures, program modules, logic elements/circuits or other data. Examples of computer readable storage media may include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical storage, hard disks, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or other storage devices, tangible media, or an article of manufacture suitable for storing the desired information and accessible by a computer.
"computer-readable signal medium" refers to a signal-bearing medium configured to transmit instructions to hardware of computing device 910, such as via a network. Signal media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave, data signal or other transport mechanism. Signal media also includes any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.
As previously described, hardware element 914 and computer-readable medium 912 represent instructions, modules, programmable device logic, and/or fixed device logic implemented in hardware that, in some embodiments, may be used to implement at least some aspects of the techniques described herein. The hardware elements may include integrated circuits or systems-on-chips, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), Complex Programmable Logic Devices (CPLDs), and other implementations in silicon or components of other hardware devices. In this context, a hardware element may serve as a processing device that performs program tasks defined by instructions, modules, and/or logic embodied by the hardware element, as well as a hardware device for storing instructions for execution, such as the computer-readable storage medium described previously.
Combinations of the foregoing may also be used to implement the various techniques and modules described herein. Thus, software, hardware, or program modules and other program modules may be implemented as one or more instructions and/or logic embodied on some form of computer-readable storage medium and/or by one or more hardware elements 914. The computing device 910 may be configured to implement particular instructions and/or functions corresponding to software and/or hardware modules. Thus, implementing a module as a module executable by the computing device 910 as software may be implemented at least partially in hardware, for example, using the processing system's computer-readable storage media and/or hardware elements 914. The instructions and/or functions may be executable/operable by one or more articles of manufacture (e.g., one or more computing devices 910 and/or processing system 911) to implement the techniques, modules, and examples described herein.
In various implementations, the computing device 910 may assume a variety of different configurations. For example, the computing device 910 may be implemented as a computer-like device including a personal computer, a desktop computer, a multi-screen computer, a laptop computer, a netbook, and so forth. The computing device 910 may also be implemented as a mobile device-like device including mobile devices such as mobile telephones, portable music players, portable gaming devices, tablet computers, multi-screen computers, and the like. The computing device 910 may also be implemented as a television-like device that includes or is connected to a device having a generally larger screen in a casual viewing environment. These devices include televisions, set-top boxes, game consoles, and the like.
The techniques described herein may be supported by these various configurations of the computing device 910 and are not limited to specific examples of the techniques described herein. Functionality may also be implemented in whole or in part on "cloud" 920 through the use of a distributed system, such as through platform 922 as described below.
Cloud 920 includes and/or is representative of a platform 922 for resources 924. The platform 922 abstracts underlying functionality of hardware (e.g., servers) and software resources of the cloud 920. The resources 924 may include applications and/or data that may be used when executing computer processes on servers remote from the computing device 910. The resources 924 may also include services provided over the internet and/or over a subscriber network such as a cellular or Wi-Fi network.
The platform 922 may abstract resources and functionality to connect the computing device 910 with other computing devices. The platform 922 may also be used to abstract a hierarchy of resources to provide a corresponding level of hierarchy encountered for the demand of the resources 924 implemented via the platform 922. Thus, in interconnected device embodiments, implementation of functions described herein may be distributed throughout the system 900. For example, the functionality may be implemented in part on the computing device 910 and by the platform 922 that abstracts the functionality of the cloud 920.
It will be appreciated that embodiments of the disclosure have been described with reference to different functional units for clarity. However, it will be apparent that the functionality of each functional unit may be implemented in a single unit, in a plurality of units or as part of other functional units without departing from the disclosure. For example, functionality illustrated to be performed by a single unit may be performed by a plurality of different units. Thus, references to specific functional units are only to be seen as references to suitable units for providing the described functionality rather than indicative of a strict logical or physical structure or organization. Thus, the present disclosure may be implemented in a single unit or may be physically and functionally distributed between different units and circuits.
It will be understood that, although the terms first, second, third, etc. may be used herein to describe various devices, elements, components or sections, these devices, elements, components or sections should not be limited by these terms. These terms are only used to distinguish one device, element, component or section from another device, element, component or section.
Although the present disclosure has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present disclosure is limited only by the accompanying claims. Additionally, although individual features may be included in different claims, these may possibly advantageously be combined, and the inclusion in different claims does not imply that a combination of features is not feasible and/or advantageous. The order of features in the claims does not imply any specific order in which the features must be worked. Furthermore, in the claims, the word "comprising" does not exclude other elements, and the terms "a" or "an" do not exclude a plurality. Reference signs in the claims are provided merely as a clarifying example and shall not be construed as limiting the scope of the claims in any way.

Claims (15)

1. A target object association method, comprising:
acquiring a network graph constructed based on a plurality of objects of different types, wherein the network graph comprises a plurality of nodes corresponding to the plurality of objects and edges connecting between the nodes, the types of the nodes correspond to the types of the objects, each edge represents an association relationship between the objects corresponding to two nodes connected by each edge and has a weight to represent the association degree between the two nodes, and the association degree between the two nodes indicates the association degree between the objects corresponding to the two nodes;
initializing a first subgraph and a second subgraph, wherein the first subgraph comprises one or more key nodes in the network graph, the second subgraph comprises nodes and edges which are left in the network graph after the nodes in the first subgraph are removed, and each key node represents a node corresponding to an object with a key type;
iteratively performing the following updating steps to update the first sub-graph and the second sub-graph until the second sub-graph does not include a key node, the updating steps including:
determining a core node from the nodes of the network graph, the core node being a node in the network graph having an association with at least one of the one or more key nodes in the first subgraph greater than zero and having a maximum sum of associations with all key nodes in the second subgraph;
determining a node set so that the node set comprises nodes with the association degree with a core node larger than zero in a second subgraph and nodes with the maximum association degree with the core node in a first subgraph, and determining the nodes in the node set, the core node and edges connecting the core node and the nodes in the node set in a network graph as a third subgraph;
adding nodes and edges of a third subgraph in the first subgraph and removing nodes and edges of the third subgraph in the second subgraph;
and associating the target objects corresponding to the nodes in the updated first sub-graph based on the edges included in the updated first sub-graph.
2. The method of claim 1, wherein the determining a core node from the nodes of the network graph comprises:
traversing all nodes of the network graph, and determining the sum of the association degrees of the current node and all key nodes in the second subgraph in response to the traversed current node and at least one key node in the first subgraph being more than zero;
and determining the node corresponding to the maximum sum value in all the determined sum values as the core node.
3. The method of claim 1, wherein each node in the network graph has a weight to represent an importance of the node, and,
wherein determining the set of nodes such that the set of nodes includes nodes in the second subgraph having a degree of association with the core node greater than zero and nodes in the first subgraph having a maximum degree of association with the core node comprises:
in response to the first sub-graph comprising a plurality of nodes with the maximum association degree with the core node, acquiring the node with the maximum node weight in the plurality of nodes in the first sub-graph;
determining a node set so that the node set comprises nodes with the association degree with the core node being greater than zero in the second subgraph and the obtained nodes with the maximum node weight.
4. The method of claim 1, wherein the obtaining a network graph constructed based on a plurality of objects of different types comprises:
acquiring network object data, wherein the network object data comprises multiple types of objects and incidence relations among the objects;
determining a heterogeneous graph based on the type of the objects in the network object data and the association relationship between the objects, wherein the heterogeneous graph comprises nodes for representing the objects and edges for representing the association relationship between the objects, the nodes for representing the objects have node weights for representing the importance of the objects corresponding to the nodes, and the edges for representing the association relationship between the objects have edge weights for representing the association degree between the objects corresponding to two nodes connected by the edges;
and mining subgraphs of the abnormal graph from the abnormal graph to obtain the network graph.
5. The method of claim 4, wherein said obtaining network object data further comprises:
acquiring relation network data of various objects;
performing data cleaning on the acquired relational network data to obtain the network object data; wherein the data cleansing includes one or more of the following operations: removing incomplete data in the relational network data, removing repeated data in the relational network data and removing error data in the relational network data.
6. The method of claim 4, wherein the mining subgraphs of an anomaly graph from the anomaly graph to obtain the network graph comprises:
establishing an initial set based on each node in the heterogeneous graph, so as to divide a plurality of nodes in the heterogeneous graph into a plurality of initial sets, and determining the modularity of each initial set, wherein the modularity of the sets characterizes the relevance of the connection among the nodes in the sets;
taking each of the plurality of initial sets as a current set;
iteratively performing a network update step until the modularity of the resulting update set does not increase any more, the update step comprising: forming an update set by assigning each node in the heterogeneous graph to a current set in which neighboring nodes are located, such that a modularity of the update set is greater than a modularity of the current set in which the neighboring nodes are located, and taking the update set as the current set;
and taking one of the obtained current sets as the network graph.
7. The method of claim 6, wherein said treating one of the resulting plurality of current sets as the network graph comprises:
determining a modularity for each of the obtained plurality of current sets;
and taking the set corresponding to the maximum modularity in the obtained multiple current sets as the network graph.
8. The method of claim 4, wherein mining subgraphs of an anomaly graph from the anomaly graph to obtain the network graph comprises:
and mining the maximum connected subgraph of the abnormal graph from the abnormal graph to serve as the network graph.
9. The method of claim 1, wherein the types of node-corresponding objects in the network graph comprise at least one of: user type, account type, merchant type, credential type, and the key types include user type.
10. The method of claim 1, wherein the association comprises at least one of: a binding relationship between the object of the user type and the object of the account type, a complaint relationship between the object of the user type and the object of the user type, a payment relationship between the object of the user type and the object of the merchant type, and a real name relationship between the object of the user type and the object of the certificate type.
11. A target object association apparatus, comprising:
an obtaining module configured to obtain a network graph constructed based on a plurality of objects of different types, wherein the network graph includes a plurality of nodes corresponding to the plurality of objects and edges connecting between the nodes, the types of the nodes correspond to the types of the objects, each edge represents an association relationship between the objects corresponding to two nodes connected by each edge and has a weight to represent an association degree between the two nodes, and the association degree between the two nodes indicates an association degree between the corresponding objects between the two nodes;
an initialization module configured to initialize a first subgraph and a second subgraph such that the first subgraph contains one or more key nodes in the network graph, the second subgraph including nodes and edges in the network graph that remain after the nodes in the first subgraph are removed, wherein each key node represents a node corresponding to an object having a key type;
an update module configured to iteratively perform update steps to update the first subgraph and the second subgraph until no key nodes are included in the second subgraph, the update steps comprising:
determining a core node from the nodes of the network graph, the core node being a node in the network graph having an association with at least one of the one or more key nodes in the first subgraph greater than zero and having a maximum sum of associations with all key nodes in the second subgraph;
determining a node set so that the node set comprises nodes with the association degree with a core node larger than zero in a second subgraph and nodes with the maximum association degree with the core node in a first subgraph, and determining the nodes in the node set, the core node and edges connecting the core node and the nodes in the node set in a network graph as a third subgraph;
adding nodes and edges of a third subgraph in the first subgraph and removing nodes and edges of the third subgraph in the second subgraph;
an association module configured to associate target objects corresponding to nodes in the updated first sub-graph based on edges included in the updated first sub-graph.
12. The apparatus of claim 11, wherein the acquisition module is configured to:
acquiring network object data, wherein the network object data comprises multiple types of objects and incidence relations among the objects;
determining a heterogeneous graph based on the type of the objects in the network object data and the association relationship between the objects, wherein the heterogeneous graph comprises nodes for representing the objects and edges for representing the association relationship between the objects, the nodes for representing the objects have node weights for representing the importance of the objects corresponding to the nodes, and the edges for representing the association relationship between the objects have edge weights for representing the association degree between the objects corresponding to two nodes connected by the edges;
and mining subgraphs of the abnormal graph from the abnormal graph to obtain the network graph.
13. A computing device comprising
A memory configured to store computer-executable instructions;
a processor configured to perform the method of any one of claims 1-10 when the computer-executable instructions are executed by the processor.
14. A computer-readable storage medium storing computer-executable instructions that, when executed, perform the method of any one of claims 1-10.
15. A computer program product comprising computer executable instructions, wherein the computer executable instructions, when executed by a processor, perform the method according to any one of claims 1-10.
CN202111157861.7A 2021-09-30 2021-09-30 Target object association method and device, computing equipment and storage medium Active CN113609345B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111157861.7A CN113609345B (en) 2021-09-30 2021-09-30 Target object association method and device, computing equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111157861.7A CN113609345B (en) 2021-09-30 2021-09-30 Target object association method and device, computing equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113609345A CN113609345A (en) 2021-11-05
CN113609345B true CN113609345B (en) 2021-12-10

Family

ID=78343290

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111157861.7A Active CN113609345B (en) 2021-09-30 2021-09-30 Target object association method and device, computing equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113609345B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114443783B (en) * 2022-04-11 2022-06-24 浙江大学 Supply chain data analysis and enhancement processing method and device
CN115268282A (en) * 2022-06-29 2022-11-01 青岛海尔科技有限公司 Control method and device of household appliance, storage medium and electronic device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014000435A1 (en) * 2012-06-25 2014-01-03 华为技术有限公司 Method and system for excavating topic core circle in social network
KR20180137387A (en) * 2017-06-15 2018-12-27 한양대학교 산학협력단 Apparatus and method for detecting overlapping community
CN110032665A (en) * 2019-03-25 2019-07-19 阿里巴巴集团控股有限公司 Determine the method and device of node of graph vector in relational network figure
US10795895B1 (en) * 2017-10-26 2020-10-06 EMC IP Holding Company LLC Business data lake search engine
CN112214499A (en) * 2020-12-03 2021-01-12 腾讯科技(深圳)有限公司 Graph data processing method and device, computer equipment and storage medium
CN112926990A (en) * 2021-03-25 2021-06-08 支付宝(杭州)信息技术有限公司 Method and device for fraud identification
CN113111193A (en) * 2021-05-25 2021-07-13 合肥工业大学 Data processing method and device of knowledge graph
WO2021184367A1 (en) * 2020-03-20 2021-09-23 清华大学 Social network graph generation method based on degree distribution generation model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9195941B2 (en) * 2013-04-23 2015-11-24 International Business Machines Corporation Predictive and descriptive analysis on relations graphs with heterogeneous entities

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014000435A1 (en) * 2012-06-25 2014-01-03 华为技术有限公司 Method and system for excavating topic core circle in social network
KR20180137387A (en) * 2017-06-15 2018-12-27 한양대학교 산학협력단 Apparatus and method for detecting overlapping community
US10795895B1 (en) * 2017-10-26 2020-10-06 EMC IP Holding Company LLC Business data lake search engine
CN110032665A (en) * 2019-03-25 2019-07-19 阿里巴巴集团控股有限公司 Determine the method and device of node of graph vector in relational network figure
WO2021184367A1 (en) * 2020-03-20 2021-09-23 清华大学 Social network graph generation method based on degree distribution generation model
CN112214499A (en) * 2020-12-03 2021-01-12 腾讯科技(深圳)有限公司 Graph data processing method and device, computer equipment and storage medium
CN112926990A (en) * 2021-03-25 2021-06-08 支付宝(杭州)信息技术有限公司 Method and device for fraud identification
CN113111193A (en) * 2021-05-25 2021-07-13 合肥工业大学 Data processing method and device of knowledge graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
重叠社区发现的两段策略;陈端兵等;《计算机科学》;20130115(第01期);第225-228页 *

Also Published As

Publication number Publication date
CN113609345A (en) 2021-11-05

Similar Documents

Publication Publication Date Title
US20210173711A1 (en) Integrated value chain risk-based profiling and optimization
CN110009174B (en) Risk recognition model training method and device and server
Liu et al. A multiobjective evolutionary algorithm based on similarity for community detection from signed social networks
CN105468742B (en) The recognition methods of malice order and device
CN112085172B (en) Method and device for training graph neural network
CN106844407B (en) Tag network generation method and system based on data set correlation
CN110659723B (en) Data processing method and device based on artificial intelligence, medium and electronic equipment
WO2021174944A1 (en) Message push method based on target activity, and related device
CN108140075A (en) User behavior is classified as exception
CN113609345B (en) Target object association method and device, computing equipment and storage medium
CN108182634A (en) A kind of training method for borrowing or lending money prediction model, debt-credit Forecasting Methodology and device
CN110009486B (en) Method, system, equipment and computer readable storage medium for fraud detection
CN110020176A (en) A kind of resource recommendation method, electronic equipment and computer readable storage medium
CN112085615A (en) Method and device for training graph neural network
CN111639253B (en) Data weight judging method, device, equipment and storage medium
EP4120138B1 (en) System and method for molecular property prediction using hypergraph message passing neural network (hmpnn)
CN110310114A (en) Object classification method, device, server and storage medium
CN113191838A (en) Shopping recommendation method and system based on heterogeneous graph neural network
Wang et al. Smartphone-based bulky waste classification using convolutional neural networks
Sarantitis et al. A network analysis of the United Kingdom’s consumer price index
CN108446738A (en) A kind of clustering method, device and electronic equipment
Aravazhi Irissappane et al. Filtering unfair ratings from dishonest advisors in multi-criteria e-markets: a biclustering-based approach
CN111177653A (en) Credit assessment method and device
CN115758271A (en) Data processing method, data processing device, computer equipment and storage medium
CN113849580A (en) Subject rating prediction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40055291

Country of ref document: HK