CN111563187A - Relationship determination method, device and system and electronic equipment - Google Patents

Relationship determination method, device and system and electronic equipment Download PDF

Info

Publication number
CN111563187A
CN111563187A CN202010411078.8A CN202010411078A CN111563187A CN 111563187 A CN111563187 A CN 111563187A CN 202010411078 A CN202010411078 A CN 202010411078A CN 111563187 A CN111563187 A CN 111563187A
Authority
CN
China
Prior art keywords
relationship
graph
relation
association
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010411078.8A
Other languages
Chinese (zh)
Inventor
李瑾瑜
孙月梅
张志磊
张天颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202010411078.8A priority Critical patent/CN111563187A/en
Publication of CN111563187A publication Critical patent/CN111563187A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Abstract

The disclosure provides a relationship determination method, a relationship determination device, a relationship determination system and electronic equipment. The method comprises the following steps: determining a first relation graph, wherein the first relation graph is used for reflecting a first incidence relation among a plurality of objects; determining a second incidence relation among the plurality of objects based on the first relation graph, wherein the second incidence relation represents a potential incidence relation among the plurality of objects; and supplementing the first relationship graph based on the second incidence relationship to determine a second relationship graph.

Description

Relationship determination method, device and system and electronic equipment
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a system, and an electronic device for determining a relationship.
Background
With the continuous development of information technology, it is a research hotspot to mine the association relationship between different things so as to improve the business processing effect. For example, the risk of a client has a certain propagation property, that is, when a risk event occurs to a single client, the risk is gradually propagated to other clients which are in an associated relationship with the single client. The types of the association relationship between the clients are various (such as the relationship types of capital, trade, investment, guarantee, relatives, operation, social contact and the like), and the mining of the association relationship between the clients has important significance for risk prevention.
In the course of implementing the disclosed concept, the inventors found that the related art has at least the following problems: due to the limited information acquisition capability, the determined association relationship between the client and the client is not comprehensive through the mode of acquiring the public client information, so that the risk prevention requirement cannot be met.
Disclosure of Invention
In view of this, the present disclosure provides a method, an apparatus, a system, and an electronic device for determining a relationship that can effectively mine an association relationship between objects.
One aspect of the present disclosure provides a relationship determination method, including: first, a first relation graph is determined, wherein the first relation graph is used for reflecting a first association relation among a plurality of objects. Then, a second association relationship between the plurality of objects is determined based on the first relationship graph. The first relationship graph is then supplemented based on the second associative relationship to determine a second relationship graph.
The relationship determining method provided by the embodiment of the disclosure determines a first relationship graph for reflecting a first relationship between a plurality of objects, where the first relationship may be determined based on collected public information of the plurality of objects, and then determines a potential relationship between the plurality of objects based on the first relationship graph, so that the first relationship graph may be supplemented based on the first relationship to obtain a second relationship graph, and the second relationship graph may more comprehensively represent the relationship between the objects, which is beneficial to improving risk prevention and control capability.
One aspect of the present disclosure provides a relationship determination apparatus including: the device comprises a first determination module, a second determination module and a third determination module. The first determining module is used for determining a first relation graph, and the first relation graph is used for reflecting a first incidence relation among the plurality of objects. The second determination module is used for determining a second incidence relation among the plurality of objects based on the first relation graph, and the second incidence relation characterizes potential incidence relations among the plurality of objects. The third determination module is configured to supplement the first relationship graph based on the second incidence relation to determine a second relationship graph.
One aspect of the present disclosure provides a relationship determination system, comprising: the system comprises a potential incidence relation probing module and an incidence relation structure reconstruction module. The potential association relation probing module is used for determining potential association relations among the plurality of objects based on a first relation graph, and the first relation graph is used for reflecting the first association relations among the plurality of objects. The incidence relation structure reconstructing module is used for reconstructing a first relation graph based on potential incidence relations among a plurality of objects.
Another aspect of the present disclosure provides an electronic device comprising one or more processors and a storage for storing executable instructions that, when executed by the processors, implement the method as described above.
Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions for implementing the method as described above when executed.
Another aspect of the disclosure provides a computer program comprising computer executable instructions for implementing the method as described above when executed.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:
fig. 1 schematically illustrates an application scenario of a relationship determination method, apparatus, system and electronic device according to an embodiment of the present disclosure;
fig. 2 schematically illustrates an exemplary system architecture to which the relationship determination methods, apparatuses, systems and electronic devices may be applied, according to embodiments of the present disclosure;
FIG. 3 schematically illustrates a flow chart of a relationship determination method according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a diagram of processing values of initial incidence relation features according to an embodiment of the present disclosure;
FIG. 5 schematically shows a structural diagram of an association determination model according to an embodiment of the present disclosure;
FIG. 6 schematically shows a schematic diagram of a first relationship diagram according to an embodiment of the disclosure;
FIG. 7 is a schematic diagram illustrating a process for processing the first relationship diagram shown in FIG. 6 using an association determination model;
FIG. 8 schematically illustrates a schematic diagram of nodes to be detected, node pairs and nodes associated with the nodes to be detected according to an embodiment of the disclosure;
FIG. 9 schematically shows a schematic diagram of a first incidence relation and a second incidence relation according to an embodiment of the present disclosure;
fig. 10 schematically shows a structural diagram of a relationship determination apparatus according to an embodiment of the present disclosure;
FIG. 11 schematically illustrates a structural schematic of a relationship determination system according to an embodiment of the disclosure; and
FIG. 12 schematically shows a block diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. One or more embodiments may be practiced without these specific details. In the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art and are to be interpreted as having a meaning that is consistent with the context of this specification and not in an idealized or overly formal sense expressly so defined herein.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). The terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more features.
In order to facilitate understanding of the technical solution of the present disclosure, a relational graph and a graph neural network suitable for the incidence relation determination model are first explained. A relationship graph is a data structure described by entities (which may be represented by nodes) and relationships (which may be represented by connecting lines between nodes, or edges for short). The relationship graph is formed by using the points and the edges, wherein the points are points, and the association relationship between the clients is edges. This allows for the analysis of customers and their relationships using neural networks (e.g., deep learning models that can analyze the graph).
The potential association relationship of the client is the association relationship which is not controlled or influenced by investment, operation decision, fund scheduling, production operation and the like between the client and the client. These relationships actually exist but are not held by the users (e.g., enterprises and institutions such as banks, companies, associations, etc., administrative agencies, individuals, etc.) who need the association information. According to the embodiment of the disclosure, the disclosed customer information is collected, the relationship graph is determined based on the disclosed customer information, then the relationship graph is analyzed, the potential association relationship between the customer and the customer is found, the relationship between the customers is more comprehensively displayed, and the method has important significance for risk prevention.
The embodiment of the disclosure provides a relationship determination method, a relationship determination device, a relationship determination system and electronic equipment. The relation determination method comprises a potential association relation determination process and a relation reconstruction constitution. In the relationship determination process, first, a first relationship graph is determined, the first relationship graph is used for reflecting a first association relationship among the plurality of objects, and then, a second association relationship among the plurality of objects is determined based on the first relationship graph, and the second association relationship represents a potential association relationship among the plurality of objects. After completion of the potential incidence relation determination process, the first relation graph is supplemented based on the second incidence relation to determine a second relation graph. The embodiment of the disclosure can use the graph neural network to assist the user to explore the potential association relationship of the client, thereby facilitating the user to more comprehensively master the relationship structure of the client, improving the capability of insights on the relationship between the clients, and providing basis for decisions such as risk management, risk prevention and control and the like.
Fig. 1 schematically illustrates an application scenario of a relationship determination method, apparatus, system and electronic device according to an embodiment of the present disclosure.
As shown in fig. 1, in one scenario, a user may wish to more fully understand the risk of cooperating with client 1 (e.g., risk points, whether the risk is controllable, etc.) before determining to cooperate with client 1. Based on the known information, it can be known that there is an association (e.g., a relationship between the customer 1 and the customer 3 or the customer 4). From this information, the user can determine that, if there is a large risk (e.g., credit investigation problem, debt abnormality, etc.) in the client 3 and/or the client 4, after cooperating with the client 1, there is a certain risk that may be affected by the risk of the client 3 and/or the client 4, so that the user can determine whether to perform cooperation with the client 1 based on these analyses. In addition, the user can also find that the client 3 and the client 4 respectively have the association relationship with the client 2, but does not know that the association relationship exists between the client 1 and the client 2, so that when the user considers the cooperation with the client 1 and does not consider whether the risk of the client 2 possibly affects the cooperation between the user and the client 1, the risk controllability for successfully completing the cooperation cannot meet the user requirement. The relationship determination method, the relationship determination device, the relationship determination system and the electronic equipment provided by the embodiment of the disclosure can dig out the association relationship between the client 1 and the client 2 based on the relationship graph, so that a user can determine the cooperation risk between the user and the client 1 based on a more comprehensive client relationship, and the risk management and control can be realized. The above-mentioned scenarios are only exemplary and should not be construed as limiting the present disclosure, and the relationship determination method, apparatus, system and electronic device according to the embodiments of the present disclosure may also be applied to other scenarios that require probing potential association relationships between objects.
Fig. 2 schematically illustrates an exemplary system architecture to which the relationship determination method, apparatus, system, and electronic device may be applied, according to an embodiment of the present disclosure. It should be noted that fig. 2 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 2, the system architecture 200 according to this embodiment may include terminal devices 201, 202, 203, a network 204, and servers 205, 206. The network 204 may include a plurality of gateways, hubs, network wires, etc. to provide a medium for communication links between the terminal devices 201, 202, 203 and the servers 205, 206. Network 204 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user can use the terminal devices 201, 202 and 203 to interact with other terminal devices and servers 205 and 206 through the network 204 to receive or send information and the like, such as association relation request sending, information request sending, processing result receiving and the like. The terminal devices 201, 202, 203 may be installed with various communication client applications, such as applications installed on a website terminal, such as risk management and control applications, bank applications, operation and maintenance applications, web browser applications, search applications, office applications, instant messaging tools, mailbox clients, social platform software, and the like (for example only).
The terminal devices 201, 202, 203 include, but are not limited to, self-service terminals, smart phones, virtual reality devices, augmented reality devices, tablets, laptop portable computers, and the like.
The server 205 may receive a request, for example, a potential association request (e.g., for a specific client, for a group of clients with a certain commonality, for a group of clients with a certain time period, etc.), and the server 205 may obtain required information (e.g., obtain object information to generate a first relationship diagram, or directly read an existing relationship diagram, etc.) from the server 206 (e.g., an information platform, a transaction platform, a cloud database, etc.) or itself, and then determine a potential association of the object based on the obtained information. For example, the servers 205, 206 may be background management servers, server clusters, and the like. The background management server may analyze and process the received service request, information request, database update instruction, and the like, and feed back a processing result (such as requested information, a processing result, and the like) to the terminal device.
It should be noted that the relationship determination method provided by the embodiment of the present disclosure may be generally executed by the server 205. The relationship determination method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 205 and is capable of communicating with the terminal devices 201, 202, 203 and/or the servers 205, 206. It should be understood that the number of terminal devices, networks, and servers are merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 3 schematically shows a flow chart of a relationship determination method according to an embodiment of the present disclosure.
As shown in fig. 3, the relationship determination method includes operations S301 to S305.
In operation S301, a first relationship diagram reflecting a first association relationship between a plurality of objects is determined.
In this embodiment, the objects include, but are not limited to: entities (customers such as organizations or individuals), events (such as interactions, transactions, operations, items, etc.). The first relational graph may be generated based on the acquired object information, may be directly read (such as a predetermined relational graph or an initial relational graph), or may be a sub-relational graph obtained by processing an existing relational graph. The relationship graph may be stored in the form of an intuitive node graph, or may be stored in a specific data structure, which is not limited herein.
In one embodiment, determining the first relationship graph may include the following operations. First, based on the collected information of the plurality of objects, a first association relationship between the plurality of objects is determined. Then, a first relation graph is constructed based on the first incidence relation.
For example, when the relationship graph is constructed for the first time, the whole graph can be stored in a map mode according to different association relations between the clients and the collected clients. The mathematical expression is: g ═ V, E. And G is a full graph used for representing all clients and the association relationship thereof. V represents a set of nodes, Vi∈ V denotes each node, used herein to denote a client, E denotes a set of edges, Ei,j=(vi,vj) ∈ E, used to represent the edges of the associative relationship, where i, j may be integers.
In addition, if the number of clients is large, the number of nodes in the relationship graph is large, and the problems that the calculation amount for analyzing the potential association relationship based on the relationship graph is too large, the calculation time is too long, and the like can be caused. In order to solve the problem, the first relational graph may be divided into a plurality of first sub-relational graphs for analysis, and after potential association relationships are determined based on the sub-relational graphs, the first relational graph may be summarized. Therefore, the calculation complexity, the analysis time and the resource consumption can be effectively reduced.
Specifically, the method may further include the following operations: after determining the first relationship graph, a plurality of first sub-relationship graphs are extracted from the first relationship graph.
For example, the whole graph (the first relational graph) may be segmented, and subgraphs (the first sub-relational graph) with relational connectivity in the whole graph may be extracted. In addition, an upper limit on the number of clients in each sub-graph may be further specified. And (4) sub-graphs exceeding the upper limit of the number of clients are further split. For example, the splitting method may use a community discovery algorithm such as a louvain algorithm (also called Fast-Unfolding algorithm), which is not limited herein.
In operation S303, a second association relationship between the plurality of objects is determined based on the first relationship graph, the second association relationship characterizing a potential association relationship between the plurality of objects.
If the first relationship graph is split into a plurality of first sub-relationships, accordingly, determining a second association relationship between the plurality of objects based on the first relationship graph comprises: for each first sub-relationship graph, a second sub-association relationship between the plurality of objects is determined based on the first sub-relationship graph.
In one embodiment, for one relationship graph (e.g., a first relationship graph or a first sub-relationship graph), determining a second associative relationship between a plurality of objects based on the first relationship graph may include the following operations.
First, the incidence relation characteristics of each of the plurality of objects in the first relation graph are extracted.
Wherein the association relationship characteristic may include at least one of: node attribute features, relationship interaction attribute features, edge attribute features, node pair features, and the like.
For example, the node attribute characteristics include at least one of: object attribute features and point map features. The object attribute feature may be basic information of an object corresponding to a node in the first relationship diagram or the first sub-relationship diagram. Taking the object as a client as an example, the object attribute characteristics include but are not limited to at least one of the following: basic information of the customer, finance, industry and commerce, fund, credit investigation and the like. The point diagram feature may be node-based diagram index information in the first relationship diagram or the first sub-relationship diagram, including but not limited to at least one of: page-rank (PR), access, affinity, and affinity, etc.
The relationship interaction attribute features include at least one of: object interaction behavior features and edge graph features. Taking the object as a client as an example, the object interaction behavior feature is a feature of interaction behavior between clients (characterized as nodes in a relationship graph), and includes but is not limited to at least one of the following: capital exchange, guarantee, mortgage, investment, common work address, common card holding and the like. The edge graph feature may be edge-based graph metric information in the first relationship graph or the first sub-relationship graph, including but not limited to at least one of: common neighbor number, Jaccard (Jaccard) similarity index, Adar index, and the like.
The node pair characteristics include edge characteristics of the node pair, which are used to characterize edge information of the node pair, including but not limited to at least one of: the type of the association relationship, the direction of the relationship, the strength and importance of the relationship, etc.
Then, determining the associated object of each of the plurality of objects based on the similarity between the associated relationship features of each of the plurality of objects, wherein the associated object is not directly connected with the corresponding object in the first relationship diagram. The similarity may be determined based on cosine (cosine) distance, Minkowski distance (Minkowski distance), or the like.
Next, a second association relationship is determined based on the plurality of objects and the respective associated objects. Referring to fig. 1, there is no association between customer 1 and customer 2 in the first relationship diagram, but the similarity between the association features of customer 1 and customer 2 is high, and if the similarity exceeds a set similarity threshold, the association between customer 1 and customer 2 may be added to the first association.
In operation S305, the first relationship diagram is supplemented based on the second association relationship to determine a second relationship diagram.
If the first relationship graph is split into a plurality of first sub-relationships, accordingly, supplementing the first relationship graph based on the second association relationship to determine the second relationship graph comprises: and supplementing the first sub-relational graph based on the second sub-incidence relation to obtain a second sub-relational graph so as to determine a second relational graph based on the second sub-relational graph.
The relationship determining method provided by the embodiment of the disclosure explores the potential association relationship between objects based on the relationship graph and the like, so that a user can more comprehensively master the relationship structure of the client, the capability of insights on the relationship between the clients is improved, and the capability of risk management, risk prevention and control and the like of the user is facilitated to be improved.
The following is an exemplary description of extracting the association relationship features.
In one embodiment, extracting the association relationship features of each of the plurality of objects in the first relationship graph may include the following operations.
First, values of initial association relationship features of a plurality of objects are determined. For example, matching is performed in the existing customer information, and/or a graph index of the relationship graph is calculated to determine a value of the initial association relationship characteristic of the object. The value of the initial association relationship characteristic of the object may be a concatenation value of the values of the initial association relationship characteristics.
The determination process of the map index of the relational map may be as follows. First, the node pairs { (v) are extracted from the first relational graph or the first sub-relational graphi,vj,ei,j)|vi,vj∈V,ei,j∈ E } the requirement for a node pair is node viAnd node vjAre directly connected two by two, i and j are integers. Graph metrics for the relationship graph, including edge-based graph metrics (edge graph features), node-based graph metrics (point graph features), may then be computed based on the node pairs. Regarding the point diagram feature, the diagram features of the corresponding nodes of the client in the first relationship diagram or the first sub-relationship diagram may include page-rank, access degree, census center, and the diagram features of the nodes may be represented as [0.01, 0.83, 0.50, 0.66 ], for example](by way of example only). Regarding the graph characteristics, such as the graph characteristics of the corresponding nodes of the client in the first relational graph or the first sub-relational graph include the number of common neighbors, Jaccard similarity index and Adar index, the graph characteristics can be expressed as [3, 0.79 ]](by way of example only).
It should be noted that when the number of the neighboring nodes of a part of the nodes is large, a certain pruning needs to be performed. The definition of the neighboring node is: n (V) { u ∈ V | (V, u) ∈ E }, and n (V) denotes a neighboring node of the node V. Specifically, for each node: selecting a class point index, such as a page-rank index, and sorting all the adjacent nodes according to the index. And setting a quantity threshold value T, and taking the nodes with the first T bits as adjacent nodes.
For ease of understanding, the object attribute feature, the object interaction behavior feature, the edge feature of the node pair, and the respective values are illustrated below.
Regarding the object attribute feature, the value of the initial association relationship feature of the client includes: basic information of the client: such as the industry to which the legal customer belongs (e.g., finance, internet, automobiles, etc.), loan balance, individual customer education (e.g., junior high school, home, research student, etc.), etc. For another example, the values of the initial association relationship features of the client include: financial information, such as financial and newspaper indexes of legal clients and annual income of individual clients (e.g. less than 8 ten thousand, 8-15 ten thousand, 15-30 ten thousand, more than 30 ten thousand, etc.). For another example, the values of the initial association relationship features of the client include: and (3) industrial and commercial characteristics: and changing the times of the industrial and commercial information. For another example, the values of the initial association relationship features of the client include: and fund information: such as monthly inflows and outflows of funds. For another example, the values of the initial association relationship features of the client include: and credit investigation information: such as breach information of credit record, etc. Taking an individual customer as an example, the initial association relationship characteristics of the customer may include: loan balance, individual customer education, annual income, default information, and accordingly, the value of the initial association relationship characteristic of the customer can be represented as [200 ten thousand, this subject, more than 30 ten thousand, no default information ].
Regarding object interaction behavior features. Such as object interaction behavior characteristics including fund traffic, warranty/mortgage/investment, common work address, common card holding. Accordingly, the value of the object interaction behavior characteristic can be expressed as [ the amount of money transferred from the object A to the object B, the guaranteed/collateralized/invested and guaranteed/collateralized/invested relation between the object A and the object B, the address of the object A and the object B working together, and the main card numbers of the object A and the object B are the same ].
Regarding the edge characteristics of the node pairs, for example, the edge characteristics of the node pairs include the type of the association relationship, the direction of the relationship, and the strength and importance of the relationship. Accordingly, the value of the edge characteristic of the node pair may be expressed as follows. Such as node pair (a, B) [ investment relationship, node a investment node B, strong relationship ].
And then, processing the value of the initial association relationship characteristic to obtain an initial association relationship characteristic vector. In this embodiment, since the value of the initial association feature may be information in various forms (such as numbers, letters, chinese characters, and the like), it is inconvenient for computer processing, and the value of the initial association feature may be vectorized. For example, a text feature extraction method, such as One-Hot, may be used to process the value of the initial association feature.
Fig. 4 schematically shows a schematic diagram for processing values of initial association relationship features according to an embodiment of the present disclosure.
As shown in fig. 4, assume that there are only two initial association features: education and loan balance. Wherein, the education degree is a discrete characteristic, including junior high school, this department, and researchers. The education degree of the client C is high school, and the education degree One-Hot of the client C to the client A in the node pair is [0,1,0,0 ].
As another example, the loan balance is a continuous feature, including loan balance 0-100 ten thousand, bins 0-20, 20-40, 40-60, 60-80, 80-100. The loan balance value of the client D is 45 thousands of the legal loan balance, and the One-Hot of the client D for the loan balance of the client B in the node pair is [0,0,0,1,0,0 ].
In addition, the dimension of each initial incidence relation feature can be kept consistent through zero padding and other methods. After the zero padding process, the initial association relationship characteristics for the education level for the client C include [0,1,0,0,0,0 ]. For customer D, the initial associative relationship characteristic for the loan balance is [0,0,0,0,0,1,0,0 ].
And then, processing the initial incidence relation characteristic vector by using the incidence relation determining model to obtain the incidence relation characteristic vector. Therefore, the initial incidence relation characteristic vector can be processed by utilizing the incidence relation determining model to obtain the incidence relation characteristic vector.
It should be noted that, the above-mentioned process of vectorizing the value of the initial association relationship characteristic of the object may also be implemented by using a vectorization model, and the vectorization model may be integrated in the association relationship determination model, which is not limited herein. In addition, the process of determining the association object based on the similarity of the association relationship features can also be implemented by using a model, and can be integrated in the association relationship determination model.
The association determination model is exemplarily described below.
In one embodiment, the association determination model includes a first convolutional layer and a second convolutional layer, the output of the first convolutional layer being at least part of the input of the second convolutional layer.
In one embodiment, the correlation determination model is characterized by an objective loss function: the similarity between the association relation characteristic vectors of the nodes in the node pair is greater than the similarity between the association relation characteristic vectors of the nodes in the non-node pair. In particular, the incidence relation determination model may be trained with reference to a back propagation algorithm to determine model parameters.
Fig. 5 schematically shows a structural diagram of an association determination model according to an embodiment of the present disclosure.
As shown in fig. 5, the association relation determination model includes a plurality of node levels, each of which includes a plurality of nodes. Wherein the node layer includes a convolutional layer.
In fig. 5, the input of the first convolution layer includes the initial association feature vectors of respective neighboring nodes of neighboring nodes connected to the target node in the first relationship diagram, and the output of the first convolution layer includes the intermediate association feature vectors of respective neighboring nodes connected to the target node in the first relationship diagram. In the initial association feature vector of fig. 5, each row of boxes corresponding to one node (circle) of the first convolution layer represents the initial association feature vector of the respective neighboring node of the neighboring node connected to the target node in the first relationship graph. Wherein the dimension of the intermediate incidence relation feature vector may be the same as the dimension of the initial incidence relation feature vector.
For another example, the input of the second convolutional layer includes respective intermediate association relationship feature vectors of each adjacent node connected to the target node in the first relational graph and an initial association relationship feature vector of the target node, and the output includes an association relationship feature vector of the target node, where the association relationship feature vector of the target node includes an aggregate association relationship feature vector of a plurality of adjacent nodes connected to the target node in the first relational graph and the initial association relationship feature vector of the target node.
In one embodiment, the association determination model is trained by: the output of the association determination model is converged by adjusting the model parameters of the association determination model.
Specifically, the convergence of the output of the association relation determination model includes at least one of the following. On one hand, after the initial association relation feature vectors of the two nodes of the node pair are determined by the association relation, the similarity between the obtained association relation feature vectors of the two nodes is larger than or equal to a first similarity threshold value. On the other hand, after the initial association relation feature vectors of the two nodes of the non-node pair are processed by the association relation determination model, the similarity between the obtained association relation feature vectors of the two nodes of the non-node pair is smaller than or equal to a second similarity threshold.
For example, a neural network may be trained in a supervised manner. The training data may include positive and negative examples. The positive samples may be pairs of nodes extracted from the relationship graph. Negative examples can be obtained in two ways. For example, the first way is to keep one node in the positive sample, and another node randomly extracts nodes that are not in the first sub-relationship graph (e.g., nodes in the first relationship graph other than the first sub-relationship graph) where the node is located. The second is to keep one of the nodes of the positive sample and select another node within the relationship graph that is not directly connected to that node.
Fig. 6 schematically shows a schematic diagram of a first relationship diagram according to an embodiment of the disclosure.
As shown in fig. 6, the first relational diagram includes: the system comprises a target node A and nodes B-G, wherein the target node A is respectively connected with a node D, a node B and a node E, the target node A and the node B are also respectively connected with the node E, the node C is respectively connected with the node B, the node F and the node G, and the node F and the node G are also respectively connected with the node E. The processing procedure of the association feature vector for the target node a can be referred to as shown in fig. 7.
Fig. 7 schematically shows a process diagram for processing the first relationship diagram shown in fig. 6 by using the association relationship determination model.
As shown in fig. 7, taking the example of constructing the association relation determination model by the graph neural network as an example, the input of the first convolution layer includes initial association relation feature vectors of respective neighboring nodes of neighboring nodes connected to the target node in the first relation graph, and the output is the node pair feature Vector (Embedding Vector) of each node. The objective loss function requires that the internode Embelling vectors of node pairs be as close as possible and the internode Embelling vectors of non-node pairs be as far as possible.
A variety of different neural network models can be used to output the node's Embedding Vector. An idea of the Attention Model is used here. Other neural network model algorithms may also be suitable.
The neural network has two convolutional layers. The first convolution layer calculates to obtain the hidden layer output of each adjacent node of the target node v
Figure BDA0002493261280000131
(i.e., intermediate association feature vectors, such as hidden layer output of target node A in FIG. 6, can be represented as
Figure BDA0002493261280000132
). Outputting the hidden layer of each adjacent node
Figure BDA0002493261280000133
And target node hidden layer output
Figure BDA0002493261280000134
Merging, using the merged Vector as the input of the second convolutional layer, and calculating to obtain the Embedding Vector of the target node v
Figure BDA0002493261280000135
The input to the first convolutional layer is the feature variable of each of the neighboring nodes of the target node A (as shown in FIG. 6, including node B, node D, and node E, which may be represented as B, D, E ∈ N (A)). The output is of the intermediate association feature vector of the three neighboring nodes (node B, node D, and node E)
Figure BDA0002493261280000141
The input of the second convolutional layer is the output of the first convolutional layer
Figure BDA0002493261280000142
After polymerization with a polymerization function γ (-), and
Figure BDA0002493261280000143
connected together to form a new input. The output of which is the Embedding Vector of the target node A
Figure BDA0002493261280000144
I.e. the result to be solved. Dimensionality sum of feature vectors after aggregation
Figure BDA0002493261280000145
Are consistent, FIG. 7
Figure BDA0002493261280000146
The dimensions of the three feature vectors are the same. For example, assume that the feature vector of the node B is [1,1 ]]The feature vector of node D is [0,1, 0]]The feature vector of the node E is [0,0,1 ]]. Taking an average value method as an example for explanation during polymerization, the feature vector after polymerization is: [(1+0+0)/3,(1+1+0)/3,(1+0+1)/3]The results obtained were [1/3,2/3,2/3 ]]. The above average value method is merely an example, and a maximum value or the like may be taken.
And calculating each node in the first relational graph or the first subrelational graph by adopting the steps to obtain the respective Embedding Vector. Therefore, the second incidence relation can be conveniently determined based on the similarity of the Embedding vectors of the nodes.
The algorithm of the second convolution layer is exemplarily described below.
The input to the second convolutional layer comprises the known Embedding expression z of the target node vv. Embedding expression of target node v neighbor node zuThe weights α (for model parameters, determined by training) for neighboring nodes the aggregation function γ (·) may be in the form of a mean function, a maximum function, etc
Figure BDA0002493261280000147
(i.e., associative relationship feature vectors).
Specifically, first, the respective adjacent nodes are calculated
Figure BDA0002493261280000148
By polymerization function gamma, polymerizing to obtain nv. The polymerization function γ can be represented by formula (1). Equation (1) characterizes the feature vector of all the neighbors of the target node v with respect to the target node v.
Figure BDA0002493261280000149
The second step is that: will zv,nvConnected together as a new input (e.g. [ z ]v,nv]And the sum of the dimensions is the dimension of the middle layer), and calculating to obtain the final Embedding Vector (namely the incidence relation feature Vector) of the target node v. The calculation formula can be represented by formula (2) and formula (3).
Figure BDA00024932612800001410
Figure BDA00024932612800001411
F (-) in the formula (1) and g (-) in the formula (2) are activation functions of the neural network, and a Linear rectification (RecU) function can be used. Wherein, Q, W, α are parameters of the model and need to be determined by model training.
And setting the output dimension of the middle layer of the neural network as m and the Embedding Vector dimension of the final node v as d. For example, m is the dimension of the middle layer and d is the dimension of the final output. The middle layer passes through a neural network, and the dimension of the output becomes d. The dimension of the association feature vector of V may be the same as the dimension of the initial association feature vector.
When model training is performed, the adopted loss objective function can be shown as formula (4).
Figure BDA0002493261280000151
Where (p, q) represents a positive sample node pair,PnegRepresenting a set of negative samples, Δ is the margin over-parameter of the loss function. ZnegkRepresenting unassociated node pairs. negkIs an unassociated node with respect to node Z. Equation (4) characterizes that the distance between two nodes without correlation is smaller than the distance between two nodes with correlation.
After the association relation determining model is trained, the initial association relation feature Vector of each node (client) in the first relation graph can be input into the association relation determining model, and the respective association relation feature vectors (Embedding vectors) of all clients are calculated.
The following describes an exemplary process of determining an associated object of each of the plurality of objects based on a similarity between the associated relationship features of each of the plurality of objects.
In one embodiment, the method may further include the following operations.
Firstly, the nodes to be detected are determined from the first relational graph based on the objects to be detected.
And then, determining the incidence relation characteristic vector of the node to be detected.
And then, determining the associated nodes so as to determine the risk object based on the associated nodes, wherein the associated relation characteristic vector of the associated node meets a preset associated condition compared with the associated relation characteristic vector of the node to be detected.
In one embodiment, determining the associated node may include the following operations.
Firstly, determining a similarity interval between the association relation feature vectors of the node to be detected and the other node in the node pair to which the node to be detected belongs.
And then, taking the nodes with the similarity between the incidence relation characteristic vectors of at least part of the second relation graph and the nodes to be detected in the similarity interval as the incidence nodes.
For example, the nodes to be detected are taken as clients to be probed for example. And selecting a range of the client to be probed according to a service scene, risk management requirements and the like. And finding the Embedding Vector I of each client I to be probed according to the Embedding Vector calculated in the previous operation. The target is to find the Embedding Vector II closest to the Embedding Vector I of the client to be probed, so as to find the client II corresponding to the Embedding Vector II, wherein the client I and the client II belong to clients which may have potential association relations.
Specifically, the calculation method mainly includes the similarity between the Embedding vectors. For example, the similarity between Embedding vectors can be determined based on Minkowski distance (Minkowski distance) and the cosine of the included angle (cosine). Of course, a variety of other similarity algorithms may also be used.
The basic computing idea is as follows: given a set of samples X, X is an m-dimensional real vector space RmSet of middle points, where xi,xj∈X,xi=(x1i,x2i,…,xmi)T,xj=(x1j,x2j,…,xmj)TI, j are integers and T denotes transpose.
For methods using Minkowski distances, sample xiAnd sample xjMinkowski distance d betweenijCan be expressed as
Figure BDA0002493261280000161
The larger the distance value is, the lower the similarity is, and the smaller the distance value is, the higher the similarity is.
For the method using the cosine of the angle, sample xiAnd sample xjCosine of the angle betweenijCan be expressed as
Figure BDA0002493261280000162
The closer the cosine value of the included angle is to 1, the higher the similarity is, and the closer the value is to 0, the lower the similarity is.
For each customer to be probed, calculating the similarity value of the customer to be probed and the Embedding Vector of other customers, and sorting the customers to be probed from high to low according to the similarity. The higher the similarity, the more likely there is an association.
Fig. 8 schematically illustrates a to-be-detected node, a node pair, and an associated node of the to-be-detected node according to an embodiment of the present disclosure.
As shown in fig. 8, for each node to be probed (e.g., a customer to be probed).
Firstly, according to the similarity value of the real association relation node pair, calculating the similarity value interval [ L, H ] of the node pair to which the current node to be detected belongs. For example, based on a range of similarity values. I.e. the node with which the node-to-node relationship exists, the upper and lower bounds of the similarity value between the two. Then, an upper float ratio L and a lower float ratio H are set, for example, within +/-three standard deviations of the mean, to obtain a new interval [ L (1-L), H (1+ H) ]. The floating region in fig. 8 is determined based on the similarity value interval, the floating proportion, and the floating proportion.
And screening out the customers with the similarity value falling in [ L (1-L), H (1+ H) ] with the customers to be probed from the customers without node pair relationship with the customers to be probed in the first relationship diagram.
Therefore, the client with the similarity value falling in [ L (1-L), H (1+ H) ] with the client to be explored can be listed as a potential associated client (namely, an associated object) list of the client to be explored.
In another embodiment, the method may further include: the first relationship graph and the second relationship graph are compared to determine a risk list.
The following description will take a first sub-relational diagram extracted from the first relational diagram as an example.
Specifically, first, for the associated clients listed in the potential associated client list, the associated clients are compared with the clients to be probed and the positions of the relationship diagram where the clients to be probed are located.
For example, whether an indirect association relationship exists between the associated client and the client to be probed in the first sub-relationship graph is compared. If so, adding the potential association relationship in the first sub-relationship graph. If not, adding the associated client in the first sub-relationship graph and adding the potential associated relationship. And adding the completed associated client and the potential association relation. And re-extracting the node pairs, and comparing the newly added node pairs. The newly added node pairs are the potential associated node pairs resulting from the new position fix.
And then, putting the potential incidence relation into the full-graph incidence relation, and reconstructing the full-graph incidence relation.
Then, the first sub-relational graph is reconstructed and re-extracted to determine a second sub-relational graph. Considering that the addition of the potential association relationship increases the association network of the client, the upper limit of the number of the allowed nodes in each second sub-relationship graph is floated, and the maximum number of the clients extracted by each second sub-relationship graph is increased. Therefore, the number of objects included in the second sub-relational graph is greater than or equal to the number of objects in the first sub-relational graph corresponding to the second sub-relational graph.
Fig. 9 schematically shows a schematic diagram of a first incidence relation and a second incidence relation according to an embodiment of the present disclosure.
As shown in fig. 9, the first association includes a node pair consisting of the client 1 and the client 3, a node pair consisting of the client 3 and the client 2, and a node pair consisting of the client 2 and the client 4. The second associative relationship further includes a pair of nodes consisting of customer 1 and customer 2, as compared to the first associative relationship. This customer 2 is a potential affiliate customer of customer 1.
The reconstructed second sub-relational graph generally has two scenarios. The scenario is that compared with the first sub-relational graph, the second sub-relational graph is basically unchanged, and new associated clients are added. And re-extracting the node pairs, and generating a newly added node pair list. And in the second scenario, compared with the first sub-relational graph, a brand-new second sub-relational graph is generated. And listing the newly added graph into a second child relation graph list. And (4) pair nodes are extracted from the second sub-relation graph again, and all the generated pair nodes are listed in the pair node list.
Next, a list of risks may also be generated. The risk list comprises the potential association relation list, the second subrelationship graph list and the newly added nodes in the node pair list. The customer relationship in the list can be regarded as the existence of a potential risk event, and is used for subsequent risk checking and risk control.
In other embodiments, the method may further include the operations of: and (6) visually displaying the relationship diagram. And marking the potential association relation on the second relation graph or the second sub-relation graph through a visualization tool. The change condition of the association relationship, the influence range of the association relationship, and the like are visually checked, which can be referred to as fig. 9.
The relationship determination method provided by the embodiment of the disclosure applies vectorization technology of the neural network of the graph, applies the similarity principle, probes the potential relationship between the objects, and explores the potential risk of the client. The design concept of low coupling and high cohesion is adopted among the operations. Through a vectorization method, potential common influence factors of the attribute characteristics of the client, the importance degree of the client in the relational network, the interaction relationship between the client and the client, the influence degree of the relationship between the client and the client in the network and the characteristics of different incidence relations are abstracted. The abstract factors contain information of point-edge relation, unify comparison standards among different data, and can reduce the dimensionality of a higher feature vector to a lower dimensionality, so that the information contained in a high-dimensional feature variable is expressed by a vector with a lower dimensionality. And further, the potential relation is explored, a risk list is obtained based on the potential relation, and data support is provided for risk prevention.
Another aspect of the present disclosure provides a relationship determination apparatus.
Fig. 10 schematically shows a structural diagram of a relationship determination apparatus according to an embodiment of the present disclosure.
As shown in fig. 10, the relationship determination apparatus 1000 includes: a first determination module 1010, a second determination module 1020, and a third determination module 1030.
The first determining module 1010 is configured to determine a first relationship graph, where the first relationship graph is configured to reflect a first association relationship between a plurality of objects.
The second determination module 1020 is configured to determine a second association relationship between the plurality of objects based on the first relationship graph, the second association relationship characterizing a potential association relationship between the plurality of objects.
The third determining module 1030 is configured to supplement the first relationship graph based on the second association relationship to determine a second relationship graph.
Another aspect of the present disclosure provides a relationship determination system.
Fig. 11 schematically shows a structural schematic diagram of a relationship determination system according to an embodiment of the present disclosure.
As shown in fig. 11, the relationship determination system 1100 includes: a potential association relation probing module 1110 and an association relation structure reconstructing module 1120.
The potential association relation probing module 1110 is configured to determine a potential association relation between a plurality of objects based on a first relationship graph, where the first relationship graph is configured to reflect a first association relation between the plurality of objects.
The incidence relation structure reconstructing module 1120 is configured to reconstruct the first relation graph based on potential incidence relations among the plurality of objects.
In another embodiment, the relationship determination system 1100 may also include a model engineering implementation module.
Specifically, the model engineering implementation module designs and constructs a graph neural network model, and generates data which can be used for potential relationship exploration. The potential association relation exploration module 1110 searches for potential association relation parties of the client through an algorithm. The incidence relation structure reconstruction module 1120 supplements potential relations among clients to a client incidence relation map to generate a risk list.
The model engineering realization module mainly constructs a model, extracts a modeling sample, designs a model characteristic variable and trains the model.
The potential association relation exploration module 1110 mainly finds potential association relation parties of the customers through an algorithm strategy according to data obtained by model training.
The incidence relation structure reconstructing module 1120 is mainly used for supplementing the potential incidence relation party obtained through the model algorithm to the original customer relation map, and obtaining the risk list by comparing the first relation map with the second relation map.
It should be noted that the implementation, solved technical problems, implemented functions, and achieved technical effects of the modules and the like in the device part and system part embodiments are respectively the same as or similar to the implementation, solved technical problems, implemented functions, and achieved technical effects of the corresponding steps in the method part embodiments, and are not described in detail herein.
Any of the modules according to embodiments of the present disclosure, or at least part of the functionality of any of them, may be implemented in one module. Any one or more of the modules according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules according to the embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging the circuit, or in any one of three implementations, or in any suitable combination of any of the software, hardware, and firmware. Alternatively, one or more of the modules according to embodiments of the disclosure may be implemented at least partly as computer program modules which, when executed, may perform corresponding functions. For example, any plurality of the first, second, and third determination modules 1010, 1020, 1030 may be combined or implemented separately, implemented in hardware or software, etc.
Another aspect of the present disclosure provides an electronic device.
FIG. 12 schematically shows a block diagram of an electronic device according to an embodiment of the disclosure. The electronic device shown in fig. 12 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 12, an electronic apparatus 1200 according to an embodiment of the present disclosure includes a processor 1201, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1202 or a program loaded from a storage section 1208 into a Random Access Memory (RAM) 1203. The processor 1201 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 1201 may also include on-board memory for caching purposes. The processor 1201 may include a single processing unit or multiple processing units for performing the different actions of the method flows according to embodiments of the present disclosure.
In the RAM1203, various programs and data necessary for the operation of the electronic apparatus 1200 are stored. The processor 1201, the ROM 1202, and the RAM1203 are communicatively connected to each other by a bus 1204. The processor 1201 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 1202 and/or the RAM 1203. Note that the programs may also be stored in one or more memories other than the ROM 1202 and the RAM 1203. The processor 1201 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.
Electronic device 1200 may also include input/output (I/O) interface 1205, according to an embodiment of the disclosure, input/output (I/O) interface 1205 also connected to bus 1204. The electronic device 1200 may also include one or more of the following components connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, and the like; an output portion 1207 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 1208 including a hard disk and the like; and a communication section 1209 including a network interface card such as a LAN card, a modem, or the like. The communication section 1209 performs communication processing via a network such as the internet. A driver 1210 is also connected to the I/O interface 1205 as needed. A removable medium 1211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 1210 as necessary, so that a computer program read out therefrom is mounted into the storage section 1208 as necessary.
According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 1209, and/or installed from the removable medium 1211. The computer program, when executed by the processor 1201, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 1202 and/or the RAM1203 and/or one or more memories other than the ROM 1202 and the RAM1203 described above.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. These examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (20)

1. A method of relationship determination, the method comprising:
determining a first relation graph, wherein the first relation graph is used for reflecting a first incidence relation among a plurality of objects;
determining a second incidence relation among a plurality of the objects based on the first relation graph, wherein the second incidence relation characterizes potential incidence relations among the plurality of the objects; and
supplementing the first relationship graph based on the second incidence relationship to determine a second relationship graph.
2. The method of claim 1, wherein said determining a second associative relationship between a plurality of said objects based on said first relationship graph comprises:
extracting the incidence relation characteristics of the objects in the first relation graph;
determining an associated object of each of the plurality of objects based on a similarity between associated relationship features of each of the plurality of objects, the associated object not being directly connected to a corresponding object in the first relationship diagram; and
determining the second association relationship based on a plurality of the objects and respective associated objects.
3. The method of claim 2, wherein the associative relationship feature comprises at least one of: node attribute features, relationship interaction attribute features, edge attribute features, node pair features.
4. The method of claim 3, wherein:
the node attribute characteristics include at least one of: object attribute features and point map features;
the relationship interaction attribute features include at least one of: object interaction behavior characteristics and edge graph characteristics; and
the node pair characteristic includes an edge characteristic of a node pair.
5. The method of claim 2, wherein the extracting the association relationship features of each of the plurality of objects in the first relationship graph comprises:
determining values of initial association relation characteristics of a plurality of objects;
processing the value of the initial incidence relation feature to obtain an initial incidence relation feature vector; and
and processing the initial incidence relation characteristic vector by utilizing an incidence relation determination model to obtain an incidence relation characteristic vector.
6. The method of claim 5, wherein the correlation determination model has an objective loss function characterizing: the similarity between the association relation characteristic vectors of the nodes in the node pair is greater than the similarity between the association relation characteristic vectors of the nodes in the non-node pair.
7. The method of claim 5, wherein the correlation determination model comprises a first convolutional layer and a second convolutional layer, an output of the first convolutional layer being at least a partial input of the second convolutional layer.
8. The method of claim 7, wherein the input of the first convolutional layer comprises initial association feature vectors of respective neighboring nodes of neighboring nodes connected to the target node in the first relational graph, and the output of the first convolutional layer comprises intermediate association feature vectors of respective neighboring nodes connected to the target node in the first relational graph.
9. The method of claim 7, wherein the input of the second convolutional layer comprises respective intermediate association feature vectors of neighboring nodes connected to the target node in the first relational graph and an initial association feature vector of the target node, and the output comprises the association feature vector of the target node, wherein the association feature vector of the target node comprises an aggregate association feature vector of a plurality of neighboring nodes connected to the target node in the first relational graph and the initial association feature vector of the target node.
10. The method of claim 5, wherein the association determination model is trained by: converging an output of the incidence relation determination model by adjusting model parameters of the incidence relation determination model;
the convergence of the output of the incidence relation determination model includes at least one of:
determining initial association relation feature vectors of two nodes of the model processing node pair by using the association relation, wherein the similarity between the obtained association relation feature vectors of the two nodes is greater than or equal to a first similarity threshold; and
and after the incidence relation determining model is utilized to process the initial incidence relation characteristic vectors of the two nodes of the non-node pair, the similarity between the obtained incidence relation characteristic vectors of the two nodes of the non-node pair is smaller than or equal to a second similarity threshold.
11. The method of claim 1, further comprising:
after determining a first relational graph, extracting a plurality of first subrelational graphs from the first relational graph;
said determining a second associative relationship between a plurality of said objects based on said first relationship graph comprises: for each first sub-relationship graph, determining a second sub-association relationship between a plurality of the objects based on the first sub-relationship graph; and
the supplementing the first relationship graph based on the second incidence relationship to determine a second relationship graph comprises: supplementing the first sub-relational graph based on the second sub-incidence relation to obtain a second sub-relational graph, and determining the second relational graph based on the second sub-relational graph.
12. The method of claim 11, wherein the second sub-relationship graph comprises a number of objects greater than or equal to the number of objects of the first sub-relationship graph corresponding to the second sub-relationship graph.
13. The method of claim 1, further comprising: comparing the first relationship graph and the second relationship graph to determine a list of risks.
14. The method of claim 1, further comprising:
determining a node to be detected from the first relational graph based on an object to be detected;
determining the incidence relation characteristic vector of the node to be detected;
and determining the associated nodes so as to determine the risk objects based on the associated nodes, wherein the associated relation characteristic vector of the associated nodes meets a preset associated condition compared with the associated relation characteristic vector of the node to be detected.
15. The method of claim 14, wherein the determining an associated node comprises:
determining a similarity interval between the association relation feature vectors of the node to be detected and the other node in the node pair to which the node to be detected belongs; and
and taking the node with the similarity between the incidence relation characteristic vectors of at least part of the second relation graph and the node to be detected in the similarity interval as the incidence node.
16. The method of claim 1, wherein the determining a first relationship graph comprises:
determining a first incidence relation among a plurality of objects based on the collected information of the plurality of objects; and
and constructing the first relation graph based on the first incidence relation.
17. A relationship determination apparatus comprising:
the first determining module is used for determining a first relation graph, and the first relation graph is used for reflecting a first incidence relation among a plurality of objects;
a second determination module, configured to determine a second association relationship between the plurality of objects based on the first relationship graph, the second association relationship characterizing a potential association relationship between the plurality of objects; and
and the third determining module is used for supplementing the first relation graph based on the second incidence relation so as to determine a second relation graph.
18. A relationship determination system, comprising:
the potential association relation detecting module is used for determining potential association relations among the plurality of objects based on a first relation graph, and the first relation graph is used for reflecting the first association relations among the plurality of objects; and
and the incidence relation structure reconstruction module is used for reconstructing the first relation graph based on potential incidence relations among a plurality of objects.
19. An electronic device, comprising:
one or more processors;
a storage device for storing executable instructions which, when executed by the processor, implement a method according to any one of claims 1 to 16.
20. A computer readable storage medium having stored thereon instructions which, when executed, implement a method according to any one of claims 1 to 16.
CN202010411078.8A 2020-05-15 2020-05-15 Relationship determination method, device and system and electronic equipment Pending CN111563187A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010411078.8A CN111563187A (en) 2020-05-15 2020-05-15 Relationship determination method, device and system and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010411078.8A CN111563187A (en) 2020-05-15 2020-05-15 Relationship determination method, device and system and electronic equipment

Publications (1)

Publication Number Publication Date
CN111563187A true CN111563187A (en) 2020-08-21

Family

ID=72071196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010411078.8A Pending CN111563187A (en) 2020-05-15 2020-05-15 Relationship determination method, device and system and electronic equipment

Country Status (1)

Country Link
CN (1) CN111563187A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112216396A (en) * 2020-10-14 2021-01-12 复旦大学 Method for predicting drug-side effect relationship based on graph neural network
CN113158929A (en) * 2021-04-27 2021-07-23 河南大学 Depth discrimination metric learning relationship verification framework based on distance and direction
CN113722489A (en) * 2021-09-02 2021-11-30 珠海市新德汇信息技术有限公司 NLP algorithm-based relation analysis method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112216396A (en) * 2020-10-14 2021-01-12 复旦大学 Method for predicting drug-side effect relationship based on graph neural network
CN112216396B (en) * 2020-10-14 2022-11-22 复旦大学 Method for predicting drug-side effect relationship based on graph neural network
CN113158929A (en) * 2021-04-27 2021-07-23 河南大学 Depth discrimination metric learning relationship verification framework based on distance and direction
CN113158929B (en) * 2021-04-27 2022-09-30 河南大学 Depth discrimination measurement learning relativity verification system based on distance and direction
CN113722489A (en) * 2021-09-02 2021-11-30 珠海市新德汇信息技术有限公司 NLP algorithm-based relation analysis method
CN113722489B (en) * 2021-09-02 2023-10-31 珠海市新德汇信息技术有限公司 Relationship analysis method based on NLP algorithm

Similar Documents

Publication Publication Date Title
Chen et al. Selecting critical features for data classification based on machine learning methods
US20230325724A1 (en) Updating attribute data structures to indicate trends in attribute data provided to automated modelling systems
CN110111198A (en) User's financial risks predictor method, device, electronic equipment and readable medium
CN103370722B (en) The system and method that actual volatility is predicted by small echo and nonlinear kinetics
CN111563187A (en) Relationship determination method, device and system and electronic equipment
US20150269669A1 (en) Loan risk assessment using cluster-based classification for diagnostics
Van Thiel et al. Artificial intelligence credit risk prediction: An empirical study of analytical artificial intelligence tools for credit risk prediction in a digital era
US11514369B2 (en) Systems and methods for machine learning model interpretation
CN110738527A (en) feature importance ranking method, device, equipment and storage medium
Basak et al. Causal ordering and inference on acyclic networks
Feng et al. The cross-shareholding network and risk contagion from stochastic shocks: an investigation based on China’s market
CN113112186A (en) Enterprise evaluation method, device and equipment
CN113674087A (en) Enterprise credit rating method, apparatus, electronic device and medium
Dong Application of Big Data Mining Technology in Blockchain Computing
CN115982654B (en) Node classification method and device based on self-supervision graph neural network
KR20210097204A (en) Methods and devices for outputting information
CN115062163A (en) Abnormal tissue identification method, abnormal tissue identification device, electronic device and medium
WO2022143431A1 (en) Method and apparatus for training anti-money laundering model
CN114493853A (en) Credit rating evaluation method, credit rating evaluation device, electronic device and storage medium
Wang et al. Intelligent weight generation algorithm based on binary isolation tree
CN114170000A (en) Credit card user risk category identification method, device, computer equipment and medium
CN114549174A (en) User behavior prediction method and device, computer equipment and storage medium
CN113052512A (en) Risk prediction method and device and electronic equipment
Meng et al. In-depth analysis of financial market based on iris recognition algorithm of MATLAB GUI
CN113515383B (en) System resource data distribution method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200821