CN111309824A

CN111309824A - Entity relationship map display method and system

Info

Publication number: CN111309824A
Application number: CN202010103482.9A
Authority: CN
Inventors: 李瑾瑜; 张志磊; 陈君; 王天娇
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2020-02-18
Filing date: 2020-02-18
Publication date: 2020-06-19
Anticipated expiration: 2040-02-18
Also published as: CN111309824B

Abstract

The invention provides a method and a system for displaying an entity relationship map, wherein the system comprises a relationship construction device, a risk calculation device and an analysis device; the relation construction device is used for collecting all entities in a preset range and constructing a knowledge graph with the entities as nodes according to the entity attributes of the entities and the relation attributes among the entities; the risk calculation device is used for analyzing entity attributes of each entity analyzed in the knowledge graph according to preset rules to obtain blacklist entities; obtaining a corresponding probability distribution function in a pre-stored function library according to the relation attribute between the blacklist entity and the associated entity thereof; calculating and obtaining a risk probability value of an entity associated with the blacklist entity according to the probability distribution function; obtaining the risk probability of the corresponding entity according to the sum of one or more risk probability values of the entity; the analysis device is used for comparing the risk probability of each entity with a preset prompt threshold value and generating prompt information according to the comparison result and the corresponding entity.

Description

Entity relationship map display method and system

Technical Field

The invention relates to the field of relation data display, in particular to a method and a system for displaying an entity relation map.

Background

Knowledge-graphs are a branch of artificial intelligence. The knowledge graph is essentially a semantic knowledge base based on a graph structure, describes concepts and mutual relations in a physical world in a symbolic form, and basic composition units of the knowledge graph are entity-relation-entity triple, entities and related attribute-value pairs thereof, and the entities are mutually connected through relations to form a network knowledge structure. In the knowledge-graph, each node represents an "entity" of the real world, and each edge is a "relationship" between entities. Generally, a knowledge graph is a relational network that links together all of the different types of information.

The traditional entity risk prediction and decision are mostly started from an entity, and the risk condition and the influence degree of the entity are analyzed according to the attribute characteristics of the entity; but the risk of a single entity is not limited to that entity itself but will also propagate to other entities with which it is associated. At present, a multi-hand association, specific association entity group identification and specific path identification cannot be effectively and quickly generated by a relational database. The general risk entity identification still needs to be judged through business rules, potential risk entities are difficult to identify under the condition of lacking the business rule judgment rules, and the common cases of predicting the risk of a single entity through an algorithm are more.

Disclosure of Invention

The invention aims to provide an entity relationship graph display method and system, which are used for counting risk propagation paths and the influence degree of risk propagation by combining a Bayes method and applying an entity risk prediction model based on machine learning from the perspective of a relationship community based on a relationship network constructed by a knowledge graph, and displaying the risk propagation paths and the influence degree of risk propagation on the relationship network to provide reference for workers.

In order to achieve the above object, the entity relationship map display system provided by the present invention specifically comprises a relationship construction device, a risk calculation device and an analysis device; the relation construction device is used for collecting all entities in a preset range and constructing a knowledge graph with the entities as nodes according to the entity attributes of the entities and the relation attributes among the entities; the risk calculation device is used for analyzing entity attributes of each entity analyzed in the knowledge graph according to preset rules to obtain blacklist entities; obtaining a corresponding probability distribution function in a pre-stored function library according to the relation attribute between the blacklist entity and the associated entity thereof; calculating and obtaining a risk probability value of an entity associated with the blacklist entity according to the probability distribution function; obtaining the risk probability of the corresponding entity according to the sum of one or more risk probability values of the entity; the analysis device is used for comparing the risk probability of each entity with a preset prompt threshold value and generating prompt information according to the comparison result and the corresponding entity.

In the above entity relationship map display system, preferably, the risk calculation means includes a conduction probability calculation module, and the conduction probability calculation module is configured to obtain blacklist entities that meet a predetermined rule in the historical data; screening and obtaining a first-degree associated entity with the blacklist entity and one or more relationship attributes between the blacklist entity and the first-degree associated entity according to the blacklist entity in historical data; obtaining a first risk transfer function corresponding to different relationship attributes between the first degree associated entity and the blacklist entity, a second risk transfer function corresponding to a plurality of relationship attributes, and a continuous transfer function of the blacklist entity through statistical analysis; and obtaining a probability distribution function according to the first risk transfer function, the second risk transfer function and the continuous transfer function, and storing the continuous transfer function into a pre-stored function library.

In the entity relationship graph display system, preferably, the risk calculation device includes an entity subgraph extraction module, and the entity subgraph extraction module is configured to obtain one or more corresponding entities through screening in the knowledge graph according to a selected screening condition of preset screening conditions; and constructing an entity subgraph according to the entities and the relationship attributes between the entities.

In the entity relationship map display system, preferably, the preset screening condition includes attribute identification, entity identification and community identification; when the attribute identification is selected, screening in the knowledge graph according to the entity attribute of each entity or the relationship attribute between the entities to obtain one or more corresponding entities; constructing entity subgraphs according to the entities and the relationship attributes between the entities; when the entity identification is selected, screening in the knowledge graph to obtain a corresponding entity; constructing entity subgraphs according to the entities and the relationship attributes between the entities; and when the selected community is identified, obtaining the connected bodies and the community clusters in the knowledge graph through a graph community algorithm to generate an entity subgraph.

In the entity relationship map display system, preferably, the risk calculation device includes a risk entity identification module, and the risk entity identification module is configured to identify in the entity subgraph according to a selected identification condition among preset identification conditions to obtain a corresponding detection result; wherein the identification condition comprises blacklist identification, entity identification and node identification; when the blacklist identification is selected, comparing the entity attributes of each entity in the entity subgraph according to a preset rule, obtaining a blacklist entity which accords with the preset rule in the entity subgraph, and generating an entity list according to the blacklist entity; when entity identification is selected, a risk detection model is established through historical data and a learning algorithm; respectively calculating the risk probability of each entity in the entity subgraph through the risk detection model, and generating an entity list according to the entities with the risk probability higher than a preset probability threshold; when the node identification is selected, identifying a centrality entity in the entity subgraph through a point analysis algorithm; calculating the risk probability of the centrality entity through the risk retrieval model, and generating an entity list according to the centrality entity with the risk probability higher than a preset probability threshold.

In the entity relationship map display system, preferably, the risk calculation device further includes a conduction path analysis module, and the conduction path analysis module is configured to calculate a shortest path between each two entities in the entity list through a graph algorithm, and record an entity attribute of the path entity and a relationship attribute between the entities.

In the entity relationship map display system, preferably, the risk calculation device further includes an entity conduction prediction module, and the entity conduction prediction module is configured to calculate, according to the entity list and the path, a conduction probability of each entity through a probability distribution function corresponding to each entity, and obtain a risk probability of each entity according to a risk probability value and a corresponding conduction probability of each entity.

The invention also provides an entity relationship map display method, which comprises the following steps: the method comprises the following steps: acquiring all entities in a preset range, and constructing a knowledge graph with the entities as nodes according to the entity attributes of the entities and the relationship attributes among the entities; step two: analyzing entity attributes of each entity analyzed in the knowledge graph according to preset rules to obtain blacklist entities; obtaining a corresponding probability distribution function in a pre-stored function library according to the relation attribute between the blacklist entity and the associated entity thereof; calculating and obtaining a risk probability value of an entity associated with the blacklist entity according to the probability distribution function; obtaining the risk probability of the corresponding entity according to the sum of one or more risk probability values of the entity; step three: and comparing the risk probability of each entity with a preset prompt threshold value, and generating prompt information according to the comparison result and the corresponding entity.

In the above entity relationship map display method, preferably, the method further comprises: obtaining blacklist entities which accord with preset rules in historical data; screening and obtaining a first-degree associated entity with the blacklist entity and one or more relationship attributes between the blacklist entity and the first-degree associated entity according to the blacklist entity in historical data; obtaining a first risk transfer function corresponding to different relationship attributes between the first degree associated entity and the blacklist entity, a second risk transfer function corresponding to a plurality of relationship attributes, and a continuous transfer function of the blacklist entity through statistical analysis; and obtaining a probability distribution function according to the first risk transfer function, the second risk transfer function and the continuous transfer function, and storing the continuous transfer function into a pre-stored function library.

In the above entity relationship map display method, preferably, the second step further includes: screening in the knowledge graph according to a selected screening condition in preset screening conditions to obtain one or more corresponding entities; and constructing an entity subgraph according to the entities and the relationship attributes between the entities.

In the entity relationship map display method, preferably, the preset screening condition includes attribute identification, entity identification and community identification; when the attribute identification is selected, screening in the knowledge graph according to the entity attribute of each entity or the relationship attribute between the entities to obtain one or more corresponding entities; constructing entity subgraphs according to the entities and the relationship attributes between the entities; when the entity identification is selected, screening in the knowledge graph to obtain a corresponding entity; constructing entity subgraphs according to the entities and the relationship attributes between the entities; and when the selected community is identified, obtaining the connected bodies and the community clusters in the knowledge graph through a graph community algorithm to generate an entity subgraph.

In the above entity relationship map display method, preferably, the second step further includes: identifying in the entity subgraph according to a selected identification condition in preset identification conditions to obtain a corresponding detection result; wherein the identification condition comprises blacklist identification, entity identification and node identification; when the blacklist identification is selected, comparing the entity attributes of each entity in the entity subgraph according to a preset rule, obtaining a blacklist entity which accords with the preset rule in the entity subgraph, and generating an entity list according to the blacklist entity; when entity identification is selected, a risk detection model is established through historical data and a learning algorithm; respectively calculating the risk probability of each entity in the entity subgraph through the risk detection model, and generating an entity list according to the entities with the risk probability higher than a preset probability threshold; when the node identification is selected, identifying a centrality entity in the entity subgraph through a point analysis algorithm; calculating the risk probability of the centrality entity through the risk retrieval model, and generating an entity list according to the centrality entity with the risk probability higher than a preset probability threshold.

In the above entity relationship map display method, preferably, the second step further includes: and calculating the shortest path between every two entities in the entity list through a graph algorithm, and recording the entity attributes of the path entities and the relationship attributes between the entities.

In the above entity relationship map display method, preferably, the second step further includes: and calculating the conduction probability of each entity through the probability distribution function corresponding to each entity according to the entity list and the path, and obtaining the risk probability of each entity according to the risk probability value and the corresponding conduction probability of each entity.

The invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method when executing the computer program.

The present invention also provides a computer-readable storage medium storing a computer program for executing the above method.

The invention solves the efficiency problem and technical limitation of generating the multi-hand incidence relation by a general relational database in a multi-layer incidence way; the invention can effectively generate the incidence relation of the super-multilayer at high speed by using the knowledge graph technology. And by using a graph algorithm, a risk analysis object is quickly generated, and a potential risk entity is automatically identified. And the identification mode of the associated group, such as a community identification mode, is expanded, and the object range of risk analysis is enriched. Thirdly, combining the graph path recognition algorithm and the machine learning algorithm to predict the risk probability of a single entity and the probability theory principle, supporting the application scene of the specific risk management according to the service, and flexibly combining to obtain the required risk conduction prediction result. The limit of analyzing the risk of a single entity is changed, and the influence of the risk of the single entity on the whole group is examined from the group view.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:

FIG. 1A is a schematic structural diagram of an entity relationship map display system according to an embodiment of the present invention;

fig. 1B is a schematic application structure diagram of an entity relationship map display system according to an embodiment of the present invention;

FIG. 2 is a flow diagram of a knowledge graph building module provided by an embodiment of the present invention;

FIG. 3 is a flow chart of a conducted risk probability function calculation module according to an embodiment of the present invention;

fig. 4A is a flowchart of an entity subgraph extraction module according to an embodiment of the present invention;

FIG. 4B is a schematic breadth-first traversal scheme provided in accordance with an embodiment of the present invention;

FIG. 4C is a depth-first traversal diagram according to an embodiment of the present invention;

FIG. 4D is a schematic diagram of a strong link according to an embodiment of the present invention;

FIG. 5 is a flowchart of a risk entity identification module according to an embodiment of the present invention;

fig. 6 is a flow chart of a risk conduction path analysis module according to an embodiment of the present invention;

FIG. 7 is a flowchart of an entity risk conductance prediction module according to an embodiment of the present invention;

FIG. 8A is a flow chart of a risk decision module according to an embodiment of the present invention;

FIG. 8B is a diagram of a risk matrix provided in accordance with an embodiment of the present invention;

FIG. 9 is a flowchart illustrating a method for displaying an entity relationship map according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The following detailed description of the embodiments of the present invention will be provided with reference to the drawings and examples, so that how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented. It should be noted that, unless otherwise specified, the embodiments and features of the embodiments of the present invention may be combined with each other, and the technical solutions formed are within the scope of the present invention.

Additionally, the steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions and, although a logical order is illustrated in the flow charts, in some cases, the steps illustrated or described may be performed in an order different than here.

Referring to fig. 1A, the entity relationship map display system provided in the present invention specifically includes a relationship construction device, a risk calculation device, and an analysis device; the relation construction device is used for collecting all entities in a preset range and constructing a knowledge graph with the entities as nodes according to the entity attributes of the entities and the relation attributes among the entities; the risk calculation device is used for analyzing entity attributes of each entity analyzed in the knowledge graph according to preset rules to obtain blacklist entities; obtaining a corresponding probability distribution function in a pre-stored function library according to the relation attribute between the blacklist entity and the associated entity thereof; calculating and obtaining a risk probability value of an entity associated with the blacklist entity according to the probability distribution function; obtaining the risk probability of the corresponding entity according to the sum of one or more risk probability values of the entity; the analysis device is used for comparing the risk probability of each entity with a preset prompt threshold value and generating prompt information according to the comparison result and the corresponding entity. The relation construction device mainly determines the definition of the entity and the relation in the knowledge graph; forming a full graph by taking an entity-relation-entity triple as a basic unit; this process can be implemented by existing technologies, which will be described in detail later; in the actual work of the analysis device, decision recommendation or other applications which are more convenient for the user to view the knowledge graph can be supplemented, and the details will be described later, and thus, the details are not described.

Referring to fig. 2, in an embodiment of the present invention, a process of building a knowledge graph is as follows:

step S101: and (5) building a knowledge map database. When the knowledge graph database is built, the traditional relational data is not adopted, because the relational database has the advantages of processing a two-dimensional table, but the efficiency is low for the processing of large-scale entity-relation-entity multiple association. Therefore, when the knowledge graph is constructed, a graph database is used as a technical basis for data storage. The basic meaning of graph databases is to store and query data in a data structure such as a "graph". The data model is mainly embodied by entities and relations, and can also process key value pairs, and the method has the advantage of quickly solving the problem of complex relations. The graph database is a non-relational database and supports operations such as query, addition, deletion, update and the like on graph structures. Compared with the traditional relational database, the method has the advantages of high query speed, simplicity in operation and capability of providing richer relation showing modes.

Step S102: an entity scope is determined. And taking the collected entities as the entities of the nodes in the knowledge spectrogram. The entities generally include: legal, individual, organization, group, etc. have legally well-defined or socially well-known physical concepts.

Step S103: and marking the attribute of the entity. Entity attributes are a specific description about describing the characteristics of an entity. For example, the enterprise comprises the name of the enterprise, the industry of the enterprise, the loan amount of the enterprise and the like; such as individuals, including education, gender, etc. The entity attributes will act on subsequent steps.

Step S104: a relationship range is determined. A definition of a relationship between the connected entities and the entities is determined. The scope of the relationship is generally determined according to legal definitions or social communication relationship definitions. Including but not limited to, funding, guaranty, trade, investment, social, relatives, business, etc. types of relationships.

Step S105: and labeling the relationship attribute. The association relationship attribute is a specific description about describing the relationship feature. Generally, the relationship strength and the relationship directionality are included. The strength of the relationship attribute is determined according to the terms of connection, guarantee responsibility and the like in the law or social communication. For example, when a loss occurs, the guarantor in the guaranty relationship assumes a higher responsibility than the relationship of relatives, so the guaranty relationship is stronger than the relationship of relatives. The directionality of the relationship is determined by the nature of the relationship itself. For example, the guaranty relationship is a directed relationship, and the relatives may be treated as undirected relationships. The relationship attributes will act on subsequent steps.

Step S106: an "entity-relationship-entity" triple is established. And establishing the minimum unit body in the knowledge spectrogram according to the entity and the relation.

Step S107: and forming a knowledge graph. And forming a knowledge graph on the basis of the entity-relation-entity triple.

In an embodiment of the present invention, the risk calculation device may further include a conduction probability calculation module, i.e., the aforementioned conduction risk probability function calculation module, configured to obtain blacklist entities in the history data that meet a predetermined rule; screening and obtaining a first-degree associated entity with the blacklist entity and one or more relationship attributes between the blacklist entity and the first-degree associated entity according to the blacklist entity in historical data; obtaining a first risk transfer function corresponding to different relationship attributes between the first degree associated entity and the blacklist entity, a second risk transfer function corresponding to a plurality of relationship attributes, and a continuous transfer function of the blacklist entity through statistical analysis; and obtaining a probability distribution function according to the first risk transfer function, the second risk transfer function and the continuous transfer function, and storing the continuous transfer function into a pre-stored function library. Specifically, referring to fig. 3, in actual operation, the workflow of the conduction probability calculating module includes the following steps:

step S201: an entity blacklist is defined. Determining an entity range serving as a blacklist according to entities causing risk loss; blacklists include, but are not limited to, default, loss of trust, fraud, criminals, negative public opinion, and the like. In practice, the black list is generally determined according to legal definition or social recognition.

Step S202: and screening the relation between the blacklist entity and the first-degree association entity. Finding a first degree associated entity of the blacklist entity in the knowledge graph; respectively according to different relation types.

Step S203: and calculating probability distribution functions of risk propagation of different relation types, namely a first risk conduction function. For the different types of the relationship,and calculating the risk conduction probability under the type by utilizing historical big data and a statistical learning method. One common approach is a conditional probability distribution function, which computes the probability of risk propagation from entity to entity for a one-degree association. That is, the probability that one entity appears on the blacklist is calculated, and the probability that the other entity appears on the blacklist as well. According to the definition of conditional probability: let the event that one entity appears on the black list be defined as X_iAn event that the other entity appears on the black list is defined as Y_j}. According to the definition of the conditional probability formula, it can be obtained:

when the conduction probability of the entity blacklist is calculated, one method is to adopt empirical distribution and directly calculate the conduction probability under each type of relation according to the definition to obtain

Another method can adopt probability distribution function to calculate according to the existing data

The probability distribution function and its parameters are estimated. Thereby respectively obtaining the blacklist probability distribution function of 'entity-relation-entity' under each type of relation.

Step S204: and calculating a multivariate probability distribution function between different relation categories, namely a second risk conduction function. According to the mathematical principle of probability theory, a multivariate probability distribution function under the interaction between different relations is estimated. The probability that the same entity becomes a blacklist under the influence of a plurality of relationships can be estimated by adopting multivariate normal distribution.

Suppose vector Z is ═ Z₁,Z₂,...,Z_N]T represents entity Z corresponding to different relations [ Z₁,Z₂,...,Z_N]Probability of blacklisting. The density function of the multivariate normal distribution function is expressed as:

sigma means covariance matrix, mu means periodA vector of expected values.

Step S205: and calculating a probability distribution function of risk continuous conduction. And calculating a probability distribution function of continuous conduction of risks more than two degrees by using historical big data and a statistical learning method. The main calculation is the influence range of the general conduction of the blacklist.

One way is to count how many degrees the blacklist will conduct, and then the probability of conducting will fall below a certain probability threshold. The probability threshold is statistically derived from historical data. That is, the probability threshold is defined based on the probability that the entity has been blacklisted based on the past prediction and the actual blacklisting of the entity. If an entity with a blacklist probability of 10% is predicted, the entity will not actually be subsequently converted into a blacklist entity; then the threshold is divided into 10%. And when the subsequent calculation is carried out, the excess degrees are not considered.

Another approach is a markov process and markov chain to estimate the transition probability of one entity conducting to the next entity after a blacklisting event has occurred. One-step transition probabilities and one-step transition probability matrices are generally considered.

Probability of one step transition

One-step transition probability matrix:

step S206: forming a probability function of multiple relationships and continuous conduction. And integrating the probability distribution of the steps S202-S205 to obtain a probability distribution function of multiple relations and continuous conduction for risk conduction prediction of the subsequent steps.

In an embodiment of the present invention, the risk calculation apparatus may further include an entity subgraph extraction module, where the entity subgraph extraction module is configured to obtain one or more corresponding entities through screening in the knowledge graph according to a selected screening condition of preset screening conditions; and constructing an entity subgraph according to the entities and the relationship attributes between the entities. The preset screening conditions comprise attribute identification, entity identification and community identification; when the attribute identification is selected, screening in the knowledge graph according to the entity attribute of each entity or the relationship attribute between the entities to obtain one or more corresponding entities; constructing entity subgraphs according to the entities and the relationship attributes between the entities; when the entity identification is selected, screening in the knowledge graph to obtain a corresponding entity; constructing entity subgraphs according to the entities and the relationship attributes between the entities; and when the selected community is identified, obtaining the connected bodies and the community clusters in the knowledge graph through a graph community algorithm to generate an entity subgraph. Specifically, referring to fig. 4A, in actual work, the step of the entity sub-graph extraction module performing entity sub-graph extraction is as follows:

step S301: a sub-graph recognition mode is selected. Since risk decision making is typically to perform risk analysis on a particular population; and the whole knowledge graph is extremely large, and the relational network result is complex. In order to improve pertinence and efficiency, subgraphs are generally extracted for analysis; the module can comprise three subgraph extraction modes.

Step S3020: the attributes identify patterns.

Step S3021: and extracting subgraphs of the entities with the same attribute. And screening the attribute concerned by the risk decision according to the marked entity attribute and the relation attribute in the previous step, and extracting the entity subgraph. Generally including both single and complex conditions. A single condition is generally directed to one specific aspect; if the sub-image is extracted from the entity according to the legal client loan amount above a certain amount; for example, a security circle map is extracted for the relationship by the security relationship. Composite conditions include combinations of multiple aspects, formed by various types of combinations of attributes of entity + relationships. The condition combination mode is not artificially limited, and any combination is suitable for the attribute identification mode of the invention.

The subgraph Search method generally adopts Breadth-First Search (BFS), depth-First Search (DFS), Label Propagation Algorithm (Label Propagation Algorithm), and the like. The basic principle of breadth-first traversal is as follows: and traversing according to the hierarchy from a certain node. After traversing a node of a certain layer, the next layer is traversed, which is specifically shown in fig. 4B. The rationale for depth-first traversal is as follows: and sequentially visiting the nodes connected with the node from the node along a path until the visiting is finished. Another path is traversed, as shown in fig. 4C.

Step S3022: an attribute subgraph is formed.

Step S3030: an entity recognition pattern.

Step S3031: subgraphs are extracted for a particular set of entities. According to the risk decision requirement, a group of designated entities is selected, and subgraphs are extracted from the group of entities. The entity subgraph extraction method is similar to the step S3021. The selected entity combination mode is not artificially limited, and any entity combination is suitable for the entity identification mode of the invention.

Step S3032: forming an entity subgraph.

Step S3040: a community recognition mode.

Step S3041: and forming a connected body, a community cluster and the like by using a graph community algorithm. And automatically identifying the closer connected bodies and community clusters in the knowledge spectrogram according to a graph community algorithm, and extracting the connected bodies and the community clusters into subgraphs.

A connector is a sub-graph, and can be from any entity to any other entity along the edge in the sub-graph; that is, for any two entities x, y in the graph, an x-y path is included in the subgraph. Wherein, the strong communication body can be focused, and the strong communication body means that each node in the subgraph can be accessed by other nodes. That is, for any node P, there is a path that can access another node Q; another path exists at the same time, and another node Q can access the node P, which is specifically shown in fig. 4D.

The community clustering adopts a louvain algorithm; the Louvain algorithm is a community discovery algorithm based on modularity, can well discover a hierarchical community structure, and the optimization goal is to maximize the modularity of the whole community network. Definition of its modularity:

A_i,j: the weight of the edge between i and j; k is a radical of_i: sum of weights of edges connected to point i; 2 m: the sum of the weights of all edges; delta (c)_i,c_j): if the point i and the point j are from the same group, 1 is output, otherwise 0 is output.

The specific algorithm comprises two parts.

A first part: the entities in the relational network are continuously traversed. Initially, each entity is divided into communities belonging to the entity; then, the modularity variable quantity delta Q of each entity removed from the community to which the entity belongs and placed behind the communities to which the neighbor belongs is calculated.

Σ (in): the sum of the weights of the edges in the current community; Σ (total): sum of weights of edges connected to entities in the target community; k is a radical of_i,in: the sum of the weights of the edges of the entity i and other entities in the target community; entity i will be placed in the target community where the delta Q growth is greatest. If Δ Q is not increased, then the entity remains in place.

The above steps are repeated repeatedly until Δ Q no longer increases.

A second part: and (5) reconstructing the network. And regarding the communities formed by the first part as points, and forming a new graph. And repeating the calculation process of the first part on the second part until the overall modularity is not improved any more.

The principle of the Label Propagation Algorithm (Label Propagation Algorithm) is as follows: in the initial state, each node is labeled to indicate the community to which it belongs. The change in label for each node within the community is related to the node labels of its surrounding neighbors. And taking the label value with the highest frequency of the same labels of the surrounding nodes as a new label of the node. Initially, each node is assigned a unique label and then propagated through the graph. Closely related nodes will eventually form the same label.

First, the labels of all nodes in the graph are initialized. For a given sectionPoint x, whose label is defined as: c_x(0)＝x。

The second step is that: the number of iterations t is set, starting with t equal to 1.

The third step: for the nodes in the graph, the sequence of the nodes in the graph is randomly generated, and the combination of the nodes and the sequence is defined as X.

The fourth step: for each X ∈ X, X represents a combination of node and order, such that

f denotes returning the label value with the highest label frequency in its neighbor nodes. If the highest frequency counts of the plurality of tags are the same, one tag is randomly selected.

x_i1,...,x_imThe neighbor nodes of x are represented whose community labels are updated t times. x is the number of_i(m+1),...,x_kAnd is also a neighbor node of x, the community label of which is updated t-1 times.

The fifth step: when each node is the label with the maximum label frequency in the neighbor nodes, the iteration is stopped. Otherwise, the iteration number is set to t ═ t +1, and the third step is repeated.

Step S3042: forming a community subgraph.

In an embodiment of the present invention, the risk calculation device may include a risk entity identification module, where the risk entity identification module is configured to identify and obtain a corresponding detection result in the entity sub-graph according to a selected identification condition of preset identification conditions; wherein the identification condition comprises blacklist identification, entity identification and node identification; when the blacklist identification is selected, comparing the entity attributes of each entity in the entity subgraph according to a preset rule, obtaining a blacklist entity which accords with the preset rule in the entity subgraph, and generating an entity list according to the blacklist entity; when entity identification is selected, a risk detection model is established through historical data and a learning algorithm; respectively calculating the risk probability of each entity in the entity subgraph through the risk detection model, and generating an entity list according to the entities with the risk probability higher than a preset probability threshold; when the node identification is selected, identifying a centrality entity in the entity subgraph through a point analysis algorithm; calculating the risk probability of the centrality entity through the risk retrieval model, and generating an entity list according to the centrality entity with the risk probability higher than a preset probability threshold. Specifically referring to fig. 5, the identification process of the risk entity identification module is as follows:

step S401: a risk entity identification pattern is selected. To predict risk propagation, it is desirable to be able to identify entity blacklists within a subgraph. The module comprises three entity risk identification modes.

Step S4020: the blacklist identifies the pattern.

Step S4021: and marking the blacklist entity in the sub-graph according to the existing blacklist.

Step S4022: and obtaining a sub-graph blacklist entity list. And listing the entities belonging to the blacklist in the subgraph into a list, wherein the blacklist probability of the blacklist client can be generally set to be 1.

Step S4030: the entity predicts the recognition pattern.

Step S4031: and calculating the probability of the entity blacklist in the subgraph by using a model algorithm. For each normal entity within the subgraph (i.e., non-blacklisted entity), the probability that the normal entity becomes blacklisted is predicted. The method of predicting the probability includes, but is not limited to, a supervised algorithm among machine learning algorithms.

The main steps of prediction include:

the first step is as follows: defining target variables of the model, namely black and white list customers; a modeled sample.

The second step is that: the feature variables for modeling are designed. The characteristic variables are generally selected according to entity attributes, and are processed based on the entity attributes to generate derivative variables.

The third step: and carrying out model training, verification and testing by using a supervised machine learning algorithm.

Supervised algorithms may generally use Logistic regression models, XGboost, LightGBM, random forests, SVMs, and the like. Neural network models may also be used.

The fourth step: and inputting the attribute characteristic variables of the normal entity into the model, and obtaining the default probability through a series of operations.

Step S4032: and obtaining a sub-graph high-risk entity list. For normal entities with higher prediction probability, the normal entities are listed in a list, and the blacklist probability of the high-risk entities generally adopts the predicted probability. The probability threshold is statistically derived from historical data. That is, the probability threshold is defined based on the probability that the entity has been blacklisted based on the past prediction and the actual blacklisting of the entity. If the entity with the probability of 90% of the blacklist is predicted, the entity is actually converted into the blacklist entity subsequently; then the threshold is divided into 90%.

Step S4040: the node identifies the pattern.

Step S4041: and identifying high-centrality entities in the subgraph by using a point analysis algorithm of the graph. And (5) calculating the entity with high centrality in the subgraph by using a graph algorithm. And (4) finding out an entity with a higher index value by adopting graph indexes such as degree centrality, compact centrality, medium centrality, feature vector centrality, PageRank and the like which are not limited to points. And (3) sorting the entities according to the index values from high to low, and taking 1% -99% quantiles at intervals of 1%. Taking entities within a range of quantiles.

The degree center is used to measure how many relationships each entity has, emphasizing the value of the entity. Generally comprises the following steps: taking the entity as a starting point, and connecting the entity to the relationship quantity pointing to other entities; degree of entry: and taking the entity as an end point, and connecting the relation quantity pointing to the entity with the entity. The tight centrality is used to measure the value of an entity in a subgraph. The inverse of the average of the distances between an entity in the subgraph and all other entities is calculated.

n: starting from an entity x, obtaining the number of entities within a certain step length; v: a set of entities of a subgraph; d (y, x): shortest path from entity y to entity x. The denominator can be understood as the synthesis of the shortest paths from the surrounding entities to the entity; if the value of the denominator is smaller, the value of the tightness centrality is larger, and the entity is tightly close to the surrounding entities.

The medium centrality: all shortest paths (which can be generally limited within a certain step size) of any two entities in the computational sub-graph are calculated, and if many of the shortest paths pass through an entity, the entity has strong intermediary bridging effect.

σ_y,z: the number of shortest paths from entity y to entity z; sigma_y,z(x) The method comprises the following steps The number of shortest paths through entity x from entity y to entity z.

Feature vector centrality is used to measure the impact of an entity on a subgraph. For entities with the same connection relation, the entity with higher adjacent entity score is higher than the entity with lower adjacent entity score, and all entities are assigned with corresponding scores according to the principle. A higher feature vector means that the entity is connected to many entities that score higher themselves.

And (3) solving the feature vector centrality by using the adjacency matrix:

given a sub-graph G ═ (V, E) with an entity set of | V |, its adjacency matrix is defined as a ═ E_v,t) When v is connected to t, a_v,t1, otherwise_v,tThe fraction of the centrality x of the entity v is then solved by:

the adjacency matrix records whether the entities are connected by using a digital matrix, and the size of the number can represent the weight of the relation. The adjacency matrix a of the graph G of order n is n x n. Defining the entity of G as v₁，v₂，...，v_n. If:

the weights of the relationships may also be represented by words greater than 0.

M (v) is a set of neighboring entities to entity v, and λ is a constant. After a series of variations, the formula can be transformed into a eigenvector equation as shown below:

Ax＝λx

λ denotes a feature vector. The maximum of the eigenvalues λ represents the centrality to be measured. By computing the related components v of the feature vectors of the entities v in the subgraph^thThe corresponding centrality score can be obtained. In order to ensure that different scores can be measured, the scores of different entities are standardized to obtain the feature vector centrality score of each node.

Step S4042: and calculating the blacklist probability of the high-centrality entity by using a model algorithm. Similar to steps S4031, S4032, a blacklist probability of the entity is predicted. This step is not necessary. These highly centralized entities may be added directly to the manifest.

Step S4043: and obtaining a sub-graph high-centrality entity list. The blacklist probability of the high centrality entity is generally the predicted probability if step S4042 is adopted. If step S4042 is skipped, the blacklist probability can be set according to the scenario assumption.

In an embodiment of the present invention, the risk calculation apparatus further includes a conducting path analysis module, where the conducting path analysis module is configured to calculate a shortest path between each two entities in the entity list through a graph algorithm, and record an entity attribute of the path entity and a relationship attribute between the entities. Specifically, referring to fig. 6, the analysis flow of the conduction path analysis module, i.e. the risk conduction path analysis module, is as follows:

step S501: acquiring an entity list from a risk entity identification module; obtaining a risk entity list from the modules, wherein the risk entity list range comprises at least one mode: blacklist entities, high risk entities, high centrality entities.

Step S502: and finding the optimal path among the entities by using a graph path algorithm. And starting from the entity in the entity list, finding the optimal path between the entity in the list and other entities in the subgraph. Because there are many possibilities for conduction between two entities in a subgraph and risk conduction is often very rapid, an optimal connection mode between two entities is found. And calculating the optimal path between every two entities through a graph algorithm. Available methods include Dijkstra's algorithm, Floyd-Warshall's algorithm, and the like.

The major principle of the Dijkstra algorithm is as follows:

the starting point is defined as the initial node. The distance of node Y represents the distance of the initial node from node Y.

The first step is as follows: all points in the graph are marked as not visited. And forming all the non-access points into a non-access point set.

The second step is that: each point is assigned a temporary distance value. The distance of the initial node under study is assigned to 0, and the other nodes are assigned to plus infinity. And setting the initial node as the current node.

The third step: for the current node, considering all nodes marked as non-visited nodes, calculating their temporary distances from the current node. And comparing the newly calculated temporary distance with the assignment of the current node, and taking the minimum value. For example, if the distance of the current node a is given as 6 and the distance between it and the neighbor B is 2, the distance between B and a is 6+ 2-8. If the previous assignment of the distance of B is greater than 8, then reassign the distance to 8; if the assignment is not greater than 8, the original distance of B is retained.

The fourth step: and traversing all the nodes which are not accessed to the current node, marking the current node as an accessed node, removing the node from the set of the nodes which are not accessed, and not calculating any more.

The fifth step: when the end point is also marked as visited (when looking at the distance between two particular points), or the minimum temporal distance in the set that is not visited is infinite, the calculation is stopped.

And a sixth step: and setting the node with the minimum temporary distance in the nodes which are not visited as the current node, and repeating the third step.

The major principle of the Floyd-Warshall algorithm is as follows:

let V ═ V be the node set in graph G₁，v₂，...，v_N}. Let the shortestPath (i, j, k) denote that only the set of intermediate nodes {1, 2.,k path. The goal is to solve for the shortest distance between each node i to each node j through at least any node in the set V.

For each node pair in the graph, shortestPath (i, j, k) may be a path that does not go through node k, or a path that must go through node k. In both cases, the set of intermediate nodes { v } must be included₁，v₂，...，v_k-1}。

From node i to node j, only through { v }₁，v₂，...，v_k-1"is defined as shortestPath (i, j, k-1). it is clear that if there is a better path from i to k to j, then this path is from i to k (passing through { v only) }₁，v₂，...，v_k-1}) and k to j (passing through only v)₁，v₂，...，v_k-1}) shortest path connections.

Let ω (i, j) be the weight of the edge between point i and point j, shortestPath (i, j, k) can be expressed in a recursive fashion:

shortestPath(i,j,0)＝ω(i,j)；

shortestPath(i,j,k)＝min(shortestPath(i,j,k-1),shortestPath(i,k,k-1)+shortestPath(k,j,k-1))

first, shortestPath (i, j, k) when k is 1 is solved for all combinations (i, j), and then, shortestPath (i, j, k) when k is 2 is solved until k is N. This results in all combinations (i, j) passing through { v }₁，v₂，...，v_NThe shortest path.

Step S503: and recording the entity attribute and the relationship attribute on each path.

In an embodiment of the present invention, the risk calculation apparatus further includes an entity conduction prediction module, where the entity conduction prediction module is configured to calculate, according to the entity list and the path, a conduction probability of each entity through a probability distribution function corresponding to each entity, and obtain a risk probability of each entity according to a risk probability value and a corresponding conduction probability of each entity; specifically, referring to fig. 7, the process of entity risk conduction prediction is as follows:

step S601: and acquiring the identified risk entity list from the risk entity identification module. The risk entity list scope includes at least one of the following modes: blacklist entities, high risk entities, high centrality entities. From the above entities as starting entities, risk conduction is started.

Step S602: a physical conduction path is obtained from a risk conduction path analysis module. And acquiring a risk conduction path from a risk entity in the previous step.

Step S603: and the conduction risk probability function calculation module acquires a conduction probability function. The probability distribution function of the entity-relation-entity under each type of relation, the multivariate probability distribution function under the interaction between different relations, the probability distribution function of risk continuous conduction, the probability function of multiple relations and continuous conduction. For calculating a specific risk propagation probability.

Step S604: calculating the conducted probability of the target entity. And starting from the blacklist probability of the initial node, calculating the conducted blacklist probability of the associated entity according to the multivariate probability distribution function under the interaction among different relations. And obtaining the probability that the target entity is conducted to the blacklist according to the probability function of the conduction subgrade and the continuous conduction. Two modes are supported for the risk propagation probability of the target entity. And calculating the conduction probability of all the entities in the mode pair subgraph, and calculating the conduction probability of any target entities in the mode pair subgraph.

In an embodiment of the invention, the analysis device may include a risk decision module, as shown in fig. 8A, a decision process of the risk decision module is as follows:

step S701: and acquiring the risk conduction probability of the target entity from the entity risk conduction prediction module.

Step S702: conductive path properties are obtained from a risky conductive path analysis module. The path attributes include entity attributes, relationship attributes, and the like.

Step S703: and establishing a decision risk engine. And the decision risk engine is used for determining the risk degree of the target entity according to the risk conduction probability and the risk conduction path attribute of the target entity and the risk matrix. The risk matrix is a four-quadrant matrix established according to the risk exposure and risk conduction probability of the entity.

Referring to fig. 8B, the four quadrants are:

quadrant I: risk transmission probability-high, risk uncovered-high;

and II, quadrant: risk transmission probability-low, risk open-high;

quadrant III: risk transmission probability-low, risk open-low;

IV quadrant: risk transmission probability-high, risk open-low;

wherein, the I quadrant has high occurrence probability and high risk exposure, and belongs to high risk. The occurrence probability of the III quadrant is low, the risk is low, and the method belongs to low risk. The occurrence probability of the II quadrant is low, and the risk exposure is high; once a risk event occurs, the loss is large. The occurrence probability of the III quadrant is high, and the risk exposure is low; the losses incurred when this occurs are small. The risk level for these two quadrants is medium.

Step S704, distinguishing according to risk preference of the decision risk engine, namely three categories of risk preference, risk neutrality and risk aversion. The severity of risk for the four quadrants mentioned above is further subdivided from high to low:

and (3) deciding risk hobbies: quadrant I > quadrant IV > quadrant II > quadrant III;

and (4) decision risk neutrality: quadrant I > quadrant IV ═ quadrant II > quadrant III;

decision risk aversion: quadrant I > quadrant II > quadrant IV > quadrant III;

step S705: and (6) risk decision making. And D, regarding the quadrant I in the step I as a high risk entity list, and taking corresponding risk control measures according to external supervision requirements and an internal management system of risk management. And the II quadrant is a medium risk entity list, has low occurrence probability and large risk exposure, carries out risk verification on the medium risk entity list, and takes corresponding risk control measures according to external supervision requirements and an internal management system of risk management if the risk is determined by verification. Quadrant III is a list of low risk entities, generally requiring only attention. And the IV quadrant is a medium risk entity list, has high occurrence probability and low risk exposure, carries out risk early warning on the medium risk entity list, and takes corresponding risk control measures according to the external supervision requirement and the internal management system of risk management.

In summary, in order to more accurately display the status of each entity and provide more diversified display manners, in actual work, the entity relationship map display system provided by the present invention can be specifically divided into the following 7 parts, which can be specifically shown in fig. 1B, and the entity relationship map display system provided by the present invention can include: the knowledge graph building module 1 builds the bottom data of the knowledge graph, which is the subsequent data base. The conduction risk probability function calculation module 2 calculates the risk probability distribution of the entity and generates a risk probability distribution function of the entity being conducted. And the entity subgraph extraction module 3 extracts the guest groups for calculating the conduction risks. The risk entity identification module 4 is to find the black list in the sub graph in module 3 as a starting point for risk conduction. Risk conduction path analysis module 5 calculates a blacklisted customer risk propagation path from module 4 within the sub-graph of module 3 based on

modules

3, 4. And the entity risk conduction prediction module 6 calculates and obtains the blacklist risk probability of the conducted target entity according to the path of the model 5 and the risk conduction probability function obtained by the module 4. And the risk decision module 7 takes corresponding measures according to the blacklist risk probability of the module 6 to control risks.

In the above embodiment, the knowledge-graph building module 1: the definitions of entities and relationships within the knowledge-graph are primarily determined. And forming a full graph by taking the entity-relationship-entity triple as a basic unit. Conduction risk probability function calculation module 2: the risk transfer probability distribution under different relations is calculated by mainly utilizing big data, and a risk probability function of the entity under various relations and continuous transfer is obtained. And an entity subgraph extraction module 3: a subject entity population for analysis is extracted and risk conductance pathways and risk states within the entity population are studied. Risk entity identification module 4: the blacklist entities in the entity subgraph are mainly searched by using a machine learning algorithm. Risk conductive path analysis module 5: the risk conduction path from the blacklist entity to the target entity is generated mainly by using a machine learning algorithm, and the attribute information of each entity-relation-entity on the path is recorded. Entity risk conductance prediction module 6: and applying the risk value obtained from the conduction risk value calculation module to a risk conduction path to obtain the risk value of the target entity. Risk decision module 7: and comprehensively considering the risk value and the risk conduction path of the target entity, establishing a risk decision engine, and determining a target entity list needing early warning and corresponding risk control measures.

Referring to fig. 9, the present invention further provides a method for displaying an entity relationship map, the method comprising: the method comprises the following steps: acquiring all entities in a preset range, and constructing a knowledge graph with the entities as nodes according to the entity attributes of the entities and the relationship attributes among the entities; step two: analyzing entity attributes of each entity analyzed in the knowledge graph according to preset rules to obtain blacklist entities; obtaining a corresponding probability distribution function in a pre-stored function library according to the relation attribute between the blacklist entity and the associated entity thereof; calculating and obtaining a risk probability value of an entity associated with the blacklist entity according to the probability distribution function; obtaining the risk probability of the corresponding entity according to the sum of one or more risk probability values of the entity; step three: and comparing the risk probability of each entity with a preset prompt threshold value, and generating prompt information according to the comparison result and the corresponding entity.

In the above embodiment, the method further comprises: obtaining blacklist entities which accord with preset rules in historical data; screening and obtaining a first-degree associated entity with the blacklist entity and one or more relationship attributes between the blacklist entity and the first-degree associated entity according to the blacklist entity in historical data; obtaining a first risk transfer function corresponding to different relationship attributes between the first degree associated entity and the blacklist entity, a second risk transfer function corresponding to a plurality of relationship attributes, and a continuous transfer function of the blacklist entity through statistical analysis; and obtaining a probability distribution function according to the first risk transfer function, the second risk transfer function and the continuous transfer function, and storing the continuous transfer function into a pre-stored function library.

In an embodiment of the present invention, the second step further includes: screening in the knowledge graph according to a selected screening condition in preset screening conditions to obtain one or more corresponding entities; and constructing an entity subgraph according to the entities and the relationship attributes between the entities. The preset screening conditions comprise attribute identification, entity identification and community identification; when the attribute identification is selected, screening in the knowledge graph according to the entity attribute of each entity or the relationship attribute between the entities to obtain one or more corresponding entities; constructing entity subgraphs according to the entities and the relationship attributes between the entities; when the entity identification is selected, screening in the knowledge graph to obtain a corresponding entity; constructing entity subgraphs according to the entities and the relationship attributes between the entities; and when the selected community is identified, obtaining the connected bodies and the community clusters in the knowledge graph through a graph community algorithm to generate an entity subgraph.

In an embodiment of the present invention, the second step further includes: identifying in the entity subgraph according to a selected identification condition in preset identification conditions to obtain a corresponding detection result; wherein the identification condition comprises blacklist identification, entity identification and node identification; when the blacklist identification is selected, comparing the entity attributes of each entity in the entity subgraph according to a preset rule, obtaining a blacklist entity which accords with the preset rule in the entity subgraph, and generating an entity list according to the blacklist entity; when entity identification is selected, a risk detection model is established through historical data and a learning algorithm; respectively calculating the risk probability of each entity in the entity subgraph through the risk detection model, and generating an entity list according to the entities with the risk probability higher than a preset probability threshold; when the node identification is selected, identifying a centrality entity in the entity subgraph through a point analysis algorithm; calculating the risk probability of the centrality entity through the risk retrieval model, and generating an entity list according to the centrality entity with the risk probability higher than a preset probability threshold.

In an embodiment of the present invention, the second step further includes: and calculating the shortest path between every two entities in the entity list through a graph algorithm, and recording the entity attributes of the path entities and the relationship attributes between the entities.

In an embodiment of the present invention, the second step further includes: and calculating the conduction probability of each entity through the probability distribution function corresponding to each entity according to the entity list and the path, and obtaining the risk probability of each entity according to the risk probability value and the corresponding conduction probability of each entity.

As shown in fig. 10, the computer device 600 may further include: communication module 110, input unit 120, audio processing unit 130, display 160, power supply 170. It is noted that the computer device 600 does not necessarily include all of the components shown in FIG. 10; furthermore, the computer device 600 may also comprise components not shown in fig. 10, as can be seen in the prior art.

As shown in fig. 10, the central processor 100, sometimes referred to as a controller or operational control, may comprise a microprocessor or other processor device and/or logic device, the central processor 100 receiving input and controlling the operation of the various components of the computer apparatus 600.

The memory 140 may be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 100 may execute the program stored in the memory 140 to realize information storage or processing, etc.

The input unit 120 provides input to the cpu 100. The input unit 120 is, for example, a key or a touch input device. The power supply 170 is used to provide power to the computer device 600. The display 160 is used to display an object to be displayed, such as an image or a character. The display may be, for example, an LCD display, but is not limited thereto.

The memory 140 may be a solid state memory such as Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 140 may also be some other type of device. Memory 140 includes buffer memory 141 (sometimes referred to as a buffer). The memory 140 may include an application/function storage section 142, and the application/function storage section 142 is used to store application programs and function programs or a flow for executing the operation of the computer apparatus 600 by the central processing unit 100.

Memory 140 may also include a data store 143, the data store 143 for storing data, such as contacts, digital data, pictures, sounds, and/or any other data used by a computer device. The driver storage 144 of the memory 140 may include various drivers for the computer device for communication functions and/or for performing other functions of the computer device (e.g., messaging applications, directory applications, etc.).

The communication module 110 is a transmitter/receiver 110 that transmits and receives signals via an antenna 111. The communication module (transmitter/receiver) 110 is coupled to the central processor 100 to provide an input signal and receive an output signal, which may be the same as in the case of a conventional mobile communication terminal.

Based on different communication technologies, a plurality of communication modules 110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be provided in the same computer device. The communication module (transmitter/receiver) 110 is also coupled to a speaker 131 and a microphone 132 via an audio processor 130 to provide audio output via the speaker 131 and receive audio input from the microphone 132 to implement general telecommunications functions. Audio processor 130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, an audio processor 130 is also coupled to the central processor 100, so that recording on the local can be enabled through a microphone 132, and so that sound stored on the local can be played through a speaker 131.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An entity relationship map display system is characterized by comprising a relationship construction device, a risk calculation device and an analysis device;

the relation construction device is used for collecting all entities in a preset range and constructing a knowledge graph with the entities as nodes according to the entity attributes of the entities and the relation attributes among the entities;

the risk calculation device is used for analyzing entity attributes of each entity analyzed in the knowledge graph according to preset rules to obtain blacklist entities; obtaining a corresponding probability distribution function in a pre-stored function library according to the relation attribute between the blacklist entity and the associated entity thereof; calculating and obtaining a risk probability value of an entity associated with the blacklist entity according to the probability distribution function; obtaining the risk probability of the corresponding entity according to the sum of one or more risk probability values of the entity;

the analysis device is used for comparing the risk probability of each entity with a preset prompt threshold value and generating prompt information according to the comparison result and the corresponding entity.

2. The entity relationship graph display system of claim 1, wherein the risk calculation means comprises a conduction probability calculation module for obtaining blacklisted entities in historical data that meet a predetermined rule; screening and obtaining a first-degree associated entity with the blacklist entity and one or more relationship attributes between the blacklist entity and the first-degree associated entity according to the blacklist entity in historical data; obtaining a first risk transfer function corresponding to different relationship attributes between the first degree associated entity and the blacklist entity, a second risk transfer function corresponding to a plurality of relationship attributes, and a continuous transfer function of the blacklist entity through statistical analysis; and obtaining a probability distribution function according to the first risk transfer function, the second risk transfer function and the continuous transfer function, and storing the continuous transfer function into a pre-stored function library.

3. The entity relationship graph display system of claim 1, wherein the risk calculation device comprises an entity subgraph extraction module, the entity subgraph extraction module is used for obtaining one or more corresponding entities through screening in the knowledge graph according to selected screening conditions in preset screening conditions; and constructing an entity subgraph according to the entities and the relationship attributes between the entities.

4. The entity relationship map display system of claim 3, wherein the predetermined filtering conditions include attribute identification, entity identification and community identification;

when the attribute identification is selected, screening in the knowledge graph according to the entity attribute of each entity or the relationship attribute between the entities to obtain one or more corresponding entities; constructing entity subgraphs according to the entities and the relationship attributes between the entities;

when the entity identification is selected, screening in the knowledge graph to obtain a corresponding entity; constructing entity subgraphs according to the entities and the relationship attributes between the entities;

and when the selected community is identified, obtaining the connected bodies and the community clusters in the knowledge graph through a graph community algorithm to generate an entity subgraph.

5. The entity relationship graph display system of claim 3, wherein the risk calculation device comprises a risk entity identification module, the risk entity identification module is configured to identify in the entity subgraph according to a selected identification condition of preset identification conditions to obtain a corresponding detection result; wherein the identification condition comprises blacklist identification, entity identification and node identification;

when the blacklist identification is selected, comparing the entity attributes of each entity in the entity subgraph according to a preset rule, obtaining a blacklist entity which accords with the preset rule in the entity subgraph, and generating an entity list according to the blacklist entity;

when entity identification is selected, a risk detection model is established through historical data and a learning algorithm; respectively calculating the risk probability of each entity in the entity subgraph through the risk detection model, and generating an entity list according to the entities with the risk probability higher than a preset probability threshold;

when the node identification is selected, identifying a centrality entity in the entity subgraph through a point analysis algorithm; calculating the risk probability of the centrality entity through the risk retrieval model, and generating an entity list according to the centrality entity with the risk probability higher than a preset probability threshold.

6. The entity relationship map display system of claim 5, wherein the risk calculation device further comprises a conduction path analysis module, the conduction path analysis module is configured to calculate a shortest path between each two entities in the entity list through a graph algorithm, and record entity attributes and relationship attributes of the path entities.

7. The entity relationship graph display system of claim 6, wherein the risk calculation device further comprises an entity conduction prediction module, the entity conduction prediction module is configured to calculate a conduction probability of each entity according to the entity list and the path through a probability distribution function corresponding to each entity, and obtain a risk probability of each entity according to the risk probability value and the corresponding conduction probability of each entity.

8. An entity relationship map display method, comprising:

the method comprises the following steps: acquiring all entities in a preset range, and constructing a knowledge graph with the entities as nodes according to the entity attributes of the entities and the relationship attributes among the entities;

step two: analyzing entity attributes of each entity analyzed in the knowledge graph according to preset rules to obtain blacklist entities; obtaining a corresponding probability distribution function in a pre-stored function library according to the relation attribute between the blacklist entity and the associated entity thereof; calculating and obtaining a risk probability value of an entity associated with the blacklist entity according to the probability distribution function; obtaining the risk probability of the corresponding entity according to the sum of one or more risk probability values of the entity;

step three: and comparing the risk probability of each entity with a preset prompt threshold value, and generating prompt information according to the comparison result and the corresponding entity.

9. The entity relationship graph display method of claim 8, further comprising:

obtaining blacklist entities which accord with preset rules in historical data;

screening and obtaining a first-degree associated entity with the blacklist entity and one or more relationship attributes between the blacklist entity and the first-degree associated entity according to the blacklist entity in historical data;

obtaining a first risk transfer function corresponding to different relationship attributes between the first degree associated entity and the blacklist entity, a second risk transfer function corresponding to a plurality of relationship attributes, and a continuous transfer function of the blacklist entity through statistical analysis;

and obtaining a probability distribution function according to the first risk transfer function, the second risk transfer function and the continuous transfer function, and storing the continuous transfer function into a pre-stored function library.

10. The entity relationship graph displaying method of claim 8, wherein the second step further comprises: screening in the knowledge graph according to a selected screening condition in preset screening conditions to obtain one or more corresponding entities; and constructing an entity subgraph according to the entities and the relationship attributes between the entities.

11. The entity relationship map display method according to claim 10, wherein the preset screening conditions include attribute identification, entity identification and community identification;

12. The entity relationship graph displaying method of claim 10, wherein the second step further comprises:

identifying in the entity subgraph according to a selected identification condition in preset identification conditions to obtain a corresponding detection result;

wherein the identification condition comprises blacklist identification, entity identification and node identification;

13. The entity relationship graph displaying method of claim 12, wherein the second step further comprises: and calculating the shortest path between every two entities in the entity list through a graph algorithm, and recording the entity attributes of the path entities and the relationship attributes between the entities.

14. The entity relationship graph displaying method of claim 13, wherein said second step further comprises: and calculating the conduction probability of each entity through the probability distribution function corresponding to each entity according to the entity list and the path, and obtaining the risk probability of each entity according to the risk probability value and the corresponding conduction probability of each entity.

15. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 8 to 14 when executing the computer program.

16. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the method of any of claims 8 to 14.