CN110717824A

CN110717824A - Method and device for conducting and calculating risk of public and guest groups by bank based on knowledge graph

Info

Publication number: CN110717824A
Application number: CN201910985850.4A
Authority: CN
Inventors: 刘鹏飞
Original assignee: Beijing Mininglamp Software System Co ltd
Current assignee: Beijing Mininglamp Software System Co ltd
Priority date: 2019-10-17
Filing date: 2019-10-17
Publication date: 2020-01-21

Abstract

The application provides a method and a device for calculating risk conduction of a bank to a public customer group based on a knowledge graph, wherein the method comprises the steps of constructing an association relation knowledge graph, aiming at unit data of each graph unit in the association relation knowledge graph, and obtaining a risk conduction coefficient of the graph unit according to a risk conduction coefficient prediction model; after any client to be tested has a risk event, obtaining the risk conduction probability of other clients to be tested except the client to be tested having the risk event according to a preset algorithm model based on the risk conduction coefficient of each map unit; determining at least one risk conduction path based on the risk conduction probability of each customer to be tested; and aiming at each risk conduction path, business early warning operation is executed, so that the credit affected degree of other enterprises which have incidence relation with the enterprise having the risk event is determined by the bank, the accuracy of determining the enterprise risk by the bank is improved, and unnecessary loss of the bank is reduced.

Description

Method and device for conducting and calculating risk of public and guest groups by bank based on knowledge graph

Technical Field

The application relates to the technical field of data analysis, in particular to a method and a device for conducting and calculating risk of public and guest groups by a bank based on a knowledge graph.

Background

With the continuous development of market economy, the production and operation of enterprises across areas and industries are very common. Due to the cross-region and cross-industry production and operation behaviors of enterprises, the association relationship between a plurality of enterprises stored in a bank is complex, and when a credit risk event occurs in any enterprise, credit influence may be generated on other enterprises having association relationship with the credit risk event.

At present, after a risk event occurs in a certain enterprise, the bank cannot determine the degree of influence on the credit of other enterprises having an incidence relation with the enterprise having the risk event, namely, the accuracy of determining the enterprise risk by the bank is low, so that the bank cannot perform business early warning operation on other enterprises, and the bank generates unnecessary loss.

Disclosure of Invention

In view of the above, an object of the present application is to provide a method and an apparatus for conducting and calculating risk of public and guest groups by a bank based on a knowledge graph, so as to improve accuracy of determining enterprise risk by the bank.

In a first aspect, an embodiment of the present application provides a method for calculating risk conduction of a public customer group by a bank based on a knowledge graph, where the method includes:

respectively taking each customer to be tested as a node, and taking the correlation between the customer to be tested and other customers to be tested as an edge, constructing a correlation knowledge graph containing each customer to be tested, wherein the edge and two nodes corresponding to the edge form a graph unit of the correlation knowledge graph, and each edge corresponds to one direction;

aiming at each map unit, acquiring unit data of the map unit, and inputting the unit data into a trained risk conduction coefficient prediction model to obtain a risk conduction coefficient of the map unit;

after any client to be tested has a risk event, obtaining the risk conduction probability of other clients to be tested except the client to be tested having the risk event according to a preset algorithm model based on the risk conduction coefficient of each map unit;

determining at least one risk conduction path based on the risk conduction probability of each customer to be tested, wherein the first node of each risk conduction path is the customer to be tested with the risk event;

and executing business early warning operation aiming at each risk conduction path.

In some embodiments of the present application, the training process of the risk conductance prediction model is:

obtaining sample unit data corresponding to a plurality of map units of the incidence relation knowledge map, wherein the sample unit data carries a label, the sample unit data comprises training unit data and test unit data, and the label comprises default and non-default;

inputting the training unit data into a deep learning model, and performing multi-round training on the deep learning model;

inputting the data of the test unit into the deep learning model after each round of training, evaluating the deep learning model after training, and determining that the training is finished when the accuracy value obtained by evaluation is greater than a preset accuracy threshold value;

and taking the deep learning model obtained after training as the risk conduction coefficient prediction model.

In some embodiments of the present application, after constructing a knowledge graph including an association relationship of each customer to be tested, and acquiring unit data of graph units, before training the risk conductance prediction model, the method further includes:

analyzing the correlation between each data in the unit data of the map unit and the risk by using a preset correlation analysis algorithm to obtain a correlation value corresponding to each data in the unit data;

and selecting data with the correlation value larger than a correlation threshold value from the unit data to form screened unit data of the map unit.

In some embodiments of the present application, after obtaining the screened unit data for the map unit, the method further comprises:

analyzing the similarity degree between any one of the screened unit data of the map unit and other data except the data by using a preset data analysis algorithm to obtain a similarity value between any two data of the screened unit data;

and merging the two data with the similarity value larger than the similarity threshold value in the screened unit data to obtain merged unit data of each map unit.

In some embodiments of the present application, the formula of the preset algorithm model is:

wherein N is the number of nodes in the incidence relation knowledge graph, PR_i(k) Risk transfer probability, PR, for the ith customer to be tested_j(K-1) Risk propagation probability, α, of the jth customer to be tested_jiAnd the risk conduction coefficient of a graph unit formed by the jth customer to be tested and the ith customer to be tested, wherein the jth customer to be tested points to the edge between the ith customer to be tested, s is a scale constant, s is more than or equal to 0 and less than or equal to 1, and i and j are positive integers more than or equal to 1 and less than or equal to N.

In a second aspect, the present application further provides a device for calculating risk conduction of public passenger groups by a bank based on a knowledge graph, where the device includes:

the system comprises a map building module, a map unit and a data processing module, wherein the map building module is used for building an incidence relation knowledge map containing each customer to be tested by taking each customer to be tested as a node and taking the incidence relation between the customer to be tested and other customers to be tested as an edge, and the edge and two nodes corresponding to the edge form a map unit of the incidence relation knowledge map;

the risk conduction coefficient determining module is used for acquiring unit data of each map unit, inputting the unit data into a trained risk conduction coefficient prediction model, and obtaining the risk conduction coefficient of the map unit;

the risk conduction probability determining module is used for obtaining the risk conduction probability of the risk events of other clients to be detected except the client to be detected, which has the risk event, based on the risk conduction coefficients of all the map units according to a preset algorithm model after the risk event occurs to any client to be detected;

the risk conduction path determination module is used for determining at least one risk conduction path based on the risk conduction probability of each customer to be tested, wherein the first node of each risk conduction path is the customer to be tested with the risk event;

and the operation module is used for executing business early warning operation aiming at each risk conduction path.

In some embodiments of the present application, the apparatus further comprises a model determination module, wherein the model determination module obtains the risk conductance prediction model by:

In some embodiments of the present application, after constructing a knowledge graph including an association relationship of each customer to be tested, and acquiring unit data of graph units, before training the risk conductance prediction model, the apparatus further includes:

the correlation degree analysis module is used for analyzing the correlation degree between each data in the unit data of the map unit and the risk by using a preset correlation degree analysis algorithm to obtain a correlation degree value corresponding to each data in the unit data;

and the selecting module is used for selecting data with the correlation value larger than the correlation threshold value from the unit data to form screened unit data of the map unit.

In some embodiments of the present application, after obtaining the filtered unit data for the map unit, the apparatus further comprises:

the similarity analysis module is used for analyzing the similarity between any one of the screened unit data of the map unit and other data except the data by using a preset data analysis algorithm to obtain a similarity value between any two data of the screened unit data;

and the merging module is used for merging the two data with the similarity value larger than the similarity threshold value in the screened unit data to obtain merged unit data of each map unit.

In some embodiments of the present application, the predetermined algorithm model in the probability determination module has the formula:

In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, the processor and the memory communicate via the bus when the electronic device is running, and the machine-readable instructions are executed by the processor to perform the steps of the method for measuring and calculating risk of public passenger groups by a knowledge-graph-based bank according to the first aspect or any one of the possible embodiments of the first aspect.

In a fourth aspect, the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, where the computer program is executed by a processor to perform the steps of the method for measuring and calculating risk and conducting of a public passenger group by a knowledge-graph-based bank according to the first aspect, or according to any possible implementation manner of the first aspect.

The embodiment of the application provides a method and a device for calculating risk conduction of a bank to a public customer group based on a knowledge graph, wherein the method comprises the steps of constructing an incidence relation knowledge graph containing each customer to be tested, determining risk conduction coefficients of graph units according to a risk conduction coefficient prediction model based on unit data of each graph unit of the incidence relation knowledge graph, obtaining the risk conduction probability of each customer to be tested after any customer to be tested generates a risk event based on the risk conduction coefficients and an algorithm model, and determining at least one risk conduction path based on the risk conduction probability, wherein the head node of each risk conduction path is the customer to be tested who generates the risk event; and executing business early warning operation aiming at each risk conduction path. According to the method for calculating risk conduction of the bank to the public customer group based on the knowledge graph, after any customer to be detected has a risk event, the risk conduction probability of other customers having an association relation with the customer to be detected having the risk event can be determined, the risk conduction path can be obtained based on the risk conduction probability, and then business early warning operation is executed based on the risk conduction path, so that after a certain enterprise has a risk time, credit affected degree of other enterprises having the association relation with the enterprise having the risk event can be determined, the bank can perform business early warning operation on the other enterprises, accuracy of determining enterprise risk by the bank is improved, and unnecessary loss of the bank is reduced.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a flowchart illustrating a method for risk conduction measurement of a public customer group by a bank based on a knowledge-graph according to an embodiment of the present application;

fig. 2 is a schematic diagram illustrating a knowledge graph of an association relationship in a method for conducting and measuring risk of a public customer group by a bank based on a knowledge graph according to an embodiment of the present application;

FIG. 3 shows a schematic diagram of an atlas unit provided in an embodiment of the application;

FIG. 4 shows a schematic diagram of another atlas element provided in an embodiment of the application;

fig. 5 is a schematic structural diagram illustrating a device for conducting and measuring risk of a public customer group by a bank based on a knowledge graph according to an embodiment of the present application;

fig. 6 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

In consideration of the fact that the current bank cannot determine the credit affected degree of other enterprises having an incidence relation with the enterprise having the risk event after the risk event occurs in the enterprise, namely the accuracy of determining the enterprise risk by the bank is low, so that the bank cannot perform business early warning operation on the other enterprises, and the bank generates unnecessary loss. Based on this, the embodiment of the application provides a method and a device for conducting and calculating risk of public and guest groups by a bank based on a knowledge graph, and the following description is given by using the embodiment.

In order to facilitate understanding of the embodiment, first, a method for calculating risk of public passenger groups by a bank based on a knowledge graph disclosed in the embodiment of the present application is described in detail.

Example one

The embodiment of the application provides a method for conducting and measuring risk of a public customer group by a bank based on a knowledge graph, which is applied to a bank detection system, so that the bank detection system can detect the state of a customer based on stored customer data, wherein the customer data comprises credit data, financial data, industrial and commercial data and the like. Specifically, the method for calculating risk conduction of the public and guest groups by the bank based on the knowledge graph is described in detail through the following process.

Referring to a flowchart of a method for calculating risk conduction of a public customer group by a bank based on a knowledge graph shown in fig. 1, the method includes steps S101-S105, and the specific process is as follows:

s101, each customer to be tested is taken as a node, the correlation between the customer to be tested and other customers to be tested is taken as an edge, a correlation knowledge graph containing each customer to be tested is constructed, wherein the edge and two nodes corresponding to the edge form a graph unit of the correlation knowledge graph, and each edge corresponds to one direction.

In the embodiment of the application, a plurality of clients to be tested and incidence relations among the clients to be tested can be extracted from client data stored in a bank detection system, wherein the incidence relations include one or more of the following relations: an action consistent relationship, a group relationship, an actual control relationship, a stock right control relationship, a guarantee relationship, etc. The association relationship may be selected according to actual needs, which is not specifically limited in this embodiment of the present application.

In the embodiment of the application, the bank detection system may obtain a plurality of association relation knowledge maps based on the stored customer data, and any node in each association relation knowledge map has an association relation with at least one other node except the node in the association relation knowledge map. Each side in the incidence relation knowledge graph corresponds to a direction, one side of the side is an actively controlled customer to be tested, the other side of the side is a controlled customer to be tested, and the direction corresponding to the side is that the controlled customer to be tested points to the actively controlled customer to be tested. Referring to fig. 2, which is a schematic diagram of an association relation knowledge graph in a method for calculating risk of a public customer group by a bank based on a knowledge graph, in fig. 2, an exemplary graph unit includes a first node 21, a second node 22, a third node 23, a fourth node 24, a fifth node 25, and 5 directed edges, where a number on each directed edge is a risk conduction coefficient, that is, the risk conduction coefficient of a graph unit composed of the first node, the second node, and the directed edges between the first node and the second node is 0.7, where the directed edges refer to edges including directions. Illustratively, if an actual control relationship and a right-to-stock control relationship exist between the first node and the second node, and the stock control proportion data is 25%, the association relationship between the first node and the second node is that the second node actually controls the first node, the second node controls 25% of shares of the first node, that is, the first node is a controlled customer to be tested, and the second node is an actively controlled customer to be tested.

In this embodiment of the present application, an edge and two nodes corresponding to the edge form a graph unit of an association knowledge graph, where each edge corresponds to a direction, and if two nodes included in the two graph units are the same, but the directions of the edges are different, the two graph units are different. Fig. 3 is a schematic diagram of a graph unit, and fig. 4 is a schematic diagram of another graph unit, each of fig. 3 and 4 includes a sixth node 31 and a seventh node 32, the direction of the edge of the graph unit in fig. 3 is that the sixth node 31 points to the seventh node 32, that is, the seventh node is an actively-controlled customer under test, the sixth node is a controlled customer under test, and 0.7 is a risk conductance of the graph unit in fig. 3; the direction of the edge of the graph element in fig. 4 is that the seventh node 31 points to the sixth node 32, i.e. the sixth node is an actively controlled customer under test, the seventh node is a controlled customer under test, and 0.5 is the risk conductance of the graph element in fig. 4. Thus, the graph elements in FIG. 3 are different graph elements from those in FIG. 4 because the sides of the graph elements in FIG. 3 are in different directions from the sides of the graph elements in FIG. 4.

S102, aiming at each map unit, acquiring unit data of the map unit, and inputting the unit data into the trained risk conduction coefficient prediction model to obtain the risk conduction coefficient of the map unit.

In the embodiment of the present application, a graph unit includes an edge and two nodes corresponding to the edge, so that unit data of the graph unit includes entity data of each node and association relation data of the edge.

In the embodiment of the application, the unit data comprises entity data of the nodes and incidence relation data of the edges, wherein the entity data is used for representing financial information of the nodes, and the incidence relation data is used for representing relations between different nodes; the entity data includes: the liability rate and the liquidity rate of the assets, and the incidence relation data comprises: consensus actor relationship data, group relationship data, actual control relationship data, equity control relationship, guarantee relationship data, stock control proportion data, and guarantee amount data. The data types and the number included in the entity data can be selected according to actual needs, and the data types and the number included in the association relation data can also be selected according to actual needs.

In the embodiment of the present application, the rate of equity liability is equal to the total equity divided by the total equity, and the equity liquidity ratio is equal to the liquidity equity divided by the liquidity. The relation data of the consistent actor, the group relation data, the actual control relation data, the stock right control relation and the guarantee relation data are binary data. Illustratively, the process of determining consensus actor data is: if there is a consistent actor relationship between two nodes in the graph element, the consistent actor relationship data is 1, and if there is no consistent actor relationship, the consistent actor relationship data is 0. The group relationship data, the actual control relationship data, the stock right control relationship, the guarantee relationship data are the same as the process of determining the consistent actor relationship data, which is not described in detail in the embodiments of the present application.

Illustratively, if the node a and the node B included in the map unit have a stock control relationship and a guarantee relationship, the corresponding stock control proportion value is stock control proportion data, and the corresponding guarantee amount value is guarantee amount data; if the map unit comprises nodes A and B without stock control relationship and guarantee relationship, the corresponding stock control proportion value is 0, the corresponding guarantee amount value is 0, and the unit of the guarantee amount is ten thousand yuan. For example, node B controls 25% of the shares of node a, the share proportion data equals 25%. Illustratively, the plurality of data in the unit data are arranged according to the consensus actor relationship data, the group relationship data, the actual control relationship data, the share right control relationship, the guarantee relationship data, the stock control ratio and the guarantee amount, the unit data of the map unit a may be {1, 0, 0, 0, 1, 0, 5}, that is, the unit data indicates that two nodes in the map unit a have the consensus actor relationship and the guarantee relationship, and the guarantee amount is 5 ten thousand yuan.

As an alternative embodiment, after constructing a knowledge graph containing the association relations of the clients to be tested and acquiring unit data of graph units, before training the risk conductance prediction model, the method further includes:

firstly, analyzing the correlation between each data in the unit data of the map unit and the risk by using a preset correlation analysis algorithm to obtain a correlation value corresponding to each data in the unit data.

And secondly, selecting data with the correlation value larger than the correlation threshold value from the unit data to form screened unit data of the map unit.

In the embodiment of the present application, the correlation analysis algorithm includes one or more of the following algorithms: pearson correlation coefficient algorithm, regularization method, model-based feature selection method, etc. The feature selection method based on the model is to apply a Machine learning model to the feature selection method, wherein the Machine learning model comprises a regression model, a Support Vector Machine (SVM) model, a decision tree model, a random forest model and the like. Taking correlation analysis algorithms including a pearson correlation coefficient algorithm, a regularization method and a model-based feature selection method as examples, the pearson correlation coefficient algorithm, the regularization method and the model-based feature selection method sequentially analyze the correlation between the data in the unit data and the risk according to a preset sequence, and screen the data based on the correlation value, wherein the preset sequence is as follows: pearson correlation coefficient algorithm, regularization method and feature selection method based on model.

Continuing to explain with the above embodiment, the pearson correlation coefficient algorithm analyzes the correlation between each data in the unit data and the risk respectively to obtain the correlation value corresponding to each data in the unit data, and selects the data with the correlation value larger than the correlation threshold from the unit data based on the correlation value corresponding to each data to form the first selected unit data. The regularization method analyzes the correlation degree between each data in the first selection unit data and the risk to obtain the correlation degree value corresponding to each data in the first selection unit data, and selects the data with the correlation degree value larger than the correlation degree threshold value from the first selection unit data based on the correlation degree value corresponding to each data in the first selection unit data to form second selection unit data. And selecting data with the correlation value larger than the correlation threshold value from the second selection unit data based on the correlation value corresponding to each data in the second selection unit data to form third selection unit data, wherein the third selection unit data is the screened unit data of the map unit. Wherein, the threshold value of the correlation degree can be set according to actual needs.

According to the embodiment of the application, the unit data of the map units are sequentially screened through different correlation analysis algorithms, the correlation between the screened unit data and risks can be high, the screened unit data are input into a trained risk conductivity coefficient prediction model, the accuracy of the obtained risk conductivity coefficient is high, and the accuracy of the client state detection can be improved.

In the embodiment of the application, the unit data of each map unit in the association relation knowledge map is obtained, and the correlation degree analysis algorithm analyzes the correlation degree between each data in the unit data and the risk based on the unit data of a plurality of map units to obtain the correlation degree value of each data. The process of analyzing the correlation between data and risk by using a pearson correlation coefficient algorithm, a regularization method and a model-based feature selection method to obtain a correlation value of the data is the prior art, and this is not described in detail in the embodiments of the present application.

As an alternative embodiment, after obtaining the screened unit data of the map unit, the method further comprises:

firstly, analyzing the similarity degree between any one of the screened unit data of the map unit and other data except the data by using a preset data analysis algorithm to obtain a similarity value between any two data of the screened unit data.

And secondly, merging the two data with the similarity value larger than the similarity threshold value in the screened unit data to obtain merged unit data of each map unit.

In the embodiment of the application, the data analysis algorithm may be a Linear Discriminant Analysis (LDA) algorithm, and the similarity threshold may be set according to actual needs. The LDA algorithm analyzes the similarity between the data in the screened unit data based on the unit data of the plurality of map units in the association relation knowledge map, for example, if the screened unit data includes the asset liability ratio, the asset flowing ratio, the actor relationship data, and the stock control ratio data, the LDA algorithm analyzes the similarity between the asset liability ratio and the asset flowing ratio, the actor relationship data, and the stock control ratio data, analyzes the similarity between the asset flowing ratio and the actor relationship data, and the stock control ratio data, and analyzes the similarity between the actor relationship data and the stock control ratio data, and combines the two data with the similarity value larger than the similarity threshold value to obtain the combined unit data of the map units. The LDA algorithm analyzes the correlation between the data in the filtered unit data and the data to obtain the correlation value of the data is the prior art, and this is not described in detail in the embodiments of the present application.

As an alternative embodiment, the training process of the risk conductance prediction model is as follows:

firstly, sample unit data corresponding to a plurality of map units of an incidence relation knowledge map are obtained, and the sample unit data carry labels, wherein the sample unit data comprise training unit data and testing unit data, and the labels comprise default and non-default.

Secondly, inputting training unit data into the deep learning model, and performing multiple rounds of training on the deep learning model.

Thirdly, inputting the data of the test unit into the deep learning model after each round of training, evaluating the deep learning model after training, and determining that the training is finished when the accuracy value obtained by evaluation is greater than a preset accuracy threshold value.

And fourthly, taking the deep learning model obtained after the training as a risk conduction coefficient prediction model.

In the embodiment of the present application, the deep learning model may be an extreme gradient boost Xgboost model. The default label means that after the node a included in the graph unit has a risk event, the node B in the graph unit has a risk event within a preset time period. Similarly, the non-default label means that after the node a included in the graph unit has a risk event, the node B in the graph unit has no risk event within a preset time period. The preset time period can be set according to actual conditions.

S103, after any client to be tested has a risk event, obtaining the risk conduction probability of the risk event of other clients to be tested except the client to be tested having the risk event according to the preset algorithm model based on the risk conduction coefficient of each map unit.

In the embodiment of the application, the preset algorithm model is an improved webpage ranking PageRank algorithm model, and according to the algorithm model, the risk conduction probability of other clients to be tested except the client to be tested who has a risk event. For example, if the association relation knowledge graph includes a first node, a second node, a third node, and a fourth node, after a risk event occurs at the first node, risk transmission probabilities of the occurrence of the risk event corresponding to the second node, the third node, and the fourth node having an association relation with the first node can be obtained.

As an alternative embodiment, the formula of the preset algorithm model is as follows:

wherein N is the number of nodes in the incidence relation knowledge graph, PR_i(k) Risk transfer probability, PR, for the ith customer to be tested_j(K-1) Risk propagation probability, α, of the jth customer to be tested_jiAnd the risk conduction coefficient of a graph unit formed by the jth customer to be tested, the ith customer to be tested and the ith customer to be tested is pointed by the jth customer to be tested, s is a scale constant, s is more than or equal to 0 and less than or equal to 1, and i and j are positive integers more than or equal to 1 and less than or equal to N.

For example, taking the knowledge graph of fig. 2 as an example, if a risk event occurs in a first node, the risk transmission probabilities of the occurrence of the risk event corresponding to a second node, a third node, a fourth node, and a fifth node are respectively obtained. In particular, to determine the occurrence of a risk event corresponding to the second nodeThe risk propagation probability is illustrated as an example, wherein j is 1, 2, 3, 4, 5, i is 2, the value of s is set to 0.85, and PR corresponding to the first node is₁(K-1) is set to 1, and as can be seen from fig. 2, N ═ 5, α₁₂＝0.7，α₂₂＝0，α₃₁＝0，α₄₂＝0，α₅₂If 0, then PR is calculated₂(k) After the risk event occurs at the first node, the risk transmission probability of the risk event occurring at the second node is 0.625. If the risk transmission probability of the risk event corresponding to the third node is continuously calculated, as can be seen from fig. 2, α₁₃＝0，α₁₃＝0.3，α₃₃＝0，α₄₃＝0，α₅₃＝0，PR₂(k) If 0.625, then PR is calculated₃(k) The risk propagation probability of the risk event occurring corresponding to the third node after the risk event occurs at the first node is 0.189. Wherein, the value of s can be set according to actual needs. The risk propagation probability corresponding to the fourth node is calculated to be PR₄(k) 0.37. The process of calculating the value of the risk propagation probability corresponding to the fifth node is PR₅(k) 0.85 × (0.2 × 0.625+0.6 × 0.37) + (1-0.85) × 0.2 ═ 0.325, that is, after the risk event occurs at the first node, the risk transmission probability of the occurrence of the risk event corresponding to the fifth node is 0.325.

And S104, determining at least one risk conduction path based on the risk conduction probability of each client to be tested, wherein the first node of each risk conduction path is the client to be tested with the risk event.

In the embodiment of the application, a risk probability threshold is set, and a risk conduction path is determined according to the risk probability threshold and the risk conduction probability of each node. Specifically, a node with a risk conduction probability greater than or equal to a risk probability threshold is used as a target node, and a customer to be tested with a risk event is used as a first node to generate a risk conduction path.

Continuing to describe the example in S103, taking a risk conduction probability corresponding to each customer to be tested in fig. 2, that is, each node in fig. 2 as an example, after the first node has a risk event, if a preset risk probability threshold is 0.3, it is known that the risk conduction probabilities of the second node, the fourth node, and the fifth node are greater than the threshold, and the risk conduction probability of the third node is less than the threshold, the second node, the fourth node, and the fifth node are target nodes, and the generated risk conduction path is first node → second node; first node → fourth node; first node → second node → fifth node; first node → fourth node → fifth node.

And S105, executing business early warning operation aiming at each risk conduction path.

In the embodiment of the application, after at least one risk conduction path is determined, a business early warning operation is executed for each risk conduction path. The business early warning operation comprises the steps of generating corresponding early warning signals according to the risk conduction probability, and sending the generated early warning signals and the risk conduction path to a bank detection system, so that the bank detection system carries out business processing on customers on the basis of the early warning signals and the risk conduction path.

Specifically, based on the risk probability threshold, the risk conduction probability is divided, different early warning signals are generated based on the risk conduction probabilities in different stages, and different business processing is performed on the client according to the different early warning signals. For example, if the risk probability threshold is 0.3, the risk conduction probability may be divided into 3 stages, where the risk conduction probability of 0.3 or more and less than 0.6 are first stages, the risk conduction probability of 0.6 or more and less than 0.8 are second stages, and the risk conduction probability of 0.8 or more and less than 1.0 are third stages, the early warning signal in the first stage may be set to green, the early warning signal in the second stage may be set to yellow, the early warning signal in the third stage may be set to red, the business process corresponding to the green early warning signal may increase the detection frequency for the customer, the business process corresponding to the yellow early warning signal may add warranty funds to the customer, and the business process corresponding to the red early warning signal may be to recover loans in advance. Continuing with the example in S104, if the risk conduction probability of the second node is 0.625, the risk conduction probability of the fourth node is 0.37, and the risk conduction probability of the fifth node is 0.325, the second node corresponds to the yellow warning signal, and the fourth node and the fifth node correspond to the green warning signal, and then perform corresponding service processing on each node according to the signal corresponding to each node. The service processing process has various conditions, and may be set according to actual needs in specific implementation, and the embodiment of the present application is only an exemplary description, and is not specifically limited to this.

According to the method for calculating risk conduction of the bank based on the knowledge graph to the public customer group, the incidence relation knowledge graph containing each customer to be tested is constructed, unit data of each graph unit based on the incidence relation knowledge graph is determined according to a risk conduction coefficient prediction model, the risk conduction coefficient of each graph unit is obtained based on the risk conduction coefficient and an algorithm model, the risk conduction probability of each customer to be tested of the risk event after any customer to be tested of the risk event occurs is obtained, at least one risk conduction path is determined based on the risk conduction probability, and the first node of each risk conduction path is the customer to be tested of the risk event; and executing business early warning operation aiming at each risk conduction path. According to the method for calculating risk conduction of the bank to the public customer group based on the knowledge graph, after any customer to be detected has a risk event, the risk conduction probability of other customers having an association relation with the customer to be detected having the risk event can be determined, the risk conduction path can be obtained based on the risk conduction probability, and then business early warning operation is executed based on the risk conduction path, so that after a certain enterprise has a risk time, credit affected degree of other enterprises having the association relation with the enterprise having the risk event can be determined, the bank can perform business early warning operation on the other enterprises, accuracy of determining enterprise risk by the bank is improved, and unnecessary loss of the bank is reduced.

Based on the same invention concept, the embodiment of the application also provides a device for conducting and calculating the risk of the bank to the public guest group based on the knowledge graph, which corresponds to the method for conducting and calculating the risk of the bank to the public guest group based on the knowledge graph.

Example two

The embodiment of the application provides a device for conducting and calculating the risk of a public customer group by a bank based on a knowledge graph, and referring to a schematic structural diagram of the device for conducting and calculating the risk of the public customer group by the bank based on the knowledge graph shown in fig. 5, the device comprises:

the map construction module 501 is configured to construct an incidence relation knowledge map including each customer to be tested by using each customer to be tested as a node and using an incidence relation between the customer to be tested and another customer to be tested as an edge, where the edge and two nodes corresponding to the edge form a map unit of the incidence relation knowledge map, and each edge corresponds to one direction;

a coefficient determining module 502, configured to, for each map unit, obtain unit data of the map unit, and input the unit data to the trained risk conductance prediction model to obtain a risk conductance of the map unit.

And a risk conduction probability determining module 503, configured to obtain, based on the risk conduction coefficient of each map unit and according to a preset algorithm model, the risk conduction probability of the risk event occurring in the other clients to be tested except the client to be tested who has the risk event after the risk event occurs in any client to be tested.

A risk conduction path determination module 504 for determining at least one risk conduction path for the risk conduction probability of each candidate customer, wherein the first node of each risk conduction path is the candidate customer for the risk event.

And an operation module 505, configured to execute a business warning operation for each risk conduction path.

As an optional embodiment, the apparatus further comprises a model determining module, wherein the model determining module obtains the risk conductance prediction model by using the following steps:

As an alternative embodiment, after constructing a knowledge graph containing the association relationship of each customer to be tested and acquiring unit data of graph units, before training the risk conductance prediction model, the apparatus further includes:

As an alternative embodiment, after obtaining the filtered unit data of the map unit, the apparatus further comprises:

and the similarity analysis module is used for analyzing the similarity between any one of the screened unit data of the map unit and other data except the data by using a preset data analysis algorithm to obtain a similarity value between any two data in the screened unit data.

As an optional embodiment, the formula of the preset algorithm model in the probability determination module is:

The device for conducting and measuring the risk of the public service group by the bank based on the knowledge graph provided by the embodiment of the application has the same technical characteristics as the method for conducting and measuring the risk of the public service group by the bank based on the knowledge graph provided by the embodiment one, so that the same technical problems can be solved, and the same technical effect can be achieved.

EXAMPLE III

Based on the same technical concept, the embodiment of the application also provides the electronic equipment. Referring to fig. 6, a schematic structural diagram of an electronic device 600 provided in the embodiment of the present application includes a processor 601, a memory 602, and a bus 603. The memory 602 is used for storing execution instructions and includes a memory 6021 and an external memory 6022; the memory 6021 is also referred to as an internal memory, and is configured to temporarily store the operation data in the processor 601 and the data exchanged with the external memory 6022 such as a hard disk, the processor 601 exchanges data with the external memory 6022 through the memory 6021, and when the electronic device 600 operates, the processor 601 communicates with the memory 602 through the bus 603, so that the processor 601 executes the following instructions:

and determining the state of each client to be tested based on the risk conduction probability of the client to be tested.

In one possible design, the instructions that may be executed by the processor 601 further include:

the training process of the risk conduction coefficient prediction model comprises the following steps:

the formula of the preset algorithm model is as follows:

Example four

The present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the method for calculating risk of public passenger groups by a bank based on a knowledge graph as described in any of the above embodiments.

Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, and when a computer program on the storage medium is executed, the steps of the method for conducting and calculating the risk of the public passenger group by the bank based on the knowledge graph can be executed, so that the accuracy of determining the enterprise risk by the bank is improved.

The computer program product for performing the method for calculating risk of public passenger groups by a bank based on a knowledge graph provided in the embodiment of the present application includes a computer readable storage medium storing a nonvolatile program code executable by a processor, where instructions included in the program code may be used to execute the method described in the foregoing method embodiment, and specific implementation may refer to the method embodiment, and will not be described herein again.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A bank risk conduction measurement and calculation method for a public customer group based on a knowledge graph is characterized by comprising the following steps:

2. The method of claim 1, wherein the risk conductance prediction model is trained by:

3. The method of claim 2, wherein after constructing the knowledge graph containing the association relationships of each customer to be tested, and obtaining the unit data of the graph unit, and before training the risk conductance prediction model, the method further comprises:

4. The method of claim 3, wherein after obtaining the screened unit data for the map unit, the method further comprises:

5. The method of claim 1, wherein the predetermined algorithm model has the formula:

wherein N is the number of nodes in the incidence relation knowledge graph, PR_i(k) Is composed ofRisk propagation probability, PR, of the ith customer to be tested_j(K-1) Risk propagation probability, α, of the jth customer to be tested_jiAnd the risk conduction coefficient of a graph unit formed by the jth customer to be tested and the ith customer to be tested, wherein the jth customer to be tested points to the edge between the ith customer to be tested, s is a scale constant, s is more than or equal to 0 and less than or equal to 1, and i and j are positive integers more than or equal to 1 and less than or equal to N.

6. A device for conducting and calculating risk of public and guest groups by banks based on knowledge graph, the device comprising:

the system comprises a map building module, a map generating module and a map generating module, wherein the map building module is used for building an incidence relation knowledge map containing each customer to be detected by taking each customer to be detected as a node and taking the incidence relation between the customer to be detected and other customers to be detected as an edge, the edge and two nodes corresponding to the edge form a map unit of the incidence relation knowledge map, and each edge corresponds to one direction;

7. The apparatus of claim 6, further comprising a model determination module that derives the risk conductance prediction model by:

8. The apparatus of claim 7, wherein after constructing the knowledge graph containing the association relationships of each customer to be tested, and obtaining the unit data of the graph unit, and before training the risk conductance prediction model, the apparatus further comprises:

9. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine readable instructions when executed by the processor performing the steps of the method for estimation of risk of public passenger groups by a knowledge-graph based bank as claimed in any one of claims 1 to 5.

10. A computer-readable storage medium, having stored thereon a computer program for performing, when being executed by a processor, the steps of the method for a knowledgegraph-based bank to risk measure and calculation of a public passenger group risk according to any one of claims 1 to 5.