CN117909926A

CN117909926A - Risk identification method and device, storage medium and electronic equipment

Info

Publication number: CN117909926A
Application number: CN202410135183.1A
Authority: CN
Inventors: 韩少帅; 魏政
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2024-01-31
Filing date: 2024-01-31
Publication date: 2024-04-19

Abstract

The specification discloses a risk identification method, a risk identification device, a storage medium and electronic equipment. According to the time information in the service directed graph, determining the node corresponding to the node characteristics needing to be input into the characteristic aggregation layer, and carrying out characteristic aggregation, so that the aggregate characteristics of the users to be identified comprise the node characteristics of all the nodes to be aggregated within a preset adjacent distance, and carrying out risk identification according to the aggregate characteristics of the users to be identified, thereby improving the accuracy of the risk identification result.

Description

Risk identification method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of machine learning, and in particular, to a risk identification method, apparatus, storage medium, and electronic device.

Background

With the development of machine learning technology, people can utilize a model trained by machine learning to wind control various services executed by users, wherein training samples used by the training model may contain personal privacy data. For example, when the user B transacts with the user a, the wind control model detects whether the user a has an abnormality according to the transaction information of the user a, so as to obtain a risk identification result, and if the risk identification result is that the user a has an abnormality, the user B can be prompted to reduce the risk of property loss caused by the user B. The transaction information of the user A comprises transaction information of the user A for directly carrying out transactions with other users.

However, whether the user has an abnormality is also related to the user who has an indirect transaction with the user, so that whether the user has an abnormality is judged only according to the transaction information of the user, and the obtained detection result may not be accurate.

Based on this, the present specification provides a risk identification method.

Disclosure of Invention

The present disclosure provides a risk identification method, apparatus, storage medium, and electronic device, so as to at least partially solve the foregoing problems in the prior art.

The technical scheme adopted in the specification is as follows:

The specification provides a risk identification method, wherein a risk identification model comprises a feature extraction layer, a feature aggregation layer and a risk identification layer, and the risk identification method comprises the following steps:

determining a user to be identified;

Acquiring a pre-constructed service directed graph in which the user to be identified is located, wherein nodes in the service directed graph represent users, edges represent service relations among the users, and the nodes contain time information of executing the service by the users;

inputting the service directed graph into the feature extraction layer to obtain node features of each node in the service directed graph output by the feature extraction layer;

Determining nodes, of which the adjacent distance of the nodes corresponding to the users to be identified is not greater than a preset adjacent distance, as nodes to be aggregated, and inputting node characteristics of the nodes to be aggregated and node characteristics of the nodes corresponding to the users to be identified into the characteristic aggregation layer according to the time information so as to enable the characteristic aggregation layer to perform characteristic aggregation and output aggregation characteristics of the nodes corresponding to the users to be identified, wherein the aggregation characteristics comprise the node characteristics of the nodes to be aggregated and the node characteristics of the nodes corresponding to the users to be identified;

And inputting the aggregation characteristics into the risk identification layer to obtain a risk identification result of the user to be identified, which is output by the risk identification layer.

Optionally, according to the time information, inputting the node characteristics of the node to be aggregated and the node characteristics of the node corresponding to the user to be identified into the characteristic aggregation layer, which specifically includes:

Determining a service subgraph formed by the nodes to be aggregated and the nodes corresponding to the users to be identified in the service directed graph according to a preset graph traversal algorithm;

According to the time information, determining an initial aggregation node in the business subgraph;

According to the initial aggregation node, determining a first node which performs feature aggregation with the initial aggregation node in the business subgraph, inputting node features of the initial aggregation node and node features of the first node into the feature aggregation layer to obtain intermediate aggregation features output by the feature aggregation layer, determining the current aggregation features of the initial aggregation node according to the intermediate aggregation features, determining the current aggregation features of the first node, and re-determining the first node as the initial aggregation node until node features of nodes corresponding to the users to be identified are input into the feature aggregation layer.

Optionally, according to the initial aggregation node, determining a first node performing feature aggregation with the initial aggregation node in the service subgraph, which specifically includes:

in the service subgraph, determining a path taking the initial aggregation node as a starting point and the node corresponding to the user to be identified as an end point as a main path according to the preset graph traversal algorithm;

taking the direction from the starting point to the end point of the main path as an aggregation direction;

and according to the aggregation direction, determining a first node which performs feature aggregation with the initial aggregation node in the main path.

Optionally, determining a path other than the main path as a branch path;

When the initial aggregation node has a branch path, inputting node characteristics of the initial aggregation node and node characteristics of the first node into the characteristic aggregation layer, wherein the method specifically comprises the following steps:

For each branch path, determining a node corresponding to the maximum adjacent distance of the initial aggregation node in the branch path according to the preset graph traversal algorithm to obtain an intermediate node;

Taking the initial aggregation node as a first branch aggregation starting point, taking the direction from the first branch aggregation starting point to the intermediate node as a first branch aggregation direction, determining a node which performs feature aggregation with the first branch aggregation starting point according to the first branch aggregation direction, performing feature aggregation, and re-determining the node which performs feature aggregation with the first branch aggregation starting point as the first branch aggregation starting point until the intermediate node is subjected to feature aggregation to obtain the current aggregation feature of each node on the branch path;

And determining a node which is subjected to feature aggregation with the intermediate node according to the second branch aggregation direction, performing feature aggregation, re-determining the node which is subjected to feature aggregation with the intermediate node as the intermediate node until the feature aggregation is performed on the initial aggregation to obtain the current aggregation feature of the initial aggregation node, wherein the current aggregation feature of the initial aggregation node comprises node features of all nodes in the branch path, re-determining the current aggregation feature of the initial aggregation node as the node feature of the initial aggregation node, and inputting the re-determined node feature of the initial aggregation node and the node feature of the first node into the feature aggregation layer.

And determining a one-degree neighbor node of the initial aggregation node in the business subgraph as a first node for feature aggregation with the initial aggregation node.

Optionally, determining the current aggregation characteristic of the initial aggregation node according to the intermediate aggregation characteristic, and determining the current aggregation characteristic of the first node specifically includes:

fusing the intermediate aggregation characteristics with the node characteristics of the initial aggregation nodes to obtain the current aggregation characteristics of the initial aggregation nodes;

and fusing the intermediate aggregation characteristic with the node characteristic of the first node to obtain the current aggregation characteristic of the first node.

Optionally, when there is more than one intermediate aggregation feature, determining the current aggregation feature of the initial aggregation node according to the intermediate aggregation feature specifically includes:

carrying out summation pooling on all the intermediate aggregation features to obtain intermediate aggregation features after summation pooling;

and fusing the intermediate aggregation characteristics after summation pooling with the node characteristics of the initial nodes to be aggregated to obtain the current aggregation characteristics of the initial aggregation nodes.

Optionally, training the risk identification model specifically includes:

Obtaining a sample service directed graph, wherein nodes in the sample service directed graph represent users, edges represent service relations among the users, and the nodes contain time information for executing the service by the users; determining a risk result of a target node in the sample service directed graph as a label of the target node;

inputting the sample service directed graph into the feature extraction layer to obtain node features of each node in the service directed graph output by the feature extraction layer;

Determining nodes except the target node in the sample service directed graph as sample nodes to be aggregated, and inputting node characteristics of the sample nodes to be aggregated and node characteristics of the target node into the characteristic aggregation layer according to time information in the sample service directed graph so as to enable the characteristic aggregation layer to perform characteristic aggregation and output aggregation characteristics of the target node;

Inputting the aggregation characteristics of the target nodes into the risk identification layer to obtain a predicted risk identification result of the target nodes, which is output by the risk identification layer;

and training the risk identification model according to the predicted risk identification result and the label.

The present specification provides a risk recognition device, and risk recognition model includes feature extraction layer, feature aggregation layer and risk recognition layer, the device includes:

the user to be identified determining module is used for determining the user to be identified;

The service directed graph acquisition module is used for acquiring a pre-constructed service directed graph of the user to be identified, wherein nodes in the service directed graph represent users, edges represent service relations among the users, and the nodes contain time information of executing the service by the users;

The feature extraction module is used for inputting the service directed graph into the feature extraction layer to obtain node features of each node in the service directed graph output by the feature extraction layer;

the aggregation module is used for determining nodes, the adjacent distance of which is not greater than a preset adjacent distance, of the nodes corresponding to the users to be identified as the nodes to be aggregated, inputting the node characteristics of the nodes to be aggregated and the node characteristics of the nodes corresponding to the users to be identified into the characteristic aggregation layer according to the time information, so that the characteristic aggregation layer carries out characteristic aggregation, and outputting the aggregation characteristics of the nodes corresponding to the users to be identified, wherein the aggregation characteristics comprise the node characteristics of the nodes to be aggregated and the node characteristics of the nodes corresponding to the users to be identified;

And the risk identification module is used for inputting the aggregation characteristics into the risk identification layer to obtain a risk identification result of the user to be identified, which is output by the risk identification layer.

Optionally, the aggregation module is specifically configured to determine, in the service directed graph, a service subgraph formed by the node to be aggregated and a node corresponding to the user to be identified according to a preset graph traversal algorithm; according to the time information, determining an initial aggregation node in the business subgraph; according to the initial aggregation node, determining a first node which performs feature aggregation with the initial aggregation node in the business subgraph, inputting node features of the initial aggregation node and node features of the first node into the feature aggregation layer to obtain intermediate aggregation features output by the feature aggregation layer, determining the current aggregation features of the initial aggregation node according to the intermediate aggregation features, determining the current aggregation features of the first node, and re-determining the first node as the initial aggregation node until node features of nodes corresponding to the users to be identified are input into the feature aggregation layer.

Optionally, the aggregation module is specifically configured to determine, in the service subgraph, a path that starts from the initial aggregation node and ends from a node corresponding to the user to be identified as a main path according to the preset graph traversal algorithm; taking the direction from the starting point to the end point of the main path as an aggregation direction; and according to the aggregation direction, determining a first node which performs feature aggregation with the initial aggregation node in the main path.

Optionally, the aggregation module is specifically configured to determine paths other than the main path as branch paths; when branch paths exist in the initial aggregation node, determining a node corresponding to the maximum adjacent distance of the initial aggregation node in each branch path according to the preset graph traversal algorithm to obtain an intermediate node; taking the initial aggregation node as a first branch aggregation starting point, taking the direction from the first branch aggregation starting point to the intermediate node as a first branch aggregation direction, determining a node which performs feature aggregation with the first branch aggregation starting point according to the first branch aggregation direction, performing feature aggregation, and re-determining the node which performs feature aggregation with the first branch aggregation starting point as the first branch aggregation starting point until the intermediate node is subjected to feature aggregation to obtain the current aggregation feature of each node on the branch path; and determining a node which is subjected to feature aggregation with the intermediate node according to the second branch aggregation direction, performing feature aggregation, re-determining the node which is subjected to feature aggregation with the intermediate node as the intermediate node until the feature aggregation is performed on the initial aggregation to obtain the current aggregation feature of the initial aggregation node, wherein the current aggregation feature of the initial aggregation node comprises node features of all nodes in the branch path, re-determining the current aggregation feature of the initial aggregation node as the node feature of the initial aggregation node, and inputting the re-determined node feature of the initial aggregation node and the node feature of the first node into the feature aggregation layer.

Optionally, the aggregation module is specifically configured to determine, in the business subgraph, a one-degree neighboring node of the initial aggregation node as a first node that performs feature aggregation with the initial aggregation node.

Optionally, the aggregation module is specifically configured to fuse the intermediate aggregation feature with a node feature of the initial aggregation node to obtain a current aggregation feature of the initial aggregation node; and fusing the intermediate aggregation characteristic with the node characteristic of the first node to obtain the current aggregation characteristic of the first node.

Optionally, the aggregation module is specifically configured to, when there is more than one intermediate aggregation feature, perform summation pooling on all the intermediate aggregation features to obtain intermediate aggregation features after summation pooling; and fusing the intermediate aggregation characteristics after summation pooling with the node characteristics of the initial nodes to be aggregated to obtain the current aggregation characteristics of the initial aggregation nodes.

Optionally, the apparatus further comprises:

The training module is used for acquiring a sample service directed graph, wherein nodes in the sample service directed graph represent users, edges represent service relations among the users, and the nodes contain time information for executing the service by the users; determining a risk result of a target node in the sample service directed graph as a label of the target node; inputting the sample service directed graph into the feature extraction layer to obtain node features of each node in the service directed graph output by the feature extraction layer; determining nodes except the target node in the sample service directed graph as sample nodes to be aggregated, and inputting node characteristics of the sample nodes to be aggregated and node characteristics of the target node into the characteristic aggregation layer according to time information in the sample service directed graph so as to enable the characteristic aggregation layer to perform characteristic aggregation and output aggregation characteristics of the target node; inputting the aggregation characteristics of the target nodes into the risk identification layer to obtain a predicted risk identification result of the target nodes, which is output by the risk identification layer; and training the risk identification model according to the predicted risk identification result and the label.

The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the risk identification method described above.

The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the risk identification method described above when executing the program.

The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:

According to the risk identification method provided by the specification, the nodes which need to be input into the feature aggregation layer for feature aggregation are determined according to the time information, and the nodes are aggregated, so that the aggregate features of the users to be identified comprise node features of all the nodes to be aggregated within a preset adjacent distance, risk identification is performed according to the aggregate features of the users to be identified, and accuracy of a risk identification result is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. Attached at

In the figure:

Fig. 1 is a schematic flow chart of a risk identification method provided in the present specification;

FIG. 2 is a schematic view of a first node provided in the present specification;

FIG. 3 is a schematic diagram of a branch path provided in the present specification;

fig. 4 is a schematic diagram of a risk identification device provided in the present specification;

fig. 5 is a schematic view of the electronic device corresponding to fig. 1 provided in the present specification.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present application based on the embodiments herein.

The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.

Fig. 1 is a schematic flow chart of a risk identification method provided in the present specification, which specifically includes the following steps:

s100: and determining the user to be identified.

When various services are carried out among a plurality of users, risk identification can be carried out on other users carrying out service communication with the users so as to reduce the risk of executing the services. In particular, when conducting funds transaction with other users, in order to protect the property safety of the users and reduce the risk of property loss, firstly, whether the users conducting the transaction are abnormal or not is detected through a risk identification model. However, since the risk recognition model only detects the transaction information of the other user, but does not consider the transaction information of the user who performs indirect transaction with the other user, and the result of risk detection is inaccurate, the present specification provides a risk recognition method, and the execution subject of the present specification may be a client terminal that performs risk detection on the other user who has a business to and from the client terminal, or may be a server that may communicate with the client terminal to obtain business information to perform risk detection, or may be other electronic device with a computing function, which is not limited in the present specification. For convenience of explanation, the present description will be made with the server as the main body of execution.

The server needs to determine the user to be identified so as to acquire a service directed graph comprising the node corresponding to the user to be identified, and performs risk identification according to the service directed graph. The user to be identified can be sent by a client, and the client is a client corresponding to a user to which the user to be identified has business to and from.

S102: and acquiring a pre-constructed service directed graph in which the user to be identified is located, wherein nodes in the service directed graph represent users, edges represent service relations among the users, and the nodes contain time information of executing the service by the users.

Because the server performs risk identification according to the service directed graph, after the user to be identified is determined, a pre-constructed service directed graph where the user to be identified is located can also be obtained, wherein nodes in the service directed graph represent users, edges represent service relationships among the users, and the nodes contain time information of executing the service by the users.

For example, there is a funds transaction between user a and user B, then the business directed graph may be a funds flow directed graph, the edges of which represent the funds flow relationship between user a and user B. If the direction of the edge is that the node corresponding to the user A points to the node corresponding to the user B, the direction of the edge indicates that the user A transfers to the user B, and the time information for executing the service is the time of transferring from the user A to the user B.

It should be noted that, when the service directed graph is constructed, a plurality of users to which the service is to be sent and received may be obtained to obtain user information of the users, where the user information includes inherent attributes of the users, such as age, gender, and the like of the users. And acquiring corresponding service information, wherein the service information comprises time information for executing the service, specific service content and the like, such as that the user A purchases a certain commodity of the user B and transfers 10 yuan to the user B. And then constructing a service directed graph according to the user information and the service information. The present description does not limit how to construct the service directed graph, as long as the service directed graph includes service information and user information.

S104: and inputting the service directed graph into the feature extraction layer to obtain the node features of each node in the service directed graph output by the feature extraction layer.

In one or more embodiments of the present disclosure, the server may perform risk recognition on the user to be recognized according to a service directed graph through a pre-trained risk recognition model, where the risk recognition model includes a feature extraction layer, a feature aggregation layer, and a risk recognition layer. And then, after the service directed graph is obtained, inputting the service directed graph into the feature extraction layer of the risk identification model, and obtaining the node features of each node in the service directed graph output by the feature extraction layer. The feature extraction layer may include a graph convolution layer or the like, and the present specification is not limited to the specific structure of the feature extraction layer as long as the feature extraction layer can extract node features of nodes.

S106: determining nodes with the adjacent distances of the nodes corresponding to the users to be identified not larger than a preset adjacent distance as nodes to be aggregated, and inputting the node characteristics of the nodes to be aggregated and the node characteristics of the nodes corresponding to the users to be identified into the characteristic aggregation layer according to the time information so as to enable the characteristic aggregation layer to perform characteristic aggregation and output aggregation characteristics of the nodes corresponding to the users to be identified, wherein the aggregation characteristics comprise the node characteristics of the nodes to be aggregated and the node characteristics of the nodes corresponding to the users to be identified.

The server can directly input the node characteristics of each node in the service directed graph and the node characteristics of the node corresponding to the user to be identified into the characteristic aggregation layer of the risk identification model according to the time information, so that the node characteristics of all the nodes in the service directed graph are included in the aggregation characteristics of the node corresponding to the user to be identified.

However, since the computing resources required for aggregating the features are more, when the computing resources are limited, the node features of part of the nodes in the service directed graph can be selected for feature aggregation, namely, the node features of the part of the nodes and the node features of the nodes corresponding to the users to be identified are input into the feature aggregation layer, so that the feature aggregation layer performs feature aggregation.

When selecting node characteristics of part of nodes in the service directed graph, the server can select node characteristics of the nodes executing the service in a preset time period according to the time information to perform characteristic aggregation. For example, the time when the user to be identified last executes the service is 1 month, 19 days and 13 points, then according to the time information, node characteristics of all nodes which execute the service are determined from 1 month, 18 days and 13 points to 1 month, 19 days and 13 points, and characteristic aggregation is performed.

Of course, the node for feature aggregation may also be determined according to the adjacency distance with the user to be identified. Because the node characteristics of the nodes with the longer adjacent distance to the user to be identified may have weaker relevance to the user to be identified, the influence on the accuracy of the risk identification result is smaller, and therefore the node characteristics corresponding to the nodes with the shorter adjacent distance to the node corresponding to the user to be identified can be selected and the characteristic aggregation can be performed.

Specifically, the server determines nodes with adjacent distances not larger than a preset adjacent distance to be the nodes corresponding to the users to be identified as the nodes to be aggregated, and inputs the node characteristics of the nodes to be aggregated and the node characteristics of the nodes corresponding to the users to be identified into the characteristic aggregation layer according to the time information, so that the characteristic aggregation layer performs characteristic aggregation, and outputs the aggregation characteristics of the nodes corresponding to the users to be identified, wherein the aggregation characteristics comprise the node characteristics of the nodes to be aggregated and the node characteristics of the nodes corresponding to the users to be identified.

The aggregation of the node characteristics of the nodes in the service directed graph is to enable the node characteristics of the nodes in the service directed graph to be included in the aggregation characteristics of the nodes corresponding to the users to be identified for risk identification, so that the accuracy of the obtained risk identification result according to the aggregation characteristics is improved.

S108: and inputting the aggregation characteristics into the risk identification layer to obtain a risk identification result of the user to be identified, which is output by the risk identification layer.

Because the aggregation feature comprises node features of the nodes in the service directed graph, a risk identification layer in the risk identification model can judge whether the user corresponding to the nodes in the service directed graph has an abnormality according to the aggregation feature, and the abnormality can be the risk of abnormal transaction of the user to be identified. For example, other users who have business trips to the user to be identified always perform business at midnight, the number of users of other users exceeds a preset number, the performing business always involves cash trips, and the cash amount is small, etc. Then, the server can input the aggregate features into the risk recognition layer to obtain a risk recognition result of the user to be recognized, which is output by the risk recognition layer, so as to display the abnormal situation to the user of the client through the risk recognition result, and prompt the user to have an abnormality with the user to be recognized, which has business to and from, thereby reducing the risk of the user executing the business.

Based on the risk identification method shown in fig. 1, the method determines the nodes needing to be input into the feature aggregation layer for feature aggregation according to time information, and aggregates the nodes, so that the aggregated features of the users to be identified comprise node features of all the nodes to be aggregated within a preset adjacent distance, risk identification is performed according to the aggregated features of the users to be identified, and accuracy of a risk identification result is improved.

For step S106, in order to obtain, according to the preset neighbor distance, all nodes that perform feature aggregation on node features of the nodes corresponding to the users to be identified are nodes to be aggregated, the server may determine, according to a preset graph traversal algorithm, a service subgraph formed by the nodes to be aggregated and the nodes corresponding to the users to be identified in the service directed graph. So that the feature aggregation is directly carried out according to the business subgraph when the feature aggregation is carried out later. The preset graph traversal algorithm comprises a depth-first search algorithm, and can also be other graph traversal algorithms, so long as a business subgraph can be determined.

In order to capture the information of the service execution links with deeper and longer length, the server can acquire the topology structure information between the nodes, and according to the time information, determine an initial aggregation node in the service subgraph, wherein the time of executing the service of the initial aggregation node is the earliest time of executing the service of all the nodes in the service subgraph.

Then, the server may determine, according to the initial aggregation node, a first node that performs feature aggregation with the initial aggregation node in the service subgraph, and input a node feature of the initial aggregation node and a node feature of the first node into the feature aggregation layer to obtain an intermediate aggregation feature output by the feature aggregation layer, determine, according to the intermediate aggregation feature, a current aggregation feature of the initial aggregation node, determine a current aggregation feature of the first node, and re-determine the first node as the initial aggregation node until a node feature of a node corresponding to the user to be identified is input into the feature aggregation layer.

That is, when feature aggregation is performed, the initial aggregation node starts to aggregate node features of all nodes in the service subgraph until node features of the nodes corresponding to the user to be identified are aggregated, so as to obtain current aggregation features of the nodes corresponding to the user to be identified, and then risk identification is performed according to the current aggregation features of the nodes corresponding to the user to be identified.

Further, in order to obtain topology information of the service subgraph and improve accuracy of risk identification results, the server may determine a main path in the service subgraph, and perform feature aggregation according to the main path.

Specifically, in the service subgraph, the server determines a path taking the initial aggregation node as a starting point and a node corresponding to the user to be identified as an end point as a main path according to the preset graph traversal algorithm. And determining a first node which performs feature aggregation with the initial aggregation node in the main path according to the aggregation direction by taking the direction from the starting point to the ending point of the main path as the aggregation direction.

Fig. 2 is a schematic view of a first node provided in the present specification, as shown in fig. 2.

In the service subgraph, there are A, B, C, D, E, F nodes, B is the node corresponding to the user to be identified, C is the initial aggregation node C-D-a-B as the main path, and then the aggregation direction is the direction from C to B, so the first node performing feature aggregation with the initial aggregation node may be D, A and B.

In order to facilitate feature aggregation, the one-degree neighboring node of the initial aggregation node may be used as the first node for feature aggregation with the initial aggregation node, i.e., the server may determine, in the business subgraph, the one-degree neighboring node of the initial aggregation node as the first node for feature aggregation with the initial aggregation node. Taking the example of fig. 2, the one-degree neighbor node of C includes E, F, D, but the node on the main path, and is the one-degree neighbor node of C including only D, so D is the first node of C. C. And D, after feature aggregation is carried out on the two nodes, obtaining respective current aggregation features, and taking the D as an initial aggregation node. And then, according to the aggregation direction, determining the node which performs feature aggregation with the node D as A, and performing feature aggregation with D, A nodes until the node features of the node B are input into the feature aggregation layer.

In the traffic subgraph, only one main path is included, and paths other than the main path are determined as branch paths. Then, in order not to miss the node characteristics of each node in the traffic subgraph, when the initial aggregation node has a branch path, the node characteristics of the nodes on the branch path need to be aggregated.

Specifically, for each branch path, according to the preset graph traversal algorithm, a node corresponding to the maximum adjacent distance of the initial aggregation node is determined in the branch path, and an intermediate node is obtained.

And taking the initial aggregation node as a first branch aggregation starting point, taking the direction from the first branch aggregation starting point to the intermediate node as a first branch aggregation direction, determining a node which performs feature aggregation with the first branch aggregation starting point according to the first branch aggregation direction, performing feature aggregation, and re-determining the node which performs feature aggregation with the first branch aggregation starting point as the first branch aggregation starting point until the intermediate node is subjected to feature aggregation to obtain the current aggregation feature of each node on the branch path.

Fig. 3 is a schematic diagram of a branching path provided in the present specification, as shown in fig. 3.

The service subgraph comprises 10 nodes A-J, B is the node corresponding to the user to be identified, C is the initial aggregation node C-D-A-B is the main path, and E-J are the nodes on the branch paths. For C, a branch path exists, and then when C and D are subjected to feature aggregation, an intermediate node H is determined, the direction of C-E-F-G-H is used as a first branch aggregation direction, and feature aggregation is performed on E, C until H is subjected to feature aggregation, so that the current aggregation feature of each node on the branch path is obtained. And performing feature aggregation on H, G by taking the direction of the H-G-F-E-C as a second branch aggregation direction until feature aggregation is performed on the C to obtain the current aggregation feature of the C, wherein the current aggregation feature of the C comprises node features of all nodes in the branch path, re-determining the current aggregation feature of the C as the node feature of the C, and inputting the re-determined node feature of the C and the node feature of the D into the feature aggregation layer.

It should be noted that, in fig. 3, F has another one-degree neighbor node I in addition to E and G, and then F may perform feature aggregation with I before performing feature aggregation with E. The feature aggregation sequence can be determined according to the time information, and then feature aggregation is performed. For example, F and E perform the service first, and F and I perform the service later, then F may perform feature aggregation with E first and then with I, which is not limited in this specification.

When determining the current aggregation characteristic of the initial aggregation node according to the intermediate aggregation characteristic and determining the current aggregation characteristic of the first node, the server can fuse the intermediate aggregation characteristic with the node characteristic of the initial aggregation node to obtain the current aggregation characteristic of the initial aggregation node. And fusing the intermediate aggregation characteristic with the node characteristic of the first node to obtain the current aggregation characteristic of the first node. The node characteristics of the initial aggregation nodes of the current aggregation characteristic diagram are fused, so that the effect of keeping long-term memory is achieved.

Continuing with the example of FIG. 2, if the node characteristic of C is Mc, the node characteristic of D is Md, the intermediate aggregate characteristic is Mc+d, the current aggregate characteristic of C is M2c+d, and the current aggregate characteristic of D is Mc+2d.

In addition, when more than one intermediate aggregation feature exists, then, when determining the current aggregation feature of the initial aggregation node, summing and pooling are performed on all the intermediate aggregation features, and the intermediate aggregation features after summing and pooling are obtained. And fusing the intermediate aggregation characteristic after the summation pooling with the node characteristic of the initial node to be aggregated to obtain the current aggregation characteristic of the initial aggregation node.

It should be noted that, for two nodes, services are executed at different times, and feature aggregation needs to be performed for each service when feature aggregation is performed. For example, for two nodes a and B, the traffic is performed at 13 and 15 points on the same day, respectively, and feature aggregation is performed twice. The method is characterized in that the time for executing the service is different, the topology information between the nodes can be acquired through twice aggregation, and the accuracy of risk identification is improved.

The present disclosure further provides a training method of a risk identification model, specifically, a server first obtains a sample service directed graph, where nodes in the sample service directed graph represent users, edges represent service relationships between users, and the nodes include time information of executing a service by the users. And determining a risk result of the target node in the sample service directed graph, wherein the risk result is used as a label of the target node, and the label can be used for carrying out abnormal information such as abnormal transaction and the like for the target node, and the specification is not limited.

And then, inputting the sample service directed graph into the feature extraction layer to obtain the node features of each node in the service directed graph output by the feature extraction layer. And then, determining nodes except the target node in the sample service directed graph as sample nodes to be aggregated, and inputting the node characteristics of the sample nodes to be aggregated and the node characteristics of the target node into the characteristic aggregation layer according to the time information in the sample service directed graph so as to enable the characteristic aggregation layer to perform characteristic aggregation and output the aggregation characteristics of the target node.

And inputting the aggregate characteristics of the target node into the risk identification layer to obtain a predicted risk identification result of the target node output by the risk identification layer. And finally, training the risk identification model according to the predicted risk identification result and the label.

The risk identification method provided for one or more embodiments of the present disclosure further provides a corresponding risk identification device based on the same concept, as shown in fig. 4.

Fig. 4 is a schematic diagram of a risk identification device provided in the present specification, where a risk identification model includes a feature extraction layer, a feature aggregation layer, and a risk identification layer, and the device includes:

a to-be-identified user determining module 400, configured to determine a to-be-identified user;

a service directed graph obtaining module 402, configured to obtain a service directed graph where the user to be identified is located, where nodes in the service directed graph represent users, edges represent service relationships between users, and the nodes include time information of executing services by the users;

The feature extraction module 404 is configured to input the service directed graph into the feature extraction layer, and obtain node features of each node in the service directed graph output by the feature extraction layer;

The aggregation module 406 is configured to determine, as a node to be aggregated, a node having an abutment distance of a node corresponding to the user to be identified that is not greater than a preset abutment distance, and input, according to the time information, a node feature of the node to be aggregated and a node feature of the node corresponding to the user to be identified into the feature aggregation layer, so that the feature aggregation layer performs feature aggregation, and output an aggregate feature of the node corresponding to the user to be identified, where the aggregate feature includes the node feature of the node to be aggregated and the node feature of the node corresponding to the user to be identified;

And the risk identification module 408 is configured to input the aggregate feature into the risk identification layer, and obtain a risk identification result of the user to be identified, which is output by the risk identification layer.

Optionally, the aggregation module 406 is specifically configured to determine, in the service directed graph, a service subgraph configured by the node to be aggregated and a node corresponding to the user to be identified according to a preset graph traversal algorithm; according to the time information, determining an initial aggregation node in the business subgraph; according to the initial aggregation node, determining a first node which performs feature aggregation with the initial aggregation node in the business subgraph, inputting node features of the initial aggregation node and node features of the first node into the feature aggregation layer to obtain intermediate aggregation features output by the feature aggregation layer, determining the current aggregation features of the initial aggregation node according to the intermediate aggregation features, determining the current aggregation features of the first node, and re-determining the first node as the initial aggregation node until node features of nodes corresponding to the users to be identified are input into the feature aggregation layer.

Optionally, the aggregation module 406 is specifically configured to determine, in the service subgraph, a path that starts with the initial aggregation node and ends with a node corresponding to the user to be identified as a main path according to the preset graph traversal algorithm; taking the direction from the starting point to the end point of the main path as an aggregation direction; and according to the aggregation direction, determining a first node which performs feature aggregation with the initial aggregation node in the main path.

Optionally, the aggregation module 406 is specifically configured to determine paths other than the main path as branch paths; when branch paths exist in the initial aggregation node, determining a node corresponding to the maximum adjacent distance of the initial aggregation node in each branch path according to the preset graph traversal algorithm to obtain an intermediate node; taking the initial aggregation node as a first branch aggregation starting point, taking the direction from the first branch aggregation starting point to the intermediate node as a first branch aggregation direction, determining a node which performs feature aggregation with the first branch aggregation starting point according to the first branch aggregation direction, performing feature aggregation, and re-determining the node which performs feature aggregation with the first branch aggregation starting point as the first branch aggregation starting point until the intermediate node is subjected to feature aggregation to obtain the current aggregation feature of each node on the branch path; and determining a node which is subjected to feature aggregation with the intermediate node according to the second branch aggregation direction, performing feature aggregation, re-determining the node which is subjected to feature aggregation with the intermediate node as the intermediate node until the feature aggregation is performed on the initial aggregation to obtain the current aggregation feature of the initial aggregation node, wherein the current aggregation feature of the initial aggregation node comprises node features of all nodes in the branch path, re-determining the current aggregation feature of the initial aggregation node as the node feature of the initial aggregation node, and inputting the re-determined node feature of the initial aggregation node and the node feature of the first node into the feature aggregation layer.

Optionally, the aggregation module 406 is specifically configured to determine, in the business subgraph, a one-degree neighboring node of the initial aggregation node as a first node that performs feature aggregation with the initial aggregation node.

Optionally, the aggregation module 406 is specifically configured to fuse the intermediate aggregation feature with a node feature of the initial aggregation node to obtain a current aggregation feature of the initial aggregation node; and fusing the intermediate aggregation characteristic with the node characteristic of the first node to obtain the current aggregation characteristic of the first node.

Optionally, the aggregation module 406 is specifically configured to, when there is more than one intermediate aggregation feature, perform summation pooling on all the intermediate aggregation features to obtain a summation pooled intermediate aggregation feature; and fusing the intermediate aggregation characteristics after summation pooling with the node characteristics of the initial nodes to be aggregated to obtain the current aggregation characteristics of the initial aggregation nodes.

Optionally, the apparatus further comprises:

A training module 410, configured to obtain a sample service directed graph, where nodes in the sample service directed graph represent users, edges represent service relationships between users, and the nodes include time information for the users to execute services; determining a risk result of a target node in the sample service directed graph as a label of the target node; inputting the sample service directed graph into the feature extraction layer to obtain node features of each node in the service directed graph output by the feature extraction layer; determining nodes except the target node in the sample service directed graph as sample nodes to be aggregated, and inputting node characteristics of the sample nodes to be aggregated and node characteristics of the target node into the characteristic aggregation layer according to time information in the sample service directed graph so as to enable the characteristic aggregation layer to perform characteristic aggregation and output aggregation characteristics of the target node; inputting the aggregation characteristics of the target nodes into the risk identification layer to obtain a predicted risk identification result of the target nodes, which is output by the risk identification layer; and training the risk identification model according to the predicted risk identification result and the label.

The present specification also provides a computer readable storage medium storing a computer program operable to perform the risk identification method provided in fig. 1 above.

The present specification also provides a schematic structural diagram of the electronic device shown in fig. 5. As shown in fig. 5, at the hardware level, the unmanned device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile memory, and may of course include hardware required by other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs to implement the risk identification method described above with respect to fig. 1. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable GATE ARRAY, FPGA)) is an integrated circuit whose logic functions are determined by user programming of the device. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented with "logic compiler (logic compiler)" software, which is similar to the software compiler used in program development and writing, and the original code before being compiled is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but HDL is not just one, but a plurality of kinds, such as ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language), and VHDL (very-high-SPEED INTEGRATED Circuit Hardware Description Language) and verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application SPECIFIC INTEGRATED Circuits (ASICs), programmable logic controllers, and embedded microcontrollers, examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present application.

Claims

1. A risk identification method, a risk identification model comprising a feature extraction layer, a feature aggregation layer and a risk identification layer, the method comprising:

determining a user to be identified;

2. The method of claim 1, according to the time information, inputting the node characteristics of the node to be aggregated and the node characteristics of the node corresponding to the user to be identified into the characteristic aggregation layer, specifically including:

3. The method according to claim 2, according to the initial aggregation node, determining, in the service subgraph, a first node that performs feature aggregation with the initial aggregation node, specifically including:

4. A method as claimed in claim 3, determining paths other than the main path as branch paths;

5. The method of claim 3, according to the initial aggregation node, determining, in the service subgraph, a first node that performs feature aggregation with the initial aggregation node, specifically including:

6. The method according to claim 2, according to the intermediate aggregation feature, determining a current aggregation feature of the initial aggregation node, and determining a current aggregation feature of the first node, specifically comprising:

7. The method according to claim 2, when there is more than one intermediate aggregation feature, determining a current aggregation feature of the initial aggregation node according to the intermediate aggregation feature, specifically comprising:

8. The method of claim 1, training the risk identification model, comprising:

9. A risk identification device, a risk identification model comprising a feature extraction layer, a feature aggregation layer, and a risk identification layer, the device comprising:

10. The device of claim 9, wherein the aggregation module is specifically configured to determine, in the service directed graph, a service subgraph formed by the nodes to be aggregated and nodes corresponding to the users to be identified according to a preset graph traversal algorithm; according to the time information, determining an initial aggregation node in the business subgraph; according to the initial aggregation node, determining a first node which performs feature aggregation with the initial aggregation node in the business subgraph, inputting node features of the initial aggregation node and node features of the first node into the feature aggregation layer to obtain intermediate aggregation features output by the feature aggregation layer, determining the current aggregation features of the initial aggregation node according to the intermediate aggregation features, determining the current aggregation features of the first node, and re-determining the first node as the initial aggregation node until node features of nodes corresponding to the users to be identified are input into the feature aggregation layer.

11. The device of claim 10, wherein the aggregation module is specifically configured to determine, in the service subgraph, a path starting from the initial aggregation node and ending at a node corresponding to the user to be identified as a main path according to the preset graph traversal algorithm; taking the direction from the starting point to the end point of the main path as an aggregation direction; and according to the aggregation direction, determining a first node which performs feature aggregation with the initial aggregation node in the main path.

12. The apparatus of claim 11, the aggregation module being specifically configured to determine paths other than the main path as branch paths; when branch paths exist in the initial aggregation node, determining a node corresponding to the maximum adjacent distance of the initial aggregation node in each branch path according to the preset graph traversal algorithm to obtain an intermediate node; taking the initial aggregation node as a first branch aggregation starting point, taking the direction from the first branch aggregation starting point to the intermediate node as a first branch aggregation direction, determining a node which performs feature aggregation with the first branch aggregation starting point according to the first branch aggregation direction, performing feature aggregation, and re-determining the node which performs feature aggregation with the first branch aggregation starting point as the first branch aggregation starting point until the intermediate node is subjected to feature aggregation to obtain the current aggregation feature of each node on the branch path; and determining a node which is subjected to feature aggregation with the intermediate node according to the second branch aggregation direction, performing feature aggregation, re-determining the node which is subjected to feature aggregation with the intermediate node as the intermediate node until the feature aggregation is performed on the initial aggregation to obtain the current aggregation feature of the initial aggregation node, wherein the current aggregation feature of the initial aggregation node comprises node features of all nodes in the branch path, re-determining the current aggregation feature of the initial aggregation node as the node feature of the initial aggregation node, and inputting the re-determined node feature of the initial aggregation node and the node feature of the first node into the feature aggregation layer.

13. The apparatus of claim 11, wherein the aggregation module is specifically configured to determine, in the traffic subgraph, a one-degree neighboring node of the initial aggregation node as a first node performing feature aggregation with the initial aggregation node.

14. The apparatus of claim 10, wherein the aggregation module is specifically configured to fuse the intermediate aggregation feature with a node feature of the initial aggregation node to obtain a current aggregation feature of the initial aggregation node; and fusing the intermediate aggregation characteristic with the node characteristic of the first node to obtain the current aggregation characteristic of the first node.

15. The apparatus of claim 10, wherein the aggregation module is specifically configured to, when there is more than one intermediate aggregation feature, sum and pool all the intermediate aggregation features to obtain a sum-pooled intermediate aggregation feature; and fusing the intermediate aggregation characteristics after summation pooling with the node characteristics of the initial nodes to be aggregated to obtain the current aggregation characteristics of the initial aggregation nodes.

16. The apparatus of claim 9, the apparatus further comprising:

17. A computer readable storage medium storing a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-8.

18. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the preceding claims 1-8 when the program is executed.