WO2019196546A1 - Method and apparatus for determining risk probability of service request event - Google Patents

Method and apparatus for determining risk probability of service request event Download PDF

Info

Publication number
WO2019196546A1
WO2019196546A1 PCT/CN2019/073869 CN2019073869W WO2019196546A1 WO 2019196546 A1 WO2019196546 A1 WO 2019196546A1 CN 2019073869 W CN2019073869 W CN 2019073869W WO 2019196546 A1 WO2019196546 A1 WO 2019196546A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
relationship
event
feature
crowd
Prior art date
Application number
PCT/CN2019/073869
Other languages
French (fr)
Chinese (zh)
Inventor
王修坤
陈岑
杨新星
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2019196546A1 publication Critical patent/WO2019196546A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Definitions

  • One or more embodiments of the present specification relate to the field of computer technology, and more particularly to a method and apparatus for determining a risk probability of a service request event by a computer.
  • risk auditing is often performed manually.
  • some simple rules are also set to assist with manual judgment.
  • such an approach is inefficient and difficult to meet the needs of rapid business development; and the accuracy of identifying high-risk users and high-risk events depends on the experience of the manually audited salesperson, and the differences in the experience of different salesmen. Bringing operational risks, making auditing accuracy difficult to guarantee, often missing.
  • One or more embodiments of the present specification describe a method and apparatus for efficiently determining the risk probability of a service request event.
  • a method of determining a risk probability of a service request event comprising:
  • Determining a risk probability of the service request event according to the event characteristic, the user personal characteristic of the at least one user, and the relationship characteristic of the at least one user.
  • the event feature includes at least one of the following: a requested service amount, a service registration time, an event occurrence time, a time difference between the service registration time and the event occurrence time, and an event occurrence location.
  • the at least one user includes the requestor of the service request event, and the beneficiary of the service request.
  • the user personal characteristics described above include one or more of the following: a user basic attribute feature, a user behavior feature, and a user location feature.
  • the relationship feature vector of the at least one user specifically: acquiring the specific crowd including the at least one user; acquiring a crowd relationship map of the specific crowd; and based on the crowd relationship map, A relationship characteristic of the at least one user is determined.
  • the obtaining the specific group includes, in a plurality of pre-divided subsets of users, determining a subset of users to which the at least one user belongs, and using the subset of users as the specific group; or The at least one user is added to the pre-selected set of users, and the set of users is taken as the specific group of people.
  • acquiring the crowd relationship map of the specific crowd further comprises: acquiring a first relationship map constructed for the pre-selected user set; acquiring an association relationship between the at least one user and the user in the pre-selected user set Adding the association relationship to the first relationship map as a crowd relationship map of the specific population.
  • the population relationship map of the specific population described above is established based on one or more of the following relationships: a transaction relationship, a device relationship, a capital relationship, and a social relationship.
  • determining the relationship characteristics of the user comprises using a node-vector network structure feature extraction algorithm to convert the relationship map into a vector factor, and determining a relationship feature vector of the user based on the vector factor.
  • a risk probability of a business request event is determined using a pre-trained evaluation model that is trained based on a gradient boost decision tree algorithm.
  • an apparatus for determining a risk probability of a service request event includes:
  • An event feature obtaining unit configured to acquire an event feature of the service request event
  • a personal feature obtaining unit configured to acquire a user personal feature of at least one user involved in the service request event
  • a relationship feature acquiring unit configured to determine a relationship feature of the at least one user based on a crowd relationship map of a specific group, wherein the specific group includes the at least one user;
  • the risk determining unit is configured to determine a risk probability of the service request event according to the event feature, the user personal feature of the at least one user, and the relationship feature of the at least one user.
  • a computer readable storage medium having stored thereon a computer program for causing a computer to perform the method of the first aspect when the computer program is executed in a computer.
  • a computing device comprising a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, implementing the method of the first aspect .
  • the risk probability of the service request event is comprehensively determined, thereby making the risk determination more efficient. And accurate.
  • FIG. 1 is a schematic diagram showing an implementation scenario of an embodiment disclosed in the present specification
  • FIG. 2 illustrates a method flow diagram for determining a risk probability of a service request event, in accordance with one embodiment
  • FIG. 3 illustrates a flow of steps for determining a relationship feature of a related user, according to one embodiment
  • Figure 5 shows a schematic block diagram of a risk determining device in accordance with one embodiment.
  • FIG. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in the present specification.
  • a risk review of a business request event is performed by a computing platform.
  • Users can send business request events to the computing platform, such as applying for a loan, applying for insurance claims, and so on.
  • the computing platform After the computing platform obtains such a business request, it needs to obtain a variety of information to comprehensively evaluate the risk probability of the event.
  • This multifaceted information includes event information for the business request event, as well as the user's personal characteristics of the user involved in the business request event.
  • the computing platform also puts the users involved in the event into a specific group of people to obtain the relationship characteristics of the user in the crowd relationship map. On this basis, based on the above event characteristics, user personal characteristics, and user relationship characteristics, comprehensively assess the risk probability of the business request event.
  • the specific execution process of the above scenario will be described below.
  • the execution body of the method may be any system, device, device, platform or server with computing and processing capabilities, such as the computing platform shown in FIG. 1 , more specifically, for example, various backgrounds that need to analyze and manage business risks.
  • Servers such as Alipay servers, insurance business servers, financial approval servers, etc. As shown in FIG.
  • the method includes the following steps: Step 21: Acquire an event feature of a service request event; Step 22: Acquire a user personal feature of at least one user involved in the service request event; Step 23, based on a specific crowd relationship a map determining a relationship characteristic of the at least one user, wherein the specific group of people includes the at least one user; step 24, according to the event feature, a user profile of the at least one user, and the at least one user A relationship feature that determines a risk probability of the service request event.
  • Step 21 Acquire an event feature of a service request event
  • Step 22 Acquire a user personal feature of at least one user involved in the service request event
  • Step 23 based on a specific crowd relationship a map determining a relationship characteristic of the at least one user, wherein the specific group of people includes the at least one user
  • step 24 according to the event feature, a user profile of the at least one user, and the at least one user A relationship feature that determines a risk probability of the service request event.
  • event characteristics of the service request event to be evaluated are obtained.
  • the business request event to be evaluated may be an event for requesting various businesses that may be at risk, for example, applying for a loan, applying for a credit service, applying for insurance claims, and the like.
  • the event characteristics related to the service request event may include one or more of the following: the requested service type, the requested amount, the time when the request occurred, the service registration time, the time difference between the registration time and the request time, and the event occurrence place. Wait.
  • the foregoing service request event is an event for applying for insurance claims
  • the event characteristics may include: the requested insurance type, the request settlement amount, the application settlement time, the insurance application time, the insurance application time, and the claim time. Time difference, place of occurrence, etc.
  • the service request event is an event for applying for a loan
  • the event feature may include: a request amount, an application time, a registration time, a time difference between the registration time and the application time, a place of occurrence, and the like.
  • the user's personal characteristics of the relevant user involved in the service request event are also obtained.
  • the relevant user involved in the service request event is the service requester.
  • the relevant users involved in the business request event also include other stakeholders other than the requester.
  • the relevant users involved may include a guarantor, etc., in addition to the loan requester.
  • the relevant users involved may include, in addition to the claims claimant, insurance beneficiaries. Therefore, the related user involved in the business request event can be multiple users. For each of the related users involved, at step 22, the user's personal characteristics of these users are obtained.
  • the user's personal characteristics include basic attributes of the user, such as gender, age, registration duration, contact details, and the like.
  • the user personal characteristics include user behavior characteristics. More specifically, the user behavior characteristics may include behavior information related to the user's historical business operations, such as the number of transactions, the average transaction amount, the number of application claims, the number of claims approved, the average claim amount, and the like.
  • the user personal characteristics also include user location characteristics, such as where each historical business operation occurs, a range of location changes, and the like.
  • the user personal characteristics may also include more aspects of the user characteristics. It can be understood that the user's personal characteristics are only dependent on some characteristics of a certain user, characterizing the user's own attribute characteristics, operating characteristics, and the like. According to the embodiment of the present specification, in addition to acquiring the personal characteristics of the individual user, the user is placed in a certain crowd, thereby discovering the relationship characteristics of the user in the crowd relationship network, so as to perform a more comprehensive analysis based on the relationship feature. And assessment.
  • step 23 for each of the related users mentioned in step 22, the relationship characteristics of the respective users are determined based on the crowd relationship map of the specific group, wherein the specific group includes the related users.
  • FIG. 3 illustrates a flow of steps for determining a relationship feature of a related user, ie, a sub-step of step 23, in accordance with one embodiment. As shown in FIG. 3, in order to determine the relationship characteristics of each related user, in step 31, a specific crowd including related users is acquired.
  • a sufficiently large set of users is predetermined such that the set of users contains relevant users of the service request event to be evaluated, and the set of users can then be considered as a specific group of people.
  • the set of users contains relevant users of the service request event to be evaluated, and the set of users can then be considered as a specific group of people.
  • the business request event is an application for insurance claims
  • a collection of all insured persons may be taken as the above specific group.
  • the set of full users is divided into a plurality of subsets of users based on certain characteristics of the user.
  • a subset of users to which the related user related to the service request event belongs is determined, and the subset of users is used as the specific group.
  • a portion of users having certain similarities or associations are pre-selected to form a set of users.
  • the business request event is an application for insurance claims
  • all users who have applied for claims may be pre-selected to form a user set.
  • the above specific population can also be obtained by other means as long as the specific population is included in the relevant user to be analyzed.
  • step 32 a population relationship map of the specific population described above is obtained.
  • the step includes reconstructing a population relationship map for the particular population described above.
  • the particular population is selected from a predetermined set of users, and the system has previously built a crowd relationship map for the set of users.
  • a specific group of people may be selected from a full amount of users, or a subset of users based on a full amount of users, and the system may pre-establish a crowd relationship map for a full amount of users, or establish a subset of each user.
  • the crowd relationship map may be directly obtained, or the part related to the specific crowd may be extracted from the pre-built crowd relationship map for a larger range of users, as the specific population Crowd relationship map.
  • the particular population described above is formed by adding related users to a pre-selected set of users. If the system has constructed a crowd relationship map for the pre-selected set of users, step 32 may include first obtaining a relationship map constructed for the pre-selected set of users; obtaining the user in the related user and the pre-selected set of users The association relationship; then, the above relationship is added to the above relationship map as a population relationship map of the specific group.
  • the construction of crowd relationship maps can be based on multiple relationships.
  • the crowd relationship map is established based on the trading relationship of the crowd. For example, if a product purchase transaction is reached between two users, a transaction association is established between the two users.
  • the transaction relationship between users can be determined by acquiring and analyzing the transaction records of a large number of users, thereby establishing a crowd relationship map.
  • the crowd relationship map is established based on the device relationship of the crowd. For example, when two or more user accounts log in using the same terminal device, it can be determined that there is a device association between the two or more user accounts. There are two or more user accounts associated with the device, which may be multiple accounts registered by the same entity user, or may be accounts corresponding to multiple users who have close associations (such as family members, colleagues, etc.). The device relationship can be determined by obtaining the entity terminal information corresponding to the user when logging in to the account.
  • the crowd relationship map is established based on the funding relationship. For example, when there is a fund transfer operation such as transfer, collection, etc. between two users, a fund association is established between the two users.
  • the relationship between the users can be determined by obtaining and analyzing the records of the user's operation using the electronic wallet, and then the relationship map is established based on the capital relationship.
  • the crowd relationship map is established based on social relationships.
  • people are increasingly using social applications to interact. For example, two users can interact through social applications such as chatting, red packets, file transfer, etc., so that social connections can be established between the two users.
  • a social relationship between the crowds can be determined based on a large number of social interactions captured by the social application, thereby establishing a crowd relationship map.
  • a population relationship map can also be established based on a greater variety of population associations. Moreover, the population relationship map can be established based on several kinds of population associations at the same time.
  • the crowd relationship map can be formed in the form of a network of nodes.
  • the crowd relationship map includes a plurality of nodes, each node corresponding to one user, and the nodes having the associated relationship are connected to each other.
  • the connections between the nodes may have various attributes, such as connection type, connection strength, etc., where the connection types include, for example, a capital connection (a connection based on a capital relationship), a social connection (a social interaction based connection) Etc.), the connection strength can also include, for example, strong connections, weak connections, and the like.
  • Figure 4 illustrates an example of a crowd relationship map in accordance with one embodiment.
  • the crowd relationship map includes a plurality of nodes, and each node corresponds to one user.
  • the connection between nodes indicates that there is an association between users. It is assumed that the crowd relationship map of FIG. 4 is established based on the capital relationship and social relationship of the crowd. Accordingly, the connection between the nodes can be a capital connection or a social connection.
  • different connection types are shown in different line types, that is, the social connections between the nodes are shown in broken lines, and the capital connections between the nodes are shown in solid lines. Also, the strength of the connection is shown by the thickness of the connecting line.
  • thick lines show strong connections and thin lines show weak connections. More specifically, the thick solid line may show a stronger capital connection (eg, the capital interaction exceeds a threshold value of $10,000, for example, 10,000 yuan), and the thin line shows a weaker fund connection (eg, the capital interaction does not exceed the above amount) Threshold); thick dashed lines may show strong social connections (eg, the frequency of interactions exceeds a frequency threshold, eg 10 times per day), thin dotted lines show weaker social connections (eg, the frequency of interaction does not exceed the above frequency threshold) ).
  • the crowd relationship map may also be characterized in other forms, such as forms, graphics, and the like.
  • step 33 based on the acquisition of the population relationship map constructed for a specific population, in step 33, based on the crowd relationship map, the relationship characteristics of the relevant users involved in the current event are determined.
  • connection related to the user such as the number of connections, the type of connection, the strength of the connection, and other connected to, may be extracted from the crowd relationship map.
  • the user, etc. takes such a connection feature as a relationship feature of the user.
  • a crowd learning map is analyzed and characterized using a machine learning aid.
  • the crowd relationship map can be understood as a network that contains a certain number of nodes (corresponding to users) and the connection relationship between nodes (the relationship between users).
  • network information is more difficult to structure into standard data, so it is difficult to apply to machine learning.
  • network representation learning algorithms have been proposed to characterize and analyze network structures. The goal of these algorithms is to represent nodes with semantic relationships in the network with low-dimensional, dense, real-valued vectors, which facilitates computational storage without the need to manually extract features and project heterogeneous information into the same low-dimensional space. For easy downstream calculations.
  • the network is embedded into a geometric space, and the spatial coordinates of each node are regarded as the characteristics of the node, so that they are put into the neural network for learning and training.
  • the map can be mapped into the geometric space, and the spatial coordinates of each user node are calculated as the relationship feature vector.
  • various algorithms can be employed for the calculation of the spatial coordinates of the network nodes.
  • the DeepWalk algorithm is used to determine a vector representation of each node in the network corresponding to the population relationship map.
  • the DeepWalk algorithm a large number of random walk particles are released on the network, and these particles will go out of a sequence of nodes in a given time. If a node is treated as a word, the resulting sequence constitutes a sentence, and thus a "language" in which the node is composed of a sequence can be obtained. Then, using the word vector conversion (Word2Vec) algorithm, a vector representation of each word "word" can be calculated.
  • Word2Vec word vector conversion
  • a node-vector (node2vec) structural feature extraction algorithm is employed to convert the population relationship map into a form of a vector factor.
  • the Node2vec node-vector structure feature extraction algorithm improves the random walk strategy in DeepWalk, achieving a balance between Depth-First Search (DFS) and Breadth-First Search (BFS).
  • DFS Depth-First Search
  • BFS Breadth-First Search
  • the user node in the crowd relationship map can be converted into a form of vector representation, so that the vector expression of the user involved in the current event in the crowd relationship map can be determined as its relationship feature vector.
  • relationship feature vector of the current event related to the user from the crowd relationship map.
  • the dimensions and elements of the obtained relationship feature vectors will be different.
  • the relationship feature vector comprehensively represents the relationship between the user and other users in the crowd relationship network by characterizing the position of the node corresponding to the user in the crowd relationship map and the connection relationship with other nodes.
  • step 24 Based on the event characteristics acquired in step 21, the user personal characteristics acquired in step 22, and the user relationship characteristics acquired in step 23 as described above, in step 24, the above various features are combined to determine the risk probability of the service request event.
  • determining a first evaluation score of the service request event based on the event feature determining a second evaluation score of the service request event based on the user personal characteristic; determining a third evaluation score of the service request event based on the user relationship feature
  • the first, second, and third evaluation scores are weighted and summed to determine the risk probability score of the service request event.
  • the manner in which the first, second, and third evaluation scores are determined may be performed by a pre-trained model algorithm and model parameters.
  • both the user personal characteristics and the user relationship features are represented in the form of a vector.
  • the feature vector of the user's personal feature and the feature vector of the user relationship feature are first spliced to obtain a user integrated feature. Then, based on the user comprehensive feature, determining a first evaluation score of the service request event, determining a second evaluation score of the event based on the event feature of the service request event, and finally determining a service request event based on the first and second evaluation scores Risk probability score.
  • the manner in which the first and second evaluation scores are determined may be performed by a pre-trained model algorithm and model parameters.
  • an evaluation model is pre-trained that evaluates the risk probability of a business request event based directly on event characteristics, user personal characteristics, and user relationship characteristics. It will be appreciated that the evaluation model is based on training data sets that have been calibrated.
  • the event characteristics of the event are acquired, and the user involved in the event User personal characteristics.
  • the user will be involved in the crowd to obtain the relationship characteristics of the user in the crowd relationship map, especially the relationship feature vector. Add the above data to the training data set.
  • the model algorithm and model parameters can be used, and the risk probability of the event is determined based on the event characteristics, the user's personal characteristics and the user relationship characteristics in the training data set, and the risk probability of an event is obtained. Then, based on the obtained risk probability and the actual known risk probability of the event (ie, the loss function), the model algorithm and the model parameters are continuously optimized, thereby training the above evaluation model.
  • the above evaluation model can employ a variety of specific model algorithms.
  • the above evaluation model is trained using a Gradient Boosting Decision Tree (GBDT) method.
  • GBDT Gradient Boosting Decision Tree
  • the gradient boost decision tree GBDT method is a supervised method of integrated learning.
  • the integrated learning method a plurality of learners are used to separately learn the training sample set, and the final model is a synthesis of the above plurality of learners.
  • the two main methods of integrated learning are Bagging and Boosting.
  • the Boosting algorithm there are sequential orders between learners, and they have different weights. At the same time, weights are assigned to each sample. Initially, each sample has the same weight. After learning the training sample with a certain learner, the weight of the wrong sample is increased, the weight of the correct sample is reduced, and then the subsequent learner is used to learn.
  • the final prediction is the combination of multiple learner results.
  • gradient model can be used to optimize the model function based on the prediction result. This method is called Gradient Boost method.
  • each base learner uses the classification regression tree algorithm to form the gradient decision tree GBDT model.
  • the classification regression tree algorithm is a binary learning machine learning algorithm.
  • the accuracy and coverage of the model are more effective.
  • a plurality of learners using a classification regression tree can be trained for various features, including event features, user personal features, and user relationship features, thereby forming the above-described evaluation model.
  • the above evaluation model may also be implemented by other algorithms, such as the aforementioned bagging algorithm in integrated learning, a learner using other algorithms, and the like.
  • the evaluation model can be directly used to determine the risk probability of the current business request event.
  • the risk probability of the service request event can be comprehensively evaluated, thereby controlling the business execution risk more efficiently and accurately.
  • FIG. 5 shows a schematic block diagram of a risk determining device in accordance with one embodiment.
  • the risk determining apparatus 500 includes: an event feature acquiring unit 510 configured to acquire an event feature of a service request event; and a personal feature obtaining unit 520 configured to acquire at least one user involved in the service request event.
  • the relationship feature obtaining unit 530 is configured to determine a relationship feature of the at least one user based on a crowd relationship map of a specific group, wherein the specific group includes the at least one user; and the risk determining unit 540 is configured to The event feature, the user profile of the at least one user, and the relationship feature of the at least one user determine a risk probability of the service request event.
  • the event feature acquired by the event feature acquiring unit 510 includes at least one of the following: a request for a business amount, a service registration time, an event occurrence time, a time difference between a service registration time and an event occurrence time, and an event occurrence location.
  • At least one user involved in the service request event includes a requestor of the service request event, and a beneficiary of the service request.
  • the personal characteristics of the user acquired by the personal feature acquisition unit 520 include one or more of the following: a user basic attribute feature, a user behavior feature, and a user location feature.
  • the relationship feature obtaining unit 530 includes: a crowd obtaining module 531 configured to acquire a specific crowd including the at least one user; and a map acquiring module 532 configured to acquire a crowd relationship map of the specific crowd; The obtaining module 533 is configured to determine a relationship feature of the at least one user based on the crowd relationship map.
  • the crowd obtaining module 531 is configured to determine, in a plurality of pre-divided subsets of users, a subset of users to which the at least one user belongs, and use the subset of users as the specific group.
  • the crowd acquisition module 531 is configured to add the at least one user to a pre-selected set of users, the set of users being the particular group of people.
  • the map acquisition module 532 is configured to: acquire a first relationship map constructed for the pre-selected user set; and acquire the at least one user and the user in the pre-selected user set An association relationship is added to the first relationship map as a crowd relationship map of the specific group of people.
  • a population relationship map for a particular population is established based on one or more of the following relationships: transaction relationships, device relationships, funding relationships, social relationships.
  • the relationship feature obtaining unit 530 is configured to: convert the relationship map into a vector factor by using a node-vector network structure feature extraction algorithm, and determine a relationship feature vector of the at least one user based on the vector factor .
  • the risk determination unit 540 is configured to determine a risk probability of the service request event using a pre-trained evaluation model that is trained based on a gradient boost decision tree algorithm.
  • the event characteristics, the user's personal characteristics and the user relationship characteristics of a service request event are integrated, and the risk probability of the service request event is comprehensively evaluated, thereby controlling the business execution risk more efficiently and accurately.
  • a computer readable storage medium having stored thereon a computer program for causing a computer to perform the method described in connection with FIG. 2 when the computer program is executed in a computer.
  • a computing device comprising a memory and a processor, the memory storing executable code, and when the processor executes the executable code, implementing the method described in connection with FIG. 2 method.
  • the functions described herein can be implemented in hardware, software, firmware, or any combination thereof.
  • the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Technology Law (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A method and an apparatus for determining the risk probability of a service request event, the method comprising: acquiring event characteristics of a service request event (21); acquiring user personal characteristics of the user to which the service request event relates (22); on the basis of a population relationship graph based on a specific population, determining user relationship characteristics (23); and, on the basis of the event characteristics, the user personal characteristics, and the user relationship characteristics, determining the risk probability of the service request event (24). Thus, the risk of the service request event can be comprehensively evaluated.

Description

确定业务请求事件的风险概率的方法及装置Method and apparatus for determining risk probability of a business request event 技术领域Technical field
本说明书一个或多个实施例涉及计算机技术领域,尤其涉及通过计算机确定业务请求事件的风险概率的方法和装置。One or more embodiments of the present specification relate to the field of computer technology, and more particularly to a method and apparatus for determining a risk probability of a service request event by a computer.
背景技术Background technique
随着计算机和互联网技术的发展,越来越多的业务通过计算平台来实现,例如商品交易、钱款支付、金融借贷、保险理赔等等。然而,在许多业务的执行和处理中,如果不对业务请求人的背景,以及所请求的业务进行审核,就很可能带来较大风险,例如一些不法分子可能利用电子平台,实施金融诈骗,借贷套现、保险骗保等等。With the development of computer and Internet technologies, more and more businesses are realized through computing platforms, such as commodity transactions, money payments, financial lending, insurance claims, and so on. However, in the execution and processing of many services, if the background of the business requester and the requested business are not reviewed, it is likely to bring greater risks. For example, some criminals may use electronic platforms to implement financial fraud and lending. Cash, insurance, and so on.
常规技术中,为了防止和降低上述风险,往往通过人工进行风险审核。在有些平台中,也会设置一些简单的规则,辅助人工进行判断。然而,这样的方式效率很低,难以满足业务快速发展的需要;并且,识别高风险用户和高风险事件的准确性有赖于人工审核的业务员的自身经验,不同业务员的经验的差异也会带来操作性风险,使得审核准确度难以得到保证,常常出现遗漏。In the conventional technology, in order to prevent and reduce the above risks, risk auditing is often performed manually. In some platforms, some simple rules are also set to assist with manual judgment. However, such an approach is inefficient and difficult to meet the needs of rapid business development; and the accuracy of identifying high-risk users and high-risk events depends on the experience of the manually audited salesperson, and the differences in the experience of different salesmen. Bringing operational risks, making auditing accuracy difficult to guarantee, often missing.
因此,希望能有改进的方案,通过高效而准确地确定业务请求事件的风险概率,降低业务执行风险。Therefore, it is desirable to have an improved solution to reduce the risk of business execution by efficiently and accurately determining the risk probability of business request events.
发明内容Summary of the invention
本说明书一个或多个实施例描述了一种方法和装置,用于高效地确定业务请求事件的风险概率。One or more embodiments of the present specification describe a method and apparatus for efficiently determining the risk probability of a service request event.
根据第一方面,提供了一种确定业务请求事件的风险概率的方法,包括:According to a first aspect, a method of determining a risk probability of a service request event is provided, comprising:
获取业务请求事件的事件特征;Obtain the event characteristics of the business request event;
获取所述业务请求事件所涉及的至少一个用户的用户个人特征;Obtaining a user personal characteristic of at least one user involved in the service request event;
基于特定人群的人群关系图谱,确定所述至少一个用户的关系特征,其中所述特定人群包含所述至少一个用户;Determining a relationship characteristic of the at least one user based on a crowd relationship map of a specific group, wherein the specific group includes the at least one user;
根据所述事件特征,所述至少一个用户的用户个人特征,以及所述至少一个用户的 关系特征,确定所述业务请求事件的风险概率。Determining a risk probability of the service request event according to the event characteristic, the user personal characteristic of the at least one user, and the relationship characteristic of the at least one user.
在一个实施例中,上述事件特征包括以下中的至少一项:请求业务金额,业务注册时间,事件发生时间,业务注册时间与事件发生时间的时间差,事件发生地点。In one embodiment, the event feature includes at least one of the following: a requested service amount, a service registration time, an event occurrence time, a time difference between the service registration time and the event occurrence time, and an event occurrence location.
在一个实施例中,上述至少一个用户包括,所述业务请求事件的请求人,和业务请求的受益人。In one embodiment, the at least one user includes the requestor of the service request event, and the beneficiary of the service request.
在一个实施例中,上述用户个人特征包括以下中的一项或多项,用户基本属性特征,用户行为特征,用户位置特征。In one embodiment, the user personal characteristics described above include one or more of the following: a user basic attribute feature, a user behavior feature, and a user location feature.
根据一种实施方式,确定上述至少一个用户的关系特征向量,具体包括:获取包含所述至少一个用户的所述特定人群;获取所述特定人群的人群关系图谱;以及基于所述人群关系图谱,确定所述至少一个用户的关系特征。Determining, according to an embodiment, the relationship feature vector of the at least one user, specifically: acquiring the specific crowd including the at least one user; acquiring a crowd relationship map of the specific crowd; and based on the crowd relationship map, A relationship characteristic of the at least one user is determined.
在一个实施例中,获取上述特定人群又包括,在预先划分的多个用户子集中,确定所述至少一个用户所属于的用户子集,将该用户子集作为上述特定人群;或者,将所述至少一个用户添加到预先选择的用户集合中,将所述用户集合作为所述特定人群。In an embodiment, the obtaining the specific group includes, in a plurality of pre-divided subsets of users, determining a subset of users to which the at least one user belongs, and using the subset of users as the specific group; or The at least one user is added to the pre-selected set of users, and the set of users is taken as the specific group of people.
在一个实施例中,获取特定人群的人群关系图谱进一步包括:获取针对预先选择的用户集合构建的第一关系图谱;获取所述至少一个用户与所述预先选择的用户集合中的用户的关联关系;将所述关联关系添加到所述第一关系图谱,作为所述特定人群的人群关系图谱。In an embodiment, acquiring the crowd relationship map of the specific crowd further comprises: acquiring a first relationship map constructed for the pre-selected user set; acquiring an association relationship between the at least one user and the user in the pre-selected user set Adding the association relationship to the first relationship map as a crowd relationship map of the specific population.
根据一种实施方式,上述特定人群的人群关系图谱基于以下一种或多种关系而建立:交易关系、设备关系、资金关系、社交关系。According to one embodiment, the population relationship map of the specific population described above is established based on one or more of the following relationships: a transaction relationship, a device relationship, a capital relationship, and a social relationship.
在一个实施例中,确定用户的关系特征包括,采用节点-向量网络结构特征提取算法,将关系图谱转换为向量因子,基于所述向量因子确定用户的关系特征向量。In one embodiment, determining the relationship characteristics of the user comprises using a node-vector network structure feature extraction algorithm to convert the relationship map into a vector factor, and determining a relationship feature vector of the user based on the vector factor.
在一个实施例中,采用预先训练的评估模型确定业务请求事件的风险概率,所述评估模型基于梯度提升决策树算法而训练。In one embodiment, a risk probability of a business request event is determined using a pre-trained evaluation model that is trained based on a gradient boost decision tree algorithm.
根据第二方面,提供一种确定业务请求事件的风险概率的装置,包括:According to a second aspect, an apparatus for determining a risk probability of a service request event includes:
事件特征获取单元,配置为获取业务请求事件的事件特征;An event feature obtaining unit configured to acquire an event feature of the service request event;
个人特征获取单元,配置为获取所述业务请求事件所涉及的至少一个用户的用户个人特征;a personal feature obtaining unit configured to acquire a user personal feature of at least one user involved in the service request event;
关系特征获取单元,配置为基于特定人群的人群关系图谱,确定所述至少一个用户 的关系特征,其中所述特定人群包含所述至少一个用户;a relationship feature acquiring unit configured to determine a relationship feature of the at least one user based on a crowd relationship map of a specific group, wherein the specific group includes the at least one user;
风险确定单元,配置为根据所述事件特征,所述至少一个用户的用户个人特征,以及所述至少一个用户的关系特征,确定所述业务请求事件的风险概率。The risk determining unit is configured to determine a risk probability of the service request event according to the event feature, the user personal feature of the at least one user, and the relationship feature of the at least one user.
根据第三方面,提供了一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行第一方面的方法。According to a third aspect, there is provided a computer readable storage medium having stored thereon a computer program for causing a computer to perform the method of the first aspect when the computer program is executed in a computer.
根据第四方面,提供了一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现第一方面的方法。According to a fourth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, implementing the method of the first aspect .
通过本说明书实施例提供的方法和装置,基于业务请求事件的事件特征,所涉及用户的用户个人特征,以及所涉及用户的关系特征,综合确定业务请求事件的风险概率,从而使得风险确定更加高效而准确。Through the method and apparatus provided by the embodiments of the present specification, based on the event characteristics of the service request event, the user's personal characteristics of the user involved, and the relationship characteristics of the involved users, the risk probability of the service request event is comprehensively determined, thereby making the risk determination more efficient. And accurate.
附图说明DRAWINGS
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention, Those skilled in the art can also obtain other drawings based on these drawings without any creative work.
图1示出本说明书披露的一个实施例的实施场景示意图;FIG. 1 is a schematic diagram showing an implementation scenario of an embodiment disclosed in the present specification;
图2示出根据一个实施例的确定业务请求事件的风险概率的方法流程图;2 illustrates a method flow diagram for determining a risk probability of a service request event, in accordance with one embodiment;
图3示出根据一个实施例的确定相关用户的关系特征的步骤流程;FIG. 3 illustrates a flow of steps for determining a relationship feature of a related user, according to one embodiment;
图4示出根据一个实施例的人群关系图谱的例子;4 illustrates an example of a crowd relationship map in accordance with one embodiment;
图5示出根据一个实施例的风险确定装置的示意性框图。Figure 5 shows a schematic block diagram of a risk determining device in accordance with one embodiment.
具体实施方式detailed description
下面结合附图,对本说明书提供的方案进行描述。The solution provided in this specification will be described below with reference to the accompanying drawings.
图1为本说明书披露的一个实施例的实施场景示意图。在该实施场景中,通过计算平台来执行业务请求事件的风险审核。用户可以向计算平台发出业务请求事件,例如申请贷款,申请保险理赔等。计算平台获取到这样的业务请求后,要获取多方面的信息,以对该事件的风险概率进行全面评估。这多方面的信息包括,业务请求事件的事件信息, 以及业务请求事件所涉及用户的用户个人特征。此外,计算平台还将事件所涉及的用户放入特定人群中,以获得用户在人群关系图谱中的关系特征。在此基础上,根据上述事件特征,用户个人特征,以及用户的关系特征,综合全面地评估业务请求事件的风险概率。下面描述上述场景的具体执行过程。FIG. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in the present specification. In this implementation scenario, a risk review of a business request event is performed by a computing platform. Users can send business request events to the computing platform, such as applying for a loan, applying for insurance claims, and so on. After the computing platform obtains such a business request, it needs to obtain a variety of information to comprehensively evaluate the risk probability of the event. This multifaceted information includes event information for the business request event, as well as the user's personal characteristics of the user involved in the business request event. In addition, the computing platform also puts the users involved in the event into a specific group of people to obtain the relationship characteristics of the user in the crowd relationship map. On this basis, based on the above event characteristics, user personal characteristics, and user relationship characteristics, comprehensively assess the risk probability of the business request event. The specific execution process of the above scenario will be described below.
图2示出根据一个实施例的确定业务请求事件的风险概率的方法流程图。该方法的执行主体可以是任何具有计算、处理能力的系统、设备、装置、平台或服务器,例如图1所示的计算平台,更具体地,例如是需要对业务风险进行分析管控的各种后台服务器,比如支付宝服务器、保险业务服务器、金融审批服务器等。如图2所示,该方法包括以下步骤:步骤21,获取业务请求事件的事件特征;步骤22,获取业务请求事件所涉及的至少一个用户的用户个人特征;步骤23,基于特定人群的人群关系图谱,确定所述至少一个用户的关系特征,其中所述特定人群包含所述至少一个用户;步骤24,根据所述事件特征,所述至少一个用户的用户个人特征,以及所述至少一个用户的关系特征,确定所述业务请求事件的风险概率。下面描述以上各个步骤的执行方式。2 illustrates a method flow diagram for determining a risk probability of a service request event, in accordance with one embodiment. The execution body of the method may be any system, device, device, platform or server with computing and processing capabilities, such as the computing platform shown in FIG. 1 , more specifically, for example, various backgrounds that need to analyze and manage business risks. Servers, such as Alipay servers, insurance business servers, financial approval servers, etc. As shown in FIG. 2, the method includes the following steps: Step 21: Acquire an event feature of a service request event; Step 22: Acquire a user personal feature of at least one user involved in the service request event; Step 23, based on a specific crowd relationship a map determining a relationship characteristic of the at least one user, wherein the specific group of people includes the at least one user; step 24, according to the event feature, a user profile of the at least one user, and the at least one user A relationship feature that determines a risk probability of the service request event. The manner in which the above various steps are performed will be described below.
首先,在步骤21,获取待评估的业务请求事件的事件特征。可以理解,待评估的业务请求事件可以是针对各种有可能存在风险的业务进行请求的事件,例如,申请贷款、申请信用服务、申请保险理赔等。相应地,与业务请求事件相关的事件特征可以包括以下中的一项或多项:请求的业务类型、请求金额、请求的发生时间、业务注册时间、注册时间与请求时间的时间差、事件发生地点等。更具体地,在一个具体例子中,上述业务请求事件为申请保险理赔的事件,相应地,事件特征可以包括:请求的险种、请求理赔金额、申请理赔时间、投保时间、投保时间与理赔时间的时间差、发生地点等。在另一例子中,上述业务请求事件为申请贷款的事件,相应地,事件特征可以包括:请求金额、申请时间、注册时间、注册时间与申请时间的时间差、发生地点等。First, in step 21, event characteristics of the service request event to be evaluated are obtained. It can be understood that the business request event to be evaluated may be an event for requesting various businesses that may be at risk, for example, applying for a loan, applying for a credit service, applying for insurance claims, and the like. Correspondingly, the event characteristics related to the service request event may include one or more of the following: the requested service type, the requested amount, the time when the request occurred, the service registration time, the time difference between the registration time and the request time, and the event occurrence place. Wait. More specifically, in a specific example, the foregoing service request event is an event for applying for insurance claims, and correspondingly, the event characteristics may include: the requested insurance type, the request settlement amount, the application settlement time, the insurance application time, the insurance application time, and the claim time. Time difference, place of occurrence, etc. In another example, the service request event is an event for applying for a loan, and correspondingly, the event feature may include: a request amount, an application time, a registration time, a time difference between the registration time and the application time, a place of occurrence, and the like.
此外,在步骤22,还获取业务请求事件所涉及的相关用户的用户个人特征。在一个实施例中,业务请求事件所涉及的相关用户即为业务请求人。在另一实施例中,业务请求事件所涉及的相关用户还包括除请求人之外的其他利益相关人。例如,申请贷款业务的事件,涉及的相关用户除了包括贷款请求人,还可以包括担保人等。申请保险理赔的事件,涉及的相关用户除了包括理赔请求人,还可以包括,保险受益人等。因此,业务请求事件所涉及的相关用户可以是多个用户。对于所涉及的各个相关用户,在步骤22,获取这些用户的用户个人特征。In addition, at step 22, the user's personal characteristics of the relevant user involved in the service request event are also obtained. In one embodiment, the relevant user involved in the service request event is the service requester. In another embodiment, the relevant users involved in the business request event also include other stakeholders other than the requester. For example, in the event of applying for a loan business, the relevant users involved may include a guarantor, etc., in addition to the loan requester. In the event of applying for insurance claims, the relevant users involved may include, in addition to the claims claimant, insurance beneficiaries. Therefore, the related user involved in the business request event can be multiple users. For each of the related users involved, at step 22, the user's personal characteristics of these users are obtained.
在一个实施例中,用户个人特征包括用户基本属性特征,例如:性别,年龄,注册 时长,联系方式等等基本信息。In one embodiment, the user's personal characteristics include basic attributes of the user, such as gender, age, registration duration, contact details, and the like.
在一个实施例中,用户个人特征包括用户行为特征。更具体的,用户行为特征可以包括与用户的历史业务操作相关的行为信息,例如,交易次数、平均交易金额、申请理赔次数、理赔获批次数、平均理赔金额等等。In one embodiment, the user personal characteristics include user behavior characteristics. More specifically, the user behavior characteristics may include behavior information related to the user's historical business operations, such as the number of transactions, the average transaction amount, the number of application claims, the number of claims approved, the average claim amount, and the like.
在一个实施例中,用户个人特征还包括用户位置特征,例如各项历史业务操作发生的位置,位置改变的范围,等等。In one embodiment, the user personal characteristics also include user location characteristics, such as where each historical business operation occurs, a range of location changes, and the like.
在更多实施例中,用户个人特征还可以包含更多方面的用户特征。可以理解,用户个人特征是仅依赖于某个用户个体的一些特征,刻画该用户自身的属性特点、操作特点等。根据本说明书的实施例,除了获取用户个体的个人特征之外,还将用户放入一定的人群中,进而发掘出用户在人群关系网络中的关系特征,以便基于该关系特征进行更全面的分析和评估。In further embodiments, the user personal characteristics may also include more aspects of the user characteristics. It can be understood that the user's personal characteristics are only dependent on some characteristics of a certain user, characterizing the user's own attribute characteristics, operating characteristics, and the like. According to the embodiment of the present specification, in addition to acquiring the personal characteristics of the individual user, the user is placed in a certain crowd, thereby discovering the relationship characteristics of the user in the crowd relationship network, so as to perform a more comprehensive analysis based on the relationship feature. And assessment.
于是,在步骤23,对于步骤22中提及的各个相关用户,基于特定人群的人群关系图谱,确定各个用户的关系特征,其中所述特定人群包含上述相关用户。图3示出根据一个实施例的确定相关用户的关系特征的步骤流程,即步骤23的子步骤。如图3所示,为了确定各个相关用户的关系特征,在步骤31,获取包含相关用户的特定人群。Then, in step 23, for each of the related users mentioned in step 22, the relationship characteristics of the respective users are determined based on the crowd relationship map of the specific group, wherein the specific group includes the related users. FIG. 3 illustrates a flow of steps for determining a relationship feature of a related user, ie, a sub-step of step 23, in accordance with one embodiment. As shown in FIG. 3, in order to determine the relationship characteristics of each related user, in step 31, a specific crowd including related users is acquired.
在一个实施例中,预先确定一个足够大的用户集合,使得该用户集合包含待评估的业务请求事件的相关用户,于是可以将该用户集合作为特定人群。例如,在业务请求事件为申请保险理赔的情况下,可以将所有投保人员的集合作为上述特定人群。In one embodiment, a sufficiently large set of users is predetermined such that the set of users contains relevant users of the service request event to be evaluated, and the set of users can then be considered as a specific group of people. For example, in the case where the business request event is an application for insurance claims, a collection of all insured persons may be taken as the above specific group.
在一个实施例中,根据用户的某些特征,将全量用户的集合划分为多个用户子集。在步骤31,判断业务请求事件所涉及的相关用户所属于的用户子集,将该用户子集作为上述特定人群。In one embodiment, the set of full users is divided into a plurality of subsets of users based on certain characteristics of the user. In step 31, a subset of users to which the related user related to the service request event belongs is determined, and the subset of users is used as the specific group.
在一个实施例中,预先选择具有一定相似性或关联性的部分用户构成一个用户集合。例如,在业务请求事件为申请保险理赔的情况下,可以预先选择所有曾经申请理赔的用户构成一个用户集合。然后在步骤31,判断当前事件的相关用户是否在上述用户集合中,如果不在,则将其添加到该用户集合中,将添加之后的用户集合作为所述特定人群。In one embodiment, a portion of users having certain similarities or associations are pre-selected to form a set of users. For example, in the case that the business request event is an application for insurance claims, all users who have applied for claims may be pre-selected to form a user set. Then, in step 31, it is judged whether the relevant user of the current event is in the above-mentioned user set, and if not, it is added to the user set, and the added user set is taken as the specific crowd.
还可以通过其他方式获取上述特定人群,只要使得该特定人群包含有待分析的相关用户。The above specific population can also be obtained by other means as long as the specific population is included in the relevant user to be analyzed.
接着,在步骤32,获取上述特定人群的人群关系图谱。Next, at step 32, a population relationship map of the specific population described above is obtained.
在一个实施例中,该步骤包括,针对上述特定人群,重新构建人群关系图谱。In one embodiment, the step includes reconstructing a population relationship map for the particular population described above.
在另一实施例中,上述特定人群选自预定的用户集合,并且系统已经预先为该用户集合构建了人群关系图谱。例如,如前所述的例子中,特定人群可以选自全量用户,或者基于全量用户划分的某个用户子集,而系统可能预先为全量用户建立了人群关系图谱,或者针对各个用户子集建立了人群关系图谱。此时,在步骤32中,可以直接获取预先构建的人群关系图谱,或者从预先构建的、针对更大范围用户的人群关系图谱中,提取出与上述特定人群相关的部分,作为针对该特定人群的人群关系图谱。In another embodiment, the particular population is selected from a predetermined set of users, and the system has previously built a crowd relationship map for the set of users. For example, in the example described above, a specific group of people may be selected from a full amount of users, or a subset of users based on a full amount of users, and the system may pre-establish a crowd relationship map for a full amount of users, or establish a subset of each user. The crowd relationship map. At this time, in step 32, the pre-built crowd relationship map may be directly obtained, or the part related to the specific crowd may be extracted from the pre-built crowd relationship map for a larger range of users, as the specific population Crowd relationship map.
在另一实施例中,上述特定人群是通过将相关用户添加到预先选择的用户集合中而形成。如果系统已经针对该预先选择的用户集合构建了人群关系图谱,那么步骤32可以包括,首先获取针对该预先选择的用户集合构建的关系图谱;获取上述相关用户与该预先选择的用户集合中的用户的关联关系;然后,将上述关联关系添加到上述关系图谱中,作为所述特定人群的人群关系图谱。In another embodiment, the particular population described above is formed by adding related users to a pre-selected set of users. If the system has constructed a crowd relationship map for the pre-selected set of users, step 32 may include first obtaining a relationship map constructed for the pre-selected set of users; obtaining the user in the related user and the pre-selected set of users The association relationship; then, the above relationship is added to the above relationship map as a population relationship map of the specific group.
不管是预先构建,或者是现场重新构建,人群关系图谱的构建可以基于多种关系。Whether pre-built or rebuilt on-site, the construction of crowd relationship maps can be based on multiple relationships.
在一个实施例中,人群关系图谱基于人群的交易关系而建立。例如,两个用户之间达成商品购买交易,则在这两个用户之间建立交易关联。可以通过获取并分析大量用户的交易记录而确定用户之间的交易关系,进而建立人群关系图谱。In one embodiment, the crowd relationship map is established based on the trading relationship of the crowd. For example, if a product purchase transaction is reached between two users, a transaction association is established between the two users. The transaction relationship between users can be determined by acquiring and analyzing the transaction records of a large number of users, thereby establishing a crowd relationship map.
在一个实施例中,人群关系图谱基于人群的设备关系而建立。例如,当两个或多个用户账户利用同一台终端设备进行登录时,可以确定这两个或多个用户账户之间存在设备关联。存在设备关联的两个或多个用户账户,有可能是同一实体用户注册的多个账户,也可以是存在紧密关联(例如家人、同事等)的多个用户所对应的账户。设备关系可以通过获取用户登录其账户时所对应的实体终端信息而确定。In one embodiment, the crowd relationship map is established based on the device relationship of the crowd. For example, when two or more user accounts log in using the same terminal device, it can be determined that there is a device association between the two or more user accounts. There are two or more user accounts associated with the device, which may be multiple accounts registered by the same entity user, or may be accounts corresponding to multiple users who have close associations (such as family members, colleagues, etc.). The device relationship can be determined by obtaining the entity terminal information corresponding to the user when logging in to the account.
在一个实施例中,人群关系图谱基于资金关系而建立。例如,当两个用户之间存在转账、收款等资金转移操作时,则在这两个用户之间建立资金关联。可以通过获取并分析用户利用电子钱包进行资金操作的记录而确定用户之间的资金关系,进而基于资金关系建立人群关系图谱。In one embodiment, the crowd relationship map is established based on the funding relationship. For example, when there is a fund transfer operation such as transfer, collection, etc. between two users, a fund association is established between the two users. The relationship between the users can be determined by obtaining and analyzing the records of the user's operation using the electronic wallet, and then the relationship map is established based on the capital relationship.
在一个实施例中,人群关系图谱基于社交关系而建立。如今人们越来越多地使用社交应用进行交互,例如,两个用户可以通过社交应用进行聊天、发红包、文件传送等互动,那么可以在这两个用户之间建立社交关联。可以基于社交应用捕获的大量社交交互确定人群之间的社交关系,进而建立人群关系图谱。In one embodiment, the crowd relationship map is established based on social relationships. Nowadays people are increasingly using social applications to interact. For example, two users can interact through social applications such as chatting, red packets, file transfer, etc., so that social connections can be established between the two users. A social relationship between the crowds can be determined based on a large number of social interactions captured by the social application, thereby establishing a crowd relationship map.
尽管以上给出了若干例子,但是可以理解,还可以基于更多种人群关联关系来建立人群关系图谱。并且,人群关系图谱可以同时基于若干种人群关联关系而建立。Although a few examples are given above, it will be appreciated that a population relationship map can also be established based on a greater variety of population associations. Moreover, the population relationship map can be established based on several kinds of population associations at the same time.
在一个实施例中,人群关系图谱可以形成为节点网络的形式。在该形式下,人群关系图谱包括多个节点,每个节点对应一个用户,存在关联关系的节点之间会彼此连接。在一个实施例中,节点之间的连接可以具有多种属性,例如连接类型,连接强度等,其中连接类型又包括,例如资金连接(基于资金关系的连接),社交连接(基于社交互动的连接等),连接强度又可以包括,例如强连接,弱连接等。In one embodiment, the crowd relationship map can be formed in the form of a network of nodes. In this form, the crowd relationship map includes a plurality of nodes, each node corresponding to one user, and the nodes having the associated relationship are connected to each other. In one embodiment, the connections between the nodes may have various attributes, such as connection type, connection strength, etc., where the connection types include, for example, a capital connection (a connection based on a capital relationship), a social connection (a social interaction based connection) Etc.), the connection strength can also include, for example, strong connections, weak connections, and the like.
图4示出根据一个实施例的人群关系图谱的例子。如图4所示,在该例子中,人群关系图谱中包括多个节点,每个节点对应一个用户。节点之间的连接表示用户之间具有关联关系。假定图4的人群关系图谱是基于人群的资金关系和社交关系而建立。相应地,节点之间的连接可以是资金连接或社交连接。在图4的例子中,以不同的线型示出不同的连接类型,即以虚线示出节点之间的社交连接,以实线示出节点之间的资金连接。并且,以连接线的粗细示出连接的强度。例如,粗线示出强连接,细线示出弱连接。更具体地,粗实线可以示出,较强的资金连接(例如资金交互超过一金额阈值,例如1万元),细实线示出,较弱的资金连接(例如资金交互不超过上述金额阈值);粗虚线可以示出,较强的社交连接(例如,交互频次超过一频次阈值,例如每天10次),细虚线示出,较弱的社交连接(例如,交互频次不超过上述频次阈值)。Figure 4 illustrates an example of a crowd relationship map in accordance with one embodiment. As shown in FIG. 4, in this example, the crowd relationship map includes a plurality of nodes, and each node corresponds to one user. The connection between nodes indicates that there is an association between users. It is assumed that the crowd relationship map of FIG. 4 is established based on the capital relationship and social relationship of the crowd. Accordingly, the connection between the nodes can be a capital connection or a social connection. In the example of FIG. 4, different connection types are shown in different line types, that is, the social connections between the nodes are shown in broken lines, and the capital connections between the nodes are shown in solid lines. Also, the strength of the connection is shown by the thickness of the connecting line. For example, thick lines show strong connections and thin lines show weak connections. More specifically, the thick solid line may show a stronger capital connection (eg, the capital interaction exceeds a threshold value of $10,000, for example, 10,000 yuan), and the thin line shows a weaker fund connection (eg, the capital interaction does not exceed the above amount) Threshold); thick dashed lines may show strong social connections (eg, the frequency of interactions exceeds a frequency threshold, eg 10 times per day), thin dotted lines show weaker social connections (eg, the frequency of interaction does not exceed the above frequency threshold) ).
可以理解,在更多实施例中,人群关系图谱还可以表征为其他形式,例如表格,图形等形式。It will be appreciated that in further embodiments, the crowd relationship map may also be characterized in other forms, such as forms, graphics, and the like.
回到图3,在获取针对特定人群而构建的人群关系图谱的基础上,在步骤33,基于该人群关系图谱,确定当前事件所涉及的相关用户的关系特征。Returning to FIG. 3, based on the acquisition of the population relationship map constructed for a specific population, in step 33, based on the crowd relationship map, the relationship characteristics of the relevant users involved in the current event are determined.
如前所述,在人群关系图谱中,存在关联关系的用户会彼此连接。相应地,在一个实施例中,对于某个用户,可以从人群关系图谱中,提取出与该用户有关的连接的特征,例如连接的数目,连接的类型,连接的强度,所连接到的其他用户,等等,将这样的连接特征作为该用户的关系特征。As mentioned earlier, in the crowd relationship map, users with associated relationships are connected to each other. Correspondingly, in one embodiment, for a certain user, features of the connection related to the user, such as the number of connections, the type of connection, the strength of the connection, and other connected to, may be extracted from the crowd relationship map. The user, etc., takes such a connection feature as a relationship feature of the user.
在另一实施例中,采用机器学习辅助方法,对人群关系图谱进行分析和表征。实际上,人群关系图谱可以理解为一种网络,其中包含了一定数目的节点(对应于用户),以及节点之间的连接关系(用户之间的关联关系)。相比于文本和图像,网络信息更难被结构化为标准的数据,因此,难以应用于机器学习。近来,提出了几种网络表示(network  representation)学习算法,来表征并分析网络结构。这些算法的目标,是用低维、稠密、实值的向量表示网络中具有语义关系的节点,从而利于计算存储,不用再手动提取特征,并且可以将异质信息投影到同一个低维空间中,方便进行下游计算。In another embodiment, a crowd learning map is analyzed and characterized using a machine learning aid. In fact, the crowd relationship map can be understood as a network that contains a certain number of nodes (corresponding to users) and the connection relationship between nodes (the relationship between users). Compared to text and images, network information is more difficult to structure into standard data, so it is difficult to apply to machine learning. Recently, several network representation learning algorithms have been proposed to characterize and analyze network structures. The goal of these algorithms is to represent nodes with semantic relationships in the network with low-dimensional, dense, real-valued vectors, which facilitates computational storage without the need to manually extract features and project heterogeneous information into the same low-dimensional space. For easy downstream calculations.
根据网络表示学习算法,将网络嵌入到一个几何空间中,将每个节点的空间坐标视作该节点的特征,从而放到神经网络中进行学习和训练。相应地,对于人群关系图谱,可以将该图谱映射到几何空间中,计算各个用户节点的空间坐标,作为其关系特征向量。对于网络节点的空间坐标的计算,可以采用多种算法。According to the network representation learning algorithm, the network is embedded into a geometric space, and the spatial coordinates of each node are regarded as the characteristics of the node, so that they are put into the neural network for learning and training. Correspondingly, for the crowd relationship map, the map can be mapped into the geometric space, and the spatial coordinates of each user node are calculated as the relationship feature vector. For the calculation of the spatial coordinates of the network nodes, various algorithms can be employed.
在一个实施例中,采用DeepWalk算法确定人群关系图谱所对应的网络中各个节点的向量表示。根据DeepWalk算法,在网络上释放大量的随机游走粒子,这些粒子在给定的时间内就会走出一个节点构成的序列。如果将节点视作单词,由此生成的序列就构成了句子,于是可以得到一种节点由序列构成的“语言”。然后,应用词向量转换(Word2Vec)算法,就可以计算出每个节点“单词”的向量表示。In one embodiment, the DeepWalk algorithm is used to determine a vector representation of each node in the network corresponding to the population relationship map. According to the DeepWalk algorithm, a large number of random walk particles are released on the network, and these particles will go out of a sequence of nodes in a given time. If a node is treated as a word, the resulting sequence constitutes a sentence, and thus a "language" in which the node is composed of a sequence can be obtained. Then, using the word vector conversion (Word2Vec) algorithm, a vector representation of each word "word" can be calculated.
在一个实施例中,采用节点-向量(node2vec)结构特征提取算法,将人群关系图谱转换为向量因子的形式。Node2vec节点-向量结构特征提取算法,改进了DeepWalk中随机游走的策略,在深度优先的搜索(Depth-First Search,DFS)和广度优先的搜索(Breadth-First Search,BFS)之间达到一个平衡,同时考虑到局部和宏观的信息,从而优化向量生成方式。如此,可以将人群关系图谱中的用户节点转换成向量表示的形式,从而可以确定当前事件所涉及的用户在该人群关系图谱中的向量表达,作为其关系特征向量。In one embodiment, a node-vector (node2vec) structural feature extraction algorithm is employed to convert the population relationship map into a form of a vector factor. The Node2vec node-vector structure feature extraction algorithm improves the random walk strategy in DeepWalk, achieving a balance between Depth-First Search (DFS) and Breadth-First Search (BFS). At the same time, consider local and macro information to optimize the vector generation method. In this way, the user node in the crowd relationship map can be converted into a form of vector representation, so that the vector expression of the user involved in the current event in the crowd relationship map can be determined as its relationship feature vector.
在其他实施例中,还可以采用更多种方式,从人群关系图谱中获取当前事件涉及用户的关系特征向量。根据人群关系图谱的不同构建方式、不同表示方式,获取的关系特征向量的维度、元素也会有所不同。不过可以理解的是,关系特征向量通过表征用户对应的节点在人群关系图谱中的位置,以及与其他节点的连接关系,从而全面地表征用户在人群关系网中与其他用户的关联关系。In other embodiments, more ways may be adopted to obtain a relationship feature vector of the current event related to the user from the crowd relationship map. According to different construction methods and different representation modes of the population relationship map, the dimensions and elements of the obtained relationship feature vectors will be different. However, it can be understood that the relationship feature vector comprehensively represents the relationship between the user and other users in the crowd relationship network by characterizing the position of the node corresponding to the user in the crowd relationship map and the connection relationship with other nodes.
基于步骤21中获取的事件特征,步骤22获取的用户个人特征,以及如上所述在步骤23获取的用户关系特征,在步骤24,综合以上各种特征,确定业务请求事件的风险概率。Based on the event characteristics acquired in step 21, the user personal characteristics acquired in step 22, and the user relationship characteristics acquired in step 23 as described above, in step 24, the above various features are combined to determine the risk probability of the service request event.
在一个具体实施例中,基于事件特征,确定业务请求事件的第一评估分数;基于用户个人特征,确定业务请求事件的第二评估分数;基于用户关系特征,确定业务请求事 件的第三评估分数;最后对第一、第二、第三评估分数进行加权求和,确定业务请求事件的风险概率分数。其中确定第一、第二和第三评估分数的方式,可以通过预先训练的模型算法和模型参数进行。In a specific embodiment, determining a first evaluation score of the service request event based on the event feature; determining a second evaluation score of the service request event based on the user personal characteristic; determining a third evaluation score of the service request event based on the user relationship feature Finally, the first, second, and third evaluation scores are weighted and summed to determine the risk probability score of the service request event. The manner in which the first, second, and third evaluation scores are determined may be performed by a pre-trained model algorithm and model parameters.
在另一个具体实施例中,用户个人特征和用户关系特征均表示为向量的形式。在步骤24,首先将用户个人特征的特征向量和用户关系特征的特征向量进行拼接,得到用户综合特征。接着,可以基于用户综合特征,确定业务请求事件的第一评估分数,基于业务请求事件的事件特征,确定该事件的第二评估分数,最后基于第一和第二评估分数,确定业务请求事件的风险概率分数。其中确定第一和第二评估分数的方式,可以通过预先训练的模型算法和模型参数进行。In another embodiment, both the user personal characteristics and the user relationship features are represented in the form of a vector. In step 24, the feature vector of the user's personal feature and the feature vector of the user relationship feature are first spliced to obtain a user integrated feature. Then, based on the user comprehensive feature, determining a first evaluation score of the service request event, determining a second evaluation score of the event based on the event feature of the service request event, and finally determining a service request event based on the first and second evaluation scores Risk probability score. The manner in which the first and second evaluation scores are determined may be performed by a pre-trained model algorithm and model parameters.
在另一实施例中,预先训练一个评估模型,该评估模型直接基于事件特征、用户个人特征以及用户关系特征,对业务请求事件的风险概率进行评估。可以理解,该评估模型基于已经标定的训练数据集进行训练。实践中,对于已知其风险概率的业务请求事件,例如人工审核确定为理赔骗保的负样本事件,或者人工审核确定为正常理赔的正样本事件,获取事件的事件特征,事件所涉及用户的用户个人特征。此外,也将涉及用户放入人群中,获取用户在人群关系图谱中的关系特征,特别是关系特征向量。将以上数据加入训练数据集。如此,可以采用一定的模型算法和模型参数,基于训练数据集中的事件特征、用户个人特征和用户关系特征确定事件的风险概率,得到某个事件的风险概率。然后,基于得到的风险概率与该事件实际的已知风险概率的比对(即损失函数),不断优化模型算法和模型参数,从而训练得到上述评估模型。In another embodiment, an evaluation model is pre-trained that evaluates the risk probability of a business request event based directly on event characteristics, user personal characteristics, and user relationship characteristics. It will be appreciated that the evaluation model is based on training data sets that have been calibrated. In practice, for a business request event whose risk probability is known, such as a manual sample to determine a negative sample event as a claims fraud, or a manual review of a positive sample event determined to be a normal claim, the event characteristics of the event are acquired, and the user involved in the event User personal characteristics. In addition, the user will be involved in the crowd to obtain the relationship characteristics of the user in the crowd relationship map, especially the relationship feature vector. Add the above data to the training data set. In this way, a certain model algorithm and model parameters can be used, and the risk probability of the event is determined based on the event characteristics, the user's personal characteristics and the user relationship characteristics in the training data set, and the risk probability of an event is obtained. Then, based on the obtained risk probability and the actual known risk probability of the event (ie, the loss function), the model algorithm and the model parameters are continuously optimized, thereby training the above evaluation model.
上述评估模型可以采用多种具体的模型算法。在一个实施例中,采用梯度提升决策树GBDT(Gradient Boosting Decision Tree)方法训练得到上述评估模型。The above evaluation model can employ a variety of specific model algorithms. In one embodiment, the above evaluation model is trained using a Gradient Boosting Decision Tree (GBDT) method.
如本领域技术人员所知,梯度提升决策树GBDT方法是一种有监督的集成学习的方法。在集成学习方法中,采用多个学习器分别对训练样本集进行学习,最终的模型是对上述多个学习器的综合。集成学习最主要的两种方法为Bagging和Boosting,其中根据Boosting算法,学习器之间存在先后顺序,且具有不同的权重,同时也为每一个样本分配权重。初始地,每一个样本的权重相等,在利用某个学习器对训练样本进行学习之后,增大错误样本的权重,减小正确样本的权重,再利用后续的学习器对其进行学习。这样,最终的预测结果为多个学习器结果的合并。在此基础上,可以采用梯度传递的方式基于预测结果优化模型函数,这样的方法称为梯度提升Gradient Boost方法。As known to those skilled in the art, the gradient boost decision tree GBDT method is a supervised method of integrated learning. In the integrated learning method, a plurality of learners are used to separately learn the training sample set, and the final model is a synthesis of the above plurality of learners. The two main methods of integrated learning are Bagging and Boosting. According to the Boosting algorithm, there are sequential orders between learners, and they have different weights. At the same time, weights are assigned to each sample. Initially, each sample has the same weight. After learning the training sample with a certain learner, the weight of the wrong sample is increased, the weight of the correct sample is reduced, and then the subsequent learner is used to learn. Thus, the final prediction is the combination of multiple learner results. On this basis, gradient model can be used to optimize the model function based on the prediction result. This method is called Gradient Boost method.
在梯度提升Gradient Boost框架下,每个基学习器采用分类回归树算法,就构成了 梯度提升决策树GBDT模型。分类回归树算法是一种基于二叉树的机器学习算法。在梯度提升决策树GBDT算法中,由于集成了多个这样的分类回归树作为学习器,使得模型的准确性和覆盖率更加有效。Under the Gradient Boost framework, each base learner uses the classification regression tree algorithm to form the gradient decision tree GBDT model. The classification regression tree algorithm is a binary learning machine learning algorithm. In the gradient lifting decision tree GBDT algorithm, since a plurality of such classification regression trees are integrated as a learner, the accuracy and coverage of the model are more effective.
更具体地,根据GBDT算法,可以针对各项特征,包括事件特征、用户个人特征和用户关系特征,训练多个采用分类回归树的学习器,从而形成上述评估模型。More specifically, according to the GBDT algorithm, a plurality of learners using a classification regression tree can be trained for various features, including event features, user personal features, and user relationship features, thereby forming the above-described evaluation model.
在其他实施例中,上述评估模型也可以采用其他算法训练实现,例如前述的集成学习中的bagging算法,以及采用其他算法的学习器等等。In other embodiments, the above evaluation model may also be implemented by other algorithms, such as the aforementioned bagging algorithm in integrated learning, a learner using other algorithms, and the like.
在评估模型训练完成之后,在步骤24,可以直接采用评估模型,确定当前的业务请求事件的风险概率。After the evaluation model training is completed, in step 24, the evaluation model can be directly used to determine the risk probability of the current business request event.
如此,综合一个业务请求事件的事件特征、用户个人特征和用户关系特征,可以全面地对该业务请求事件的风险概率进行评估,从而更加高效、准确地把控业务执行风险。In this way, by synthesizing the event characteristics, user personal characteristics and user relationship characteristics of a service request event, the risk probability of the service request event can be comprehensively evaluated, thereby controlling the business execution risk more efficiently and accurately.
根据另一方面的实施例,还提供一种确定业务请求事件的风险概率的装置。图5示出根据一个实施例的风险确定装置的示意性框图。如图5所示,该风险确定装置500包括:事件特征获取单元510,配置为获取业务请求事件的事件特征;个人特征获取单元520,配置为获取所述业务请求事件所涉及的至少一个用户的用户个人特征;关系特征获取单元530,配置为基于特定人群的人群关系图谱,确定所述至少一个用户的关系特征,其中所述特定人群包含所述至少一个用户;风险确定单元540,配置为根据所述事件特征,所述至少一个用户的用户个人特征,以及所述至少一个用户的关系特征,确定所述业务请求事件的风险概率。According to another embodiment, an apparatus for determining a risk probability of a service request event is also provided. Figure 5 shows a schematic block diagram of a risk determining device in accordance with one embodiment. As shown in FIG. 5, the risk determining apparatus 500 includes: an event feature acquiring unit 510 configured to acquire an event feature of a service request event; and a personal feature obtaining unit 520 configured to acquire at least one user involved in the service request event. a user personal feature; the relationship feature obtaining unit 530 is configured to determine a relationship feature of the at least one user based on a crowd relationship map of a specific group, wherein the specific group includes the at least one user; and the risk determining unit 540 is configured to The event feature, the user profile of the at least one user, and the relationship feature of the at least one user determine a risk probability of the service request event.
在一个实施例中,上述事件特征获取单元510获取的事件特征包括以下中的至少一项:请求业务金额,业务注册时间,事件发生时间,业务注册时间与事件发生时间的时间差,事件发生地点。In one embodiment, the event feature acquired by the event feature acquiring unit 510 includes at least one of the following: a request for a business amount, a service registration time, an event occurrence time, a time difference between a service registration time and an event occurrence time, and an event occurrence location.
根据一个实施例,业务请求事件所涉及的至少一个用户包括,业务请求事件的请求人,以及业务请求的受益人。According to one embodiment, at least one user involved in the service request event includes a requestor of the service request event, and a beneficiary of the service request.
在一个实施例中,上述个人特征获取单元520所获取的用户个人特征包括以下中的一项或多项,用户基本属性特征,用户行为特征,用户位置特征。In one embodiment, the personal characteristics of the user acquired by the personal feature acquisition unit 520 include one or more of the following: a user basic attribute feature, a user behavior feature, and a user location feature.
根据一种实施方式,上述关系特征获取单元530包括:人群获取模块531,配置为获取包含所述至少一个用户的特定人群;图谱获取模块532,配置为获取所述特定人群的人群关系图谱;特征获取模块533,配置为基于所述人群关系图谱,确定所述至少一 个用户的关系特征。According to an embodiment, the relationship feature obtaining unit 530 includes: a crowd obtaining module 531 configured to acquire a specific crowd including the at least one user; and a map acquiring module 532 configured to acquire a crowd relationship map of the specific crowd; The obtaining module 533 is configured to determine a relationship feature of the at least one user based on the crowd relationship map.
在一个实施例中,所述人群获取模块531配置为,在预先划分的多个用户子集中,确定所述至少一个用户所属于的用户子集,将该用户子集作为上述特定人群。In an embodiment, the crowd obtaining module 531 is configured to determine, in a plurality of pre-divided subsets of users, a subset of users to which the at least one user belongs, and use the subset of users as the specific group.
在另一实施例中,所述人群获取模块531配置为,将所述至少一个用户添加到预先选择的用户集合中,将所述用户集合作为所述特定人群。In another embodiment, the crowd acquisition module 531 is configured to add the at least one user to a pre-selected set of users, the set of users being the particular group of people.
进一步地,在一个实施例中,图谱获取模块532配置为:获取针对所述预先选择的用户集合构建的第一关系图谱;获取所述至少一个用户与所述预先选择的用户集合中的用户的关联关系;将所述关联关系添加到所述第一关系图谱,作为所述特定人群的人群关系图谱。Further, in an embodiment, the map acquisition module 532 is configured to: acquire a first relationship map constructed for the pre-selected user set; and acquire the at least one user and the user in the pre-selected user set An association relationship is added to the first relationship map as a crowd relationship map of the specific group of people.
根据一种实施方式,特定人群的人群关系图谱基于以下一种或多种关系而建立:交易关系、设备关系、资金关系、社交关系。According to one embodiment, a population relationship map for a particular population is established based on one or more of the following relationships: transaction relationships, device relationships, funding relationships, social relationships.
在一个实施例中,上述关系特征获取单元530配置为,采用节点-向量网络结构特征提取算法,将所述关系图谱转换为向量因子,基于所述向量因子确定所述至少一个用户的关系特征向量。In an embodiment, the relationship feature obtaining unit 530 is configured to: convert the relationship map into a vector factor by using a node-vector network structure feature extraction algorithm, and determine a relationship feature vector of the at least one user based on the vector factor .
在一个实施例中,风险确定单元540配置为,采用预先训练的评估模型确定所述业务请求事件的风险概率,所述评估模型基于梯度提升决策树算法而训练。In one embodiment, the risk determination unit 540 is configured to determine a risk probability of the service request event using a pre-trained evaluation model that is trained based on a gradient boost decision tree algorithm.
通过上述装置,综合一个业务请求事件的事件特征、用户个人特征和用户关系特征,全面地对该业务请求事件的风险概率进行评估,从而更加高效、准确地把控业务执行风险Through the above device, the event characteristics, the user's personal characteristics and the user relationship characteristics of a service request event are integrated, and the risk probability of the service request event is comprehensively evaluated, thereby controlling the business execution risk more efficiently and accurately.
根据另一方面的实施例,还提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行结合图2所描述的方法。According to another embodiment, there is also provided a computer readable storage medium having stored thereon a computer program for causing a computer to perform the method described in connection with FIG. 2 when the computer program is executed in a computer.
根据再一方面的实施例,还提供一种计算设备,包括存储器和处理器,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现结合图2所述的方法。According to still another embodiment, there is also provided a computing device comprising a memory and a processor, the memory storing executable code, and when the processor executes the executable code, implementing the method described in connection with FIG. 2 method.
本领域技术人员应该可以意识到,在上述一个或多个示例中,本发明所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。Those skilled in the art will appreciate that in one or more examples described above, the functions described herein can be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详 细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。The specific embodiments of the present invention have been described in detail with reference to the preferred embodiments of the present invention. The scope of the protection, any modifications, equivalent substitutions, improvements, etc., which are made on the basis of the technical solutions of the present invention, are included in the scope of the present invention.

Claims (24)

  1. 一种确定业务请求事件的风险概率的方法,包括:A method of determining a risk probability of a business request event, comprising:
    获取业务请求事件的事件特征;Obtain the event characteristics of the business request event;
    获取所述业务请求事件所涉及的至少一个用户的用户个人特征;Obtaining a user personal characteristic of at least one user involved in the service request event;
    基于特定人群的人群关系图谱,确定所述至少一个用户的关系特征,其中所述特定人群包含所述至少一个用户;Determining a relationship characteristic of the at least one user based on a crowd relationship map of a specific group, wherein the specific group includes the at least one user;
    根据所述事件特征,所述至少一个用户的用户个人特征,以及所述至少一个用户的关系特征,确定所述业务请求事件的风险概率。Determining a risk probability of the service request event according to the event feature, the user personal characteristic of the at least one user, and the relationship feature of the at least one user.
  2. 根据权利要求1所述的方法,其中所述事件特征包括以下中的至少一项:请求业务金额,业务注册时间,事件发生时间,业务注册时间与事件发生时间的时间差,事件发生地点。The method of claim 1, wherein the event characteristic comprises at least one of: a request for a business amount, a service registration time, an event occurrence time, a time difference between a service registration time and an event occurrence time, and an event occurrence location.
  3. 根据权利要求1所述的方法,其中所述至少一个用户包括,所述业务请求事件的请求人,以及业务请求的受益人。The method of claim 1 wherein said at least one user comprises a requestor of said business request event and a beneficiary of a business request.
  4. 根据权利要求1所述的方法,其中所述用户个人特征包括以下中的一项或多项,用户基本属性特征,用户行为特征,用户位置特征。The method of claim 1 wherein said user personal characteristics comprise one or more of the following: user basic attribute characteristics, user behavior characteristics, user location characteristics.
  5. 根据权利要求1所述的方法,其中基于特定人群的人群关系图谱,确定所述至少一个用户的关系特征向量,包括:The method of claim 1, wherein determining the relationship feature vector of the at least one user based on a population relationship map of a specific population comprises:
    获取包含所述至少一个用户的所述特定人群;Obtaining the specific crowd of the at least one user;
    获取所述特定人群的人群关系图谱;Obtaining a population relationship map of the specific population;
    基于所述人群关系图谱,确定所述至少一个用户的关系特征。Determining a relationship characteristic of the at least one user based on the crowd relationship map.
  6. 根据权利要求5所述的方法,其中获取包含所述至少一个用户的所述特定人群包括,在预先划分的多个用户子集中,确定所述至少一个用户所属于的用户子集,将该用户子集作为上述特定人群。The method of claim 5, wherein the obtaining the specific crowd comprising the at least one user comprises, in a plurality of pre-divided subsets of users, determining a subset of users to which the at least one user belongs, the user The subset serves as the specific population mentioned above.
  7. 根据权利要求5所述的方法,其中获取包含所述至少一个用户的所述特定人群包括,将所述至少一个用户添加到预先选择的用户集合中,将所述用户集合作为所述特定人群。The method of claim 5 wherein obtaining the particular population comprising the at least one user comprises adding the at least one user to a pre-selected set of users, the set of users being the particular population.
  8. 根据权利要求7所述的方法,其中获取所述特定人群的人群关系图谱包括:The method of claim 7 wherein obtaining a population relationship map for said particular population comprises:
    获取针对所述预先选择的用户集合构建的第一关系图谱;Acquiring a first relationship map constructed for the pre-selected set of users;
    获取所述至少一个用户与所述预先选择的用户集合中的用户的关联关系;Obtaining an association relationship between the at least one user and a user in the pre-selected user set;
    将所述关联关系添加到所述第一关系图谱,作为所述特定人群的人群关系图谱。Adding the association relationship to the first relationship map as a crowd relationship map of the specific group of people.
  9. 根据权利要求1所述的方法,其中所述特定人群的人群关系图谱基于以下一种或 多种关系而建立:交易关系、设备关系、资金关系、社交关系。The method of claim 1 wherein the population relationship map for the particular population is established based on one or more of the following relationships: transaction relationships, device relationships, financial relationships, social relationships.
  10. 根据权利要求1所述的方法,其中确定所述至少一个用户的关系特征包括,采用节点-向量网络结构特征提取算法,将所述关系图谱转换为向量因子,基于所述向量因子确定所述至少一个用户的关系特征向量。The method of claim 1, wherein determining the relationship characteristic of the at least one user comprises using a node-vector network structure feature extraction algorithm to convert the relationship map into a vector factor, the at least determining the at least one based on the vector factor A relationship vector of a user.
  11. 根据权利要求1所述的方法,其中确定所述业务请求事件的风险概率包括,采用预先训练的评估模型确定所述业务请求事件的风险概率,所述评估模型基于梯度提升决策树算法而训练。The method of claim 1 wherein determining the risk probability of the service request event comprises determining a risk probability of the service request event using a pre-trained evaluation model, the evaluation model being trained based on a gradient boost decision tree algorithm.
  12. 一种确定业务请求事件的风险概率的装置,包括:An apparatus for determining a risk probability of a service request event, comprising:
    事件特征获取单元,配置为获取业务请求事件的事件特征;An event feature obtaining unit configured to acquire an event feature of the service request event;
    个人特征获取单元,配置为获取所述业务请求事件所涉及的至少一个用户的用户个人特征;a personal feature obtaining unit configured to acquire a user personal feature of at least one user involved in the service request event;
    关系特征获取单元,配置为基于特定人群的人群关系图谱,确定所述至少一个用户的关系特征,其中所述特定人群包含所述至少一个用户;a relationship feature acquiring unit, configured to determine a relationship feature of the at least one user based on a crowd relationship map of a specific group, wherein the specific group includes the at least one user;
    风险确定单元,配置为根据所述事件特征,所述至少一个用户的用户个人特征,以及所述至少一个用户的关系特征,确定所述业务请求事件的风险概率。The risk determining unit is configured to determine a risk probability of the service request event according to the event feature, the user personal feature of the at least one user, and the relationship feature of the at least one user.
  13. 根据权利要求12所述的装置,其中所述事件特征包括以下中的至少一项:请求业务金额,业务注册时间,事件发生时间,业务注册时间与事件发生时间的时间差,事件发生地点。The apparatus of claim 12, wherein the event characteristic comprises at least one of: a request for a business amount, a service registration time, an event occurrence time, a time difference between a service registration time and an event occurrence time, and an event occurrence location.
  14. 根据权利要求12所述的装置,其中所述至少一个用户包括,所述业务请求事件的请求人,以及业务请求的受益人。The apparatus of claim 12 wherein said at least one user comprises a requestor of said service request event and a beneficiary of a service request.
  15. 根据权利要求12所述的装置,其中所述用户个人特征包括以下中的一项或多项,用户基本属性特征,用户行为特征,用户位置特征。The apparatus of claim 12, wherein the user personal characteristics comprise one or more of the following: a user basic attribute feature, a user behavior feature, a user location feature.
  16. 根据权利要求12所述的装置,其中所述关系特征获取单元包括:The apparatus of claim 12, wherein the relationship feature acquisition unit comprises:
    人群获取模块,配置为获取包含所述至少一个用户的所述特定人群;a crowd acquisition module configured to obtain the specific crowd including the at least one user;
    图谱获取模块,配置为获取所述特定人群的人群关系图谱;a map acquisition module configured to acquire a population relationship map of the specific population;
    特征获取模块,配置为基于所述人群关系图谱,确定所述至少一个用户的关系特征。And a feature acquiring module configured to determine a relationship feature of the at least one user based on the crowd relationship map.
  17. 根据权利要求16所述的装置,其中所述人群获取模块配置为,在预先划分的多个用户子集中,确定所述至少一个用户所属于的用户子集,将该用户子集作为上述特定人群。The apparatus according to claim 16, wherein the crowd obtaining module is configured to determine, in a plurality of pre-divided subsets of users, a subset of users to which the at least one user belongs, the subset of users being the specific group of people .
  18. 根据权利要求16所述的装置,其中所述人群获取模块配置为,将所述至少一个用户添加到预先选择的用户集合中,将所述用户集合作为所述特定人群。The apparatus of claim 16, wherein the crowd acquisition module is configured to add the at least one user to a pre-selected set of users, the set of users being the particular group of people.
  19. 根据权利要求18所述的装置,其中所述图谱获取模块配置为:The apparatus of claim 18 wherein said map acquisition module is configured to:
    获取针对所述预先选择的用户集合构建的第一关系图谱;Acquiring a first relationship map constructed for the pre-selected set of users;
    获取所述至少一个用户与所述预先选择的用户集合中的用户的关联关系;Obtaining an association relationship between the at least one user and a user in the pre-selected user set;
    将所述关联关系添加到所述第一关系图谱,作为所述特定人群的人群关系图谱。Adding the association relationship to the first relationship map as a crowd relationship map of the specific group of people.
  20. 根据权利要求12所述的装置,其中所述特定人群的人群关系图谱基于以下一种或多种关系而建立:交易关系、设备关系、资金关系、社交关系。The apparatus of claim 12, wherein the population relationship map of the particular population is established based on one or more of the following relationships: a transaction relationship, a device relationship, a funding relationship, a social relationship.
  21. 根据权利要求12所述的装置,其中所述关系特征获取单元配置为,采用节点-向量网络结构特征提取算法,将所述关系图谱转换为向量因子,基于所述向量因子确定所述至少一个用户的关系特征向量。The apparatus according to claim 12, wherein said relationship feature obtaining unit is configured to convert said relationship map into a vector factor using a node-vector network structure feature extraction algorithm, and determine said at least one user based on said vector factor Relational feature vector.
  22. 根据权利要求12所述的装置,其中所述风险确定单元配置为,采用预先训练的评估模型确定所述业务请求事件的风险概率,所述评估模型基于梯度提升决策树算法而训练。The apparatus of claim 12, wherein the risk determination unit is configured to determine a risk probability of the service request event using a pre-trained evaluation model, the evaluation model being trained based on a gradient boost decision tree algorithm.
  23. 一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-11中任一项的所述的方法。A computer readable storage medium having stored thereon a computer program for causing a computer to perform the method of any of claims 1-11 when the computer program is executed in a computer.
  24. 一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-11中任一项所述的方法。A computing device, comprising a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, implementing the method of any one of claims 1-11 method.
PCT/CN2019/073869 2018-04-12 2019-01-30 Method and apparatus for determining risk probability of service request event WO2019196546A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810327337.1A CN108399509A (en) 2018-04-12 2018-04-12 Determine the method and device of the risk probability of service request event
CN201810327337.1 2018-04-12

Publications (1)

Publication Number Publication Date
WO2019196546A1 true WO2019196546A1 (en) 2019-10-17

Family

ID=63100004

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/073869 WO2019196546A1 (en) 2018-04-12 2019-01-30 Method and apparatus for determining risk probability of service request event

Country Status (3)

Country Link
CN (1) CN108399509A (en)
TW (1) TW201944305A (en)
WO (1) WO2019196546A1 (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399509A (en) * 2018-04-12 2018-08-14 阿里巴巴集团控股有限公司 Determine the method and device of the risk probability of service request event
CN109377347B (en) * 2018-09-27 2020-07-24 深圳先进技术研究院 Network credit early warning method and system based on feature selection and electronic equipment
CN109636083A (en) * 2018-10-16 2019-04-16 深圳壹账通智能科技有限公司 Blacklist analysis method, device, equipment and computer readable storage medium
CN109636565A (en) * 2018-10-16 2019-04-16 深圳壹账通智能科技有限公司 Processing method, device, equipment and the computer readable storage medium of risk data
CN109636564A (en) * 2018-10-16 2019-04-16 平安科技(深圳)有限公司 Information verification mechanism, device, equipment and storage medium for air control
CN109559192A (en) * 2018-10-25 2019-04-02 深圳壹账通智能科技有限公司 Risk checking method, device, equipment and storage medium based on association map
CN110033151B (en) * 2018-11-09 2024-01-19 创新先进技术有限公司 Relation risk evaluation method and device, electronic equipment and computer storage medium
CN109657917B (en) * 2018-11-19 2022-04-29 平安科技(深圳)有限公司 Risk early warning method and device for evaluation object, computer equipment and storage medium
CN109598513B (en) * 2018-11-22 2023-06-20 创新先进技术有限公司 Risk identification method and risk identification device
CN110046784A (en) * 2018-12-14 2019-07-23 阿里巴巴集团控股有限公司 A kind of risk of user's access determines method and device
CN109685647B (en) * 2018-12-27 2021-08-10 阳光财产保险股份有限公司 Credit fraud detection method and training method and device of model thereof, and server
CN109801077A (en) * 2019-01-21 2019-05-24 北京邮电大学 A kind of arbitrage user detection method, device and equipment
CN109919782A (en) * 2019-01-24 2019-06-21 平安科技(深圳)有限公司 It is associated with case recognition methods, electronic device and computer readable storage medium
CN110009511A (en) * 2019-01-29 2019-07-12 阿里巴巴集团控股有限公司 Arbitrage behavior recognition methods, arbitrage behavior identification model training method and system
CN110008349B (en) * 2019-02-01 2020-11-10 创新先进技术有限公司 Computer-implemented method and apparatus for event risk assessment
CN110084468B (en) * 2019-03-14 2020-09-01 阿里巴巴集团控股有限公司 Risk identification method and device
CN110097450A (en) * 2019-03-26 2019-08-06 中国人民财产保险股份有限公司 Vehicle borrows methods of risk assessment, device, equipment and storage medium
CN110599329A (en) * 2019-09-09 2019-12-20 腾讯科技(深圳)有限公司 Credit evaluation method, credit evaluation device and electronic equipment
CN110544100A (en) * 2019-09-10 2019-12-06 北京三快在线科技有限公司 Business identification method, device and medium based on machine learning
CN111198967B (en) * 2019-12-20 2024-03-08 北京淇瑀信息科技有限公司 User grouping method and device based on relationship graph and electronic equipment
CN111291900A (en) * 2020-03-05 2020-06-16 支付宝(杭州)信息技术有限公司 Method and device for training risk recognition model
CN111798092B (en) * 2020-05-27 2024-03-12 深圳奇迹智慧网络有限公司 Customs inspection monitoring method, customs inspection monitoring device, computer equipment and storage medium
TW202147207A (en) * 2020-06-08 2021-12-16 財團法人資訊工業策進會 Risk detection system and risk detection method
CN114912717B (en) * 2022-07-13 2022-10-25 成都秦川物联网科技股份有限公司 Smart city guarantee housing application risk assessment method and system based on Internet of things

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469376A (en) * 2015-08-20 2017-03-01 阿里巴巴集团控股有限公司 A kind of risk control method and equipment
CN107679686A (en) * 2017-08-28 2018-02-09 阿里巴巴集团控股有限公司 A kind of business performs method and device
CN107741953A (en) * 2017-09-14 2018-02-27 平安科技(深圳)有限公司 The real relationship match method, apparatus and readable storage medium storing program for executing of social platform user
CN107818513A (en) * 2017-11-24 2018-03-20 泰康保险集团股份有限公司 Methods of risk assessment and device, storage medium, electronic equipment
CN108399509A (en) * 2018-04-12 2018-08-14 阿里巴巴集团控股有限公司 Determine the method and device of the risk probability of service request event

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469376A (en) * 2015-08-20 2017-03-01 阿里巴巴集团控股有限公司 A kind of risk control method and equipment
CN107679686A (en) * 2017-08-28 2018-02-09 阿里巴巴集团控股有限公司 A kind of business performs method and device
CN107741953A (en) * 2017-09-14 2018-02-27 平安科技(深圳)有限公司 The real relationship match method, apparatus and readable storage medium storing program for executing of social platform user
CN107818513A (en) * 2017-11-24 2018-03-20 泰康保险集团股份有限公司 Methods of risk assessment and device, storage medium, electronic equipment
CN108399509A (en) * 2018-04-12 2018-08-14 阿里巴巴集团控股有限公司 Determine the method and device of the risk probability of service request event

Also Published As

Publication number Publication date
CN108399509A (en) 2018-08-14
TW201944305A (en) 2019-11-16

Similar Documents

Publication Publication Date Title
WO2019196546A1 (en) Method and apparatus for determining risk probability of service request event
TWI712981B (en) Risk identification model training method, device and server
WO2021114974A1 (en) User risk assessment method and apparatus, electronic device, and storage medium
US11257041B2 (en) Detecting disability and ensuring fairness in automated scoring of video interviews
US11403643B2 (en) Utilizing a time-dependent graph convolutional neural network for fraudulent transaction identification
JP2020522832A (en) System and method for issuing a loan to a consumer determined to be creditworthy
US11531987B2 (en) User profiling based on transaction data associated with a user
US20220319701A1 (en) Supervised machine learning-based modeling of sensitivities to potential disruptions
CN111553701A (en) Session-based risk transaction determination method and device
CN114202336A (en) Risk behavior monitoring method and system in financial scene
CN112016850A (en) Service evaluation method and device
CN115034886A (en) Default risk prediction method and device
US11538029B2 (en) Integrated machine learning and blockchain systems and methods for implementing an online platform for accelerating online transacting
CN113222732A (en) Information processing method, device, equipment and storage medium
CN113408627A (en) Target object determination method and device and server
US20190303781A1 (en) System and method for implementing a trust discretionary distribution tool
CN112446777A (en) Credit evaluation method, device, equipment and storage medium
US20230267352A1 (en) System, Method, and Computer Program Product for Time Series Based Machine Learning Model Reduction Strategy
CN110362981B (en) Method and system for judging abnormal behavior based on trusted device fingerprint
Shaik et al. Customer loan eligibility prediction using machine learning
CN113886539A (en) Method and device for recommending dialect, customer service equipment and storage medium
US20240161117A1 (en) Trigger-Based Electronic Fund Transfers
US20240152959A1 (en) Systems and methods for artificial intelligence using data analytics of unstructured data
US20230359944A1 (en) Automated systems for machine learning model development, analysis, and refinement
US20230334307A1 (en) Training an artificial intelligence engine to predict a user likelihood of attrition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19785519

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19785519

Country of ref document: EP

Kind code of ref document: A1