Disclosure of Invention
It is desirable to provide a means for enabling further acquisition of a group of customers related to an identified individual suspicious customer, thereby expanding the amount of customer coverage while improving the efficiency of audits.
According to one embodiment, a method of obtaining a customer population associated with a particular customer is provided, comprising obtaining data representative of the particular customer of a plurality of customers; obtaining customer-related data relating to each of a plurality of customers, the customer-related data including at least customer relationship data representing relationships between each of the plurality of customers and other customers and customer characteristic data of the customer; acquiring predefined extension rule data; and determining one or more of the plurality of customers that are related to the particular customer based on the customer-related data and the extended rule data to obtain a customer population related to the particular customer.
Currently, the identification and management mode of individual suspicious customers for customers who use an electronic payment platform to conduct transactions does not consider the association between the identified individual suspicious customers and other customers, and only the individual suspicious customers are managed, so that the number of samples for management is limited. In fact, the customers who make transactions are not isolated, and often have some correlation, especially when making transactions for some illegal purpose, such as money laundering, there is a strong financial and/or non-financial relationship between the customers. The inventor realizes the strong correlation among the customers, acquires other customers with strong correlation according to the customer related data and the predefined expansion rule on the basis of the acquired individual suspicious customers according to the various embodiments of the invention, and pushes the acquired individual suspicious customers and the customers with strong correlation to the auditor as a whole, so that the auditor can examine and manage the customer group with strong correlation in the process of examination and management, thereby not only improving the efficiency of examination and management, but also easily obtaining a complete evidence chain in the examination and management, and in addition, by means of the group examination and management mode, the coverage range of the auditor customers can be expanded, and the defense radius of illegal transaction examination and management can be increased.
According to a further embodiment, determining a further one or more of the plurality of customers related to each of the one or more customers based on the customer-related data and the extended rules data; and including the other one or more customers in a customer population associated with the particular customer.
Therefore, on the basis of one or more first-obtained customers in the customer group related to the specific customer, the customer with strong correlation with the one or more customers can be further obtained, and the further-obtained customers are also added into the customer group related to the specific customer, so that the coverage range of the customer to be audited is further enlarged, and the defense radius of illegal transaction audition is increased. Multiple layers of such extensions may be made if desired.
According to a further embodiment, each customer in the customer population is scored based on the obtained customer-related data for each customer in the customer population related to the specific customer using a machine-learning based scoring model; and ranking the customers in the customer population based on the scores.
Therefore, scoring and then sorting are carried out by using the machine learning-based scoring model, the scoring accuracy can be increased, the customers can be conveniently sorted according to requirements in the acquired customer group with strong correlation with the specific customer, and for example, the customers with high possibility of illegal transaction can be ranked in the front.
According to a further embodiment, the client-related data of each reference client of a set of reference clients is used to determine feature data representing a plurality of client features of the reference client; and training the machine learning based scoring model using the feature data of the reference customer.
Therefore, the scoring model based on machine learning can be trained in a targeted manner before scoring and sorting.
According to a further embodiment, feature data representing a plurality of customer features of the customer is determined from customer related data of each customer in the customer population; scoring each customer in the customer population based on the feature data representing a plurality of customer features of the customer using a machine learning-based scoring model; and ranking the customers in the customer population based on the score of each customer. This provides a specific example of scoring using the trained scoring model described above.
According to a further embodiment, the extended rule data comprises rule data representing a customer population related to the specific customer obtained by data mining of the plurality of customers based on customer-related data of each of the plurality of customers.
According to a further embodiment, the expanded rule data further comprises predetermined rule data representing a relevance of each customer in the customer population to the specific customer; and/or predetermined rule data relating to the purpose of obtaining the customer population.
By using the expansion rule data of at least two aspects, the client group related to the specific client can be acquired more comprehensively and accurately.
According to a further embodiment, one or more customer groups among the plurality of customers are identified by matching predefined templates defining a relationship structure between corresponding individual customers and/or feature data of corresponding individual customers with customer related data of the plurality of customers; and determining data representing a customer segment related to the particular customer from among the one or more customer segments as rule data representing the customer segment related to the particular customer.
According to a further embodiment, one or more customer groups are determined from the plurality of customers based on customer related data of the plurality of customers, such that each customer of each of the one or more customer groups has at least a predetermined number of related customers in the customer group; and determining data representing a customer segment of the one or more customer segments that is relevant to the particular customer as rule data representing the customer segment that is relevant to the particular customer.
Two ways of obtaining rule data representing a customer population associated with the particular customer are provided above.
According to another embodiment, there is provided an apparatus for obtaining a customer population associated with a particular customer, comprising a memory; and a processor configured to perform the methods according to various embodiments of the present invention when executing the program code from the memory.
According to another embodiment, a machine-readable medium is provided that stores computer program code, which when executed, causes a computer or processor to perform a method according to various embodiments of the invention.
According to another embodiment, there is provided an apparatus for acquiring a customer population related to a specific customer, including a first acquisition unit configured to acquire data representing the specific customer of a plurality of customers and to acquire customer-related data related to each customer of the plurality of customers, the customer-related data including at least customer relationship data representing a relationship between each customer of the plurality of customers and other customers and customer characteristic data of the customer; a second acquisition unit configured to acquire predefined extension rule data; and a determining unit configured to determine one or more customers related to the specific customer among the plurality of customers based on the customer-related data and the extended rule data to acquire a customer group related to the specific customer.
Detailed Description
Applications of embodiments of the present invention will be described below with reference to an application for illicit transaction review of customers of an electronic payment platform, it being understood that the applications of embodiments of the present invention are not so limited, and should be capable of use in any application scenario where it is desirable to extend the scope of a customer on a customer-specific basis to obtain a customer base associated with a particular customer. Thus, the customers referred to below are also not limited to customers that conduct transactions in an electronic payment platform.
FIG. 1 illustrates a block diagram of an apparatus 10 for obtaining a customer population associated with a particular customer, according to one embodiment. The apparatus 10 comprises a first acquisition unit 11, a second acquisition unit 12, a determination unit 13 and an output unit 14.
The first acquisition unit 11 acquires data representing a specific client. An electronic payment platform such as a pay pal provides an audit of illegal transactions to a transaction customer, e.g., a current pay pal audit platform transaction customer's individual transactions, and outputs a list of audited suspicious customers for audit by an auditor for a predetermined period of time, e.g., weekly. The first acquisition unit 11 can acquire the list of suspicious clients, which are stored as specific clients. The particular customer may be one or more of a plurality of customers that conduct transactions within a predetermined time period.
In addition, the first obtaining unit 11 is also capable of obtaining customer-related data of a plurality of customers, for example, all customers who conduct transactions on the electronic payment platform within the predetermined period of time. The specific customer is included in the plurality of customers. The customer-related data includes at least customer relationship data representing a relationship between each of the plurality of customers and the other customers and customer characteristic data for each customer. Customer relationship data includes both funding and non-funding relationships between any two customers. The fund relationship refers to transaction activities such as transfers that occur between two customers within the predetermined time period, and the non-fund relationship refers to any relationship other than the fund relationship that occurs between two customers, for example, the same device relationship between two customers within the predetermined time period, such as a common mac address, a common mobile phone contact, etc. The customer characteristic data is characteristic data related to a customer individual, such as the amount of money inflow and outflow of the customer within a predetermined period of time, the situation of a transaction opponent, and the like, and whether the customer has received a system alarm about an illegal transaction or has been reported about the illegal transaction within a predetermined period of time.
The second acquisition unit 12 acquires predefined extension rule data. The expansion rules data specifies rules for how to expand on a current customer-specific basis to obtain a customer population associated with a particular customer among a plurality of customers conducting a transaction, and in particular, how to expand the number of customers needing to be reviewed on a customer-specific basis. In a preferred embodiment, the extended rule data may include two-sided rule data. In one aspect, the expanded rules data includes predetermined rules data representing customer groups related to a particular customer, which can be predetermined by data mining customer-related data-based on payment platform transaction customers over a predetermined period of time; or any rule data previously determined that characterizes a customer population associated with a particular customer can be used. In another aspect, the extended rules data may include rules data representing the relevance of each customer in a customer population to a particular customer and/or purpose for which the customer population is obtained. While it is preferred to use both aspects of the rule data to determine the customer population associated with a particular customer, this is not limiting and one aspect of the rule data may be used, or other rule data introduced if necessary.
In order to predetermine rule data representing a customer population associated with a particular customer, data mining can be performed on a plurality of trading customers using various methods. In one embodiment, one or more customer groups among the plurality of customers can be identified by matching predefined templates defining relationship structures between corresponding individual customers and/or characteristic data of corresponding individual customers with the customer-related data of the plurality of customers. In a further embodiment, it is possible to first identify certain suspected customer groups on the basis of a relationship graph between a plurality of customers using templates representing, for example, relationship structures (e.g., fund flow relationships) between the various customers for an illegal transaction, and then further determine for each customer in the identified customer groups whether it should belong to one of the customer groups for an illegal transaction based on their characteristic data.
In another embodiment, one or more customer groups can be determined from the plurality of customers based on customer-related data for the plurality of customers who are conducting transactions within, for example, a predetermined time period, such that each customer in each of the one or more customer groups has at least a predetermined number of related customers in the customer group.
After one or more customer groups that may, for example, involve illegal transactions are determined from the plurality of customers through the various embodiments described above, data representing a customer group related to the particular customer from among the one or more customer groups is determined as rule data representing the customer group related to the particular customer. The above-described process of determining the rule data representing the customer group related to the specific customer can also be performed by the second acquisition unit 12 on the basis of the customer-related data of the plurality of customers who make transactions. Of course, it is also contemplated that the above-described rule data is predetermined and used only in the apparatus according to the embodiment of the present invention.
The determining unit 13 determines one or more customers related to a specific customer among the plurality of customers based not only on the rule data indicating the customer group related to the specific customer but also on the customer-related data including the customer relationship data and the customer characteristic data of each of the plurality of customers who make transactions, thereby obtaining the customer group related to the specific customer. The determined one or more customers are included in a customer segment associated with the particular customer. For example, if predetermined rule data representing a customer segment associated with a particular customer indicates that customers using the same mac address and having an outflow of funds greater than the outflow of funds of the particular customer by a certain threshold for a predetermined period of time are members of the customer segment associated with the particular customer, then such customers can be identified based on the customer-related data of the plurality of customers conducting the transaction.
In addition to identifying a relevant customer group using the rule data indicative of a customer group relevant to a particular customer as described above, a further determination can be made on the plurality of customers conducting a transaction based on the customer-related data to identify the customers relevant to the particular customer, e.g., rule data indicative of the relevance of each customer in the customer group to the particular customer and/or the purpose of obtaining the customer group can be utilized.
Such regulation data includes, for example, regulation data relating to the level of funds, non-funds relationship, proportion of funds, whether an alarm was received, whether it was reported. The rules data relating to the magnitude of funds can specify a relationship between the total amount of funds in and out of a customer in the customer population to be determined by the customer and the total amount of funds in and out of that particular customer; the non-funding relationship-related rule data can specify whether a non-funding relationship indicator between a customer in the customer population to be determined and the particular customer is greater than a threshold; the rules data relating to the proportion of funds can specify whether a proportion of an incoming or outgoing fund amount level for a customer in the customer population to be determined to an incoming or outgoing fund amount level for a particular customer is greater than a certain threshold; the rule data relating to whether an alert has been received can specify that a customer of the customer population to be determined has received an alert regarding, for example, an illegal transaction within a predetermined period of time; the rule data relating to whether or not it has been reported specifies whether or not the customers in the customer group to be determined have been reported within a predetermined time, for example, in connection with an illegal transaction. The above-mentioned rule data concerning the capital level, non-capital relationship, and capital proportion belong to rule data indicating the relevance to a specific customer, and the above-mentioned rule data concerning whether an alarm has been received and whether it has been reported belongs to rule data indicating the relevance to the purpose of acquiring a customer group. Rule data relating to other aspects are also contemplated. Further, the rule data may be arbitrarily combined to identify a client related to a specific client, and for example, a client that meets one or more pieces of the rule among a plurality of clients may be determined as a client related to the specific client. A client associated with a particular client identified from a plurality of clients using the rule data of both aspects described above can be determined to be a member of a group of clients associated with the particular client.
After the determining unit 13 determines each member of the customer group related to the specific customer, the output unit 14 can output the specific customer and the determined customer group related to the specific customer together, so that the auditor can conduct group audition on the customers on the basis of the specific customer and the auditor can improve audition efficiency and can conveniently obtain a complete evidence chain during audition.
On the other hand, the output unit 14 is also capable of outputting the specific client and each client in the client group related to the specific client, and simultaneously outputting the extension rule which each client conforms to, so as to facilitate the auditor to conduct the audition. For example, if a customer is determined to be a member of a customer group associated with a particular customer because it complies with the regulation data relating to the level of funds and whether or not an alert was received, then for that customer, while outputting itself, the above-mentioned regulation data relating to the level of funds and whether or not an alert was received is also output, or an indication associated with the regulation data is output, such as the relationship of the customer's total amount of funds in-flow and out-flow to the specific customer and the alert data for that customer.
As described above, after one or more clients related to a specific client among the plurality of clients are determined in the determining unit 13 so as to acquire a client group related to the specific client, the specific client and its related client group are directly output in the output unit 14. However, to further extend the coverage of the auditing customers, in one embodiment, the determining unit 13 is further capable of further extending the customers associated with each of the one or more customers based on the determined one or more customers to further increase the coverage of the determined customer population. Specifically, the determination unit 13 can determine, based on the customer-related data and the expanded rule data of each of the plurality of customers, another one or more customers related to each of the previously determined one or more customers among the plurality of customers who made transactions, the another one or more customers being included in the customer group related to the specific customer for output by the output unit 14. In this further extended embodiment, the extension rule data acquired previously by the second acquisition unit 12 can be directly used. It is also contemplated to score and rank each customer in the customer population obtained by the further expansion described above as follows.
Fig. 2 shows a block diagram of an apparatus 20 for obtaining a customer population associated with a particular customer, according to another embodiment. The apparatus 20 shown in fig. 2 differs from the apparatus 10 shown in fig. 1 mainly in that the apparatus 10 shown in fig. 2 further comprises a scoring unit 15 and a sorting unit 16. The scoring unit 15 receives from the determining unit 13 the specific client and the determined group of clients related to the specific client, and in the case that the specific client is plural, in one embodiment, can receive each specific client and the determined group of clients related to the specific client, that is, receive the group of clients corresponding to each specific client. The scoring unit 15 is able to score each customer in each customer population related to the specific customer based on customer related data of each customer in the customer population using a machine learning based scoring model M. The scoring model can be trained in advance using machine learning means. An example of a scoring model that may be used is a scoring model based on a gradient boosting decision tree algorithm. The ranking unit 16 is capable of ranking the customers in each customer segment based on the scores of each customer in the customer segment. In this case, the output unit 14 outputs each customer group based on the ranking. It is also contemplated that the particular customer may be included in a customer population associated therewith for scoring and ranking.
Fig. 3 shows a block diagram of an apparatus 30 for obtaining a customer population associated with a particular customer, according to yet another embodiment. The apparatus 30 shown in fig. 3 differs from the apparatus 20 shown in fig. 2 mainly in that a training unit 17 is further included in the apparatus shown in fig. 3. The training unit 17 is capable of determining feature data representing a plurality of client features of a set of reference clients from client-related data of each reference client; and training the machine learning based scoring model M using the feature data of the reference customer. The reference customer's score is known. The characteristic data includes, but is not limited to, characteristic data indicating, for example, "inflow amount in predetermined time", "mac relation index", and/or "reported or not". Any number and any kind of characteristic data are contemplated. The training process described above can first be implemented before using the device according to an embodiment of the invention.
In the case of using the scoring model M trained as described above, the determining unit 13 determines feature data representing a plurality of customer features of each customer in the determined customer population from customer-related data of the customer. The scoring unit 15 scores each customer in the determined customer population based on feature data representing a plurality of customer features of the customer using a machine learning-based scoring model. The ranking unit 16 ranks the customers in the determined customer population based on the score of each customer.
While various embodiments of the present invention have been described with reference to the embodiments shown in fig. 1-3, it should be understood by those skilled in the art that the above embodiments are not limiting, and some features can be changed/modified/deleted on the basis of various embodiments, so as to obtain a new technical solution. For example, the training mode defined by the training unit 17 described above can be replaced by other training modes known in the art.
A flow diagram of a method 400 of obtaining a customer population associated with a particular customer in accordance with one embodiment of the present invention is described below with reference to fig. 4.
At 401, data representing a particular customer of a plurality of customers is obtained, and in one embodiment, data indicative of the particular customer, which can be previously determined from the plurality of customers conducting the transaction, can be received externally.
At 402, customer-related data is obtained relating to each of a plurality of customers, the customer-related data including at least customer relationship data representing relationships between each of the plurality of customers and other customers and customer characteristic data for the customer.
At 403, predefined extension rule data is obtained. As described above, in a preferred embodiment, the extended rule data may include two-way rule data. In one aspect, the extended rule data includes predetermined rule data representing a customer segment associated with a particular customer. In another aspect, the extended rules data may include rules data representing the relevance of each customer in a customer population to a particular customer and/or purpose for which the customer population is obtained.
At 404, one or more of the plurality of customers that are associated with a particular customer are determined based on the customer-related data and the extended rules data to obtain a customer segment associated with the particular customer. In a preferred embodiment, at 404, a further one or more of the plurality of customers related to each of the one or more customers is further determined based on the customer-related data and the expansion rule data, the further one or more customers being included in the customer population related to the specific customer, thereby further expanding the customer population related to the specific customer.
At 405, the particular customer and its associated customer segment are output.
Fig. 5 shows a flowchart 500 of a method of obtaining a customer population associated with a particular customer according to one embodiment of the present invention, where the processing at 501-503 is the same as the processing at 401-403 in the flowchart 400 shown in fig. 4.
At 504, in addition to the same processing as 404 above, in one embodiment, feature data representing a plurality of customer characteristics of each customer in the determined customer population is also determined based on the customer-related data for the customer.
At 505, each customer in each determined customer population is scored based on feature data representing a plurality of customer features of the customer using a machine learning based scoring model.
At 506, the customers in the customer population are ranked based on the scores for each customer.
At 507, the customer population is entered based on the ranking.
Although various embodiments of the method according to the invention are described with reference to the flow charts shown in fig. 4 and 5. It is understood that the corresponding processes can be added/modified/deleted on the basis of the flowcharts of the above embodiments, thereby constituting a new technical solution to achieve different effects.
In one embodiment, predefined extension rule data can be obtained at 403 by: matching a predefined template with customer-related data of a plurality of customers to identify one or more customer groups among the plurality of customers, the predefined template defining a relationship structure between corresponding individual customers and/or characteristic data of corresponding individual customers; data representing a customer segment related to a particular customer of one or more customer segments is determined as rule data representing the customer segment related to the particular customer.
In another embodiment, predefined extended rule data can be obtained at 403 by: determining one or more customer groups from the plurality of customers based on customer-related data for the plurality of customers, such that each customer in each of the one or more customer groups has at least a predetermined number of related customers in the customer group; data representing a customer segment associated with a particular customer of the one or more customer segments is determined as rule data representing the customer segment associated with the particular customer.
In another embodiment, each customer in a customer population associated with the particular customer can be scored at 505 based on customer-related data for each customer in the customer population using a machine-learning based scoring model; and at 506, ranking the customers in the customer population based on the score.
In yet another embodiment, the scoring model can be trained before performing step 505 according to the above process, specifically, determining feature data representing a plurality of customer features of a set of reference customers according to the customer-related data of each reference customer; and training the scoring model using the feature data of the reference client.
It will be appreciated that the functions of the various elements of the apparatus and the flow of the method for obtaining a customer population associated with a particular customer of the various embodiments of the present invention can be implemented by a computer program/software. The software can be loaded into the working memory of a data processor and when running is used for performing the method according to embodiments of the invention.
Exemplary embodiments of the invention cover both: the computer program/software of the invention is created/used from the beginning and the existing program/software is transferred to the computer program/software using the invention by means of an update.
According to further embodiments of the invention, there is provided a machine (e.g. computer) readable medium, such as a CD-ROM, having stored thereon computer program code which, when executed, causes a computer or processor to perform a method according to embodiments of the invention. The machine-readable medium may be, for example, an optical storage medium or a solid-state medium supplied together with or as part of other hardware.
The computer program for performing the methods according to embodiments of the invention may also be distributed in other forms, for example via the internet or other wired or wireless telecommunication systems. The computer program may also be presented over a network, such as the worldwide web, and can be downloaded into the working computers of data processors from such networks.
It is also understood that the various elements of the apparatus and method for obtaining a customer population associated with a particular customer of the various embodiments of the present invention can be implemented in hardware or a combination of hardware and software.
In one embodiment, a system for obtaining a customer population associated with a particular customer can be implemented by a memory and a processor. The memory can store computer program code for executing the method flows according to various embodiments of the invention; when executing program code from memory, the processor performs procedures in accordance with various embodiments of the invention.
It has to be noted that embodiments of the invention have been described with reference to different subject matters. In particular, some embodiments are described with reference to method type claims whereas other embodiments are described with reference to apparatus type claims. However, a person skilled in the art will gather from the above and the following description that, unless other notified, in addition to any combination of features belonging to one type of subject-matter also any combination between features relating to different subject-matters is considered to be disclosed with this application. Also, all features can be combined, providing a synergistic effect greater than a simple sum of the features.
The present invention has been described above with reference to specific embodiments, and it will be understood by those skilled in the art that the technical solutions of the present invention can be implemented in various ways without departing from the spirit and essential features of the present invention. The specific embodiments are merely illustrative and not restrictive. In addition, the embodiments can be combined arbitrarily to achieve the object of the present invention. The scope of protection of the invention is defined by the appended claims.
The word "comprising" in the description and in the claims does not exclude the presence of other elements or steps. The functions of the respective elements described in the specification or recited in the claims may be divided or combined into plural corresponding elements or may be implemented by a single element.