CN114154588A - Method and device for training abnormal transaction detection model and detecting abnormal transaction - Google Patents

Method and device for training abnormal transaction detection model and detecting abnormal transaction Download PDF

Info

Publication number
CN114154588A
CN114154588A CN202111506039.7A CN202111506039A CN114154588A CN 114154588 A CN114154588 A CN 114154588A CN 202111506039 A CN202111506039 A CN 202111506039A CN 114154588 A CN114154588 A CN 114154588A
Authority
CN
China
Prior art keywords
transaction
sample data
transaction sample
abnormal
detection model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111506039.7A
Other languages
Chinese (zh)
Inventor
王昱森
周振华
李云鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN202111506039.7A priority Critical patent/CN114154588A/en
Publication of CN114154588A publication Critical patent/CN114154588A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention provides a method and a device for training an abnormal transaction detection model and detecting abnormal transactions. The training method of the abnormal transaction detection model comprises the following steps: receiving the acquired first transaction sample data set, wherein each first transaction sample data in the first transaction sample data set is normal transaction sample data; performing feature extraction processing on the first transaction sample data set to obtain a first transaction sample feature set; and training by adopting an unsupervised machine learning algorithm to obtain an abnormal transaction detection model based on the first transaction sample feature set, and recording the core position and the radius of each cluster obtained based on the unsupervised machine learning algorithm.

Description

Method and device for training abnormal transaction detection model and detecting abnormal transaction
The application is a divisional application of patent applications with application date of 2019, 3, and 28, and application number of 201910243559.X, entitled "method and device for training abnormal transaction detection model and detecting abnormal transaction".
Technical Field
The present invention relates to the field of intersection of machine learning and financial transactions, and more particularly, to a method and an apparatus for training an abnormal-transaction detection model, an abnormal-transaction detection method and apparatus, and a computing device and a computer-readable storage medium storing a computer program.
Background
With the proliferation of scientific technology and socio-economic, financial transactions (e.g., internet-based financial transactions, etc.) are also more frequent and increasingly important. For example, financial institutions such as banks may assess financing credit lines and loan payments according to the financial transactions of a business.
However, since the financial transaction is often low in counterfeiting cost and large in profit space, there is a possibility that enterprises may cheat the interests by constructing false financial transactions, for example, cheat loans from banks, etc. When abnormal transactions (such as counterfeit transactions) are faced, the traditional way of verifying the counterfeit through invoices and other means does not have real-time performance, and cannot meet the high-efficiency appeal of banks and enterprises in the internet era, and if the real-time monitoring is carried out through a supervised machine learning method, a large number of labeled samples are required for training, the collection and labeling work of the samples also consumes a large amount of time and labor cost, and the condition of misjudgment exists.
Disclosure of Invention
The invention aims to provide a training method of an abnormal transaction detection model and an abnormal transaction detection method.
One aspect of the present invention provides a method for training an abnormal transaction detection model, including: receiving the acquired first transaction sample data set, wherein each first transaction sample data in the first transaction sample data set is normal transaction sample data; performing feature extraction processing on the first transaction sample data set to obtain a first transaction sample feature set; and training by adopting an unsupervised machine learning algorithm to obtain an abnormal transaction detection model based on the first transaction sample feature set, wherein the core position and the radius of each cluster obtained based on the unsupervised machine learning algorithm are recorded through the abnormal transaction detection model.
Optionally, the training method further comprises: outputting the abnormal transaction detection model and the recorded core position and radius of each cluster.
Optionally, the features of each first transaction sample in the first set of transaction sample features comprise one or more of the following features extracted from the first transaction sample data: time attribute feature, money amount distribution attribute feature, same-class attribute feature and same-region attribute feature.
Optionally, the step of training to obtain an abnormal transaction detection model by using an unsupervised machine learning algorithm based on the first transaction sample feature set includes: normalizing the characteristics of each first transaction sample in the first transaction sample characteristic set by columns; and training by adopting an unsupervised machine learning algorithm to obtain an abnormal transaction detection model based on the characteristics of the first transaction samples normalized according to the columns.
Optionally, the unsupervised machine learning algorithm comprises a k-means algorithm, a DBSCAN algorithm, or an isolated forest algorithm.
Optionally, the unsupervised machine learning algorithm is a k-means algorithm, and the step of training to obtain the abnormal transaction detection model by using the k-means algorithm includes: determining core locations of k initial clusters in the first transaction sample feature set, wherein a value of k is determined based on the first transaction sample feature set; clustering the first trading sample feature set using a k-means algorithm based on the core positions of predetermined k initial clusters until a standard measure function begins to converge.
Optionally, the first transaction sample data is a transaction sample prior to the enterprise joining the supply chain finance.
One aspect of the present invention provides an abnormal transaction detection method, including: receiving second transaction sample data to be detected; performing feature extraction processing on the second transaction sample data to obtain features of the second transaction sample; inputting the characteristics of the second transaction sample into an abnormal transaction detection model based on an unsupervised machine learning algorithm to obtain a prediction result; and judging according to the prediction result and the core position and the radius of each cluster of the abnormal transaction detection model, and outputting the second transaction sample data as the detection result of the abnormal transaction when judging that the characteristics of the second transaction sample do not belong to any cluster of the abnormal transaction detection model.
Optionally, the abnormal transaction detection model is obtained according to any one of the training methods described above, where the feature extraction processing on the second transaction sample data is the same as the process of performing the feature extraction processing on the first transaction sample data set in any one of the training methods described above.
Optionally, when the distance between the feature of the second transaction sample and the core position of each cluster is greater than the product of a predetermined enabling coefficient and the radius of the corresponding cluster, determining that the feature of the second transaction sample does not belong to any cluster of the abnormal transaction detection model.
Optionally, the detection result that the second transaction sample data is a normal transaction is output when the distance between the feature of the second transaction sample and the core position of at least one cluster is equal to or less than the product of a predetermined enabling factor and the radius of the at least one cluster.
Optionally, in response to the second transaction sample data being a detection result of a normal transaction, selectively updating the abnormal transaction detection model based on the second transaction sample data.
Optionally, the step of updating the anomalous transaction detection model based on the second transaction sample data comprises: and taking the second transaction sample data and the training transaction sample data in the abnormal transaction detection model as new training transaction sample data to serve as training input of the abnormal transaction detection model so as to update the abnormal transaction detection model.
Optionally, the second transaction sample data is a transaction sample after the enterprise joins a supply chain finance.
An aspect of the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as any one of the above.
One aspect of the invention provides a computing device comprising: one or more processors; one or more memories storing a computer program that, when executed by the one or more processors, implements a method as any one of above.
One aspect of the present invention provides a training apparatus for an abnormal transaction detection model, including: a receiving unit configured to receive the acquired first transaction sample data set, wherein each first transaction sample data in the first transaction sample data set is normal transaction sample data; the characteristic processing unit is configured to perform characteristic extraction processing on the first transaction sample data set to obtain a first transaction sample feature set; and the training and recording unit is configured to obtain an abnormal transaction detection model by adopting an unsupervised machine learning algorithm through training based on the first transaction sample feature set, and record the core position and the radius of each cluster obtained based on the unsupervised machine learning algorithm.
Optionally, the training device further comprises: and the output unit outputs the abnormal transaction detection model and the recorded core position and radius of each cluster.
Optionally, the features of each first transaction sample in the first set of transaction sample features comprise one or more of the following features extracted from the first transaction sample data: time attribute feature, money amount distribution attribute feature, same-class attribute feature and same-region attribute feature.
Optionally, the training and recording unit is configured to: normalizing the characteristics of each first transaction sample in the first transaction sample characteristic set by columns; and training by adopting an unsupervised machine learning algorithm to obtain an abnormal transaction detection model based on the characteristics of the first transaction samples normalized according to the columns.
Optionally, the unsupervised machine learning algorithm comprises a k-means algorithm, a DBSCAN algorithm, or an isolated forest algorithm.
Optionally, the unsupervised machine learning algorithm is a k-means algorithm, the training and recording unit is configured to: determining core locations of k initial clusters in the first transaction sample feature set, wherein a value of k is determined based on the first transaction sample feature set; clustering the first trading sample feature set using a k-means algorithm based on the core positions of predetermined k initial clusters until a standard measure function begins to converge.
Optionally, the first transaction sample data is a transaction sample prior to the enterprise joining the supply chain finance.
An aspect of the present invention provides an abnormal transaction detecting apparatus, including: a receiving unit configured to receive second transaction sample data to be detected; the characteristic processing unit is configured to perform characteristic extraction processing on the second transaction sample data to obtain characteristics of the second transaction sample; an input unit configured to input features of the second transaction sample into an abnormal transaction detection model based on an unsupervised machine learning algorithm, resulting in a prediction result; and the detection unit is configured to perform judgment according to the prediction result and the core position and the radius of each cluster of the abnormal transaction detection model, and when the characteristic of the second transaction sample is judged not to belong to any cluster of the abnormal transaction detection model, the second transaction sample data is output as a detection result of the abnormal transaction.
Optionally, the abnormal transaction detection model is obtained according to any one of the training methods described above, where the feature extraction processing performed on the second transaction sample data by the feature processing unit is the same as the process of performing the feature extraction processing on the first transaction sample data set in the training method described above.
Optionally, the detection unit is configured to: when the distance between the features of the second transaction sample and the core position of each cluster is larger than the product of a predetermined enabling coefficient and the radius of the corresponding cluster, judging that the features of the second transaction sample do not belong to any cluster of the abnormal transaction detection model.
Optionally, the detection unit is configured to: outputting a detection result that a second transaction sample data is a normal transaction when a distance between a feature of the second transaction sample and a core position of at least one cluster is equal to or less than a product of a predetermined enabling factor and a radius of the at least one cluster.
Optionally, the abnormal transaction detecting apparatus further includes: an updating unit configured to selectively update the abnormal transaction detection model based on the second transaction sample data in response to a detection result that the second transaction sample data is a normal transaction.
Optionally, the updating unit is configured to: and taking the second transaction sample data and the training transaction sample data in the abnormal transaction detection model as new training transaction sample data to serve as training input of the abnormal transaction detection model so as to update the abnormal transaction detection model.
Optionally, the second transaction sample data is a transaction sample after the enterprise joins a supply chain finance.
The technical scheme of the invention for carrying out abnormal transaction detection by using the unsupervised machine learning algorithm takes the characteristics of the financial transaction scene into consideration and adopts the unsupervised machine learning algorithm, so that a simple model which can be interpreted and is easy to visualize is realized, the requirements of supervision can be met, the algorithm logic can be visualized to business personnel, the business personnel can be helped to better understand the early warning logic, suggestions can be provided for subsequent operation of the transaction, and higher detection accuracy can be provided.
Additional aspects and/or advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.
Drawings
The above and other objects and features of the present invention will become more apparent from the following description taken in conjunction with the accompanying drawings which illustrate, by way of example, an example in which:
FIG. 1 shows a flow diagram of a method of training an anomalous transaction detection model in accordance with an embodiment of the present invention;
FIG. 2 shows a flow diagram of an anomalous transaction detection method in accordance with an embodiment of the invention;
FIG. 3 illustrates a training apparatus for an anomalous transaction detection model according to an embodiment of the present invention;
FIG. 4 illustrates an anomalous transaction detection device according to an embodiment of the present invention;
FIG. 5 illustrates a training apparatus for an anomalous transaction detection model according to an embodiment of the present invention;
fig. 6 illustrates an abnormal transaction detecting apparatus according to an embodiment of the present invention.
Detailed Description
The following description is provided with reference to the accompanying drawings to assist in a comprehensive understanding of exemplary embodiments of the invention as defined by the claims and their equivalents. The description includes various specific details to aid understanding, but these details are to be regarded as illustrative only. Thus, one of ordinary skill in the art will recognize that: various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the present invention. Moreover, descriptions of well-known functions and constructions may be omitted for clarity and conciseness.
In the present invention, the transaction may indicate a financial transaction. In one example, the transaction may be a supply chain financial transaction. Supply Chain Finance (SCF) is a specialized area of commercial banking credit business (bank level) and is also a financing channel for enterprises (enterprise level), especially for small and medium-sized enterprises. It refers to a bank providing financing and other settlement and financing services to customers (core enterprises), while providing the suppliers of these customers with the convenience of timely receipt of loans, or providing their distributors with prepaid proxy payment and inventory financing services. Briefly, a financing model for banking businesses that connects core and upstream and downstream businesses together to provide flexibly deployed financial products and services. The above definition is very close to the traditional warranty and mortgage (move to property and right/mortgage credit) services. But the obvious difference is that the warranty and the escort are only simple trade financing products, and the supply chain finance is achieved between the core enterprise and the bank, and the systematic financing arrangement is oriented to all member enterprises in the supply chain.
However, the present invention is not limited to supply chain financial transactions, and the present invention is also applicable to other transaction-oriented scenarios (e.g., scenarios with high frequency transactions and with time-series characteristics). For example, the present invention may be applied to one example scenario as follows: the exchange wants to detect insider transactions, determining that no inside transactions were found before 2018, but wants to determine whether there are inside transactions (i.e., abnormal transactions) during the period of 2018.
FIG. 1 shows a flow diagram of a method of training an anomalous transaction detection model in accordance with an embodiment of the present invention.
Referring to fig. 1, a method of training an abnormal transaction detection model according to an embodiment of the present invention includes step S110, step S120, and step S130.
In step S110, the acquired first transaction sample data set is received, where each first transaction sample data in the first transaction sample data set is a normal transaction sample.
Here, a normal transaction sample may indicate a legal, compliant transaction sample. Further, in the present invention, the first set of transaction sample data may comprise a number of transaction samples greater than or equal to a predetermined number. For example, the predetermined number may be 20, however, the predetermined number is not limited by the present invention, and the predetermined number may be any other number according to the present invention.
In one embodiment, the first transaction sample data is a transaction sample prior to the enterprise joining the supply chain finance. In particular, business behavior of a business prior to joining supply chain finance is generally considered normal business behavior because the business has no incentive to construct spurious transaction data (e.g., orders) without the incentive of bank borrowing prior to joining supply chain finance. That is, a transaction sample before a business joins a supply chain finance may be considered a normal transaction sample. However, as described above, the first transaction sample data is not limited to the transaction sample before the enterprise joins the supply chain finance, and the first transaction sample data may be a transaction sample that can be regarded and/or determined as a normal transaction sample in other transaction-oriented scenarios.
For ease of illustration and understanding, the following description may be primarily described based on a supply chain financial transaction scenario, however, the invention is not limited to this example application scenario of supply chain financial transactions.
In step S120, a feature extraction process is performed on the first transaction sample data set to obtain a first transaction sample feature set.
Here, performing the feature extraction processing on the first transaction sample data set may mean performing the feature extraction processing on each first transaction sample data in the first transaction sample data set, where the first transaction sample feature set includes each first transaction sample data feature obtained after performing the feature extraction processing on each first transaction sample data. For example, the first transaction sample data feature may be a feature vector.
In one embodiment, the features of each first transaction sample in the first set of transaction sample features comprise one or more of the following features extracted from the first transaction sample data: time attribute feature, money amount distribution attribute feature, same-class attribute feature and same-region attribute feature. The characteristics take the transaction characteristics under the transaction scene into consideration, the characteristics are derived and extracted according to the transaction dimension and/or the time dimension, and the sample data is mapped to the high-dimensional discrete characteristic space, so that the abnormality detection of the sample data can be more accurately carried out. Note that when the features of the first transaction sample include two or more of the above-described features, the feature vector of the features of the first transaction sample may be composed of feature vectors of the two or more features. For example, the feature of the first transaction sample includes a first feature and a second feature, the feature vector of the first feature is [ a, b ], and when the feature vector of the second feature is [ c, d ], the feature vector of the feature of the first transaction sample is [ a, b, c, d ], where a, b, c, and d indicate corresponding feature values.
The time attribute feature, the amount distribution attribute feature, the same kind attribute feature, and the same region attribute feature of the first transaction sample data are described in more detail below.
In the present invention, the time attribute feature of the first transaction sample data may indicate a month and a date to which the order placing time and the warehousing time of the transaction belong, whether the date is a weekend, whether the date is a holiday, and the like. For example, as an illustrative example, when the time attribute feature of the first trade sample data includes a month to which the next time of the trade belongs, a date, and whether it is a weekend, for a first trade sample whose next time is 1 month, 20 days, and is a weekend, the time feature of the first trade sample may be represented as a feature vector [1,20,1], where a first value of the feature vector indicates the month to which the next time of the trade belongs, a second value of the feature vector indicates the date to which the next time of the trade belongs (e.g., 20 days corresponds to 20, 30 days corresponds to 30), and a third value of the feature vector indicates whether the next time of the trade is a weekend (e.g., 1 corresponds to weekend, 0 corresponds to not weekend). However, the above illustrative examples are for illustration only, and the present invention may include any time feature or combination of a plurality of time features extracted from the first transaction sample data.
In the present invention, the value attribute feature of the first transaction sample data may indicate statistics of the value of the transaction over a historical time window. In one example, the historical time window may be a time window indicating within 7, 14, 21, 30, 60, 90 days, etc. prior to the transaction, or the historical time window may indicate a time window within 1, 3, 5, 10 transactions, etc. prior to the transaction. Further, in one example, the various statistics may indicate a mean, sum, median, standard deviation, maximum, minimum, etc. of the transaction amount. Note that the historical time window and the statistical value described above are merely examples, and the present invention is not limited thereto. For example, as an illustrative example, when the money attribute feature of the first transaction sample data includes an average value of money amounts within 7 days before the transaction and a maximum value of money amounts within 14 days before the transaction, a feature vector of the money attribute feature of the first transaction sample may be represented as [100000,200000] for the first transaction sample data in which the average value of money amounts within 7 days before the transaction is 100000 and the maximum value of money amounts within 14 days before the transaction is 200000. However, the above illustrative examples are for illustration only, and the present invention may include any monetary attribute feature or combination of monetary attribute features extracted from the first transaction sample data.
How to calculate the time attribute feature and the money amount attribute feature of the first transaction sample data for a specific first transaction sample data has been described above with reference to the time attribute feature and the money amount attribute feature of the first transaction sample data, and the calculation methods of the money amount distribution attribute feature, the same kind attribute feature, and the same region attribute feature of the first transaction sample data described later are similar to the calculation methods of the time attribute feature and the money amount attribute feature of the first transaction sample data. Therefore, for the sake of brevity, the description of the calculation method of the amount distribution attribute feature, the same-class attribute feature, and the same-region attribute feature of the first transaction sample data will be omitted later.
In the present invention, the amount distribution attribute feature of the first transaction sample data may indicate whether the transaction amount is ten, one hundred, one thousand, ten thousand, etc., whether the transaction amount exceeds 1,2, 3 times of the historical transaction amount, the number of times of exceeding in the historical time window, etc. For example, as an illustrative example, when the first transaction sample has a transaction amount of 200, 10, 230, 17 over the past 10 days, and an average transaction amount of 150 before supply chain finance, then one of the following features may be constructed: "the number of transactions over the past 10 days, numerically higher than the average transaction before supply chain finance", i.e., the characteristic count (200>150,10>150,230>150,17>150) ═ 2. However, the above illustrative examples are for illustration only, and the present invention may include any monetary distribution attribute feature or combination of monetary distribution attribute features extracted from the first transaction sample data.
In the present invention, the same-class attribute feature of the first transaction sample data may indicate a relationship between a transaction amount of the first transaction sample data and a transaction amount of a company and/or a business of the same class in a historical time window, and the like. For example, the relationship between the transaction amount of the first transaction sample data and the transaction amount of the company and/or business of the same class in the historical time window may indicate a multiple relationship between the transaction amount of the first transaction sample data and the average transaction amount of the company and/or business of the same class in the historical time window.
In the present invention, the same-region attribute feature of the first transaction sample data may indicate a relationship between the transaction amount and the transaction amount of the company and/or the enterprise in the same region in the historical time window, and the like. For example, the relationship between the transaction amount of the first transaction sample data and the transaction amount of the company and/or business of the same region in the historical time window may indicate a multiple relationship between the transaction amount of the first transaction sample data and the average transaction amount of the company and/or business of the same region in the historical time window.
In step S130, based on the first transaction sample feature set, an unsupervised machine learning algorithm is used to train and obtain an abnormal transaction detection model, and the core position and radius of each cluster obtained based on the unsupervised machine learning algorithm are recorded. Here, the core position of the cluster may indicate the centroid position of the cluster, and the radius of the cluster may indicate the distance of the farthest point (i.e., sample) in the cluster to the core.
In the invention, the unsupervised machine learning algorithm is adopted aiming at the special high-frequency characteristic and time series characteristic of the transaction, so that the complexity of transaction detection can be simplified and the accuracy requirement of the transaction detection can be met.
Here, the unsupervised machine learning algorithm may include a k-means algorithm, a DBSCAN algorithm, or an isolated forest algorithm, etc. For simplicity and ease of understanding, the abnormal transaction detection model is mainly described below by taking the k-means algorithm as an example, however, it should be noted that other unsupervised machine learning algorithms as described above can also be applied to the abnormal transaction detection model of the present invention similarly to the k-means algorithm.
Specifically, in one embodiment, when the unsupervised machine learning algorithm is a k-means algorithm, the step of training the abnormal transaction detection model using the k-means algorithm may include: determining core locations of k initial clusters in a first transaction sample feature set, wherein the value of k is determined based on the first transaction sample feature set; clustering the first trading sample feature set using a k-means algorithm based on the core positions of the predetermined k initial clusters until the standard measure function starts to converge. Here, the standard measure function generally employs a mean square error function.
In this embodiment, the value of k is determined based on the first transaction sample feature set. In other words, the value of k is determined based on each first sample feature in the first set of transaction sample features. Suitable values for k may be determined experimentally and/or calculated by a variety of methods as follows. However, the method of determining a suitable value of k is not limited to the example methods described below and/or any combination of the example methods described below, and any other known method suitable for determining a value of k is possible.
In one example, the appropriate value of k may be determined by a contour Coefficient (Silhouette Coefficient). The contour coefficients combine the degree of Cohesion (Cohesion) and the degree of Separation (Separation) of the clusters and are used to evaluate the effect of the clusters. The value is between-1 and 1, and the larger the value is, the better the clustering effect is. The specific calculation method is as follows: for each sample point i, calculating the average value of the distances between the point i and all other elements in the same cluster, and recording the average value as a (i) for quantifying the degree of agglomeration in the cluster; selecting a cluster b except i, calculating the average distance between all points in i and b, traversing all other clusters, and finding the nearest average distance, which is denoted as b (i), namely the neighbor class of i and is used for quantifying the separation degree between the clusters; for sample point i, the profile coefficients s (i) ═ (b (i) -a (i) (/ max { a (i)), b (i) }; and calculating the contour coefficients of all the i, solving the average value, namely the overall contour coefficient of the current cluster, and measuring the closeness degree of the data cluster. If s (i) is less than 0, the average distance between i and the elements in the cluster is less than that of other nearest clusters, which indicates that the clustering effect is not good. If a (i) is close to 0, or b (i) is large enough, i.e. a (i) < < b (i), then s (i) is close to 1, which indicates that the clustering effect is better. In the present invention, when s (i) is greater than or equal to a predetermined value, the k value at that time can be considered to be a suitable k value. In addition, in the above-described processing, k is not generally set large. By means of enumeration, k is from 2 to a fixed value (for example, 10), kmeans are repeatedly run for several times on each k value (local optimal solution is avoided), the average contour coefficient of the current k is calculated, and finally k corresponding to the value with the maximum contour coefficient is selected as the final cluster number.
In another example, the appropriate value of k may be determined by elbow methods. Here, the core index of the elbow method is SSE (sum of the squared errors), where Ci is the ith cluster, p is the sample point in Ci, mi is the centroid of Ci (the mean of all samples in Ci), and SSE is the clustering error of all samples, which represents how good the clustering effect is. The core idea of the elbow method is as follows: as the number k of clusters increases, the sample division becomes finer, the aggregation degree of each cluster gradually increases, and the sum of squared errors SSE naturally becomes smaller. And when k is smaller than the real cluster number, the aggregation degree of each cluster is greatly increased due to the increase of k, so that the descending amplitude of the SSE is large, and when k reaches the real cluster number, the return of the aggregation degree obtained by increasing k is rapidly reduced, so that the descending amplitude of the SSE is rapidly reduced and then tends to be gentle along with the continuous increase of the value of k, namely the relation graph of the SSE and k is in the shape of an elbow, and the value of k corresponding to the elbow is the real cluster number of the data.
After the appropriate k values are determined, the core locations of the k initial clusters need to be determined. In one example, a point (i.e., a first sample data feature) may be randomly selected as a first initial cluster center point (i.e., a core location), then the point farthest from the point is selected as a second initial cluster center point, then the point closest to the first two points is selected as a third initial cluster center point, and so on until k initial cluster center points are selected. In another example, a hierarchical clustering algorithm or a Canopy algorithm is selected for initial clustering, and then the center points of the clusters are used as the initial cluster center points of the k-means algorithm. However, the above-described example of determining the core positions of k initial clusters is illustrative, and the present invention is not limited to the above-described example of determining the core positions of k initial clusters.
And when the standard measure function starts to converge, stopping continuously clustering the first transaction sample feature set by using the k-means algorithm. At this point, the anomalous transaction detection model may be considered complete. In this case, the core location and radius of each cluster obtained based on an unsupervised machine learning algorithm (e.g., k-means algorithm) are recorded.
Further, optionally, before training the anomalous transaction detection model, the features of each first transaction sample in the first transaction sample feature set may be normalized by column (column-wise). Here, by the column-wise normalization, each feature value in the feature vector corresponding to the feature of each first transaction sample can be normalized, thereby facilitating the subsequent calculation processing. Since the column-wise normalization technique is a prior art in the related art, it will not be specifically described.
And after the normalization by columns, training by adopting an unsupervised machine learning algorithm to obtain an abnormal transaction detection model based on the characteristics of each first transaction sample after the normalization by columns. The specific training process may be the same or similar to that performed with reference to the k-means algorithm as described above.
Further, optionally, the training method further comprises: outputting the abnormal transaction detection model and the recorded core position and radius of each cluster.
FIG. 2 shows a flow diagram of an anomalous transaction detection method in accordance with an embodiment of the present invention.
Referring to fig. 2, the abnormal transaction detecting method according to the embodiment of the present invention includes steps S210, S220, S230, and S240.
In step S210, second transaction sample data to be detected is received.
In one embodiment, the second transaction sample data is a transaction sample after the enterprise joins the supply chain finance. In particular, the business behavior of a business after joining supply chain finance presents the potential for spurious and/or anomalous transactions because the business has an incentive to construct spurious transaction data (e.g., orders) after joining supply chain finance, with the temptation of bank borrowing. Therefore, it is necessary to confirm whether the transaction sample after the enterprise joins the supply chain finance is an abnormal transaction sample. However, as described above, the second transaction sample data is not limited to the transaction sample after the enterprise joins the supply chain finance, and the second transaction sample data may be a transaction sample of whether the transaction sample is an abnormal transaction sample to be confirmed in other scenarios with the transaction as a main body.
In step S220, feature extraction processing is performed on the second transaction sample data to obtain features of the second transaction sample.
In one embodiment, the feature extraction process on the second transaction sample data may be the same as the process of feature extraction process on the first transaction sample data set as described above.
In step S230, the characteristics of the second transaction sample are input into an abnormal transaction detection model based on the unsupervised machine learning algorithm, and a prediction result is obtained.
Here, the abnormal-transaction detection model based on the unsupervised machine learning algorithm can be obtained by any of the training methods described with reference to fig. 1. For the sake of brevity, the training method will not be described in detail here. Further, the prediction result may indicate that the features of the second transaction sample have been mapped to a space made up of the features of the respective first transaction samples used to train the anomalous transaction detection model.
In step S240, a determination is made according to the prediction result and the core position and radius of each cluster of the abnormal transaction detection model, and when it is determined that the feature of the second transaction sample does not belong to any cluster of the abnormal transaction detection model, the second transaction sample data is output as a detection result of the abnormal transaction.
Specifically, when the distance between the second transaction sample feature and the core position of each cluster is greater than the product of a predetermined energization coefficient and the radius of the corresponding cluster, it is determined that the feature of the second transaction sample does not belong to any one of the clusters of the abnormal transaction detection model. Here, the feature of the second transaction sample may be understood as one feature vector, and the core position of each cluster also corresponds to the corresponding feature vector.
Generally, the result obtained by the clustering algorithm is difficult to define how to calculate the abnormal condition, and generally, in the conventional method, samples which cannot be classified are taken as the abnormal condition, or a certain class is taken as the abnormal condition, but the abnormal condition does not accord with the financial business scene. Conventional approaches are not effective because there may be a wide variety of business participation in a financial (e.g., a supply chain), an unsorted sample, and a sample of a certain class that may be truly specific to a business transaction, rather than an exception. By not judging whether the category attribute is abnormal or not as described above, the distance between the transaction sample to be detected and the normal transaction sample in the high-dimensional feature space is compared, and if the distance difference exceeds the early warning threshold, the sample is considered to be abnormal transaction, so that whether the transaction sample to be detected is abnormal or not can be effectively detected.
In the present invention, the energization factor is also a risk tolerance factor. The invention provides the concept of the enabling coefficient, which aims to be more suitable for transaction scenes and improve the accuracy of identifying abnormal transactions. For example, in an example supply chain financial scenario, enterprise a has a tight previous fund, and can only place an order once a month to enterprise B, the amount of the order is not more than 100 yuan, and the next order placement can only wait for the self to sell money and make money, but after having financial services, enterprise a can be credited, and can place an order of 130 yuan once a month, and bank pays to enterprise B, and then enterprise a can sell more money, so that after benign development, enterprise a is larger in size and higher in order amount, and obviously taking the first 100 yuan as the maximum transaction limit for normal transactions is very unscientific, for example, it can be 130 yuan, and we consider that supply chain financial services can help enterprise a increase the size by 30% at most, and at this time, the enabling coefficient is 1.3. Of course, too large an incentive coefficient can cause problems, such as an increase to 200 dollars, with an incentive coefficient of 2, indicating a 100% increase in business development, which gives enterprise a incentive to construct fraudulent transactions (e.g., orders) for fraudulent crediting, since the revenue from fraudulent crediting is significantly greater than normal business behavior. That is, the incentive coefficient is a coefficient for reflecting the development status of the enterprise and the tolerance of the bank to the potential fraud risk.
The incentive coefficient may be set to a value greater than 1 when the business's developmental status becomes better (e.g., joins supply chain finance) and/or the bank can tolerate increased risk of potential fraud (e.g., the bank encourages loans). The enablement factor can be set to a value less than 1 when the business's developmental status deteriorates (e.g., credit decreases, etc.) and/or the bank can tolerate a reduced risk of potential fraud (e.g., the bank tightens the loan). In other words, the energizing coefficient of the invention can be changed according to the development condition of the enterprise and the tolerance of the bank to the potential fraud risk, so that the accuracy of abnormal transaction detection is improved.
Further, when a distance between a second transaction sample characteristic and a core position of at least one cluster is equal to or less than a product of a predetermined energization coefficient and a radius of at least one cluster, a detection result that the second transaction sample data is a normal transaction is output.
In addition, optionally, the abnormal transaction detection method according to the embodiment of the present invention may further include: and selectively updating the abnormal transaction detection model based on the second transaction sample data in response to the second transaction sample data being the detection result of the normal transaction. Since the abnormal transaction detection model can be selectively updated based on the second transaction sample detected as a normal transaction, the abnormal transaction detection model can be always maintained in a relatively accurate state. Here, the selective updating adopted by the present invention can effectively avoid that the second transaction sample has adverse effects on the accuracy of the abnormal transaction detection model when the second transaction sample is erroneously detected as a normal transaction. For example, when the second transaction sample data is detected as a normal transaction, the second transaction sample data and the training transaction sample data in the abnormal transaction detection model may be used as new training transaction sample data as training input of the abnormal transaction detection model to update the abnormal transaction detection model. In this case, the training steps in the update process may be the same as or similar to the training process described with reference to FIG. 1.
In one embodiment, the anomalous transaction detection model may be updated at predetermined periods. For example, the predetermined period may be said to be one month or one quarter. Therefore, the abnormal transaction detection model is updated by the preset period, so that the convenience of maintenance of the abnormal transaction detection model can be ensured, and the detection accuracy of the abnormal transaction detection model can be maintained at a high level.
In another embodiment, when the second transaction sample data is detected as a normal transaction and the second transaction sample data is confirmed as a normal transaction through bank and/or enterprise feedback (for example, through a transaction authenticity marking platform inside the bank), the abnormal transaction detection model can be updated based on the second transaction sample data. In this embodiment, updating the abnormal transaction detection model based on the second transaction sample data that is twice confirmed as a normal transaction can improve the detection accuracy of the abnormal transaction detection model well.
In one embodiment of the present invention, there is also provided a computing device comprising one or more processors and one or more memories, wherein the one or more memories store a computer program that, when executed by the one or more processors, implements any of the methods disclosed herein.
The computing device may specifically be the device shown in fig. 3 or fig. 4.
FIG. 3 illustrates a training apparatus of an abnormal transaction detection model according to an embodiment of the present invention.
Referring to fig. 3, the training apparatus 300 of the abnormal transaction detection model may include one or more processors 310 and a memory 320. The memory 320 stores a computer program, wherein the computer program, when executed by the one or more processors 310, implements any of the training methods described with reference to fig. 1. The one or more processors 310 may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor, a microcontroller, or a microprocessor. By way of example, and not limitation, the processor may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like. Data and/or instructions between the one or more processors 310 and the memory 320 may be sent and received over a network via a network interface device (not shown) that may employ any known transmission protocol.
For example, the computer programs, when executed by the one or more processors 310, may cause the one or more processors 310 to perform and/or implement the following: receiving the acquired first transaction sample data set, wherein each first transaction sample data in the first transaction sample data set is normal transaction sample data; performing feature extraction processing on the first transaction sample data set to obtain a first transaction sample feature set; and training by adopting an unsupervised machine learning algorithm based on the first transaction sample feature set to obtain an abnormal transaction detection model, wherein the core position and the radius of each cluster obtained based on the unsupervised machine learning algorithm are recorded through the abnormal transaction detection model.
Fig. 4 shows an abnormal transaction detecting apparatus according to an embodiment of the present invention.
Referring to fig. 4, the anomalous transaction detection device 400 may include one or more processors 410 and memory 420. The memory 420 stores a computer program that, when executed by the one or more processors 410, implements any of the anomalous transaction detection methods described with reference to fig. 2. The one or more processors 410 may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor, a microcontroller, or a microprocessor. By way of example, and not limitation, the processor may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like. Data and/or instructions between the one or more processors 410 and the memory 420 may be sent and received over a network via a network interface device (not shown) that may employ any known transmission protocol.
For example, the computer programs, when executed by the one or more processors 410, may cause the one or more processors 410 to perform and/or implement the following: receiving second transaction sample data to be detected; performing feature extraction processing on the second transaction sample data to obtain features of the second transaction sample; inputting the characteristics of the second transaction sample into an abnormal transaction detection model based on an unsupervised machine learning algorithm to obtain a prediction result; and judging according to the prediction result and the core position and the radius of each cluster of the abnormal transaction detection model, and outputting second transaction sample data as the detection result of the abnormal transaction when judging that the characteristics of the second transaction sample do not belong to any cluster of the abnormal transaction detection model.
FIG. 5 illustrates a training apparatus for an anomalous transaction detection model according to an embodiment of the present invention.
Referring to fig. 5, the training apparatus 500 of the abnormal transaction detection model according to an embodiment of the present invention may include a receiving unit 510, a feature extraction unit 520, and a training and recording unit 530. Here, the training device 500 of the abnormal transaction detection model may perform any of the training methods described with reference to fig. 1. The receiving unit 510, the feature extraction unit 520 and the training and recording unit 530 are described in more detail below. Note that, for the sake of brevity, a detailed description related to any of the training methods described with reference to fig. 1 is omitted below, however, a detailed description related to any of the training methods described with reference to fig. 1 may be applied to a corresponding unit (e.g., the receiving unit 510, the feature extraction unit 520, or the training and recording unit 530) that performs one or more steps in the training method.
In the present invention, the receiving unit 510 may be configured to receive the acquired first transaction sample data set, wherein each first transaction sample data in the first transaction sample data set is normal transaction sample data. In one embodiment, the first transaction sample data is a transaction sample prior to the enterprise joining the supply chain finance.
In the present invention, the feature processing unit 520 may be configured to perform a feature extraction process on the first transaction sample data set to obtain a first transaction sample feature set. In one embodiment, the features of each first transaction sample in the first set of transaction sample features comprise one or more of the following features extracted from the first transaction sample data: time attribute feature, money amount distribution attribute feature, same-class attribute feature and same-region attribute feature.
In the present invention, the training and recording unit 530 may be configured to obtain an abnormal transaction detection model by using an unsupervised machine learning algorithm based on the first transaction sample feature set, and record the core position and radius of each cluster obtained based on the unsupervised machine learning algorithm. Here, the unsupervised machine learning algorithm may include a k-means algorithm, a DBSCAN algorithm, or an isolated forest algorithm, etc.
In one embodiment, the training and recording unit 530 may be configured to: normalizing the characteristics of each first transaction sample in the first transaction sample characteristic set by columns; and training by adopting an unsupervised machine learning algorithm to obtain an abnormal transaction detection model based on the characteristics of the first transaction samples normalized according to the columns.
When the unsupervised machine learning algorithm is a k-means algorithm, the training and recording unit 530 may be configured to: determining core locations of k initial clusters in the first transaction sample feature set, wherein a value of k is determined based on the first transaction sample feature set; clustering the first trading sample feature set using a k-means algorithm based on the core positions of predetermined k initial clusters until a standard measure function begins to converge.
In an optional embodiment, the training apparatus 500 for abnormal transaction detection model further comprises an output unit (not shown), wherein the output unit is configured to output the abnormal transaction detection model and the recorded core position and radius of each cluster.
Fig. 6 illustrates an abnormal transaction detecting apparatus according to an embodiment of the present invention.
Referring to fig. 6, the training apparatus 600 of the abnormal transaction detection model according to an embodiment of the present invention may include a receiving unit 610, a feature processing unit 620, an input unit 630, and a detection unit 640.
Here, the training device 500 of the abnormal transaction detection model may perform any of the abnormal transaction detection methods described with reference to fig. 2. The receiving unit 610, the feature processing unit 620, the input unit 630, and the detection unit 640 are described in more detail below. Note that, for the sake of brevity, a detailed description related to any abnormal transaction detection method described with reference to fig. 2 is omitted below, however, a detailed description related to any abnormal transaction detection method described with reference to fig. 2 may be applied to a corresponding unit (e.g., the receiving unit 610, the feature processing unit 620, the input unit 630, or the detection unit 640) that performs one or more steps of the abnormal transaction detection method.
In the present invention, the receiving unit 610 may be configured to receive second transaction sample data to be detected. In one embodiment, the second transaction sample data is a transaction sample after the enterprise joins the supply chain finance.
In the present invention, the feature processing unit 620 may be configured to perform a feature extraction process on the second transaction sample data, so as to obtain features of the second transaction sample. In one embodiment, the feature processing performed by feature processing unit 620 may be the same as or similar to feature processing unit 520 in FIG. 5.
In the present invention, the input unit 630 may be configured to input the characteristics of the second transaction sample into an abnormal transaction detection model based on an unsupervised machine learning algorithm, resulting in a predicted result. Here, the prediction result may indicate that the features of the second transaction sample have been mapped to a space made up of the features of the respective first transaction samples used to train the anomalous transaction detection model.
In the present invention, the detecting unit 640 may be configured to perform a determination according to the prediction result and a core position and a radius of each cluster of the abnormal transaction detection model, and output the second transaction sample data as a detection result of the abnormal transaction when it is determined that the feature of the second transaction sample does not belong to any cluster of the abnormal transaction detection model.
In one embodiment, the detection unit 640 may be configured to: and when the distance between the characteristic of the second transaction sample and the core position of each cluster is larger than the product of a preset enabling coefficient and the radius of the corresponding cluster, judging that the characteristic of the second transaction sample does not belong to any cluster of the abnormal transaction detection model. Here, the energization coefficient may be the energization coefficient described with reference to the embodiment of fig. 2. Further optionally, the detection unit is configured to: outputting a detection result that a second transaction sample data is a normal transaction when a distance between a feature of the second transaction sample and a core position of at least one cluster is equal to or less than a product of a predetermined enabling factor and a radius of the at least one cluster.
Further optionally, the abnormal transaction detecting apparatus 600 may further comprise an updating unit (not shown), wherein the updating unit may be configured to selectively update the abnormal transaction detection model based on the second transaction sample data in response to the second transaction sample data being a detection result of a normal transaction. In one embodiment, the update unit may be configured to: and taking the second transaction sample data and the training transaction sample data in the abnormal transaction detection model as new training transaction sample data as training input of the abnormal transaction detection model so as to update the abnormal transaction detection model.
The training method and the training apparatus of the abnormal transaction detection model, and the abnormal transaction detection method and the abnormal transaction detection apparatus according to the exemplary embodiments of the present invention have been described above with reference to fig. 1 to 6. However, it should be understood that: the devices, systems, units, etc. used in fig. 1-6 may each be configured as software, hardware, firmware, or any combination thereof that performs a particular function. For example, these systems, devices, units, etc. may correspond to dedicated integrated circuits, may correspond to pure software programs, and may correspond to units combining software and hardware. Further, one or more functions implemented by these systems, apparatuses, or units, etc. may also be uniformly executed by components in a physical entity device (e.g., processor, client, server, etc.).
Further, the training method described above may be realized by a computer program recorded on a computer-readable storage medium. For example, according to an exemplary embodiment of the present invention, a computer-readable storage medium may be provided, in which a computer program is stored, which, when being executed by a processor, carries out any of the training methods disclosed in the present application.
For example, the computer program when executed by a processor realizes the steps of: receiving the acquired first transaction sample data set, wherein each first transaction sample data in the first transaction sample data set is normal transaction sample data; performing feature extraction processing on the first transaction sample data set to obtain a first transaction sample feature set; and training by adopting an unsupervised machine learning algorithm based on the first transaction sample feature set to obtain an abnormal transaction detection model, wherein the core position and the radius of each cluster obtained based on the unsupervised machine learning algorithm are recorded through the abnormal transaction detection model.
Further, the above-described abnormal transaction detecting method may be implemented by a computer program recorded on a computer-readable storage medium. For example, according to an exemplary embodiment of the present invention, a computer-readable storage medium may be provided, in which a computer program is stored, which when executed by a processor implements any of the abnormal transaction detecting methods disclosed in the present application.
For example, the computer program when executed by a processor realizes the steps of: receiving second transaction sample data to be detected; performing feature extraction processing on the second transaction sample data to obtain features of the second transaction sample; inputting the characteristics of the second transaction sample into an abnormal transaction detection model based on an unsupervised machine learning algorithm to obtain a prediction result; and judging according to the prediction result and the core position and the radius of each cluster of the abnormal transaction detection model, and outputting second transaction sample data as the detection result of the abnormal transaction when judging that the characteristics of the second transaction sample do not belong to any cluster of the abnormal transaction detection model.
The computer program in the computer-readable storage medium may be executed in an environment deployed in a computer device such as a client, a host, a proxy apparatus, a server, etc., and it should be noted that the computer program may be further used to perform additional steps other than the above steps or perform more specific processes when the above steps are performed, and the contents of the additional steps and the further processes are mentioned in the description of the related methods and apparatuses with reference to fig. 1 to 4, and thus will not be described again here to avoid repetition.
The technical scheme of the invention for carrying out abnormal transaction detection by using the unsupervised machine learning algorithm takes the characteristics of the financial transaction scene into consideration and adopts the unsupervised machine learning algorithm, so that a simple model which can be interpreted and is easy to visualize is realized, the requirements of supervision can be met, the algorithm logic can be visualized to business personnel, the business personnel can be helped to better understand the early warning logic, suggestions can be provided for subsequent operation of the transaction, and higher detection accuracy can be provided.
While exemplary embodiments of the present application have been described above, it should be understood that the above description is exemplary only, and not exhaustive, and that the present application is not limited to the exemplary embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the present application. Therefore, the protection scope of the present application shall be subject to the scope of the claims.

Claims (10)

1. A method of training an anomalous transaction detection model, comprising:
receiving the acquired first transaction sample data set, wherein each first transaction sample data in the first transaction sample data set is normal transaction sample data;
performing feature extraction processing on the first transaction sample data set to obtain a first transaction sample feature set;
and training by adopting an unsupervised machine learning algorithm to obtain an abnormal transaction detection model based on the first transaction sample feature set, and recording the core position and the radius of each cluster obtained based on the unsupervised machine learning algorithm.
2. The training method of claim 1, wherein the training method further comprises:
outputting the abnormal transaction detection model and the recorded core position and radius of each cluster.
3. The training method of claim 1, wherein the features of each first transaction sample in the first set of transaction sample features comprise one or more of the following features extracted from the first transaction sample data: time attribute feature, money amount distribution attribute feature, same-class attribute feature and same-region attribute feature.
4. The training method of claim 1, wherein the step of training an abnormal transaction detection model using an unsupervised machine learning algorithm based on the first transaction sample feature set comprises:
normalizing the characteristics of each first transaction sample in the first transaction sample characteristic set by columns;
and training by adopting an unsupervised machine learning algorithm to obtain an abnormal transaction detection model based on the characteristics of the first transaction samples normalized according to the columns.
5. The training method of claim 1, wherein the unsupervised machine learning algorithm comprises a k-means algorithm, a DBSCAN algorithm, or an orphan forest algorithm.
6. An anomalous transaction detection method comprising:
receiving second transaction sample data to be detected;
performing feature extraction processing on the second transaction sample data to obtain features of the second transaction sample;
inputting the characteristics of the second transaction sample into an abnormal transaction detection model based on an unsupervised machine learning algorithm to obtain a prediction result;
and judging according to the prediction result and the core position and the radius of each cluster of the abnormal transaction detection model, and outputting the second transaction sample data as the detection result of the abnormal transaction when judging that the characteristics of the second transaction sample do not belong to any cluster of the abnormal transaction detection model.
7. A computer-readable storage medium storing a computer program which, when executed by a processor, implements the method of any one of claims 1 to 6.
8. A computing device, comprising:
one or more processors;
one or more memories storing a computer program that, when executed by the one or more processors, implements the method of any one of claims 1-6.
9. An abnormal transaction detection model training device, comprising:
a receiving unit configured to receive the acquired first transaction sample data set, wherein each first transaction sample data in the first transaction sample data set is normal transaction sample data;
the characteristic processing unit is configured to perform characteristic extraction processing on the first transaction sample data set to obtain a first transaction sample feature set;
and the training and recording unit is configured to obtain an abnormal transaction detection model by adopting an unsupervised machine learning algorithm through training based on the first transaction sample feature set, and record the core position and the radius of each cluster obtained based on the unsupervised machine learning algorithm.
10. An anomalous transaction detection device comprising:
a receiving unit configured to receive second transaction sample data to be detected;
the characteristic processing unit is configured to perform characteristic extraction processing on the second transaction sample data to obtain characteristics of the second transaction sample;
an input unit configured to input features of the second transaction sample into an abnormal transaction detection model based on an unsupervised machine learning algorithm, resulting in a prediction result;
and the detection unit is configured to perform judgment according to the prediction result and the core position and the radius of each cluster of the abnormal transaction detection model, and when the characteristic of the second transaction sample is judged not to belong to any cluster of the abnormal transaction detection model, the second transaction sample data is output as a detection result of the abnormal transaction.
CN202111506039.7A 2019-03-28 2019-03-28 Method and device for training abnormal transaction detection model and detecting abnormal transaction Pending CN114154588A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111506039.7A CN114154588A (en) 2019-03-28 2019-03-28 Method and device for training abnormal transaction detection model and detecting abnormal transaction

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910243559.XA CN109948728A (en) 2019-03-28 2019-03-28 The method and apparatus of the training of abnormal transaction detection model and abnormal transaction detection
CN202111506039.7A CN114154588A (en) 2019-03-28 2019-03-28 Method and device for training abnormal transaction detection model and detecting abnormal transaction

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201910243559.XA Division CN109948728A (en) 2019-03-28 2019-03-28 The method and apparatus of the training of abnormal transaction detection model and abnormal transaction detection

Publications (1)

Publication Number Publication Date
CN114154588A true CN114154588A (en) 2022-03-08

Family

ID=67011030

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201910243559.XA Pending CN109948728A (en) 2019-03-28 2019-03-28 The method and apparatus of the training of abnormal transaction detection model and abnormal transaction detection
CN202111506039.7A Pending CN114154588A (en) 2019-03-28 2019-03-28 Method and device for training abnormal transaction detection model and detecting abnormal transaction

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201910243559.XA Pending CN109948728A (en) 2019-03-28 2019-03-28 The method and apparatus of the training of abnormal transaction detection model and abnormal transaction detection

Country Status (1)

Country Link
CN (2) CN109948728A (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110378607B (en) * 2019-07-24 2020-06-05 青岛鲁诺金融电子技术有限公司 Automobile financial service system based on algorithm
CN111798312B (en) * 2019-08-02 2024-03-01 深圳索信达数据技术有限公司 Financial transaction system anomaly identification method based on isolated forest algorithm
CN111026653B (en) * 2019-09-16 2022-04-08 腾讯科技(深圳)有限公司 Abnormal program behavior detection method and device, electronic equipment and storage medium
CN110751196B (en) * 2019-10-12 2020-09-18 东北石油大学 Oil-like drop attachment identification method in oil-water two-phase flow transparent pipe wall
CN111191720B (en) * 2019-12-30 2023-08-15 中国建设银行股份有限公司 Service scene identification method and device and electronic equipment
CN111428757B (en) * 2020-03-05 2021-09-10 支付宝(杭州)信息技术有限公司 Model training method, abnormal data detection method and device and electronic equipment
CN111833171B (en) * 2020-03-06 2021-06-25 北京芯盾时代科技有限公司 Abnormal operation detection and model training method, device and readable storage medium
CN111445254A (en) * 2020-03-10 2020-07-24 中国建设银行股份有限公司 Transaction behavior detection method, device and system
CN111353890A (en) * 2020-03-30 2020-06-30 中国工商银行股份有限公司 Application log-based application anomaly detection method and device
CN111461223B (en) * 2020-04-01 2023-04-07 支付宝(杭州)信息技术有限公司 Training method of abnormal transaction identification model and abnormal transaction identification method
CN111383030B (en) * 2020-05-28 2021-02-23 支付宝(杭州)信息技术有限公司 Transaction risk detection method, device and equipment
CN111882415A (en) * 2020-07-24 2020-11-03 未鲲(上海)科技服务有限公司 Training method and related device of quality detection model
CN112101952B (en) * 2020-09-27 2024-05-10 中国建设银行股份有限公司 Bank suspicious transaction evaluation and data processing method and device
CN113159790A (en) * 2021-05-19 2021-07-23 中国银行股份有限公司 Abnormal transaction identification method and device
CN113298184B (en) * 2021-06-21 2022-09-02 哈尔滨工程大学 Sample extraction and expansion method and storage medium for small sample image recognition
CN114471408B (en) * 2022-01-27 2023-08-08 广东天航动力科技有限公司 Automatic monitoring system for powder material production
CN114495137B (en) * 2022-04-15 2022-08-02 深圳高灯计算机科技有限公司 Bill abnormity detection model generation method and bill abnormity detection method
CN117171603B (en) * 2023-11-01 2024-02-06 海底鹰深海科技股份有限公司 Doppler velocity measurement data processing method based on machine learning

Also Published As

Publication number Publication date
CN109948728A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
CN114154588A (en) Method and device for training abnormal transaction detection model and detecting abnormal transaction
US7296734B2 (en) Systems and methods for scoring bank customers direct deposit account transaction activity to match financial behavior to specific acquisition, performance and risk events defined by the bank using a decision tree and stochastic process
CN111967779B (en) Risk assessment method, device and equipment
CN111476660B (en) Intelligent wind control system and method based on data analysis
US20080033852A1 (en) Computer-based modeling of spending behaviors of entities
US11042930B1 (en) Insufficient funds predictor
US11580339B2 (en) Artificial intelligence based fraud detection system
CN112801529B (en) Financial data analysis method and device, electronic equipment and medium
US20210097543A1 (en) Determining fraud risk indicators using different fraud risk models for different data phases
US20220005041A1 (en) Enhancing explainability of risk scores by generating human-interpretable reason codes
WO2019196257A1 (en) Automatic repayment method and system, and terminal device
CN110895758A (en) Screening method, device and system for credit card account with cheating transaction
US20210334812A1 (en) System and method for managing chargeback risk
US20220172214A1 (en) Method for generating transferable tranches
US20220215465A1 (en) Predictive modeling based on pattern recognition
CN111144899B (en) Method and device for identifying false transaction and electronic equipment
US7719426B2 (en) Correctional supervision program and card
CN113034046A (en) Data risk metering method and device, electronic equipment and storage medium
CN112329862A (en) Decision tree-based anti-money laundering method and system
CN112241917A (en) Intelligent financial institution pre-loan management method and system
CN110910002A (en) Account receivable default risk identification method and system
CN111126788A (en) Risk identification method and device and electronic equipment
dos Reis Evaluating classical and artificial intelligence methods for credit risk analysis
WO2009048843A1 (en) Methods and systems of predicting mortgage payment risk
WO2022136692A1 (en) Method for calculating at least one score representative of a probable activity breakage of a merchant, system, apparatus and corresponding computer program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination