CA3156642A1

CA3156642A1 - Anti-fraud method and system based on automatic feature engineering

Info

Publication number: CA3156642A1
Application number: CA3156642A
Authority: CA
Inventors: Yang CHU; Xiaokai Dong
Original assignee: 10353744 Canada Ltd
Current assignee: 10353744 Canada Ltd
Priority date: 2021-04-30
Filing date: 2022-04-26
Publication date: 2022-10-30
Also published as: CN113139818A

Abstract

The present invention discloses an anti-fraud method and a system based on automatic feature engineering, whereby features can be augmented quickly, highly effectively and normatively through automatic feature engineering. The method comprises: obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector, and constructing a feature set F0 of original field features; performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa ; calculating an information gain gf, of each new feature, each serving as a structure tree partition attribute in the feature set Fa of the current node, selecting a feature f to which the maximum information gain gf corresponds to serve as a partition attribute, and partitioning the transaction dataset into two subtrees left and right to obtain the structure tree.

Description

ANTI-FRAUD METHOD AND SYSTEM BASED ON AUTOMATIC FEATURE
ENGINEERING
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the field of artificial intelligence technology, and more particularly to an anti-fraud method and an anti-fraud system based on automatic feature engineering.
Description of Related Art

[0002] Financial frauds on the internet are causing many social and financial problems, network payment is one of the typical patterns in online finance, and fraudulent transactions in this pattern are also one of the main forms of financial frauds on the internet. It has become a mainstream conception in the field of network payment anti-fraud to identify fraudulent transactions by constructing fraud detection models on the basis of machine learning. During the process of constructing a fraud detection model, feature engineering is the most essential step, as the quality of the features directly affects the performance of the model, and this is usually also the most time-consuming step requiring the highest expertise in the relevant field of technology. In terms of feature engineering, currently available network payment fraud detection models are mainly carried out by professionals in the form of manual construction based on their business knowhow, but fraud scenarios are numerous and variegated under the network payment pattern, feature constructing procedures are far from being identical under the different scenarios, so the method of manually constructing features can no longer meet the daily increasing anti-fraud requirements.

Date Recue/Date Received 2022-04-26 SUMMARY OF THE INVENTION

[0003] An objective of the present invention it is to provide an anti-fraud method and an anti-fraud system based on automatic feature engineering, whereby features can be augmented quickly, highly effectively and normatively through automatic feature engineering, so as to enhance training precision of the anti-fraud model, and to ensure accuracy of the recognition result of the anti-fraud model.

[0004] In order to achieve the above objective, according to the first aspect, the present invention provides an anti-fraud method based on automatic feature engineering, the method comprises:

[0005] obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector after processing, and constructing a feature set F0 of original field features;

[0006] performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa in a process of constructing a current node of a structure tree, wherein the feature set Fa includes linearly extended new features r and the original field features in the feature set F0;

[0007] calculating an information gain gf, of each new feature, each serving as a structure tree partition attribute in the feature set Fa of the current node, selecting a feature f to which the maximum information gain gf corresponds to serve as a partition attribute, and partitioning the transaction dataset into two subtrees left and right to obtain the structure tree;

[0008] adding the feature f, if it is a new feature, to the feature set Fa of the current node of the structure tree, and simultaneously merging the feature f and a conversion function used for its construction into a feature set Fs; and

[0009] employing the feature set Fs of a leaf node of the structure tree and the feature set Fa as a training set, and training an anti-fraud model for recognizing a fraudulent transaction.

[0010] Preferably, the method further comprises:

Date Recue/Date Received 2022-04-26

[0011] entering the two subtrees left and right respectively, judging whether the number of samples of a transaction dataset in the current node is lower than a set minimum threshold T, and judging whether purity of the transaction dataset is higher than a set threshold G;

[0012] arriving at the leaf node if the number of samples of the transaction dataset in the current node is lower than the set minimum threshold T, and purity of the transaction dataset is higher than the set threshold G, and completing construction of the structure tree;

[0013] repeatedly constructing a feature set Fs. of the next node and the corresponding feature set Fa if the number of samples of the transaction dataset in the current node is not lower than the set minimum threshold T, and/or purity of the transaction dataset is not higher than the set threshold G, until arriving at the leaf node and completing construction of the structure tree.

[0014] Preferably, the step of obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector after processing, and constructing a feature set F0 of original field features includes:

[0015] obtaining a transaction dataset D¨{X, Y}, where X¨{xi, x2 ... , xii}, Y¨fyi, y2, ... , y0, the xi represents a feature vector of the ith transaction record, the yi represents a fraud result vector of the ith transaction record, 1<i<n; and

[0016] constructing the feature set F0 based on a feature vector set of n pieces of transaction records.

[0017] Moreover, types of the conversion function include one or more of a conversion function of a vertical mode, a conversion function of a horizontal mode, and a conversion function of a time window mode; and

[0018] there are k number of preset conversion functions, wherein W¨{wi, w2, , wk} expresses weight vectors to which the various conversion functions correspond.

[0019] Furthermore, the step of performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa in a Date Recue/Date Received 2022-04-26 process of constructing a current node of a structure tree, wherein the feature set Fa includes linearly extended new features and the original field features in the feature set F0 includes:

[0020] initializing weights w of the conversion functions, so that the weight w of each conversion function =
lkl'

[0021] initializing a latest average information gain utility list 1, of each conversion function, wherein a length of the gain utility list 1, is set as k, and an initial value of the latest average information gain to which each conversion function corresponds is 0;

[0022] screening r number of original field features out of s number of original field features of the feature set F0 to construct new features during the process of constructing the current node of the structure tree, employing current conversion function m to take each screened new feature fi as information gain gfi of the structure tree partition attribute, calculating a latest average information gain g, of the current conversion function m and updating the same into the gain utility list 1,, calculating a reward value )8 of the current conversion function m based on the gain utility list 1, and the latest average information gain gõ, wherein an initial value of the conversion function is 1, letting m=m+/, and repeating calculation of a reward value )8 of the next conversion function;
and

[0023] updating the weights of the corresponding conversion functions according to the reward values )8 of the various conversion functions, and updating the weight of each conversion function after normalization thereof.

[0024] Preferably, the conversion function with the maximum weight is used in a process of constructing a feature set Fs of a next-level sub-node and a corresponding feature set F.

[0025] Exemplarily, a step of calculating the reward value )8 includes:
1Ted

[0026] 13 = max {0, 9 1.
(Tax _ironed

[0027] wherein /am' is an information gain maximum value in the gain utility list 1,, and /Ted Date Recue/Date Received 2022-04-26 is an average information gain of the gain utility list 10.

[0028] Exemplarily, a weight calculation formula of the conversion function is Wo' =
t 13 \a W,* e U-F0 , wherein a is an update rate of weights, Wo represents the weight in the conversion function before update, and IV,' represents the weight in the conversion function after update.

[0029] In comparison with prior-art technology, the anti-fraud method based on automatic feature engineering provided by the present invention achieves the following advantageous effects.

[0030] In the anti-fraud method based on automatic feature engineering provided by the present invention, a transaction dataset is firstly obtained, the dataset is cleaned to obtain a feature vector of each transaction record and a corresponding fraud result vector, and all transaction records are summarized to construct a feature set F0 containing the entire original field features; a tree structure is thereafter used to automatically augment the features, linear calculation is performed on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa in a process of constructing a current node of a structure tree, the feature set Fa includes linearly extended new features r and the original field features in the feature set F0;
an information gain gf, of each new feature, each serving as a structure tree partition attribute in the feature set Fa of the current node is subsequently calculated, a feature f to which the maximum information gain gf corresponds is selected to serve as a partition attribute, and the transaction dataset is partitioned into two subtrees left and right to obtain the structure tree; if the feature f is a new feature, it is added to the feature set Fa of the current node of the structure tree, and the feature f and a conversion function used for its construction are simultaneously merged into a feature set Fs; and the feature set Fs of a leaf node of the structure tree and the feature set Fa are finally employed as a training set to train an anti-fraud model for recognizing a fraudulent transaction.
Date Recue/Date Received 2022-04-26

[0031] Seen as such, the present invention makes use of the tree structure, realizes construction of features at the same time of partitioning the dataset, and makes it possible to construct features directed to various anti-fraud scenarios, particularly the network payment scenario, through customized conversion function design; new features as constructed are retained when partial features are constructed at a node to serve as basis features of the next node to construct new features, whereby construction of complicated features is realized. Accordingly, features can be augmented quickly, highly effectively and normatively through automatic feature engineering, so as to enhance training precision of the anti-fraud model, and to ensure accuracy of the recognition result of the anti-fraud model.

[0032] According to the second aspect, the present invention provides an anti-fraud device based on automatic feature engineering, the device is applied to the anti-fraud method based on automatic feature engineering as recited in the foregoing technical solution, and comprises:

[0033] a collecting unit, for obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector after processing, and constructing a feature set F0 of original field features;

[0034] a linearly augmenting unit, for performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa in a process of constructing a current node of a structure tree, wherein the feature set Fa includes linearly extended new features r and the original field features in the feature set F0;

[0035] a gain calculating unit, for calculating an information gain gf, of each new feature, each serving as a structure tree partition attribute in the feature set Fa of the current node, selecting a feature f to which the maximum information gain gf corresponds to serve as a partition attribute, and partitioning the transaction dataset into two subtrees left and right to obtain the structure tree;

Date Recue/Date Received 2022-04-26

[0036] a non-linearly augmenting unit, for adding the featuref, if it is a new feature, to the feature set Fa of the current node of the structure tree, and simultaneously merging the feature f and a conversion function used for its construction into a feature set Fs; and

[0037] a model training unit, for employing the feature set Fs. of a leaf node of the structure tree and the feature set Fa as a training set, and training an anti-fraud model for recognizing a fraudulent transaction.

[0038] In comparison with prior-art technology, the advantageous effects achieved by the anti-fraud device based on automatic feature engineering provided by the present invention are identical with the advantageous effects achievable by the anti-fraud method based on automatic feature engineering provided by the foregoing technical solution, so these are not redundantly described in this context.

[0039] According to the third aspect, the present invention provides a computer-readable storage medium storing a computer program thereon, the computer program executes the steps of the aforementioned anti-fraud method based on automatic feature engineering when it is run by a processor.

[0040] In comparison with prior-art technology, the advantageous effects achieved by the computer-readable storage medium provided by the present invention are identical with the advantageous effects achievable by the anti-fraud method based on automatic feature engineering provided by the foregoing technical solution, so these are not redundantly described in this context.
BRIEF DESCRIPTION OF THE DRAWINGS

[0041] The drawings described here are meant to provide further understanding of the present invention, and constitute a part of the present invention. The exemplary embodiments of the present invention and the descriptions thereof are meant to explain the present Date Recue/Date Received 2022-04-26 invention, rather than to restrict the present invention. In the drawings:

[0042] Fig. 1 is a flowchart schematically illustrating the anti-fraud method based on automatic feature engineering in an embodiment of the present invention;

[0043] Fig. 2 is a flowchart schematically illustrating training of the anti-fraud model in an embodiment of the present invention;

[0044] Fig. 3 is a view schematically illustrating the overall framework of the feature structure tree algorithm in an embodiment of the present invention;

[0045] Fig. 4 is a view schematically illustrating the action scope of the conversion function of a vertical mode in an embodiment of the present invention;

[0046] Fig. 5 is a view schematically illustrating the action scope of the conversion function of a horizonal mode in an embodiment of the present invention;

[0047] Fig. 6 is a view schematically illustrating the action scope of the conversion function of a time window mode in an embodiment of the present invention; and

[0048] Fig. 7 is an exemplary view illustrating the feature structure tree in an embodiment of the present invention.
DETAILED DESCRIPTION OF THE INVENTION

[0049] To make more apparent and easily comprehensible the objectives, features and advantages of the present invention, the technical solutions in the embodiments of the present invention will be described more clearly and comprehensively below with Date Recue/Date Received 2022-04-26 reference to the accompanying drawings in the embodiments of the present invention.
Apparently, the embodiments as described are merely partial, rather than the entire, embodiments of the present invention. All other embodiments obtainable by persons ordinarily skilled in the art on the basis of the embodiments in the present invention without spending any creative effort in the process shall all fall within the protection scope of the present invention.

[0050] Embodiment 1

[0051] Please refer to Fig. 1, this embodiment provides an anti-fraud method based on automatic feature engineering, and the method comprises:

[0052] obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector after processing, and constructing a feature set F0 of original field features;
performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa in a process of constructing a current node of a structure tree, wherein the feature set Fa includes linearly extended new features r and the original field features in the feature set F0;
calculating an information gain gf, of each new feature, each serving as a structure tree partition attribute in the feature set Fa of the current node, selecting a feature f to which the maximum information gain gf corresponds to serve as a partition attribute, and partitioning the transaction dataset into two subtrees left and right to obtain the structure tree; adding the feature f, if it is a new feature, to the feature set Fa of the current node of the structure tree, and simultaneously merging the feature f and a conversion function used for its construction into a feature set Fs; and employing the feature set Fs of a leaf node of the structure tree and the feature set Fa as a training set, and training an anti-fraud model for recognizing a fraudulent transaction.

[0053] In the anti-fraud method based on automatic feature engineering provided by this embodiment, a transaction dataset is firstly obtained, the dataset is cleaned to obtain a Date Recue/Date Received 2022-04-26 feature vector of each transaction record and a corresponding fraud result vector, and all transaction records are summarized to construct a feature set F0 containing the entire original field features; a tree structure is thereafter used to automatically augment the features, linear calculation is performed on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa in a process of constructing a current node of a structure tree, the feature set Fa includes linearly extended new features r and the original field features in the feature set F0;
an information gain gf, of each new feature, each serving as a structure tree partition attribute in the feature set Fa of the current node is subsequently calculated, a feature f to which the maximum information gain gf corresponds is selected to serve as a partition attribute, and the transaction dataset is partitioned into two subtrees left and right to obtain the structure tree; if the feature f is a new feature, it is added to the feature set Fa of the current node of the structure tree, and the feature f and a conversion function used for its construction are simultaneously merged into a feature set Fs.;
and the feature set Fs. of a leaf node of the structure tree and the feature set Fa are finally employed as a training set to train an anti-fraud model for recognizing a fraudulent transaction.

[0054] Seen as such, this embodiment makes use of the tree structure, realizes construction of features at the same time of partitioning the dataset, and makes it possible to construct features directed to various anti-fraud scenarios, particularly the network payment scenario, through customized conversion function design; new features as constructed are retained when partial features are constructed at a node to serve as basis features of the next node to construct new features, whereby construction of complicated features is realized. Accordingly, features can be augmented quickly, highly effectively and normatively through automatic feature engineering, so as to enhance training precision of the anti-fraud model, and to ensure accuracy of the recognition result of the anti-fraud model.

[0055] During specific implementation, with respect to an anti-fraud scenario, the process of Date Recue/Date Received 2022-04-26 designing an anti-fraud detection model for network payment scenario, for example, is as shown in Fig. 2, in which are mainly included several such steps as data obtaining, data preprocessing, feature engineering, model selecting and training, and testing in real time and maintaining, of which feature engineering is the important means to realize automatic augmentation of features, so the various sections of feature engineering are extensively described in detail in this embodiment, to introduce in detail the details for realizing the automatic feature engineering method proposed for network payment fraud detection.

[0056] This embodiment employs an automatic feature engineering method of a customized feature construction tree to automatically construct features. Its implementation algorithm mainly covers three sections, as shown in Fig. 3, including: a first section that is customized conversion function design directed to internet financial network payment, a second section that is a partial feature construction process at each node in the customized feature construction tree, and a third section that is a timeliness update mechanism of conversion function weight vectors in the customized feature construction tree. Specifics are described below.

[0057] First section ¨ customized conversion function design:

[0058] Feature construction is a process of converting original field features, and involves the concept of conversion function, the conversion function subsumes such operations as algebraic calculation and integrated computation, and can also perform feature scaling or convert the relation between a feature and a category from a non-linear relation to a linear relation at the same time, it can map a feature from an original space to a completely new feature space, can also change the distribution status of original features, and can change the valuation coverage range of the original features, all such conversions are directed to the generation of new features. The categories of the conversion function can be classified according to the number of features required by its input, and can be classified as univariate conversion function, binary conversion function, and multivariate conversion function. In order to reduce data processing quantity, conversion functions preferably Date Recue/Date Received 2022-04-26 selected in this embodiment merely involve the univariate conversion function and the binary conversion function. The conversion functions are mainly classified into three types in this embodiment according to the action scope modes of the conversion functions:
a conversion function of a vertical mode, a conversion function of a horizontal mode, and a conversion function of a time window mode.

[0059] Specifically speaking, the conversion function of a vertical mode is a conversion function acting on a single feature or between plural feature attributes. As shown in Fig. 4, where the conversion function acts on a single feature, cubic value can be calculated, for example, for a column of features named transaction amount of money, so as to obtain a column of new features, and square, sigmoid and tanh values can be calculated in a similar manner. Where the conversion function acts between features, a difference between two fields, namely transaction amount of money in two dotted-line boxes and feature 2, can be calculated, for example, so as to obtain a column of new features, and addition and multiplication can be similarly made between features. In summary, the calculation link of the conversion function of a vertical mode with respect to each piece of transaction record data occurs at a single field column or between plural field columns.

[0060] The conversion function of a horizontal mode is a conversion function acting between plural different samples under a feature field, as shown in Fig. 5. For instance, a difference between two adjacent transactions can be calculated with respect to a column of features named transaction amount of money according to user groupings to obtain a column of new features of user transaction amount of money differences, and it is also possible to calculate accumulative sums with respect to a column of features named transaction amount of money according to user groupings to obtain a column of new features of user accumulated transaction amount of money. Similarly, it is further possible to calculate frequency, group accumulative summation or accumulative counting of a certain feature.
In summary, the calculation link of the conversion function of a horizontal mode occurs between plural rows under a field.

Date Recue/Date Received 2022-04-26

[0061] The conversion function of a time window mode employs the concept of the sliding time window, and this is important to analyze transaction behavior features within a period of time; it is a conversion function acting on the same and single feature field to manipulate a plurality of samples within a time window. As shown in Fig. 6, for instance, it is possible to calculate an accumulative sum with respect to a column of features named transaction amount of money according to user groupings to obtain a column of new features of accumulated transaction amount of money of users within a period of time.
Similarly, the conversion function of a time window mode can also calculate extremums, medium values, variances, counting, distinct counting, and modes of transaction amounts of money within the time window.

[0062] Second section ¨ partial feature construction process at node:

[0063] The feature construction tree customized in this embodiment, as shown in Fig. 7, not only constructs new features at each node on the basis of an original feature set of transaction records, but also presents combinations of conversion functions, namely to continue to construct features on the basis of the constructed new features. Here, the feature construction tree retains features constructed from the parent node for partitioning the dataset, the features form a new and enlarged feature space together with the original features, feature construction is again carried out and features for partitioning the dataset are selected on the enlarged feature space. Such a partial feature construction process is added with a combination function of the conversion function, whereby searching range of the feature space is enlarged.

[0064] The method of obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector after processing, and constructing a feature set F0 of original field features in the above embodiment includes:

[0065] obtaining a transaction dataset D¨{X, Y}, where X¨{xi, x2 , xii}, Y¨tyi, y2, ... , yd., the xi represents a feature vector of the ith transaction record, the yi represents a fraud result Date Recue/Date Received 2022-04-26 vector of the ith transaction record, 1<i<n; and constructing the feature set F0 based on a feature vector set of n pieces of transaction records.

[0066] During specific implementation, in the anti-fraud of internet financial network payment, suppose that D is the entire network payment transaction dataset, D={XY}, where X¨{xi, x2 ... , xa), in which what xi corresponds are various fields of the ith transaction record, namely a feature vector, and X represents a feature vector set of all transaction records; Y=[
¨ 1, , v 2, === in which what yi corresponds is the result as to whether the ith transaction record is fraudulent, it is valuated as yi = {0,1}, 0 indicates normality, 1 indicates abnormality, and Y represents a set of all transaction record labels. The two together make up the entire dataset D, and the total number of transaction record samples is n in the dataset.

[0067] Let F0 represent a feature set of original fields in the dataset, Fa represent a total set of features containing new features on the current node, it not only includes the feature set of the original fields but also includes new feature r capable of being newly constructed through the conversion function for partitioning the dataset, and Fs.
represent a set of new features selected at the nodes of the structure tree to partition the dataset and their construction process. Table 1 shows examples of the various feature sets.

[0068]
Feature Set Example F0 original features such as transaction amount of money, transaction time, etc.
Fa original features such as transaction amount of money, transaction time, etc.
newly constructed features such as new feature 1, new feature 2, etc.

Date Recue/Date Received 2022-04-26 Fs new feature 1: mean window ('transaction amount of money');
new feature 2: vat window ('transaction amount of money');
... such new features and their construction process.

[0069] Table 1

[0070] In the above embodiment, types of the conversion function include one or more of a conversion function of a vertical mode, a conversion function of a horizontal mode, and a conversion function of a time window mode, wherein there are k number of preset conversion functions, W¨{wi, w2, , Wk} expresses weight vectors to which the various conversion functions correspond.

[0071] During specific implementation, let 0 represent a conversion function set as used, let w2, , Wk} represent a weight vector of the conversion function set, and wi represent the weight of the ith conversion function, the higher the weight is, this indicates that the more probably this conversion function will be selected, the total number of conversion functions in the above set is k, and gf represents an information gain obtained by selecting featuref at the node to serve as the structure tree partition attribute;
as can be understood, the information gain is equivalent to the information gain calculated in an ID3 decision tree, and the information gain can also be replaced with a GINI index in a CART decision tree, while the following algorithm design and experiment section of this embodiment are all proceeded on the basis of the information gain. go represents an average value of information gains obtained by using all features generated by the conversion function at the node to respectively serve as the partition attribute, lo =
{t-m-Ft} {t-m+2} {t}
go ,go , go represents a list of average information gain utilities of the conversion function lately being selected for m number of times, m is the length of the list 10, and got represents an average information gain utility value obtained by using all new features generated by the conversion function selected at timing Ito serve as partition attributes.
Date Recue/Date Received 2022-04-26

[0072] The method of performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa in a process of constructing a current node of a structure tree, wherein the feature set Fa includes linearly extended new features and the original field features in the feature set F0 includes:

[0073] initializing weights w of the conversion functions, so that the weight w of each conversion function =
initializing a latest average information gain utility list I, of each I ki' conversion function, wherein a length of the gain utility list I, is set as k, and an initial value of the latest average information gain to which each conversion function corresponds is 0; screening r number of original field features out of s number of original field features of the feature set F0 to construct new features during the process of constructing the current node of the structure tree, employing current conversion function m to take each screened new feature fi as information gain g fi of the structure tree partition attribute, calculating a latest average information gain g, of the current conversion function m and updating the same into the gain utility list 1,, calculating a reward value )8 of the current conversion function m based on the gain utility list I, and the latest average information gain g0, wherein an initial value of the conversion function is 1, letting m =m +/, and repeating calculation of a reward value )8 of the next conversion function; and updating the weights of the corresponding conversion functions according to the reward values ,8 of the various conversion functions, and updating the weight of each conversion function after normalization thereof.

[0074] The above embodiment further includes: entering the two subtrees left and right respectively, judging whether the number of samples of a transaction dataset in the current node is lower than a set minimum threshold T, and judging whether purity of the transaction dataset is higher than a set threshold G; arriving at the leaf node if the number of samples of the transaction dataset in the current node is lower than the set minimum threshold T, and purity of the transaction dataset is higher than the set threshold G, and completing construction of the structure tree; repeatedly constructing a feature set Fs. of Date Recue/Date Received 2022-04-26 the next node and the corresponding feature set Fa if the number of samples of the transaction dataset in the current node is not lower than the set minimum threshold T, and/or purity of the transaction dataset is not higher than the set threshold G, until arriving at the leaf node and completing construction of the structure tree. In other words, plural transaction dataset samples should be obtained to construct the structure tree.

[0075] The structure of the structure tree is as shown in Fig. 7, and steps for the entire structure tree to perform feature construction are specifically described below.

[0076] Step 1: weight vectors W of conversion functions in the conversion function set 0 are initialized, wherein each wi = ¨1 a latest average information gain utility list I, of each Ikl' conversion function is initialized, where a length of the list is set as k, and initializing Fa = Fo, Fs = (p;

[0077] Step 2: a conversion function is selected according to probabilities and in accordance with weight vectors W of the conversion functions on a certain node of the structure tree (a conversion function with higher weight value is more probably selected). If the function is a univariate conversion function, r number of different features are selected from s number of features from the feature set F0 on the dataset to which the node corresponds, where r<s , and s = IF, I, i.e., the size of F, is set. The conversion function is applied to the r number of features, and r number of new features are constructed; if the function is a binary conversion function, r groups of different feature pairs are selected from the entire s number of features from a dataset to which the node corresponds, where r<C2 s, the conversion function is applied to these r groups of feature pairs, and r number of new features are constructed;

[0078] Step 3: an information gain g f of each new feature f serving as the partition attribute is calculated with respect to the r number of new features as constructed and the original Date Recue/Date Received 2022-04-26 feature Fa in the node, the feature f with the maximum information gain is selected to serve as the partition attribute, the dataset is partitioned into two sections left and right according to the specific partition value of the feature f, and split into two subtrees left and right, samples whose values of feature f are lower than the specific partition value are assigned to the left subtree, the remaining samples are correspondingly assigned to the right subtree, and these respectively correspond to left child nodes and right child nodes.
If the feature f is a newly constructed feature, the new feature f is added to the newly constructed feature set Fa, i.e., Fa = Fa u f, and the feature f as well as its construction process are merged to the set Fs, as should be noted, the equal sign in the formula connotes the meaning of assignment of value;

[0079] Step 4: weight values of the conversion functions are updated according to the timeliness update mechanism of weight vectors of the conversion functions specified below;

[0080] Step 5: the left and right child nodes are respectively entered, it is judged whether the number of samples of a sub dataset in the node is lower than a set minimum threshold T, and/or it is judged whether purity of the samples of the sub dataset is higher than a set threshold G, if yes, the leaf node is arrived, and the process ends; if not, steps 2-4 are repeated, until the leaf node is arrived. When the structure tree is constructed to completion, step 6 is entered;

[0081] Step 6: after the entire structure tree has been constructed to completion, features in the feature set Fs as finally obtained are precisely the new features constructed by the feature construction tree and their construction process.

[0082] Preferably, in the above embodiment, the conversion function used in the process of constructing the feature set Fs. of the next node is the conversion function with the maximum weight after update.

Date Recue/Date Received 2022-04-26

[0083] In the above embodiment, the timeliness update mechanism of weight vectors of the conversion functions is as follows.

[0084] During specific implementation, an information gain average value is employed in this embodiment to appraise the pros and cons of features constructed by the various conversion functions. Specifically speaking, r number of new features are firstly constructed through a conversion function at the node, what the information gain average value represents is an average value of information gains obtained by using these new features respectively as partition attributes of the dataset. If the information gain average value of a conversion function is relatively high, features constructed thereby are relatively better in performance, so the weight of this conversion function should be increased, so that it will be more probably selected in subsequent nodes; to the contrary, probabilities of conversion functions with lower information gain average value for being selected by subsequent nodes should be correspondingly decreased. However, if the performance appraisal obtained by selecting a certain conversion function each time is invariably high or invariably low, this would cause the circumstance in which the weights of some conversion functions become extremely high while the weights of some conversion functions become extremely low, in which case selection of conversion functions by subsequent nodes will lean heavily on a certain one or several certain ones, thereby causing the circumstance in which features constructed are unduly singular. This is a problem concerning trade-off between exploration and utilization ¨
existing conversion functions with higher weights should be utilized, while other conversion functions should also be considered. Accordingly, a latest average information gain utility list should be maintained for each conversion function, whereby the corresponding weight vector is updated according to the recent performance of the conversion function, timeliness is strengthened, and it is ensured that the weight vector should not converge to a certain value or several certain values, so that the features as constructed are more generalized.

Date Recue/Date Received 2022-04-26

[0085] The steps are as follows.

[0086] Step 1: according to the feature constructing steps in the above embodiment, if the conversion function selected by the current node is m, the entire r number of new features are constructed according to it to be respectively taken as partition attributes of the dataset, and average information gain go is calculated and obtained according to the following formula:

[0087] go = g f (1)

[0088] where g f, represents an information gain obtained by using the ith new feature fi constructed by conversion function m to serve as a partition attribute. go is used to update the latest average information gain utility list 1, of the current conversion function m, the average information gain go is added to the tail of the list 1,, and the first value at the head of the list 1, is deleted.

[0089] Step 2: a reward value )8 of the current conversion function is calculated according to the latest average information gain utility list 1, and average information gain go of the current conversion function m:

90-110ned [0090] /3 = max {0, ig(2) lax _ironed

[0091] The /m' is the information gain maximum value in the gain utility list /o, and the /Ted is the average information gain of the gain utility list 1.

[0092] Step 3: the weight vector of the conversion function is updated according to the reward value )8 of the current conversion function m in accordance with formula (3), and the weight vector of the conversion function is then normalized in accordance with formula (4):
\ a

[0093] wo' = vvo* eU-0) (3) Date Recue/Date Received 2022-04-26 wt

[0094] Wi = (4) EJ=iwi

[0095] where W, represents the weight value of the conversion function m before update, 1/17,' represents the weight of the updated conversion function m, W, in formula (3) is monotonously increased with the increase in the reward fl, in other words, the higher the reward value is, the larger will be the extent of increase in the weight of the conversion function, and a controls the rate of update of the weight;

[0096] Wi in formula (4) represents the weight value of the ith conversion function, and E1.1,114/1 represents the sum total of weight values of all conversion functions.

[0097] Step 4: the current conversion function m is selected according to probabilities in accordance with weight vectors of new conversion functions at the next node, and steps 1-3 are repeated until the leaf node is arrived.

[0098] In summary, in view of network payment anti-fraud problems, this embodiment proposes an automatic feature engineering method of a customized feature construction tree. The method realizes construction of features at the same time of partitioning the dataset by using the tree structure; through the customized conversion function design, it is made possible to construct features directed to network payment; new features as constructed are retained when partial features are constructed at a node to serve as basis features of the next node to construct new features, whereby construction of complicated features is realized; the method possesses a timeliness update mechanism for weight vectors of conversion functions, whereby is prevented weights of conversion functions from falling into regional extremums, and guaranteed that feature construction is generalized.
Inventive points of the method include, but are not limited, to the following:

[0099] 1. The design of conversion functions in this method is extensible without affecting the overall structure of the algorithm, and more types of conversion functions can be subsequently added;

Date Recue/Date Received 2022-04-26

[0100] 2. Effectiveness and generality of the proposed method are attempted to be verified on datasets with larger timespan in more different network payment scenarios;

[0101] 3. It is considered to propagate the automatic feature engineering method under the online network payment pattern to other patterns under the field of intemet finance.

[0102] Embodiment 2

[0103] This embodiment provides an anti-fraud device based on automatic feature engineering, the device comprises:

[0104] a collecting unit, for obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector after processing, and constructing a feature set F0 of original field features;

[0105] a linearly augmenting unit, for performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa in a process of constructing a current node of a structure tree, wherein the feature set Fa includes linearly extended new features r and the original field features in the feature set F0;

[0106] a gain calculating unit, for calculating an information gain gf, of each new feature, each serving as a structure tree partition attribute in the feature set Fa of the current node, selecting a feature f to which the maximum information gain gf corresponds to serve as a partition attribute, and partitioning the transaction dataset into two subtrees left and right to obtain the structure tree;

[0107] a non-linearly augmenting unit, for adding the feature f, if it is a new feature, to the feature set Fa of the current node of the structure tree, and simultaneously merging the feature f and a conversion function used for its construction into a feature set Fs; and

[0108] a model training unit, for employing the feature set Fs. of a leaf node of the structure tree and the feature set Fa as a training set, and training an anti-fraud model for recognizing a fraudulent transaction.

Date Recue/Date Received 2022-04-26

[0109] In comparison with prior-art technology, the advantageous effects achieved by the anti-fraud device based on automatic feature engineering provided by this embodiment of the present invention are identical with the advantageous effects achievable by the anti-fraud method based on automatic feature engineering provided by Embodiment 1, so these are not redundantly described in this context.

[0110] Embodiment 3

[0111] This embodiment provides a computer-readable storage medium storing a computer program thereon, the computer program executes the steps of the aforementioned anti-fraud method based on automatic feature engineering when it is run by a processor.

[0112] In comparison with prior-art technology, the advantageous effects achieved by the computer-readable storage medium provided by this embodiment are identical with the advantageous effects achievable by the anti-fraud method based on automatic feature engineering provided by the foregoing technical solution, so these are not redundantly described in this context.

[0113] As comprehensible to persons ordinarily skilled in the art, the entire or partial steps that realize the above method can be completed via a program that instructs relevant hardware, the program can be stored in a computer-readable storage medium, and subsumes the various steps of the method according to the foregoing embodiment when it is executed, and the storage medium can be ROM/RAM, a magnetic disk, an optical disk, a memory card, etc.

[0114] What the above describes is merely directed to specific modes of execution of the present invention, but the protection scope of the present invention is not restricted thereby. Any change or replacement easily conceivable to persons ordinarily skilled in the art within the technical range disclosed in the present invention shall be covered by the protection Date Recue/Date Received 2022-04-26 scope of the present invention. Accordingly, the protection scope of the present invention shall be based on the protection scope as claimed in the Claims.

Date Recue/Date Received 2022-04-26

Claims

What is claimed is:

1. An anti-fraud method based on automatic feature engineering, characterized in comprising:
obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector after processing, and constructing a feature set F0 of original field features;
performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set F.alpha. in a process of constructing a current node of a structure tree, wherein the feature set F.alpha. includes linearly extended new features r and the original field features in the feature set F0;
calculating an information gain g.function., of each new feature, each serving as a structure tree partition attribute in the feature set F.alpha. of the current node, selecting a feature f to which the maximum information gain g.function. corresponds to serve as a partition attribute, and partitioning the transaction dataset into two subtrees left and right to obtain the structure tree;
adding the feature .function., if it is a new feature, to the feature set F.alpha. of the current node of the structure tree, and simultaneously merging the feature .function. and a conversion function used for its construction into a feature set F s; and employing the feature set F s of a leaf node of the structure tree and the feature set F.alpha. as a training set, and training an anti-fraud model for recognizing a fraudulent transaction.

2. The method according to Claim 1, characterized in further comprising:
entering the two subtrees left and right respectively, judging whether the number of samples of a transaction dataset in the current node is lower than a set minimum threshold T, and judging whether purity of the transaction dataset is higher than a set threshold G;
arriving at the leaf node if the number of samples of the transaction dataset in the current node is lower than the set minimum threshold T, and purity of the transaction dataset is higher than the set threshold G, and completing construction of the structure tree;
repeatedly constructing a feature set F s of the next node and the corresponding feature set F.alpha.

if the number of samples of the transaction dataset in the current node is not lower than the set minimum threshold T, and/or purity of the transaction dataset is not higher than the set threshold G, until arriving at the leaf node and completing construction of the structure tree.

3. The method according to Claim 2, characterized in that the step of obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector after processing, and constructing a feature set F0 of original field features includes:
obtaining a transaction dataset D¨{X, Y}, where X={ xl, x2 , xõ,},Y= fy 1, y2, , yn) , the xi represents a feature vector of the ith transaction record, the yi represents a fraud result vector of the ith transaction record, / <i<n; and constructing the feature set F0 based on a feature vector set of n pieces of transaction records.

4. The method according to Claim 2 or 3, characterized in that types of the conversion function include one or more of a conversion function of a vertical mode, a conversion function of a horizontal mode, and a conversion function of a time window mode; and that there are k number of preset conversion functions, wherein W¨{w1,w2, ...,wk}
expresses weight vectors to which the various conversion functions correspond.

5. The method according to Claim 4, characterized in that the step of performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set Fa in a process of constructing a current node of a structure tree, wherein the feature set Fa includes linearly extended new features and the original field features in the feature set F0 includes:
initializing weights w of the conversion functions, so that the weight w of each conversion function =
initializing a latest average information gain utility list la of each conversion function, wherein a length of the gain utility list 1, is set as k, and an initial value of the latest average information gain to which each conversion function corresponds is 0;
screening r number of original field features out of s number of original field features of the Date Recue/Date Received 2022-04-26 feature set F0 to construct new features during the process of constructing the current node of the structure tree, employing current conversion function m to take each screened new feature fi as information gain gfi of the structure tree partition attribute, calculating a latest average information gain go of the current conversion function m and updating the same into the gain utility list lo, calculating a reward value )8 of the current conversion function m based on the gain utility list I, and the latest average information gain go, wherein an initial value of the conversion function is 1, letting m=m+1, and repeating calculation of a reward value )8 of the next conversion function; and updating the weights of the corresponding conversion functions according to the reward values )8 of the various conversion functions, and updating the weight of each conversion function after normalization thereof.

6. The method according to Claim 5, characterized in that the conversion function with the maximum weight is used in a process of constructing a feature set Fs of a next-level sub-node and a corresponding feature set Fa.

7. The method according to Claim 5, characterized in that a step of calculating the reward value )8 includes:
wherein 17'" is an information gain maximum value in the gain utility list lo, and ITed is an average information gain of the gain utility list I,.

8. The method according to Claim 5, characterized in that a weight calculation formula of the conversion function is wherein a is an update rate of weights, Wo represents the weight in the conversion function before update, and Wo' represents the weight in the conversion function after update.

9. An anti-fraud device based on automatic feature engineering, characterized in comprising:

Date Recue/Date Received 2022-04-26 a collecting unit, for obtaining a transaction dataset to acquire a transaction record feature vector and a fraud result vector after processing, and constructing a feature set F0 of original field features;
a linearly augmenting unit, for performing linear calculation on the original field features in the feature set F0 based on a preset conversion function to obtain a feature set F.alpha. in a process of constructing a current node of a structure tree, wherein the feature set F.alpha. includes linearly extended new features r and the original field features in the feature set F0;
a gain calculating unit, for calculating an information gain g f, of each new feature, each serving as a structure tree partition attribute in the feature set F.alpha. of the current node, selecting a feature .function. to which the maximum information gain gf corresponds to serve as a partition attribute, and partitioning the transaction dataset into two subtrees left and right to obtain the structure tree;
a non-linearly augmenting unit, for adding the feature .function., if it is a new feature, to the feature set F.alpha. of the current node of the structure tree, and simultaneously merging the feature .function. and a conversion function used for its construction into a feature set F s; and a model training unit, for employing the feature set F s of a leaf node of the structure tree and the feature set F .alpha. as a training set, and training an anti-fraud model for recognizing a fraudulent transaction.

10. A computer-readable storage medium, storing thereon a computer program, characterized in that the computer program executes steps of the method as recited in anyone of Claims 1 to 7 when it is run by a processor.