WO2023124204A1 - Anti-fraud risk assessment method and apparatus, training method and apparatus, and readable storage medium - Google Patents

Anti-fraud risk assessment method and apparatus, training method and apparatus, and readable storage medium Download PDF

Info

Publication number
WO2023124204A1
WO2023124204A1 PCT/CN2022/117419 CN2022117419W WO2023124204A1 WO 2023124204 A1 WO2023124204 A1 WO 2023124204A1 CN 2022117419 W CN2022117419 W CN 2022117419W WO 2023124204 A1 WO2023124204 A1 WO 2023124204A1
Authority
WO
WIPO (PCT)
Prior art keywords
fraud
risk assessment
risk
vector
app
Prior art date
Application number
PCT/CN2022/117419
Other languages
French (fr)
Chinese (zh)
Inventor
骆浩楠
龚妙岚
李嘉
周凯
章文康
Original Assignee
中国银联股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国银联股份有限公司 filed Critical 中国银联股份有限公司
Publication of WO2023124204A1 publication Critical patent/WO2023124204A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Definitions

  • the invention belongs to the field of anti-fraud, and in particular relates to an anti-fraud risk assessment method, a training method, a device and a readable storage medium.
  • the present invention provides the following solutions.
  • a training method for an anti-fraud risk assessment model including: obtaining a training sample set, the training samples include multi-dimensional features and their fraud labels, and the multi-dimensional features include: user static features, user behavior features, and device risk APP features; Input the training sample set into the anti-fraud risk assessment model to be trained for iterative training; wherein, in each round of iteration, the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain the input vector, and inputs the input vector based on self-attention
  • the feature learning network constructed by the force mechanism is used to obtain the weighted and fused coding vector, and the coding vector is input into the deep network to obtain the risk prediction result, and the parameters of the risk assessment model are updated using the risk prediction result and the loss function constructed by the fraud label.
  • a Transformer encoder is used as a feature learning network, and the Transformer encoder includes a self-attention layer, a residual and normalization layer, a feedforward network layer, and a summation and normalization layer.
  • it also includes: obtaining the use sequence information of the equipment risk APP, and obtaining the use correlation between each risk APP used by the user equipment and the current fund APP based on the use sequence; using the position encoding mechanism of the Transformer encoder to Use timing information for timing encoding to obtain a timing vector, combine the timing vector with the usage correlation corresponding to each risk APP to obtain a timing intensity vector; combine the timing intensity vector with the input vector corresponding to the equipment risk APP feature, and input it into the self-attention layer .
  • using the position encoding mechanism of the Transformer encoder to perform time-series encoding on the used time-series information further comprising: wherein, the following formula is used to define the time-series encoding rule: Among them, TE(t,2i) is the 2i-th dimension of the time-series encoding vector of time series t, TE(t,2i+1) is the 2i+1-th dimension of the time-series encoding vector of time series t, and d model is the dimension of the time-series encoding vector .
  • the method further includes: obtaining the global risk APP, and using the attribute information of each risk APP to obtain associated and/or similar other APPs to expand the global risk APP; the attribute information includes one or more of the following Type: developer information, name information, APP introduction information.
  • obtaining the training data set also includes: collecting user transaction behavior information by means of buried points, and the user transaction behavior data includes: transaction location IP, transaction counterparty information; periodically collecting APP usage information of user equipment, According to the global risk APP, the risk APP used by the user equipment is determined, and the characteristics of the equipment risk APP are obtained.
  • the multi-dimensional feature further includes: a text feature, where the text feature includes transaction message information.
  • the deep network employs random forest or XGB in machine learning.
  • a transaction amount weight factor is set in the loss function.
  • the second method provides an anti-fraud risk assessment method, including: obtaining real-time transaction information, which includes: user static characteristics, user behavior characteristics, and equipment risk APP characteristics; inputting real-time transaction information into the anti-fraud risk assessment model, anti-fraud
  • the fraud risk assessment model performs embedding processing on the input real-time transaction information to obtain the input vector, and inputs the input vector into the feature learning network built based on the attention mechanism to obtain the encoding vector, and inputs the encoding vector into the deep network to obtain the risk prediction result; among them,
  • the anti-fraud risk assessment model is trained using the method in the first aspect.
  • it further includes: if the risk prediction result meets the preset condition, performing corresponding interference processing and/or alarm processing based on real-time transaction information.
  • it also includes: updating the training sample set based on the risk prediction results and real-time transaction information; constructing a user relationship graph based on the real-time updated training sample set, the user transaction relationship graph uses users as nodes, and uses transactions between users The relationship is an edge; mining gang nodes and/or gang transactions from the user transaction graph through clustering algorithm and/or graph attention algorithm; identifying hidden fraud samples from the training sample set based on gang nodes and/or gang transactions; based on feedback The hidden fraud samples are updated to train the risk assessment prediction model.
  • a training device for an anti-fraud risk assessment model including: an acquisition module for acquiring a training sample set, the training samples include multidimensional features and fraud labels thereof, and the multidimensional features include: user static features, user behavior features, and Device risk APP features; training module, used to input the training sample set into the anti-fraud risk assessment model to be trained for iterative training; wherein, in each round of iteration, the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain Input vector, input the input vector into the feature learning network based on the self-attention mechanism to obtain the weighted and fused encoding vector, input the encoding vector into the deep network to obtain the risk prediction result, and use the risk prediction result and the loss constructed by the fraud label
  • the function updates the parameters of the risk assessment model.
  • an anti-fraud risk assessment device including: an acquisition module for acquiring real-time transaction information, and the real-time transaction information includes: user static characteristics, user behavior characteristics and device risk APP characteristics; an evaluation module for real-time The transaction information is input into the anti-fraud risk assessment model, and the anti-fraud risk assessment model performs embedding processing on the input real-time transaction information to obtain the input vector, which is input into the feature learning network built based on the attention mechanism to obtain the encoding vector, and the encoding vector is input into A deep network is used to obtain a risk prediction result; wherein, the anti-fraud risk assessment model is trained by the method as in the first aspect.
  • a training device for an anti-fraud risk assessment model including: at least one processor; and a memory connected to at least one processor in communication; wherein, the memory stores instructions executable by at least one processor, The instructions are executed by at least one processor, so that the at least one processor can perform: the method of the first aspect.
  • an anti-fraud risk assessment device including: at least one processor; and a memory communicatively connected to the at least one processor; wherein, the memory stores instructions executable by the at least one processor, and the instructions are executed by at least one processor. Executed by one processor, so that at least one processor can execute: the method of the second aspect.
  • a computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the method according to the first aspect and/or the method according to the second aspect method.
  • FIG. 1 is a schematic structural diagram of a training device for an anti-fraud risk assessment model according to an embodiment of the present invention
  • FIG. 2 is a schematic flow diagram of a training method of an anti-fraud risk assessment model according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of the training process of the anti-fraud risk assessment model according to an embodiment of the present invention.
  • FIG. 4 is a schematic flow diagram of an anti-fraud risk assessment method according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of the use process of the anti-fraud risk assessment model according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a training device for an anti-fraud risk assessment model according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an anti-fraud risk assessment device according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a training device for an anti-fraud risk assessment model according to an embodiment of the present invention.
  • Fig. 9 is a schematic structural diagram of an anti-fraud risk assessment device according to an embodiment of the present invention.
  • A/B can mean A or B; "and/or” in this article is just an association relationship describing associated objects, indicating that there can be three relationships, For example, A and/or B may mean that A exists alone, A and B exist simultaneously, and B exists alone.
  • first”, “second”, etc. are used for descriptive purposes only, and should not be understood as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, a feature defined as “first”, “second”, etc. may expressly or implicitly include one or more of that feature. In the description of the embodiments of the present application, unless otherwise specified, "plurality" means two or more.
  • FIG. 1 is a schematic structural diagram of a hardware operating environment involved in the solution of the embodiment of the present invention.
  • FIG. 1 is a schematic structural diagram of a hardware operating environment of a training device for an anti-fraud risk assessment model.
  • the database hotspot line updating device in the embodiment of the present invention may be a terminal device such as a PC or a portable computer.
  • the training device for the anti-fraud risk assessment model may include: a processor 1001 , such as a CPU, a network interface 1004 , a user interface 1003 , a memory 1005 , and a communication bus 1002 .
  • the communication bus 1002 is used to realize connection and communication between these components.
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the memory 1005 can be a high-speed RAM memory, or a non-volatile memory, such as a disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a training program for an anti-fraud risk assessment model.
  • the operating system is a program that manages and controls the hardware and software resources of the training equipment for the anti-fraud risk assessment model, and supports the operation of the database hotspot line update program and other software or programs.
  • the user interface 1003 is mainly used to receive requests, data, etc. sent by the first terminal, the second terminal and the supervision terminal;
  • the network interface 1004 is mainly used to connect the background server and The background server performs data communication;
  • the processor 1001 can be used to call the database hotspot line update program stored in the memory 1005, and perform the following operations:
  • the training sample includes multidimensional features and their fraud labels, and the multidimensional features include: user static features, user behavior features, and device risk APP features; input the training sample set into the anti-fraud risk assessment model to be trained for iterative training; among them , in each round of iterations, the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain an input vector, and inputs the input vector into the feature learning network built based on the self-attention mechanism to obtain a weighted and fused encoding vector, and encodes The vector is input into the deep network to obtain the risk prediction result, and the parameters of the risk assessment model are updated using the risk prediction result and the loss function constructed by the fraud label.
  • the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain an input vector, and inputs the input vector into the feature learning network built based on the self-attention mechanism to obtain a weighted and fused encoding vector, and encodes
  • the vector is input into the deep network to obtain the
  • FIG. 2 is a schematic flowchart of a training method for an anti-fraud risk assessment model according to an embodiment of the present application.
  • the execution subject may be one or more electronic devices, and more specifically It is a processing module; from a program point of view, the execution subject may be a program carried on these electronic devices.
  • the method includes:
  • the training samples include multi-dimensional features and their fraud labels.
  • the multi-dimensional features include: user static features, user behavior features, and device risk APP features;
  • the training sample set contains several black and white samples, where the black samples refer to the training samples whose fraudulent label is "Yes", and the white samples refer to the training samples whose fraudulent label is "No". Each training sample is obtained according to the transaction side information.
  • the training samples can be: user static characteristics (user A, gender, age, occupation), user behavior characteristics (transaction location IP, transaction counterparty information), device risk APP characteristics (app 1 , t 1 , app 2 , t 2 ,...t n-1 ,app n ), where app n is the transaction APP, app 1 , app 2 , etc. belong to the risk APP installed and used on the user's device, that is, the APP in the risk list, the above t 1 , t 2 , t n-1 corresponds to the use interval between two adjacent risky APPs, from which we can see the user's usage habits for risky APPs. )
  • the method further includes: obtaining a global risk APP, and using the attribute information of each risk APP to obtain associated and/or similar other APPs to expand the global risk APP; the attribute information includes one or more of the following : Developer information, name information, APP introduction information.
  • obtaining the training data set also includes: collecting user static characteristics, user static characteristics include user age and gender; collecting user transaction behavior information through buried points, user transaction behavior data includes: transaction location IP, transaction counterparty Information: Periodically collect the APP usage information of the user equipment, determine the risk APP used by the user equipment according to the global risk APP, and obtain the characteristics of the equipment risk APP.
  • the device information is collected at the first time point, the activity trace of app 1 is found, and the usage traces of app 1 and app 2 are found at the second time point, and the usage time of APP can be estimated based on the collection time.
  • the multi-dimensional features further include: text features, where the text features include transaction message information. It can be understood that the transaction message information of some fraudulent transactions is quite special, and the risk can be identified by identifying the transaction message information.
  • the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain the input vector, and inputs the input vector into the feature learning network built based on the self-attention mechanism to obtain a weighted and fused encoding vector.
  • the encoding vector is input into the deep network to obtain the risk prediction result, and the parameters of the risk assessment model are updated using the risk prediction result and the loss function constructed by the fraud label.
  • FIG. 3 it shows a training architecture diagram of an anti-fraud risk assessment model, wherein the anti-fraud risk assessment model 300 includes an embedding layer 301 for converting multidimensional features of input training samples into a vector form, ie an input vector.
  • the feature extraction network 302 is used to extract effective features from the input vector sequence.
  • the feature extraction network is constructed based on a self-attention mechanism, which may specifically include a self-attention layer, a residual and a normalization layer, a feedforward network layer, and a summation And the normalization layer, so that the encoded vector after weighted fusion can be obtained.
  • the deep network 303 is used to obtain a risk prediction result based on the encoding vector.
  • the deep network 30 also receives the fraud label of the sample, so as to adjust the parameters of the risk assessment model through backpropagation based on the error of the risk prediction result and the fraud label.
  • Q, K, V three vectors
  • the QK T weight output weighted matching is normalized for gradient stability, then activated by softmax and multiplied by V to obtain the result after the weighted input vector passes through the attention structure, and finally connected to the residual network structure to prevent deep learning from degrading.
  • features such as APP and static attributes installed on the user equipment are vectorized, and then spliced through the attention mechanism to obtain a weighted sum, so that the risk prediction result of the user dimension can be obtained.
  • a Transformer encoder is used as a feature learning network, and the Transformer encoder includes a self-attention layer, a residual and normalization layer, a feedforward network layer, and a summation and normalization layer.
  • it also includes: obtaining the use sequence information of the device risk APP, and obtaining the use correlation between each risk APP used by the user equipment and the current fund APP based on the use sequence;
  • the time series information is time series coded to obtain a time series vector, and the time series vector is combined with the use correlation corresponding to each risk APP to obtain a time series intensity vector; the time series intensity vector is combined with the input vector corresponding to the equipment risk APP feature, and input into the self-attention layer. Then the weighted summation is obtained by splicing through the attention mechanism, so that the risk prediction result of the device APP dimension can be further obtained.
  • the usage timing information of the device risk APP is: (app 1 ,t 1 ,app 2 ,t 2 ,app 3 ...t n-1 ,app n ), at this time, if the trading APP uses a certain Risk APP, the correlation between the two is high. If another risk APP was used a long time ago in the trading APP, the correlation between the two is low.
  • the relative time is also very important. Therefore, you can refer to the position encoding mechanism of the Transformer encoder to encode the timing information.
  • timing coding rules can be used:
  • TE(t,2i) is the 2i-th dimension of the time-series encoding vector of time series t
  • TE(t,2i+1) is the 2i+1-th dimension of the time-series encoding vector of time series t
  • d model is the dimension of the time-series encoding vector .
  • the time series vector at time t+t1 can be obtained from the linear change of time t, which is convenient for the model to capture changes between relative time series.
  • the embedding layer 301 includes an input embedding (input embedding) layer and a temporal encoding layer.
  • each feature of the training sample can be embedded to obtain the word embedding tensor of each feature.
  • the tensor can be expressed as a one-dimensional vector, a two-dimensional matrix, three-dimensional or more dimensional data, etc. wait.
  • the usage timing position of each risk APP in the user device can be obtained, and then a timing tensor is generated for the timing of each risk APP.
  • the timing tensor and embedding tensor of these features can be combined and input to the feature extraction network.
  • the deep network employs random forest or XGB in machine learning.
  • a transaction amount weighting factor is set in the loss function. It can be understood that the amount of fraud is generally too large and the damage is serious. Therefore, the weight factor in the loss function can be set based on the transaction amount of each training sample, so that the whole model is more conducive to identifying fraudulent transactions with large amounts.
  • Fig. 4 is a schematic flowchart of an anti-fraud risk assessment method provided by an embodiment of the present invention.
  • method 400 includes:
  • Obtain real-time transaction information which includes: one or more of user static characteristics, user behavior characteristics, and device risk APP characteristics;
  • Input real-time transaction information into the anti-fraud risk assessment model perform embedding processing on the input real-time transaction information to obtain an input vector, input the input vector into the feature learning network constructed based on the attention mechanism to obtain an encoding vector, and input the encoding vector into the depth network to obtain risk prediction results; wherein, the anti-fraud risk assessment model is trained using the method of the above-mentioned embodiment.
  • FIG. 5 it shows a schematic view of the use of the anti-fraud risk assessment model.
  • the transaction information obtained in real time is input into the trained anti-fraud risk assessment model 300.
  • the transaction information includes such as user static characteristics, user behavior characteristics and One or more of the device risk APP features
  • the embedding layer 301 embeds the transaction information to obtain vectorized data, that is, the input vector
  • the feature extraction network 302 extracts effective features from the input vector, that is, the encoding vector
  • the trained deep network predicts the encoding and obtains the risk prediction result.
  • it also includes: if the risk prediction result meets the preset condition, performing corresponding interference processing and/or alarm processing based on real-time transaction information.
  • it also includes: updating the training sample set based on the risk prediction results and real-time transaction information; constructing a user relationship graph based on the real-time updated training sample set, the user transaction relationship graph uses users as nodes, and uses transaction relationships between users as edges; mine gang nodes and/or gang transactions from user transaction graphs through clustering algorithms and/or graph attention algorithms; identify hidden fraudulent samples from training samples based on gang nodes and/or gang transactions; feedback-based Concealing fraudulent samples to update the risk assessment prediction model.
  • the above-mentioned training sample set may not be labeled comprehensively and accurately. Based on this, based on known black samples, clustering and graph algorithms can be used to further mine gang crimes, that is, to dig out black samples.
  • first and second are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features.
  • the features defined as “first” and “second” may explicitly or implicitly include at least one of these features.
  • “plurality” means at least two, such as two, three, etc., unless otherwise specifically defined.
  • an embodiment of the present invention also provides a training device for an anti-fraud risk assessment model, which is used to implement the training method for an anti-fraud risk assessment model provided in any of the above-mentioned embodiments.
  • FIG. 6 is a schematic structural diagram of a training device for an anti-fraud risk assessment model provided by an embodiment of the present invention.
  • the device 600 includes:
  • the acquisition module 601 is configured to acquire a training sample set, the training samples include multi-dimensional features and their fraud labels, and the multi-dimensional features include: user static features, user behavior features, and device risk APP features;
  • a training module 602 configured to input the training sample set into the anti-fraud risk assessment model to be trained for iterative training
  • the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain the input vector, and inputs the input vector into the feature learning network built based on the self-attention mechanism to obtain a weighted and fused encoding vector.
  • the encoding vector is input into the deep network to obtain the risk prediction result, and the parameters of the risk assessment model are updated using the risk prediction result and the loss function constructed by the fraud label.
  • an embodiment of the present invention also provides an anti-fraud risk assessment device, which is used to implement the anti-fraud risk assessment method provided in any one of the above embodiments.
  • Fig. 7 is a schematic structural diagram of an anti-fraud risk assessment device provided by an embodiment of the present invention.
  • the obtaining module 701 is used to obtain real-time transaction information, and the real-time transaction information includes: user static characteristics, user behavior characteristics and device risk APP characteristics;
  • the evaluation module 702 is used to input the real-time transaction information into the anti-fraud risk assessment model, and the anti-fraud risk assessment model performs embedding processing on the input real-time transaction information to obtain an input vector, and inputs the input vector into the feature learning network constructed based on the attention mechanism to obtain The encoding vector is obtained, and the encoding vector is input into the deep network to obtain the risk prediction result; wherein, the anti-fraud risk assessment model is obtained by training with the above training method.
  • FIG. 8 is a training device for an anti-fraud risk assessment model according to an embodiment of the present application, which is used to execute the training method for the anti-fraud risk assessment model shown in FIG. 2 , the device includes: at least one processor; and, with at least one A processor is communicatively connected to a memory; wherein, the memory stores instructions that can be executed by at least one processor, and the instructions are executed by at least one processor, so that at least one processor can execute the method of the above-mentioned embodiment.
  • Fig. 9 is an anti-fraud risk assessment device according to an embodiment of the present application, which is used to execute the anti-fraud risk assessment method shown in Fig. 4, the device includes: at least one processor; and, communicated with at least one processor A memory; wherein, the memory stores instructions executable by at least one processor, and the instructions are executed by at least one processor, so that the at least one processor can execute the methods of the above-mentioned embodiments.
  • a non-volatile computer storage medium for a training method of an anti-fraud risk assessment model and/or an anti-fraud risk assessment method is provided, on which computer-executable instructions are stored, and the computer-executable instructions set To execute when executed by a processor: the method of the above-mentioned embodiments.
  • the device, device, and computer-readable storage medium provided in the embodiments of the present application correspond to the method one-to-one. Therefore, the device, device, and computer-readable storage medium also have beneficial technical effects similar to their corresponding methods.
  • the beneficial technical effect of the method has been described in detail, therefore, the beneficial technical effect of the device, equipment and computer-readable storage medium will not be repeated here.
  • the embodiments of the present invention may be provided as methods, systems or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-permanent storage in computer-readable media, in the form of random access memory (RAM) and/or nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM). Memory is an example of computer readable media.
  • RAM random access memory
  • ROM read-only memory
  • flashRAM flash memory
  • Computer-readable media including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information.
  • Information may be computer readable instructions, data structures, modules of a program, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read only memory
  • EEPROM Electrically Erasable Programmable Read-

Abstract

Provided in the present invention are an anti-fraud risk assessment method and apparatus, a training method and apparatus, and a readable storage medium. The training method comprises: acquiring a training sample set, wherein training samples comprise multi-dimensional features and fraud labels thereof, which multi-dimensional features comprise a static feature of a user, a behavior feature of the user and a device risk application feature; and inputting the training sample set into an anti-fraud risk assessment model to be trained, so as to perform iterative training, wherein in each round of iteration, the anti-fraud risk assessment model executes embedding processing on the input multi-dimensional features, so as to obtain an input vector; the input vector is input into a feature learning network, which is constructed on the basis of a self-attention mechanism, such that a coded vector after weighted fusion is obtained; the coded vector is input into a deep network, such that a risk prediction result is obtained; and parameters of the risk assessment model is updated by using a loss function which is constructed on the basis of the risk prediction result and the fraud labels. By using the method, a better anti-fraud risk assessment effect can be obtained.

Description

反欺诈风险评估方法、训练方法、装置及可读存储介质Anti-fraud risk assessment method, training method, device and readable storage medium
本申请要求于2021年12月29日提交的、申请号为202111640205.2、标题为“反欺诈风险评估方法、训练方法、装置及可读存储介质”的中国专利申请的优先权,该中国专利申请的公开内容以引用的方式并入本文。This application claims the priority of the Chinese patent application with application number 202111640205.2 and titled "Anti-fraud risk assessment method, training method, device and readable storage medium" filed on December 29, 2021. The Chinese patent application The disclosure is incorporated herein by reference.
技术领域technical field
本发明属于反欺诈领域,具体涉及一种反欺诈风险评估方法、训练方法、装置及可读存储介质。The invention belongs to the field of anti-fraud, and in particular relates to an anti-fraud risk assessment method, a training method, a device and a readable storage medium.
背景技术Background technique
本部分旨在为权利要求书中陈述的本发明的实施方式提供背景或上下文。此处的描述不因为包括在本部分中就承认是现有技术。This section is intended to provide a background or context for implementations of the invention that are recited in the claims. The descriptions herein are not admitted to be prior art by inclusion in this section.
随着电信网络的发展,实时通讯和资金交易愈加方便,同时也为欺诈分子提供可乘之机。基于案前用户防范欺诈意识薄弱、案后追查难度大的现状,在交易侧进行防范显得尤为重要。With the development of telecommunications networks, real-time communication and financial transactions have become more convenient, and at the same time provide opportunities for fraudsters. Based on the current situation that users have weak awareness of fraud prevention before the case and it is difficult to trace after the case, it is particularly important to prevent it on the transaction side.
然而目前,仍然面临对欺诈交易的识别滞后且准确度不高的情况。However, at present, the identification of fraudulent transactions is still lagging behind and the accuracy is not high.
发明内容Contents of the invention
针对上述现有技术中存在的问题,提出了一种反欺诈风险评估方法、训练方法、装置及可读存储介质,利用这种方法、装置及计算机可读存储介质,能够解决上述问题。Aiming at the problems existing in the above-mentioned prior art, an anti-fraud risk assessment method, training method, device and readable storage medium are proposed, and the above-mentioned problems can be solved by using the method, device and computer-readable storage medium.
本发明提供了以下方案。The present invention provides the following solutions.
第一方面,提供一种反欺诈风险评估模型的训练方法,包括:获取训练样本集,训练样本包括多维特征及其欺诈标签,多维特征包括:用户静态特征、用户行为特征以及设备风险APP特征;将训练样本集输入待训练的反欺诈风险评估模型进行迭代训练;其中,在每轮迭代中,反欺诈风险评估模型对输入的多维特征执行嵌入处理以得到输入向量,将输入向量输入基于自注意力机制构建的特征学习网络以获得加权融合后的编码向量,将编码向量输入深度网络以得到风险预测结果,以及,利用风险预测结果和欺诈标签构建的损失函数更新风险评估模型的参数。In the first aspect, a training method for an anti-fraud risk assessment model is provided, including: obtaining a training sample set, the training samples include multi-dimensional features and their fraud labels, and the multi-dimensional features include: user static features, user behavior features, and device risk APP features; Input the training sample set into the anti-fraud risk assessment model to be trained for iterative training; wherein, in each round of iteration, the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain the input vector, and inputs the input vector based on self-attention The feature learning network constructed by the force mechanism is used to obtain the weighted and fused coding vector, and the coding vector is input into the deep network to obtain the risk prediction result, and the parameters of the risk assessment model are updated using the risk prediction result and the loss function constructed by the fraud label.
在一种实施方式中,采用Transformer编码器作为特征学习网络,Transformer编码器包括自注意力层、残差及归一化层、前馈网络层和求和及归一化层。In one embodiment, a Transformer encoder is used as a feature learning network, and the Transformer encoder includes a self-attention layer, a residual and normalization layer, a feedforward network layer, and a summation and normalization layer.
在一种实施方式中,还包括:获取设备风险APP的使用时序信息,基于使用时序获取用户设备使用的每个风险APP和当前资金类APP的使用相关性;利用Transformer编码器的位置编码机制对使用时序信息进行时序编码,得到时序向量,将时序向量结合每个风险APP对应的使用相关性得到时序强度向量;将时序强度向量和设备风险APP特征对应的输入向量结合,并输入自注意力层。In one embodiment, it also includes: obtaining the use sequence information of the equipment risk APP, and obtaining the use correlation between each risk APP used by the user equipment and the current fund APP based on the use sequence; using the position encoding mechanism of the Transformer encoder to Use timing information for timing encoding to obtain a timing vector, combine the timing vector with the usage correlation corresponding to each risk APP to obtain a timing intensity vector; combine the timing intensity vector with the input vector corresponding to the equipment risk APP feature, and input it into the self-attention layer .
在一种实施方式中,利用Transformer编码器的位置编码机制对使用时序信息进行时序编码,还包括:其中,利用如下公式定义时序编码规则:
Figure PCTCN2022117419-appb-000001
Figure PCTCN2022117419-appb-000002
其中,TE(t,2i)为时序t的时序编码向量的第2i维,TE(t,2i+1)为时序t的时序编码向量的第2i+1维,d model是时序编码向量的维度。
In one embodiment, using the position encoding mechanism of the Transformer encoder to perform time-series encoding on the used time-series information, further comprising: wherein, the following formula is used to define the time-series encoding rule:
Figure PCTCN2022117419-appb-000001
Figure PCTCN2022117419-appb-000002
Among them, TE(t,2i) is the 2i-th dimension of the time-series encoding vector of time series t, TE(t,2i+1) is the 2i+1-th dimension of the time-series encoding vector of time series t, and d model is the dimension of the time-series encoding vector .
在一种实施方式中,方法还包括:获取全局风险APP,并利用每个风险APP的属性信息获取关联和/或相似的其他APP以扩充全局风险APP;属性信息包括以下中的一种或多种:开发者信息、名称信息、APP介绍信息。In one embodiment, the method further includes: obtaining the global risk APP, and using the attribute information of each risk APP to obtain associated and/or similar other APPs to expand the global risk APP; the attribute information includes one or more of the following Type: developer information, name information, APP introduction information.
在一种实施方式中,获取训练数据集,还包括:通过埋点方式收集用户交易行为信息,用户交易行为数据包括:交易地点IP、交易对手方信息;周期性收集用户设备的APP使用信息,根据全局风险APP确定用户设备使用的风险APP,得到设备风险APP特征。In one embodiment, obtaining the training data set also includes: collecting user transaction behavior information by means of buried points, and the user transaction behavior data includes: transaction location IP, transaction counterparty information; periodically collecting APP usage information of user equipment, According to the global risk APP, the risk APP used by the user equipment is determined, and the characteristics of the equipment risk APP are obtained.
在一种实施方式中,多维特征还包括:文本特征,文本特征包括交易留言信息。In one embodiment, the multi-dimensional feature further includes: a text feature, where the text feature includes transaction message information.
在一种实施方式中,深度网络采用机器学习中的随机森林或XGB。In one embodiment, the deep network employs random forest or XGB in machine learning.
在一种实施方式中,损失函数中设有交易金额权重因子。In one embodiment, a transaction amount weight factor is set in the loss function.
第二方法,提供一种反欺诈风险评估方法,包括:获取实时交易信息,实时交易信息包括:用户静态特征、用户行为特征以及设备风险APP特征;将实时交易信息输入反欺诈风险评估模型,反欺诈风险评估模型对输入的实时交易信息执行嵌入处理以得到输入向量,将输入向量输入基于注意力机制构建的特征学习网络以获得编码向量,将编码向量输入深度网络以得到风险预测结果;其中,反欺诈风险评估模型利用如第一方面的方法训练得到。The second method provides an anti-fraud risk assessment method, including: obtaining real-time transaction information, which includes: user static characteristics, user behavior characteristics, and equipment risk APP characteristics; inputting real-time transaction information into the anti-fraud risk assessment model, anti-fraud The fraud risk assessment model performs embedding processing on the input real-time transaction information to obtain the input vector, and inputs the input vector into the feature learning network built based on the attention mechanism to obtain the encoding vector, and inputs the encoding vector into the deep network to obtain the risk prediction result; among them, The anti-fraud risk assessment model is trained using the method in the first aspect.
在一种实施方式中,还包括:如风险预测结果符合预设条件,则基于实时交易信息进行对应的干扰处理和/或告警处理。In one embodiment, it further includes: if the risk prediction result meets the preset condition, performing corresponding interference processing and/or alarm processing based on real-time transaction information.
在一种实施方式中,还包括:基于风险预测结果和实时交易信息更新训练样本集;基于实时更新的训练样本集构建用户关系图,用户交易关系图以用户为节点,以用户之间的交易关系为边;通过聚类算法和/或图注意力算法从用户交易关系图中挖掘出团伙节点和/或团伙 交易;基于团伙节点和/或团伙交易从训练样本集中识别隐藏欺诈样本;基于反馈的隐藏欺诈样本对风险评估预测模型进行更新训练。In one embodiment, it also includes: updating the training sample set based on the risk prediction results and real-time transaction information; constructing a user relationship graph based on the real-time updated training sample set, the user transaction relationship graph uses users as nodes, and uses transactions between users The relationship is an edge; mining gang nodes and/or gang transactions from the user transaction graph through clustering algorithm and/or graph attention algorithm; identifying hidden fraud samples from the training sample set based on gang nodes and/or gang transactions; based on feedback The hidden fraud samples are updated to train the risk assessment prediction model.
第三方面,提供一种反欺诈风险评估模型的训练装置,包括:获取模块,用于获取训练样本集,训练样本包括多维特征及其欺诈标签,多维特征包括:用户静态特征、用户行为特征以及设备风险APP特征;训练模块,用于将训练样本集输入待训练的反欺诈风险评估模型进行迭代训练;其中,在每轮迭代中,反欺诈风险评估模型对输入的多维特征执行嵌入处理以得到输入向量,将输入向量输入基于自注意力机制构建的特征学习网络以获得加权融合后的编码向量,将编码向量输入深度网络以得到风险预测结果,以及,利用风险预测结果和欺诈标签构建的损失函数更新风险评估模型的参数。In a third aspect, a training device for an anti-fraud risk assessment model is provided, including: an acquisition module for acquiring a training sample set, the training samples include multidimensional features and fraud labels thereof, and the multidimensional features include: user static features, user behavior features, and Device risk APP features; training module, used to input the training sample set into the anti-fraud risk assessment model to be trained for iterative training; wherein, in each round of iteration, the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain Input vector, input the input vector into the feature learning network based on the self-attention mechanism to obtain the weighted and fused encoding vector, input the encoding vector into the deep network to obtain the risk prediction result, and use the risk prediction result and the loss constructed by the fraud label The function updates the parameters of the risk assessment model.
第四方面,提供一种反欺诈风险评估装置,包括:获取模块,用于获取实时交易信息,实时交易信息包括:用户静态特征、用户行为特征以及设备风险APP特征;评估模块,用于将实时交易信息输入反欺诈风险评估模型,反欺诈风险评估模型对输入的实时交易信息执行嵌入处理以得到输入向量,将输入向量输入基于注意力机制构建的特征学习网络以获得编码向量,将编码向量输入深度网络以得到风险预测结果;其中,反欺诈风险评估模型利用如第一方面的方法训练得到。In the fourth aspect, an anti-fraud risk assessment device is provided, including: an acquisition module for acquiring real-time transaction information, and the real-time transaction information includes: user static characteristics, user behavior characteristics and device risk APP characteristics; an evaluation module for real-time The transaction information is input into the anti-fraud risk assessment model, and the anti-fraud risk assessment model performs embedding processing on the input real-time transaction information to obtain the input vector, which is input into the feature learning network built based on the attention mechanism to obtain the encoding vector, and the encoding vector is input into A deep network is used to obtain a risk prediction result; wherein, the anti-fraud risk assessment model is trained by the method as in the first aspect.
第五方面,提供一种反欺诈风险评估模型的训练装置,包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如第一方面的方法。In a fifth aspect, a training device for an anti-fraud risk assessment model is provided, including: at least one processor; and a memory connected to at least one processor in communication; wherein, the memory stores instructions executable by at least one processor, The instructions are executed by at least one processor, so that the at least one processor can perform: the method of the first aspect.
第六方面,提供一种反欺诈风险评估装置,包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如第二方面的方法。In a sixth aspect, an anti-fraud risk assessment device is provided, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein, the memory stores instructions executable by the at least one processor, and the instructions are executed by at least one processor. Executed by one processor, so that at least one processor can execute: the method of the second aspect.
第七方面,提供一种计算机可读存储介质,计算机可读存储介质存储有程序,当程序被多核处理器执行时,使得多核处理器执行如第一方面的方法和/或如第二方面的方法。In a seventh aspect, a computer-readable storage medium is provided, the computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the method according to the first aspect and/or the method according to the second aspect method.
上述实施例的优点之一,能够获得更好的反欺诈风险评估效果。One of the advantages of the above embodiments is that better anti-fraud risk assessment effects can be obtained.
本发明的其他优点将配合以下的说明和附图进行更详细的解说。Other advantages of the present invention will be explained in more detail in conjunction with the following description and accompanying drawings.
应当理解,上述说明仅是本发明技术方案的概述,以便能够更清楚地了解本发明的技术手段,从而可依照说明书的内容予以实施。为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举例说明本发明的具体实施方式。It should be understood that the above description is only an overview of the technical solution of the present invention, so as to understand the technical means of the present invention more clearly, and thus implement it according to the contents of the description. In order to make the above and other objects, features and advantages of the present invention more comprehensible, specific embodiments of the present invention are illustrated below.
附图说明Description of drawings
通过阅读下文的示例性实施例的详细描述,本领域普通技术人员将明白本文的优点和益处以及其他优点和益处。附图仅用于示出示例性实施例的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的标号表示相同的部件。在附图中:The advantages and benefits herein, as well as other advantages and benefits, will be apparent to those of ordinary skill in the art upon reading the following detailed description of the exemplary embodiments. The drawings are only for the purpose of illustrating exemplary embodiments and are not to be considered as limiting the invention. Also throughout the drawings, the same reference numerals are used to denote the same parts. In the attached picture:
图1为根据本发明一实施例的反欺诈风险评估模型的训练设备的结构示意图;1 is a schematic structural diagram of a training device for an anti-fraud risk assessment model according to an embodiment of the present invention;
图2为根据本发明一实施例的反欺诈风险评估模型的训练方法的流程示意图;2 is a schematic flow diagram of a training method of an anti-fraud risk assessment model according to an embodiment of the present invention;
图3为根据本发明一实施例的反欺诈风险评估模型的训练过程示意图;3 is a schematic diagram of the training process of the anti-fraud risk assessment model according to an embodiment of the present invention;
图4为根据本发明一实施例的反欺诈风险评估方法的流程示意图;FIG. 4 is a schematic flow diagram of an anti-fraud risk assessment method according to an embodiment of the present invention;
图5为根据本发明一实施例的反欺诈风险评估模型的使用过程示意图;FIG. 5 is a schematic diagram of the use process of the anti-fraud risk assessment model according to an embodiment of the present invention;
图6为根据本发明一实施例的反欺诈风险评估模型的训练装置的结构示意图;6 is a schematic structural diagram of a training device for an anti-fraud risk assessment model according to an embodiment of the present invention;
图7为根据本发明一实施例的反欺诈风险评估装置的结构示意图;FIG. 7 is a schematic structural diagram of an anti-fraud risk assessment device according to an embodiment of the present invention;
图8为根据本发明一实施例的反欺诈风险评估模型的训练装置的结构示意图;8 is a schematic structural diagram of a training device for an anti-fraud risk assessment model according to an embodiment of the present invention;
图9为根据本发明一实施例的反欺诈风险评估装置的结构示意图。Fig. 9 is a schematic structural diagram of an anti-fraud risk assessment device according to an embodiment of the present invention.
在附图中,相同或对应的标号表示相同或对应的部分。In the drawings, the same or corresponding reference numerals denote the same or corresponding parts.
具体实施方式Detailed ways
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure, and to fully convey the scope of the present disclosure to those skilled in the art.
在本申请实施例的描述中,应理解,诸如“包括”或“具有”等术语旨在指示本说明书中所公开的特征、数字、步骤、行为、部件、部分或其组合的存在,并且不旨在排除一个或多个其他特征、数字、步骤、行为、部件、部分或其组合存在的可能性。In the description of the embodiments of the present application, it should be understood that terms such as "comprising" or "having" are intended to indicate the existence of the features, numbers, steps, acts, components, parts or combinations thereof disclosed in the specification, and do not It is intended to exclude the possibility of the existence of one or more other features, figures, steps, acts, parts, parts or combinations thereof.
除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。Unless otherwise specified, "/" means or, for example, A/B can mean A or B; "and/or" in this article is just an association relationship describing associated objects, indicating that there can be three relationships, For example, A and/or B may mean that A exists alone, A and B exist simultaneously, and B exists alone.
术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”等的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。The terms "first", "second", etc. are used for descriptive purposes only, and should not be understood as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, a feature defined as "first", "second", etc. may expressly or implicitly include one or more of that feature. In the description of the embodiments of the present application, unless otherwise specified, "plurality" means two or more.
本申请中的所有代码都是示例性的,本领域技术人员根据所使用的编程语言,具体的需求和个人习惯等因素会在不脱离本申请的思想的条件下想到各种变型。All codes in this application are exemplary, and those skilled in the art will think of various modifications without departing from the idea of this application according to factors such as the programming language used, specific requirements, and personal habits.
如图1所示,图1是本发明实施例方案涉及的硬件运行环境的结构示意图。As shown in FIG. 1 , FIG. 1 is a schematic structural diagram of a hardware operating environment involved in the solution of the embodiment of the present invention.
需要说明的是,图1即可为反欺诈风险评估模型的训练设备的硬件运行环境的结构示意图。本发明实施例的数据库热点行更新设备可以是PC,便携计算机等终端设备。It should be noted that FIG. 1 is a schematic structural diagram of a hardware operating environment of a training device for an anti-fraud risk assessment model. The database hotspot line updating device in the embodiment of the present invention may be a terminal device such as a PC or a portable computer.
如图1所示,该反欺诈风险评估模型的训练设备可以包括:处理器1001,例如CPU,网络接口1004,用户接口1003,存储器1005,通信总线1002。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatilememory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1 , the training device for the anti-fraud risk assessment model may include: a processor 1001 , such as a CPU, a network interface 1004 , a user interface 1003 , a memory 1005 , and a communication bus 1002 . Wherein, the communication bus 1002 is used to realize connection and communication between these components. The user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. Optionally, the network interface 1004 may include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 can be a high-speed RAM memory, or a non-volatile memory, such as a disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
本领域技术人员可以理解,图1中示出的反欺诈风险评估模型的训练设备结构并不构成对设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the training device structure of the anti-fraud risk assessment model shown in FIG. Different component arrangements.
如图1所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及反欺诈风险评估模型的训练程序。其中,操作系统是管理和控制反欺诈风险评估模型的训练设备硬件和软件资源的程序,支持数据库热点行更新程序以及其它软件或程序的运行。As shown in FIG. 1 , the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a training program for an anti-fraud risk assessment model. Among them, the operating system is a program that manages and controls the hardware and software resources of the training equipment for the anti-fraud risk assessment model, and supports the operation of the database hotspot line update program and other software or programs.
在图1所示的反欺诈风险评估模型的训练设备中,用户接口1003主要用于接收第一终端、第二终端和监管终端发送的请求、数据等;网络接口1004主要用于连接后台服务器与后台服务器进行数据通信;而处理器1001可以用于调用存储器1005中存储的数据库热点行更新程序,并执行以下操作:In the training device of the anti-fraud risk assessment model shown in Figure 1, the user interface 1003 is mainly used to receive requests, data, etc. sent by the first terminal, the second terminal and the supervision terminal; the network interface 1004 is mainly used to connect the background server and The background server performs data communication; and the processor 1001 can be used to call the database hotspot line update program stored in the memory 1005, and perform the following operations:
获取训练样本集,训练样本包括多维特征及其欺诈标签,多维特征包括:用户静态特征、用户行为特征以及设备风险APP特征;将训练样本集输入待训练的反欺诈风险评估模型进行迭代训练;其中,在每轮迭代中,反欺诈风险评估模型对输入的多维特征执行嵌入处理以得到输入向量,将输入向量输入基于自注意力机制构建的特征学习网络以获得加权融合后的编码向量,将编码向量输入深度网络以得到风险预测结果,以及,利用风险预测结果和欺诈标签构建的损失函数更新风险评估模型的参数。Obtain a training sample set, the training sample includes multidimensional features and their fraud labels, and the multidimensional features include: user static features, user behavior features, and device risk APP features; input the training sample set into the anti-fraud risk assessment model to be trained for iterative training; among them , in each round of iterations, the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain an input vector, and inputs the input vector into the feature learning network built based on the self-attention mechanism to obtain a weighted and fused encoding vector, and encodes The vector is input into the deep network to obtain the risk prediction result, and the parameters of the risk assessment model are updated using the risk prediction result and the loss function constructed by the fraud label.
由此,使用注意力机制融合用户静态特征、用户行为特征和设备风险APP特征等多维数据以进行风险预测,能够训练出反欺诈风险评估效果更好的模型。Therefore, using the attention mechanism to fuse multi-dimensional data such as user static features, user behavior features, and device risk APP features for risk prediction, a model with better anti-fraud risk assessment effects can be trained.
图2为根据本申请一实施例的一种反欺诈风险评估模型的训练方法的流程示意图,在该流程中,从设备角度而言,执行主体可以是一个或者多个电子设备,更具体地可以是的处理模块;从程序角度而言,执行主体相应地可以是搭载于这些电子设备上的程序。FIG. 2 is a schematic flowchart of a training method for an anti-fraud risk assessment model according to an embodiment of the present application. In this process, from the perspective of equipment, the execution subject may be one or more electronic devices, and more specifically It is a processing module; from a program point of view, the execution subject may be a program carried on these electronic devices.
参考图1,该方法包括:Referring to Figure 1, the method includes:
202、获取训练样本集,训练样本包括多维特征及其欺诈标签,多维特征包括:用户静态特征、用户行为特征以及设备风险APP特征;202. Obtain a training sample set. The training samples include multi-dimensional features and their fraud labels. The multi-dimensional features include: user static features, user behavior features, and device risk APP features;
训练样本集包含若干黑白样本,其中黑样本是指欺诈标签为“是”的训练样本,白样本是指欺诈标签为“否”的训练样本,每个训练样本由根据交易侧信息获得。The training sample set contains several black and white samples, where the black samples refer to the training samples whose fraudulent label is "Yes", and the white samples refer to the training samples whose fraudulent label is "No". Each training sample is obtained according to the transaction side information.
例如,训练样本可以是:用户静态特征(用户A,性别,年龄,职业),用户行为特征(交易地点IP、交易对手方信息),设备风险APP特征(app 1,t 1,app 2,t 2,…t n-1,app n),其中app n为交易APP,app 1,app 2等属于用户设备上安装并使用的风险APP,也即在风险名单中的APP,上述t 1,t 2,t n-1相应为相邻两个风险APP之间的使用间隔时间,由此可以看出用户对于风险APP的使用习惯。) For example, the training samples can be: user static characteristics (user A, gender, age, occupation), user behavior characteristics (transaction location IP, transaction counterparty information), device risk APP characteristics (app 1 , t 1 , app 2 , t 2 ,...t n-1 ,app n ), where app n is the transaction APP, app 1 , app 2 , etc. belong to the risk APP installed and used on the user's device, that is, the APP in the risk list, the above t 1 , t 2 , t n-1 corresponds to the use interval between two adjacent risky APPs, from which we can see the user's usage habits for risky APPs. )
在一些实施例中,方法还包括:获取全局风险APP,并利用每个风险APP的属性信息获取关联和/或相似的其他APP以扩充全局风险APP;属性信息包括以下中的一种或多种:开发者信息、名称信息、APP介绍信息。In some embodiments, the method further includes: obtaining a global risk APP, and using the attribute information of each risk APP to obtain associated and/or similar other APPs to expand the global risk APP; the attribute information includes one or more of the following : Developer information, name information, APP introduction information.
可以理解,不断会有新的风险APP产生,而已知的风险APP名单难以全面统计,因此可以根据现有已知的风险APP并利用诸如聚类等关联算法推测出未知的风险APP,进而实时扩充全局风险APP。可以理解,风险APP之间的开发者信息、名称信息、APP介绍信息等属性信息之间可能存在关联活相似,因此可以据此来实现扩充。It is understandable that there will be new risky APPs constantly, and the list of known risky APPs is difficult to comprehensively count. Therefore, unknown risky APPs can be inferred based on existing known risky APPs and using association algorithms such as clustering, and then expanded in real time Global Risk APP. It can be understood that attribute information such as developer information, name information, and APP introduction information among risky APPs may be related or similar, so expansion can be realized based on this.
在一些实施例中,获取训练数据集,还包括:收集用户静态特征,用户静态特征包括用户年龄性别;通过埋点方式收集用户交易行为信息,用户交易行为数据包括:交易地点IP、交易对手方信息;周期性收集用户设备的APP使用信息,根据全局风险APP确定用户设备使用的风险APP,得到设备风险APP特征。In some embodiments, obtaining the training data set also includes: collecting user static characteristics, user static characteristics include user age and gender; collecting user transaction behavior information through buried points, user transaction behavior data includes: transaction location IP, transaction counterparty Information: Periodically collect the APP usage information of the user equipment, determine the risk APP used by the user equipment according to the global risk APP, and obtain the characteristics of the equipment risk APP.
比如,在第一时间点对用于设备信息进行收集,发现app 1的活动痕迹,在第二时间点发现app 1和app 2的使用痕迹,并可基于收集时间来估计APP的使用时间。 For example, the device information is collected at the first time point, the activity trace of app 1 is found, and the usage traces of app 1 and app 2 are found at the second time point, and the usage time of APP can be estimated based on the collection time.
在一些实施例中,多维特征还包括:文本特征,文本特征包括交易留言信息。可以理解,某些欺诈交易的交易留言信息较为特别,可通过识别其交易留言信息来识别风险。In some embodiments, the multi-dimensional features further include: text features, where the text features include transaction message information. It can be understood that the transaction message information of some fraudulent transactions is quite special, and the risk can be identified by identifying the transaction message information.
204、将训练样本集输入待训练的反欺诈风险评估模型进行迭代训练;204. Input the training sample set into the anti-fraud risk assessment model to be trained for iterative training;
其中,在每轮迭代中,反欺诈风险评估模型对输入的多维特征执行嵌入处理以得到输入向量,将输入向量输入基于自注意力机制构建的特征学习网络以获得加权融合后的编码向 量,将编码向量输入深度网络以得到风险预测结果,以及,利用风险预测结果和欺诈标签构建的损失函数更新风险评估模型的参数。Among them, in each round of iterations, the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain the input vector, and inputs the input vector into the feature learning network built based on the self-attention mechanism to obtain a weighted and fused encoding vector. The encoding vector is input into the deep network to obtain the risk prediction result, and the parameters of the risk assessment model are updated using the risk prediction result and the loss function constructed by the fraud label.
参考图3,示出了反欺诈风险评估模型的训练架构图,其中反欺诈风险评估模型300包括嵌入层301,用于将输入的训练样本的多维特征转换为向量形式,即输入向量。特征提取网络302,用于从输入向量序列中提取有效特征,该特征提取网络基于自注意力机制构建,具体可以包括自注意力层、残差及归一化层、前馈网络层和求和及归一化层,由此可以获得加权融合后的编码向量。深度网络303,用于基于该编码向量获得风险预测结果,该深度网络30还接收样本的欺诈标签,从而基于风险预测结果和欺诈标签的误差通过反向传播调整该风险评估模型的参数。Referring to FIG. 3 , it shows a training architecture diagram of an anti-fraud risk assessment model, wherein the anti-fraud risk assessment model 300 includes an embedding layer 301 for converting multidimensional features of input training samples into a vector form, ie an input vector. The feature extraction network 302 is used to extract effective features from the input vector sequence. The feature extraction network is constructed based on a self-attention mechanism, which may specifically include a self-attention layer, a residual and a normalization layer, a feedforward network layer, and a summation And the normalization layer, so that the encoded vector after weighted fusion can be obtained. The deep network 303 is used to obtain a risk prediction result based on the encoding vector. The deep network 30 also receives the fraud label of the sample, so as to adjust the parameters of the risk assessment model through backpropagation based on the error of the risk prediction result and the fraud label.
在注意力机制中,每个输入向量乘三个不同的权值矩阵得到3个向量(Q,K,V),分别是query向量Q、key向量K和value向量V,通过相似度计算score=QK T权值输出加权匹配,为了梯度稳定对其进行归一化,然后通过softmax激活并点乘V,得到加权输入向量经过attention结构之后的结果,最终接入残差网络结构防止深度学习退化。 In the attention mechanism, each input vector is multiplied by three different weight matrices to obtain three vectors (Q, K, V), which are the query vector Q, the key vector K and the value vector V, and the score= is calculated by the similarity The QK T weight output weighted matching is normalized for gradient stability, then activated by softmax and multiplied by V to obtain the result after the weighted input vector passes through the attention structure, and finally connected to the residual network structure to prevent deep learning from degrading.
在本发明中针对用户设备安装的APP、静态属性等特征进行向量化,然后分别通过注意力机制拼接得到加权求和,从而可以获得用户维度的风险预测结果。In the present invention, features such as APP and static attributes installed on the user equipment are vectorized, and then spliced through the attention mechanism to obtain a weighted sum, so that the risk prediction result of the user dimension can be obtained.
在一些实施例中,采用Transformer编码器作为特征学习网络,Transformer编码器包括自注意力层、残差及归一化层、前馈网络层和求和及归一化层。In some embodiments, a Transformer encoder is used as a feature learning network, and the Transformer encoder includes a self-attention layer, a residual and normalization layer, a feedforward network layer, and a summation and normalization layer.
在一些实施例中,还包括:获取设备风险APP的使用时序信息,基于使用时序获取用户设备使用的每个风险APP和当前资金类APP的使用相关性;利用Transformer编码器的位置编码机制对使用时序信息进行时序编码,得到时序向量,将时序向量结合每个风险APP对应的使用相关性得到时序强度向量;将时序强度向量和设备风险APP特征对应的输入向量结合,并输入自注意力层。然后通过注意力机制拼接得到加权求和,从而可以进一步获得设备APP维度的风险预测结果。In some embodiments, it also includes: obtaining the use sequence information of the device risk APP, and obtaining the use correlation between each risk APP used by the user equipment and the current fund APP based on the use sequence; The time series information is time series coded to obtain a time series vector, and the time series vector is combined with the use correlation corresponding to each risk APP to obtain a time series intensity vector; the time series intensity vector is combined with the input vector corresponding to the equipment risk APP feature, and input into the self-attention layer. Then the weighted summation is obtained by splicing through the attention mechanism, so that the risk prediction result of the device APP dimension can be further obtained.
例如,设备风险APP的使用时序信息为:(app 1,t 1,app 2,t 2,app 3…t n-1,app n),此时,如交易APP使用前短时间内使用某一风险APP,则二者相关性较高,如交易APP使用前很长时间前使用另一风险APP,则二者相关性较低,例如,利用以下公式设置风险APP:app n-1与当前交易APP:app n之间的相关度:
Figure PCTCN2022117419-appb-000003
For example, the usage timing information of the device risk APP is: (app 1 ,t 1 ,app 2 ,t 2 ,app 3 ...t n-1 ,app n ), at this time, if the trading APP uses a certain Risk APP, the correlation between the two is high. If another risk APP was used a long time ago in the trading APP, the correlation between the two is low. For example, use the following formula to set the risk APP: app n-1 and current transaction APP: Correlation between app n :
Figure PCTCN2022117419-appb-000003
同时考虑到用户的使用习惯,除了绝对时间关系,相对时间也非常重要,因此可以参考Transformer编码器的位置编码机制对使用时序信息进行时序编码。At the same time, taking into account the user's usage habits, in addition to the absolute time relationship, the relative time is also very important. Therefore, you can refer to the position encoding mechanism of the Transformer encoder to encode the timing information.
例如,可以采用以下时序编码规则:For example, the following timing coding rules can be used:
Figure PCTCN2022117419-appb-000004
Figure PCTCN2022117419-appb-000004
Figure PCTCN2022117419-appb-000005
Figure PCTCN2022117419-appb-000005
其中,TE(t,2i)为时序t的时序编码向量的第2i维,TE(t,2i+1)为时序t的时序编码向量的第2i+1维,d model是时序编码向量的维度。 Among them, TE(t,2i) is the 2i-th dimension of the time-series encoding vector of time series t, TE(t,2i+1) is the 2i+1-th dimension of the time-series encoding vector of time series t, and d model is the dimension of the time-series encoding vector .
根据以上公式可知t+t1时刻的时间序列向量可以由时间t线性变化得到,便于模型捕捉相对时序之间的变化。According to the above formula, it can be seen that the time series vector at time t+t1 can be obtained from the linear change of time t, which is convenient for the model to capture changes between relative time series.
参考图3,嵌入层301包括输入嵌入(inputembedding)层和时序编码层。在输入嵌入层,可以对训练样本的各个特征进行嵌入处理,从而得到各个特征的词嵌入张量,张量具体可以表现为一维的向量、二维的矩阵、三维或更多维的数据等等。在时序编码层,可以获取各个风险APP在用户设备的使用时序位置,进而对各个风险APP的时序生成时序张量。在得到待处理文本中每个特征的嵌入张量和某些特征(风险APP)的时序张量之后,可以将这些特征的时序张量和嵌入张量进行组合,并输入特征提取网络。Referring to FIG. 3 , the embedding layer 301 includes an input embedding (input embedding) layer and a temporal encoding layer. In the input embedding layer, each feature of the training sample can be embedded to obtain the word embedding tensor of each feature. The tensor can be expressed as a one-dimensional vector, a two-dimensional matrix, three-dimensional or more dimensional data, etc. wait. At the timing coding layer, the usage timing position of each risk APP in the user device can be obtained, and then a timing tensor is generated for the timing of each risk APP. After getting the embedding tensor of each feature in the text to be processed and the timing tensor of certain features (risk APP), the timing tensor and embedding tensor of these features can be combined and input to the feature extraction network.
在一些实施例中,深度网络采用机器学习中的随机森林或XGB。In some embodiments, the deep network employs random forest or XGB in machine learning.
在一些实施例中,损失函数中设有交易金额权重因子。可以理解,欺诈金额普遍偏大且危害较为严重,因此可以基于每个训练样本的交易金额设置损失函数中的权重因子,使整个模型更有利于识别金额较大的欺诈交易。In some embodiments, a transaction amount weighting factor is set in the loss function. It can be understood that the amount of fraud is generally too large and the damage is serious. Therefore, the weight factor in the loss function can be set based on the transaction amount of each training sample, so that the whole model is more conducive to identifying fraudulent transactions with large amounts.
基于相同的技术构思,本发明实施例还提供一种反欺诈风险评估方法。图4为本发明实施例提供的一种反欺诈风险评估方法的流程示意图。Based on the same technical concept, the embodiment of the present invention also provides an anti-fraud risk assessment method. Fig. 4 is a schematic flowchart of an anti-fraud risk assessment method provided by an embodiment of the present invention.
如图4所示,方法400包括:As shown in FIG. 4, method 400 includes:
402、获取实时交易信息,实时交易信息包括:用户静态特征、用户行为特征以及设备风险APP特征中的一种或多种;402. Obtain real-time transaction information, which includes: one or more of user static characteristics, user behavior characteristics, and device risk APP characteristics;
404、将实时交易信息输入反欺诈风险评估模型,对输入的实时交易信息执行嵌入处理以得到输入向量,将输入向量输入基于注意力机制构建的特征学习网络以获得编码向量,将编码向量输入深度网络以得到风险预测结果;其中,反欺诈风险评估模型利用如上述实施例的方法训练得到。404. Input real-time transaction information into the anti-fraud risk assessment model, perform embedding processing on the input real-time transaction information to obtain an input vector, input the input vector into the feature learning network constructed based on the attention mechanism to obtain an encoding vector, and input the encoding vector into the depth network to obtain risk prediction results; wherein, the anti-fraud risk assessment model is trained using the method of the above-mentioned embodiment.
参考图5,示出了出了反欺诈风险评估模型的使用示意图,此时,实时获得的交易信息输入训练好的反欺诈风险评估模型300,该交易信息包括诸如用户静态特征、用户行为特征以及设备风险APP特征中的一种或多种,嵌入层301对该交易信息进行嵌入处理,以得到向量化数据,即输入向量,特征提取网络302从该输入向量中提取出有效特征,即编码向量,经过训练的深度网络对该编码进行预测,得到风险预测结果。Referring to FIG. 5 , it shows a schematic view of the use of the anti-fraud risk assessment model. At this time, the transaction information obtained in real time is input into the trained anti-fraud risk assessment model 300. The transaction information includes such as user static characteristics, user behavior characteristics and One or more of the device risk APP features, the embedding layer 301 embeds the transaction information to obtain vectorized data, that is, the input vector, and the feature extraction network 302 extracts effective features from the input vector, that is, the encoding vector , the trained deep network predicts the encoding and obtains the risk prediction result.
在一些实施例中,还包括:如风险预测结果符合预设条件,则基于实时交易信息进行对应的干扰处理和/或告警处理。In some embodiments, it also includes: if the risk prediction result meets the preset condition, performing corresponding interference processing and/or alarm processing based on real-time transaction information.
在一些实施例中,还包括:基于风险预测结果和实时交易信息更新训练样本集;基于实时更新的训练样本集构建用户关系图,用户交易关系图以用户为节点,以用户之间的交易关系为边;通过聚类算法和/或图注意力算法从用户交易关系图中挖掘出团伙节点和/或团伙交易;基于团伙节点和/或团伙交易从训练样本集中识别隐藏欺诈样本;基于反馈的隐藏欺诈样本对风险评估预测模型进行更新训练。In some embodiments, it also includes: updating the training sample set based on the risk prediction results and real-time transaction information; constructing a user relationship graph based on the real-time updated training sample set, the user transaction relationship graph uses users as nodes, and uses transaction relationships between users as edges; mine gang nodes and/or gang transactions from user transaction graphs through clustering algorithms and/or graph attention algorithms; identify hidden fraudulent samples from training samples based on gang nodes and/or gang transactions; feedback-based Concealing fraudulent samples to update the risk assessment prediction model.
具体地,上述训练样本集可能标注并不全面、准确,基于此,可以基于已知的黑样本通过聚类和图算法进一步挖掘团伙作案,也即挖掘出其中的黑样本。Specifically, the above-mentioned training sample set may not be labeled comprehensively and accurately. Based on this, based on known black samples, clustering and graph algorithms can be used to further mine gang crimes, that is, to dig out black samples.
在本说明书的描述中,参考术语“一些可能的实施方式”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions referring to the terms "some possible implementations", "some embodiments", "examples", "specific examples", or "some examples" mean that the descriptions described in conjunction with the embodiments or examples A particular feature, structure, material, or characteristic is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In addition, the terms "first" and "second" are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, the features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In the description of the present invention, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本发明的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本发明的实施例所属技术领域的技术人员所理解。Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent modules, segments or portions of code comprising one or more executable instructions for implementing specific logical functions or steps of the process , and the scope of preferred embodiments of the invention includes alternative implementations in which functions may be performed out of the order shown or discussed, including substantially concurrently or in reverse order depending on the functions involved, which shall It is understood by those skilled in the art to which the embodiments of the present invention pertain.
关于本申请实施例的方法流程图,将某些操作描述为以一定顺序执行的不同的步骤。这样的流程图属于说明性的而非限制性的。可以将在本文中所描述的某些步骤分组在一起并且在单个操作中执行、可以将某些步骤分割成多个子步骤、并且可以以不同于在本文中所示出的顺序来执行某些步骤。可以由任何电路结构和/或有形机制(例如,由在计算机设备上运行的软件、硬件(例如,处理器或芯片实现的逻辑功能)等、和/或其任何组合)以任何方式来实现在流程图中所示出的各个步骤。Regarding the method flow chart of the embodiment of the present application, certain operations are described as different steps performed in a certain order. Such flowcharts are illustrative and not restrictive. Certain steps described herein can be grouped together and performed in a single operation, can be divided into multiple sub-steps, and can be performed in an order different than that shown herein . It can be implemented in any way by any circuit structure and/or tangible mechanism (for example, by software running on a computer device, hardware (for example, logical functions implemented by a processor or a chip), etc., and/or any combination thereof). The individual steps shown in the flowchart.
基于相同的技术构思,本发明实施例还提供一种反欺诈风险评估模型的训练装置,用于执行上述任一实施例所提供的反欺诈风险评估模型的训练方法。图6为本发明实施例提供的一种反欺诈风险评估模型的训练装置结构示意图。Based on the same technical concept, an embodiment of the present invention also provides a training device for an anti-fraud risk assessment model, which is used to implement the training method for an anti-fraud risk assessment model provided in any of the above-mentioned embodiments. FIG. 6 is a schematic structural diagram of a training device for an anti-fraud risk assessment model provided by an embodiment of the present invention.
如图6所示,装置600包括:As shown in Figure 6, the device 600 includes:
获取模块601,用于获取训练样本集,训练样本包括多维特征及其欺诈标签,多维特征包括:用户静态特征、用户行为特征以及设备风险APP特征;The acquisition module 601 is configured to acquire a training sample set, the training samples include multi-dimensional features and their fraud labels, and the multi-dimensional features include: user static features, user behavior features, and device risk APP features;
训练模块602,用于将训练样本集输入待训练的反欺诈风险评估模型进行迭代训练;A training module 602, configured to input the training sample set into the anti-fraud risk assessment model to be trained for iterative training;
其中,在每轮迭代中,反欺诈风险评估模型对输入的多维特征执行嵌入处理以得到输入向量,将输入向量输入基于自注意力机制构建的特征学习网络以获得加权融合后的编码向量,将编码向量输入深度网络以得到风险预测结果,以及,利用风险预测结果和欺诈标签构建的损失函数更新风险评估模型的参数。Among them, in each round of iterations, the anti-fraud risk assessment model performs embedding processing on the input multi-dimensional features to obtain the input vector, and inputs the input vector into the feature learning network built based on the self-attention mechanism to obtain a weighted and fused encoding vector. The encoding vector is input into the deep network to obtain the risk prediction result, and the parameters of the risk assessment model are updated using the risk prediction result and the loss function constructed by the fraud label.
基于相同的技术构思,本发明实施例还提供一种反欺诈风险评估装置,用于执行上述任一实施例所提供的反欺诈风险评估方法。图7为本发明实施例提供的一种反欺诈风险评估装置结构示意图。Based on the same technical concept, an embodiment of the present invention also provides an anti-fraud risk assessment device, which is used to implement the anti-fraud risk assessment method provided in any one of the above embodiments. Fig. 7 is a schematic structural diagram of an anti-fraud risk assessment device provided by an embodiment of the present invention.
获取模块701,用于获取实时交易信息,实时交易信息包括:用户静态特征、用户行为特征以及设备风险APP特征;The obtaining module 701 is used to obtain real-time transaction information, and the real-time transaction information includes: user static characteristics, user behavior characteristics and device risk APP characteristics;
评估模块702,用于将实时交易信息输入反欺诈风险评估模型,反欺诈风险评估模型对输入的实时交易信息执行嵌入处理以得到输入向量,将输入向量输入基于注意力机制构建的特征学习网络以获得编码向量,将编码向量输入深度网络以得到风险预测结果;其中,反欺诈风险评估模型利用如上述训练方法训练得到。The evaluation module 702 is used to input the real-time transaction information into the anti-fraud risk assessment model, and the anti-fraud risk assessment model performs embedding processing on the input real-time transaction information to obtain an input vector, and inputs the input vector into the feature learning network constructed based on the attention mechanism to obtain The encoding vector is obtained, and the encoding vector is input into the deep network to obtain the risk prediction result; wherein, the anti-fraud risk assessment model is obtained by training with the above training method.
需要说明的是,本申请实施例中的装置可以实现前述方法的实施例的各个过程,并达到相同的效果和功能,这里不再赘述。It should be noted that the device in the embodiment of the present application can realize each process of the foregoing method embodiment, and achieve the same effect and function, which will not be repeated here.
图8为根据本申请一实施例的反欺诈风险评估模型的训练装置,用于执行图2所示出的反欺诈风险评估模型的训练方法,该装置包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行上述实施例的方法。FIG. 8 is a training device for an anti-fraud risk assessment model according to an embodiment of the present application, which is used to execute the training method for the anti-fraud risk assessment model shown in FIG. 2 , the device includes: at least one processor; and, with at least one A processor is communicatively connected to a memory; wherein, the memory stores instructions that can be executed by at least one processor, and the instructions are executed by at least one processor, so that at least one processor can execute the method of the above-mentioned embodiment.
图9为根据本申请一实施例的反欺诈风险评估装置,用于执行图4所示出的反欺诈风险评估方法,该装置包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行上述实施例的方法。Fig. 9 is an anti-fraud risk assessment device according to an embodiment of the present application, which is used to execute the anti-fraud risk assessment method shown in Fig. 4, the device includes: at least one processor; and, communicated with at least one processor A memory; wherein, the memory stores instructions executable by at least one processor, and the instructions are executed by at least one processor, so that the at least one processor can execute the methods of the above-mentioned embodiments.
根据本申请的一些实施例,提供了反欺诈风险评估模型的训练方法和/或反欺诈风险评估方法的非易失性计算机存储介质,其上存储有计算机可执行指令,该计算机可执行指令设置为在由处理器运行时执行:上述实施例的方法。According to some embodiments of the present application, a non-volatile computer storage medium for a training method of an anti-fraud risk assessment model and/or an anti-fraud risk assessment method is provided, on which computer-executable instructions are stored, and the computer-executable instructions set To execute when executed by a processor: the method of the above-mentioned embodiments.
本申请中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置、设备和计算机可读存储介质实施例而言,由于其基本相似于方法实施例,所以其描述进行了简化,相关之处可参见方法实施例的部分说明即可。Each embodiment in the present application is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the apparatus, equipment and computer-readable storage medium embodiments, since they are basically similar to the method embodiments, their descriptions are simplified, and for relevant parts, please refer to part of the description of the method embodiments.
本申请实施例提供的装置、设备和计算机可读存储介质与方法是一一对应的,因此,装置、设备和计算机可读存储介质也具有与其对应的方法类似的有益技术效果,由于上面已经对方法的有益技术效果进行了详细说明,因此,这里不再赘述装置、设备和计算机可读存储介质的有益技术效果。The device, device, and computer-readable storage medium provided in the embodiments of the present application correspond to the method one-to-one. Therefore, the device, device, and computer-readable storage medium also have beneficial technical effects similar to their corresponding methods. The beneficial technical effect of the method has been described in detail, therefore, the beneficial technical effect of the device, equipment and computer-readable storage medium will not be repeated here.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, systems or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flashRAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in computer-readable media, in the form of random access memory (RAM) and/or nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM). Memory is an example of computer readable media.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。此外,尽管在附图中以特定顺序描述了本发明方法的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。附加地或备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,和/或将一个步骤分解为多个步骤执行。Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device. In addition, while operations of the methods of the present invention are depicted in the figures in a particular order, there is no requirement or implication that these operations must be performed in that particular order, or that all illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps may be combined into one step for execution, and/or one step may be decomposed into multiple steps for execution.
虽然已经参考若干具体实施方式描述了本发明的精神和原理,但是应该理解,本发明并不限于所公开的具体实施方式,对各方面的划分也不意味着这些方面中的特征不能组合以进行受益,这种划分仅是为了表述的方便。本发明旨在涵盖所附权利要求的精神和范围内所包括的各种修改和等同布置。Although the spirit and principles of the invention have been described with reference to a number of specific embodiments, it should be understood that the invention is not limited to the specific embodiments disclosed, nor does division of aspects imply that features in these aspects cannot be combined to achieve optimal performance. Benefit, this division is only for the convenience of expression. The present invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (17)

  1. 一种反欺诈风险评估模型的训练方法,包括:A training method for an anti-fraud risk assessment model, comprising:
    获取训练样本集,所述训练样本包括多维特征及其欺诈标签,所述多维特征包括:用户静态特征、用户行为特征以及设备风险APP特征;Obtain a training sample set, the training samples include multi-dimensional features and fraud labels thereof, the multi-dimensional features include: user static features, user behavior features and device risk APP features;
    将所述训练样本集输入待训练的反欺诈风险评估模型进行迭代训练;Inputting the training sample set into the anti-fraud risk assessment model to be trained for iterative training;
    其中,在每轮迭代中,所述反欺诈风险评估模型对输入的所述多维特征执行嵌入处理以得到输入向量,将所述输入向量输入基于自注意力机制构建的特征学习网络以获得加权融合后的编码向量,将所述编码向量输入深度网络以得到风险预测结果,以及,利用所述风险预测结果和所述欺诈标签构建的损失函数更新所述风险评估模型的参数。Wherein, in each round of iteration, the anti-fraud risk assessment model performs embedding processing on the input multidimensional features to obtain an input vector, and inputs the input vector into a feature learning network constructed based on a self-attention mechanism to obtain a weighted fusion input the coded vector into the deep network to obtain the risk prediction result, and update the parameters of the risk assessment model by using the risk prediction result and the loss function constructed by the fraud label.
  2. 根据权利要求1所述的方法,其中,采用Transformer编码器作为所述特征学习网络,所述Transformer编码器包括自注意力层、残差及归一化层、前馈网络层和求和及归一化层。The method according to claim 1, wherein, using a Transformer encoder as the feature learning network, the Transformer encoder includes a self-attention layer, a residual and a normalization layer, a feed-forward network layer, and summation and normalization One chemical layer.
  3. 根据权利要求2所述的方法,其中,还包括:The method according to claim 2, further comprising:
    获取所述设备风险APP的使用时序信息,基于所述使用时序获取用户设备使用的每个风险APP和当前资金类APP的使用相关性;Obtaining the use sequence information of the equipment risk APP, and obtaining the use correlation between each risk APP used by the user equipment and the current capital APP based on the use sequence;
    利用所述Transformer编码器的位置编码机制对所述使用时序信息进行时序编码,得到时序向量,将所述时序向量结合每个风险APP对应的所述使用相关性得到时序强度向量;Using the position encoding mechanism of the Transformer encoder to perform time-series encoding on the use time-series information to obtain a time-series vector, combining the time-series vector with the use correlation corresponding to each risk APP to obtain a time-series strength vector;
    将所述时序强度向量和所述设备风险APP特征对应的所述输入向量结合,并输入所述自注意力层。Combining the time series intensity vector with the input vector corresponding to the device risk APP feature, and inputting it into the self-attention layer.
  4. 根据权利要求3所述的方法,其中,利用所述Transformer编码器的位置编码机制对所述使用时序信息进行时序编码,还包括:The method according to claim 3, wherein, using the position encoding mechanism of the Transformer encoder to perform time sequence encoding on the usage timing information, further comprising:
    其中,利用如下公式定义时序编码规则:Among them, the timing coding rules are defined by the following formula:
    Figure PCTCN2022117419-appb-100001
    Figure PCTCN2022117419-appb-100001
    Figure PCTCN2022117419-appb-100002
    Figure PCTCN2022117419-appb-100002
    其中,TE(t,2i)为时序t的时序编码向量的第2i维,TE(t,2i+1)为时序t的时序编码向量的第2i+1维,d model是时序编码向量的维度。 Among them, TE(t,2i) is the 2i-th dimension of the time-series encoding vector of time series t, TE(t,2i+1) is the 2i+1-th dimension of the time-series encoding vector of time series t, and d model is the dimension of the time-series encoding vector .
  5. 根据权利要求1-4中任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1-4, wherein the method further comprises:
    获取全局风险APP,并利用每个所述风险APP的属性信息获取关联和/或相似的其他APP以扩充所述全局风险APP;Obtain the global risk APP, and use the attribute information of each of the risk APPs to obtain associated and/or similar other APPs to expand the global risk APP;
    所述属性信息包括以下中的一种或多种:开发者信息、名称信息、APP介绍信息。The attribute information includes one or more of the following: developer information, name information, and APP introduction information.
  6. 根据权利要求1-5中任一项所述的方法,其中,获取训练数据集,还包括:The method according to any one of claims 1-5, wherein obtaining a training data set further comprises:
    通过埋点方式收集所述用户交易行为信息,所述用户交易行为数据包括:交易地点IP、交易对手方信息;Collect the user’s transaction behavior information by way of burying points, and the user’s transaction behavior data includes: transaction location IP, transaction counterparty information;
    周期性收集用户设备的APP使用信息,根据全局风险APP确定所述用户设备使用的风险APP,得到所述设备风险APP特征。Periodically collect APP usage information of the user equipment, determine the risk APP used by the user equipment according to the global risk APP, and obtain the characteristics of the equipment risk APP.
  7. 根据权利要求1-6中任一项所述的方法,其中,所述多维特征还包括:文本特征,所述文本特征包括交易留言信息。The method according to any one of claims 1-6, wherein the multi-dimensional feature further includes: a text feature, and the text feature includes transaction message information.
  8. 根据权利要求1-7中任一项所述的方法,其中,所述深度网络采用机器学习中的随机森林或XGB。The method according to any one of claims 1-7, wherein the deep network adopts random forest or XGB in machine learning.
  9. 根据权利要求1-8中任一项所述的方法,其中,所述损失函数中设有交易金额权重因子。The method according to any one of claims 1-8, wherein a transaction amount weight factor is set in the loss function.
  10. 一种反欺诈风险评估方法,包括:An anti-fraud risk assessment method, comprising:
    获取实时交易信息,所述实时交易信息包括:用户静态特征、用户行为特征以及设备风险APP特征中的一种或多种;Acquiring real-time transaction information, the real-time transaction information includes: one or more of user static characteristics, user behavior characteristics and equipment risk APP characteristics;
    将所述实时交易信息输入反欺诈风险评估模型,所述反欺诈风险评估模型对输入的所述实时交易信息执行嵌入处理以得到输入向量,将所述输入向量输入基于注意力机制构建的特征学习网络以获得编码向量,将所述编码向量输入深度网络以得到风险预测结果;Inputting the real-time transaction information into the anti-fraud risk assessment model, the anti-fraud risk assessment model performs embedding processing on the input real-time transaction information to obtain an input vector, and inputs the input vector into the feature learning based on the attention mechanism network to obtain an encoding vector, and input the encoding vector into a deep network to obtain a risk prediction result;
    其中,所述反欺诈风险评估模型利用如权利要求1-9中任一项所述的方法训练得到。Wherein, the anti-fraud risk assessment model is trained by the method according to any one of claims 1-9.
  11. 根据权利要求10所述的方法,其中,还包括:The method according to claim 10, further comprising:
    如所述风险预测结果符合预设条件,则基于所述实时交易信息进行对应的干扰处理和/或告警处理。If the risk prediction result meets the preset condition, corresponding interference processing and/or alarm processing is performed based on the real-time transaction information.
  12. 根据权利要求10或11所述的方法,其中,还包括:The method according to claim 10 or 11, further comprising:
    基于所述风险预测结果和所述实时交易信息更新训练样本集;updating the training sample set based on the risk prediction result and the real-time transaction information;
    基于实时更新的所述训练样本集构建用户关系图,所述用户交易关系图以用户为节点,以用户之间的交易关系为边;Constructing a user relationship graph based on the training sample set updated in real time, the user transaction relationship graph uses users as nodes and uses transaction relationships between users as edges;
    通过聚类算法和/或图注意力算法从所述用户交易关系图中挖掘出团伙节点和/或团伙交易;Mining out gang nodes and/or gang transactions from the user transaction relation graph by clustering algorithm and/or graph attention algorithm;
    基于所述团伙节点和/或所述团伙交易从所述训练样本集中识别隐藏欺诈样本;identifying hidden fraudulent samples from said set of training samples based on said gang nodes and/or said gang transactions;
    基于反馈的所述隐藏欺诈样本对所述风险评估预测模型进行更新训练。The risk assessment prediction model is updated and trained based on the fed back hidden fraud samples.
  13. 一种反欺诈风险评估模型的训练装置,其中,包括:A training device for an anti-fraud risk assessment model, including:
    获取模块,用于获取训练样本集,所述训练样本包括多维特征及其欺诈标签,所述多维特征包括:用户静态特征、用户行为特征以及设备风险APP特征;The obtaining module is used to obtain a training sample set, the training sample includes multi-dimensional features and fraud labels thereof, and the multi-dimensional features include: user static features, user behavior features and device risk APP features;
    训练模块,用于将所述训练样本集输入待训练的反欺诈风险评估模型进行迭代训练;A training module, configured to input the training sample set into the anti-fraud risk assessment model to be trained for iterative training;
    其中,在每轮迭代中,所述反欺诈风险评估模型对输入的所述多维特征执行嵌入处理以得到输入向量,将所述输入向量输入基于自注意力机制构建的特征学习网络以获得加权融合后的编码向量,将所述编码向量输入深度网络以得到风险预测结果,以及,利用所述风险预测结果和所述欺诈标签构建的损失函数更新所述风险评估模型的参数。Wherein, in each round of iteration, the anti-fraud risk assessment model performs embedding processing on the input multidimensional features to obtain an input vector, and inputs the input vector into a feature learning network constructed based on a self-attention mechanism to obtain a weighted fusion input the coded vector into the deep network to obtain the risk prediction result, and update the parameters of the risk assessment model by using the risk prediction result and the loss function constructed by the fraud label.
  14. 一种反欺诈风险评估装置,包括:An anti-fraud risk assessment device, comprising:
    获取模块,用于获取实时交易信息,所述实时交易信息包括:用户静态特征、用户行为特征以及设备风险APP特征;An acquisition module, configured to acquire real-time transaction information, the real-time transaction information including: user static characteristics, user behavior characteristics and device risk APP characteristics;
    评估模块,用于将所述实时交易信息输入反欺诈风险评估模型,所述反欺诈风险评估模型对输入的所述实时交易信息执行嵌入处理以得到输入向量,将所述输入向量输入基于注意力机制构建的特征学习网络以获得编码向量,将所述编码向量输入深度网络以得到风险预测结果;其中,所述反欺诈风险评估模型利用如权利要求1-9中任一项所述的方法训练得到。An evaluation module, configured to input the real-time transaction information into an anti-fraud risk assessment model, the anti-fraud risk assessment model performs embedding processing on the input real-time transaction information to obtain an input vector, and inputs the input vector into an attention-based The feature learning network constructed by the mechanism is used to obtain the coding vector, and the coding vector is input into the deep network to obtain the risk prediction result; wherein, the anti-fraud risk assessment model is trained using the method described in any one of claims 1-9 get.
  15. 一种反欺诈风险评估模型的训练装置,包括:A training device for an anti-fraud risk assessment model, comprising:
    至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如权利要求1-9中任意一项所述的方法。At least one processor; and, a memory connected in communication with the at least one processor; wherein, the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform: as The method of any one of claims 1-9.
  16. 一种反欺诈风险评估装置,包括:An anti-fraud risk assessment device, comprising:
    至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如权利要求10-12中任意一项所述的方法。At least one processor; and, a memory connected in communication with the at least one processor; wherein, the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform: as The method according to any one of claims 10-12.
  17. 一种计算机可读存储介质,所述计算机可读存储介质存储有程序,当所述程序被多核处理器执行时,使得所述多核处理器执行如权利要求1-9中任一项所述的方法,或者权利要求10-12中任一项所述的方法。A computer-readable storage medium, the computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the method according to any one of claims 1-9 method, or the method described in any one of claims 10-12.
PCT/CN2022/117419 2021-12-29 2022-09-07 Anti-fraud risk assessment method and apparatus, training method and apparatus, and readable storage medium WO2023124204A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111640205.2A CN114298417A (en) 2021-12-29 2021-12-29 Anti-fraud risk assessment method, anti-fraud risk training method, anti-fraud risk assessment device, anti-fraud risk training device and readable storage medium
CN202111640205.2 2021-12-29

Publications (1)

Publication Number Publication Date
WO2023124204A1 true WO2023124204A1 (en) 2023-07-06

Family

ID=80972114

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/117419 WO2023124204A1 (en) 2021-12-29 2022-09-07 Anti-fraud risk assessment method and apparatus, training method and apparatus, and readable storage medium

Country Status (3)

Country Link
CN (1) CN114298417A (en)
TW (1) TW202326537A (en)
WO (1) WO2023124204A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116934098A (en) * 2023-09-14 2023-10-24 山东省标准化研究院(Wto/Tbt山东咨询工作站) Risk quantitative evaluation method for technical trade measures
CN117113148A (en) * 2023-08-30 2023-11-24 上海智租物联科技有限公司 Risk identification method, device and storage medium based on time sequence diagram neural network
CN117151851A (en) * 2023-09-12 2023-12-01 浪潮数字(山东)建设运营有限公司 Bank risk prediction method and device based on genetic algorithm and electronic equipment
CN117435918A (en) * 2023-12-20 2024-01-23 杭州市特种设备检测研究院(杭州市特种设备应急处置中心) Elevator risk early warning method based on spatial attention network and feature division
CN117556224A (en) * 2024-01-12 2024-02-13 国网四川省电力公司电力科学研究院 Grid facility anti-seismic risk assessment system, method and storage medium
CN117151851B (en) * 2023-09-12 2024-04-30 浪潮数字(山东)建设运营有限公司 Bank risk prediction method and device based on genetic algorithm and electronic equipment

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114298417A (en) * 2021-12-29 2022-04-08 中国银联股份有限公司 Anti-fraud risk assessment method, anti-fraud risk training method, anti-fraud risk assessment device, anti-fraud risk training device and readable storage medium
CN114757581A (en) * 2022-05-18 2022-07-15 华南理工大学 Financial transaction risk assessment method and device, electronic equipment and computer readable medium
CN114936723B (en) * 2022-07-21 2023-04-14 中国电子科技集团公司第三十研究所 Social network user attribute prediction method and system based on data enhancement
CN116258579B (en) * 2023-04-28 2023-08-04 成都新希望金融信息有限公司 Training method of user credit scoring model and user credit scoring method
CN116562901B (en) * 2023-06-25 2024-04-02 福建润楼数字科技有限公司 Automatic generation method of anti-fraud rule based on machine learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180253737A1 (en) * 2017-03-06 2018-09-06 International Business Machines Corporation Dynamicall Evaluating Fraud Risk
CN109978538A (en) * 2017-12-28 2019-07-05 阿里巴巴集团控股有限公司 Determine fraudulent user, training pattern, the method and device for identifying risk of fraud
CN112348520A (en) * 2020-10-21 2021-02-09 上海淇玥信息技术有限公司 XGboost-based risk assessment method and device and electronic equipment
CN112365338A (en) * 2020-11-11 2021-02-12 平安普惠企业管理有限公司 Artificial intelligence-based data fraud detection method, device, terminal and medium
US20210374756A1 (en) * 2020-05-29 2021-12-02 Mastercard International Incorporated Methods and systems for generating rules for unseen fraud and credit risks using artificial intelligence
CN114298417A (en) * 2021-12-29 2022-04-08 中国银联股份有限公司 Anti-fraud risk assessment method, anti-fraud risk training method, anti-fraud risk assessment device, anti-fraud risk training device and readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180253737A1 (en) * 2017-03-06 2018-09-06 International Business Machines Corporation Dynamicall Evaluating Fraud Risk
CN109978538A (en) * 2017-12-28 2019-07-05 阿里巴巴集团控股有限公司 Determine fraudulent user, training pattern, the method and device for identifying risk of fraud
US20210374756A1 (en) * 2020-05-29 2021-12-02 Mastercard International Incorporated Methods and systems for generating rules for unseen fraud and credit risks using artificial intelligence
CN112348520A (en) * 2020-10-21 2021-02-09 上海淇玥信息技术有限公司 XGboost-based risk assessment method and device and electronic equipment
CN112365338A (en) * 2020-11-11 2021-02-12 平安普惠企业管理有限公司 Artificial intelligence-based data fraud detection method, device, terminal and medium
CN114298417A (en) * 2021-12-29 2022-04-08 中国银联股份有限公司 Anti-fraud risk assessment method, anti-fraud risk training method, anti-fraud risk assessment device, anti-fraud risk training device and readable storage medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117113148A (en) * 2023-08-30 2023-11-24 上海智租物联科技有限公司 Risk identification method, device and storage medium based on time sequence diagram neural network
CN117151851A (en) * 2023-09-12 2023-12-01 浪潮数字(山东)建设运营有限公司 Bank risk prediction method and device based on genetic algorithm and electronic equipment
CN117151851B (en) * 2023-09-12 2024-04-30 浪潮数字(山东)建设运营有限公司 Bank risk prediction method and device based on genetic algorithm and electronic equipment
CN116934098A (en) * 2023-09-14 2023-10-24 山东省标准化研究院(Wto/Tbt山东咨询工作站) Risk quantitative evaluation method for technical trade measures
CN117435918A (en) * 2023-12-20 2024-01-23 杭州市特种设备检测研究院(杭州市特种设备应急处置中心) Elevator risk early warning method based on spatial attention network and feature division
CN117435918B (en) * 2023-12-20 2024-03-15 杭州市特种设备检测研究院(杭州市特种设备应急处置中心) Elevator risk early warning method based on spatial attention network and feature division
CN117556224A (en) * 2024-01-12 2024-02-13 国网四川省电力公司电力科学研究院 Grid facility anti-seismic risk assessment system, method and storage medium
CN117556224B (en) * 2024-01-12 2024-03-22 国网四川省电力公司电力科学研究院 Grid facility anti-seismic risk assessment system, method and storage medium

Also Published As

Publication number Publication date
TW202326537A (en) 2023-07-01
CN114298417A (en) 2022-04-08

Similar Documents

Publication Publication Date Title
WO2023124204A1 (en) Anti-fraud risk assessment method and apparatus, training method and apparatus, and readable storage medium
US11190562B2 (en) Generic event stream processing for machine learning
Wang et al. LightLog: A lightweight temporal convolutional network for log anomaly detection on the edge
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
US10883345B2 (en) Processing of computer log messages for visualization and retrieval
Olmezogullari et al. Representation of click-stream datasequences for learning user navigational behavior by using embeddings
CN111612041A (en) Abnormal user identification method and device, storage medium and electronic equipment
Dey et al. Representation of developer expertise in open source software
CN108959305A (en) A kind of event extraction method and system based on internet big data
CN111737586B (en) Information recommendation method, device, equipment and computer readable storage medium
CN110516210B (en) Text similarity calculation method and device
CN110851761A (en) Infringement detection method, device and equipment based on block chain and storage medium
Lv et al. Computational intelligence in security of digital twins big graphic data in cyber-physical systems of smart cities
CN113986674A (en) Method and device for detecting abnormity of time sequence data and electronic equipment
CN113128196A (en) Text information processing method and device, storage medium
CN114090769A (en) Entity mining method, entity mining device, computer equipment and storage medium
CN113110843A (en) Contract generation model training method, contract generation method and electronic equipment
US20160004976A1 (en) System and methods for abductive learning of quantized stochastic processes
Atzberger et al. Large-Scale Evaluation of Topic Models and Dimensionality Reduction Methods for 2D Text Spatialization
CN113627514A (en) Data processing method and device of knowledge graph, electronic equipment and storage medium
Lee et al. Comparison between various multiple linear regression model for prediction of TBM performance
Bond et al. An unsupervised machine learning approach for ground‐motion spectra clustering and selection
CN112685574B (en) Method and device for determining hierarchical relationship of domain terms
Kumar et al. Arsknn: an efficient k-nearest neighbor classification technique using mass based similarity measure
CN115114627B (en) Malicious software detection method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913478

Country of ref document: EP

Kind code of ref document: A1