WO2018166457A1 - 神经网络模型训练、交易行为风险识别方法及装置 - Google Patents

神经网络模型训练、交易行为风险识别方法及装置 Download PDF

Info

Publication number
WO2018166457A1
WO2018166457A1 PCT/CN2018/078906 CN2018078906W WO2018166457A1 WO 2018166457 A1 WO2018166457 A1 WO 2018166457A1 CN 2018078906 W CN2018078906 W CN 2018078906W WO 2018166457 A1 WO2018166457 A1 WO 2018166457A1
Authority
WO
WIPO (PCT)
Prior art keywords
gbdt
sample
sample data
path information
decision tree
Prior art date
Application number
PCT/CN2018/078906
Other languages
English (en)
French (fr)
Inventor
李龙飞
周俊
李小龙
Original Assignee
阿里巴巴集团控股有限公司
李龙飞
周俊
李小龙
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 李龙飞, 周俊, 李小龙 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2018166457A1 publication Critical patent/WO2018166457A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Definitions

  • the present application relates to the field of computer technology, and in particular, to a neural network model training, a transaction behavior risk identification method and apparatus.
  • the neural network model is directly trained based on the sample data and the sample tags of the sample data.
  • the sample data collected above usually includes information of multiple dimensions, which leads to the low efficiency of neural network model training.
  • the present application describes a neural network model training, transaction behavior risk identification method and device, which can improve the efficiency of neural network model training.
  • a neural network model training method including:
  • the neural network model is trained according to the path information and the sample tag corresponding to each sample data in the GBDT.
  • a transaction behavior risk identification method including:
  • a neural network model training apparatus including:
  • a determining unit configured to input a plurality of pre-collected sample data into the gradient promotion decision tree GBDT to determine path information corresponding to each sample data in the GBDT; each sample data has a corresponding sample tag;
  • a training unit configured to train the neural network model according to the path information and the sample label corresponding to each sample data determined by the determining unit in the GBDT.
  • a transaction behavior risk identification apparatus including:
  • An obtaining unit configured to acquire transaction behavior data of the user
  • a determining unit configured to input the transaction behavior data acquired by the acquiring unit into the gradient promotion decision tree GBDT to determine path information corresponding to the transaction behavior data in the GBDT;
  • An input unit configured to input the path information determined by the determining unit into a neural network model
  • An output unit for outputting a transaction behavior risk identification result An output unit for outputting a transaction behavior risk identification result.
  • the neural network model training and transaction behavior risk identification method and apparatus input a plurality of pre-collected sample data into the gradient promotion decision tree GBDT to determine path information corresponding to each sample data in the GBDT.
  • the neural network model is trained according to the path information and sample tags corresponding to each sample data in the GBDT. That is, the application first determines the path information according to the GBDT, and then trains the neural network model according to the path information and the sample tag, and according to the characteristics of the GBDT itself, one path information usually includes information of multiple dimensions in the sample data, Therefore, the efficiency of training of the neural network model can be improved.
  • FIG. 1 is a flowchart of a neural network model training method according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of a decision tree provided by the present application.
  • FIG. 3 is a schematic diagram of a process of training a DNN provided by the present application.
  • FIG. 4 is a schematic diagram of a transaction behavior risk identification method provided by the present application.
  • FIG. 5 is a schematic diagram of a neural network model training apparatus according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a transaction behavior risk identification apparatus according to another embodiment of the present application.
  • the neural network model training method provided by the embodiment of the present application is applicable to a scenario of training a neural network model such as a deep neural network (DNN) or an artificial neural network (ANN).
  • DNN deep neural network
  • ANN artificial neural network
  • a well-trained neural network model can be used for pattern recognition and classification scenarios, for example, for risk identification of trading behavior.
  • FIG. 1 is a flowchart of a neural network model training method according to an embodiment of the present application.
  • the executor of the method may be a device with processing capability: a server or a system or a device. As shown in FIG. 1 , the method specifically includes:
  • Step 110 Input a plurality of sample data collected in advance into a Gradient Boosting Decision Tree (GBDT) to determine path information corresponding to each sample data in the GBDT.
  • GBDT Gradient Boosting Decision Tree
  • the GBDT model Before performing step 110, the GBDT model can be trained first. The specific training process will be described later.
  • the sample data may refer to the transaction behavior data of the user.
  • the sample data may be collected from a back-end database of the Alipay system.
  • the sample data can be attributed to the following five categories of user data: 1) historical behavior information of the user. For example, a, the number of user calls within a few days (eg, 180 days); b, the last time the city was logged in; c, the last time it was logged in; d, the number of logins in a few days (eg, 90 days). 2) User's transaction information.
  • a the average payment amount for several days (eg, 90 days); b, the number of days paid within a few days (eg, 180 days); c, the amount paid within a few days (eg, 180 days); d, the last payment distance Waiting this time.
  • Basic information of the user For example, a, whether the user is single; b, whether the user is decorated; c, whether the user is married; d, the age of the user; e, the length of the user registration; f, the level of user education, and the like.
  • RPC behavior information here refers to the RPC call between the client and the server when the user uses the client. In one implementation, these operations for each user in a recent given time window can be collected. For example, the number of times the RPC interface accessed by the user in the past 2 days can be collected.
  • URL User's Uniform Resoure Locator
  • the sample data is classified into positive sample data. For example, if a transaction behavior is operated by a non-user himself or brings a certain loss to the user's account and is reported, the transaction behavior data is marked as positive sample data. Otherwise, if a sample data is the user's normal transaction behavior data, the sample data is marked as negative sample data.
  • negative sample data is easier to collect. For example, it is easy to collect data on normal payment behavior from the back-end database of the Alipay system. Therefore, the negative sample data in the sample data set will account for the vast majority, for example, greater than 99.999%.
  • the trained neural network model tends to be biased. For example, it can only identify safe trading behaviors, but can not identify risky trading behaviors, which affects the accuracy of trading behavior risk identification. Sex.
  • the sample data can be preprocessed.
  • the positive sample data may be upsampled; and/or the negative sample data may be downsampled.
  • the upsampling process on the positive sample data may include: increasing the quantity of the positive sample data by copying or the like.
  • Downsampling the negative sample data may include: reducing the amount of negative sample data by deleting or the like.
  • the ratio of positive sample data to negative sample data can be adjusted to 1:300.
  • a corresponding sample tag may also be added to the positive and negative sample data. Specifically, a positive sample tag is added for positive sample data and a negative sample tag is added for negative sample data.
  • inputting a plurality of sample data collected in advance into the GBDT may include: determining, for each sample data, feature values corresponding to the plurality of features according to the sample data. The feature values of the feature are then entered into the decision tree of the GBDT.
  • Some of the above features may adopt a model variable precipitated online by the existing transaction behavior risk identification model, and the model variable belongs to the following three categories: 1) historical behavior information of the user. 2) User's transaction information. 3) Basic information of the user.
  • the above model variables need to be determined according to the business data, and the business data usually comes from different business departments, and it takes a certain time to collect and organize, so the latest state of the user cannot be obtained only through the above model variables, and thus the latest user cannot be obtained.
  • Trading behavior for risk identification To solve this problem, the feature of the RPC behavior information attributed to the user and the characteristics of the URL address information attributed to the user are added in the present application.
  • the features of the present application may be characterized by the following five categories: 1) historical behavior information of the user. 2) User's transaction information. 3) Basic information of the user. 4) User's RPC behavior information. 5) User's URL address information. Each category is as described above and will not be described here.
  • the feature value can be input into the GBDT.
  • the GBDT here can be composed of multiple decision trees. Each decision tree includes multiple nodes, and each node corresponds to one feature. Taking a decision tree as an example, the decision tree can be as shown in FIG. 2. In FIG. 2, node 1, node 2, and node 3 respectively have characteristics: "whether the user gender is male” or "user is older than 20 years old. "and whether the transaction amount exceeds 1,000 yuan" corresponds.
  • multiple path information can be determined in the decision tree. For example, if the sample data includes the user gender is male, the user age is greater than 20 years, and the transaction amount exceeds 1000 yuan, the determined path information may be as shown by the thick line in FIG. 2 .
  • the feature value may also be represented as a feature vector in the one-hot form.
  • the input of the feature value into the GBDT may be replaced by: inputting the feature vector corresponding to the feature value into the decision tree to determine the corresponding path information.
  • the feature vector corresponding to the feature value may be: [0 1]. If the user gender is female, that is, the feature value of the feature is “female”, the feature vector corresponding to the feature value may be: [1 0].
  • the determination of the feature vector corresponding to the feature value can be implemented in the following two ways: In the first implementation manner, the rule is first set: if it appears, the identifier is 1 Otherwise 0. Specifically, it is assumed that the preset RPC behavior information is: a, b, and c. The sample data contains the RPC behavior information of the user within two days: a, a and b, that is, the characteristic values are: a, a and b. Then the corresponding feature vector can be: [1 1 0]. In another implementation, a rule can be set: the frequency of the preset RPC behavior information is counted, and then normalized.
  • the preset RPC behavior information is: a, b, and c.
  • the sample data contains the RPC behavior information of the user within two days: a, a, b, b, and c, that is, the characteristic values are: a, a, b, b, and c.
  • the corresponding feature vector can be: 2, 2 and 1. Because of the need for normalization, the final eigenvector is: [0.4 0.4 0.2].
  • the present application obtains path information by inputting sample data into the GBDT, and the path information includes a plurality of feature values. Thereby, the number of feature values can be greatly reduced, whereby the manual work can be remarkably reduced.
  • Step 120 Train the neural network model according to the path information corresponding to each sample data and the sample tag.
  • the neural network model herein may include DNN or ANN, and the like.
  • DNN has developed rapidly in recent years.
  • traditional shallow models such as Logistic Regression (LR), Random Forest (RF)
  • LR Logistic Regression
  • RF Random Forest
  • DNN has its unique advancement: model expression ability. Powerful for big data and distributed training. Therefore, in this specification, the training DNN is taken as an example for explanation.
  • the training process of the DNN can be as shown in FIG. 3.
  • the input layer of the DNN is used to input each path information in the GBDT, and the output layer can output the first prediction result.
  • the DNN outputs a corresponding first prediction result.
  • the preset threshold may be set according to the empirical value, and may be considered to have been obtained. Optimized DNN.
  • the number of layers of the DNN in FIG. 3 can be changed as the number of path information is different.
  • the neural network model trained in this application will be better than other models (LR or RF). At the same time, the time of feature processing is greatly reduced, and the overall modeling process is much faster.
  • the feature values corresponding to the plurality of features may be input into each decision tree of the GBDT.
  • the conclusions of the various decision trees are then summed to determine the second prediction. It can be understood that for each sample data, the GBDT model outputs a corresponding second prediction result. For a plurality of sample data in the sample set, if the probability that the second prediction result matches the sample tag of the sample data reaches a preset threshold, the preset threshold here may be set according to the empirical value, and may be considered to have been optimized.
  • GBDT model For a plurality of sample data in the sample set, if the probability that the second prediction result matches the sample tag of the sample data reaches a preset threshold, the preset threshold here may be set according to the empirical value, and may be considered to have been optimized. GBDT model.
  • the input may be continued by adjusting the number of decision trees, the depth of the decision tree, and the regularization term (for representing the feature). And the output operation until the preset threshold is reached.
  • the neural network model trained in the present application can meet the timeliness requirement, that is, can identify the latest transaction behavior of the user.
  • Path information is obtained by inputting sample data into GBDT.
  • a path information is composed of a plurality of feature values, that is, a path information includes information of multiple dimensions of the sample data, thereby greatly reducing the amount of data input by the DNN input layer, thereby improving the neural network. The efficiency of model training.
  • the neural network model can be deployed to the line, and the risk behavior of the user's transaction behavior is identified.
  • FIG. 4 is a schematic diagram of a process of a transaction behavior risk identification method provided by the present application. As shown in FIG. 4, the method may include:
  • Step 410 Acquire transaction behavior data of the user.
  • the transaction behavior data here is the same as the definition of the above sample data, and will not be repeated here.
  • step 420 the transaction behavior data is input into the gradient promotion decision tree GBDT to determine the path information corresponding to the transaction behavior data in the GBDT.
  • the above GBDT is composed of a plurality of decision trees, each decision tree includes a plurality of nodes, and each node corresponds to one feature.
  • the transaction behavior data is input into the gradient promotion decision tree GBDT, and the step of determining the corresponding path information of the transaction behavior data in the GBDT may specifically include: determining the feature values corresponding to the plurality of features according to the transaction behavior data; Value, the path information is determined in the decision tree.
  • the process of determining the path information may refer to FIG. 2, and details are not described herein.
  • step 430 the path information is input into the neural network model.
  • the path information determined in step 420 is entered into the input layer of the DNN.
  • Step 440 outputting a transaction behavior risk identification result.
  • the transaction behavior risk identification result is output by the output layer of the DNN.
  • the recognition result is a risky trading behavior
  • an alarm can be initiated.
  • the recognition result is a risky payment behavior
  • the user account can be frozen to prevent property loss.
  • the embodiment of the present application further provides a neural network model training device, as shown in FIG. 5, the device includes:
  • the determining unit 501 is configured to input the plurality of sample data collected in advance into the gradient promotion decision tree GBDT to determine path information corresponding to each sample data in the GBDT.
  • each sample data has a corresponding sample tag.
  • the training unit 502 is configured to train the neural network model according to the path information and the sample tag corresponding to each sample data determined by the determining unit 501 in the GBDT.
  • the GBDT is composed of a plurality of decision trees, each decision tree includes a plurality of nodes, and each node corresponds to one feature.
  • the determining unit 501 is specifically configured to:
  • the feature values corresponding to the plurality of features are determined based on the sample data.
  • the features may include: remote process call RPC behavior information of the user and/or uniform resource locator URL address information of the user.
  • the path information is determined in the decision tree based on the feature values.
  • the sample tag may include: a positive sample tag and a negative sample tag.
  • the above device may further comprise:
  • the processing unit 503 is configured to perform upsampling processing on the sample data whose sample label is a positive sample label; and/or,
  • Downsample processing is performed on sample data whose sample label is a negative sample label.
  • the determining unit 501 inputs a plurality of sample data collected in advance into the gradient lifting decision tree GBDT to determine path information corresponding to each sample data in the GBDT.
  • the training unit 502 trains the neural network model according to the corresponding path information and sample tags in the GBDT for each sample data. Thereby, the efficiency of training of the neural network model can be improved.
  • the embodiment of the present application further provides a transaction behavior risk identification device, as shown in FIG. 6, the device includes:
  • the obtaining unit 601 is configured to acquire transaction behavior data of the user.
  • the determining unit 602 is configured to input the transaction behavior data acquired by the obtaining unit 601 into the gradient promotion decision tree GBDT to determine path information corresponding to the transaction behavior data in the GBDT.
  • the input unit 603 is configured to input the path information determined by the determining unit 602 into the neural network model.
  • the output unit 604 is configured to output a transaction behavior risk identification result.
  • the GBDT is composed of multiple decision trees, each decision tree includes multiple nodes, and each node corresponds to one feature;
  • the determining unit 602 is specifically configured to:
  • the feature values corresponding to the plurality of features are determined according to the transaction behavior data.
  • the path information is determined in the decision tree based on the feature values.
  • the feature may include: a remote procedure call RPC behavior information of the user and/or a uniform resource locator URL address information of the user.
  • the transaction behavior risk identification device provided by the application can improve the efficiency and accuracy of the transaction behavior risk identification.
  • the functions described herein can be implemented in hardware, software, firmware, or any combination thereof.
  • the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

一种神经网络模型训练、交易行为风险识别方法及装置。所述神经网络模型训练方法包括:将预先收集的多个样本数据输入到梯度提升决策树GBDT中,以确定每个样本数据在GBDT中对应的路径信息(S110);根据每个样本数据在GBDT中对应的路径信息以及样本标签,对神经网络模型进行训练(S120)。所述方法首先根据GBDT来确定路径信息,之后根据路径信息以及样本标签来训练神经网络模型,而根据GBDT本身的特点可知,其一条路径信息通常会包含样本数据中多个维度的信息,由此,可以提高神经网络模型训练的效率。

Description

神经网络模型训练、交易行为风险识别方法及装置 技术领域
本申请涉及计算机技术领域,尤其涉及一种神经网络模型训练、交易行为风险识别方法及装置。
背景技术
传统技术中,在搜集到样本数据之后,直接根据样本数据以及样本数据的样本标签,来训练神经网络模型。然而,上述搜集的样本数据通常会包括多个维度的信息,这会导致神经网络模型训练的效率比较低。
发明内容
本申请描述了一种神经网络模型训练、交易行为风险识别方法及装置,可以提高神经网络模型训练的效率。
第一方面,提供了一种神经网络模型训练方法,包括:
将预先收集的多个样本数据输入到梯度提升决策树GBDT中,以确定每个样本数据在所述GBDT中对应的路径信息;所述每个样本数据具有对应的样本标签;
根据所述每个样本数据在所述GBDT中对应的路径信息以及样本标签,对神经网络模型进行训练。
第二方面,提供了一种交易行为风险识别方法,包括:
获取用户的交易行为数据;
将所述交易行为数据输入到梯度提升决策树GBDT中,以确定所述交易行为数据在所述GBDT中对应的路径信息;
将所述路径信息输入到神经网络模型中;
输出交易行为风险识别结果。
第三方面,提供了一种神经网络模型训练装置,包括:
确定单元,用于将预先收集的多个样本数据输入到梯度提升决策树GBDT中,以确定每个样本数据在所述GBDT中对应的路径信息;所述每个样本数据具有对应的样本标签;
训练单元,用于根据所述确定单元确定的所述每个样本数据在所述GBDT中对应的路径信息以及样本标签,对神经网络模型进行训练。
第四方面,提供了一种交易行为风险识别装置,包括:
获取单元,用于获取用户的交易行为数据;
确定单元,用于将所述获取单元获取的所述交易行为数据输入到梯度提升决策树GBDT中,以确定所述交易行为数据在所述GBDT中对应的路径信息;
输入单元,用于将所述确定单元确定的所述路径信息输入到神经网络模型中;
输出单元,用于输出交易行为风险识别结果。
本申请提供的神经网络模型训练、交易行为风险识别方法及装置,将预先收集的多个样本数据输入到梯度提升决策树GBDT中,以确定每个样本数据在GBDT中对应的路径信息。根据每个样本数据在GBDT中对应的路径信息以及样本标签,对神经网络模型进行训练。也即本申请首先根据GBDT来确定路径信息,之后根据路径信息以及样本标签来训练神经网络模型,而根据GBDT本身的特点可知,其一条路径信息通常会包含样本数据中多个维度的信息,由此,可以提高神经网络模型训练的效率。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前 提下,还可以根据这些附图获得其它的附图。
图1为本申请一种实施例提供的神经网络模型训练方法流程图;
图2为本申请提供的决策树的示意图;
图3为本申请提供的训练DNN的过程示意图;
图4为本申请提供的交易行为风险识别方法示意图;
图5为本申请一种实施例提供的神经网络模型训练装置示意图;
图6为本申请另一种实施例提供的交易行为风险识别装置示意图。
具体实施方式
下面结合附图,对本申请的实施例进行描述。
本申请实施例提供的神经网络模型训练方法适用于对深度神经网络(Deep Neural Network,DNN)或者人工神经网络(Artificial Neural Network,ANN)等神经网络模型进行训练的场景。训练好的神经网络模型可以用于进行模式识别以及分类的场景,如,可以用于对交易行为进行风险识别。
图1为本申请一种实施例提供的神经网络模型训练方法流程图。所述方法的执行主体可以为具有处理能力的设备:服务器或者系统或者装置,如图1所示,所述方法具体包括:
步骤110,将预先收集的多个样本数据输入到梯度提升决策树(Gradient Boosting Decision Tree,GBDT)中,以确定每个样本数据在GBDT中对应的路径信息。
在执行步骤110之前,可以先训练好GBDT模型。具体的训练过程后续进行说明。
步骤110中,以训练的神经网络模型用于交易行为风险识别的场景为例来说,上述样本数据可以是指用户的交易行为数据。具体地,可以是从支付宝系统的后台数据库中搜集样本数据。此处,样本数据可以归属于如下五个类别的用户数据:1)用户的历史行为信息。如,a,若干天(如,180天)内 用户来电次数;b,最后一次登录城市;c,最后一次登录距今时间;d,若干天(如,90天)内登录次数等。2)用户的交易信息。如,a,若干天(如,90天)平均支付金额;b,若干天(如,180天)内支付天数;c,若干天(如,180天)内支付金额;d,最后一次支付距今时间等。3)用户的基本信息。如,a,用户是否单身;b,用户是否装修;c,用户是否已婚;d,用户年龄;e,用户注册时长;f,用户教育水平等。4)用户的远程过程调用(Remote Procedure Call,RPC)行为信息。此处的RPC行为信息是指用户在使用客户端的时候,客户端与服务器之间的RPC调用。在一种实现方式中,可以搜集每个用户在最近一个给定时间窗口的这些操作。如,可以搜集用户近2天访问的RPC接口的次数变量。5)用户的统一资源定位器(Uniform Resoure Locator,URL)地址信息。
对上述收集的多个样本数据,如果某样本数据与当前用户不相关或者该样本数据能给用户带来负面影响的,则将该样本数据分类为正样本数据。如,某一交易行为由非用户本人操作的或者对用户的账户带来一定的损失且报案的,则将该交易行为数据标记为正样本数据。否则,如果某样本数据为用户本人正常的交易行为数据,则将该样本数据标记为负样本数据。
需要说明的是,通常负样本数据比较容易搜集。如,可以很容易从支付宝系统的后台数据库中搜集到正常支付行为的数据。所以,样本数据集合中负样本数据会占绝大多数的比重,如,大于99.999%。然而,当负样本数据的比重比较高时,训练的神经网络模型往往会有偏差,如,只能识别安全的交易行为,而不能识别有风险的交易行为,这影响了交易行为风险识别的准确性。
为了能提升交易行为风险识别的准确性,可以对样本数据进行预处理。在一种实现方式中,可以对正样本数据进行升采样处理;和/或,对负样本数据进行降采样处理。其中,对正样本数据进行升采样处理可以包括:通过复制等方式增加正样本数据的数量。对负样本数据进行降采样处理可以包括: 通过删除等方式减小负样本数据的数量。在一个例子中,可以将正样本数据与负样本数据的比例调整为1:300。
还需要说明的是,对上述预处理后的样本数据,还可以为正、负样本数据添加对应的样本标签。具体地,为正样本数据添加正样本标签,为负样本数据添加负样本标签。
步骤110中,将预先收集的多个样本数据输入到GBDT中具体可以包括:针对每个样本数据,可以先根据该样本数据,确定多个特征对应的特征值。之后,将特征的特征值输入到GBDT的决策树中。
此处的特征可以归属于多个类别。在一种实现方式中,上述特征中的部分特征可以采用现有交易行为风险识别模型在线沉淀的模型变量,该模型变量归属于如下三个类别:1)用户的历史行为信息。2)用户的交易信息。3)用户的基本信息。
然而,上述模型变量需要根据业务数据来确定,而业务数据通常来自不同业务部门,其采集和整理需要一定的时间,所以仅通过上述模型变量不能得到用户最新的状态,从而也不能对用户最新的交易行为进行风险识别。为解决该问题,本申请中增加了归属于用户的RPC行为信息的特征和归属于用户的URL地址信息的特征。
综上,本申请的特征可以为归属于如下五个类别的特征:1)用户的历史行为信息。2)用户的交易信息。3)用户的基本信息。4)用户的RPC行为信息。5)用户的URL地址信息。其中,每个类别如上所述,在此不复赘述。
对上述设定的特征,在根据具体的样本数据,确定其对应的特征值之后,就可以将特征值输入到GBDT中。此处的GBDT可以由多棵决策树组成,每棵决策树包括多个节点,每个节点与一个特征相对应。以一棵决策树为例来说,该决策树可以如图2所示,图2中,节点1、节点2和节点3分别与特征:“用户性别是否是男”、“用户年龄大于20岁”以及“交易金额是否超过1000元”相对应。在将特征的特征值输入决策树之后,就可以在决策树中确定出多条 路径信息。如,假设样本数据包含用户性别是男,用户年龄大于20岁,交易金额超过1000元时,确定出的路径信息可以如图2中的粗线所示。
作为示例性说明,图2中只是展示了一条路径信息,实际上样本数据输入GBDT时,可以确定出多条路径信息,本申请在此不复赘述。
需要说明的是,本申请中,在将该特征值输入到GBDT之前,还可以将该特征值表示为one-hot形式的特征向量。在还确定特征值对应的特征向量的情况下,上述将特征值输入到GBDT中可以替换为:将特征值对应的特征向量输入到决策树中,以确定相应的路径信息。其中,确定特征值的特征向量的过程可以举例如下:
以特征为“用户性别”为例来说,如果用户性别为男,也即特征的特征值为“男”,则该特征值对应的特征向量可以为:[0 1]。如果用户性别为女,也即特征的特征值为“女”,则该特征值对应的特征向量可以为:[1 0]。
再以特征为用户的RPC行为信息为例来说,其特征值对应的特征向量的确定可以通过如下两种方式来实现:第一种实现方式中,首先设定规则:出现过则标识为1,否则为0。具体地,假设预设的RPC行为信息为:a,b和c。而某个样本数据包含用户两天内的RPC行为信息为:a,a和b,也即特征值为:a,a和b。则对应的特征向量可以为:[1 1 0]。在另一种实现方式中,可以设定规则:统计预设的RPC行为信息的频次,然后归一化。具体地,假设预设的RPC行为信息为:a,b和c。而某个样本数据包含用户两天内的RPC行为信息为:a,a,b,b和c,也即特征值为:a,a,b,b和c。则对应的特征向量可以为:2,2和1。因为需要归一化,所以最终的特征向量为:[0.4 0.4 0.2]。
需要说明的是,上述将特征值表示为特征向量属于传统常规技术,在此不复赘述。
需要说明的是,为了提升神经网络模型的准确性,本申请中设定了比较多的特征,从而会确定多个特征值。对于越来越多的特征值,其处理往往需 要花费很多的时间,受限于同时观察的特征值的个数,人很难对多个特征值之间的关系进行深入的分析,并手工生成新的特征值。而本申请通过将样本数据输入GBDT,来得到路径信息,该路径信息由于包含了多个特征值。从而可以大大减小特征值的数量,由此可以显著地减少人工的工作。
步骤120,根据每个样本数据对应的路径信息以及样本标签,对神经网络模型进行训练。
此处的神经网络模型可以包括DNN或者ANN等。其中,DNN最近几年发展迅速,相比传统使用的浅层模型(如,逻辑回归(Logistic Regression,LR),随机森林(Random forest,RF)),DNN有着其特有的先进性:模型表达能力强大,适合大数据和分布式训练。因此,本说明书中,以训练DNN为例进行说明。
本申请中,DNN的训练过程可以如图3所示,图3中,DNN的输入层用于输入GBDT中的各条路径信息,而输出层即可输出第一预测结果。可以理解的是,针对每个样本数据,即在将该样本数据对应的路径信息输入到DNN之后,DNN都会输出相应的第第一预测结果。对样本集合中的多个样本数据,若第第一预测结果与样本数据的样本标签相符合的概率达到预设阈值,此处的预设阈值可以根据经验值设定,则可以认为已经得到了优化的DNN。
可以理解的是,随着路径信息的个数的不同,图3中DNN的层数是可以改动的。
通过实验发明,本申请训练得到的神经网络模型会比其它模型(LR或者RF)的效果都好。同时特征处理的时间大大的减少了,整体建模流程变快了很多。
以下对如何训练GBDT模型进行说明:
在根据每个样本数据,确定多个特征对应的特征值之后,可以将多个特征对应的特征值输入GBDT的各个决策树中。之后将各个决策树的结论累加起来以确定第二预测结果。可以理解的是,针对每个样本数据,GBDT模型 都会输出相应的第二预测结果。对样本集合中的多个样本数据,若第二预测结果与样本数据的样本标签相符合的概率达到预设阈值,此处的预设阈值可以根据经验值设定,则可以认为已经得到了优化的GBDT模型。而若第二预测结果与样本数据的样本标签相符合的概率未达到预设阈值,则可以通过调整决策树的数目、决策树的深度以及正则化项(用于表示特征)来继续执行上述输入和输出的操作,直至达到预设阈值为止。
综上,本申请具有如下几方面的优点:
1)由于本申请的特征包括了类别为用户RPC行为信息的特征,因此本申请训练的神经网络模型能够满足时效性要求,也即能够识别用户最新的交易行为。
2)本申请训练的神经网络模型的准确性比传统的浅层模型高。
3)通过将样本数据输入GBDT,获得了路径信息。而一条路径信息由多个特征值组合而成,也即一条路径信息包含了样本数据的多个维度的信息,由此,可以极大地减小DNN输入层输入的数据量,从而可以提高神经网络模型训练的效率。
需要说明的是,在通过图1所示的各步骤训练得到神经网络模型之后,就可以将该神经网络模型部署到线上,并对用户的交易行为进行风险识别了。
图4为本申请提供的交易行为风险识别方法的过程示意图。如图4所示,该方法可以包括:
步骤410,获取用户的交易行为数据。
此处的交易行为数据与上述样本数据的定义相同,在此不复赘述。
步骤420,将交易行为数据输入到梯度提升决策树GBDT中,以确定交易行为数据在GBDT中对应的路径信息。
上述GBDT由多棵决策树组成,每棵决策树包括多个节点,每个节点与一个特征相对应。步骤420中将交易行为数据输入到梯度提升决策树GBDT中,以确定交易行为数据在GBDT中对应的路径信息的步骤具体可以包括: 根据交易行为数据,确定多个特征对应的特征值;根据特征值,在决策树中确定路径信息。其中,确定路径信息的过程可以参照图2,在此不复赘述。
步骤430,将路径信息输入到神经网络模型中。
即将步骤420中确定的路径信息输入DNN的输入层中。
步骤440,输出交易行为风险识别结果。
具体地,由DNN的输出层输出交易行为风险识别结果。此处,如果识别结果为风险的交易行为,则可以发起报警。在支付场景下,若识别结果为风险的支付行为,则可以冻结该用户账户以防止财产流失。与上述神经网络模型训练方法对应地,本申请实施例还提供的一种神经网络模型训练装置,如图5所示,该装置包括:
确定单元501,用于将预先收集的多个样本数据输入到梯度提升决策树GBDT中,以确定每个样本数据在GBDT中对应的路径信息。
此处,每个样本数据具有对应的样本标签。
训练单元502,用于根据确定单元501确定的每个样本数据在GBDT中对应的路径信息以及样本标签,对神经网络模型进行训练。
可选地,GBDT由多棵决策树组成,每棵决策树包括多个节点,每个节点与一个特征相对应。
确定单元501具体用于:
对多个样本数据中的每个样本数据,根据样本数据,确定多个特征对应的特征值。
此处,特征可以包括:用户的远程过程调用RPC行为信息和/或用户的统一资源定位器URL地址信息。
根据特征值,在决策树中确定路径信息。
可选地,样本标签可以包括:正样本标签和负样本标签。上述装置还可以包括:
处理单元503,用于对样本标签为正样本标签的样本数据进行升采样处 理;和/或,
对样本标签为负样本标签的样本数据进行降采样处理。
本申请实施例装置的各功能模块的功能,可以通过上述方法实施例的各步骤来实现,因此,本申请提供的装置的具体工作过程,在此不复赘述。
本申请提供的神经网络模型训练装置,确定单元501将预先收集的多个样本数据输入到梯度提升决策树GBDT中,以确定每个样本数据在GBDT中对应的路径信息。训练单元502根据每个样本数据在GBDT中对应的路径信息以及样本标签,对神经网络模型进行训练。由此,可以提高神经网络模型训练的效率。
与上述交易行为风险识别方法对应地,本申请实施例还提供的一种交易行为风险识别装置,如图6所示,该装置包括:
获取单元601,用于获取用户的交易行为数据。
确定单元602,用于将获取单元601获取的交易行为数据输入到梯度提升决策树GBDT中,以确定交易行为数据在GBDT中对应的路径信息。
输入单元603,用于将确定单元602确定的路径信息输入到神经网络模型中。
输出单元604,用于输出交易行为风险识别结果。
可选地,GBDT由多棵决策树组成,每棵决策树包括多个节点,每个节点与一个特征相对应;
确定单元602具体用于:
根据交易行为数据,确定多个特征对应的特征值。
根据特征值,在决策树中确定路径信息。
其中,特征可以包括:用户的远程过程调用RPC行为信息和/或用户的统一资源定位器URL地址信息。
本申请实施例装置的各功能模块的功能,可以通过上述方法实施例的各步骤来实现,因此,本申请提供的装置的具体工作过程,在此不复赘述。
本申请提供的交易行为风险识别装置,可以提高交易行为风险识别的效率和准确性。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本发明所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。

Claims (14)

  1. 一种神经网络模型训练方法,其特征在于,包括:
    将预先收集的多个样本数据输入到梯度提升决策树GBDT中,以确定每个样本数据在所述GBDT中对应的路径信息;所述每个样本数据具有对应的样本标签;
    根据所述每个样本数据在所述GBDT中对应的路径信息以及样本标签,对神经网络模型进行训练。
  2. 根据权利要求1所述的方法,其特征在于,所述GBDT由多棵决策树组成,每棵决策树包括多个节点,每个节点与一个特征相对应;
    所述将预先收集的多个样本数据输入到梯度提升决策树GBDT中,以确定每个样本数据在GBDT中对应的路径信息,包括:
    对所述多个样本数据中的每个样本数据,根据所述样本数据,确定多个特征对应的特征值;
    根据所述特征值,在所述决策树中确定所述路径信息。
  3. 根据权利要求1或2所述的方法,其特征在于,所述样本标签包括:正样本标签和负样本标签;
    在所述将预先收集的多个样本数据输入到梯度提升决策树GBDT中之前,还包括:
    对样本标签为正样本标签的样本数据进行升采样处理;和/或,
    对样本标签为负样本标签的样本数据进行降采样处理。
  4. 根据权利要求2所述的方法,其特征在于,所述特征包括:
    用户的远程过程调用RPC行为信息和/或用户的统一资源定位器URL地址信息。
  5. 一种交易行为风险识别方法,其特征在于,包括:
    获取用户的交易行为数据;
    将所述交易行为数据输入到梯度提升决策树GBDT中,以确定所述交易 行为数据在所述GBDT中对应的路径信息;
    将所述路径信息输入到神经网络模型中;
    输出交易行为风险识别结果。
  6. 根据权利要求5所述的方法,其特征在于,所述GBDT由多棵决策树组成,每棵决策树包括多个节点,每个节点与一个特征相对应;
    所述将所述交易行为数据输入到梯度提升决策树GBDT中,以确定所述交易行为数据在所述GBDT中对应的路径信息,包括:
    根据所述交易行为数据,确定多个特征对应的特征值;
    根据所述特征值,在所述决策树中确定所述路径信息。
  7. 根据权利要求6所述的方法,其特征在于,所述特征包括:
    用户的远程过程调用RPC行为信息和/或用户的统一资源定位器URL地址信息。
  8. 一种神经网络模型训练装置,其特征在于,包括:
    确定单元,用于将预先收集的多个样本数据输入到梯度提升决策树GBDT中,以确定每个样本数据在所述GBDT中对应的路径信息;所述每个样本数据具有对应的样本标签;
    训练单元,用于根据所述确定单元确定的所述每个样本数据在所述GBDT中对应的路径信息以及样本标签,对神经网络模型进行训练。
  9. 根据权利要求8所述的装置,其特征在于,所述GBDT由多棵决策树组成,每棵决策树包括多个节点,每个节点与一个特征相对应;
    所述确定单元具体用于:
    对所述多个样本数据中的每个样本数据,根据所述样本数据,确定多个特征对应的特征值;
    根据所述特征值,在所述决策树中确定所述路径信息。
  10. 根据权利要求8或9所述的装置,其特征在于,所述样本标签包括:正样本标签和负样本标签;所述装置还包括:
    处理单元,用于对样本标签为正样本标签的样本数据进行升采样处理;和/或,
    对样本标签为负样本标签的样本数据进行降采样处理。
  11. 根据权利要求9所述的装置,其特征在于,所述特征包括:
    用户的远程过程调用RPC行为信息和/或用户的统一资源定位器URL地址信息。
  12. 一种交易行为风险识别装置,其特征在于,包括:
    获取单元,用于获取用户的交易行为数据;
    确定单元,用于将所述获取单元获取的所述交易行为数据输入到梯度提升决策树GBDT中,以确定所述交易行为数据在所述GBDT中对应的路径信息;
    输入单元,用于将所述确定单元确定的所述路径信息输入到神经网络模型中;
    输出单元,用于输出交易行为风险识别结果。
  13. 根据权利要求12所述的装置,其特征在于,所述GBDT由多棵决策树组成,每棵决策树包括多个节点,每个节点与一个特征相对应;
    所述确定单元具体用于:
    根据所述交易行为数据,确定多个特征对应的特征值;
    根据所述特征值,在所述决策树中确定所述路径信息。
  14. 根据权利要求13所述的装置,其特征在于,所述特征包括:
    用户的远程过程调用RPC行为信息和/或用户的统一资源定位器URL地址信息。
PCT/CN2018/078906 2017-03-15 2018-03-14 神经网络模型训练、交易行为风险识别方法及装置 WO2018166457A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710153115.8A CN108629413B (zh) 2017-03-15 2017-03-15 神经网络模型训练、交易行为风险识别方法及装置
CN201710153115.8 2017-03-15

Publications (1)

Publication Number Publication Date
WO2018166457A1 true WO2018166457A1 (zh) 2018-09-20

Family

ID=63522791

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/078906 WO2018166457A1 (zh) 2017-03-15 2018-03-14 神经网络模型训练、交易行为风险识别方法及装置

Country Status (3)

Country Link
CN (1) CN108629413B (zh)
TW (1) TWI689874B (zh)
WO (1) WO2018166457A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109559232A (zh) * 2019-01-03 2019-04-02 深圳壹账通智能科技有限公司 交易数据处理方法、装置、计算机设备和存储介质
WO2020088007A1 (zh) * 2018-10-30 2020-05-07 阿里巴巴集团控股有限公司 确定用户金融违约风险的方法及装置
CN111290922A (zh) * 2020-03-03 2020-06-16 中国工商银行股份有限公司 服务运行健康度监测方法及装置
CN111667028A (zh) * 2020-07-09 2020-09-15 腾讯科技(深圳)有限公司 一种可靠负样本确定方法和相关装置
CN111667290A (zh) * 2019-03-08 2020-09-15 北京京东尚科信息技术有限公司 业务展示方法和装置、计算机可读存储介质
CN111931690A (zh) * 2020-08-28 2020-11-13 Oppo广东移动通信有限公司 模型训练方法、装置、设备及存储介质
CN112161173A (zh) * 2020-09-10 2021-01-01 国网河北省电力有限公司检修分公司 一种电网布线参数检测装置及检测方法
CN112541076A (zh) * 2020-11-09 2021-03-23 北京百度网讯科技有限公司 目标领域的扩充语料生成方法、装置和电子设备
CN112667940A (zh) * 2020-10-15 2021-04-16 广东电子工业研究院有限公司 基于深度学习的网页正文抽取方法

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389494B (zh) * 2018-10-25 2021-11-05 北京芯盾时代科技有限公司 借贷欺诈检测模型训练方法、借贷欺诈检测方法及装置
CN109583475B (zh) * 2018-11-02 2023-06-30 创新先进技术有限公司 异常信息的监测方法及装置
CN110046179B (zh) * 2018-12-25 2023-09-08 创新先进技术有限公司 一种报警维度的挖掘方法、装置及设备
CN109784403B (zh) * 2019-01-16 2022-07-05 武汉斗鱼鱼乐网络科技有限公司 一种识别风险设备的方法以及相关设备
CN110033092B (zh) * 2019-01-31 2020-06-02 阿里巴巴集团控股有限公司 数据标签生成、模型训练、事件识别方法和装置
CN110008349B (zh) * 2019-02-01 2020-11-10 创新先进技术有限公司 计算机执行的事件风险评估的方法及装置
CN110232400A (zh) * 2019-04-30 2019-09-13 冶金自动化研究设计院 一种梯度提升决策神经网络分类预测方法
CN110390041B (zh) * 2019-07-02 2022-05-20 上海上湖信息技术有限公司 在线学习方法及装置、计算机可读存储介质
CN110942248B (zh) * 2019-11-26 2022-05-31 支付宝(杭州)信息技术有限公司 交易风控网络的训练方法及装置、交易风险检测方法
CN111291900A (zh) * 2020-03-05 2020-06-16 支付宝(杭州)信息技术有限公司 训练风险识别模型的方法及装置
CN111723083B (zh) * 2020-06-23 2024-04-05 北京思特奇信息技术股份有限公司 用户身份识别方法、装置、电子设备及存储介质
CN113610354A (zh) * 2021-07-15 2021-11-05 北京淇瑀信息科技有限公司 第三方平台用户的策略分配方法、装置及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890803A (zh) * 2011-07-21 2013-01-23 阿里巴巴集团控股有限公司 电子商品异常交易过程的确定方法及其装置
CN105279691A (zh) * 2014-07-25 2016-01-27 中国银联股份有限公司 基于随机森林模型的金融交易检测方法和设备
CN105844501A (zh) * 2016-05-18 2016-08-10 上海亿保健康管理有限公司 一种消费行为的风险控制系统及方法
CN106296195A (zh) * 2015-05-29 2017-01-04 阿里巴巴集团控股有限公司 一种风险识别方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130054417A1 (en) * 2011-08-30 2013-02-28 Qualcomm Incorporated Methods and systems aggregating micropayments in a mobile device
CN105975992A (zh) * 2016-05-18 2016-09-28 天津大学 一种基于自适应升采样的不平衡数据集分类方法
CN106096727B (zh) * 2016-06-02 2018-12-07 腾讯科技(深圳)有限公司 一种基于机器学习的网络模型构造方法及装置
CN106506454B (zh) * 2016-10-10 2019-11-12 江苏通付盾科技有限公司 欺诈业务识别方法及装置
CN106447333A (zh) * 2016-11-29 2017-02-22 中国银联股份有限公司 一种欺诈交易侦测方法及服务器

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890803A (zh) * 2011-07-21 2013-01-23 阿里巴巴集团控股有限公司 电子商品异常交易过程的确定方法及其装置
CN105279691A (zh) * 2014-07-25 2016-01-27 中国银联股份有限公司 基于随机森林模型的金融交易检测方法和设备
CN106296195A (zh) * 2015-05-29 2017-01-04 阿里巴巴集团控股有限公司 一种风险识别方法及装置
CN105844501A (zh) * 2016-05-18 2016-08-10 上海亿保健康管理有限公司 一种消费行为的风险控制系统及方法

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020088007A1 (zh) * 2018-10-30 2020-05-07 阿里巴巴集团控股有限公司 确定用户金融违约风险的方法及装置
CN109559232A (zh) * 2019-01-03 2019-04-02 深圳壹账通智能科技有限公司 交易数据处理方法、装置、计算机设备和存储介质
CN111667290A (zh) * 2019-03-08 2020-09-15 北京京东尚科信息技术有限公司 业务展示方法和装置、计算机可读存储介质
CN111290922B (zh) * 2020-03-03 2023-08-22 中国工商银行股份有限公司 服务运行健康度监测方法及装置
CN111290922A (zh) * 2020-03-03 2020-06-16 中国工商银行股份有限公司 服务运行健康度监测方法及装置
CN111667028A (zh) * 2020-07-09 2020-09-15 腾讯科技(深圳)有限公司 一种可靠负样本确定方法和相关装置
CN111667028B (zh) * 2020-07-09 2024-03-12 腾讯科技(深圳)有限公司 一种可靠负样本确定方法和相关装置
CN111931690A (zh) * 2020-08-28 2020-11-13 Oppo广东移动通信有限公司 模型训练方法、装置、设备及存储介质
CN112161173A (zh) * 2020-09-10 2021-01-01 国网河北省电力有限公司检修分公司 一种电网布线参数检测装置及检测方法
CN112667940B (zh) * 2020-10-15 2022-02-18 广东电子工业研究院有限公司 基于深度学习的网页正文抽取方法
CN112667940A (zh) * 2020-10-15 2021-04-16 广东电子工业研究院有限公司 基于深度学习的网页正文抽取方法
CN112541076A (zh) * 2020-11-09 2021-03-23 北京百度网讯科技有限公司 目标领域的扩充语料生成方法、装置和电子设备
CN112541076B (zh) * 2020-11-09 2024-03-29 北京百度网讯科技有限公司 目标领域的扩充语料生成方法、装置和电子设备

Also Published As

Publication number Publication date
TWI689874B (zh) 2020-04-01
TW201835819A (zh) 2018-10-01
CN108629413A (zh) 2018-10-09
CN108629413B (zh) 2020-06-16

Similar Documents

Publication Publication Date Title
WO2018166457A1 (zh) 神经网络模型训练、交易行为风险识别方法及装置
CN110765117B (zh) 欺诈识别方法、装置、电子设备及计算机可读存储介质
TWI706333B (zh) 欺詐交易識別方法、裝置、伺服器及儲存媒體
WO2020253358A1 (zh) 业务数据的风控分析处理方法、装置和计算机设备
WO2019127451A1 (zh) 图像识别方法及云端系统
US8719192B2 (en) Transfer of learning for query classification
JP7337949B2 (ja) 機械学習アプリケーションにおけるカテゴリフィールド値の取り扱い
WO2014108004A1 (zh) 一种微博用户身份识别方法及系统
CN108111399B (zh) 消息处理的方法、装置、终端及存储介质
KR20170035892A (ko) 온라인 서비스의 거동 변화의 인식 기법
US20210357772A1 (en) System and method for time series pattern recognition
CN110727761A (zh) 对象信息获取方法、装置及电子设备
CN115796310A (zh) 信息推荐及模型训练方法、装置、设备和存储介质
CN114117029B (zh) 一种基于多层次信息增强的解决方案推荐方法及系统
US20190340514A1 (en) System and method for generating ultimate reason codes for computer models
CN113850077A (zh) 基于人工智能的话题识别方法、装置、服务器及介质
CN112053245B (zh) 信息评估方法及系统
CN117745482A (zh) 合同条款的确定方法、装置、设备和介质
US11847599B1 (en) Computing system for automated evaluation of process workflows
CN113010664B (zh) 一种数据处理方法、装置及计算机设备
CN114048512B (zh) 一种处理敏感数据的方法及装置
CN112069392B (zh) 涉网犯罪防控方法、装置、计算机设备及存储介质
CN105787075A (zh) 一种基于数据挖掘的事件预测方法和装置
CN113312354B (zh) 数据表的识别方法、装置、设备和存储介质
CN113361402B (zh) 识别模型的训练方法、确定准确率的方法、装置和设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18766726

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18766726

Country of ref document: EP

Kind code of ref document: A1