CN115953031A

CN115953031A - Risk prediction model training method and device, computer-readable storage medium

Info

Publication number: CN115953031A
Application number: CN202310132757.5A
Authority: CN
Inventors: 范昊; 杨恺; 李娴; 郑邦祺; 黄志翔
Original assignee: Jingdong Technology Holding Co Ltd
Current assignee: Jingdong Technology Holding Co Ltd
Priority date: 2023-02-09
Filing date: 2023-02-09
Publication date: 2023-04-11

Abstract

The present disclosure relates to a training method and device for a risk prediction model, a computer-readable storage medium, and relates to the field of artificial intelligence. The training method of the risk prediction model includes: obtaining a plurality of training samples, wherein each training sample includes a risk label and user characteristics, and the risk label of at least one training sample in the plurality of training samples includes a first risk generated in a first time window label and the second risk label generated in the second time window, the length of the first time window is greater than the length of the second time window; for each training sample, a risk prediction model is used to generate a risk prediction result according to user characteristics; according to multiple The risk prediction results of the training samples and the corresponding risk labels are used to train the risk prediction model. According to the training method of the risk prediction model disclosed in the present disclosure, the accuracy of the risk prediction model in long-term risk prediction and short-term risk prediction can be improved simultaneously.

Description

Risk prediction model training method and device, computer-readable storage medium

技术领域technical field

本公开涉及互联网技术领域，特别涉及风险预测模型的训练方法、风险预测方法及装置、电子设备、计算机可读存储介质。The present disclosure relates to the field of Internet technologies, and in particular to a training method of a risk prediction model, a risk prediction method and device, electronic equipment, and a computer-readable storage medium.

背景技术Background technique

随着互联网和通信技术的高速发展，传统的风险控制方式已逐渐不能支撑风险预测需求，而互联网技术对多维度、大量数据的智能处理，批量标准化的执行流程，更能贴合信息发展时代风险控制的发展要求。在互联网技术领域中，通过建立各种风险预测模型，学习用户的行为模式，挖掘海量数据中存在的价值，进而达到合理规避风险的目的。With the rapid development of the Internet and communication technology, traditional risk control methods have gradually been unable to support risk prediction needs, while Internet technology's intelligent processing of multi-dimensional and large amounts of data, and batch standardized execution processes are more suitable for risks in the era of information development. Control development requirements. In the field of Internet technology, by establishing various risk prediction models, learning user behavior patterns, and mining the value of massive data, the purpose of reasonably avoiding risks can be achieved.

互联网中的风险是指：用户未能履行约定契约中的义务，而造成损失的风险。相关技术中，通过建立风险预测模型，从用户特征中学习规律，预测用户是否存在违约风险，从而对用户进行风险控制和风险提示，规避用户违约的风险。The risk in the Internet refers to the risk of losses caused by users failing to fulfill the obligations in the contract. In related technologies, by establishing a risk prediction model and learning rules from user characteristics, it is predicted whether a user has a risk of default, thereby performing risk control and risk reminders to users, and avoiding the risk of user default.

发明内容Contents of the invention

根据本公开的第一方面，提供了一种风险预测模型的训练方法，包括：According to a first aspect of the present disclosure, a method for training a risk prediction model is provided, including:

获取多个训练样本，其中，每个训练样本包括风险标签和用户特征，多个训练样本中至少一个训练样本的风险标签包括在第一时间窗口生成的第一风险标签和在第二时间窗口生成的第二风险标签，第一时间窗口的长度大于第二时间窗口的长度；Obtaining multiple training samples, wherein each training sample includes a risk label and user characteristics, and the risk label of at least one training sample in the multiple training samples includes a first risk label generated in a first time window and a risk label generated in a second time window The second risk label, the length of the first time window is greater than the length of the second time window;

针对每个训练样本，根据用户特征，利用风险预测模型生成风险预测结果，其中，所述至少一个训练样本的风险预测结果包括第一风险标签对应的第一风险预测任务的预测结果和第二风险标签对应的第二风险预测任务的预测结果；For each training sample, a risk prediction model is used to generate a risk prediction result according to user characteristics, wherein the risk prediction result of the at least one training sample includes the prediction result of the first risk prediction task corresponding to the first risk label and the second risk The prediction result of the second risk prediction task corresponding to the label;

根据多个训练样本的风险预测结果和对应的风险标签，训练风险预测模型。According to the risk prediction results and corresponding risk labels of multiple training samples, the risk prediction model is trained.

在一些实施例中，根据多个训练样本的风险预测结果和对应的风险标签，训练风险预测模型，包括：In some embodiments, the risk prediction model is trained according to the risk prediction results and corresponding risk labels of multiple training samples, including:

对于所述至少一个训练样本中的每个训练样本，根据第一风险标签、第二风险标签、第一风险预测任务的预测结果和第二风险预测任务的预测结果，计算所述至少一个训练样本中的每个训练样本的损失函数；For each training sample in the at least one training sample, the at least one training sample is calculated according to the first risk label, the second risk label, the prediction result of the first risk prediction task, and the prediction result of the second risk prediction task The loss function for each training sample in ;

根据多个训练样本的损失函数，训练风险预测模型。According to the loss function of multiple training samples, the risk prediction model is trained.

在一些实施例中，对于所述至少一个训练样本中的每个训练样本，根据第一风险标签、第二风险标签、第一风险预测任务的预测结果和第二风险预测任务的预测结果，计算所述至少一个训练样本中的每个训练样本的损失函数，包括：In some embodiments, for each training sample in the at least one training sample, according to the first risk label, the second risk label, the prediction result of the first risk prediction task and the prediction result of the second risk prediction task, calculate The loss function of each training sample in the at least one training sample includes:

对于所述至少一个训练样本中的每个训练样本，根据第一风险标签和第一风险预测任务的预测结果，计算所述至少一个训练样本中的每个训练样本的第一损失函数；For each training sample in the at least one training sample, calculate a first loss function for each training sample in the at least one training sample according to the first risk label and the prediction result of the first risk prediction task;

对于所述至少一个训练样本中的每个训练样本，根据第二风险标签和第二风险预测任务的预测结果，计算所述至少一个训练样本中的每个训练样本的第二损失函数；For each training sample in the at least one training sample, calculate a second loss function for each training sample in the at least one training sample according to the second risk label and the prediction result of the second risk prediction task;

根据所述至少一个训练样本中的每个训练样本的第一损失函数和第二损失函数，计算所述至少一个训练样本中的每个训练样本的损失函数。A loss function for each of the at least one training samples is calculated based on the first loss function and the second loss function for each of the at least one training samples.

在一些实施例中，根据所述至少一个训练样本中的每个训练样本的第一损失函数和第二损失函数，计算所述至少一个训练样本中的每个训练样本的损失函数，包括：In some embodiments, according to the first loss function and the second loss function of each training sample in the at least one training sample, calculating the loss function of each training sample in the at least one training sample includes:

根据所述至少一个训练样本中的每个训练样本的第一损失函数和第二损失函数之和，计算所述至少一个训练样本中的每个训练样本的损失函数。The loss function of each training sample in the at least one training sample is calculated according to the sum of the first loss function and the second loss function of each training sample in the at least one training sample.

在一些实施例中，所述至少一个训练样本的用户特征在指定时间提取，第一时间窗口为从指定时间到第一时间的时间窗口，第二时间窗口为从指定时间到第二时间的时间窗口。In some embodiments, the user features of the at least one training sample are extracted at a specified time, the first time window is a time window from the specified time to the first time, and the second time window is the time from the specified time to the second time window.

在一些实施例中，风险预测模型的训练方法还包括以下至少一项：In some embodiments, the training method of the risk prediction model also includes at least one of the following:

用默认值填充用户特征中的缺失值；Fill missing values in user features with default values;

用均值填充法填充用户特征中的缺失值；Fill missing values in user features with mean filling;

用就近补齐法填充用户特征中的缺失值；Fill missing values in user features with the nearest complement method;

用最近邻法填充用户特征中的缺失值；Fill missing values in user features with the nearest neighbor method;

用多重插补法填充用户特征中的缺失值。Fill missing values in user characteristics with multiple imputation.

在一些实施例中，风险预测模型的训练方法还包括筛除以下至少一种用户特征：In some embodiments, the training method of the risk prediction model further includes screening out at least one of the following user characteristics:

缺失率大于第一阈值的用户特征；User features whose missing rate is greater than the first threshold;

对训练样本的区分度低于第二阈值的用户特征；User features whose discrimination degree to the training samples is lower than the second threshold;

方差大于第三阈值的用户特征。User features with variance greater than a third threshold.

在一些实施例中，风险预测模型的训练方法还包括：In some embodiments, the training method of the risk prediction model also includes:

对用户特征进行离散化处理。Discretize user features.

利用验证样本，验证风险预测模型，其中，验证样本与训练样本生成的时间不同。The risk prediction model is validated using validation samples, where the validation samples are generated at different times from the training samples.

在一些实施例中，风险预测模型为多任务深度神经网络模型或树模型。In some embodiments, the risk prediction model is a multi-task deep neural network model or a tree model.

根据本公开的第二方面，提供了一种风险预测方法，包括：According to a second aspect of the present disclosure, a risk prediction method is provided, including:

根据目标用户的用户特征，利用风险预测模型生成对目标用户的风险预测结果，其中，风险预测模型根据本公开任一实施例所述的风险预测模型的训练方法训练得到。According to the user characteristics of the target user, a risk prediction model is used to generate a risk prediction result for the target user, wherein the risk prediction model is trained according to the risk prediction model training method described in any embodiment of the present disclosure.

根据本公开的第三方面，提供了一种风险预测模型的训练装置，包括：According to a third aspect of the present disclosure, a training device for a risk prediction model is provided, including:

获取模块，获取多个训练样本，其中，每个训练样本包括风险标签和用户特征，多个训练样本中至少一个训练样本的风险标签包括在第一时间窗口生成的第一风险标签和在第二时间窗口生成的第二风险标签，第一时间窗口的长度大于第二时间窗口的长度；The acquisition module acquires a plurality of training samples, wherein each training sample includes a risk label and user characteristics, and the risk label of at least one training sample in the plurality of training samples includes the first risk label generated in the first time window and the first risk label generated in the second time window. The second risk label generated by the time window, the length of the first time window is greater than the length of the second time window;

生成模块，针对每个训练样本，根据用户特征，利用风险预测模型生成风险预测结果，其中，所述至少一个训练样本的风险预测结果包括第一风险标签对应的第一风险预测任务的预测结果和第二风险标签对应的第二风险预测任务的预测结果；The generation module is for each training sample, according to user characteristics, using the risk prediction model to generate a risk prediction result, wherein the risk prediction result of the at least one training sample includes the prediction result of the first risk prediction task corresponding to the first risk label and The prediction result of the second risk prediction task corresponding to the second risk label;

训练模块，根据多个训练样本的风险预测结果和对应的风险标签，训练风险预测模型。The training module trains the risk prediction model according to the risk prediction results of multiple training samples and the corresponding risk labels.

根据本公开的第四方面，提供了一种风险预测装置，包括：生成模块，被配置为根据目标用户的用户特征，利用风险预测模型生成对目标用户的风险预测结果，其中，风险预测模型根据本公开任一实施例所述的风险预测模型的训练装置训练得到。According to a fourth aspect of the present disclosure, there is provided a risk prediction device, including: a generating module configured to use a risk prediction model to generate a risk prediction result for a target user according to the user characteristics of the target user, wherein the risk prediction model is based on The training device of the risk prediction model described in any embodiment of the present disclosure is trained.

根据本公开的第五方面，提供了一种电子设备，包括：According to a fifth aspect of the present disclosure, there is provided an electronic device, comprising:

存储器；以及storage; and

耦接至所述存储器的处理器，所述处理器被配置为基于存储在所述存储器的指令，执行根据本公开任一实施例所述的风险预测模型的训练方法，或根据本公开任一实施例所述的风险预测方法。A processor coupled to the memory, the processor configured to execute the method for training a risk prediction model according to any embodiment of the present disclosure, or according to any one of the present disclosure, based on the instructions stored in the memory. The risk prediction method described in the embodiment.

根据本公开的第六方面，提供了一种计算机可读存储介质，其上存储有计算机程序指令，该指令被处理器执行时，实现根据本公开任一实施例所述的风险预测模型的训练方法，或根据本公开任一实施例所述的风险预测方法。According to a sixth aspect of the present disclosure, there is provided a computer-readable storage medium, on which computer program instructions are stored. When the instructions are executed by a processor, the training of the risk prediction model according to any embodiment of the present disclosure is implemented. method, or the risk prediction method according to any embodiment of the present disclosure.

附图说明Description of drawings

构成说明书的一部分的附图描述了本公开的实施例，并且连同说明书一起用于解释本公开的原理。The accompanying drawings, which constitute a part of this specification, illustrate the embodiments of the disclosure and together with the description serve to explain the principles of the disclosure.

参照附图，根据下面的详细描述，可以更加清楚地理解本公开，其中：The present disclosure can be more clearly understood from the following detailed description with reference to the accompanying drawings, in which:

图1示出根据本公开一些实施例的风险预测模型的训练方法的流程图；FIG. 1 shows a flowchart of a method for training a risk prediction model according to some embodiments of the present disclosure;

图2示出根据本公开一些实施例的风险预测标签和用户特征的构造方法；FIG. 2 shows a construction method of risk prediction labels and user characteristics according to some embodiments of the present disclosure;

图3示出根据本公开一些实施例的数据预处理方法；FIG. 3 illustrates a data preprocessing method according to some embodiments of the present disclosure;

图4示出根据本公开一些实施例的利用风险预测结果训练风险预测模型的流程图；FIG. 4 shows a flowchart of training a risk prediction model using risk prediction results according to some embodiments of the present disclosure;

图5示出根据本公开一些实施例的计算损失函数的方法；FIG. 5 illustrates a method of calculating a loss function according to some embodiments of the present disclosure;

图6示出根据本公开一些实施例的风险预测模型的训练装置的框图；6 shows a block diagram of a training device for a risk prediction model according to some embodiments of the present disclosure;

图7示出根据本公开另一些实施例的电子设备的框图；Fig. 7 shows a block diagram of an electronic device according to other embodiments of the present disclosure;

图8示出用于实现本公开一些实施例的计算机系统的框图。Figure 8 shows a block diagram of a computer system for implementing some embodiments of the present disclosure.

具体实施方式Detailed ways

现在将参照附图来详细描述本公开的各种示例性实施例。应注意到：除非另外具体说明，否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that relative arrangements of components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

同时，应当明白，为了便于描述，附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。At the same time, it should be understood that, for the convenience of description, the sizes of the various parts shown in the drawings are not drawn according to the actual proportional relationship.

以下对至少一个示例性实施例的描述实际上仅仅是说明性的，决不作为对本公开及其应用或使用的任何限制。The following description of at least one exemplary embodiment is merely illustrative in nature and in no way intended as any limitation of the disclosure, its application or uses.

对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论，但在适当情况下，所述技术、方法和设备应当被视为说明书的一部分。Techniques, methods and devices known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods and devices should be considered part of the description.

在这里示出和讨论的所有示例中，任何具体值应被解释为仅仅是示例性的，而不是作为限制。因此，示例性实施例的其它示例可以具有不同的值。In all examples shown and discussed herein, any specific values should be construed as exemplary only, and not as limitations. Therefore, other examples of the exemplary embodiment may have different values.

应注意到：相似的标号和字母在下面的附图中表示类似项，因此，一旦某一项在一个附图中被定义，则在随后的附图中不需要对其进行进一步讨论。It should be noted that like numerals and letters denote like items in the following figures, therefore, once an item is defined in one figure, it does not require further discussion in subsequent figures.

相关技术中，在建立风控模型时，每个用户作为一个样本，将用户申请的时间作为观察点，在观察点之前的一段时间窗口，作为观察期，用来在观察期中提取用户的行为表现数据，构造样本的特征X。In related technologies, when establishing a risk control model, each user is used as a sample, and the time when the user applies is used as the observation point, and a period of time before the observation point is used as the observation period, which is used to extract the user's behavior during the observation period Data, construct the feature X of the sample.

从观察点之后到表现点的时间窗口，作为表现期。表现期是对观察点上用户的表现进行监控的时间周期，根据用户在这段时间窗口内的的表现，将风险标签Y分类成“0”和“1”。“0”和“1”表示用户未来会不会出现违约、失联等。相较于观察点而言，表现期是未来时间，风险标签的获取需要等待数月以上的时间，表现期长的样本数量更少。因此，标签Y存在一定的滞后性，存在延迟反馈现象。The time window from the observation point to the performance point is called the performance period. The performance period is the time period for monitoring the user's performance at the observation point. According to the user's performance in this time window, the risk label Y is classified into "0" and "1". "0" and "1" indicate whether the user will default, lose contact, etc. in the future. Compared with the observation point, the performance period is the future time, and the acquisition of risk labels needs to wait for more than several months, and the number of samples with a long performance period is less. Therefore, there is a certain hysteresis in the label Y, and there is a phenomenon of delayed feedback.

使用表现期越长的标签建模，用户的行为能够充分表现出来，用户的违约风险暴露得更加充分，但可以使用到的样本越少，影响模型的训练效果和准确率。使用表现期越短的风险标签建模，虽然能使用到的样本更多，但是短表现期的标签并不能充分暴露用户的长期违约风险，因此训练出的模型不能很好地学习用户的行为模式，对用户的长期违约行为的可能性预测不够准确，造成损失。Using tags with a longer performance period to model, the user's behavior can be fully represented, and the user's default risk is more fully exposed, but the number of samples that can be used is less, which affects the training effect and accuracy of the model. Modeling with risk labels with a shorter performance period, although more samples can be used, but labels with a short performance period cannot fully expose the user's long-term default risk, so the trained model cannot learn the user's behavior pattern well , the prediction of the possibility of the user's long-term breach of contract is not accurate enough, resulting in losses.

在选取样本训练模型时，难以在样本的完整性和样本的质量之间保持平衡。When selecting samples to train the model, it is difficult to maintain a balance between the integrity of the samples and the quality of the samples.

此外，采用某一个标签训练出的模型，只能预测用户在该标签对应的风险上面的表现，但是不具备泛化性。In addition, the model trained with a certain label can only predict the user's performance on the risk corresponding to the label, but it does not have generalization.

本公开提供了一种风险预测模型的训练方法、风险预测方法及装置、电子设备、计算机可读存储介质，能够同时提高风险预测模型在长期风险预测和短期风险预测上的准确率。The disclosure provides a risk prediction model training method, risk prediction method and device, electronic equipment, and a computer-readable storage medium, which can simultaneously improve the accuracy of the risk prediction model in long-term risk prediction and short-term risk prediction.

图1示出根据本公开一些实施例的风险预测模型的训练方法的流程图。Fig. 1 shows a flowchart of a method for training a risk prediction model according to some embodiments of the present disclosure.

如图1所示，风险预测模型的训练方法包括步骤S1-步骤S3。在一些实施例中，风险预测模型的训练方法由风险预测模型的训练装置执行。As shown in Fig. 1, the training method of the risk prediction model includes step S1-step S3. In some embodiments, the method for training a risk prediction model is executed by a training device for a risk prediction model.

在步骤S1中，获取多个训练样本，其中，每个训练样本包括风险标签和用户特征，多个训练样本中至少一个训练样本的风险标签包括在第一时间窗口生成的第一风险标签和在第二时间窗口生成的第二风险标签，第一时间窗口的长度大于第二时间窗口的长度。In step S1, a plurality of training samples are obtained, wherein each training sample includes a risk label and user characteristics, and the risk label of at least one training sample in the plurality of training samples includes the first risk label generated in the first time window and the first risk label generated in the first time window. For the second risk label generated by the second time window, the length of the first time window is greater than the length of the second time window.

图2示出根据本公开一些实施例的风险预测标签和用户特征的构造方法。FIG. 2 illustrates a construction method of risk prediction tags and user characteristics according to some embodiments of the present disclosure.

如图2所示，至少一个训练样本的用户特征在指定时间提取，第一时间窗口为从指定时间到第一时间的时间窗口，第二时间窗口为从指定时间到第二时间的时间窗口。As shown in Figure 2, the user features of at least one training sample are extracted at a specified time, the first time window is the time window from the specified time to the first time, and the second time window is the time window from the specified time to the second time.

指定时间(即，观察点)之前的一段时间是观察期，位于时间轴左侧，是用于观察用户特征的时间区间。例如，在指定时间，提取在观察期产生的用户特征。A period of time before the specified time (that is, the observation point) is an observation period, located on the left side of the time axis, and is a time interval for observing user characteristics. For example, at a specified time, user characteristics generated during the observation period are extracted.

本领域技术人员应当知晓，指定时间可以是一个时间区间，也就是用户提交申请的时间，在该时间区间提交申请的用户构成用于建模的样本。Those skilled in the art should know that the specified time may be a time interval, that is, the time when users submit applications, and users who submit applications within this time interval constitute samples for modeling.

观察点之后的一段时间窗口用于生成风险标签。根据本公开的一些实施例，用于训练模型的样本中，部分训练样本同时拥有多个不同长度时间窗口生成的标签。例如，训练样本A同时具有第一风险标签Y₁和第二风险标签Y₂。Y₁和Y₂分别用于表示不同类型的第一风险和第二风险是否发生，也就是说，第一风险标签和第二风险标签分别对应不同的风险预测任务。例如，第一风险为该用户违约90天，第二风险为违约30天。Y₁取值1和0分别表示第一风险是否发生，Y₂取值1和0分别表示第二风险是否发生。第一风险标签Y₁对应第一时间窗口(表现期)，也就是从指定时间到第一时间的时间区间，其长度为T₁。第二风险标签Y₂对应的第二时间窗口(表现期)，也就是从指定时间到第二时间的时间区间，其长度为T₂，T₁>T₂。A window of time after the observation point is used to generate risk labels. According to some embodiments of the present disclosure, among the samples used for training the model, part of the training samples simultaneously have labels generated by multiple time windows of different lengths. For example, the training sample A has both the first risk label Y ₁ and the second risk label Y ₂ . Y ₁ and Y ₂ are respectively used to indicate whether different types of the first risk and the second risk occur, that is, the first risk label and the second risk label correspond to different risk prediction tasks respectively. For example, the first risk is that the user defaults for 90 days, and the second risk is that the user defaults for 30 days. Y ₁ takes values 1 and 0 to indicate whether the first risk occurs, and Y ₂ takes values 1 and 0 to indicate whether the second risk occurs. The first risk label Y ₁ corresponds to the first time window (performance period), that is, the time interval from the specified time to the first time, and its length is T ₁ . The second time window (expression period) corresponding to the second risk label Y ₂ , that is, the time interval from the specified time to the second time, has a length of T ₂ , where T ₁ >T ₂ .

一个训练样本同时能够有不止两个风险标签，例如，有Y₁，Y₂，Y₃，……Y_N等多个标签，本公开对此不做限制。A training sample can have more than two risk labels at the same time, for example, Y ₁ , Y ₂ , Y ₃ , . . . Y _N and other multiple labels, which is not limited in this disclosure.

另外，还有一些训练样本能够仅拥有一个风险标签。例如，用于训练模型的样本的集合(训练集)还包括训练样本B和D，训练样本B有风险标签Y₃，训练样本C有风险标签Y₄，Y₃对应的时间窗口长度大于Y₄的时间窗口Y₄。即，训练样本的集合包括具有长短不同的风险标签的训练样本。In addition, some training samples can have only one risk label. For example, the set of samples (training set) used to train the model also includes training samples B and D, training sample B has a risk label Y ₃ , training sample C has a risk label Y ₄ , and the length of the time window corresponding to Y ₃ is greater than Y ₄ The time window Y ₄ . That is, the set of training samples includes training samples with risk labels of different lengths.

在步骤S2中，利用风险预测模型，针对每个训练样本，根据用户特征，生成风险预测结果，其中，至少一个训练样本的风险预测结果包括第一风险标签对应的第一风险预测任务的预测结果和第二风险标签对应的第二风险预测任务的预测结果。In step S2, the risk prediction model is used to generate a risk prediction result for each training sample according to user characteristics, wherein the risk prediction result of at least one training sample includes the prediction result of the first risk prediction task corresponding to the first risk label The prediction result of the second risk prediction task corresponding to the second risk label.

例如，将所有样本输入到模型之中，得到样本的每个风险标签对应的预测结果。For example, input all samples into the model to obtain the prediction results corresponding to each risk label of the samples.

如果一个样本有多个标签，则生成多个标签对应的多个任务的多个预测结果，至少一个训练样本的风险预测结果包括第一风险标签对应的第一风险预测任务的预测结果和第二风险标签对应的第二风险预测任务的预测结果。训练样本A同时具有第一风险标签Y₁和第二风险标签Y₂，则生成对应的第一风险预测任务的预测结果p₁和第二风险标签p₂。If a sample has multiple labels, multiple prediction results of multiple tasks corresponding to multiple labels are generated, and the risk prediction results of at least one training sample include the prediction results of the first risk prediction task corresponding to the first risk label and the second The prediction result of the second risk prediction task corresponding to the risk label. The training sample A has both the first risk label Y ₁ and the second risk label Y ₂ , and the corresponding prediction result p ₁ and the second risk label p ₂ of the first risk prediction task are generated.

第一风险标签和第二风险标签分别对应不同的风险预测任务。风险预测模型能够同时学习第一风险标签对应的第一风险预测任务和第二风险标签对应的第二风险预测任务，得到在多个风险预测目标上的整体最优解。与使用只具有一个标签的训练样本训练单任务模型相比，在风险预测模型中使用多任务学习技术具有以下优点：The first risk label and the second risk label respectively correspond to different risk prediction tasks. The risk prediction model can simultaneously learn the first risk prediction task corresponding to the first risk label and the second risk prediction task corresponding to the second risk label, and obtain an overall optimal solution on multiple risk prediction objectives. Compared to training a single-task model using training samples with only one label, using multi-task learning techniques in risk prediction models has the following advantages:

(1)由于多个(不同长度的时间窗口生成的)风险标签对应的多个风险预测任务之间具有一定的相关性，并且具有各自的噪声，因此多任务学习能够实现隐式的数据增强；(1) Since multiple risk prediction tasks corresponding to multiple risk labels (generated by time windows of different lengths) have certain correlations and have their own noises, multi-task learning can achieve implicit data enhancement;

(2)多任务学习能够帮助风险预测模型将注意力集中在重要的用户特征和风险预测任务上，其他风险预测任务为这些重要用户特征的相关性或不相关性提供额外的证据，增强模型针对每个风险预测任务的学习效果；(2) Multi-task learning can help the risk prediction model to focus on important user characteristics and risk prediction tasks. Other risk prediction tasks provide additional evidence for the relevance or irrelevance of these important user characteristics, and the enhanced model is aimed at The learning effect of each risk prediction task;

(3)能够令风险预测模型学习通用化的用户特征表征，起到隐式正则的作用。(3) It can make the risk prediction model learn a generalized user feature representation, which plays the role of implicit regularization.

在训练模型之前，对训练集进行预处理，能够提高模型的预测效率。图3示出根据本公开一些实施例的数据预处理方法。Before training the model, preprocessing the training set can improve the prediction efficiency of the model. FIG. 3 illustrates a data preprocessing method according to some embodiments of the present disclosure.

如图3所示，数据预处理包括数据准备、特征处理和特征工程等。As shown in Figure 3, data preprocessing includes data preparation, feature processing, and feature engineering.

特征处理包括缺失值处理、特征筛选和数据划分等。下面介绍根据本公开一些实施例的特征处理的方法。Feature processing includes missing value processing, feature screening, and data partitioning. The method of feature processing according to some embodiments of the present disclosure is introduced below.

在一些实施例中，风险预测模型还包括以下至少一项：用默认值填充用户特征中的缺失值；用均值填充法填充用户特征中的缺失值；用就近补齐法填充用户特征中的缺失值；In some embodiments, the risk prediction model further includes at least one of the following: filling the missing values in the user features with default values; filling the missing values in the user features with the mean value filling method; filling the missing values in the user features with the nearest completion method value;

用最近邻法填充用户特征中的缺失值；用多重插补法填充用户特征中的缺失值。Fill missing values in user features with nearest neighbor; fill missing values in user features with multiple imputation.

例如，将对于训练集、验证集中的样本，如果样本的多个特征中，部分特征的值缺失，则用默认值填充缺失特征值，对于标签中的缺失值予以保留，不做修改。此外，除了用默认值填充缺失值，还可以采用特殊值填充、平均值填充、热卡(就近补齐)法、最近邻法、多重插补法、模型预测等方法处理缺失值。For example, for the samples in the training set and validation set, if the value of some features is missing among the multiple features of the sample, the missing feature value will be filled with the default value, and the missing value in the label will be retained without modification. In addition, in addition to filling missing values with default values, you can also use special value filling, average filling, hot card (nearest filling) method, nearest neighbor method, multiple imputation method, model prediction and other methods to deal with missing values.

在一些实施例中，风险预测模型的训练方法还包括筛除以下至少一种用户特征：缺失率大于第一阈值的用户特征；对训练样本的区分度低于第二阈值的用户特征；方差大于第三阈值的用户特征。In some embodiments, the training method of the risk prediction model further includes filtering out at least one of the following user features: user features whose missing rate is greater than a first threshold; user features whose discrimination degree to training samples is lower than a second threshold; variance greater than User characteristics for the third threshold.

例如，对训练样本和/或验证样本的多个特征进行特征筛选处理。将缺失率较高的特征和/或方差较大的不稳定特征排除。还能够排除失效的特征，也就是对训练样本的区分度低于第二阈值的用户特征，例如，取值恒定的恒一特征。其中，区分度是评估一个特征对用户的区分能力的指标。通过把单特征当做风险预测模型的输入，使用AUC(Area UnderCurve，曲线下面积)，KS(Kolmogorov-Smirnov，洛伦兹曲线)，IV(Information Value，信息价值)，来计算特征的区分度。For example, feature screening processing is performed on multiple features of the training samples and/or verification samples. Features with high missing rates and/or unstable features with high variance are excluded. It is also possible to exclude invalid features, that is, user features whose discrimination against training samples is lower than the second threshold, for example, constant-unity features with constant values. Among them, the degree of discrimination is an index to evaluate the ability of a feature to distinguish users. By using a single feature as the input of the risk prediction model, use AUC (Area Under Curve, area under the curve), KS (Kolmogorov-Smirnov, Lorenz curve), IV (Information Value, information value) to calculate the discrimination of features.

此外，采取特征降维方法，例如PCA(principal component analysis，主成分分析)等，也能实现对有效特征的筛选。In addition, adopting feature dimension reduction methods, such as PCA (principal component analysis, principal component analysis), etc., can also realize the screening of effective features.

在划分训练样本时，将训练样本按照一定比例，随机划分为训练集(train)和测试集(test)，保留验证样本不进行划分，其中，训练集用来训练模型，测试集用来监督模型的训练过程，防止过拟合，验证集用来评估模型的性能。When dividing the training samples, the training samples are randomly divided into a training set (train) and a test set (test) according to a certain proportion, and the verification samples are not divided. Among them, the training set is used to train the model, and the test set is used to supervise the model. The training process prevents overfitting, and the validation set is used to evaluate the performance of the model.

在一些实施例中，风险预测模型的训练方法还包括：对用户特征进行离散化处理。例如，对特征(例如连续特征、多分类特征)，进行分箱离散化处理。In some embodiments, the method for training the risk prediction model further includes: discretizing user features. For example, binning and discretization are performed on features (such as continuous features and multi-category features).

通过对特征进行离散化处理，能够促进模型的快速迭代，降低模型过拟合的风险，提高模型的稳定性。By discretizing the features, it can promote the rapid iteration of the model, reduce the risk of model over-fitting, and improve the stability of the model.

此外，在特征工程中，还对训练集中的类别特征，进行独热(onehot)编码，得到嵌入向量；对连续特征进行数据标准化(例如，数据归一化)、对数(log)变换等处理，得到稠密向量。对验证样本，也做类似的处理。In addition, in feature engineering, one-hot encoding is performed on the category features in the training set to obtain embedding vectors; data standardization (for example, data normalization) and logarithmic (log) transformation are performed on continuous features. , to get a dense vector. Similar processing is also done for the verification samples.

在用户特征中，可能存在不相关的特征，特征之间也可能相互依赖，本公开通过特征处理，能够更加灵活地处理数据。Among the user features, there may be irrelevant features, and the features may also depend on each other. The present disclosure can process data more flexibly through feature processing.

在步骤S3中，根据多个训练样本的风险预测结果和对应的风险标签，训练风险预测模型。In step S3, the risk prediction model is trained according to the risk prediction results of multiple training samples and the corresponding risk labels.

例如，对于每个训练样本，根据训练样本的风险预测结果和样本的标签，能够计算训练样本的损失函数。根据所有训练样本的损失函数，求解损失函数最小值对应的模型参数，以更新模型参数，训练风险预测模型。For example, for each training sample, the loss function of the training sample can be calculated according to the risk prediction result of the training sample and the label of the sample. According to the loss function of all training samples, the model parameters corresponding to the minimum value of the loss function are solved to update the model parameters and train the risk prediction model.

在一些实施例中，风险预测模型为Multi-TaskDNN(Multi-Task Deep NeuralNetworks，多任务深度神经网络模型)或树模型等多任务预测模型。In some embodiments, the risk prediction model is a multi-task prediction model such as Multi-TaskDNN (Multi-Task Deep Neural Networks, multi-task deep neural network model) or tree model.

如果风险预测模型采用树结构的模型，则采用加法模型和分步计算的原理，不断迭代建立新的树，直到达到模型停止训练的条件。If the risk prediction model adopts a tree-structured model, use the principle of an additive model and step-by-step calculation to iteratively build a new tree until the condition for the model to stop training is reached.

如果采用多任务深度神经网络模型，则在训练过程中更新迭代模型的参数，模型可以训练若干轮(epoch)，其中每个epoch在上一个epoch的基础上进行训练。神经网络模型训练时会采取梯度下降+反向传播的方式进行训练，并对待优化参数进行更新。在优化时可以针对具体任务类型调整优化方式，例如梯度下降、随机梯度下降、小批量(minibatch)梯度下降、动量(Momentum)随机梯度下降、Adagrad(adaptive gradient，自适应梯度)法、Adam(adaptive moment estimation，自适应矩估计)法等。If a multi-task deep neural network model is used, the parameters of the iterative model are updated during the training process, and the model can be trained for several rounds (epoch), wherein each epoch is trained on the basis of the previous epoch. During the training of the neural network model, gradient descent + backpropagation will be adopted for training, and the parameters to be optimized will be updated. During optimization, the optimization method can be adjusted for specific task types, such as gradient descent, stochastic gradient descent, minibatch gradient descent, momentum (Momentum) stochastic gradient descent, Adagrad (adaptive gradient, adaptive gradient) method, Adam (adaptive gradient) method, moment estimation, adaptive moment estimation) method, etc.

损失函数采用交叉熵损失函数、0-1损失函数、Hinge(合页)损失函数、指数损失函数、对数损失函数等。The loss function adopts cross-entropy loss function, 0-1 loss function, Hinge (hinge) loss function, exponential loss function, logarithmic loss function, etc.

在模型训练完成后，采取Auc、召回率(Recall)、平衡F分数(F1)、KS等评价指标对模型进行评估。After the model training is completed, the model is evaluated by evaluation indicators such as Auc, recall rate (Recall), balanced F score (F1), and KS.

图4示出根据本公开一些实施例的利用风险预测结果训练风险预测模型的流程图。Fig. 4 shows a flowchart of training a risk prediction model using risk prediction results according to some embodiments of the present disclosure.

如图4所示，利用风险预测结果训练风险预测模型包括步骤S31-S32。As shown in Fig. 4, training the risk prediction model using the risk prediction results includes steps S31-S32.

在步骤S31中，对于至少一个训练样本中的每个训练样本，根据第一风险标签、第二风险标签、第一风险预测任务的预测结果和第二风险预测任务的预测结果，计算至少一个训练样本中的每个训练样本的损失函数。In step S31, for each training sample in at least one training sample, calculate at least one training The loss function for each training example in the sample.

在步骤S32中，根据多个训练样本的损失函数，训练风险预测模型。In step S32, the risk prediction model is trained according to the loss functions of a plurality of training samples.

例如，如果训练样本A同时包括第一风险标签Y₁和第二风险标签Y₂，则将训练样本A的特征输入模型中，能得到对第一风险是否发生的预测结果p₁和对第一风险是否发生的预测结果p₂。根据Y₁、Y₂、p₁、p₂，计算训练样本A的损失函数。For example, if the training sample A includes both the first risk label Y ₁ and the second risk label Y ₂ , then the features of the training sample A are input into the model, and the prediction result p ₁ for whether the first risk occurs and the prediction result for the first risk The prediction result p ₂ of whether the risk occurs. Calculate the loss function of the training sample A according to Y ₁ , Y ₂ , p ₁ , and p ₂ .

如果训练样本B仅包括一个风险标签(例如，第一风险标签Y₁)，将训练样本A的特征输入模型中，则只需要根据Y₁和p₁，计算训练训练样本A的损失函数。If the training sample B includes only one risk label (for example, the first risk label Y ₁ ), and the features of the training sample A are input into the model, it is only necessary to calculate the loss function of training the training sample A according to Y ₁ and p ₁ .

在一些实施例中，对于至少一个训练样本中的每个训练样本，根据第一风险标签、第二风险标签、第一风险预测任务的预测结果和第二风险预测任务的预测结果，计算至少一个训练样本中的每个训练样本的损失函数，包括：对于至少一个训练样本中的每个训练样本，根据第一风险标签和第一风险预测任务的预测结果，计算至少一个训练样本中的每个训练样本的第一损失函数；对于至少一个训练样本中的每个训练样本，根据第二风险标签和第二风险预测任务的预测结果，计算至少一个训练样本中的每个训练样本的第二损失函数；根据至少一个训练样本中的每个训练样本的第一损失函数和第二损失函数，计算至少一个训练样本中的每个训练样本的损失函数。In some embodiments, for each training sample in at least one training sample, according to the first risk label, the second risk label, the prediction result of the first risk prediction task and the prediction result of the second risk prediction task, at least one The loss function of each training sample in the training samples, including: for each training sample in the at least one training sample, according to the first risk label and the prediction result of the first risk prediction task, calculating each of the at least one training sample The first loss function of the training sample; for each training sample in the at least one training sample, according to the prediction result of the second risk label and the second risk prediction task, calculate the second loss of each training sample in the at least one training sample function; calculating a loss function of each of the at least one training sample according to the first loss function and the second loss function of each of the at least one training sample.

例如，根据Y₁和p₁，计算训练样本A的第一损失函数L₁，根据Y₂和p₂，计算训练样本A的第二损失函数L₂，根据第一损失函数和第二损失函数，计算训练样本A最终的损失函数。For example, according to Y ₁ and p ₁ , calculate the first loss function L ₁ of the training sample A, according to Y ₂ and p ₂ , calculate the second loss function L ₂ of the training sample A, according to the first loss function and the second loss function , to calculate the final loss function of the training sample A.

在一些实施例中，根据至少一个训练样本中的每个训练样本的第一损失函数和第二损失函数，计算至少一个训练样本中的每个训练样本的损失函数，包括：根据至少一个训练样本中的每个训练样本的第一损失函数和第二损失函数之和，计算至少一个训练样本中的每个训练样本的损失函数。In some embodiments, calculating the loss function of each of the at least one training sample according to the first loss function and the second loss function of each of the at least one training sample includes: according to at least one training sample The sum of the first loss function and the second loss function of each training sample in , and calculate the loss function of each training sample in at least one training sample.

例如，训练样本A最终的损失函数是L₁+L2。For example, the final loss function of training sample A is L ₁ +L2.

风险预测模型能够有不止有两个风险预测任务，相应地，一个训练样本也能够有不止两个风险标签。下面以多任务神经网络模型为例，介绍在有N个风险预测任务的情况下，如何实现模型的训练。A risk prediction model can have more than two risk prediction tasks, and correspondingly, a training sample can also have more than two risk labels. Taking the multi-task neural network model as an example, the following describes how to train the model when there are N risk prediction tasks.

图5示出根据本公开一些实施例的计算损失函数的方法。FIG. 5 illustrates a method of calculating a loss function according to some embodiments of the present disclosure.

对于N个风险预测任务，其对应的风险标签为Y＝{Y₁,Y₂…Y_N}。由于相对于用户特征来说，标签具有滞后性，可能出现部分标签缺失的情况，所以不是每个训练样本都能具有所有的风险标签，即，一个训练样本的任意一个风险标签可以是空值。For N risk prediction tasks, the corresponding risk labels are Y={Y ₁ , Y ₂ . . . Y _N }. Due to the hysteresis of labels relative to user features, some labels may be missing, so not every training sample can have all risk labels, that is, any risk label of a training sample can be a null value.

风险预测模型对N个任务的风险预测结果为P＝{p₁,p₂…p_N}。The risk prediction result of the risk prediction model for N tasks is P={p ₁ ,p ₂ ...p _N }.

对于任意一个训练样本，遍历训练样本的所有风险标签，如果当前风险标签Y_i为空值，则跳过，否则，根据Y_i和p_i，计算该训练样本在第i个任务上的损失函数Loss_i。对该训练样本的所有损失函数求和，得到该训练样本的总损失函数。其中，训练样本在第i个任务上的损失函数Loss_i可以是交叉熵损失函数。For any training sample, traverse all the risk labels of the training sample, if the current risk label Y _i is empty, then skip, otherwise, according to Y _i and p _i , calculate the loss function of the training sample on the i-th task Loss _i . Sum all the loss functions of the training sample to get the total loss function of the training sample. Wherein, the loss function Loss _i of the training sample on the i-th task may be a cross-entropy loss function.

在一些实施例中，利用验证样本，验证风险预测模型，其中，验证样本与训练样本生成的时间不同。例如，在数据库中指定时间作为观察点，并确定时间窗口，生成训练集和跨时间验证集。其中，训练集用于模型的训练，验证集用于评估模型的性能。In some embodiments, the risk prediction model is validated using validation samples, wherein the validation samples are generated at different times from the training samples. For example, specify time as an observation point in the database, and determine the time window, generate a training set and a validation set across time. Among them, the training set is used to train the model, and the validation set is used to evaluate the performance of the model.

验证样本是OOT(Out of Time，跨时间)验证样本，能够用于模拟评估模型上线应用后的效果。OOT验证样本的标签生成的时间与训练样本的标签生成的时间生成的时间不同，因此，OOT验证样本与训练样本生成的时间不同。例如，训练样本是根据今年数据产生的样本，验证样本是根据去年数据产生的样本。训练样本和验证样本的风险标签对应的时间窗口不重叠。利用跨时间的验证样本，验证模型，能够保证模型的稳定性，提高模型的泛化能力。The verification sample is an OOT (Out of Time, across time) verification sample, which can be used to simulate and evaluate the effect of the online application of the model. The label generation time of the OOT validation samples is different from that of the training samples, therefore, the OOT validation samples are generated at different times from the training samples. For example, a training sample is a sample generated based on this year's data, and a validation sample is a sample generated based on last year's data. The time windows corresponding to the risk labels of training samples and validation samples do not overlap. Using cross-time verification samples to verify the model can ensure the stability of the model and improve the generalization ability of the model.

在使用测试样本对模型进行测试和/或使用验证样本对模型进行验证时，计算损失函数的方法与训练过程类似，此处不再赘述。When using test samples to test the model and/or using verification samples to verify the model, the method of calculating the loss function is similar to the training process, and will not be repeated here.

根据本公开的风险预测模型的训练方法，通过获取多个训练样本，利用风险预测模型，针对每个训练样本，根据用户特征，生成风险预测结果，然后根据多个训练样本的风险预测结果和对应的风险标签，训练风险预测模型，利用部分训练样本的第一风险标签和第二风险标签，同时在多个任务上训练风险预测模型，获取在第一风险预测任务和第二风险预测任务上的预测结果。According to the training method of the risk prediction model of the present disclosure, by obtaining multiple training samples, using the risk prediction model, for each training sample, according to user characteristics, a risk prediction result is generated, and then according to the risk prediction results of the multiple training samples and the corresponding risk labels, train the risk prediction model, use the first risk label and the second risk label of some training samples, and train the risk prediction model on multiple tasks at the same time, and obtain the first risk prediction task and the second risk prediction task. forecast result.

风险预测模型既学习长期风险与用户特征之间的关系，减少风险标签延迟反馈现象对长期风险预测的影响；又能够利用到更多的新鲜训练样本，在学习到短期风险与用户特征之间的关系的同时，增加可用的训练样本数量。此外，第一风险预测任务和第二风险预测任务之间具有一定的相关性，并且具有各自的噪声，这种风险预测模型同时在多个风险预测任务上训练的方法能够实现数据增强。因此，训练得到的风险预测模型在长期风险预测和短期风险预测上，都能够有更高的准确率。The risk prediction model not only learns the relationship between long-term risk and user characteristics, but also reduces the impact of risk label delay feedback on long-term risk prediction; it can also use more fresh training samples to learn the relationship between short-term risk and user characteristics. At the same time, it increases the number of training samples available. In addition, there is a certain correlation between the first risk prediction task and the second risk prediction task, and they have their own noises. This method of simultaneously training a risk prediction model on multiple risk prediction tasks can achieve data enhancement. Therefore, the trained risk prediction model can have higher accuracy in long-term risk prediction and short-term risk prediction.

此外，根据本公开的风险预测模型的训练方法训练得到的风险预测模型，能够同时预测多个风险预测任务，泛化性更好。In addition, the risk prediction model trained according to the risk prediction model training method of the present disclosure can simultaneously predict multiple risk prediction tasks, and has better generalization.

本公开提供了一种风险预测方法，包括：根据目标用户的用户特征，利用风险预测模型生成对目标用户的风险预测结果，其中，风险预测模型根据本公开任一实施例所述的风险预测模型的训练方法训练得到。在一些实施例中，风险预测方法由风险预测装置执行。The present disclosure provides a risk prediction method, including: according to the user characteristics of the target user, using a risk prediction model to generate a risk prediction result for the target user, wherein the risk prediction model is based on the risk prediction model described in any embodiment of the present disclosure obtained by the training method. In some embodiments, the risk prediction method is performed by a risk prediction device.

其中，目标用户可以是任何需要评估其违约风险的用户。在训练完成风险预测模型后，根据目标用户的用户特征，利用风险预测模型，能够根据需要，生成第一风险预测任务的预测结果和/或第二风险标签对应的第二风险预测任务的预测结果，从而判断目标用户在未来的第一时间窗口是否会违约和/或在第二时间窗口是否会违约。其中，用户特征在指定时间提取，第一时间窗口为从指定时间到第一时间的时间窗口，第二时间窗口为从指定时间到第二时间的时间窗口。Among them, the target user can be any user whose default risk needs to be assessed. After the risk prediction model is trained, according to the user characteristics of the target user, the risk prediction model can be used to generate the prediction result of the first risk prediction task and/or the prediction result of the second risk prediction task corresponding to the second risk label , so as to determine whether the target user will default in the first time window in the future and/or whether the target user will default in the second time window. Wherein, the user feature is extracted at a specified time, the first time window is a time window from the specified time to the first time, and the second time window is a time window from the specified time to the second time.

图6示出根据本公开一些实施例的风险预测模型的训练装置的框图。Fig. 6 shows a block diagram of a training device for a risk prediction model according to some embodiments of the present disclosure.

如图6所示，风险预测模型的训练装置6包括获取模块61、生成模块62和训练模块63。As shown in FIG. 6 , the risk prediction model training device 6 includes an acquisition module 61 , a generation module 62 and a training module 63 .

获取模块61，被配置为获取多个训练样本，其中，每个训练样本包括风险标签和用户特征，多个训练样本中至少一个训练样本的风险标签包括在第一时间窗口生成的第一风险标签和在第二时间窗口生成的第二风险标签，第一时间窗口的长度大于第二时间窗口的长度，例如执行如图6所示的步骤S1。The acquisition module 61 is configured to acquire a plurality of training samples, wherein each training sample includes a risk label and user characteristics, and the risk label of at least one training sample in the plurality of training samples includes a first risk label generated in a first time window With the second risk label generated in the second time window, the length of the first time window is greater than the length of the second time window, for example, execute step S1 as shown in FIG. 6 .

生成模块62，被配置为针对每个训练样本，根据用户特征，利用风险预测模型生成风险预测结果，其中，至少一个训练样本的风险预测结果包括第一风险标签对应的第一风险预测任务的预测结果和第二风险标签对应的第二风险预测任务的预测结果，例如执行如图6所示的步骤S2。The generation module 62 is configured to use a risk prediction model to generate a risk prediction result for each training sample according to user characteristics, wherein the risk prediction result of at least one training sample includes the prediction of the first risk prediction task corresponding to the first risk label The result and the prediction result of the second risk prediction task corresponding to the second risk label, for example, execute step S2 as shown in FIG. 6 .

训练模块63，被配置为根据多个训练样本的风险预测结果和对应的风险标签，训练风险预测模型，例如执行如图6所示的步骤S3。The training module 63 is configured to train a risk prediction model according to the risk prediction results of a plurality of training samples and corresponding risk labels, for example, execute step S3 as shown in FIG. 6 .

本公开提供了一种风险预测装置，包括生成模块。其中，生成模块被配置为根据目标用户的用户特征，利用风险预测模型生成对目标用户的风险预测结果，其中，风险预测模型根据本公开任一实施例所述的风险预测模型的训练装置训练得到。The disclosure provides a risk prediction device, including a generation module. Wherein, the generation module is configured to use a risk prediction model to generate a risk prediction result for the target user according to the user characteristics of the target user, wherein the risk prediction model is obtained by training the risk prediction model training device described in any embodiment of the present disclosure .

图7示出根据本公开另一些实施例的电子设备的框图。FIG. 7 shows a block diagram of an electronic device according to other embodiments of the present disclosure.

如图7所示，电子设备7包括存储器71；以及耦接至该存储器71的处理器72，存储器71用于存储执行风险预测模型的训练方法。处理器72被配置为基于存储在存储器71中的指令，执行本公开中任意一些实施例中的风险预测模型的训练方法。As shown in FIG. 7 , the electronic device 7 includes a memory 71 ; and a processor 72 coupled to the memory 71 , and the memory 71 is used for storing and executing a training method of the risk prediction model. The processor 72 is configured to execute the method for training the risk prediction model in some embodiments of the present disclosure based on the instructions stored in the memory 71 .

根据本公开一些实施例的风险预测模型的训练装置和/或电子设备，利用部分训练样本的第一风险标签和第二风险标签，同时在多个任务上训练风险预测模型，获取在第一风险预测任务和第二风险预测任务上的预测结果，既学习长期风险与用户特征之间的关系，减少风险标签延迟反馈现象对长期风险预测的影响，又能够利用到更多的新鲜训练样本，在学习到短期风险与用户特征之间的关系的同时，增加可用的训练样本数量，实现数据增强。因此，训练得到的风险预测模型在长期风险预测和短期风险预测上，都能够有更高的准确率。According to some embodiments of the present disclosure, the risk prediction model training device and/or electronic equipment use the first risk label and the second risk label of some training samples to train the risk prediction model on multiple tasks at the same time, and obtain the first risk The prediction results on the prediction task and the second risk prediction task not only learn the relationship between long-term risk and user characteristics, reduce the impact of risk label delay feedback phenomenon on long-term risk prediction, but also use more fresh training samples. While learning the relationship between short-term risk and user characteristics, the number of available training samples is increased to achieve data enhancement. Therefore, the trained risk prediction model can have higher accuracy in long-term risk prediction and short-term risk prediction.

如图8所示，计算机系统80可以通用计算设备的形式表现。计算机系统80包括存储器810、处理器820和连接不同系统组件的总线800。As shown in FIG. 8, computer system 80 may take the form of a general-purpose computing device. Computer system 80 includes memory 810, processor 820, and bus 800 that connects the various system components.

存储器810例如可以包括系统存储器、非易失性存储介质等。系统存储器例如存储有操作系统、应用程序、引导装载程序(Boot Loader)以及其他程序等。系统存储器可以包括易失性存储介质，例如随机存取存储器(RAM)和/或高速缓存存储器。非易失性存储介质例如存储有执行本公开中任意一些实施例中的风险预测模型的训练方法的指令。非易失性存储介质包括但不限于磁盘存储器、光学存储器、闪存等。The memory 810 may include, for example, a system memory, a non-volatile storage medium, and the like. The system memory stores, for example, an operating system, an application program, a boot loader (Boot Loader) and other programs. System memory may include volatile storage media such as random access memory (RAM) and/or cache memory. The non-volatile storage medium, for example, stores instructions for executing the method for training the risk prediction model in some embodiments of the present disclosure. Non-volatile storage media include, but are not limited to, magnetic disk storage, optical storage, flash memory, and the like.

处理器820可以用通用处理器、数字信号处理器(DSP)、应用专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它可编程逻辑设备、分立门或晶体管等分立硬件组件方式来实现。相应地，诸如判断模块和确定模块的每个模块，可以通过中央处理器(CPU)运行存储器中执行相应步骤的指令来实现，也可以通过执行相应步骤的专用电路来实现。The processor 820 can be realized by means of discrete hardware components such as general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gates, or transistors. accomplish. Correspondingly, each module, such as the judging module and the determining module, can be implemented by executing instructions in the memory of the central processing unit (CPU) to execute corresponding steps, or can also be implemented by a dedicated circuit that executes corresponding steps.

总线800可以使用多种总线结构中的任意总线结构。例如，总线结构包括但不限于工业标准体系结构(ISA)总线、微通道体系结构(MCA)总线、外围组件互连(PCI)总线。Bus 800 may use any of a variety of bus structures. For example, bus structures include, but are not limited to, Industry Standard Architecture (ISA) buses, Micro Channel Architecture (MCA) buses, Peripheral Component Interconnect (PCI) buses.

计算机系统80还可以包括输入输出接口830、网络接口840、存储接口850等。这些接口830、840、850以及存储器810和处理器820之间可以通过总线800连接。输入输出接口830可以为显示器、鼠标、键盘等输入输出设备提供连接接口。网络接口840为各种联网设备提供连接接口。存储接口850为软盘、U盘、SD卡等外部存储设备提供连接接口。The computer system 80 may also include an input and output interface 830, a network interface 840, a storage interface 850, and the like. These interfaces 830 , 840 , and 850 , as well as the memory 810 and the processor 820 may be connected through a bus 800 . The input and output interface 830 can provide a connection interface for input and output devices such as a monitor, a mouse, and a keyboard. The network interface 840 provides a connection interface for various networked devices. The storage interface 850 provides connection interfaces for external storage devices such as floppy disks, U disks, and SD cards.

这里，参照根据本公开实施例的方法、装置和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解，流程图和/或框图的每个框以及各框的组合，都可以由计算机可读程序指令实现。Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus and computer program products according to embodiments of the disclosure. It should be understood that each block of the flowchart and/or block diagrams, and combinations of blocks, can be implemented by computer readable program instructions.

这些计算机可读程序指令可提供到通用计算机、专用计算机或其他可编程装置的处理器，以产生一个机器，使得通过处理器执行指令产生实现在流程图和/或框图中一个或多个框中指定的功能的装置。These computer-readable program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable device to produce a machine such that execution of the instructions by the processor produces the processes implemented in one or more blocks of the flow diagrams and/or block diagrams. device for the specified function.

这些计算机可读程序指令也可读存储在计算机可读存储器中，这些指令使得计算机以特定方式工作，从而产生一个制造品，包括实现在流程图和/或框图中一个或多个框中指定的功能的指令。The computer-readable program instructions, which may also be readable and stored in the computer-readable memory, cause the computer to operate in a specific manner to produce an article of manufacture, including implementing the process specified in one or more blocks of the flowchart and/or block diagram. function instructions.

本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。The disclosure can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.

通过上述实施例中的风险预测模型的训练方法、风险预测方法及装置、电子设备、计算机可读存储介质，提高了风险预测模型在长期风险预测和短期风险预测上的准确率。Through the risk prediction model training method, risk prediction method and device, electronic equipment, and computer-readable storage medium in the above embodiments, the accuracy of the risk prediction model in long-term risk prediction and short-term risk prediction is improved.

至此，已经详细描述了根据本公开的风险预测模型的训练方法及装置、计算机可读存储介质。为了避免遮蔽本公开的构思，没有描述本领域所公知的一些细节。本领域技术人员根据上面的描述，完全可以明白如何实施这里公开的技术方案。So far, the risk prediction model training method, device, and computer-readable storage medium according to the present disclosure have been described in detail. Certain details known in the art have not been described in order to avoid obscuring the concept of the present disclosure. Based on the above description, those skilled in the art can fully understand how to implement the technical solutions disclosed herein.

Claims

1. A method of training a risk prediction model, comprising:

obtaining a plurality of training samples, wherein each training sample comprises a risk label and a user characteristic, the risk label of at least one training sample in the plurality of training samples comprises a first risk label generated in a first time window and a second risk label generated in a second time window, and the length of the first time window is greater than that of the second time window;

generating a risk prediction result by using a risk prediction model according to user characteristics for each training sample, wherein the risk prediction result of at least one training sample comprises a prediction result of a first risk prediction task corresponding to a first risk label and a prediction result of a second risk prediction task corresponding to a second risk label;

and training a risk prediction model according to the risk prediction results of the training samples and the corresponding risk labels.

2. The training method of the risk prediction model according to claim 1, wherein training the risk prediction model according to the risk prediction results and the corresponding risk labels of the plurality of training samples comprises:

for each training sample in the at least one training sample, calculating a loss function of each training sample in the at least one training sample according to the first risk label, the second risk label, the prediction result of the first risk prediction task and the prediction result of the second risk prediction task;

and training a risk prediction model according to the loss functions of the plurality of training samples.

3. The training method of the risk prediction model of claim 2, wherein for each of the at least one training sample, calculating a loss function for each of the at least one training sample based on the first risk label, the second risk label, the predicted outcome of the first risk prediction task, and the predicted outcome of the second risk prediction task comprises:

for each training sample in the at least one training sample, calculating a first loss function of each training sample in the at least one training sample according to the first risk label and the prediction result of the first risk prediction task;

for each training sample in the at least one training sample, calculating a second loss function of each training sample in the at least one training sample according to a second risk label and a prediction result of a second risk prediction task;

calculating a loss function for each of the at least one training samples from the first and second loss functions for each of the at least one training samples.

4. The method of training a risk prediction model according to claim 3, wherein calculating a loss function for each of the at least one training sample from the first and second loss functions for each of the at least one training sample comprises:

calculating a loss function for each of the at least one training samples according to a sum of the first loss function and the second loss function for each of the at least one training samples.

5. The method of training a risk prediction model according to any one of claims 1 to 4, wherein the user features of the at least one training sample are extracted at a specified time, the first time window being a time window from the specified time to a first time, and the second time window being a time window from the specified time to a second time.

6. The method of training a risk prediction model according to any one of claims 1 to 4, further comprising at least one of:

populating missing values in the user feature with default values;

filling missing values in the user characteristics by using a mean filling method;

filling missing values in the user characteristics by using a near completion method;

filling missing values in the user characteristics by using a nearest neighbor method;

and filling missing values in the user characteristics by using a multi-interpolation method.

7. The method of training a risk prediction model according to any one of claims 1 to 4, further comprising screening out at least one of the following user characteristics:

user features for which the loss rate is greater than a first threshold;

the discrimination of the training samples is lower than the user characteristics of a second threshold value;

user features having a variance greater than a third threshold.

8. The method of training a risk prediction model according to any one of claims 1 to 4, further comprising:

and carrying out discretization processing on the user characteristics.

9. The method of training a risk prediction model according to any one of claims 1 to 4, further comprising:

the risk prediction model is validated using a validation sample, wherein the validation sample is generated at a different time than the training sample.

10. The training method of the risk prediction model according to any one of claims 1 to 4, wherein the risk prediction model is a multitasking deep neural network model or a tree model.

11. A method of risk prediction, comprising:

generating a risk prediction result for the target user by using a risk prediction model according to the user characteristics of the target user, wherein the risk prediction model is obtained by training according to the training method of the risk prediction model according to any one of claims 1-10.

12. A training apparatus for a risk prediction model, comprising:

the training system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module acquires a plurality of training samples, each training sample comprises a risk label and a user characteristic, the risk label of at least one training sample in the plurality of training samples comprises a first risk label generated in a first time window and a second risk label generated in a second time window, and the length of the first time window is greater than that of the second time window;

the generating module is used for generating a risk prediction result by using a risk prediction model according to the user characteristics for each training sample, wherein the risk prediction result of at least one training sample comprises a prediction result of a first risk prediction task corresponding to a first risk label and a prediction result of a second risk prediction task corresponding to a second risk label;

and the training module is used for training a risk prediction model according to the risk prediction results of the training samples and the corresponding risk labels.

13. A risk prediction device comprising:

a generating module configured to generate a risk prediction result for the target user by using a risk prediction model according to the user characteristics of the target user, wherein the risk prediction model is obtained by training with the training device of the risk prediction model according to claim 12.

14. An electronic device, comprising:

a memory; and

a processor coupled to the memory, the processor configured to perform a method of training a risk prediction model according to any one of claims 1 to 10, or a method of risk prediction according to claim 1, based on instructions stored in the memory.

15. A computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement a method of training a risk prediction model according to any one of claims 1 to 10, or a method of risk prediction according to claim 1.