CN109978179A

CN109978179A - Model training method and device, electronic equipment and readable storage medium

Info

Publication number: CN109978179A
Application number: CN201910271480.8A
Authority: CN
Inventors: 赵呈路; 李雪
Original assignee: Lazas Network Technology Shanghai Co Ltd
Current assignee: Lazas Network Technology Shanghai Co Ltd
Priority date: 2019-04-04
Filing date: 2019-04-04
Publication date: 2019-07-05

Abstract

The embodiment of the disclosure discloses a model training method, a model training device, electronic equipment and a readable storage medium. According to the technical scheme, the base model used in the combined model and the corresponding combination coefficient of the used base model can be automatically determined, the parameter adjusting efficiency in the model training process can be improved, and the accuracy and the objectivity of the model are improved.

Description

Model training method, device, electronic device and readable storage medium

技术领域technical field

本公开涉及计算机技术领域，具体涉及一种模型训练方法、装置、电子设备及可读存储介质。The present disclosure relates to the field of computer technologies, and in particular, to a model training method, apparatus, electronic device, and readable storage medium.

背景技术Background technique

为了提高机器学习中模型的预测精度，技术人员通常采用将多个基模型进行组合，来提高模型的泛化能力。In order to improve the prediction accuracy of the model in machine learning, technicians usually combine multiple base models to improve the generalization ability of the model.

在提出本公开的过程中，发明人发现，现有技术中的模型组合常常需要技术人员先对多个基模型分别训练，再对训练后的多个基模型进行选择、组合，并对模型组合进行训练以调整模型组合的参数，使得现有的模型训练耗时费力，严重影响了模型训练的效率。During the process of proposing the present disclosure, the inventor found that the model combination in the prior art often requires technicians to train multiple base models separately, then select and combine the trained multiple base models, and then combine the models. Performing training to adjust the parameters of the model combination makes the existing model training time-consuming and labor-intensive, which seriously affects the efficiency of model training.

发明内容SUMMARY OF THE INVENTION

为了解决相关技术中的问题，本公开实施例提供一种模型训练方法、装置、电子设备及可读存储介质。In order to solve the problems in the related art, the embodiments of the present disclosure provide a model training method, an apparatus, an electronic device, and a readable storage medium.

第一方面，本公开实施例提供一种模型训练方法。In a first aspect, an embodiment of the present disclosure provides a model training method.

具体地，所述模型训练方法，包括：Specifically, the model training method includes:

获取第一训练数据和第二训练数据；obtaining first training data and second training data;

基于所述第一训练数据训练多个基模型，确定各个基模型的模型参数；Train multiple base models based on the first training data, and determine model parameters of each base model;

基于所述第二训练数据，通过贪心算法确定组合模型中使用的基模型和所使用的基模型的相应组合系数。Based on the second training data, the base model used in the combined model and the corresponding combining coefficients of the used base model are determined by a greedy algorithm.

结合第一方面，本公开在第一方面的第一种实现方式中，所述多个基模型包括至少一个线性模型和/或至少一个非线性模型。In conjunction with the first aspect, in a first implementation manner of the first aspect of the present disclosure, the plurality of base models include at least one linear model and/or at least one nonlinear model.

结合第一方面的第一种实现方式，本公开在第一方面的第二种实现方式中，所述线性模型包括逻辑回归模型；和/或In conjunction with the first implementation manner of the first aspect, in a second implementation manner of the first aspect of the present disclosure, the linear model includes a logistic regression model; and/or

所述非线性模型包括极端梯度提升模型、因子分解机和随机森林中的至少一个。The nonlinear model includes at least one of an extreme gradient boosting model, a factorization machine, and a random forest.

结合第一方面的第一种实现方式，本公开在第一方面的第三种实现方式中，所述基于第一训练数据训练多个基模型，包括：With reference to the first implementation manner of the first aspect, in a third implementation manner of the first aspect of the present disclosure, the training of multiple base models based on the first training data includes:

使用梯度提升树模型处理所述第一训练数据，得到中间训练数据；using a gradient boosting tree model to process the first training data to obtain intermediate training data;

去除所述中间训练数据中的低相关特征，得到第三训练数据；Remove the low correlation features in the intermediate training data to obtain the third training data;

基于所述第三训练数据训练所述多个基模型。The plurality of base models are trained based on the third training data.

结合第一方面的第三种实现方式，本公开在第一方面的第四种实现方式中，所述基于第一训练数据训练多个基模型，包括基于所述第一训练数据训练所述多个基模型中的非线性模型；和/或With reference to the third implementation manner of the first aspect, in a fourth implementation manner of the first aspect of the present disclosure, the training of multiple base models based on the first training data includes training the multiple base models based on the first training data. a nonlinear model in a base model; and/or

所述基于所述第三训练数据训练所述多个基模型，包括基于所述第三训练数据训练所述多个基模型中的线性模型。The training of the plurality of base models based on the third training data includes training a linear model of the plurality of base models based on the third training data.

结合第一方面，本公开在第一方面的第五种实现方式中，所述基于第二训练数据，通过贪心算法确定组合模型中使用的基模型及所述使用的基模型的相应系数，包括：With reference to the first aspect, in a fifth implementation manner of the first aspect of the present disclosure, the base model used in the combined model and the corresponding coefficients of the used base model are determined by a greedy algorithm based on the second training data, including :

基于所述第二训练数据，确定所述多个基模型中性能最优的第一模型，将所述第一模型作为组合模型；determining, based on the second training data, a first model with the best performance among the multiple base models, and using the first model as a combined model;

逐步增加所述组合模型中的基模型数量，直到加入新的基模型不再提升模型组合的性能，或者所述组合模型中的基模型数量等于所述多个基模型的总数，其中，每次加入组合模型的基模型为使得加入所述基模型之后的组合模型性能最优的基模型，确定加入所述基模型之后的组合模型中各基模型的组合系数；Gradually increase the number of base models in the combined model until adding a new base model no longer improves the performance of the model combination, or the number of base models in the combined model is equal to the total number of the multiple base models, wherein each time The base model added to the combined model is a base model with the best performance of the combined model after adding the base model, and the combination coefficient of each base model in the combined model after adding the base model is determined;

输出所述组合模型中使用的基模型及所述使用的基模型的相应组合系数。The base model used in the combined model and the corresponding combination coefficients of the used base model are output.

结合第一方面，本公开在第一方面的第六种实现方式中，所述模型训练方法还包括：In conjunction with the first aspect, in a sixth implementation manner of the present disclosure, the model training method further includes:

去除原始数据中的低相关特征，得到预处理数据；Remove low-correlation features in the original data to obtain preprocessed data;

通过对预处理数据进行随机切分或按时间切分，得到所述第一训练数据、所述第二训练数据和测试数据。The first training data, the second training data and the test data are obtained by randomly dividing or dividing the preprocessed data by time.

结合第一方面的第六种实现方式，本公开在第一方面的第七种实现方式中，还包括：In conjunction with the sixth implementation manner of the first aspect, in the seventh implementation manner of the first aspect, the present disclosure further includes:

基于所述测试数据，对所述组合模型进行校验。The combined model is validated based on the test data.

结合第一方面，本公开在第一方面的第八种实现方式中，所述第一训练数据和所述第二训练数据包括与用户画像有关的数据；With reference to the first aspect, in an eighth implementation manner of the first aspect, the first training data and the second training data include data related to user portraits;

所述组合模型和所述基模型用于基于与用户画像有关的数据进行预测。The combined model and the base model are used to make predictions based on data related to user profiles.

第二方面，本公开实施例中提供了一种模型训练装置，其特征在于，包括：In a second aspect, an embodiment of the present disclosure provides a model training device, characterized in that it includes:

获取模块，被配置为获取第一训练数据和第二训练数据；an acquisition module, configured to acquire the first training data and the second training data;

第一确定模块，被配置为基于所述第一训练数据训练多个基模型，确定各个基模型的模型参数；a first determination module, configured to train a plurality of base models based on the first training data, and to determine model parameters of each base model;

第二确定模块，被配置为基于所述第二训练数据，通过贪心算法确定组合模型中使用的基模型和所使用的基模型的相应组合系数。The second determination module is configured to determine, based on the second training data, a base model used in the combined model and a corresponding combination coefficient of the used base model through a greedy algorithm.

结合第二方面，本公开在第二方面的第一种实现方式中，所述多个基模型包括至少一个线性模型和/或至少一个非线性模型。In conjunction with the second aspect, in a first implementation manner of the second aspect, the plurality of base models include at least one linear model and/or at least one nonlinear model.

结合第二方面的第一种实现方式，本公开在第二方面的第二种实现方式中，所述线性模型包括逻辑回归模型；和/或In conjunction with the first implementation of the second aspect, in a second implementation of the second aspect, the linear model includes a logistic regression model; and/or

结合第二方面的第一种实现方式，本公开在第二方面的第三种实现方式中，所述基于第一训练数据训练多个基模型，包括：With reference to the first implementation manner of the second aspect, in a third implementation manner of the second aspect of the present disclosure, the training of multiple base models based on the first training data includes:

结合第二方面的第三种实现方式，本公开在第二方面的第四种实现方式中，所述基于第一训练数据训练多个基模型，包括基于所述第一训练数据训练所述多个基模型中的非线性模型；和/或With reference to the third implementation manner of the second aspect, in a fourth implementation manner of the second aspect of the present disclosure, the training of multiple base models based on the first training data includes training the multiple base models based on the first training data. a nonlinear model in a base model; and/or

结合第二方面的第四种实现方式，本公开在第二方面的第五种实现方式中，所述基于第二训练数据，通过贪心算法确定组合模型中使用的基模型及所述使用的基模型的相应系数，包括：With reference to the fourth implementation manner of the second aspect, in a fifth implementation manner of the second aspect, the base model used in the combined model and the used base model are determined by a greedy algorithm based on the second training data. The corresponding coefficients of the model, including:

结合第二方面，本公开在第二方面的第六种实现方式中，所述模型训练还包括：去除模块，被配置为去除原始数据中的低相关特征，得到预处理数据；With reference to the second aspect, in a sixth implementation manner of the second aspect of the present disclosure, the model training further includes: a removal module configured to remove low-correlation features in the original data to obtain preprocessed data;

切分模块，被配置为通过对预处理数据进行随机切分或按时间切分，得到所述第一训练数据、所述第二训练数据和测试数据。The segmentation module is configured to randomly segment or segment the preprocessed data to obtain the first training data, the second training data and the test data.

结合第二方面的第六种实现方式，本公开在第二方面的第七种实现方式中，所述模型训练还包括：With reference to the sixth implementation manner of the second aspect, in the seventh implementation manner of the second aspect of the present disclosure, the model training further includes:

校验模块，被配置为基于所述测试数据，对所述组合模型进行校验。A verification module is configured to verify the combined model based on the test data.

结合第二方面，本公开在第二方面的第八种实现方式中，所述第一训练数据和所述第二训练数据包括与用户画像有关的数据；In conjunction with the second aspect, in an eighth implementation manner of the second aspect, the first training data and the second training data include data related to user portraits;

第三方面，本公开实施例提供了一种电子设备，包括存储器和处理器，其中，所述存储器用于存储一条或多条计算机指令，其中，所述一条或多条计算机指令被所述处理器执行以实现以下方法步骤：In a third aspect, embodiments of the present disclosure provide an electronic device, including a memory and a processor, wherein the memory is used to store one or more computer instructions, wherein the one or more computer instructions are processed by the The controller executes to implement the following method steps:

结合第三方面，本公开在第三方面的第一种实现方式中，所述多个基模型包括至少一个线性模型和/或至少一个非线性模型。In conjunction with the third aspect, in a first implementation manner of the third aspect, the plurality of base models include at least one linear model and/or at least one nonlinear model.

结合第三方面的第一种实现方式，本公开在第三方面的第二种实现方式中，所述线性模型包括逻辑回归模型；和/或In conjunction with the first implementation manner of the third aspect, in a second implementation manner of the third aspect, the linear model includes a logistic regression model; and/or

结合第三方面的第一种实现方式，本公开在第三方面的第三种实现方式中，使用梯度提升树模型处理所述第一训练数据，得到中间训练数据；With reference to the first implementation manner of the third aspect, in a third implementation manner of the third aspect of the present disclosure, a gradient boosting tree model is used to process the first training data to obtain intermediate training data;

结合第三方面的第三种实现方式，本公开在第三方面的第四种实现方式中，所述基于第一训练数据训练多个基模型，包括基于所述第一训练数据训练所述多个基模型中的非线性模型；和/或With reference to the third implementation manner of the third aspect, in a fourth implementation manner of the third aspect of the present disclosure, the training of multiple base models based on the first training data includes training the multiple base models based on the first training data. a nonlinear model in a base model; and/or

结合第三方面，本公开在第三方面的第五种实现方式中，所述基于第二训练数据，通过贪心算法确定组合模型中使用的基模型及所述使用的基模型的相应系数，包括：With reference to the third aspect, in a fifth implementation manner of the third aspect of the present disclosure, the base model used in the combined model and the corresponding coefficients of the used base model are determined by a greedy algorithm based on the second training data, including :

结合第三方面，本公开在第三方面的第六种实现方式中，所述一条或多条计算机指令还被所述处理器执行以实现以下方法步骤：去除原始数据中的低相关特征，得到预处理数据；In conjunction with the third aspect, in a sixth implementation manner of the third aspect of the present disclosure, the one or more computer instructions are further executed by the processor to implement the following method steps: removing low-correlation features in the original data to obtain preprocessing data;

结合第三方面的第六种实现方式，本公开在第三方面的第七种实现方式中，所述一条或多条计算机指令还被所述处理器执行以实现以下方法步骤：With reference to the sixth implementation manner of the third aspect, in a seventh implementation manner of the third aspect of the present disclosure, the one or more computer instructions are further executed by the processor to implement the following method steps:

结合第三方面，本公开在第三方面的第八种实现方式中，所述第一训练数据和所述第二训练数据包括与用户画像有关的数据；With reference to the third aspect, in an eighth implementation manner of the third aspect, the first training data and the second training data include data related to user portraits;

第四方面，本公开实施例中提供了一种可读存储介质，其上存储有计算机指令，该计算机指令被处理器执行时实现如第一方面、第一方面的第一种实现方式至第八种实现方式任一项所述的方法。In a fourth aspect, an embodiment of the present disclosure provides a readable storage medium on which computer instructions are stored, and when the computer instructions are executed by a processor, implement the first aspect, the first implementation manner of the first aspect to the fourth aspect. The method described in any one of the eight implementation manners.

本公开实施例提供的技术方案可以包括以下有益效果：The technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects:

根据本公开实施例提供的技术方案，能够自动确定组合模型中使用的基模型和所使用的基模型的相应组合系数，可以提高模型训练过程中调参的效率，提高模型的准确率和客观性。According to the technical solutions provided by the embodiments of the present disclosure, the base model used in the combined model and the corresponding combination coefficient of the used base model can be automatically determined, the efficiency of parameter adjustment in the model training process can be improved, and the accuracy and objectivity of the model can be improved .

应当理解的是，以上的一般描述和后文的细节描述仅是示例性和解释性的，并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.

附图说明Description of drawings

结合附图，通过以下非限制性实施方式的详细描述，本公开的其它标签、目的和优点将变得更加明显。在附图中：Other labels, objects and advantages of the present disclosure will become more apparent from the following detailed description of non-limiting embodiments, taken in conjunction with the accompanying drawings. In the attached image:

图1示出根据本公开的实施例的模型训练方法的流程图；1 shows a flowchart of a model training method according to an embodiment of the present disclosure;

图2示出根据本公开的实施例的模型训练方法的流程图；2 shows a flowchart of a model training method according to an embodiment of the present disclosure;

图3示出根据本公开的实施例的模型训练方法的流程图；3 shows a flowchart of a model training method according to an embodiment of the present disclosure;

图4示出根据本公开的实施例的训练多个基模型的流程图；4 illustrates a flowchart of training multiple base models according to an embodiment of the present disclosure;

图5示出根据本公开的实施例的梯度提升树模型的示意图；5 shows a schematic diagram of a gradient boosted tree model according to an embodiment of the present disclosure;

图6示出根据本公开的实施例的确定组合模型中使用的基模型及所述使用的基模型的相应系数的流程图；6 shows a flowchart of determining a base model used in a combined model and corresponding coefficients of the used base model according to an embodiment of the present disclosure;

图7示出根据本公开的实施例的模型训练方法的示例性过程；7 illustrates an exemplary process of a model training method according to an embodiment of the present disclosure;

图8示出根据本公开的实施例的模型训练装置的结构框图；8 shows a structural block diagram of a model training apparatus according to an embodiment of the present disclosure;

图9示出根据本公开的实施例的电子设备的结构框图；9 shows a structural block diagram of an electronic device according to an embodiment of the present disclosure;

图10示出适于用来实现根据本公开实施例的模型训练方法的计算机系统的结构示意图。FIG. 10 shows a schematic structural diagram of a computer system suitable for implementing the model training method according to an embodiment of the present disclosure.

具体实施方式Detailed ways

下文中，将参考附图详细描述本公开的示例性实施例，以使本领域技术人员可容易地实现它们。此外，为了清楚起见，在附图中省略了与描述示例性实施例无关的部分。Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily implement them. Also, for the sake of clarity, parts unrelated to describing the exemplary embodiments are omitted from the drawings.

在本公开中，应理解，诸如“包括”或“具有”等的术语旨在指示本说明书中所公开的特征、数字、步骤、行为、部件、部分或其组合的存在，并且不欲排除一个或多个其他特征、数字、步骤、行为、部件、部分或其组合存在或被添加的可能性。In the present disclosure, it should be understood that terms such as "comprising" or "having" are intended to indicate the presence of features, numbers, steps, acts, components, parts, or combinations thereof disclosed in this specification, and are not intended to exclude a or multiple other features, numbers, steps, acts, components, parts, or combinations thereof may exist or be added.

另外还需要说明的是，在不冲突的情况下，本公开中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。In addition, it should be noted that the embodiments of the present disclosure and the features of the embodiments may be combined with each other under the condition of no conflict. The present disclosure will be described in detail below with reference to the accompanying drawings and in conjunction with embodiments.

上文提及，为了提高机器学习中模型的预测精度，技术人员通常采用将多个基模型进行组合，来提高模型的泛化能力。在提出本公开的过程中，发明人发现，现有技术中的模型组合常常需要技术人员先对多个基模型分别训练，再对训练后的多个基模型进行选择、组合，并对模型组合进行训练以调整模型组合的参数，使得现有的模型训练耗时费力，严重影响了模型训练的效率。As mentioned above, in order to improve the prediction accuracy of the model in machine learning, technicians usually combine multiple base models to improve the generalization ability of the model. During the process of proposing the present disclosure, the inventor found that the model combination in the prior art often requires technicians to train multiple base models separately, then select and combine the trained multiple base models, and then combine the models. Performing training to adjust the parameters of the model combination makes the existing model training time-consuming and labor-intensive, which seriously affects the efficiency of model training.

考虑到上述缺陷，本公开实施例提供的技术方案获取第一训练数据和第二训练数据，基于所述第一训练数据训练多个基模型，确定各个基模型的模型参数，基于所述第二训练数据，通过贪心算法确定组合模型中使用的基模型和所使用的基模型的相应组合系数。该技术方案能够自动确定组合模型中使用的基模型和所使用的基模型的相应组合系数，可以提高模型训练过程中的调参效率，降低人工对模型的干预，提高模型的泛化能力。Considering the above defects, the technical solutions provided by the embodiments of the present disclosure obtain first training data and second training data, train a plurality of base models based on the first training data, determine model parameters of each base model, and determine the model parameters of each base model based on the second training data. Training data, the base model used in the combined model and the corresponding combination coefficients of the used base model are determined by a greedy algorithm. The technical solution can automatically determine the base model used in the combined model and the corresponding combination coefficient of the used base model, which can improve the efficiency of parameter adjustment in the model training process, reduce manual intervention on the model, and improve the generalization ability of the model.

图1示出根据本公开的实施例的模型训练方法的流程图。FIG. 1 shows a flowchart of a model training method according to an embodiment of the present disclosure.

如图1所示，所述模型训练方法包括以下步骤S101-S103。As shown in FIG. 1 , the model training method includes the following steps S101-S103.

在步骤S101中，获取第一训练数据和第二训练数据。In step S101, first training data and second training data are acquired.

在步骤S102中，基于所述第一训练数据训练多个基模型，确定各个基模型的模型参数。In step S102, a plurality of base models are trained based on the first training data, and model parameters of each base model are determined.

在步骤S103中，基于所述第二训练数据，通过贪心算法确定组合模型中使用的基模型和所使用的基模型的相应组合系数。In step S103, based on the second training data, a base model used in the combined model and a corresponding combination coefficient of the used base model are determined by a greedy algorithm.

根据本公开的实施例，例如，假设有多个基模型M1、M2、……、Mn，n≥2，先基于第一训练数据分别训练基模型M1、M2、……、Mn，然后，基于贪心算法确定组合模型M使用的基模型和所使用的基模型的相应组合系数。根据实际需要和模型的效果，组合模型M可以包括多个基模型M1、M2、……、Mn中的全部基模型或部分基模型。组合模型中基模型的组合系数可以表示该基模型在组合模型中的权重。例如，假设组合模型M包括基模型M1、M2、M3，基模型M1、M2、M3的组合系数分别为m1、m2、m3，则M＝m1*M1+m2*M2+m3*M3。According to an embodiment of the present disclosure, for example, assuming that there are multiple base models M1, M2, . The greedy algorithm determines the base model used by the combined model M and the corresponding combining coefficients of the used base model. According to actual needs and the effect of the model, the combined model M may include all or part of the multiple base models M1, M2, . . . , Mn. The combination coefficient of the base model in the combination model can represent the weight of the base model in the combination model. For example, assuming that the combination model M includes base models M1, M2, and M3, and the combination coefficients of the base models M1, M2, and M3 are m1, m2, and m3, respectively, then M=m1*M1+m2*M2+m3*M3.

根据本公开的实施例，能够自动确定组合模型中使用的基模型和所使用的基模型的相应组合系数，可以提高模型训练过程中的调参效率，提高模型的泛化能力。According to the embodiments of the present disclosure, the base model used in the combined model and the corresponding combination coefficient of the used base model can be automatically determined, which can improve the efficiency of parameter adjustment in the model training process and improve the generalization ability of the model.

本公开所提出的模型训练方法广泛地适用于各种模型组合的训练，所述第一训练数据和所述第二训练数据可以包括各种数据，所述组合模型和所述基模型可以用于各种用途，本公开对此不作具体限定。The model training method proposed in the present disclosure is widely applicable to the training of various model combinations, the first training data and the second training data may include various data, and the combined model and the base model may be used for Various uses, which are not specifically limited in the present disclosure.

例如，所述第一训练数据和所述第二训练数据可以包括与用户画像有关的数据。所述组合模型和所述基模型用于基于所述与用户画像有关的数据进行预测。例如，第一训练数据和第二训练数据可以包括以下至少一种与用户画像有关的数据：用户属性数据(例如年龄、性别、职业等)、用户行为数据(例如消费偏好、浏览偏好等)、用户与其他主体的交互数据(例如下单、收藏、退换货等)，组合模型和基模型可以用于基于与用户画像有关的数据来预测用户的行为，例如用户点击推荐列表中的推荐项的概率。For example, the first training data and the second training data may include data related to user portraits. The combined model and the base model are used to make predictions based on the user profile-related data. For example, the first training data and the second training data may include at least one of the following data related to user portraits: user attribute data (such as age, gender, occupation, etc.), user behavior data (such as consumption preferences, browsing preferences, etc.), The interaction data between the user and other subjects (such as placing an order, saving, returning, etc.), the combined model and the base model can be used to predict the user's behavior based on the data related to the user's portrait, such as the user's click on the recommended item in the recommendation list. probability.

图2示出根据本公开的实施例的模型训练方法的流程图。FIG. 2 shows a flowchart of a model training method according to an embodiment of the present disclosure.

如图2所示，根据本公开实施例，所述模型训练方法除了步骤S101-S103之外，还包括步骤S104-S105。As shown in FIG. 2, according to an embodiment of the present disclosure, the model training method further includes steps S104-S105 in addition to steps S101-S103.

在步骤S104中，去除原始数据中的低相关特征，得到预处理数据。In step S104, low correlation features in the original data are removed to obtain preprocessed data.

在步骤S105中，通过对预处理数据进行随机切分或按时间切分，得到所述第一训练数据、所述第二训练数据和测试数据。In step S105, the first training data, the second training data and the test data are obtained by randomly dividing or dividing the preprocessed data by time.

根据本公开的实施例，低相关特征可以是与本次模型训练无关(或关系很小)的特征，也可以使能够从其他特征获取的冗余特征，例如，已知矩形长和宽，则矩形的面积可视为冗余特征。According to the embodiment of the present disclosure, the low correlation feature may be a feature that has nothing to do with the current model training (or a very small relationship), or it may be a redundant feature that can be obtained from other features. For example, if the length and width of a rectangle are known, then The area of the rectangle can be regarded as a redundant feature.

根据本公开的实施例，所述去除原始数据中的低相关特征包括通过方差检验、相关性检验等特征检验方法，去除原始数据中的低相关特征，例如，去除方差为零的特征，去除相关性低于阈值的特征等。这样，可以有效避免维数爆炸，一方面减少了数据对内存的占用，提高了模型训练的效率，另一方面还可以有效避免模型训练过程中由于特征过于稀疏而导致过拟合的情况。According to an embodiment of the present disclosure, the removing low-correlation features in the original data includes removing low-correlation features in the original data through feature testing methods such as variance testing, correlation testing, etc., for example, removing features with zero variance, removing correlation Features that are below the threshold, etc. In this way, the dimensionality explosion can be effectively avoided. On the one hand, the memory occupation of the data is reduced, and the efficiency of model training is improved. On the other hand, it can also effectively avoid the situation of overfitting caused by too sparse features during the model training process.

图3示出根据本公开的实施例的模型训练方法的流程图。FIG. 3 shows a flowchart of a model training method according to an embodiment of the present disclosure.

如图3所示，根据本公开实施例，所述模型训练方法除了步骤S101-S103之外，还包括步骤S106。As shown in FIG. 3, according to an embodiment of the present disclosure, the model training method further includes step S106 in addition to steps S101-S103.

在步骤S106中，基于所述测试数据，对所述组合模型进行校验。In step S106, the combined model is verified based on the test data.

根据本公开的实施例，所述测试数据与所述第一训练数据和第二训练数据都是不同的，即所述测试数据没有在步骤S101-S103中使用过。这样，可以提高校验结果的保真性，有效减小过拟合的风险，从而提高模型的准确性。According to an embodiment of the present disclosure, the test data is different from the first training data and the second training data, that is, the test data has not been used in steps S101-S103. In this way, the fidelity of the verification results can be improved, and the risk of overfitting can be effectively reduced, thereby improving the accuracy of the model.

图4示出根据本公开的实施例的训练多个基模型的流程图。FIG. 4 shows a flowchart of training multiple base models according to an embodiment of the present disclosure.

如图4所示，基于第一训练数据训练多个基模型包括步骤S201-S203。As shown in FIG. 4, training a plurality of base models based on the first training data includes steps S201-S203.

在步骤S201中，使用梯度提升树模型处理所述第一训练数据，得到中间训练数据。In step S201, a gradient boosting tree model is used to process the first training data to obtain intermediate training data.

在步骤S202中，去除所述中间训练数据中的低相关特征，得到第三训练数据。In step S202, low correlation features in the intermediate training data are removed to obtain third training data.

在步骤S203中，基于所述第三训练数据训练所述多个基模型。In step S203, the plurality of base models are trained based on the third training data.

根据本公开的实施例，所述梯度提升树模型(Gradient BoostingDecision TreeModel，简称GBDT模型)包括多棵回归树，是一种迭代的决策树模型。According to an embodiment of the present disclosure, the gradient boosting tree model (Gradient BoostingDecision TreeModel, GBDT model for short) includes multiple regression trees, and is an iterative decision tree model.

根据本公开的实施例，使用梯度提升树模型处理所述第一训练数据，得到中间训练数据，可以先使用梯度提升树处理所述第一训练数据，然后利用one-hot编码(独热编码)记录每棵回归树的叶子节点，即将所述第一训练数据所在的叶子节点记为1，其余记为0，再组合所有编码得到中间训练数据。According to an embodiment of the present disclosure, the gradient boosting tree model is used to process the first training data to obtain intermediate training data. The gradient boosting tree may be used to process the first training data, and then one-hot encoding (one-hot encoding) is used. The leaf node of each regression tree is recorded, that is, the leaf node where the first training data is located is marked as 1, and the rest are marked as 0, and then all codes are combined to obtain intermediate training data.

下面结合图5具体说明根据本公开实施例使用梯度提升树模型处理第一训练数据，得到中间训练数据的示例性过程。An exemplary process of using a gradient boosting tree model to process the first training data to obtain intermediate training data according to an embodiment of the present disclosure will be specifically described below with reference to FIG. 5 .

图5示出根据本公开的实施例的梯度提升树模型的示意图。5 shows a schematic diagram of a gradient boosted tree model according to an embodiment of the present disclosure.

为了描述的方便，图5中以梯度提升树模型T包括三棵回归树为例进行解释和说明，应当了解的是，该示例仅为示例使用，并非是对于本公开的限制，本公开中的梯度提升树模型还可以由两棵或者更多棵回归树组成。另外，每棵回归树的深度及叶子节点数也可以根据实际需要进行设定，本公开对此不作具体限定。For the convenience of description, in FIG. 5, the gradient boosting tree model T includes three regression trees as an example for explanation and description. It should be understood that this example is only used as an example, and is not a limitation of the present disclosure. A gradient boosted tree model can also consist of two or more regression trees. In addition, the depth and the number of leaf nodes of each regression tree can also be set according to actual needs, which are not specifically limited in the present disclosure.

如图5所示，梯度提升树模型T包括三棵回归树T1、T2和T3，其中，回归树T1和T3各具有三个叶子节点，回归树T2具有两个叶子节点。每个叶子节点对应于一个特征，所述第一训练数据S1通过所述梯度提升树模型T处理后，得到的中间训练数据X1为八维的特征向量。As shown in FIG. 5 , the gradient boosting tree model T includes three regression trees T1 , T2 and T3 , wherein the regression trees T1 and T3 each have three leaf nodes, and the regression tree T2 has two leaf nodes. Each leaf node corresponds to a feature, and after the first training data S1 is processed by the gradient boosting tree model T, the obtained intermediate training data X1 is an eight-dimensional feature vector.

假设经过所述梯度提升树模型T处理后，所述第一训练数据S1落在回归树T1的第一个叶子节点，则可以得到编码a1为[1,0,0]，表示所述第一训练数据S1包括回归树T1的第一个叶子节点对应的特征；所述第一训练数据S1落在回归树T2的第二个叶子节点，则可以得到编码a2为[0,1]；所述第一训练数据S1落在回归树T3的第三个叶子节点，则可以得到编码a3为[0,0,1]。通过将编码a1、编码a2和编码a3组合，得到中间训练数据X1为八维向量[1,0,0,0,1,0,0,1]。Assuming that after the gradient boosting tree model T is processed, the first training data S1 falls on the first leaf node of the regression tree T1, then the code a1 can be obtained as [1,0,0], indicating that the first training data S1 falls on the first leaf node of the regression tree T1. The training data S1 includes the feature corresponding to the first leaf node of the regression tree T1; the first training data S1 falls on the second leaf node of the regression tree T2, then the code a2 can be obtained as [0,1]; the If the first training data S1 falls on the third leaf node of the regression tree T3, the code a3 can be obtained as [0, 0, 1]. By combining the encoding a1, the encoding a2 and the encoding a3, the intermediate training data X1 is obtained as an eight-dimensional vector [1,0,0,0,1,0,0,1].

另外，在使用梯度提升树处理所述第一训练数据过程中，可以根据所述第一训练数据调整所述梯度提升树模型的参数，从而提高所述梯度提升树模型的性能。In addition, in the process of processing the first training data using the gradient boosting tree, the parameters of the gradient boosting tree model may be adjusted according to the first training data, thereby improving the performance of the gradient boosting tree model.

根据本公开的实施例，去除所述中间训练数据中的低相关特征，得到第三训练数据。具体地，可以通过方差检验、相关性检验等特征检验，去除中间训练数据中的低相关特征，例如，去除方差为零的特征，去除相关性低于阈值的特征。According to an embodiment of the present disclosure, the low correlation features in the intermediate training data are removed to obtain third training data. Specifically, low-correlation features in the intermediate training data can be removed through feature testing such as variance testing and correlation testing, for example, features with zero variance and features with a correlation lower than a threshold can be removed.

根据本公开的实施例，由于中间训练数据的维数是由特征数决定的，而与所述中间训练数据相比，第三训练数据去除了低相关特征。因此，所述第三训练数据的维度低于所述中间训练数据的维度，从而有效避免维数爆炸，一方面可以减少数据对内存的占用，提高模型训练的效率，另一方面，由于经过GBDT处理后得到的中间数据为独热编码的组合，独热编码本身特征较为稀疏，其组合所得特征则更为稀疏，因此，通过从中间数据去除低相关特征，得到维数更低的第三训练数据，还可以有效避免模型训练过程中由于特征过于稀疏而导致过拟合的情况。According to an embodiment of the present disclosure, since the dimension of the intermediate training data is determined by the number of features, the third training data removes low-correlation features compared to the intermediate training data. Therefore, the dimension of the third training data is lower than the dimension of the intermediate training data, thereby effectively avoiding dimensionality explosion. On the one hand, the memory occupation of data can be reduced, and the efficiency of model training can be improved. On the other hand, due to GBDT The intermediate data obtained after processing is a combination of one-hot encoding. The one-hot encoding itself has relatively sparse features, and the combined features are even more sparse. Therefore, by removing low-correlation features from the intermediate data, a third training program with a lower dimension is obtained. It can also effectively avoid overfitting due to too sparse features during model training.

根据本公开的实施例，所述多个基模型包括至少一个非线性模型。根据本公开的实施例，所述非线性模型包括极端梯度提升模型、因子分解机和随机森林中的至少一个。According to an embodiment of the present disclosure, the plurality of base models includes at least one nonlinear model. According to an embodiment of the present disclosure, the nonlinear model includes at least one of an extreme gradient boosting model, a factorization machine, and a random forest.

根据本公开的实施例，所述基于第一训练数据训练多个基模型，包括基于所述第一训练数据训练所述多个基模型中的非线性模型。According to an embodiment of the present disclosure, the training of a plurality of base models based on the first training data includes training a nonlinear model among the plurality of base models based on the first training data.

根据本公开的实施例，可以将第一训练数据切分为第一训练集和第一测试集，对于每个基模型(例如，非线性模型)，先基于所述第一训练集，结合贪心搜索与网格搜索法查找多组参数，再基于所述第一测试集，通过交叉验证法确定所述基模型的模型参数。According to an embodiment of the present disclosure, the first training data can be divided into a first training set and a first test set, and for each base model (eg, a nonlinear model), based on the first training set, combined with greedy Search and grid search methods are used to find multiple sets of parameters, and then based on the first test set, the model parameters of the base model are determined by a cross-validation method.

根据本公开的实施例，通过交叉验证法确定所述基模型的模型参数，可以在一定程度上避免过拟合，还有利于从有限的预处理数据中获取尽可能多的有效信息，从而提高所述基模型的泛化能力。According to the embodiments of the present disclosure, determining the model parameters of the base model through the cross-validation method can avoid overfitting to a certain extent, and is also beneficial to obtain as much effective information as possible from limited preprocessing data, thereby improving The generalization ability of the base model.

根据本公开的实施例，通过贪心搜索、网格搜索等算法，并结合交叉验证法自动确定所述各个基模型的模型参数，能够有效减少人工干预，提高模型训练的效率以及模型的准确度和客观性。According to the embodiments of the present disclosure, the model parameters of the respective base models are automatically determined through algorithms such as greedy search and grid search combined with the cross-validation method, which can effectively reduce manual intervention, improve the efficiency of model training, and improve the accuracy and performance of the model. objectivity.

根据本公开的实施例，所述多个基模型包括至少一个线性模型。根据本公开的实施例，所述线性模型包括逻辑回归模型。According to an embodiment of the present disclosure, the plurality of base models includes at least one linear model. According to an embodiment of the present disclosure, the linear model includes a logistic regression model.

根据本公开的实施例，所述基于所述第三训练数据训练所述多个基模型，包括基于所述第三训练数据训练所述多个基模型中的线性模型。According to an embodiment of the present disclosure, the training of the plurality of base models based on the third training data includes training a linear model of the plurality of base models based on the third training data.

根据本公开的实施例，可以将第三训练数据切分为第三训练集和第三测试集，对于每个基模型(例如，线性模型)，先基于所述第三训练集，结合网格搜索法查找多组参数，再基于所述第三测试集，通过交叉验证法确定所述基模型的模型参数。According to an embodiment of the present disclosure, the third training data can be divided into a third training set and a third testing set, and for each base model (eg, a linear model), based on the third training set, the grid A search method is used to find multiple sets of parameters, and based on the third test set, the model parameters of the base model are determined by a cross-validation method.

根据本公开实施例，多个基模型可以包括多个类型不同的基模型，有利于使得多个基模型的预测误差互相独立，从而降低所述多个基模型组合后的错误率，从而提高模型训练的准确性和可靠性。According to the embodiment of the present disclosure, multiple base models may include multiple base models of different types, which is beneficial to make the prediction errors of the multiple base models independent of each other, thereby reducing the error rate after combining the multiple base models, thereby improving the model Accuracy and reliability of training.

图6示出根据本公开的实施例的确定组合模型中使用的基模型及所述使用的基模型的相应系数的流程图。FIG. 6 shows a flowchart of determining a base model used in a combined model and corresponding coefficients of the used base model, according to an embodiment of the present disclosure.

如图6所示，所述基于第二训练数据，通过贪心算法确定组合模型中使用的基模型及所述使用的基模型的相应系数，包括以下步骤S301-S303。As shown in FIG. 6 , based on the second training data, the base model used in the combined model and the corresponding coefficients of the used base model are determined by a greedy algorithm, including the following steps S301-S303.

在步骤S301中，基于所述第二训练数据，确定所述多个基模型中性能最优的第一模型，将所述第一模型作为组合模型。In step S301, based on the second training data, a first model with the best performance among the multiple base models is determined, and the first model is used as a combined model.

在步骤S302中，逐步增加所述组合模型中的基模型数量，直到加入新的基模型不再提升模型组合的性能，或者所述组合模型中的基模型数量等于所述多个基模型的总数，其中，每次加入组合模型的基模型为使得加入所述基模型之后的组合模型性能最优的基模型，确定加入所述基模型之后的组合模型中各基模型的组合系数。In step S302, the number of base models in the combined model is gradually increased until adding a new base model no longer improves the performance of the model combination, or the number of base models in the combined model is equal to the total number of the multiple base models , wherein the base model added to the combined model each time is the base model that optimizes the performance of the combined model after adding the base model, and the combined coefficients of each base model in the combined model after adding the base model are determined.

在步骤S303中，输出所述组合模型中使用的基模型及所述使用的基模型的相应组合系数。In step S303, the base model used in the combined model and the corresponding combination coefficients of the used base model are output.

根据本公开的实施例，可以将第二训练数据切分为第二训练集和第二测试集，对于每个组合模型，先基于所述第二训练集训练组合模型，再基于所述第二测试集测试组合模型的性能。According to the embodiment of the present disclosure, the second training data can be divided into a second training set and a second test set. For each combined model, the combined model is first trained based on the second training set, and then based on the second training set. The test set tests the performance of the combined model.

例如，假设已经基于所述第一训练数据确定了30个基模型的模型参数。接下来，基于所述第二训练数据，确定所述30个基模型中性能最优的第一模型B1，将所述第一模型B1作为组合模型。然后，在剩余的29个基模型中确定与第一模型B1组合后得到的组合模型性能最佳的第二模型B2，并确定所述第一模型B1和第二模型B2的组合系数，根据所述第一模型B1、第二模型B2以及第一模型B1和第二模型B2的组合系数，更新所述组合模型。依次类推，逐步增加所述组合模型中的基模型数量，直到加入新的基模型不再提升模型组合的性能，或者所述组合模型中的基模型数量等于30，其中，每次加入组合模型的基模型为使得加入所述基模型之后的组合模型性能最优的基模型，确定加入所述基模型之后的组合模型中各基模型的组合系数，最后输出所述组合模型中使用的基模型及所述使用的基模型的相应组合系数。For example, assume that model parameters for 30 base models have been determined based on the first training data. Next, based on the second training data, a first model B1 with the best performance among the 30 base models is determined, and the first model B1 is used as a combined model. Then, determine the second model B2 with the best performance of the combined model obtained by combining with the first model B1 from the remaining 29 base models, and determine the combination coefficient of the first model B1 and the second model B2, according to the The first model B1, the second model B2, and the combined coefficients of the first model B1 and the second model B2 are updated, and the combined model is updated. By analogy, the number of base models in the combined model is gradually increased until adding a new base model no longer improves the performance of the model combination, or the number of base models in the combined model is equal to 30, where each time the combined model is added The base model is the base model with the best performance of the combined model after adding the base model, determines the combination coefficient of each base model in the combined model after adding the base model, and finally outputs the base model and the base model used in the combined model. The corresponding combination coefficients of the base model used.

根据本公开的实施例，采用贪心算法确定组合模型中使用的基模型及所述使用的基模型的相应系数，有利于提高模型训练的效率，降低时间复杂度。According to the embodiments of the present disclosure, the greedy algorithm is used to determine the base model used in the combined model and the corresponding coefficients of the used base model, which is beneficial to improve the efficiency of model training and reduce the time complexity.

下面参考图7具体说明根据本公开实施例模型训练方法的示例性过程。An exemplary process of the model training method according to an embodiment of the present disclosure will be specifically described below with reference to FIG. 7 .

图7示出根据本公开的实施例的模型训练方法的示例性过程。FIG. 7 illustrates an exemplary process of a model training method according to an embodiment of the present disclosure.

为了描述的方便，图7中多个基模型中仅绘制了一个非线性模型和一个线性模型，应当了解的是，该示例仅为示例使用，并非是对于本公开的限制，本公开中的多个基模型中线性模型和非线性模型的数量可以根据实际需要进行设定，本公开对此不作具体限定。For the convenience of description, only one nonlinear model and one linear model are drawn among the multiple base models in FIG. 7 . It should be understood that this example is only used as an example, and is not intended to limit the present disclosure. The number of linear models and nonlinear models in each base model can be set according to actual needs, which is not specifically limited in the present disclosure.

如图7所示，获取原始数据后，先去除原始数据中的低相关特征，得到预处理数据，然后通过对预处理数据进行随机切分或按时间切分，得到所述第一训练数据S1、所述第二训练数据S2和测试数据U。As shown in FIG. 7 , after obtaining the original data, firstly remove the low-correlation features in the original data to obtain preprocessed data, and then by randomly dividing or dividing the preprocessed data by time, the first training data S1 is obtained. , the second training data S2 and the test data U.

对于第一训练数据S1，先使用GBDT模型T处理所述第一训练数据S1，得到中间训练数据X1，然后去除所述中间训练数据X1中的低相关特征，得到第三训练数据S3。For the first training data S1, first use the GBDT model T to process the first training data S1 to obtain intermediate training data X1, and then remove low correlation features in the intermediate training data X1 to obtain third training data S3.

继续参考图7，所述多个基模型包括线性模型和非线性模型。在得到所述第一训练数据S1、第二训练数据S2以及第三训练数据S3后，基于所述第一训练数据S1训练所述多个基模型中的非线性模型，并基于所述第三训练数据训练所述多个基模型中的线性模型，从而确定各个基模型的模型参数。With continued reference to FIG. 7 , the plurality of base models include linear models and nonlinear models. After the first training data S1, the second training data S2 and the third training data S3 are obtained, the nonlinear models in the multiple base models are trained based on the first training data S1, and based on the third training data S1 The training data trains linear models in the plurality of base models, thereby determining model parameters of each base model.

确定所述多个基模型的模型参数后，基于所述第二训练数据S2，通过贪心算法确定组合模型中使用的基模型和所使用的基模型的相应组合系。After the model parameters of the multiple base models are determined, based on the second training data S2, a base model used in the combined model and a corresponding combination system of the used base models are determined by a greedy algorithm.

然后，再基于所述测试数据U，对所述组合模型进行校验，从而评估所述组合模型的泛化能力。Then, based on the test data U, the combined model is verified to evaluate the generalization ability of the combined model.

图8示出根据本公开的实施例的模型训练装置700的结构框图。其中，该装置可以通过软件、硬件或者两者的结合实现成为电子设备的部分或者全部。FIG. 8 shows a structural block diagram of a model training apparatus 700 according to an embodiment of the present disclosure. Wherein, the apparatus may be realized by software, hardware or a combination of the two to become part or all of the electronic device.

如图8示，所述模型训练装置700包括获取模块701、第一确定模块702和第二确定模块703。As shown in FIG. 8 , the model training apparatus 700 includes an acquisition module 701 , a first determination module 702 and a second determination module 703 .

所述获取模块701被配置为获取第一训练数据和第二训练数据；The obtaining module 701 is configured to obtain the first training data and the second training data;

所述第一确定模块702被配置为基于所述第一训练数据训练多个基模型，确定各个基模型的模型参数；The first determining module 702 is configured to train a plurality of base models based on the first training data, and determine model parameters of each base model;

所述第二确定模块703被配置为基于所述第二训练数据，通过贪心算法确定组合模型中使用的基模型和所使用的基模型的相应组合系数。The second determination module 703 is configured to determine, based on the second training data, a base model used in the combined model and a corresponding combination coefficient of the used base model through a greedy algorithm.

根据本公开的实施例，所述多个基模型包括至少一个线性模型和/或至少一个非线性模型。According to an embodiment of the present disclosure, the plurality of base models include at least one linear model and/or at least one nonlinear model.

所述线性模型包括逻辑回归模型；和/或the linear model comprises a logistic regression model; and/or

根据本公开的实施例，所述基于第一训练数据训练多个基模型，包括：According to an embodiment of the present disclosure, the training of multiple base models based on the first training data includes:

根据本公开的实施例，所述基于第一训练数据训练多个基模型，包括基于所述第一训练数据训练所述多个基模型中的非线性模型；和/或According to an embodiment of the present disclosure, the training of a plurality of base models based on the first training data includes training a nonlinear model among the plurality of base models based on the first training data; and/or

根据本公开的实施例，所述基于第二训练数据，通过贪心算法确定组合模型中使用的基模型及所述使用的基模型的相应系数，包括：According to an embodiment of the present disclosure, determining the base model used in the combined model and the corresponding coefficients of the used base model through a greedy algorithm based on the second training data includes:

逐步增加所述组合模型中的基模型数量，直到加入新的基模型不再提升模型组合的性能，或者所述组合模型中的基模型数量等于所述多个基模型的总量，其中，每次加入组合模型的基模型为使得加入所述基模型之后的组合模型性能最优的基模型，确定加入所述基模型之后的组合模型中各基模型的组合系数；Gradually increase the number of base models in the combined model until adding a new base model no longer improves the performance of the model combination, or the number of base models in the combined model is equal to the total number of the multiple base models, where each The base model added to the combined model for the second time is the base model that makes the combined model performance after adding the base model to be optimal, and the combination coefficient of each base model in the combined model after adding the base model is determined;

输出所述组合模型中使用的基模型及所述使用的基模型的相应组合系数。根据本公开的实施例，所述装置700还包括去除模块704和切分模块705。The base model used in the combined model and the corresponding combination coefficients of the used base model are output. According to an embodiment of the present disclosure, the apparatus 700 further includes a removal module 704 and a segmentation module 705 .

所述去除模块704被配置为去除原始数据中的低相关特征，得到预处理数据；The removing module 704 is configured to remove low correlation features in the original data to obtain preprocessed data;

所述切分模块705被配置为通过对预处理数据进行随机切分或按时间切分，得到所述第一训练数据、所述第二训练数据和测试数据。The segmenting module 705 is configured to obtain the first training data, the second training data and the test data by randomly segmenting the preprocessed data or segmenting by time.

根据本公开的实施例，所述装置700还包括校验模块706。According to an embodiment of the present disclosure, the apparatus 700 further includes a verification module 706 .

所述校验模块706被配置为基于所述测试数据，对所述组合模型进行校验。The verification module 706 is configured to verify the combined model based on the test data.

根据本公开的实施例，所述第一训练数据和所述第二训练数据包括与用户画像有关的数据；According to an embodiment of the present disclosure, the first training data and the second training data include data related to user portraits;

本公开还公开了一种电子设备，图9示出根据本公开的实施例的电子设备的结构框图。The present disclosure also discloses an electronic device, and FIG. 9 shows a structural block diagram of the electronic device according to an embodiment of the present disclosure.

如图9所示，所述电子设备800包括存储器801和处理器802。所述存储器801用于存储一条或多条计算机指令，其中，所述一条或多条计算机指令被所述处理器802执行以实现以下方法步骤：As shown in FIG. 9 , the electronic device 800 includes a memory 801 and a processor 802 . The memory 801 is used to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor 802 to implement the following method steps:

根据本公开的实施例，所述线性模型包括逻辑回归模型；和/或According to an embodiment of the present disclosure, the linear model comprises a logistic regression model; and/or

逐步增加所述组合模型中的基模型数量，直到加入新的基模型不再提升模型组合的性能，其中，每次加入组合模型的基模型为使得加入所述基模型之后的组合模型性能最优的基模型，确定加入所述基模型之后的组合模型中各基模型的组合系数；Gradually increase the number of base models in the combined model, until adding a new base model no longer improves the performance of the model combination, wherein each time the base model of the combined model is added to make the combined model after adding the base model the best performance the base model, determine the combination coefficient of each base model in the combined model after adding the base model;

根据本公开的实施例，所述一条或多条计算机指令还被所述处理器802执行以实现以下方法步骤：According to an embodiment of the present disclosure, the one or more computer instructions are also executed by the processor 802 to implement the following method steps:

如图10所示，计算机系统900包括中央处理单元(CPU)901，其可以根据存储在只读存储器(ROM)902中的程序或者从存储部分909加载到随机访问存储器(RAM)903中的程序而执行上述实施例中的各种处理。在RAM 903中，还存储有系统900操作所需的各种程序和数据。CPU 901、ROM 902以及RAM 903通过总线904彼此相连。输入/输出(I/O)接口905也连接至总线904。As shown in FIG. 10, a computer system 900 includes a central processing unit (CPU) 901, which can be loaded into a random access memory (RAM) 903 according to a program stored in a read only memory (ROM) 902 or a program from a storage section 909 Instead, various processes in the above-described embodiments are performed. In the RAM 903, various programs and data necessary for the operation of the system 900 are also stored. The CPU 901 , the ROM 902 , and the RAM 903 are connected to each other through a bus 904 . An input/output (I/O) interface 905 is also connected to bus 904 .

以下部件连接至I/O接口905：包括键盘、鼠标等的输入部分906；包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分908；包括硬盘等的存储部分908；以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分909。通信部分909经由诸如因特网的网络执行通信处理。驱动器910也根据需要连接至I/O接口905。可拆卸介质911，诸如磁盘、光盘、磁光盘、半导体存储器等等，根据需要安装在驱动器910上，以便于从其上读出的计算机程序根据需要被安装入存储部分908。The following components are connected to the I/O interface 905: an input section 906 including a keyboard, a mouse, etc.; an output section 908 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 908 including a hard disk, etc. ; and a communication section 909 including a network interface card such as a LAN card, a modem, and the like. The communication section 909 performs communication processing via a network such as the Internet. A drive 910 is also connected to the I/O interface 905 as needed. A removable medium 911, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 910 as needed so that a computer program read therefrom is installed into the storage section 908 as needed.

特别地，根据本公开的实施例，上文描述的方法可以被实现为计算机软件程序。例如，本公开的实施例包括一种计算机程序产品，其包括有形地包含在及其可读介质上的计算机程序，所述计算机程序包含用于执行上述对象类别确定方法的程序代码。在这样的实施例中，该计算机程序可以通过通信部分909从网络上被下载和安装，和/或从可拆卸介质911被安装。In particular, according to embodiments of the present disclosure, the methods described above may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product including a computer program tangibly embodied on a readable medium thereof, the computer program including program code for performing the above-described object class determination method. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 909, and/or installed from the removable medium 911.

附图中的流程图和框图，图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，路程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分，所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the diagram or block diagram may represent a module, segment, or portion of code that contains one or more functions for implementing the specified logical function. executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.

描述于本公开实施例中所涉及到的单元或模块可以通过软件的方式实现，也可以通过可编程硬件的方式来实现。所描述的单元或模块也可以设置在处理器中，这些单元或模块的名称在某种情况下并不构成对该单元或模块本身的限定。The units or modules involved in the embodiments of the present disclosure may be implemented in a software manner, or may be implemented in a programmable hardware manner. The described units or modules may also be provided in the processor, and the names of these units or modules do not constitute a limitation on the units or modules themselves in certain circumstances.

作为另一方面，本公开还提供了一种可读存储介质，该可读存储介质可以是上述实施例中电子设备或计算机系统中所包含的可读存储介质；也可以是单独存在，未装配入设备中的可读存储介质。可读存储介质存储有一个或者一个以上程序，所述程序被一个或者一个以上的处理器用来执行描述于本公开的方法。As another aspect, the present disclosure also provides a readable storage medium. The readable storage medium may be the readable storage medium included in the electronic device or computer system in the above-mentioned embodiments; readable storage medium in the device. The readable storage medium stores one or more programs used by one or more processors to perform the methods described in the present disclosure.

以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解，本公开中所涉及的发明范围，并不限于上述技术特征的特定组合而成的技术方案，同时也应涵盖在不脱离所述发明构思的情况下，由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is merely a preferred embodiment of the present disclosure and an illustration of the technical principles employed. Those skilled in the art should understand that the scope of the invention involved in the present disclosure is not limited to the technical solutions formed by the specific combination of the above-mentioned technical features, and should also cover the above-mentioned technical features without departing from the inventive concept. Other technical solutions formed by any combination of its equivalent features. For example, a technical solution is formed by replacing the above features with the technical features disclosed in the present disclosure (but not limited to) with similar functions.

Claims

1. a model training method, is characterized in that, comprises:

obtaining first training data and second training data;

Train multiple base models based on the first training data, and determine model parameters of each base model;

Based on the second training data, the base model used in the combined model and the corresponding combining coefficients of the used base model are determined by a greedy algorithm.

2. method according to claim 1, is characterized in that:

The plurality of base models include at least one linear model and/or at least one nonlinear model.

3. method according to claim 2, is characterized in that:

the linear model comprises a logistic regression model; and/or

The nonlinear model includes at least one of an extreme gradient boosting model, a factorization machine, and a random forest.

4. The method according to claim 2, wherein the training of multiple base models based on the first training data comprises:

using a gradient boosting tree model to process the first training data to obtain intermediate training data;

Remove the low correlation features in the intermediate training data to obtain the third training data;

The plurality of base models are trained based on the third training data.

5. method according to claim 4, is characterized in that:

The training of multiple base models based on the first training data includes training a nonlinear model in the multiple base models based on the first training data; and/or

The training of the plurality of base models based on the third training data includes training a linear model of the plurality of base models based on the third training data.

6. The method according to claim 1, wherein, determining the base model used in the combined model and the corresponding coefficients of the used base model by a greedy algorithm based on the second training data, comprising:

determining, based on the second training data, a first model with the best performance among the multiple base models, and using the first model as a combined model;

Gradually increase the number of base models in the combined model until adding a new base model no longer improves the performance of the model combination, or the number of base models in the combined model is equal to the total number of the multiple base models, wherein each time The base model added to the combined model is a base model with the best performance of the combined model after adding the base model, and the combination coefficient of each base model in the combined model after adding the base model is determined;

The base model used in the combined model and the corresponding combination coefficients of the used base model are output.

7. The method of claim 1, further comprising:

Remove low-correlation features in the original data to obtain preprocessed data;

The first training data, the second training data and the test data are obtained by randomly dividing or dividing the preprocessed data by time.

8. A model training device, comprising:

an acquisition module, configured to acquire the first training data and the second training data;

a first determination module, configured to train a plurality of base models based on the first training data, and to determine model parameters of each base model;

The second determination module is configured to determine, based on the second training data, a base model used in the combined model and a corresponding combination coefficient of the used base model through a greedy algorithm.

9. An electronic device, comprising a memory and a processor; wherein the memory is used to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to achieve The method steps of any one of claims 1-7.

10. A readable storage medium on which computer instructions are stored, wherein the computer instructions implement the method steps of any one of claims 1-7 when the computer instructions are executed by a processor.