CN109978179A - Model training method and device, electronic equipment and readable storage medium - Google Patents

Model training method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN109978179A
CN109978179A CN201910271480.8A CN201910271480A CN109978179A CN 109978179 A CN109978179 A CN 109978179A CN 201910271480 A CN201910271480 A CN 201910271480A CN 109978179 A CN109978179 A CN 109978179A
Authority
CN
China
Prior art keywords
basic mode
mode type
training
model
training data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910271480.8A
Other languages
Chinese (zh)
Inventor
赵呈路
李雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lazas Network Technology Shanghai Co Ltd
Original Assignee
Lazas Network Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lazas Network Technology Shanghai Co Ltd filed Critical Lazas Network Technology Shanghai Co Ltd
Priority to CN201910271480.8A priority Critical patent/CN109978179A/en
Publication of CN109978179A publication Critical patent/CN109978179A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The embodiment of the disclosure discloses a model training method, a model training device, electronic equipment and a readable storage medium. According to the technical scheme, the base model used in the combined model and the corresponding combination coefficient of the used base model can be automatically determined, the parameter adjusting efficiency in the model training process can be improved, and the accuracy and the objectivity of the model are improved.

Description

Model training method, device, electronic equipment and readable storage medium storing program for executing
Technical field
This disclosure relates to field of computer technology, and in particular to a kind of model training method, device, electronic equipment and readable Storage medium.
Background technique
In order to improve the precision of prediction of model in machine learning, technical staff, which generallys use, carries out group for multiple basic mode types It closes, the generalization ability of Lai Tigao model.
During proposing the disclosure, inventors have found that model in the prior art combination usually needs technical staff First multiple basic mode types are respectively trained, then multiple basic mode types after training are selected, are combined, and model combination is instructed Practice to adjust the parameter of model combination, so that existing model training time and effort consuming, has seriously affected the efficiency of model training.
Summary of the invention
In order to solve the problems in the relevant technologies, the embodiment of the present disclosure provides a kind of model training method, device, electronics and sets Standby and readable storage medium storing program for executing.
In a first aspect, the embodiment of the present disclosure provides a kind of model training method.
Specifically, the model training method, comprising:
Obtain the first training data and the second training data;
Based on the multiple basic mode types of first training data training, the model parameter of each basic mode type is determined;
Based on second training data, basic mode type used in built-up pattern and used is determined by greedy algorithm The corresponding combination coefficient of basic mode type.
With reference to first aspect, for the disclosure in the first implementation of first aspect, the multiple basic mode type includes extremely A few linear model and/or at least one nonlinear model.
The first implementation with reference to first aspect, the disclosure are described in second of implementation of first aspect Linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
The first implementation with reference to first aspect, the disclosure are described in the third implementation of first aspect Based on the multiple basic mode types of the first training data training, comprising:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
The third implementation with reference to first aspect, the disclosure are described in the 4th kind of implementation of first aspect Based on the multiple basic mode types of the first training data training, including based in the multiple basic mode type of first training data training Nonlinear model;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data Practice the linear model in the multiple basic mode type.
With reference to first aspect, the disclosure is described to be based on the second training data in the 5th kind of implementation of first aspect, The corresponding coefficient of basic mode type used in built-up pattern and the basic mode type used is determined by greedy algorithm, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added Performance or the built-up pattern in basic mode type quantity be equal to the multiple basic mode type sum, wherein addition group every time The basic mode type of molding type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, is determined described in being added The combination coefficient of each basic mode type in built-up pattern after basic mode type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
With reference to first aspect, in the 6th kind of implementation of first aspect, the model training method also wraps the disclosure It includes:
The low correlated characteristic in initial data is removed, preprocessed data is obtained;
By carrying out random cutting or temporally cutting to preprocessed data, first training data, described the are obtained Two training datas and test data.
The 6th kind of implementation with reference to first aspect, the disclosure are also wrapped in the 7th kind of implementation of first aspect It includes:
Based on the test data, the built-up pattern is verified.
With reference to first aspect, the disclosure is in the 8th kind of implementation of first aspect, first training data and institute Stating the second training data includes related data of drawing a portrait with user;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
Second aspect provides a kind of model training apparatus in the embodiment of the present disclosure characterized by comprising
Module is obtained, is configured as obtaining the first training data and the second training data;
First determining module is configured as determining each basic mode based on the multiple basic mode types of first training data training The model parameter of type;
Second determining module is configured as determining in built-up pattern based on second training data by greedy algorithm The corresponding combination coefficient of the basic mode type and used basic mode type that use.
In conjunction with second aspect, for the disclosure in the first implementation of second aspect, the multiple basic mode type includes extremely A few linear model and/or at least one nonlinear model.
In conjunction with the first implementation of second aspect, the disclosure is described in second of implementation of second aspect Linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
In conjunction with the first implementation of second aspect, the disclosure is described in the third implementation of second aspect Based on the multiple basic mode types of the first training data training, comprising:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
In conjunction with the third implementation of second aspect, the disclosure is described in the 4th kind of implementation of second aspect Based on the multiple basic mode types of the first training data training, including based in the multiple basic mode type of first training data training Nonlinear model;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data Practice the linear model in the multiple basic mode type.
In conjunction with the 4th kind of implementation of second aspect, the disclosure is described in the 5th kind of implementation of second aspect Based on the second training data, the phase of basic mode type used in built-up pattern and the basic mode type used is determined by greedy algorithm Answer coefficient, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added Performance or the built-up pattern in basic mode type quantity be equal to the multiple basic mode type sum, wherein addition group every time The basic mode type of molding type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, is determined described in being added The combination coefficient of each basic mode type in built-up pattern after basic mode type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
In conjunction with second aspect, the disclosure is in the 6th kind of implementation of second aspect, the model training further include: go Except module, the low correlated characteristic being configured as in removal initial data obtains preprocessed data;
Cutting module, is configured as by carrying out random cutting or temporally cutting to preprocessed data, obtains described the One training data, second training data and test data.
In conjunction with the 6th kind of implementation of second aspect, the disclosure is described in the 7th kind of implementation of second aspect Model training further include:
Correction verification module is configured as verifying the built-up pattern based on the test data.
In conjunction with second aspect, the disclosure is in the 8th kind of implementation of second aspect, first training data and institute Stating the second training data includes related data of drawing a portrait with user;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
The third aspect, the embodiment of the present disclosure provide a kind of electronic equipment, including memory and processor, wherein described Memory is for storing one or more computer instruction, wherein one or more computer instruction is by the processor It executes to realize following methods step:
Obtain the first training data and the second training data;
Based on the multiple basic mode types of first training data training, the model parameter of each basic mode type is determined;
Based on second training data, basic mode type used in built-up pattern and used is determined by greedy algorithm The corresponding combination coefficient of basic mode type.
In conjunction with the third aspect, for the disclosure in the first implementation of the third aspect, the multiple basic mode type includes extremely A few linear model and/or at least one nonlinear model.
In conjunction with the first implementation of the third aspect, the disclosure is described in second of implementation of the third aspect Linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
In conjunction with the first implementation of the third aspect, the disclosure uses in the third implementation of the third aspect First training data described in gradient boosted tree model treatment, obtains middle trained data;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
In conjunction with the third implementation of the third aspect, the disclosure is described in the 4th kind of implementation of the third aspect Based on the multiple basic mode types of the first training data training, including based in the multiple basic mode type of first training data training Nonlinear model;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data Practice the linear model in the multiple basic mode type.
In conjunction with the third aspect, the disclosure is described to be based on the second training data in the 5th kind of implementation of the third aspect, The corresponding coefficient of basic mode type used in built-up pattern and the basic mode type used is determined by greedy algorithm, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added Performance or the built-up pattern in basic mode type quantity be equal to the multiple basic mode type sum, wherein addition group every time The basic mode type of molding type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, is determined described in being added The combination coefficient of each basic mode type in built-up pattern after basic mode type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
In conjunction with the third aspect, the disclosure is in the 6th kind of implementation of the third aspect, one or more computer Instruction is also executed by the processor to realize following methods step: the low correlated characteristic in removal initial data obtains pre- place Manage data;
By carrying out random cutting or temporally cutting to preprocessed data, first training data, described the are obtained Two training datas and test data.
In conjunction with the 6th kind of implementation of the third aspect, the disclosure is described in the 7th kind of implementation of the third aspect One or more computer instruction is also executed by the processor to realize following methods step:
Based on the test data, the built-up pattern is verified.
In conjunction with the third aspect, the disclosure is in the 8th kind of implementation of the third aspect, first training data and institute Stating the second training data includes related data of drawing a portrait with user;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
Fourth aspect provides a kind of readable storage medium storing program for executing in the embodiment of the present disclosure, is stored thereon with computer instruction, should Realize the first implementation such as first aspect, first aspect to the 8th kind of realization side when computer instruction is executed by processor The described in any item methods of formula.
The technical solution that the embodiment of the present disclosure provides can include the following benefits:
According to the technical solution that the embodiment of the present disclosure provides, basic mode type and institute used in built-up pattern can be automatically determined The corresponding combination coefficient of the basic mode type used can be improved the efficiency of tune ginseng during model training, improve the accuracy rate of model And objectivity.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The disclosure can be limited.
Detailed description of the invention
In conjunction with attached drawing, by the detailed description of following non-limiting embodiment, other labels of the disclosure, purpose and excellent Point will be apparent.In the accompanying drawings:
Fig. 1 shows the flow chart of model training method according to an embodiment of the present disclosure;
Fig. 2 shows the flow charts of model training method according to an embodiment of the present disclosure;
Fig. 3 shows the flow chart of model training method according to an embodiment of the present disclosure;
Fig. 4 shows the flow chart of the multiple basic mode types of training according to an embodiment of the present disclosure;
Fig. 5 shows the schematic diagram that gradient according to an embodiment of the present disclosure promotes tree-model;
Fig. 6 shows basic mode type used in determining built-up pattern according to an embodiment of the present disclosure and the basic mode used The flow chart of the corresponding coefficient of type;
Fig. 7 shows the example process of model training method according to an embodiment of the present disclosure;
Fig. 8 shows the structural block diagram of model training apparatus according to an embodiment of the present disclosure;
Fig. 9 shows the structural block diagram of electronic equipment according to an embodiment of the present disclosure;
Figure 10 shows the structure for being suitable for the computer system for being used to realize the model training method according to the embodiment of the present disclosure Schematic diagram.
Specific embodiment
Hereinafter, the exemplary embodiment of the disclosure will be described in detail with reference to the attached drawings, so that those skilled in the art can hold It changes places and realizes them.In addition, for the sake of clarity, the part unrelated with description exemplary embodiment is omitted in the accompanying drawings.
In the disclosure, it should be appreciated that the term of " comprising " or " having " etc. is intended to refer to disclosed in this specification Feature, number, step, behavior, the presence of component, part or combinations thereof, and be not intended to exclude other one or more features, A possibility that number, step, behavior, component, part or combinations thereof exist or are added.
It also should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure It can be combined with each other.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Mentioned above, in order to improve the precision of prediction of model in machine learning, technical staff is generallyd use multiple basic modes Type is combined, the generalization ability of Lai Tigao model.During proposing the disclosure, inventors have found that in the prior art Model combination usually needs technical staff that first multiple basic mode types are respectively trained, then selects multiple basic mode types after training It selects, combine, and model combination is trained to adjust the parameter of model combination, so that existing model training time and effort consuming, The efficiency of model training is seriously affected.
In view of drawbacks described above, the technical solution that the embodiment of the present disclosure provides obtains the first training data and the second training number According to based on the multiple basic mode types of first training data training, the model parameter of each basic mode type being determined, based on second instruction Practice data, the corresponding combination coefficient of basic mode type used in built-up pattern and used basic mode type is determined by greedy algorithm. The technical solution can automatically determine the corresponding combination coefficient of basic mode type used in built-up pattern and used basic mode type, can Join efficiency to improve the tune during model training, reduces manually to the intervention of model, improve the generalization ability of model.
Fig. 1 shows the flow chart of model training method according to an embodiment of the present disclosure.
As shown in Figure 1, the model training method includes the following steps S101-S103.
In step s101, the first training data and the second training data are obtained.
In step s 102, based on the multiple basic mode types of first training data training, the model of each basic mode type is determined Parameter.
In step s 103, it is based on second training data, base used in built-up pattern is determined by greedy algorithm The corresponding combination coefficient of model and used basic mode type.
In accordance with an embodiment of the present disclosure, for example, it is assumed that have multiple basic mode type M1, M2 ..., Mn, n >=2, first be based on first Training data be respectively trained basic mode type M1, M2 ..., Mn the basic mode that built-up pattern M is used then is determined based on greedy algorithm The corresponding combination coefficient of type and used basic mode type.According to actual needs with the effect of model, built-up pattern M may include more A basic mode type M1, M2 ..., whole basic mode types or part basic mode type in Mn.The combination coefficient of basic mode type can in built-up pattern To indicate the weight of the basic mode type in built-up pattern.For example, it is assumed that built-up pattern M includes basic mode type M1, M2, M3, basic mode type The combination coefficient of M1, M2, M3 are respectively m1, m2, m3, then M=m1*M1+m2*M2+m3*M3.
In accordance with an embodiment of the present disclosure, basic mode type used in built-up pattern and used basic mode type can be automatically determined The corresponding combination coefficient, the tune ginseng efficiency during model training can be improved, improve the generalization ability of model.
The model training method that the disclosure is proposed is widely applicable for the training of various model combinations, first training Data and second training data may include various data, and the built-up pattern and the basic mode type can be used for various use On the way, the disclosure is not especially limited this.
For example, first training data and second training data may include related data of drawing a portrait with user. The built-up pattern and the basic mode type are used to predict based on the data related with user's portrait.For example, the first instruction Practice data and the second training data may include following at least one data related with user's portrait: user attribute data (example Such as age, gender, occupation), the friendship of user behavior data (such as consumption preferences, browsing preference etc.), user and other main bodys Mutual data (such as place an order, collect, goods return and replacement etc.), built-up pattern and basic mode type can be used for based on drawing a portrait related number with user The probability of the recommendation items in recommendation list is clicked according to come the behavior of predicting user, such as user.
Fig. 2 shows the flow charts of model training method according to an embodiment of the present disclosure.
As shown in Fig. 2, the model training method also wraps other than step S101-S103 according to the embodiment of the present disclosure Include step S104-S105.
In step S104, the low correlated characteristic in initial data is removed, preprocessed data is obtained.
In step s105, by carrying out random cutting or temporally cutting to preprocessed data, first instruction is obtained Practice data, second training data and test data.
In accordance with an embodiment of the present disclosure, low correlated characteristic can be the spy of (or relationship very little) unrelated with this model training Sign can also allow to the redundancy feature obtained from other features, for example, as it is known that rectangle is long and wide, then the area of rectangle is visual For redundancy feature.
In accordance with an embodiment of the present disclosure, the low correlated characteristic in the removal initial data includes passing through variance test, phase The inspection by attributes methods such as closing property inspection, remove the low correlated characteristic in initial data, for example, the feature that removal variance is zero, goes Except correlation is lower than the feature etc. of threshold value.In this way, it is possible to prevente effectively from dimension explosion, on the one hand reduces data and accounts for memory With improving the efficiency of model training, on the other hand can also effectively avoid during model training since feature is excessively sparse And the case where leading to over-fitting.
Fig. 3 shows the flow chart of model training method according to an embodiment of the present disclosure.
As shown in figure 3, the model training method also wraps other than step S101-S103 according to the embodiment of the present disclosure Include step S106.
In step s 106, it is based on the test data, the built-up pattern is verified.
In accordance with an embodiment of the present disclosure, the test data and first training data and the second training data are all not With, i.e., the described test data did not use in step S101-S103.In this way, the fidelity of check results can be improved, The risk of over-fitting is effectively reduced, to improve the accuracy of model.
Fig. 4 shows the flow chart of the multiple basic mode types of training according to an embodiment of the present disclosure.
As shown in figure 4, including step S201-S203 based on the multiple basic mode types of the first training data training.
In step s 201, using the first training data described in gradient boosted tree model treatment, middle trained data are obtained.
In step S202, the low correlated characteristic in the middle trained data is removed, third training data is obtained.
In step S203, based on the multiple basic mode type of third training data training.
In accordance with an embodiment of the present disclosure, the gradient promotes tree-model (Gradient BoostingDecision Tree Model, abbreviation GBDT model) it include more regression trees, it is a kind of decision-tree model of iteration.
In accordance with an embodiment of the present disclosure, using the first training data described in gradient boosted tree model treatment, intermediate instruction is obtained Practice data, first can handle first training data using gradient boosted tree, then utilizes one-hot coding (one-hot coding) The leaf node of every regression tree is recorded, i.e., the leaf node where first training data is denoted as 1, remaining is denoted as 0, then It combines all codings and obtains middle trained data.
It is illustrated below with reference to Fig. 5 and the training number of gradient boosted tree model treatment first is used according to the embodiment of the present disclosure According to obtaining the example process of middle trained data.
Fig. 5 shows the schematic diagram that gradient according to an embodiment of the present disclosure promotes tree-model.
For convenience of description, it is promoted for tree-model T includes three regression trees in Fig. 5 by gradient and explains and say It is bright, it should be understood that, the merely illustrative use of the example is not the limitation for the disclosure, and the gradient in the disclosure is promoted Tree-model can also be made of two or more regression trees.In addition, the depth and leaf node number of every regression tree can also To be set according to actual needs, the disclosure is not especially limited this.
As shown in figure 5, it includes three regression trees T1, T2 and T3 that gradient, which promotes tree-model T, wherein regression tree T1 and T3 is each There are three leaf nodes for tool, and there are two leaf nodes for regression tree T2 tool.Each leaf node correspond to a feature, described first After training data S1 promotes tree-model T processing by the gradient, obtained middle trained data X1 is the feature vector of octuple.
Assuming that the first training data S1 falls in the of regression tree T1 after the gradient promotes tree-model T processing One leaf node, then available coding a1 is [1,0,0], indicates that the first training data S1 includes the of regression tree T1 The corresponding feature of one leaf node;The first training data S1 falls in second leaf node of regression tree T2, then can be with Obtaining coding a2 is [0,1];The first training data S1 falls in the third leaf node of regression tree T3, then available volume Code a3 is [0,0,1].By will encode a1, coding a2 and coding a3 combination, obtain middle trained data X1 be octuple vector [1, 0,0,0,1,0,0,1]。
In addition, during handling first training data using gradient boosted tree, it can be according to first training Gradient described in data point reuse promotes the parameter of tree-model, to improve the performance that the gradient promotes tree-model.
In accordance with an embodiment of the present disclosure, the low correlated characteristic in the middle trained data is removed, third training number is obtained According to.Specifically, it can be removed low related special in middle trained data by inspection by attributes such as variance test, correlation tests Sign, for example, the feature that removal variance is zero, removal correlation is lower than the feature of threshold value.
In accordance with an embodiment of the present disclosure, due to the dimension of middle trained data be determined by characteristic, and with it is described in Between training data compare, third training data eliminates low correlated characteristic.Therefore, the dimension of the third training data is lower than institute The dimension of middle trained data is stated, to effectively dimension be avoided to explode, data on the one hand can be reduced to the occupancy of memory, improve The efficiency of model training, it is on the other hand, solely hot due to the combination that the intermediate data obtained after GBDT is handled is one-hot coding It is more sparse to encode feature itself, feature obtained by a combination thereof is then more sparse, therefore, low related special by removing from intermediate data Sign, obtains the lower third training data of dimension, can also effectively avoid during model training since feature is excessively sparse and The case where leading to over-fitting.
In accordance with an embodiment of the present disclosure, the multiple basic mode type includes at least one nonlinear model.According to the disclosure Embodiment, the nonlinear model include at least one of extreme gradient lift scheme, Factorization machine and random forest.
In accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, including based on described the Nonlinear model in the multiple basic mode type of one training data training.
In accordance with an embodiment of the present disclosure, can by the first training data cutting be the first training set and the first test set, it is right In each basic mode type (for example, nonlinear model), it is first based on first training set, is looked into conjunction with greed search with grid data service Multiple groups parameter is looked for, then is based on first test set, the model parameter of the basic mode type is determined by cross-validation method.
In accordance with an embodiment of the present disclosure, the model parameter that the basic mode type is determined by cross-validation method, can be certain Over-fitting is avoided in degree, also helps and obtains effective information as much as possible from limited preprocessed data, to improve The generalization ability of the basic mode type.
In accordance with an embodiment of the present disclosure, by greed search, grid search scheduling algorithm, and combine cross-validation method automatically true The model parameter of fixed each basic mode type, can effectively reduce manual intervention, improve the efficiency and model of model training Accuracy and objectivity.
In accordance with an embodiment of the present disclosure, the multiple basic mode type includes at least one linear model.According to the reality of the disclosure Example is applied, the linear model includes Logic Regression Models.
In accordance with an embodiment of the present disclosure, described based on the multiple basic mode type of third training data training, including base Linear model in the multiple basic mode type of third training data training.
In accordance with an embodiment of the present disclosure, can by third training data cutting be third training set and third test set, it is right In each basic mode type (for example, linear model), it is first based on the third training set, searches multiple groups parameter in conjunction with grid data service, It is based on the third test set again, the model parameter of the basic mode type is determined by cross-validation method.
According to the embodiment of the present disclosure, multiple basic mode types may include multiple different types of basic mode types, advantageously allow more The prediction error of a basic mode type is independent mutually, so that the error rate after the multiple basic mode type combination is reduced, to improve model Trained accuracy and reliability.
Fig. 6 shows basic mode type used in determining built-up pattern according to an embodiment of the present disclosure and the basic mode used The flow chart of the corresponding coefficient of type.
As shown in fig. 6, described be based on the second training data, basic mode type used in built-up pattern is determined by greedy algorithm And the corresponding coefficient of the basic mode type used, include the following steps S301-S303.
In step S301, it is based on second training data, determines first of best performance in the multiple basic mode type Model, using first model as built-up pattern.
In step s 302, the basic mode type quantity being stepped up in the built-up pattern, until new basic mode type is added not The performance of lift scheme combination or the basic mode type quantity in the built-up pattern are equal to the sum of the multiple basic mode type again, Wherein, the basic mode type for built-up pattern being added every time is the basic mode so that the built-up pattern best performance being added after the basic mode type Type determines the combination coefficient of each basic mode type in the built-up pattern after the basic mode type is added.
In step S303, the respective sets of basic mode type used in the built-up pattern and the basic mode type used are exported Collaboration number.
In accordance with an embodiment of the present disclosure, can by the second training data cutting be the second training set and the second test set, it is right In each built-up pattern, first based on second training set training built-up pattern, then based on second test set test combination The performance of model.
For example, it is assumed that having been based on first training data has determined the model parameter of 30 basic mode types.Next, base In second training data, the first Model B 1 of best performance in 30 basic mode types is determined, by first Model B 1 As built-up pattern.Then, the built-up pattern obtained after combining with the first Model B 1 is determined in remaining 29 basic mode types Optimal second Model B 2 of energy, and determine the combination coefficient of first Model B 1 and the second Model B 2, according to first mould The combination coefficient of type B1, the second Model B 2 and the first Model B 1 and the second Model B 2, update the built-up pattern.Successively class It pushes away, the basic mode type quantity being stepped up in the built-up pattern, the property until the no longer lift scheme combination of new basic mode type is added Can or the built-up pattern in basic mode type quantity be equal to 30, wherein every time be added built-up pattern basic mode type be so that plus Enter the basic mode type of the built-up pattern best performance after the basic mode type, determines in the built-up pattern after the basic mode type is added The combination coefficient of each basic mode type finally exports the corresponding of basic mode type used in the built-up pattern and the basic mode type used Combination coefficient.
In accordance with an embodiment of the present disclosure, basic mode type used in built-up pattern is determined using greedy algorithm and described used The corresponding coefficient of basic mode type is conducive to the efficiency for improving model training, reduces time complexity.
The example process according to embodiment of the present disclosure model training method is illustrated below with reference to Fig. 7.
Fig. 7 shows the example process of model training method according to an embodiment of the present disclosure.
For convenience of description, a nonlinear model and a linear model are only depicted in Fig. 7 in multiple basic mode types, It will be appreciated that the merely illustrative use of the example, is not the limitation for the disclosure, in multiple basic mode types in the disclosure The quantity of linear model and nonlinear model can be set according to actual needs, and the disclosure is not especially limited this.
As shown in fig. 7, first remove the low correlated characteristic in initial data after obtaining initial data, obtain preprocessed data, Then by carrying out random cutting or temporally cutting to preprocessed data, the first training data S1, described second are obtained Training data S2 and test data U.
For the first training data S1, the first training data S1 first is handled using GBDT model T, obtains middle trained Then data X1 removes the low correlated characteristic in the middle trained data X1, obtains third training data S3.
With continued reference to Fig. 7, the multiple basic mode type includes linear model and nonlinear model.Obtaining first training It is the multiple based on the first training data S1 training after data S1, the second training data S2 and third training data S3 Nonlinear model in basic mode type, and the linear model in the multiple basic mode type is trained based on the third training data, from And determine the model parameter of each basic mode type.
After the model parameter for determining the multiple basic mode type, it is based on the second training data S2, it is true by greedy algorithm Determine the respective sets collaboration of basic mode type used in built-up pattern and used basic mode type.
Then, then it is based on the test data U, the built-up pattern is verified, to assess the built-up pattern Generalization ability.
Fig. 8 shows the structural block diagram of model training apparatus 700 according to an embodiment of the present disclosure.Wherein, which can be with Pass through being implemented in combination with as some or all of of electronic equipment for software, hardware or both.
If Fig. 8 shows, the model training apparatus 700 is determining including obtaining module 701, the first determining module 702 and second Module 703.
The acquisition module 701 is configured as obtaining the first training data and the second training data;
First determining module 702 is configured as determining each based on the multiple basic mode types of first training data training The model parameter of a basic mode type;
Second determining module 703 is configured as being determined and being combined by greedy algorithm based on second training data The corresponding combination coefficient of basic mode type used in model and used basic mode type.
In accordance with an embodiment of the present disclosure, the multiple basic mode type includes at least one linear model and/or at least one is non- Linear model.
The linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
It is in accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, comprising:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
In accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, including based on described the Nonlinear model in the multiple basic mode type of one training data training;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data Practice the linear model in the multiple basic mode type.
In accordance with an embodiment of the present disclosure, described to be based on the second training data, being determined in built-up pattern by greedy algorithm makes The corresponding coefficient of basic mode type and the basic mode type used, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added Performance or the built-up pattern in basic mode type quantity be equal to the multiple basic mode type total amount, wherein addition group every time The basic mode type of molding type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, is determined described in being added The combination coefficient of each basic mode type in built-up pattern after basic mode type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.According to this Disclosed embodiment, described device 700 further include removal module 704 and cutting module 705.
The removal module 704 is configured as the low correlated characteristic in removal initial data, obtains preprocessed data;
The cutting module 705 is configured as obtaining by carrying out random cutting or temporally cutting to preprocessed data First training data, second training data and test data.
In accordance with an embodiment of the present disclosure, described device 700 further includes correction verification module 706.
The correction verification module 706 is configured as verifying the built-up pattern based on the test data.
In accordance with an embodiment of the present disclosure, first training data and second training data include having with user's portrait The data of pass;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
The disclosure also discloses a kind of electronic equipment, and Fig. 9 shows the structure of electronic equipment according to an embodiment of the present disclosure Block diagram.
As shown in figure 9, the electronic equipment 800 includes memory 801 and processor 802.The memory 801 is for depositing Store up one or more computer instruction, wherein one or more computer instruction is executed by the processor 802 to realize Following methods step:
Obtain the first training data and the second training data;
Based on the multiple basic mode types of first training data training, the model parameter of each basic mode type is determined;
Based on second training data, basic mode type used in built-up pattern and used is determined by greedy algorithm The corresponding combination coefficient of basic mode type.
In accordance with an embodiment of the present disclosure, the multiple basic mode type includes at least one linear model and/or at least one is non- Linear model.
In accordance with an embodiment of the present disclosure, the linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
It is in accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, comprising:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
In accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, including based on described the Nonlinear model in the multiple basic mode type of one training data training;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data Practice the linear model in the multiple basic mode type.
In accordance with an embodiment of the present disclosure, described to be based on the second training data, being determined in built-up pattern by greedy algorithm makes The corresponding coefficient of basic mode type and the basic mode type used, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added Performance, wherein every time be added built-up pattern basic mode type be so that the built-up pattern performance after the basic mode type is added most Excellent basic mode type determines the combination coefficient of each basic mode type in the built-up pattern after the basic mode type is added;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
In accordance with an embodiment of the present disclosure, one or more computer instruction is also executed by the processor 802 with reality Existing following methods step:
The low correlated characteristic in initial data is removed, preprocessed data is obtained;
By carrying out random cutting or temporally cutting to preprocessed data, first training data, described the are obtained Two training datas and test data.
In accordance with an embodiment of the present disclosure, one or more computer instruction is also executed by the processor 802 with reality Existing following methods step:
Based on the test data, the built-up pattern is verified.
In accordance with an embodiment of the present disclosure, first training data and second training data include having with user's portrait The data of pass;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
Figure 10 shows the structure for being suitable for the computer system for being used to realize the model training method according to the embodiment of the present disclosure Schematic diagram.
As shown in Figure 10, computer system 900 includes central processing unit (CPU) 901, can be read-only according to being stored in Program in memory (ROM) 902 or be loaded into the program in random access storage device (RAM) 903 from storage section 909 and Execute the various processing in above-described embodiment.In RAM 903, also it is stored with system 900 and operates required various program sum numbers According to.CPU 901, ROM 902 and RAM 903 are connected with each other by bus 904.Input/output (I/O) interface 905 also connects To bus 904.
I/O interface 905 is connected to lower component: the importation 906 including keyboard, mouse etc.;It is penetrated including such as cathode The output par, c 908 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 908 including hard disk etc.; And the communications portion 909 of the network interface card including LAN card, modem etc..Communications portion 909 via such as because The network of spy's net executes communication process.Driver 910 is also connected to I/O interface 905 as needed.Detachable media 911, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 910, in order to read from thereon Computer program be mounted into storage section 908 as needed.
Particularly, in accordance with an embodiment of the present disclosure, method as described above may be implemented as computer software programs.Example Such as, embodiment of the disclosure includes a kind of computer program product comprising be tangibly embodied in and its readable medium on meter Calculation machine program, the computer program include the program code that method is determined for executing above-mentioned object type.In such reality It applies in example, which can be downloaded and installed from network by communications portion 909, and/or from detachable media 911 are mounted.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in course diagram or block diagram can generation A part of one module, section or code of table, a part of the module, section or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Being described in the embodiment of the present disclosure involved unit or module can be realized by way of software, can also be with It is realized by way of programmable hardware.Described unit or module also can be set in the processor, these units or The title of module does not constitute the restriction to the unit or module itself under certain conditions.
As on the other hand, the disclosure additionally provides a kind of readable storage medium storing program for executing, which can be above-mentioned Readable storage medium storing program for executing included in electronic equipment or computer system in embodiment;It is also possible to individualism, without supplying Readable storage medium storing program for executing in equipment.Readable storage medium storing program for executing is stored with one or more than one program, described program by one or The more than one processor of person is used to execute to be described in disclosed method.
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the disclosure, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed in the disclosure Can technical characteristic replaced mutually and the technical solution that is formed.

Claims (10)

1. a kind of model training method characterized by comprising
Obtain the first training data and the second training data;
Based on the multiple basic mode types of first training data training, the model parameter of each basic mode type is determined;
Based on second training data, basic mode type and used basic mode used in built-up pattern are determined by greedy algorithm The corresponding combination coefficient of type.
2. according to the method described in claim 1, it is characterized by:
The multiple basic mode type includes at least one linear model and/or at least one nonlinear model.
3. according to the method described in claim 2, it is characterized by:
The linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
4. according to the method described in claim 2, it is characterized in that, described be based on the multiple basic mode types of the first training data training, Include:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
5. according to the method described in claim 4, it is characterized by:
It is described to be based on the multiple basic mode types of the first training data training, including based on the multiple base of first training data training Nonlinear model in model;And/or
It is described to train institute based on the multiple basic mode type of third training data training, including based on the third training data State the linear model in multiple basic mode types.
6. the second training data is based on the method according to claim 1, wherein described, it is true by greedy algorithm Determine the corresponding coefficient of basic mode type used in built-up pattern and the basic mode type used, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by first mould Type is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern, the property until the no longer lift scheme combination of new basic mode type is added Basic mode type quantity in energy or the built-up pattern is equal to the sum of the multiple basic mode type, wherein combination die is added every time The basic mode type of type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, determines and the basic mode is added The combination coefficient of each basic mode type in built-up pattern after type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
7. the method according to claim 1, wherein further include:
The low correlated characteristic in initial data is removed, preprocessed data is obtained;
By carrying out random cutting or temporally cutting to the preprocessed data, first training data, described the are obtained Two training datas and test data.
8. a kind of model training apparatus characterized by comprising
Module is obtained, is configured as obtaining the first training data and the second training data;
First determining module is configured as determining each basic mode type based on the multiple basic mode types of first training data training Model parameter;
Second determining module is configured as being determined in built-up pattern and being used by greedy algorithm based on second training data Basic mode type and used basic mode type the corresponding combination coefficient.
9. a kind of electronic equipment, which is characterized in that including memory and processor;Wherein, the memory is for storing one Or a plurality of computer instruction, wherein one or more computer instruction is executed by the processor to realize claim The described in any item method and steps of 1-7.
10. a kind of readable storage medium storing program for executing, is stored thereon with computer instruction, which is characterized in that the computer instruction is by processor Claim 1-7 described in any item method and steps are realized when execution.
CN201910271480.8A 2019-04-04 2019-04-04 Model training method and device, electronic equipment and readable storage medium Pending CN109978179A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910271480.8A CN109978179A (en) 2019-04-04 2019-04-04 Model training method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910271480.8A CN109978179A (en) 2019-04-04 2019-04-04 Model training method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN109978179A true CN109978179A (en) 2019-07-05

Family

ID=67083015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910271480.8A Pending CN109978179A (en) 2019-04-04 2019-04-04 Model training method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN109978179A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765110A (en) * 2019-10-24 2020-02-07 深圳前海微众银行股份有限公司 Generalization capability processing method, device, equipment and storage medium
CN112906554A (en) * 2021-02-08 2021-06-04 智慧眼科技股份有限公司 Model training optimization method and device based on visual image and related equipment
CN113139463A (en) * 2021-04-23 2021-07-20 北京百度网讯科技有限公司 Method, apparatus, device, medium and program product for training a model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765110A (en) * 2019-10-24 2020-02-07 深圳前海微众银行股份有限公司 Generalization capability processing method, device, equipment and storage medium
CN112906554A (en) * 2021-02-08 2021-06-04 智慧眼科技股份有限公司 Model training optimization method and device based on visual image and related equipment
CN112906554B (en) * 2021-02-08 2022-12-23 智慧眼科技股份有限公司 Model training optimization method and device based on visual image and related equipment
CN113139463A (en) * 2021-04-23 2021-07-20 北京百度网讯科技有限公司 Method, apparatus, device, medium and program product for training a model

Similar Documents

Publication Publication Date Title
Wangphanich et al. Analysis of the bullwhip effect in multi-product, multi-stage supply chain systems–a simulation approach
US7552078B2 (en) Enterprise portfolio analysis using finite state Markov decision process
Oukil et al. Ranking dispatching rules in multi-objective dynamic flow shop scheduling: a multi-faceted perspective
US8566805B2 (en) System and method to provide continuous calibration estimation and improvement options across a software integration life cycle
CN109978179A (en) Model training method and device, electronic equipment and readable storage medium
CN107230035A (en) Information-pushing method and device
CN107784390A (en) Recognition methods, device, electronic equipment and the storage medium of subscriber lifecycle
CN107220217A (en) Characteristic coefficient training method and device that logic-based is returned
CN111310860B (en) Method and computer-readable storage medium for improving performance of gradient boosting decision trees
CN107679059A (en) Matching process, device, computer equipment and the storage medium of service template
CN108665156B (en) Supply chain selection evaluation method based on Markov chain under block chain
CN112068962B (en) Cloud rendering resource exchange method based on deep learning
CN113657495A (en) Insurance product recommendation method, device and equipment based on probability prediction model
CN116188061B (en) Commodity sales predicting method and device, electronic equipment and storage medium
US20110112889A1 (en) Proactive demand shaping for a configurable product portfolio with uncertain demand
CN107770783A (en) A kind of base station extending capacity reformation Design Method and relevant device
Kouki et al. Solution procedures for lost sales base-stock inventory systems with compound Poisson demand
CN112308623A (en) High-quality client loss prediction method and device based on supervised learning and storage medium
CN106897123A (en) Database operation method and device
CN110515600A (en) A kind of method and apparatus of workflow conditional configuration
CN109785818A (en) A kind of music music method and system based on deep learning
CN107784548A (en) Order processing method and apparatus
Demgne et al. Modelling and numerical assessment of a maintenance strategy with stock through piecewise deterministic Markov processes and quasi Monte Carlo methods
Ling et al. Maximum profit mining and its application in software development
CN110163701A (en) The method and apparatus of pushed information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190705

RJ01 Rejection of invention patent application after publication