CN109978179A - Model training method and device, electronic equipment and readable storage medium - Google Patents
Model training method and device, electronic equipment and readable storage medium Download PDFInfo
- Publication number
- CN109978179A CN109978179A CN201910271480.8A CN201910271480A CN109978179A CN 109978179 A CN109978179 A CN 109978179A CN 201910271480 A CN201910271480 A CN 201910271480A CN 109978179 A CN109978179 A CN 109978179A
- Authority
- CN
- China
- Prior art keywords
- basic mode
- mode type
- training
- model
- training data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 271
- 238000000034 method Methods 0.000 title claims abstract description 54
- 230000000875 corresponding effect Effects 0.000 claims description 30
- 238000012360 testing method Methods 0.000 claims description 27
- 230000002596 correlated effect Effects 0.000 claims description 21
- 238000004422 calculation algorithm Methods 0.000 claims description 20
- 238000007637 random forest analysis Methods 0.000 claims description 7
- 241001269238 Data Species 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 6
- 230000006399 behavior Effects 0.000 description 4
- 230000006854 communication Effects 0.000 description 4
- 238000002790 cross-validation Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000000465 moulding Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 238000007689 inspection Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 241000208340 Araliaceae Species 0.000 description 2
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 2
- 235000003140 Panax quinquefolius Nutrition 0.000 description 2
- 235000008434 ginseng Nutrition 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
The embodiment of the disclosure discloses a model training method, a model training device, electronic equipment and a readable storage medium. According to the technical scheme, the base model used in the combined model and the corresponding combination coefficient of the used base model can be automatically determined, the parameter adjusting efficiency in the model training process can be improved, and the accuracy and the objectivity of the model are improved.
Description
Technical field
This disclosure relates to field of computer technology, and in particular to a kind of model training method, device, electronic equipment and readable
Storage medium.
Background technique
In order to improve the precision of prediction of model in machine learning, technical staff, which generallys use, carries out group for multiple basic mode types
It closes, the generalization ability of Lai Tigao model.
During proposing the disclosure, inventors have found that model in the prior art combination usually needs technical staff
First multiple basic mode types are respectively trained, then multiple basic mode types after training are selected, are combined, and model combination is instructed
Practice to adjust the parameter of model combination, so that existing model training time and effort consuming, has seriously affected the efficiency of model training.
Summary of the invention
In order to solve the problems in the relevant technologies, the embodiment of the present disclosure provides a kind of model training method, device, electronics and sets
Standby and readable storage medium storing program for executing.
In a first aspect, the embodiment of the present disclosure provides a kind of model training method.
Specifically, the model training method, comprising:
Obtain the first training data and the second training data;
Based on the multiple basic mode types of first training data training, the model parameter of each basic mode type is determined;
Based on second training data, basic mode type used in built-up pattern and used is determined by greedy algorithm
The corresponding combination coefficient of basic mode type.
With reference to first aspect, for the disclosure in the first implementation of first aspect, the multiple basic mode type includes extremely
A few linear model and/or at least one nonlinear model.
The first implementation with reference to first aspect, the disclosure are described in second of implementation of first aspect
Linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
The first implementation with reference to first aspect, the disclosure are described in the third implementation of first aspect
Based on the multiple basic mode types of the first training data training, comprising:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
The third implementation with reference to first aspect, the disclosure are described in the 4th kind of implementation of first aspect
Based on the multiple basic mode types of the first training data training, including based in the multiple basic mode type of first training data training
Nonlinear model;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data
Practice the linear model in the multiple basic mode type.
With reference to first aspect, the disclosure is described to be based on the second training data in the 5th kind of implementation of first aspect,
The corresponding coefficient of basic mode type used in built-up pattern and the basic mode type used is determined by greedy algorithm, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described
One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added
Performance or the built-up pattern in basic mode type quantity be equal to the multiple basic mode type sum, wherein addition group every time
The basic mode type of molding type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, is determined described in being added
The combination coefficient of each basic mode type in built-up pattern after basic mode type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
With reference to first aspect, in the 6th kind of implementation of first aspect, the model training method also wraps the disclosure
It includes:
The low correlated characteristic in initial data is removed, preprocessed data is obtained;
By carrying out random cutting or temporally cutting to preprocessed data, first training data, described the are obtained
Two training datas and test data.
The 6th kind of implementation with reference to first aspect, the disclosure are also wrapped in the 7th kind of implementation of first aspect
It includes:
Based on the test data, the built-up pattern is verified.
With reference to first aspect, the disclosure is in the 8th kind of implementation of first aspect, first training data and institute
Stating the second training data includes related data of drawing a portrait with user;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
Second aspect provides a kind of model training apparatus in the embodiment of the present disclosure characterized by comprising
Module is obtained, is configured as obtaining the first training data and the second training data;
First determining module is configured as determining each basic mode based on the multiple basic mode types of first training data training
The model parameter of type;
Second determining module is configured as determining in built-up pattern based on second training data by greedy algorithm
The corresponding combination coefficient of the basic mode type and used basic mode type that use.
In conjunction with second aspect, for the disclosure in the first implementation of second aspect, the multiple basic mode type includes extremely
A few linear model and/or at least one nonlinear model.
In conjunction with the first implementation of second aspect, the disclosure is described in second of implementation of second aspect
Linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
In conjunction with the first implementation of second aspect, the disclosure is described in the third implementation of second aspect
Based on the multiple basic mode types of the first training data training, comprising:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
In conjunction with the third implementation of second aspect, the disclosure is described in the 4th kind of implementation of second aspect
Based on the multiple basic mode types of the first training data training, including based in the multiple basic mode type of first training data training
Nonlinear model;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data
Practice the linear model in the multiple basic mode type.
In conjunction with the 4th kind of implementation of second aspect, the disclosure is described in the 5th kind of implementation of second aspect
Based on the second training data, the phase of basic mode type used in built-up pattern and the basic mode type used is determined by greedy algorithm
Answer coefficient, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described
One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added
Performance or the built-up pattern in basic mode type quantity be equal to the multiple basic mode type sum, wherein addition group every time
The basic mode type of molding type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, is determined described in being added
The combination coefficient of each basic mode type in built-up pattern after basic mode type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
In conjunction with second aspect, the disclosure is in the 6th kind of implementation of second aspect, the model training further include: go
Except module, the low correlated characteristic being configured as in removal initial data obtains preprocessed data;
Cutting module, is configured as by carrying out random cutting or temporally cutting to preprocessed data, obtains described the
One training data, second training data and test data.
In conjunction with the 6th kind of implementation of second aspect, the disclosure is described in the 7th kind of implementation of second aspect
Model training further include:
Correction verification module is configured as verifying the built-up pattern based on the test data.
In conjunction with second aspect, the disclosure is in the 8th kind of implementation of second aspect, first training data and institute
Stating the second training data includes related data of drawing a portrait with user;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
The third aspect, the embodiment of the present disclosure provide a kind of electronic equipment, including memory and processor, wherein described
Memory is for storing one or more computer instruction, wherein one or more computer instruction is by the processor
It executes to realize following methods step:
Obtain the first training data and the second training data;
Based on the multiple basic mode types of first training data training, the model parameter of each basic mode type is determined;
Based on second training data, basic mode type used in built-up pattern and used is determined by greedy algorithm
The corresponding combination coefficient of basic mode type.
In conjunction with the third aspect, for the disclosure in the first implementation of the third aspect, the multiple basic mode type includes extremely
A few linear model and/or at least one nonlinear model.
In conjunction with the first implementation of the third aspect, the disclosure is described in second of implementation of the third aspect
Linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
In conjunction with the first implementation of the third aspect, the disclosure uses in the third implementation of the third aspect
First training data described in gradient boosted tree model treatment, obtains middle trained data;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
In conjunction with the third implementation of the third aspect, the disclosure is described in the 4th kind of implementation of the third aspect
Based on the multiple basic mode types of the first training data training, including based in the multiple basic mode type of first training data training
Nonlinear model;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data
Practice the linear model in the multiple basic mode type.
In conjunction with the third aspect, the disclosure is described to be based on the second training data in the 5th kind of implementation of the third aspect,
The corresponding coefficient of basic mode type used in built-up pattern and the basic mode type used is determined by greedy algorithm, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described
One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added
Performance or the built-up pattern in basic mode type quantity be equal to the multiple basic mode type sum, wherein addition group every time
The basic mode type of molding type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, is determined described in being added
The combination coefficient of each basic mode type in built-up pattern after basic mode type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
In conjunction with the third aspect, the disclosure is in the 6th kind of implementation of the third aspect, one or more computer
Instruction is also executed by the processor to realize following methods step: the low correlated characteristic in removal initial data obtains pre- place
Manage data;
By carrying out random cutting or temporally cutting to preprocessed data, first training data, described the are obtained
Two training datas and test data.
In conjunction with the 6th kind of implementation of the third aspect, the disclosure is described in the 7th kind of implementation of the third aspect
One or more computer instruction is also executed by the processor to realize following methods step:
Based on the test data, the built-up pattern is verified.
In conjunction with the third aspect, the disclosure is in the 8th kind of implementation of the third aspect, first training data and institute
Stating the second training data includes related data of drawing a portrait with user;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
Fourth aspect provides a kind of readable storage medium storing program for executing in the embodiment of the present disclosure, is stored thereon with computer instruction, should
Realize the first implementation such as first aspect, first aspect to the 8th kind of realization side when computer instruction is executed by processor
The described in any item methods of formula.
The technical solution that the embodiment of the present disclosure provides can include the following benefits:
According to the technical solution that the embodiment of the present disclosure provides, basic mode type and institute used in built-up pattern can be automatically determined
The corresponding combination coefficient of the basic mode type used can be improved the efficiency of tune ginseng during model training, improve the accuracy rate of model
And objectivity.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not
The disclosure can be limited.
Detailed description of the invention
In conjunction with attached drawing, by the detailed description of following non-limiting embodiment, other labels of the disclosure, purpose and excellent
Point will be apparent.In the accompanying drawings:
Fig. 1 shows the flow chart of model training method according to an embodiment of the present disclosure;
Fig. 2 shows the flow charts of model training method according to an embodiment of the present disclosure;
Fig. 3 shows the flow chart of model training method according to an embodiment of the present disclosure;
Fig. 4 shows the flow chart of the multiple basic mode types of training according to an embodiment of the present disclosure;
Fig. 5 shows the schematic diagram that gradient according to an embodiment of the present disclosure promotes tree-model;
Fig. 6 shows basic mode type used in determining built-up pattern according to an embodiment of the present disclosure and the basic mode used
The flow chart of the corresponding coefficient of type;
Fig. 7 shows the example process of model training method according to an embodiment of the present disclosure;
Fig. 8 shows the structural block diagram of model training apparatus according to an embodiment of the present disclosure;
Fig. 9 shows the structural block diagram of electronic equipment according to an embodiment of the present disclosure;
Figure 10 shows the structure for being suitable for the computer system for being used to realize the model training method according to the embodiment of the present disclosure
Schematic diagram.
Specific embodiment
Hereinafter, the exemplary embodiment of the disclosure will be described in detail with reference to the attached drawings, so that those skilled in the art can hold
It changes places and realizes them.In addition, for the sake of clarity, the part unrelated with description exemplary embodiment is omitted in the accompanying drawings.
In the disclosure, it should be appreciated that the term of " comprising " or " having " etc. is intended to refer to disclosed in this specification
Feature, number, step, behavior, the presence of component, part or combinations thereof, and be not intended to exclude other one or more features,
A possibility that number, step, behavior, component, part or combinations thereof exist or are added.
It also should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure
It can be combined with each other.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Mentioned above, in order to improve the precision of prediction of model in machine learning, technical staff is generallyd use multiple basic modes
Type is combined, the generalization ability of Lai Tigao model.During proposing the disclosure, inventors have found that in the prior art
Model combination usually needs technical staff that first multiple basic mode types are respectively trained, then selects multiple basic mode types after training
It selects, combine, and model combination is trained to adjust the parameter of model combination, so that existing model training time and effort consuming,
The efficiency of model training is seriously affected.
In view of drawbacks described above, the technical solution that the embodiment of the present disclosure provides obtains the first training data and the second training number
According to based on the multiple basic mode types of first training data training, the model parameter of each basic mode type being determined, based on second instruction
Practice data, the corresponding combination coefficient of basic mode type used in built-up pattern and used basic mode type is determined by greedy algorithm.
The technical solution can automatically determine the corresponding combination coefficient of basic mode type used in built-up pattern and used basic mode type, can
Join efficiency to improve the tune during model training, reduces manually to the intervention of model, improve the generalization ability of model.
Fig. 1 shows the flow chart of model training method according to an embodiment of the present disclosure.
As shown in Figure 1, the model training method includes the following steps S101-S103.
In step s101, the first training data and the second training data are obtained.
In step s 102, based on the multiple basic mode types of first training data training, the model of each basic mode type is determined
Parameter.
In step s 103, it is based on second training data, base used in built-up pattern is determined by greedy algorithm
The corresponding combination coefficient of model and used basic mode type.
In accordance with an embodiment of the present disclosure, for example, it is assumed that have multiple basic mode type M1, M2 ..., Mn, n >=2, first be based on first
Training data be respectively trained basic mode type M1, M2 ..., Mn the basic mode that built-up pattern M is used then is determined based on greedy algorithm
The corresponding combination coefficient of type and used basic mode type.According to actual needs with the effect of model, built-up pattern M may include more
A basic mode type M1, M2 ..., whole basic mode types or part basic mode type in Mn.The combination coefficient of basic mode type can in built-up pattern
To indicate the weight of the basic mode type in built-up pattern.For example, it is assumed that built-up pattern M includes basic mode type M1, M2, M3, basic mode type
The combination coefficient of M1, M2, M3 are respectively m1, m2, m3, then M=m1*M1+m2*M2+m3*M3.
In accordance with an embodiment of the present disclosure, basic mode type used in built-up pattern and used basic mode type can be automatically determined
The corresponding combination coefficient, the tune ginseng efficiency during model training can be improved, improve the generalization ability of model.
The model training method that the disclosure is proposed is widely applicable for the training of various model combinations, first training
Data and second training data may include various data, and the built-up pattern and the basic mode type can be used for various use
On the way, the disclosure is not especially limited this.
For example, first training data and second training data may include related data of drawing a portrait with user.
The built-up pattern and the basic mode type are used to predict based on the data related with user's portrait.For example, the first instruction
Practice data and the second training data may include following at least one data related with user's portrait: user attribute data (example
Such as age, gender, occupation), the friendship of user behavior data (such as consumption preferences, browsing preference etc.), user and other main bodys
Mutual data (such as place an order, collect, goods return and replacement etc.), built-up pattern and basic mode type can be used for based on drawing a portrait related number with user
The probability of the recommendation items in recommendation list is clicked according to come the behavior of predicting user, such as user.
Fig. 2 shows the flow charts of model training method according to an embodiment of the present disclosure.
As shown in Fig. 2, the model training method also wraps other than step S101-S103 according to the embodiment of the present disclosure
Include step S104-S105.
In step S104, the low correlated characteristic in initial data is removed, preprocessed data is obtained.
In step s105, by carrying out random cutting or temporally cutting to preprocessed data, first instruction is obtained
Practice data, second training data and test data.
In accordance with an embodiment of the present disclosure, low correlated characteristic can be the spy of (or relationship very little) unrelated with this model training
Sign can also allow to the redundancy feature obtained from other features, for example, as it is known that rectangle is long and wide, then the area of rectangle is visual
For redundancy feature.
In accordance with an embodiment of the present disclosure, the low correlated characteristic in the removal initial data includes passing through variance test, phase
The inspection by attributes methods such as closing property inspection, remove the low correlated characteristic in initial data, for example, the feature that removal variance is zero, goes
Except correlation is lower than the feature etc. of threshold value.In this way, it is possible to prevente effectively from dimension explosion, on the one hand reduces data and accounts for memory
With improving the efficiency of model training, on the other hand can also effectively avoid during model training since feature is excessively sparse
And the case where leading to over-fitting.
Fig. 3 shows the flow chart of model training method according to an embodiment of the present disclosure.
As shown in figure 3, the model training method also wraps other than step S101-S103 according to the embodiment of the present disclosure
Include step S106.
In step s 106, it is based on the test data, the built-up pattern is verified.
In accordance with an embodiment of the present disclosure, the test data and first training data and the second training data are all not
With, i.e., the described test data did not use in step S101-S103.In this way, the fidelity of check results can be improved,
The risk of over-fitting is effectively reduced, to improve the accuracy of model.
Fig. 4 shows the flow chart of the multiple basic mode types of training according to an embodiment of the present disclosure.
As shown in figure 4, including step S201-S203 based on the multiple basic mode types of the first training data training.
In step s 201, using the first training data described in gradient boosted tree model treatment, middle trained data are obtained.
In step S202, the low correlated characteristic in the middle trained data is removed, third training data is obtained.
In step S203, based on the multiple basic mode type of third training data training.
In accordance with an embodiment of the present disclosure, the gradient promotes tree-model (Gradient BoostingDecision Tree
Model, abbreviation GBDT model) it include more regression trees, it is a kind of decision-tree model of iteration.
In accordance with an embodiment of the present disclosure, using the first training data described in gradient boosted tree model treatment, intermediate instruction is obtained
Practice data, first can handle first training data using gradient boosted tree, then utilizes one-hot coding (one-hot coding)
The leaf node of every regression tree is recorded, i.e., the leaf node where first training data is denoted as 1, remaining is denoted as 0, then
It combines all codings and obtains middle trained data.
It is illustrated below with reference to Fig. 5 and the training number of gradient boosted tree model treatment first is used according to the embodiment of the present disclosure
According to obtaining the example process of middle trained data.
Fig. 5 shows the schematic diagram that gradient according to an embodiment of the present disclosure promotes tree-model.
For convenience of description, it is promoted for tree-model T includes three regression trees in Fig. 5 by gradient and explains and say
It is bright, it should be understood that, the merely illustrative use of the example is not the limitation for the disclosure, and the gradient in the disclosure is promoted
Tree-model can also be made of two or more regression trees.In addition, the depth and leaf node number of every regression tree can also
To be set according to actual needs, the disclosure is not especially limited this.
As shown in figure 5, it includes three regression trees T1, T2 and T3 that gradient, which promotes tree-model T, wherein regression tree T1 and T3 is each
There are three leaf nodes for tool, and there are two leaf nodes for regression tree T2 tool.Each leaf node correspond to a feature, described first
After training data S1 promotes tree-model T processing by the gradient, obtained middle trained data X1 is the feature vector of octuple.
Assuming that the first training data S1 falls in the of regression tree T1 after the gradient promotes tree-model T processing
One leaf node, then available coding a1 is [1,0,0], indicates that the first training data S1 includes the of regression tree T1
The corresponding feature of one leaf node;The first training data S1 falls in second leaf node of regression tree T2, then can be with
Obtaining coding a2 is [0,1];The first training data S1 falls in the third leaf node of regression tree T3, then available volume
Code a3 is [0,0,1].By will encode a1, coding a2 and coding a3 combination, obtain middle trained data X1 be octuple vector [1,
0,0,0,1,0,0,1]。
In addition, during handling first training data using gradient boosted tree, it can be according to first training
Gradient described in data point reuse promotes the parameter of tree-model, to improve the performance that the gradient promotes tree-model.
In accordance with an embodiment of the present disclosure, the low correlated characteristic in the middle trained data is removed, third training number is obtained
According to.Specifically, it can be removed low related special in middle trained data by inspection by attributes such as variance test, correlation tests
Sign, for example, the feature that removal variance is zero, removal correlation is lower than the feature of threshold value.
In accordance with an embodiment of the present disclosure, due to the dimension of middle trained data be determined by characteristic, and with it is described in
Between training data compare, third training data eliminates low correlated characteristic.Therefore, the dimension of the third training data is lower than institute
The dimension of middle trained data is stated, to effectively dimension be avoided to explode, data on the one hand can be reduced to the occupancy of memory, improve
The efficiency of model training, it is on the other hand, solely hot due to the combination that the intermediate data obtained after GBDT is handled is one-hot coding
It is more sparse to encode feature itself, feature obtained by a combination thereof is then more sparse, therefore, low related special by removing from intermediate data
Sign, obtains the lower third training data of dimension, can also effectively avoid during model training since feature is excessively sparse and
The case where leading to over-fitting.
In accordance with an embodiment of the present disclosure, the multiple basic mode type includes at least one nonlinear model.According to the disclosure
Embodiment, the nonlinear model include at least one of extreme gradient lift scheme, Factorization machine and random forest.
In accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, including based on described the
Nonlinear model in the multiple basic mode type of one training data training.
In accordance with an embodiment of the present disclosure, can by the first training data cutting be the first training set and the first test set, it is right
In each basic mode type (for example, nonlinear model), it is first based on first training set, is looked into conjunction with greed search with grid data service
Multiple groups parameter is looked for, then is based on first test set, the model parameter of the basic mode type is determined by cross-validation method.
In accordance with an embodiment of the present disclosure, the model parameter that the basic mode type is determined by cross-validation method, can be certain
Over-fitting is avoided in degree, also helps and obtains effective information as much as possible from limited preprocessed data, to improve
The generalization ability of the basic mode type.
In accordance with an embodiment of the present disclosure, by greed search, grid search scheduling algorithm, and combine cross-validation method automatically true
The model parameter of fixed each basic mode type, can effectively reduce manual intervention, improve the efficiency and model of model training
Accuracy and objectivity.
In accordance with an embodiment of the present disclosure, the multiple basic mode type includes at least one linear model.According to the reality of the disclosure
Example is applied, the linear model includes Logic Regression Models.
In accordance with an embodiment of the present disclosure, described based on the multiple basic mode type of third training data training, including base
Linear model in the multiple basic mode type of third training data training.
In accordance with an embodiment of the present disclosure, can by third training data cutting be third training set and third test set, it is right
In each basic mode type (for example, linear model), it is first based on the third training set, searches multiple groups parameter in conjunction with grid data service,
It is based on the third test set again, the model parameter of the basic mode type is determined by cross-validation method.
According to the embodiment of the present disclosure, multiple basic mode types may include multiple different types of basic mode types, advantageously allow more
The prediction error of a basic mode type is independent mutually, so that the error rate after the multiple basic mode type combination is reduced, to improve model
Trained accuracy and reliability.
Fig. 6 shows basic mode type used in determining built-up pattern according to an embodiment of the present disclosure and the basic mode used
The flow chart of the corresponding coefficient of type.
As shown in fig. 6, described be based on the second training data, basic mode type used in built-up pattern is determined by greedy algorithm
And the corresponding coefficient of the basic mode type used, include the following steps S301-S303.
In step S301, it is based on second training data, determines first of best performance in the multiple basic mode type
Model, using first model as built-up pattern.
In step s 302, the basic mode type quantity being stepped up in the built-up pattern, until new basic mode type is added not
The performance of lift scheme combination or the basic mode type quantity in the built-up pattern are equal to the sum of the multiple basic mode type again,
Wherein, the basic mode type for built-up pattern being added every time is the basic mode so that the built-up pattern best performance being added after the basic mode type
Type determines the combination coefficient of each basic mode type in the built-up pattern after the basic mode type is added.
In step S303, the respective sets of basic mode type used in the built-up pattern and the basic mode type used are exported
Collaboration number.
In accordance with an embodiment of the present disclosure, can by the second training data cutting be the second training set and the second test set, it is right
In each built-up pattern, first based on second training set training built-up pattern, then based on second test set test combination
The performance of model.
For example, it is assumed that having been based on first training data has determined the model parameter of 30 basic mode types.Next, base
In second training data, the first Model B 1 of best performance in 30 basic mode types is determined, by first Model B 1
As built-up pattern.Then, the built-up pattern obtained after combining with the first Model B 1 is determined in remaining 29 basic mode types
Optimal second Model B 2 of energy, and determine the combination coefficient of first Model B 1 and the second Model B 2, according to first mould
The combination coefficient of type B1, the second Model B 2 and the first Model B 1 and the second Model B 2, update the built-up pattern.Successively class
It pushes away, the basic mode type quantity being stepped up in the built-up pattern, the property until the no longer lift scheme combination of new basic mode type is added
Can or the built-up pattern in basic mode type quantity be equal to 30, wherein every time be added built-up pattern basic mode type be so that plus
Enter the basic mode type of the built-up pattern best performance after the basic mode type, determines in the built-up pattern after the basic mode type is added
The combination coefficient of each basic mode type finally exports the corresponding of basic mode type used in the built-up pattern and the basic mode type used
Combination coefficient.
In accordance with an embodiment of the present disclosure, basic mode type used in built-up pattern is determined using greedy algorithm and described used
The corresponding coefficient of basic mode type is conducive to the efficiency for improving model training, reduces time complexity.
The example process according to embodiment of the present disclosure model training method is illustrated below with reference to Fig. 7.
Fig. 7 shows the example process of model training method according to an embodiment of the present disclosure.
For convenience of description, a nonlinear model and a linear model are only depicted in Fig. 7 in multiple basic mode types,
It will be appreciated that the merely illustrative use of the example, is not the limitation for the disclosure, in multiple basic mode types in the disclosure
The quantity of linear model and nonlinear model can be set according to actual needs, and the disclosure is not especially limited this.
As shown in fig. 7, first remove the low correlated characteristic in initial data after obtaining initial data, obtain preprocessed data,
Then by carrying out random cutting or temporally cutting to preprocessed data, the first training data S1, described second are obtained
Training data S2 and test data U.
For the first training data S1, the first training data S1 first is handled using GBDT model T, obtains middle trained
Then data X1 removes the low correlated characteristic in the middle trained data X1, obtains third training data S3.
With continued reference to Fig. 7, the multiple basic mode type includes linear model and nonlinear model.Obtaining first training
It is the multiple based on the first training data S1 training after data S1, the second training data S2 and third training data S3
Nonlinear model in basic mode type, and the linear model in the multiple basic mode type is trained based on the third training data, from
And determine the model parameter of each basic mode type.
After the model parameter for determining the multiple basic mode type, it is based on the second training data S2, it is true by greedy algorithm
Determine the respective sets collaboration of basic mode type used in built-up pattern and used basic mode type.
Then, then it is based on the test data U, the built-up pattern is verified, to assess the built-up pattern
Generalization ability.
Fig. 8 shows the structural block diagram of model training apparatus 700 according to an embodiment of the present disclosure.Wherein, which can be with
Pass through being implemented in combination with as some or all of of electronic equipment for software, hardware or both.
If Fig. 8 shows, the model training apparatus 700 is determining including obtaining module 701, the first determining module 702 and second
Module 703.
The acquisition module 701 is configured as obtaining the first training data and the second training data;
First determining module 702 is configured as determining each based on the multiple basic mode types of first training data training
The model parameter of a basic mode type;
Second determining module 703 is configured as being determined and being combined by greedy algorithm based on second training data
The corresponding combination coefficient of basic mode type used in model and used basic mode type.
In accordance with an embodiment of the present disclosure, the multiple basic mode type includes at least one linear model and/or at least one is non-
Linear model.
The linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
It is in accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, comprising:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
In accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, including based on described the
Nonlinear model in the multiple basic mode type of one training data training;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data
Practice the linear model in the multiple basic mode type.
In accordance with an embodiment of the present disclosure, described to be based on the second training data, being determined in built-up pattern by greedy algorithm makes
The corresponding coefficient of basic mode type and the basic mode type used, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described
One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added
Performance or the built-up pattern in basic mode type quantity be equal to the multiple basic mode type total amount, wherein addition group every time
The basic mode type of molding type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, is determined described in being added
The combination coefficient of each basic mode type in built-up pattern after basic mode type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.According to this
Disclosed embodiment, described device 700 further include removal module 704 and cutting module 705.
The removal module 704 is configured as the low correlated characteristic in removal initial data, obtains preprocessed data;
The cutting module 705 is configured as obtaining by carrying out random cutting or temporally cutting to preprocessed data
First training data, second training data and test data.
In accordance with an embodiment of the present disclosure, described device 700 further includes correction verification module 706.
The correction verification module 706 is configured as verifying the built-up pattern based on the test data.
In accordance with an embodiment of the present disclosure, first training data and second training data include having with user's portrait
The data of pass;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
The disclosure also discloses a kind of electronic equipment, and Fig. 9 shows the structure of electronic equipment according to an embodiment of the present disclosure
Block diagram.
As shown in figure 9, the electronic equipment 800 includes memory 801 and processor 802.The memory 801 is for depositing
Store up one or more computer instruction, wherein one or more computer instruction is executed by the processor 802 to realize
Following methods step:
Obtain the first training data and the second training data;
Based on the multiple basic mode types of first training data training, the model parameter of each basic mode type is determined;
Based on second training data, basic mode type used in built-up pattern and used is determined by greedy algorithm
The corresponding combination coefficient of basic mode type.
In accordance with an embodiment of the present disclosure, the multiple basic mode type includes at least one linear model and/or at least one is non-
Linear model.
In accordance with an embodiment of the present disclosure, the linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
It is in accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, comprising:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
In accordance with an embodiment of the present disclosure, described to be based on the multiple basic mode types of the first training data training, including based on described the
Nonlinear model in the multiple basic mode type of one training data training;And/or
It is described to be instructed based on the multiple basic mode type of third training data training, including based on the third training data
Practice the linear model in the multiple basic mode type.
In accordance with an embodiment of the present disclosure, described to be based on the second training data, being determined in built-up pattern by greedy algorithm makes
The corresponding coefficient of basic mode type and the basic mode type used, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by described
One model is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern is combined until new basic mode type no longer lift scheme is added
Performance, wherein every time be added built-up pattern basic mode type be so that the built-up pattern performance after the basic mode type is added most
Excellent basic mode type determines the combination coefficient of each basic mode type in the built-up pattern after the basic mode type is added;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
In accordance with an embodiment of the present disclosure, one or more computer instruction is also executed by the processor 802 with reality
Existing following methods step:
The low correlated characteristic in initial data is removed, preprocessed data is obtained;
By carrying out random cutting or temporally cutting to preprocessed data, first training data, described the are obtained
Two training datas and test data.
In accordance with an embodiment of the present disclosure, one or more computer instruction is also executed by the processor 802 with reality
Existing following methods step:
Based on the test data, the built-up pattern is verified.
In accordance with an embodiment of the present disclosure, first training data and second training data include having with user's portrait
The data of pass;
The built-up pattern and the basic mode type are used to predict based on data related with user's portrait.
Figure 10 shows the structure for being suitable for the computer system for being used to realize the model training method according to the embodiment of the present disclosure
Schematic diagram.
As shown in Figure 10, computer system 900 includes central processing unit (CPU) 901, can be read-only according to being stored in
Program in memory (ROM) 902 or be loaded into the program in random access storage device (RAM) 903 from storage section 909 and
Execute the various processing in above-described embodiment.In RAM 903, also it is stored with system 900 and operates required various program sum numbers
According to.CPU 901, ROM 902 and RAM 903 are connected with each other by bus 904.Input/output (I/O) interface 905 also connects
To bus 904.
I/O interface 905 is connected to lower component: the importation 906 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 908 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 908 including hard disk etc.;
And the communications portion 909 of the network interface card including LAN card, modem etc..Communications portion 909 via such as because
The network of spy's net executes communication process.Driver 910 is also connected to I/O interface 905 as needed.Detachable media 911, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 910, in order to read from thereon
Computer program be mounted into storage section 908 as needed.
Particularly, in accordance with an embodiment of the present disclosure, method as described above may be implemented as computer software programs.Example
Such as, embodiment of the disclosure includes a kind of computer program product comprising be tangibly embodied in and its readable medium on meter
Calculation machine program, the computer program include the program code that method is determined for executing above-mentioned object type.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 909, and/or from detachable media
911 are mounted.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in course diagram or block diagram can generation
A part of one module, section or code of table, a part of the module, section or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in the embodiment of the present disclosure involved unit or module can be realized by way of software, can also be with
It is realized by way of programmable hardware.Described unit or module also can be set in the processor, these units or
The title of module does not constitute the restriction to the unit or module itself under certain conditions.
As on the other hand, the disclosure additionally provides a kind of readable storage medium storing program for executing, which can be above-mentioned
Readable storage medium storing program for executing included in electronic equipment or computer system in embodiment;It is also possible to individualism, without supplying
Readable storage medium storing program for executing in equipment.Readable storage medium storing program for executing is stored with one or more than one program, described program by one or
The more than one processor of person is used to execute to be described in disclosed method.
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that invention scope involved in the disclosure, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed in the disclosure
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (10)
1. a kind of model training method characterized by comprising
Obtain the first training data and the second training data;
Based on the multiple basic mode types of first training data training, the model parameter of each basic mode type is determined;
Based on second training data, basic mode type and used basic mode used in built-up pattern are determined by greedy algorithm
The corresponding combination coefficient of type.
2. according to the method described in claim 1, it is characterized by:
The multiple basic mode type includes at least one linear model and/or at least one nonlinear model.
3. according to the method described in claim 2, it is characterized by:
The linear model includes Logic Regression Models;And/or
The nonlinear model includes at least one of extreme gradient lift scheme, Factorization machine and random forest.
4. according to the method described in claim 2, it is characterized in that, described be based on the multiple basic mode types of the first training data training,
Include:
Using the first training data described in gradient boosted tree model treatment, middle trained data are obtained;
The low correlated characteristic in the middle trained data is removed, third training data is obtained;
Based on the multiple basic mode type of third training data training.
5. according to the method described in claim 4, it is characterized by:
It is described to be based on the multiple basic mode types of the first training data training, including based on the multiple base of first training data training
Nonlinear model in model;And/or
It is described to train institute based on the multiple basic mode type of third training data training, including based on the third training data
State the linear model in multiple basic mode types.
6. the second training data is based on the method according to claim 1, wherein described, it is true by greedy algorithm
Determine the corresponding coefficient of basic mode type used in built-up pattern and the basic mode type used, comprising:
Based on second training data, the first model of best performance in the multiple basic mode type is determined, by first mould
Type is as built-up pattern;
The basic mode type quantity being stepped up in the built-up pattern, the property until the no longer lift scheme combination of new basic mode type is added
Basic mode type quantity in energy or the built-up pattern is equal to the sum of the multiple basic mode type, wherein combination die is added every time
The basic mode type of type is the basic mode type so that the built-up pattern best performance being added after the basic mode type, determines and the basic mode is added
The combination coefficient of each basic mode type in built-up pattern after type;
Export the corresponding combination coefficient of basic mode type used in the built-up pattern and the basic mode type used.
7. the method according to claim 1, wherein further include:
The low correlated characteristic in initial data is removed, preprocessed data is obtained;
By carrying out random cutting or temporally cutting to the preprocessed data, first training data, described the are obtained
Two training datas and test data.
8. a kind of model training apparatus characterized by comprising
Module is obtained, is configured as obtaining the first training data and the second training data;
First determining module is configured as determining each basic mode type based on the multiple basic mode types of first training data training
Model parameter;
Second determining module is configured as being determined in built-up pattern and being used by greedy algorithm based on second training data
Basic mode type and used basic mode type the corresponding combination coefficient.
9. a kind of electronic equipment, which is characterized in that including memory and processor;Wherein, the memory is for storing one
Or a plurality of computer instruction, wherein one or more computer instruction is executed by the processor to realize claim
The described in any item method and steps of 1-7.
10. a kind of readable storage medium storing program for executing, is stored thereon with computer instruction, which is characterized in that the computer instruction is by processor
Claim 1-7 described in any item method and steps are realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910271480.8A CN109978179A (en) | 2019-04-04 | 2019-04-04 | Model training method and device, electronic equipment and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910271480.8A CN109978179A (en) | 2019-04-04 | 2019-04-04 | Model training method and device, electronic equipment and readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109978179A true CN109978179A (en) | 2019-07-05 |
Family
ID=67083015
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910271480.8A Pending CN109978179A (en) | 2019-04-04 | 2019-04-04 | Model training method and device, electronic equipment and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109978179A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765110A (en) * | 2019-10-24 | 2020-02-07 | 深圳前海微众银行股份有限公司 | Generalization capability processing method, device, equipment and storage medium |
CN112906554A (en) * | 2021-02-08 | 2021-06-04 | 智慧眼科技股份有限公司 | Model training optimization method and device based on visual image and related equipment |
CN113139463A (en) * | 2021-04-23 | 2021-07-20 | 北京百度网讯科技有限公司 | Method, apparatus, device, medium and program product for training a model |
-
2019
- 2019-04-04 CN CN201910271480.8A patent/CN109978179A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765110A (en) * | 2019-10-24 | 2020-02-07 | 深圳前海微众银行股份有限公司 | Generalization capability processing method, device, equipment and storage medium |
CN112906554A (en) * | 2021-02-08 | 2021-06-04 | 智慧眼科技股份有限公司 | Model training optimization method and device based on visual image and related equipment |
CN112906554B (en) * | 2021-02-08 | 2022-12-23 | 智慧眼科技股份有限公司 | Model training optimization method and device based on visual image and related equipment |
CN113139463A (en) * | 2021-04-23 | 2021-07-20 | 北京百度网讯科技有限公司 | Method, apparatus, device, medium and program product for training a model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wangphanich et al. | Analysis of the bullwhip effect in multi-product, multi-stage supply chain systems–a simulation approach | |
US7552078B2 (en) | Enterprise portfolio analysis using finite state Markov decision process | |
Oukil et al. | Ranking dispatching rules in multi-objective dynamic flow shop scheduling: a multi-faceted perspective | |
US8566805B2 (en) | System and method to provide continuous calibration estimation and improvement options across a software integration life cycle | |
CN109978179A (en) | Model training method and device, electronic equipment and readable storage medium | |
CN107230035A (en) | Information-pushing method and device | |
CN107784390A (en) | Recognition methods, device, electronic equipment and the storage medium of subscriber lifecycle | |
CN107220217A (en) | Characteristic coefficient training method and device that logic-based is returned | |
CN111310860B (en) | Method and computer-readable storage medium for improving performance of gradient boosting decision trees | |
CN107679059A (en) | Matching process, device, computer equipment and the storage medium of service template | |
CN108665156B (en) | Supply chain selection evaluation method based on Markov chain under block chain | |
CN112068962B (en) | Cloud rendering resource exchange method based on deep learning | |
CN113657495A (en) | Insurance product recommendation method, device and equipment based on probability prediction model | |
CN116188061B (en) | Commodity sales predicting method and device, electronic equipment and storage medium | |
US20110112889A1 (en) | Proactive demand shaping for a configurable product portfolio with uncertain demand | |
CN107770783A (en) | A kind of base station extending capacity reformation Design Method and relevant device | |
Kouki et al. | Solution procedures for lost sales base-stock inventory systems with compound Poisson demand | |
CN112308623A (en) | High-quality client loss prediction method and device based on supervised learning and storage medium | |
CN106897123A (en) | Database operation method and device | |
CN110515600A (en) | A kind of method and apparatus of workflow conditional configuration | |
CN109785818A (en) | A kind of music music method and system based on deep learning | |
CN107784548A (en) | Order processing method and apparatus | |
Demgne et al. | Modelling and numerical assessment of a maintenance strategy with stock through piecewise deterministic Markov processes and quasi Monte Carlo methods | |
Ling et al. | Maximum profit mining and its application in software development | |
CN110163701A (en) | The method and apparatus of pushed information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190705 |
|
RJ01 | Rejection of invention patent application after publication |