CN107665349A - The training method and device of multiple targets in a kind of disaggregated model - Google Patents

The training method and device of multiple targets in a kind of disaggregated model Download PDF

Info

Publication number
CN107665349A
CN107665349A CN201610614088.5A CN201610614088A CN107665349A CN 107665349 A CN107665349 A CN 107665349A CN 201610614088 A CN201610614088 A CN 201610614088A CN 107665349 A CN107665349 A CN 107665349A
Authority
CN
China
Prior art keywords
target
training
sample
training sample
individual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610614088.5A
Other languages
Chinese (zh)
Other versions
CN107665349B (en
Inventor
尤爱华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610614088.5A priority Critical patent/CN107665349B/en
Publication of CN107665349A publication Critical patent/CN107665349A/en
Application granted granted Critical
Publication of CN107665349B publication Critical patent/CN107665349B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Abstract

The invention discloses the training method and device of multiple targets in a kind of disaggregated model, for improving the utilization rate of machine learning resource, improves the renewal efficiency of disaggregated model.The embodiment of the present invention provides a kind of training method of multiple targets in disaggregated model, including:Current training training set used is extracted from tranining database, the training set includes M training sample;Get the aiming field in each training sample respectively from M training sample in the training set, and the field contents included according to the M aiming field got select M target value of corresponding M aiming field from N number of target respectively, by the M target value selected be respectively configured to should be where the aiming field of target value training sample;Study using preset sorting algorithm by the M training sample to being configured with M target value, constructs the disaggregated model for including N number of target, and N number of target is corresponding with different model parameters in the disaggregated model.

Description

The training method and device of multiple targets in a kind of disaggregated model
Technical field
The present invention relates to the training method and dress of multiple targets in field of computer technology, more particularly to a kind of disaggregated model Put.
Background technology
In the prior art, the clicking rate that some advertisement can be calculated by advertisement prediction model predicts (English full name: PCTR, English abbreviation:Predict Click-Through Rate), PCTR is technology most crucial in advertisement algorithm, and PCTR will Solve the problems, such as to be prediction specific user in click probability of the particular advertisement position to particular advertisement in certain circumstances.Current is wide It is typically all only to consider this target of ad click rate to accuse in prediction model, i.e., current PCTR only needs to consider when estimating This target of ad click rate, thus in training data can by the ad click rate target identification into 0/1, wherein, 0 can To represent the exposure of advertisement, 1, which can represent advertisement, is clicked.This advertisement PCTR predictor method may be only available for traditional advertisement Prediction model, therefore estimating for PCTR needs to consider the other factorses in addition to ad click rate in traditional scheme.But With being continuously increased for advertisement putting field, such as circle of friends advertisement putting field, this is different from traditional advertisement putting, friend The PCTR pre-estimations enclosed in advertisement putting need to consider the prediction of multiple targets in addition to ad click rate, such as the point of advertisement Details are hit, clicks on and multiple targets such as shares.According to prior art, if each target individually trains prediction, machine learning money It will be huge in terms of the consumption of source, be difficult in terms of engineering.
The content of the invention
The embodiments of the invention provide the training method and device of multiple targets in a kind of disaggregated model, for improving machine The utilization rate of education resource, improve the renewal efficiency of disaggregated model.
In order to solve the above technical problems, the embodiment of the present invention provides following technical scheme:
In a first aspect, the embodiment of the present invention provides a kind of training method of multiple targets in disaggregated model, including:
Current training training set used is extracted from tranining database, the training set includes M training sample, The M is natural number;
Get the aiming field in each training sample, and root respectively from M training sample in the training set The field contents included according to the M aiming field got select M of corresponding M aiming field from N number of target respectively Target value, by the M target value selected be respectively configured to should be where the aiming field of target value training sample This, the N is the target number for needing to train, and N is the natural number more than or equal to 2;
Study using preset sorting algorithm by the M training sample to being configured with M target value, constructs bag The disaggregated model of N number of target is included, N number of target is corresponding with different model parameters in the disaggregated model.
Second aspect, the embodiment of the present invention also provide a kind of trainer of multiple targets in disaggregated model, including:
Sample extraction module, the training set used for extracting current training from tranining database, the training set Include M training sample, the M is natural number;
Multiple target configuration module, for getting each training sample respectively from M training sample in the training set Aiming field in this, and the field contents included according to the M aiming field got are selected pair from N number of target respectively Answer M target value of M aiming field, by the M target value selected be respectively configured to should target value mesh Training sample where marking-up section, the N is the target number for needing to train, and N is the natural number more than or equal to 2;
Target training module, for passing through the M training sample to being configured with M target value using preset sorting algorithm This study, constructs the disaggregated model for including N number of target, N number of target is corresponding with not in the disaggregated model Same model parameter.
As can be seen from the above technical solutions, the embodiment of the present invention has advantages below:
In embodiments of the present invention, current training training set used, training set are extracted from tranining database first Include M training sample, then get the target in each training sample respectively from M training sample in training set Field, and the field contents included according to the M aiming field got select corresponding M target from N number of target respectively M target value of field, by the M target value selected be respectively configured to should target value aiming field where Training sample, the finally study using preset sorting algorithm by the M training sample to being configured with M target value, Constructing includes the disaggregated model of N number of target, and N number of target is corresponding with different model parameters in disaggregated model.It is of the invention real Apply after training sample in training set is got in example, can be that M training sample in training set configure from N number of target Go out M target value, so that after being learnt using sorting algorithm to M training sample for being configured with M target value, can To construct the disaggregated model for including N number of target, the disaggregated model can be used for estimating for multiple targets, and N value can be with The target number trained as needed determines, therefore the once loading in the embodiment of the present invention to training set can train N Individual target, therefore n times loading need not be carried out to training set, solve each target in the prior art and individually train asking for prediction Topic, so as to improve the utilization rate of machine learning resource, improves the renewal efficiency of disaggregated model.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, make required in being described below to embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for For those skilled in the art, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is the process blocks signal of the training method of multiple targets in a kind of disaggregated model provided in an embodiment of the present invention Figure;
Fig. 2 is the training process schematic diagram of multiple targets in the forecast model provided in an embodiment of the present invention to advertisement;
Fig. 3-a are that the composition structure of the trainer of multiple targets in a kind of disaggregated model provided in an embodiment of the present invention is shown It is intended to;
Fig. 3-b are a kind of composition structural representation of target training module provided in an embodiment of the present invention;
Fig. 3-c are the composition structural representation of another target training module provided in an embodiment of the present invention;
Fig. 3-d are the composition structure of the trainer of multiple targets in another disaggregated model provided in an embodiment of the present invention Schematic diagram;
Fig. 4 is that the training method of multiple targets in disaggregated model provided in an embodiment of the present invention is applied to the composition of server Structural representation.
Embodiment
The embodiments of the invention provide the training method and device of multiple targets in a kind of disaggregated model, for improving machine The utilization rate of education resource, improve the renewal efficiency of disaggregated model.
To enable goal of the invention, feature, the advantage of the present invention more obvious and understandable, below in conjunction with the present invention Accompanying drawing in embodiment, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that disclosed below Embodiment be only part of the embodiment of the present invention, and not all embodiments.Based on the embodiment in the present invention, this area The every other embodiment that technical staff is obtained, belongs to the scope of protection of the invention.
Term " comprising " and " having " in description and claims of this specification and above-mentioned accompanying drawing and they Any deformation, it is intended that cover it is non-exclusive include, so as to a series of process comprising units, method, system, product or set It is standby to be not necessarily limited to those units, but may include not list clearly or consolidate for these processes, method, product or equipment The other units having.
In the prior art, advertisement PCTR predictor method can only be to the prediction of one target of ad click rate.But with Advertisement putting field is continuously increased, such as circle of friends advertisement putting field, and this is different from traditional advertisement putting, and circle of friends is wide The prediction of multiple targets of the PCTR pre-estimations needs consideration in addition to ad click rate in announcement dispensing, such as the click of advertisement are detailed Feelings, click such as share at multiple targets.According to prior art, if each target individually trains prediction, machine learning resource disappears It will be huge in terms of consumption, be difficult in terms of engineering.So in order to reduce the use of machine learning resource, while realization pair The training of multiple targets, the method for the embodiment of the present invention propose the multiple target training based on training data amendment, can be in machine Ensure the training effectiveness of model under conditions of resource-constrained, while method provided in an embodiment of the present invention can support flexible configuration Target numbers in model, it may be implemented in being adjusted flexibly between machine resources and model efficiency.Carry out individually below specifically It is bright.
One embodiment of the training method of multiple targets in disaggregated model of the present invention, is referred to shown in Fig. 1, this method can To comprise the following steps:
101st, current training training set used is extracted from tranining database, training set includes M training sample, M is natural number.
In embodiments of the present invention, multiple training datas are preserved in tranining database, tranining database can also be root Need to carry out real-time update according to user, the embodiment of the present invention carries out first extracting the training under current state during target training every time Training data in database, using the training data extracted as training set, such as can by the way of randomly selecting from Current training training set used is extracted in tranining database, the training data extracted in tranining database may be constructed instruction Practice collection, each training data that training set includes can be used as a training sample to carry out subsequent treatment, and the present invention is implemented M training sample is included with training set in example and carries out subsequent processing steps.For example, in some embodiments of the present invention, The training sample that training set includes can specifically include the base attribute and behavioral data of user, such as training sample can include Sex, age, region, type of cell phone, network type, user's operation behavior.Wherein, user's operation behavior specifically also refers to User clicks picture, either forwarding information or makes comments, and user's operation behavior can specifically be come with connected applications scene It is determined that.
In some embodiments of the invention, step 101 extracts current training training used from tranining database After collection, the training method of multiple targets can also comprise the following steps in disaggregated model provided in an embodiment of the present invention:
A1, judge whether the current training sample number M for training training set used to include is more than preset sample and holds Measure threshold value;
If A2, training sample number M are more than preset sample size threshold value, the M training sample included to training set Processing is sampled, obtains the training sample sampled out in training set.
Wherein, in practical application scene, in order to save machine learning resource, training sample scale is excessive asks for reply Topic, can also carry out step A1 to A2.Whether big first determine whether to get training sample number M in training set by step 101 In preset sample size threshold value, if the training sample number M that the training set used in current training includes is more than preset sample This capacity threshold, then illustrate that the sample size of training set currently extracted is excessive, the M training that can include to training set Sample is sampled processing, to reduce the sample size in training set, obtains the training sample sampled out in training set.Wherein may be used It is sampled in a variety of ways, such as the training sample of particular value is identified as using specific algorithm sample drawn, can also makes Sampled out with the mode randomly selected from training set training sample.Pass through the sampling to training set, it is possible to reduce the instruction of loading Practice sample, improve the efficiency of machine learning.
In some embodiments of the invention, step 101 extracts current training training used from tranining database Collection, specifically may include steps of:
B1, according to the data update cycle of training sample extracted respectively from tranining database each data update week Training set of the training set used in as current training after being updated in phase.
Wherein, training data can be periodically updated in tranining database, then the training of disaggregated model be also required into Row periodically updates, therefore is also required to be extracted according to the data update cycle of training sample when extracting training set, such as The data update cycle of training sample is 30 minutes, then can be with the training set of extraction in every 30 minutes, so as to perform subsequent step Described in multiple target training, so as to ensure model modification cycle, bigger saving machine learning resource.
102nd, the aiming field in each training sample, and root are got respectively from M training sample in training set The field contents included according to the M aiming field got select M of corresponding M aiming field from N number of target respectively Target value, by the M target value selected be respectively configured to should be where the aiming field of target value training sample This, N is the target number for needing to train, and N is the natural number more than or equal to 2.
In embodiments of the present invention, after extracting training set, the M training sample included for training set can A specific aiming field is got from each training sample, in the aiming field that ad-hoc location writes in training sample The field contents of configuration can carry out actual setting according to the training sample of reality.For example, in circle of friends advertisement putting , can in the aiming field for the aiming field of each training sample in M training sample of training set during PCTR pre-estimations Different field contents are set to share according to ad click rate, the click details of advertisement, ad click etc., to need what is trained Exemplified by target number is N, then the target value of each aiming field configuration can be according in corresponding field in M aiming field Hold and determine, specific target value can be target value corresponding to some target in N number of target, and N value can be with It is 2,3 or bigger value, the aiming field needed in actual applications for training sample configures how many individual targets and can combined Actual training application scenarios determine., can be according to the word of aiming field for the aiming field got in training sample Target value corresponding to some target in the N number of target of section content configuration, i.e., each corresponding target value of target, and Target value is also differed corresponding to N number of target, and the target value of different target is carried out when learning to training sample Real-time update., wherein it is desired to the target number N of training species and number can configure according to concrete scene, with friend Exemplified by PCTR pre-estimations in friend's circle advertisement putting, N number of target can be ad click rate, the click details, wide of advertisement respectively Accuse click to share, then one target value can be set for each target, such as the target value of the 1st target can be 1, The target value of 2nd target can be that the target value of the 2, the 3rd target can be 3 etc..It is illustrated below, from training sample The aiming field for being stored in some ad-hoc location is got in this, for training sample 1, training sample 2, training sample 3, target The field contents of field can be content A, content B and content C, then can be that content A configures target value a, be configured for target B Target value b, target value c is configured for target C, then target value a can be allocated to training sample 1, target value b is matched somebody with somebody Put to training sample 2, target value c is allocated to training sample 3.Matched somebody with somebody in actual applications according to the field contents of aiming field Which target value is target value corresponding to putting can determine to need to configure with connected applications scene.
In some embodiments of the invention, foregoing execution step A1 to A2 realize scene under, step 102 is from training Preset aiming field is got in sample, is comprised the following steps:Obtained respectively in the training sample sampled out from training set To the aiming field in the training sample sampled out.If that is, being sampled to training set, obtained just for sampling Training sample read aiming field.
103rd, the study using preset sorting algorithm by the M training sample to being configured with M target value, construction Go out to include the disaggregated model of N number of target, N number of target is corresponding with different model parameters in disaggregated model.
In embodiments of the present invention, after for M training sample respectively one target value of configuration, next basis is configured with M training sample of M target value carries out the establishment of disaggregated model.Specifically, the use of preset sorting algorithm can patrol Regression algorithm, decision Tree algorithms and neural network algorithm etc. are collected to establish disaggregated model, due to the M training sample used during study N number of target that the M target value configured in this is trained from needs, therefore N can be trained in the disaggregated model constructed Individual target, there should be different model parameters for different targets in disaggregated model, thereby using including multiple targets Disaggregated model can estimate to realize prediction of multiple targets to estimated result when being predicted estimation, therefore in the embodiment of the present invention To same training set, the training of multiple targets in disaggregated model can be completed, it is higher to the service efficiency of machine learning resource, The independent training pattern of each target need not be directed to.Such as carry out the PCTR in circle of friends advertisement putting according to the embodiment of the present invention During pre-estimation, it is possible to achieve ad click rate, ad click details, ad click such as share at the prediction of multiple targets.
In some embodiments of the invention, step 103 using preset sorting algorithm by being taken to being configured with M target The study of M training sample of value, constructs the disaggregated model for including N number of target, specifically may include steps of:
C1, obtain the model parameter for corresponding to each target in N number of target respectively in the following way:It is determined that current carried out The target value that aiming field configures in the training sample of training corresponds to NtIndividual target, judges NtInstructed corresponding to individual target It is positive sample or negative sample to practice sample, if NtTraining sample corresponding to individual target is positive sample, then using preset classification Algorithm updates NtThe model parameter of individual target, if NtTraining sample corresponding to individual target is negative sample, then using sorting algorithm The model parameter of each target in N number of target is updated, wherein, NtRepresent less than N any one natural number;
C2, after completing the training to N number of target using the M training sample of M target value of configuration, N will be corresponded to The model parameter of each target is combined in individual target, obtains the disaggregated model for including N number of target.
Wherein, in step C1, by taking any one training sample being currently trained as an example, the training sample passes through step Rapid 102 can configure a target value, if the target value is any one target in N number of target, any one mesh Mark NtRepresent, then for NtTraining sample corresponding to individual target, the training sample is first judged for positive sample or negative sample, Wherein, positive sample refers to the training sample for belonging to a certain classification, and negative sample is referred to as anti-sample, refers to be not belonging to certain one kind Other training sample.It is illustrated below, it is determined that need to be trained is N number of target, then this N number of target can be expressed as 1st ..., N, then represent that the training sample is negative sample when target value is 0, target value is 1,2 ..., N represent the training sample Originally it is positive sample.If NtTraining sample corresponding to individual target is positive sample, then can use sorting algorithm renewal NtIndividual target Model parameter, that is to say, that as NtWhen training sample corresponding to individual target is positive sample, N is only updatedtThe mould of individual target Shape parameter, and as NtTraining sample corresponding to individual target is negative sample, then updates each mesh in N number of target using sorting algorithm Target model parameter, that is to say, that as NtWhen training sample corresponding to individual target is negative sample, all moulds in disaggregated model Shape parameter is required for updating, and the model parameter of each target in N number of target can be got by the description in step C1.So Step C2 is performed afterwards, the model parameter corresponding to each target in N number of target is combined, i.e., according to multiple targets by mould Shape parameter is divided into multiple dimensions, and the model ginseng of multiple dimensions can be obtained when the model parameter of each dimension is combined Number, so as to obtain including the disaggregated model of N number of target.
In previous embodiment in a manner of the establishment of disaggregated model is illustrated exemplified by N number of target in disaggregated model is trained. In other embodiments of the present invention, step 103 passes through the M to being configured with M target value using preset sorting algorithm The study of individual training sample, the disaggregated model for including N number of target is constructed, specifically may include steps of:
D1, obtain the model parameter for corresponding to each target in N number of target respectively in the following way:It is determined that it is currently needed for The N of training1Individual target, and filter out target value from M training sample and be not belonging to N1The training sample of individual target, obtains M1 Individual training sample, M1Expression is configured with target value in M training sample of M target value and belongs to the N1The instruction of individual target Practice number of samples, N1Represent any one natural number more than or equal to 1 and less than N;Passed through using preset sorting algorithm to M1It is individual The study of training sample, is constructed corresponding to N1The model parameter of individual target;
D2, after the completion of all being trained to N number of target, it will be combined corresponding to the model parameter of each target in N number of target one Rise, obtain the disaggregated model for including N number of target.
Wherein, scene is realized in step D1, can first by the N in N number of target1Individual target is used as and is currently needed for training Target, N1Represent more than or equal to 1 and be less than N, such as N value can be 10, N1Value can be 3, then can be first to 10 Preceding 3 targets in target are instructed.In this case, it is necessary to being filtered out from all training samples in training set Non- N1The training sample of individual target, configuration in M training sample of M target value will be configured with and remove N1Its beyond individual target The training sample of its target filters out, and obtains M1Individual training sample, after completing filtering, using preset sorting algorithm to M1It is individual Training sample is learnt, and is constructed corresponding to N1The model parameter of individual target.For removing N in N number of target1Beyond individual target Other targets can also be trained in the way of described in step D1, so as to generate the model parameter of each target. Step D2 is similar with the implementation of step C2 in previous embodiment, no longer describes in detail.
In previous embodiment in a manner of the establishment of disaggregated model is illustrated exemplified by N number of target in disaggregated model is trained. N number of target can be grouped, obtain multiple packets, then can be carried out individually carrying out disaggregated model for each packet Create, then can be created that a disaggregated model for each packet, such as N value can be 10, N1Value can be 3, then 10 targets can be divided into 3 groups, such as the 1st target to the 4th target is a packet, and the 5th target is to the 8th Target is a packet, and the 9th target and the 10th target are a packet, then can extract 3 respectively for this 3 packets Individual disaggregated model.The last merging to multiple targets in disaggregated model again, obtains a disaggregated model for including more multiple target, can To obtain the disaggregated model for including N number of target described in step 103.
By description of the above example to the embodiment of the present invention, current instruction is extracted from tranining database first Practice training set used, training set includes M training sample, then obtained respectively from M training sample in training set To the aiming field in each training sample, and the field contents included according to the M aiming field got are respectively from N number of mesh M target value of corresponding M aiming field is selected in mark, the M target value selected is respectively configured to should Training sample where the aiming field of target value, finally using preset sorting algorithm by being taken to being configured with M target The study of M training sample of value, constructs the disaggregated model for including N number of target, and N number of target is corresponding with not in disaggregated model Same model parameter.Can be M instruction in training set after the training sample in training set is got in the embodiment of the present invention Practice sample and M target value is configured from N number of target, so as to individual to the M for being configured with M target value using sorting algorithm After training sample is learnt, the disaggregated model for including N number of target can be constructed, the disaggregated model can be used for multiple mesh Target is estimated, and target number that N value can train as needed determines, therefore to training set in the embodiment of the present invention Once loading can train N number of target, therefore need not carry out n times loading to training set, solve each mesh in the prior art Mark individually trains the problem of prediction, so as to improve the utilization rate of machine learning resource, improves the renewal efficiency of disaggregated model.
For ease of being better understood from and implementing the such scheme of the embodiment of the present invention, corresponding application scenarios of illustrating below come It is specifically described.Next it is pre- PCTR to be applied to the training method of multiple target in disaggregated model provided in an embodiment of the present invention Illustrated exemplified by estimation.Traditional PCTR, which is estimated, only considers one target of ad click rate, so the target in training data Exposure is represented into 0/1,0,1 represents click.But in the case where circle of friends estimates scene, it is necessary to estimate multiple targets.According to existing skill The each target of art is individually trained, and machine resources consumption is too big.So in order to reduce machine resources, can be according to the embodiment of the present invention It is trained sample and makees correcting process.Such as training sample includes the Back ground Information of user, such as it may include the sex of user, year Age, region, type of cell phone, network type etc..Aiming field in training sample can be provided with according to field contents is not all The target that different field content in target, such as aiming field can be set has N number of, the then target value of each aiming field Can be 0-N.Such as exemplified by configuring 10 targets in training sample, 0 can represent exposure, and 1 can represent click head portrait, 2 can represent click title, and 3 can represent click picture, and 4 can represent click video playback, and 5, which can represent click, shares, and 6 Click collection can be represented, 7, which can represent user, makes comments, and 8 can represent user's concern, and 9 can represent click details, and 10 Represent user type.In actual applications, training set can be stored using hybrid language data standard, such as PB can be used (English full name:Protocol Buffer) stored in file format training sample, then using during training sample according to PB agreement solutions Analysis, such as N can be configured in training sample, for representing the number of target.N number of target can be ensured in the embodiment of the present invention Identical training sample can be used, i.e., by the way that N number of target can be trained with a training sample, so as to reduce machine Storage resource.Further, since in the PCTR prediction model time consumption for training of realistic model, Primary Stage Data is loaded into training machine Internal memory, this has certain take.So in the embodiment of the present invention when training multiple targets, the training of multiple model parameters can Share a training sample and feature analysis.Then according to the training sample corrected, in model training, negative sample is run into All model parameters are updated, positive sample is run into and only updates model parameter corresponding to positive sample, can so realize and once train Sample loads, and trains multiple targets.In the case of ensureing the model modification cycle, bigger saving machine resources, for each Model can flexibly configure different target, can thus realize and be adjusted flexibly between machine resources and model efficiency.
It is the instruction of multiple targets in the forecast model provided in an embodiment of the present invention to advertisement next, referring to shown in Fig. 2 Practice process schematic.Mainly include following process:
1st, training set generates:Training sample is corresponding with multiple targets in the embodiment of the present invention, it is assumed that and the number of target is N, So need the target of training sample being mapped to 0-N, for example 0 represents exposure, 1 represents click details, and head portrait is clicked in 2 representatives Deng.The main purpose of target mapping is can to distinguish different targets in following model filtering and model training.
2nd, training sample loads:By giving the cycle data parameter of training sample, corresponding training sample is loaded, such as Can be with the training sample of loading in every 30 minutes, to ensure the real-time update of disaggregated model.
3rd, training sample filters:In order to lift the training effect of disaggregated model, such as training sample can be sampled, due to The training sample of loading is too big, it may be necessary to makees sample process.And for example filter non-training objective sample, for example, training 1/2 this two Individual target, the training sample of 3-N target will be all filtered when filter data is crossed.
4th, new training sample is obtained, then performs step 5.
5th, disaggregated model training updates:When the training sample of input is negative sample, all model parameters are updated, when defeated When the training sample entered is positive sample, the model parameter corresponding to a more fresh target.
6th, the model parameter of multiple targets merges in model:The result of each model obtained by training directly is merged into One big model, export as local file, so far obtain final multi-objective Model.
7th, the loading of the disaggregated model of multiple target, can be exported as local file, recommended engine meeting after disaggregated model generation This file is loaded, then can be with the clicking rate of prediction advertisement on line.
Training sample amendment can be based in the embodiment of the present invention and carries out multi-model training, can support circle of friends multiple target The demand that PCTR is estimated, while can ensure that renewal efficiency and machine resources in model are adjusted flexibly.
It should be noted that for foregoing each method embodiment, in order to be briefly described, therefore it is all expressed as a series of Combination of actions, but those skilled in the art should know, the present invention is not limited by described sequence of movement because According to the present invention, some steps can use other orders or carry out simultaneously.Secondly, those skilled in the art should also know Know, embodiment described in this description belongs to preferred embodiment, and involved action and module are not necessarily of the invention It is necessary.
For ease of preferably implementing the such scheme of the embodiment of the present invention, the phase for implementing such scheme is also provided below Close device.
Refer to shown in Fig. 3-a, the trainer of multiple targets in a kind of disaggregated model provided in an embodiment of the present invention 300, it can include:Sample extraction module 301, multiple target configuration module 302 and target training module 303, wherein,
Sample extraction module 301, the training set used for extracting current training from tranining database, the training Concentration includes M training sample, and the M is natural number;
Multiple target configuration module 302, for getting each training respectively from M training sample in the training set Aiming field in sample, and the field contents included according to the M aiming field got are selected from N number of target respectively M target value of corresponding M aiming field, the M target value selected is respectively configured to should target value Training sample where aiming field, the N is the target number for needing to train, and N is the natural number more than or equal to 2;
Target training module 303, for passing through the M instruction to being configured with M target value using preset sorting algorithm Practice the study of sample, construct the disaggregated model for including N number of target, N number of target is corresponding in the disaggregated model There is different model parameters.
In some embodiments of the invention, refer to shown in Fig. 3-b, the target training module 303, including:
First model modification module 3031, for obtaining respectively in the following way corresponding to each in N number of target The model parameter of target:It is determined that the target value that aiming field configures in the training sample being currently trained corresponds to NtIt is individual Target, judge the NtTraining sample corresponding to individual target is positive sample or negative sample, if the NtCorresponding to individual target Training sample is positive sample, then using preset sorting algorithm renewal NtThe model parameter of individual target, if the NtIndividual target Corresponding training sample is negative sample, then the model parameter of each target in N number of target is updated using the sorting algorithm, its In, the NtRepresent less than any one natural number of the N;
First parameter combination module 3032, for being completed using M training sample for configuring M target value to N number of mesh After target training, the model parameter corresponding to each target in N number of target is combined, obtains including the N The disaggregated model of individual target.
In some embodiments of the invention, refer to shown in Fig. 3-c, the target training module 303, including:
Second model modification module 3033, for obtaining respectively in the following way corresponding to each in N number of target The model parameter of target:It is determined that it is currently needed for the N of training1Individual target, and filter out target value from the M training sample It is not belonging to the N1The training sample of individual target, obtains M1Individual training sample, the M1Represent that the M for being configured with M target value is individual Target value belongs to the N in training sample1The training sample number of individual target, the N1Represent more than or equal to 1 and less than described N any one natural number;Passed through using preset sorting algorithm to the M1The study of individual training sample, constructs and corresponds to The N1The model parameter of individual target;
Second parameter combination module 3034, after all training completion to N number of target, it will correspond to described N number of The model parameter of each target is combined in target, obtains the disaggregated model for including N number of target.
In some embodiments of the invention, refer to shown in Fig. 3-d, the training cartridge of multiple targets in the disaggregated model Put 300, in addition to:Sample decimation blocks 304, wherein,
The sample decimation blocks 304, current instruction is extracted from tranining database for the sample extraction module 301 After practicing training set used, whether the training sample number M that judging current training training set used includes is more than Preset sample size threshold value;If the training sample number M is more than preset sample size threshold value, to being wrapped in the training set The M training sample included is sampled processing, obtains the training sample sampled out in the training set;
The multiple target configuration module 302, specifically for obtaining the training sampled out respectively from the training set Aiming field in sample.
In some embodiments of the invention, the sample extraction module 301, specifically for the data according to training sample Update cycle extracts the training set conduct after being updated within each data update cycle respectively from the tranining database Current training training set used.
Description more than to the embodiment of the present invention, extracted first from tranining database used in current training Training set, training set includes M training sample, then got respectively from M training sample in training set each Aiming field in training sample, and the field contents included according to the M aiming field got select from N number of target respectively Select out M target value of corresponding M aiming field, by the M target value selected be respectively configured to should target take Training sample where the aiming field of value, it is finally individual by the M to being configured with M target value using preset sorting algorithm The study of training sample, constructs the disaggregated model for including N number of target, and N number of target is corresponding with different moulds in disaggregated model Shape parameter.Can be M training sample in training set after the training sample in training set is got in the embodiment of the present invention M target value is configured from N number of target, so as to train sample to M that is configured with M target value using sorting algorithm After this is learnt, the disaggregated model for including N number of target can be constructed, the disaggregated model can be used for the pre- of multiple targets Estimate, target number that N value can train as needed determines, therefore in the embodiment of the present invention to training set once plus Load can train N number of target, therefore need not carry out n times loading to training set, and it is independent to solve each target in the prior art The problem of training prediction, so as to improve the utilization rate of machine learning resource, improve the renewal efficiency of disaggregated model.
Fig. 4 is a kind of server architecture schematic diagram provided in an embodiment of the present invention, and the server 1100 can be because of configuration or property Energy is different and produces bigger difference, can include one or more central processing units (central processing Units, CPU) 1122 (for example, one or more processors) and memory 1132, one or more storage applications The storage medium 1130 of program 1142 or data 1144 (such as one or more mass memory units).Wherein, memory 1132 and storage medium 1130 can be it is of short duration storage or persistently storage.One can be included by being stored in the program of storage medium 1130 Individual or more than one module (diagram does not mark), each module can include operating the series of instructions in server.More enter One step, central processing unit 1122 be could be arranged to communicate with storage medium 1130, and storage medium is performed on server 1100 Series of instructions operation in 1130.
Server 1100 can also include one or more power supplys 1126, one or more wired or wireless nets Network interface 1150, one or more input/output interfaces 1158, and/or, one or more operating systems 1141, example Such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
It can be based on as the training method step of multiple targets in the disaggregated model performed by server in above-described embodiment Server architecture shown in the Fig. 4.
It should be noted that, device embodiment described above is only schematical in addition, wherein described as separation The unit of part description can be or may not be it is physically separate, can be as the part that unit is shown or It can not be physical location, you can with positioned at a place, or can also be distributed on multiple NEs.Can be according to reality Border needs to select some or all of module therein to realize the purpose of this embodiment scheme.It is in addition, provided by the invention In device embodiment accompanying drawing, the annexation between module represents there is communication connection between them, specifically can be implemented as one Bar or a plurality of communication bus or signal wire.Those of ordinary skill in the art are without creative efforts, you can with Understand and implement.
Through the above description of the embodiments, it is apparent to those skilled in the art that the present invention can borrow Software is helped to add the mode of required common hardware to realize, naturally it is also possible to include application specific integrated circuit, specially by specialized hardware Realized with CPU, private memory, special components and parts etc..Generally, all functions of being completed by computer program can Easily realized with corresponding hardware, moreover, for realizing that the particular hardware structure of same function can also be a variety of more Sample, such as analog circuit, digital circuit or special circuit etc..But it is more for the purpose of the present invention in the case of software program it is real It is now more preferably embodiment.Based on such understanding, technical scheme is substantially made to prior art in other words The part of contribution can be embodied in the form of software product, and the computer software product is stored in the storage medium that can be read In, such as the floppy disk of computer, USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), random access memory Device (RAM, Random Access Memory), magnetic disc or CD etc., including some instructions are causing a computer to set Standby (can be personal computer, server, or network equipment etc.) performs the method described in each embodiment of the present invention.
In summary, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although with reference to upper Embodiment is stated the present invention is described in detail, it will be understood by those within the art that:It still can be to upper State the technical scheme described in each embodiment to modify, or equivalent substitution is carried out to which part technical characteristic;And these Modification is replaced, and the essence of appropriate technical solution is departed from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (10)

  1. A kind of 1. training method of multiple targets in disaggregated model, it is characterised in that including:
    Current training training set used is extracted from tranining database, the training set includes M training sample, described M is natural number;
    Get the aiming field in each training sample respectively from M training sample in the training set, and according to obtaining The field contents that the M aiming field got includes select M target of corresponding M aiming field from N number of target respectively Value, by the M target value selected be respectively configured to should be where the aiming field of target value training sample, institute It is the target number for needing to train to state N, and N is the natural number more than or equal to 2;
    Study using preset sorting algorithm by the M training sample to being configured with M target value, is constructed including institute The disaggregated model of N number of target is stated, N number of target is corresponding with different model parameters in the disaggregated model.
  2. 2. according to the method for claim 1, it is characterised in that described to be passed through using preset sorting algorithm to being configured with M The study of M training sample of individual target value, the disaggregated model for including N number of target is constructed, including:
    Obtain the model parameter for corresponding to each target in N number of target respectively in the following way:It is determined that currently instructed The target value that aiming field configures in experienced training sample corresponds to NtIndividual target, judge the NtCorresponding to individual target Training sample is positive sample or negative sample, if the NtTraining sample corresponding to individual target is positive sample, then using preset Sorting algorithm renewal NtThe model parameter of individual target, if the NtTraining sample corresponding to individual target is negative sample, then makes The model parameter of each target in N number of target is updated with the sorting algorithm, wherein, the NtRepresent less than appointing for the N One natural number of meaning;
    After completing the training to N number of target using the M training sample of M target value of configuration, the N will be corresponded to The model parameter of each target is combined in individual target, obtains the disaggregated model for including N number of target.
  3. 3. according to the method for claim 1, it is characterised in that described to be passed through using preset sorting algorithm to being configured with M The study of M training sample of individual target value, the disaggregated model for including N number of target is constructed, including:
    Obtain the model parameter for corresponding to each target in N number of target respectively in the following way:It is determined that it is currently needed for instructing Experienced N1Individual target, and filter out target value from the M training sample and be not belonging to the N1The training sample of individual target, Obtain M1Individual training sample, the M1Expression is configured with target value in M training sample of M target value and belongs to the N1 The training sample number of individual target, N1Represent any one natural number more than or equal to 1 and less than the N;Use preset point Class algorithm passes through to the M1The study of individual training sample, construct corresponding to the N1The model parameter of individual target;
    After all training completion to N number of target, the model parameter combination corresponding to each target in N number of target is existed Together, the disaggregated model for including N number of target is obtained.
  4. 4. according to the method in any one of claims 1 to 3, it is characterised in that described to be extracted from tranining database After current training training set used, methods described also includes:
    Whether the training sample number M that judging current training training set used includes is more than preset sample size threshold Value;
    If the training sample number M is more than preset sample size threshold value, the M training sample included to the training set Processing is sampled, obtains the training sample sampled out in the training set;
    The aiming field in each training sample is got in the M training sample from the training set respectively, including:
    The target word in the training sample sampled out is got in the training sample sampled out from the training set respectively Section.
  5. 5. according to the method in any one of claims 1 to 3, it is characterised in that described to be extracted from tranining database Current training training set used, including:
    Extracted respectively from the tranining database in each data update cycle according to the data update cycle of training sample It is interior update after the training set training set used as current training.
  6. A kind of 6. trainer of multiple targets in disaggregated model, it is characterised in that including:
    Sample extraction module, for extracting the training set used in current training from tranining database, wrapped in the training set M training sample is included, the M is natural number;
    Multiple target configuration module, for being got respectively in each training sample from M training sample in the training set Aiming field, and the field contents included according to the M aiming field got select corresponding M from N number of target respectively M target value of individual aiming field, by the M target value selected be respectively configured to should target value target word Training sample where section, the N is the target number for needing to train, and N is the natural number more than or equal to 2;
    Target training module, for passing through the M training sample to being configured with M target value using preset sorting algorithm Study, constructs the disaggregated model for including N number of target, N number of target is corresponding with different in the disaggregated model Model parameter.
  7. 7. device according to claim 6, it is characterised in that the target training module, including:
    First model modification module, the mould of each target in N number of target is corresponded to for obtaining respectively in the following way Shape parameter:It is determined that the target value that aiming field configures in the training sample being currently trained corresponds to NtIndividual target, sentences Break the NtTraining sample corresponding to individual target is positive sample or negative sample, if the NtTraining sample corresponding to individual target Originally it is positive sample, then using preset sorting algorithm renewal NtThe model parameter of individual target, if the NtCorresponding to individual target Training sample is negative sample, then the model parameter of each target in N number of target is updated using the sorting algorithm, wherein, institute State NtRepresent less than any one natural number of the N;
    First parameter combination module, for being completed using M training sample for configuring M target value to N number of target After training, the model parameter corresponding to each target in N number of target is combined, obtains including N number of mesh Target disaggregated model.
  8. 8. device according to claim 6, it is characterised in that the target training module, including:
    Second model modification module, the mould of each target in N number of target is corresponded to for obtaining respectively in the following way Shape parameter:It is determined that it is currently needed for the N of training1Individual target, and filter out target value from the M training sample and be not belonging to institute State N1The training sample of individual target, obtains M1Individual training sample, the M1Expression is configured with M training sample of M target value Middle target value belongs to the N1The training sample number of individual target, the N1Represent more than or equal to 1 and any less than the N One natural number;Passed through using preset sorting algorithm to the M1The study of individual training sample, construct corresponding to the N1It is individual The model parameter of target;
    Second parameter combination module, after all training completion to N number of target, it will correspond to each in N number of target The model parameter of individual target is combined, and obtains the disaggregated model for including N number of target.
  9. 9. the device according to any one of claim 6 to 8, it is characterised in that multiple targets in the disaggregated model Trainer, in addition to:Sample decimation blocks, wherein,
    The sample decimation blocks, current training instruction used is extracted from tranining database for the sample extraction module After practicing collection, whether the training sample number M that judging current training training set used includes is more than preset sample Capacity threshold;If the training sample number M is more than preset sample size threshold value, M included to the training set instructs Practice sample and be sampled processing, obtain the training sample sampled out in the training set;
    The multiple target configuration module, specifically for being got respectively from the training set in the training sample sampled out Aiming field.
  10. 10. the device according to any one of claim 6 to 8, it is characterised in that the sample extraction module is specific to use Extracted respectively from the tranining database within each data update cycle according to the data update cycle of training sample Training set of the training set used in as current training after updating.
CN201610614088.5A 2016-07-29 2016-07-29 Training method and device for multiple targets in classification model Active CN107665349B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610614088.5A CN107665349B (en) 2016-07-29 2016-07-29 Training method and device for multiple targets in classification model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610614088.5A CN107665349B (en) 2016-07-29 2016-07-29 Training method and device for multiple targets in classification model

Publications (2)

Publication Number Publication Date
CN107665349A true CN107665349A (en) 2018-02-06
CN107665349B CN107665349B (en) 2020-12-04

Family

ID=61115690

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610614088.5A Active CN107665349B (en) 2016-07-29 2016-07-29 Training method and device for multiple targets in classification model

Country Status (1)

Country Link
CN (1) CN107665349B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110737446A (en) * 2018-07-20 2020-01-31 杭州海康威视数字技术股份有限公司 Method and device for updating parameters
CN111325228A (en) * 2018-12-17 2020-06-23 上海游昆信息技术有限公司 Model training method and device
CN112785005A (en) * 2021-01-22 2021-05-11 中国平安人寿保险股份有限公司 Multi-target task assistant decision-making method and device, computer equipment and medium
CN114169536A (en) * 2022-02-11 2022-03-11 希望知舟技术(深圳)有限公司 Data management and control method and related device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103217280A (en) * 2013-03-18 2013-07-24 西安交通大学 Multivariable support vector machine prediction method for aero-engine rotor residual life
CN106203487A (en) * 2016-06-30 2016-12-07 北京航空航天大学 A kind of image classification method based on Multiple Kernel Learning Multiple Classifier Fusion and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103217280A (en) * 2013-03-18 2013-07-24 西安交通大学 Multivariable support vector machine prediction method for aero-engine rotor residual life
CN106203487A (en) * 2016-06-30 2016-12-07 北京航空航天大学 A kind of image classification method based on Multiple Kernel Learning Multiple Classifier Fusion and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HANEN BORCHANI: "A survey on multi-output regression", 《WILEY INTERDISCIPLINARY REVIEWS:DATA MINING AND KNOWLEDGE DISCOVERY》 *
王国勋: "基于多目标决策的数据挖掘模型选择研究", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110737446A (en) * 2018-07-20 2020-01-31 杭州海康威视数字技术股份有限公司 Method and device for updating parameters
CN111325228A (en) * 2018-12-17 2020-06-23 上海游昆信息技术有限公司 Model training method and device
CN111325228B (en) * 2018-12-17 2021-04-06 上海游昆信息技术有限公司 Model training method and device
CN112785005A (en) * 2021-01-22 2021-05-11 中国平安人寿保险股份有限公司 Multi-target task assistant decision-making method and device, computer equipment and medium
CN114169536A (en) * 2022-02-11 2022-03-11 希望知舟技术(深圳)有限公司 Data management and control method and related device
CN114169536B (en) * 2022-02-11 2022-05-06 希望知舟技术(深圳)有限公司 Data management and control method and related device

Also Published As

Publication number Publication date
CN107665349B (en) 2020-12-04

Similar Documents

Publication Publication Date Title
Nabil A modified flower pollination algorithm for global optimization
WO2022121510A1 (en) Stochastic policy gradient-based traffic signal control method and system, and electronic device
CN107665349A (en) The training method and device of multiple targets in a kind of disaggregated model
WO2019144892A1 (en) Data processing method, device, storage medium and electronic device
CN106992994A (en) A kind of automatically-monitored method and system of cloud service
CN106097043A (en) The processing method of a kind of credit data and server
CN106875004A (en) Composite mode neuronal messages processing method and system
CN110339569A (en) Control the method and device of virtual role in scene of game
CN110378699A (en) A kind of anti-fraud method, apparatus and system of transaction
Karaboga et al. Estimation of number of foreign visitors with ANFIS by using ABC algorithm
CN108629379A (en) A kind of individual's reference appraisal procedure and system
CN109697512A (en) Personal data analysis method and computer storage medium based on Bayesian network
CN112766600A (en) Urban area crowd flow prediction method and system
CN110362728A (en) Information-pushing method, device, equipment and storage medium based on big data analysis
CN114584406B (en) Industrial big data privacy protection system and method for federated learning
CN111461284A (en) Data discretization method, device, equipment and medium
CN115270782A (en) Event propagation popularity prediction method based on graph neural network
CN108737491A (en) Information-pushing method and device and storage medium, electronic device
CN113381888B (en) Control method, device, equipment and storage medium
CN107644268B (en) Open source software project incubation state prediction method based on multiple features
CN114445684A (en) Method, device and equipment for training lane line segmentation model and storage medium
CN113240219A (en) Land utilization simulation and prediction method
CN113658689A (en) Multi-agent model training method and device, electronic equipment and storage medium
CN107423811A (en) The streamflow change attribution recognition methods combined based on BP artificial neural networks and Scene Simulation
CN111126607A (en) Data processing method, device and system for model training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant