CN110033094A - A kind of model training method and device based on disturbance sample - Google Patents

A kind of model training method and device based on disturbance sample Download PDF

Info

Publication number
CN110033094A
CN110033094A CN201910133409.3A CN201910133409A CN110033094A CN 110033094 A CN110033094 A CN 110033094A CN 201910133409 A CN201910133409 A CN 201910133409A CN 110033094 A CN110033094 A CN 110033094A
Authority
CN
China
Prior art keywords
sample
disturbance
model
initial
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910133409.3A
Other languages
Chinese (zh)
Inventor
林建滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910133409.3A priority Critical patent/CN110033094A/en
Publication of CN110033094A publication Critical patent/CN110033094A/en
Priority to PCT/CN2020/070290 priority patent/WO2020168843A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

This specification embodiment provides a kind of obtain and disturbs sample set and the method and apparatus based on disturbance sample set training pattern, and the method for obtaining disturbance sample set includes: to calculate the characteristic value mean square deviation of the characteristic value of each dimension in the corresponding multiple feature vectors of the multiple initial sample;And for each dimension in each feature vector in the multiple feature vector, generate corresponding random number, and the current characteristic value of the dimension of this feature vector is updated to the sum of the current characteristic value and corresponding random number, to generate multiple disturbance samples corresponding with the multiple feature vector, to obtain disturbance sample set, wherein, the value range of each random number is determined based on the product of the characteristic value mean square deviation of scheduled first parameter dimension corresponding with the random number.

Description

A kind of model training method and device based on disturbance sample
Technical field
This specification embodiment is related to machine learning field, obtains disturbance sample more particularly, to based on original training set The method and apparatus of this collection, obtain model training sample set and test sample at the method and apparatus for obtaining model training sample set The method and apparatus of collection and based on disturbance sample model training method and device.
Background technique
Machine mould, which is deployed in actual environment, will receive various challenges, wherein a critically important challenge is exactly The stability of model.By taking speech recognition modeling as an example, the data that machine learning model is used in training are often by appropriate Conduct oneself well reason, noise reduction, however in the actual environment, model will in face of the case where it is sufficiently complex, such as in noisy environment, Echo of microphone etc. can all cause model data to be processed inconsistent with noise and actual training data, so as to cause The precision of model has greatly changed.Therefore, reality of the robustness for machine learning model of machine learning model is improved Using there is great meaning.Machine learning algorithm generallys use L1 canonical and L2 canonical to enhance the robustness of model at present, The two methods are all to achieve the effect that robustness by the search space (absolute value of parameter) of limited model parameter.
Therefore, it is necessary to a kind of model training methods of more effectively enhancing model robustness.
Summary of the invention
This specification embodiment is intended to provide a kind of more effective model training method, with solve it is in the prior art not Foot.
To achieve the above object, this specification is provided a kind of obtained based on original training set on one side and disturbs sample set Method includes multiple initial samples in the original training set, and each initial sample includes corresponding feature vector, the method Include:
Calculate the characteristic value of the characteristic value of each dimension in the corresponding multiple feature vectors of the multiple initial sample Mean square deviation;And
For each dimension in each feature vector in the multiple feature vector, corresponding random number is generated, and will The current characteristic value of the dimension of this feature vector is updated to the sum of the current characteristic value and corresponding random number, with generate with The corresponding multiple disturbance samples of the multiple feature vector, to obtain disturbance sample set, wherein each random number Value range based on scheduled first parameter dimension corresponding with the random number characteristic value mean square deviation product determine.
In one embodiment, the random number is random numbers of Gaussian distribution, the mean square deviation of the random numbers of Gaussian distribution For the product of the characteristic value mean square deviation of first parameter dimension corresponding with the random number.
In one embodiment, the random number is mean random number, wherein the value range of the mean random number exists Between positive and negative first numerical value, wherein first numerical value is the characteristic value of first parameter dimension corresponding with the random number The product of mean square deviation.
On the other hand this specification provides a kind of method for obtaining model training sample set based on original training set, wherein It include multiple initial samples in the original training set, which comprises
By the above method, disturbance sample set is obtained, the disturbance sample set includes distinguishing with the multiple initial sample Corresponding multiple disturbance samples;And
By merging the multiple initial sample with the multiple disturbance sample, training sample set is obtained.
On the other hand this specification provides a kind of based on original training set acquisition model training sample set and test sample collection Method, wherein in the original training set include multiple initial samples, which comprises
By above-mentioned acquisition disturb sample set method, obtain disturbance sample set, the disturbance sample set include with it is described The corresponding multiple disturbance samples of multiple initial samples;
By by the multiple initial sample the initial sample in part and it is the multiple disturbance sample in portion disturbances Sample merges, and obtains training sample set;And
By by initial sample remaining in the multiple initial sample at least partly and in the multiple disturbance sample At least partly merging of remaining disturbance sample, obtains test sample collection.
In one embodiment, the initial sample in the part account for the multiple initial sample ratio and the portion disturbances The ratio that sample accounts for the multiple disturbance sample is identical.
In one embodiment, by by initial sample remaining in the multiple initial sample at least partly with it is described At least partly merging of remaining disturbance sample in multiple disturbance samples, obtaining test sample collection includes, by will be the multiple Remaining initial sample merges with remaining disturbance sample in the multiple disturbance sample in initial sample, obtains test sample Collection.
In one embodiment, the initial sample in the part is respectively corresponded with the portion disturbances sample.
This specification embodiment provides a kind of model training method, comprising:
Obtain original training set, wherein include multiple initial samples in the original training set;
The original training set, which is based on, by the method for above-mentioned acquisition training sample set and test sample collection obtains multiple instructions Practice sample set and multiple test sample collections corresponding with the multiple training sample set, wherein the multiple training sample The first parameter for collecting different from multiple values respectively corresponds;
"current" model is respectively trained using the multiple training sample set, to obtain multiple more new models respectively;
Assess corresponding more new model respectively using the multiple test sample collection, wherein the test sample collection and phase The more new model answered corresponds to identical training sample set;And
Based on assessment result, the more new model of the "current" model is determined in the multiple more new model.
In one embodiment, the model be following any class model: supervised learning model, unsupervised learning model, With intensified learning model.
On the other hand this specification provides a kind of device that disturbance sample set is obtained based on original training set, the initial sample This concentration includes multiple initial samples, and each initial sample includes corresponding feature vector, and described device includes:
Computing unit is configured to, and calculates each dimension in the corresponding multiple feature vectors of the multiple initial sample Characteristic value characteristic value mean square deviation;And
Generation unit is configured to, and for each dimension in each feature vector in the multiple feature vector, generates phase The random number answered, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value with it is corresponding random The sum of number, to generate multiple disturbance samples corresponding with the multiple feature vector, so that disturbance sample set is obtained, In, the value range of each random number is square based on the characteristic value of scheduled first parameter dimension corresponding with the random number The product of difference determines.
On the other hand this specification provides a kind of device that model training sample set is obtained based on original training set, wherein It include multiple initial samples in the original training set, described device includes:
Acquiring unit is configured to, and by above-mentioned apparatus, obtains disturbance sample set, the disturbance sample set include with it is described The corresponding multiple disturbance samples of multiple initial samples;And
Combining unit is configured to, and by merging the multiple initial sample with the multiple disturbance sample, obtains training Sample set.
On the other hand this specification provides a kind of based on original training set acquisition model training sample set and test sample collection Device, wherein include multiple initial samples in the original training set, described device includes:
Acquiring unit is configured to, and the device of sample set is disturbed by above-mentioned acquisition, obtains disturbance sample set, the disturbance Sample set includes multiple disturbance samples corresponding with the multiple initial sample;
First combining unit, is configured to, by by the multiple initial sample the initial sample in part with it is the multiple The portion disturbances sample disturbed in sample merges, and obtains training sample set;And
Second combining unit, is configured to, by by initial sample remaining in the multiple initial sample at least partly At least partly merging for sample is disturbed with remaining in the multiple disturbance sample, obtains test sample collection.
In one embodiment, second combining unit is additionally configured to, by will in the multiple initial sample it is remaining Initial sample merge with remaining disturbance sample in the multiple disturbance sample, obtain test sample collection.
On the other hand this specification provides a kind of model training apparatus, comprising:
First acquisition unit is configured to, and obtains original training set, wherein includes multiple initial in the original training set Sample;
Second acquisition unit is configured to, and is based on by above-mentioned acquisition training sample set and the device of test sample collection described Original training set obtains multiple training sample sets and multiple test sample collections corresponding with the multiple training sample set, Wherein, the multiple training sample set first parameter different from multiple values respectively corresponds;
Training unit is configured to, and "current" model is respectively trained using the multiple training sample set, multiple to obtain respectively More new model;
Assessment unit is configured to, and assesses corresponding more new model respectively using the multiple test sample collection, wherein institute It states test sample collection and corresponds to identical training sample set with corresponding more new model;And
Determination unit is configured to, and is based on assessment result, determines the "current" model more in the multiple more new model New model.
On the other hand this specification provides a kind of computer readable storage medium, be stored thereon with computer program, work as institute When stating computer program and executing in a computer, computer is enabled to execute any of the above-described method.
On the other hand this specification provides a kind of calculating equipment, including memory and processor, which is characterized in that described to deposit It is stored with executable code in reservoir, when the processor executes the executable code, realizes any of the above-described method.
This specification embodiment is disturbed by the training data to model, simulates the data noise in true environment, To increase the robustness of model, and by using noisy data training and assessment models, and mould is determined based on assessment result Type predefined parameter, to quantitatively improve model to the validity of abnormal data, in addition, not to machine in this specification embodiment The parameter of device learning model is limited, thus without limitation on the learning ability of model.
Detailed description of the invention
This specification embodiment is described in conjunction with the accompanying drawings, and this specification embodiment can be made clearer:
Fig. 1 shows the schematic diagram of the model training systems 100 according to this specification embodiment;
Fig. 2 shows a kind of methods for obtaining disturbance sample set based on original training set according to this specification embodiment;
Fig. 3 schematically illustrates the calculating of the mean square deviation to the one-dimensional characteristic value in multiple feature vectors;
Fig. 4 shows n perturbation features vector corresponding with n feature vector in Fig. 3;
Fig. 5 shows the method flow that model training sample set is obtained based on original training set according to this specification embodiment Figure;
Fig. 6, which is shown, is based on original training set acquisition model training sample set and survey according to one kind of this specification embodiment Try the method flow diagram of sample set;
Fig. 7 shows a kind of model training method flow chart according to this specification embodiment;
Fig. 8 shows a kind of device 800 that disturbance sample set is obtained based on original training set according to this specification embodiment;
Fig. 9 shows a kind of device that model training sample set is obtained based on original training set according to this specification embodiment 900;
Figure 10, which is shown, is based on original training set acquisition model training sample set and survey according to one kind of this specification embodiment Try the device 1000 of sample set;
Figure 11 shows a kind of model training apparatus 1100 according to this specification embodiment.
Specific embodiment
This specification embodiment is described below in conjunction with attached drawing.
Fig. 1 shows the schematic diagram of the model training systems 100 according to this specification embodiment.As shown in Figure 1, system 100 Including data processing module 11, training module 12 and evaluation module 13.In data processing module 11, by data set A Each sample in feature vector each dimension values apply disturbance, to obtain data set B.In training module 12, from number According at least partly data (such as in data set A 70% data) is obtained in collection A, at least partly data (example is obtained from data set B Data such as in data set B 70%), to obtain training dataset and being merged, and assembled for training using the training data Practice machine learning model.The machine learning model can be arbitrary model, be, for example, speech recognition modeling above.The language Sound identification model is, for example, supervised learning model or for intensified learning model etc., it will be understood that the arbitrary model can be with For unsupervised learning model.In evaluation module 13, from obtaining remaining data (such as 30% number in data set A in data set A According to), remaining data (such as 30% data in data set B) are obtained from data set B, are surveyed to be obtained and being merged Data set is tried, and assesses the model of the training using the test data set.
Above-mentioned each treatment process is described below in detail.
Fig. 2 shows a kind of method for obtaining disturbance sample set based on original training set according to this specification embodiment, institutes Stating includes multiple initial samples in original training set, and each initial sample includes corresponding feature vector, which comprises
In step S202, the feature of each dimension in the corresponding multiple feature vectors of the multiple initial sample is calculated The characteristic value mean square deviation of value;And
Each dimension in each feature vector in the multiple feature vector is generated corresponding in step S204 Random number, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value and corresponding random number it With to generate corresponding with the multiple feature vector multiple disturbance samples, to obtain disturbance sample set, wherein respectively Characteristic value mean square deviation of the value range of a random number based on scheduled first parameter dimension corresponding with the random number Product determines.
Firstly, calculating each dimension in the corresponding multiple feature vectors of the multiple initial sample in step S202 Characteristic value characteristic value mean square deviation.
The original training set is, for example, data set A shown in Fig. 1, includes such as n initial samples in data set A This includes respective feature vector in each sample, and described eigenvector is, for example, m dimensional feature vector, each of which dimension values Corresponding to a characteristic value.In addition, in the case where the model that will be trained is, for example, monitor model, in each sample also Including corresponding label value.
The mean square deviation is standard deviation, is variances sigma2Square root, can be indicated with σ.To multiple sample calculating sides When poor, usually calculated with following formula (1):
Wherein, n is total sample number, and μ is n xiMean value.
Fig. 3 schematically illustrates the calculating of the mean square deviation to the one-dimensional characteristic value in multiple feature vectors.In fig. 3, it is assumed that there is n A feature vector, each feature vector include the characteristic value of m feature, can will wherein j-th of feature vector (i.e. vector in figure J) (list of feature values of the feature i) i.e. in figure is shown as x to i-th dimension featureij, wherein [1, m] i ∈, j ∈ [1, n].To which n special Levy the standard deviation sigma of the characteristic value of the respective i-th dimension of vectoriIt can be calculated by following formula (2):
Wherein, μiIt is calculated by following formula (3)
Specifically, for example, the characteristic value of the 2nd dimensional feature (i.e. feature 2) for each vector in dotted line frame in Fig. 3 x21,x22,…,x2n, it can be based on the n numerical value, its mean value is calculated by formula (3)It is counted by formula (2) Calculate its variance
Each dimension in each feature vector in the multiple feature vector is generated corresponding in step S204 Random number, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value and corresponding random number it With to generate corresponding with the multiple feature vector multiple disturbance samples, to obtain disturbance sample set, wherein respectively Characteristic value mean square deviation of the value range of a random number based on scheduled first parameter dimension corresponding with the random number Product determines.
Referring still to Fig. 3, it is assumed that for the vector 1 in figure, for each dimension 1,2 ... of this feature vector, m, difference Generate corresponding random number a11,a21,…am1, and by the current characteristic value x of each dimension of this feature vectori1It is updated to xi1+ai1, Wherein [1, m] i ∈, so as to obtain perturbation features vector corresponding with this feature vector 1 (as shown in dotted line frame in Fig. 4). For each feature vector, it can similarly apply disturbance, to obtain corresponding perturbation features vector.Fig. 4 is shown N perturbation features vector corresponding with n feature vector in Fig. 3.By the perturbation features vector training pattern, i.e., to instruction Practice sample and increase noise, model will be improved to the resistivity of noise, enhance the stability of model.
In one embodiment, each random number a for including in Fig. 4ij, wherein [1, m] i ∈, j ∈ [1, n] can lead to Cross Gaussian random variable A=norm (0, λ σi) obtain, that is to say, that the stochastic variable mean value is 0, and mean square deviation is λ σi, σ herei For calculated by above-mentioned formula (2) dimension i (the characteristic value mean square deviation of feature i), λ are predefined parameter, can value exist Between 0.0001 to 0.1.Such as in conjunction with Fig. 3 and Fig. 4, each a corresponding with 2 dimension of feature in Fig. 42jAll by A=norm (0, λσ2) generate, wherein σ2As σ shown in Fig. 32.As well known to the skilled person, according to the probability density of gaussian variable Figure, aij99% possibility value will fall in [- 3 λ σi,3λσi] section in, that is to say, that by predefined parameter λ with The characteristic value meansquaredeviationσ of dimension iiProduct λ σi, define each random number a of dimension iijValue range, wherein j takes Being worth range is 1 to n.
In one embodiment, each random number a in Fig. 4ijIt can be obtained by mean random variable B, stochastic variable B It may be, for example, [- λ σi,λσi] mean random variable in range.That is, passing through the product λ σiLimit each random number aijValue range.The value range of the mean random variable B is not limited to be set as [- λ σi,λσi], for example, it is also possible to be [-3λσi,3λσi] etc..In addition, each random number aijIt is not limited to through above-mentioned Gaussian random variable or mean random Variable obtains, and can be obtained by other any stochastic variables, such as Poisson stochastic variable, as long as it passes through λ σiLimitation Its value range.
Wherein, λ is used to balance the noise and model performance being added, and the value of λ is smaller, aijValue range it is smaller, Ye Jishi The disturbance added is smaller, and when application disturbance is too small, the influence to feature vector is too small, cannot play the work for improving model performance With when the disturbance of application is excessive, and affecting the forecasting accuracy of model.Therefore, raising of the value of λ for model performance It is more important.In one embodiment, the value that λ can be determined by the specific environment that model is applied, for example, for speech recognition mould Type, it is more noisy in the environment of its application, in the case where noise is larger, it can set larger for λ value, in the environment ratio of its application It is quieter, in the case where noise is less than normal, it can set smaller for λ value.In one embodiment, as follows to be described in detail , by being assessed trained model the value to determine λ, that is, select the preferable λ of assessed value as final model Trained λ.In one embodiment, after having had determined λ in model before the use the training sample set training of batch, Reusable λ value in subsequent repetition training.
After obtaining the corresponding disturbance sample of each initial sample of original training set by method shown in Fig. 2, Disturbance sample set can be obtained, so as to obtain the training sample set and test specimens of model based on original training set and disturbance sample set This collection.
Fig. 5 shows the method flow that model training sample set is obtained based on original training set according to this specification embodiment Figure, wherein include multiple initial samples in the original training set, which comprises
In step S502, by method shown in Fig. 2, disturbance sample set, the disturbance are obtained based on the original training set Sample set includes multiple disturbance samples corresponding with the multiple initial sample;And
Training sample is obtained by merging the multiple initial sample with the multiple disturbance sample in step S504 Collection.
Firstly, by method shown in Fig. 2, being generated corresponding multiple with the multiple initial sample in step S502 Disturb sample, wherein the multiple disturbance sample corresponds to identical first parameter of value.The multiple disturbance sample is, for example, Multiple disturbance samples as shown in Figure 4, as described above, each random number a in multiple disturbance sampleijGauss can be passed through Stochastic variable A=norm (0, λ σi) obtain, that is to say, that this batch disturbs sample and corresponds to identical λ value.
Training sample is obtained by merging the multiple initial sample with the multiple disturbance sample in step S504 Collection.It, can be by the whole samples and disturbance sample set in original training set in the case where not needing to assess training pattern In whole samples merge, to obtain training sample set.By and in training sample concentration while including original training set disturbing Dynamic sample set, enriches the training sample of model, so that model can adapt in different actual environments.
Fig. 6, which is shown, is based on original training set acquisition model training sample set and survey according to one kind of this specification embodiment Try the method flow diagram of sample set, wherein include multiple initial samples in the original training set, which comprises
In step S602, by method shown in Fig. 2, disturbance sample set, the disturbance are obtained based on the original training set Sample set includes multiple disturbance samples corresponding with the multiple initial sample;
In step S604, by will be in the initial sample in part and the multiple disturbance sample in the multiple initial sample Portion disturbances sample merge, obtain training sample set;And
In step S606, by by initial sample remaining in the multiple initial sample at least partly with it is the multiple At least partly merging for disturbing remaining disturbance sample in sample, obtains test sample collection.
It in the method, can be by the way that the initial sample in part be closed with portion disturbances sample after acquisition disturbs sample set And together, to obtain training sample set.In one embodiment, for example, in combinable original training set 70% just 70% disturbance sample in beginning sample and disturbance sample set, to obtain training sample set.Later, by the original training set Merge with the disturbance sample of the remaining such as 30% initial sample disturbed in sample set and 30%, to obtain and the training The corresponding test sample collection of sample set.Wherein, the described 70% initial sample and 70% disturbance that the training sample is concentrated Sample can be respectively it is mutual corresponding, be also possible to not corresponding.
In one embodiment, the initial sample in the part account for the multiple initial sample ratio and the portion disturbances The ratio that sample accounts for the multiple disturbance sample can also be different, for example, in the case where model actual application environment is more noisy, The disturbance sample including larger proportion can be concentrated in training sample, such as concentrating in training sample includes in whole disturbance samples 80% disturbance sample, all 20% initial sample in initial samples.Correspondingly, same proportional arrangement can also be passed through Test sample collection, disturbance sample, residue for example including (accounting for all disturbance samples) 20% remaining in all disturbance samples are just The initial sample of (accounting for all initial samples) 5% in beginning sample.
After training sample set and the test sample collection for obtaining model as shown in Figure 5 and Figure 6, training sample set can be passed through Training pattern, and can be assessed by test sample collection and model is assessed.It is described below based on test sample the set pair analysis model The predefined parameter λ of model is selected in assessment, thus the method for advanced optimizing model.
Fig. 7 shows a kind of model training method flow chart according to this specification embodiment, comprising:
In step S702, original training set is obtained, wherein include multiple initial samples in the original training set;
In step S704, by method shown in Fig. 6 be based on the original training set obtain multiple training sample sets and with institute State the corresponding multiple test sample collections of multiple training sample sets, wherein the multiple training sample set and multiple values are not The first same parameter respectively corresponds;
In step S706, "current" model is respectively trained using the multiple training sample set, to obtain multiple updates respectively Model;
In step S708, corresponding more new model is assessed respectively using the multiple test sample collection, wherein the test Sample set corresponds to identical training sample set with corresponding more new model;And
In step S710, it is based on assessment result, the update mould of the "current" model is determined in the multiple more new model Type.
Firstly, obtaining original training set in step S702, wherein include multiple initial samples in the original training set. The model is not limited to concrete type, can be supervised learning model, unsupervised learning model, Yi Jiqiang as described above Change any kind in learning model.For example, the model is speech recognition modeling as described above, it is, for example, that supervision is learned Model is practised, in this case, corresponding feature vector can be extracted from the voice, thus described in use by the way that voice is manually entered The initial sample of feature vector and label value (semanteme) as model.However, the model can be potentially encountered not in practical applications Same environment, such as quiet environment, a variety of noisy environment with different noises.And it is manually obtained under single environment Initial sample can not simulate so how different environment, and under various circumstances manually obtain sample cost it is also relatively high. So that the expansion of sample is carried out based on original training set by this method, to obtain training sample set.
In step S704, by method shown in Fig. 6 be based on the original training set obtain multiple training sample sets and with institute State the corresponding multiple test sample collections of multiple training sample sets, wherein the multiple training sample set and multiple values are not The first same parameter respectively corresponds.
That is, method shown in multipass Fig. 6 is based on just in the case where taking different value to above-mentioned predefined parameter λ Beginning sample set obtains multiple training sample sets and multiple test sample collections corresponding with its difference.For example, λ can be distinguished value It is 0.0001,0.001,0.01,0.1.To can determine influence of the magnitude to model training of λ.It is appreciated that the value of λ is not It is limited to aforesaid way and above-mentioned number, but can be specifically limited according to concrete model.Specifically, for above-mentioned 4 λ values, base In above-mentioned original training set A, 4 disturbance sample set B can be obtained respectively by method shown in Fig. 21、B2、B3、B4, it is assumed that it is based on This 4 disturbance sample sets obtain 4 groups of sample sets respectively and are combined (C1, D1)、(C2, D2)、(C3, D3)、(C4, D4), wherein CiIndicate instruction Practice sample set, DiIndicate test sample collection.
In step S706, "current" model is respectively trained using the multiple training sample set, to obtain multiple updates respectively Model.
In the above-described example, that is to say, that use each training sample set C1、C2、C3、C4"current" model is respectively trained, with 4 update model Ms are obtained respectively1、M2、M3、M4
In step S708, corresponding more new model is assessed respectively using the multiple test sample collection, wherein the test Sample set corresponds to identical training sample set with corresponding more new model.
In the above-described example, that is to say, that use each test sample collection D1、D2、D3、D44 updates are assessed respectively Model M1、M2、M3、M4, wherein test sample collection D1With update model M1Both correspond to training sample set C1, that is, test sample collection D1With update model M1It is corresponding, it can similarly obtain, test sample collection D2With update model M2It is corresponding, test sample collection D3With Update model M3It is corresponding, test sample collection D4With update model M4It is corresponding.Corresponding update mould can be calculated by test sample The various evaluation indexes of type, such as accuracy rate, accurate rate, recall rate, so that more new model is assessed accordingly to this, example Such as, in summary the assessed value of the model can be obtained by various evaluation indexes.
In step S710, it is based on assessment result, the update mould of the "current" model is determined in the multiple more new model Type.
In the above-described example, after obtaining the corresponding respective assessed value of more new model of each λ, such as can be by assessed value Highest more new model is determined as the more new model of the "current" model, that is, model after training, and retains the update mould of the determination Type is to carry out subsequent model use, such as model prediction.
Fig. 8 shows a kind of device 800 that disturbance sample set is obtained based on original training set according to this specification embodiment, It include multiple initial samples in the original training set, each initial sample includes corresponding feature vector, and described device includes:
Computing unit 81, is configured to, and calculates each dimension in the corresponding multiple feature vectors of the multiple initial sample The characteristic value mean square deviation of the characteristic value of degree;And
Generation unit 82, is configured to, and for each dimension in each feature vector in the multiple feature vector, generates Corresponding random number, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value with it is corresponding with The sum of machine number, to generate multiple disturbance samples corresponding with the multiple feature vector, so that disturbance sample set is obtained, In, the value range of each random number is square based on the characteristic value of scheduled first parameter dimension corresponding with the random number The product of difference determines.
Fig. 9 shows a kind of device that model training sample set is obtained based on original training set according to this specification embodiment 900, wherein include multiple initial samples in the original training set, described device includes:
Acquiring unit 91, is configured to, and by above-mentioned apparatus, obtains disturbance sample set based on the original training set, described Disturbance sample set includes multiple disturbance samples corresponding with the multiple initial sample;And
Combining unit 92, is configured to, and by merging the multiple initial sample with the multiple disturbance sample, obtains instruction Practice sample set.
Figure 10, which is shown, is based on original training set acquisition model training sample set and survey according to one kind of this specification embodiment Try the device 1000 of sample set, wherein include multiple initial samples in the original training set, described device includes:
Acquiring unit 101, is configured to, and the device of sample set is disturbed by above-mentioned acquisition, is obtained based on the original training set Take disturbance sample set, the disturbance sample set includes multiple disturbance samples corresponding with the multiple initial sample;
First combining unit 102, is configured to, by by the initial sample in part in the multiple initial sample and described more Portion disturbances sample in a disturbance sample merges, and obtains training sample set;And
Second combining unit 103, is configured to, by by at least portion of initial sample remaining in the multiple initial sample Divide and disturb at least partly merging for sample with remaining in the multiple disturbance sample, obtains test sample collection.
In one embodiment, second combining unit is additionally configured to, by will in the multiple initial sample it is remaining Initial sample merge with remaining disturbance sample in the multiple disturbance sample, obtain test sample collection.
Figure 11 shows a kind of model training apparatus 1100 according to this specification embodiment, comprising:
First acquisition unit 111, is configured to, and obtains original training set, wherein includes multiple first in the original training set Beginning sample;
Second acquisition unit 112, is configured to, and is based on institute by above-mentioned acquisition training sample set and the device of test sample collection It states original training set and obtains multiple training sample sets and multiple test samples corresponding with the multiple training sample set Collection, wherein the multiple training sample set first parameter different from multiple values respectively corresponds;
Training unit 113, is configured to, and "current" model is respectively trained using the multiple training sample set, to obtain respectively Multiple more new models;
Assessment unit 114, is configured to, and assesses corresponding more new model respectively using the multiple test sample collection, wherein The test sample collection corresponds to identical training sample set with corresponding more new model;And
Determination unit 115, is configured to, and is based on assessment result, and the "current" model is determined in the multiple more new model More new model.
On the other hand this specification provides a kind of computer readable storage medium, be stored thereon with computer program, work as institute When stating computer program and executing in a computer, computer is enabled to execute any of the above-described method.
On the other hand this specification provides a kind of calculating equipment, including memory and processor, which is characterized in that described to deposit It is stored with executable code in reservoir, when the processor executes the executable code, realizes any of the above-described method.
This specification embodiment is disturbed by the training data to model, simulates the data noise in true environment, To increase the robustness of model, and by using noisy data training and assessment models, and mould is determined based on assessment result Type predefined parameter, to quantitatively improve model to the validity of abnormal data, in addition, not to machine in this specification embodiment The parameter of device learning model is limited, thus without limitation on the learning ability of model.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method Part explanation.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can With or may be advantageous.
Those of ordinary skill in the art should further appreciate that, describe in conjunction with the embodiments described herein Each exemplary unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clear Illustrate to Chu the interchangeability of hardware and software, generally describes each exemplary group according to function in the above description At and step.These functions hold track actually with hardware or software mode, depending on technical solution specific application and set Count constraint condition.Those of ordinary skill in the art can realize each specific application using distinct methods described Function, but this realization is it is not considered that exceed scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can hold track with hardware, processor Software module or the combination of the two implement.Software module can be placed in random access memory (RAM), memory, read-only storage Device (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology neck In any other form of storage medium well known in domain.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention Protection scope, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include Within protection scope of the present invention.

Claims (22)

  1. It include multiple initial samples in the original training set 1. a kind of obtain the method for disturbing sample set based on original training set This, each initial sample includes corresponding feature vector, which comprises
    The characteristic value for calculating the characteristic value of each dimension in the corresponding multiple feature vectors of the multiple initial sample is square Difference;And
    For each dimension in each feature vector in the multiple feature vector, corresponding random number is generated, and by the spy The current characteristic value of the dimension for levying vector is updated to the sum of the current characteristic value and corresponding random number, with generation with it is described The corresponding multiple disturbance samples of multiple feature vectors, to obtain disturbance sample set, wherein each random number takes The product for being worth range based on the characteristic value mean square deviation of scheduled first parameter dimension corresponding with the random number determines.
  2. 2. according to the method described in claim 1, wherein, the random number is random numbers of Gaussian distribution, the Gaussian Profile with The mean square deviation of machine number is the product of the characteristic value mean square deviation of first parameter dimension corresponding with the random number.
  3. 3. according to the method described in claim 1, wherein, the random number is mean random number, wherein the mean random number Value range between positive and negative first numerical value, wherein first numerical value be first parameter it is corresponding with the random number The product of the characteristic value mean square deviation of dimension.
  4. 4. a kind of method for obtaining model training sample set based on original training set, wherein include more in the original training set A initial sample, which comprises
    By method described in claim 1, disturbance sample set, the disturbance sample set packet are obtained based on the original training set Include multiple disturbance samples corresponding with the multiple initial sample;And
    By merging the multiple initial sample with the multiple disturbance sample, training sample set is obtained.
  5. 5. a kind of method for obtaining model training sample set and test sample collection based on original training set, wherein the initial sample This concentration includes multiple initial samples, which comprises
    By method described in claim 1, disturbance sample set, the disturbance sample set packet are obtained based on the original training set Include multiple disturbance samples corresponding with the multiple initial sample;
    By by the multiple initial sample the initial sample in part and it is the multiple disturbance sample in portion disturbances sample Merge, obtains training sample set;And
    By by initial sample remaining in the multiple initial sample at least partly with it is remaining in the multiple disturbance sample Disturbance sample at least partly merging, obtain test sample collection.
  6. 6. according to the method described in claim 5, wherein, the initial sample in part account for the ratio of the multiple initial sample with The ratio that the portion disturbances sample accounts for the multiple disturbance sample is identical.
  7. 7. according to the method described in claim 6, wherein, by by initial sample remaining in the multiple initial sample extremely Small part disturbs at least partly merging for sample with remaining in the multiple disturbance sample, and obtaining test sample collection includes leading to It crosses and merges initial sample remaining in the multiple initial sample with remaining disturbance sample in the multiple disturbance sample, obtain Take test sample collection.
  8. 8. according to the method described in claim 6, wherein, the initial sample in part and the portion disturbances sample are right respectively It answers.
  9. 9. a kind of model training method, comprising:
    Obtain original training set, wherein include multiple initial samples in the original training set;
    By the method described in claim 5 be based on the original training set obtain multiple training sample sets and with it is the multiple The corresponding multiple test sample collections of training sample set, wherein the multiple training sample set it is different from multiple values One parameter respectively corresponds;
    "current" model is respectively trained using the multiple training sample set, to obtain multiple more new models respectively;
    Assess corresponding more new model respectively using the multiple test sample collection, wherein the test sample collection with it is corresponding More new model corresponds to identical training sample set;And
    Based on assessment result, the more new model of the "current" model is determined in the multiple more new model.
  10. 10. according to the method described in claim 9, wherein, the model is following any class model: supervised learning model, nothing Supervised learning model and intensified learning model.
  11. It include multiple initial samples in the original training set 11. a kind of obtain the device for disturbing sample set based on original training set This, each initial sample includes corresponding feature vector, and described device includes:
    Computing unit is configured to, and calculates the spy of each dimension in the corresponding multiple feature vectors of the multiple initial sample The characteristic value mean square deviation of value indicative;And
    Generation unit is configured to, and for each dimension in each feature vector in the multiple feature vector, is generated corresponding Random number, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value and corresponding random number it With to generate corresponding with the multiple feature vector multiple disturbance samples, to obtain disturbance sample set, wherein respectively Characteristic value mean square deviation of the value range of a random number based on scheduled first parameter dimension corresponding with the random number Product determines.
  12. 12. device according to claim 11, wherein the random number is random numbers of Gaussian distribution, the Gaussian Profile The mean square deviation of random number is the product of the characteristic value mean square deviation of first parameter dimension corresponding with the random number.
  13. 13. device according to claim 11, wherein the random number is mean random number, wherein the mean random Several value ranges is between positive and negative first numerical value, wherein first numerical value is that first parameter is corresponding with the random number Dimension characteristic value mean square deviation product.
  14. 14. a kind of device for obtaining model training sample set based on original training set, wherein include more in the original training set A initial sample, described device include:
    Acquiring unit is configured to, and by the device described in claim 11, obtains disturbance sample based on the original training set Collection, the disturbance sample set include multiple disturbance samples corresponding with the multiple initial sample;And
    Combining unit is configured to, and by merging the multiple initial sample with the multiple disturbance sample, obtains training sample Collection.
  15. 15. a kind of device for obtaining model training sample set and test sample collection based on original training set, wherein the initial sample This concentration includes multiple initial samples, and described device includes:
    Acquiring unit is configured to, and by the device described in claim 11, obtains disturbance sample based on the original training set Collection, the disturbance sample set include multiple disturbance samples corresponding with the multiple initial sample;
    First combining unit, is configured to, by by the multiple initial sample the initial sample in part and the multiple disturbance Portion disturbances sample in sample merges, and obtains training sample set;And
    Second combining unit, is configured to, by by initial sample remaining in the multiple initial sample at least partly with institute At least partly merging for stating remaining disturbance sample in multiple disturbance samples, obtains test sample collection.
  16. 16. device according to claim 15, wherein the initial sample in part accounts for the ratio of the multiple initial sample It is identical that the multiple disturbance ratio of sample is accounted for the portion disturbances sample.
  17. 17. device according to claim 16, wherein second combining unit is additionally configured to, by will be the multiple Remaining initial sample merges with remaining disturbance sample in the multiple disturbance sample in initial sample, obtains test sample Collection.
  18. 18. device according to claim 16, wherein the initial sample in part and the portion disturbances sample are right respectively It answers.
  19. 19. a kind of model training apparatus, comprising:
    First acquisition unit is configured to, and obtains original training set, wherein includes multiple initial samples in the original training set;
    Second acquisition unit is configured to, and is based on the original training set by the device described in claim 15 and is obtained multiple instructions Practice sample set and multiple test sample collections corresponding with the multiple training sample set, wherein the multiple training sample The first parameter for collecting different from multiple values respectively corresponds;
    Training unit is configured to, and "current" model is respectively trained using the multiple training sample set, to obtain multiple updates respectively Model;
    Assessment unit is configured to, and assesses corresponding more new model respectively using the multiple test sample collection, wherein the survey It tries sample set and corresponds to identical training sample set with corresponding more new model;And
    Determination unit is configured to, and is based on assessment result, and the update mould of the "current" model is determined in the multiple more new model Type.
  20. 20. device according to claim 19, wherein the model is following any class model: supervised learning model, nothing Supervised learning model and intensified learning model.
  21. 21. a kind of computer readable storage medium, is stored thereon with computer program, when the computer program in a computer When execution, computer perform claim is enabled to require the method for any one of 1-10.
  22. 22. a kind of calculating equipment, including memory and processor, which is characterized in that be stored with executable generation in the memory Code realizes method of any of claims 1-10 when the processor executes the executable code.
CN201910133409.3A 2019-02-22 2019-02-22 A kind of model training method and device based on disturbance sample Pending CN110033094A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910133409.3A CN110033094A (en) 2019-02-22 2019-02-22 A kind of model training method and device based on disturbance sample
PCT/CN2020/070290 WO2020168843A1 (en) 2019-02-22 2020-01-03 Model training method and apparatus based on disturbance samples

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910133409.3A CN110033094A (en) 2019-02-22 2019-02-22 A kind of model training method and device based on disturbance sample

Publications (1)

Publication Number Publication Date
CN110033094A true CN110033094A (en) 2019-07-19

Family

ID=67234962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910133409.3A Pending CN110033094A (en) 2019-02-22 2019-02-22 A kind of model training method and device based on disturbance sample

Country Status (2)

Country Link
CN (1) CN110033094A (en)
WO (1) WO2020168843A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062442A (en) * 2019-12-20 2020-04-24 支付宝(杭州)信息技术有限公司 Method and device for explaining service processing result of service processing model
WO2020168843A1 (en) * 2019-02-22 2020-08-27 阿里巴巴集团控股有限公司 Model training method and apparatus based on disturbance samples
CN111783551A (en) * 2020-06-04 2020-10-16 中国人民解放军军事科学院国防科技创新研究院 Confrontation sample defense method based on Bayes convolutional neural network
CN113780365A (en) * 2021-08-19 2021-12-10 支付宝(杭州)信息技术有限公司 Sample generation method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060048010A1 (en) * 2004-08-30 2006-03-02 Hung-En Tai Data analyzing method for a fault detection and classification system
CN105808500A (en) * 2016-02-26 2016-07-27 山西牡丹深度智能科技有限公司 Realization method and device of deep learning
CN107193863A (en) * 2017-04-01 2017-09-22 广东工业大学 A kind of Data Quality Assessment Methodology of data untagged
CN107315918B (en) * 2017-07-06 2020-05-01 青岛大学 Method for improving steady estimation by using noise
CN110033094A (en) * 2019-02-22 2019-07-19 阿里巴巴集团控股有限公司 A kind of model training method and device based on disturbance sample

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020168843A1 (en) * 2019-02-22 2020-08-27 阿里巴巴集团控股有限公司 Model training method and apparatus based on disturbance samples
CN111062442A (en) * 2019-12-20 2020-04-24 支付宝(杭州)信息技术有限公司 Method and device for explaining service processing result of service processing model
CN111062442B (en) * 2019-12-20 2022-04-12 支付宝(杭州)信息技术有限公司 Method and device for explaining service processing result of service processing model
CN111783551A (en) * 2020-06-04 2020-10-16 中国人民解放军军事科学院国防科技创新研究院 Confrontation sample defense method based on Bayes convolutional neural network
CN111783551B (en) * 2020-06-04 2023-07-25 中国人民解放军军事科学院国防科技创新研究院 Countermeasure sample defense method based on Bayesian convolutional neural network
CN113780365A (en) * 2021-08-19 2021-12-10 支付宝(杭州)信息技术有限公司 Sample generation method and device
CN113780365B (en) * 2021-08-19 2024-06-14 支付宝(杭州)信息技术有限公司 Sample generation method and device

Also Published As

Publication number Publication date
WO2020168843A1 (en) 2020-08-27

Similar Documents

Publication Publication Date Title
CN110033094A (en) A kind of model training method and device based on disturbance sample
CN109346087B (en) Noise robust speaker verification method and apparatus against bottleneck characteristics of a network
US20200125836A1 (en) Training Method for Descreening System, Descreening Method, Device, Apparatus and Medium
US20210150347A1 (en) Guided training of machine learning models with convolution layer feature data fusion
CN107305774A (en) Speech detection method and device
CN108875463B (en) Multi-view vector processing method and device
CN112215696A (en) Personal credit evaluation and interpretation method, device, equipment and storage medium based on time sequence attribution analysis
CN114398611A (en) Bimodal identity authentication method, device and storage medium
CN111144462A (en) Unknown individual identification method and device for radar signals
Duong et al. Speech enhancement based on nonnegative matrix factorization with mixed group sparsity constraint
CN110399279A (en) A kind of intelligent measure for inhuman intelligent body
JP2018022014A (en) Feature quantity extraction device, feature quantity extraction function information generator, and method and program thereof
CN115859128B (en) Analysis method and system based on interaction similarity of archive data
CN106340310B (en) Speech detection method and device
RU2148274C1 (en) Method for identification of person using properties of signature
CN112766537B (en) Short-term electric load prediction method
CN114186646A (en) Block chain abnormal transaction identification method and device, storage medium and electronic equipment
KR102202823B1 (en) Method and device for binary classification using characteristics of weighted maximum mean discrepancy operations for positive-unlabeled learning
CN110263196B (en) Image retrieval method, image retrieval device, electronic equipment and storage medium
CN113159419A (en) Group feature portrait analysis method, device and equipment and readable storage medium
CN112766403A (en) Incremental clustering method and device based on information gain weight
JP5438703B2 (en) Feature quantity enhancement device, feature quantity enhancement method, and program thereof
CN112016956A (en) BP neural network-based ore grade estimation method and device
CN111613247A (en) Foreground voice detection method and device based on microphone array
US20200034735A1 (en) System for generating topic inference information of lyrics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190719