CN110033094A - A kind of model training method and device based on disturbance sample - Google Patents
A kind of model training method and device based on disturbance sample Download PDFInfo
- Publication number
- CN110033094A CN110033094A CN201910133409.3A CN201910133409A CN110033094A CN 110033094 A CN110033094 A CN 110033094A CN 201910133409 A CN201910133409 A CN 201910133409A CN 110033094 A CN110033094 A CN 110033094A
- Authority
- CN
- China
- Prior art keywords
- sample
- disturbance
- model
- initial
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Complex Calculations (AREA)
- Image Analysis (AREA)
Abstract
This specification embodiment provides a kind of obtain and disturbs sample set and the method and apparatus based on disturbance sample set training pattern, and the method for obtaining disturbance sample set includes: to calculate the characteristic value mean square deviation of the characteristic value of each dimension in the corresponding multiple feature vectors of the multiple initial sample;And for each dimension in each feature vector in the multiple feature vector, generate corresponding random number, and the current characteristic value of the dimension of this feature vector is updated to the sum of the current characteristic value and corresponding random number, to generate multiple disturbance samples corresponding with the multiple feature vector, to obtain disturbance sample set, wherein, the value range of each random number is determined based on the product of the characteristic value mean square deviation of scheduled first parameter dimension corresponding with the random number.
Description
Technical field
This specification embodiment is related to machine learning field, obtains disturbance sample more particularly, to based on original training set
The method and apparatus of this collection, obtain model training sample set and test sample at the method and apparatus for obtaining model training sample set
The method and apparatus of collection and based on disturbance sample model training method and device.
Background technique
Machine mould, which is deployed in actual environment, will receive various challenges, wherein a critically important challenge is exactly
The stability of model.By taking speech recognition modeling as an example, the data that machine learning model is used in training are often by appropriate
Conduct oneself well reason, noise reduction, however in the actual environment, model will in face of the case where it is sufficiently complex, such as in noisy environment,
Echo of microphone etc. can all cause model data to be processed inconsistent with noise and actual training data, so as to cause
The precision of model has greatly changed.Therefore, reality of the robustness for machine learning model of machine learning model is improved
Using there is great meaning.Machine learning algorithm generallys use L1 canonical and L2 canonical to enhance the robustness of model at present,
The two methods are all to achieve the effect that robustness by the search space (absolute value of parameter) of limited model parameter.
Therefore, it is necessary to a kind of model training methods of more effectively enhancing model robustness.
Summary of the invention
This specification embodiment is intended to provide a kind of more effective model training method, with solve it is in the prior art not
Foot.
To achieve the above object, this specification is provided a kind of obtained based on original training set on one side and disturbs sample set
Method includes multiple initial samples in the original training set, and each initial sample includes corresponding feature vector, the method
Include:
Calculate the characteristic value of the characteristic value of each dimension in the corresponding multiple feature vectors of the multiple initial sample
Mean square deviation;And
For each dimension in each feature vector in the multiple feature vector, corresponding random number is generated, and will
The current characteristic value of the dimension of this feature vector is updated to the sum of the current characteristic value and corresponding random number, with generate with
The corresponding multiple disturbance samples of the multiple feature vector, to obtain disturbance sample set, wherein each random number
Value range based on scheduled first parameter dimension corresponding with the random number characteristic value mean square deviation product determine.
In one embodiment, the random number is random numbers of Gaussian distribution, the mean square deviation of the random numbers of Gaussian distribution
For the product of the characteristic value mean square deviation of first parameter dimension corresponding with the random number.
In one embodiment, the random number is mean random number, wherein the value range of the mean random number exists
Between positive and negative first numerical value, wherein first numerical value is the characteristic value of first parameter dimension corresponding with the random number
The product of mean square deviation.
On the other hand this specification provides a kind of method for obtaining model training sample set based on original training set, wherein
It include multiple initial samples in the original training set, which comprises
By the above method, disturbance sample set is obtained, the disturbance sample set includes distinguishing with the multiple initial sample
Corresponding multiple disturbance samples;And
By merging the multiple initial sample with the multiple disturbance sample, training sample set is obtained.
On the other hand this specification provides a kind of based on original training set acquisition model training sample set and test sample collection
Method, wherein in the original training set include multiple initial samples, which comprises
By above-mentioned acquisition disturb sample set method, obtain disturbance sample set, the disturbance sample set include with it is described
The corresponding multiple disturbance samples of multiple initial samples;
By by the multiple initial sample the initial sample in part and it is the multiple disturbance sample in portion disturbances
Sample merges, and obtains training sample set;And
By by initial sample remaining in the multiple initial sample at least partly and in the multiple disturbance sample
At least partly merging of remaining disturbance sample, obtains test sample collection.
In one embodiment, the initial sample in the part account for the multiple initial sample ratio and the portion disturbances
The ratio that sample accounts for the multiple disturbance sample is identical.
In one embodiment, by by initial sample remaining in the multiple initial sample at least partly with it is described
At least partly merging of remaining disturbance sample in multiple disturbance samples, obtaining test sample collection includes, by will be the multiple
Remaining initial sample merges with remaining disturbance sample in the multiple disturbance sample in initial sample, obtains test sample
Collection.
In one embodiment, the initial sample in the part is respectively corresponded with the portion disturbances sample.
This specification embodiment provides a kind of model training method, comprising:
Obtain original training set, wherein include multiple initial samples in the original training set;
The original training set, which is based on, by the method for above-mentioned acquisition training sample set and test sample collection obtains multiple instructions
Practice sample set and multiple test sample collections corresponding with the multiple training sample set, wherein the multiple training sample
The first parameter for collecting different from multiple values respectively corresponds;
"current" model is respectively trained using the multiple training sample set, to obtain multiple more new models respectively;
Assess corresponding more new model respectively using the multiple test sample collection, wherein the test sample collection and phase
The more new model answered corresponds to identical training sample set;And
Based on assessment result, the more new model of the "current" model is determined in the multiple more new model.
In one embodiment, the model be following any class model: supervised learning model, unsupervised learning model,
With intensified learning model.
On the other hand this specification provides a kind of device that disturbance sample set is obtained based on original training set, the initial sample
This concentration includes multiple initial samples, and each initial sample includes corresponding feature vector, and described device includes:
Computing unit is configured to, and calculates each dimension in the corresponding multiple feature vectors of the multiple initial sample
Characteristic value characteristic value mean square deviation;And
Generation unit is configured to, and for each dimension in each feature vector in the multiple feature vector, generates phase
The random number answered, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value with it is corresponding random
The sum of number, to generate multiple disturbance samples corresponding with the multiple feature vector, so that disturbance sample set is obtained,
In, the value range of each random number is square based on the characteristic value of scheduled first parameter dimension corresponding with the random number
The product of difference determines.
On the other hand this specification provides a kind of device that model training sample set is obtained based on original training set, wherein
It include multiple initial samples in the original training set, described device includes:
Acquiring unit is configured to, and by above-mentioned apparatus, obtains disturbance sample set, the disturbance sample set include with it is described
The corresponding multiple disturbance samples of multiple initial samples;And
Combining unit is configured to, and by merging the multiple initial sample with the multiple disturbance sample, obtains training
Sample set.
On the other hand this specification provides a kind of based on original training set acquisition model training sample set and test sample collection
Device, wherein include multiple initial samples in the original training set, described device includes:
Acquiring unit is configured to, and the device of sample set is disturbed by above-mentioned acquisition, obtains disturbance sample set, the disturbance
Sample set includes multiple disturbance samples corresponding with the multiple initial sample;
First combining unit, is configured to, by by the multiple initial sample the initial sample in part with it is the multiple
The portion disturbances sample disturbed in sample merges, and obtains training sample set;And
Second combining unit, is configured to, by by initial sample remaining in the multiple initial sample at least partly
At least partly merging for sample is disturbed with remaining in the multiple disturbance sample, obtains test sample collection.
In one embodiment, second combining unit is additionally configured to, by will in the multiple initial sample it is remaining
Initial sample merge with remaining disturbance sample in the multiple disturbance sample, obtain test sample collection.
On the other hand this specification provides a kind of model training apparatus, comprising:
First acquisition unit is configured to, and obtains original training set, wherein includes multiple initial in the original training set
Sample;
Second acquisition unit is configured to, and is based on by above-mentioned acquisition training sample set and the device of test sample collection described
Original training set obtains multiple training sample sets and multiple test sample collections corresponding with the multiple training sample set,
Wherein, the multiple training sample set first parameter different from multiple values respectively corresponds;
Training unit is configured to, and "current" model is respectively trained using the multiple training sample set, multiple to obtain respectively
More new model;
Assessment unit is configured to, and assesses corresponding more new model respectively using the multiple test sample collection, wherein institute
It states test sample collection and corresponds to identical training sample set with corresponding more new model;And
Determination unit is configured to, and is based on assessment result, determines the "current" model more in the multiple more new model
New model.
On the other hand this specification provides a kind of computer readable storage medium, be stored thereon with computer program, work as institute
When stating computer program and executing in a computer, computer is enabled to execute any of the above-described method.
On the other hand this specification provides a kind of calculating equipment, including memory and processor, which is characterized in that described to deposit
It is stored with executable code in reservoir, when the processor executes the executable code, realizes any of the above-described method.
This specification embodiment is disturbed by the training data to model, simulates the data noise in true environment,
To increase the robustness of model, and by using noisy data training and assessment models, and mould is determined based on assessment result
Type predefined parameter, to quantitatively improve model to the validity of abnormal data, in addition, not to machine in this specification embodiment
The parameter of device learning model is limited, thus without limitation on the learning ability of model.
Detailed description of the invention
This specification embodiment is described in conjunction with the accompanying drawings, and this specification embodiment can be made clearer:
Fig. 1 shows the schematic diagram of the model training systems 100 according to this specification embodiment;
Fig. 2 shows a kind of methods for obtaining disturbance sample set based on original training set according to this specification embodiment;
Fig. 3 schematically illustrates the calculating of the mean square deviation to the one-dimensional characteristic value in multiple feature vectors;
Fig. 4 shows n perturbation features vector corresponding with n feature vector in Fig. 3;
Fig. 5 shows the method flow that model training sample set is obtained based on original training set according to this specification embodiment
Figure;
Fig. 6, which is shown, is based on original training set acquisition model training sample set and survey according to one kind of this specification embodiment
Try the method flow diagram of sample set;
Fig. 7 shows a kind of model training method flow chart according to this specification embodiment;
Fig. 8 shows a kind of device 800 that disturbance sample set is obtained based on original training set according to this specification embodiment;
Fig. 9 shows a kind of device that model training sample set is obtained based on original training set according to this specification embodiment
900;
Figure 10, which is shown, is based on original training set acquisition model training sample set and survey according to one kind of this specification embodiment
Try the device 1000 of sample set;
Figure 11 shows a kind of model training apparatus 1100 according to this specification embodiment.
Specific embodiment
This specification embodiment is described below in conjunction with attached drawing.
Fig. 1 shows the schematic diagram of the model training systems 100 according to this specification embodiment.As shown in Figure 1, system 100
Including data processing module 11, training module 12 and evaluation module 13.In data processing module 11, by data set A
Each sample in feature vector each dimension values apply disturbance, to obtain data set B.In training module 12, from number
According at least partly data (such as in data set A 70% data) is obtained in collection A, at least partly data (example is obtained from data set B
Data such as in data set B 70%), to obtain training dataset and being merged, and assembled for training using the training data
Practice machine learning model.The machine learning model can be arbitrary model, be, for example, speech recognition modeling above.The language
Sound identification model is, for example, supervised learning model or for intensified learning model etc., it will be understood that the arbitrary model can be with
For unsupervised learning model.In evaluation module 13, from obtaining remaining data (such as 30% number in data set A in data set A
According to), remaining data (such as 30% data in data set B) are obtained from data set B, are surveyed to be obtained and being merged
Data set is tried, and assesses the model of the training using the test data set.
Above-mentioned each treatment process is described below in detail.
Fig. 2 shows a kind of method for obtaining disturbance sample set based on original training set according to this specification embodiment, institutes
Stating includes multiple initial samples in original training set, and each initial sample includes corresponding feature vector, which comprises
In step S202, the feature of each dimension in the corresponding multiple feature vectors of the multiple initial sample is calculated
The characteristic value mean square deviation of value;And
Each dimension in each feature vector in the multiple feature vector is generated corresponding in step S204
Random number, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value and corresponding random number it
With to generate corresponding with the multiple feature vector multiple disturbance samples, to obtain disturbance sample set, wherein respectively
Characteristic value mean square deviation of the value range of a random number based on scheduled first parameter dimension corresponding with the random number
Product determines.
Firstly, calculating each dimension in the corresponding multiple feature vectors of the multiple initial sample in step S202
Characteristic value characteristic value mean square deviation.
The original training set is, for example, data set A shown in Fig. 1, includes such as n initial samples in data set A
This includes respective feature vector in each sample, and described eigenvector is, for example, m dimensional feature vector, each of which dimension values
Corresponding to a characteristic value.In addition, in the case where the model that will be trained is, for example, monitor model, in each sample also
Including corresponding label value.
The mean square deviation is standard deviation, is variances sigma2Square root, can be indicated with σ.To multiple sample calculating sides
When poor, usually calculated with following formula (1):
Wherein, n is total sample number, and μ is n xiMean value.
Fig. 3 schematically illustrates the calculating of the mean square deviation to the one-dimensional characteristic value in multiple feature vectors.In fig. 3, it is assumed that there is n
A feature vector, each feature vector include the characteristic value of m feature, can will wherein j-th of feature vector (i.e. vector in figure
J) (list of feature values of the feature i) i.e. in figure is shown as x to i-th dimension featureij, wherein [1, m] i ∈, j ∈ [1, n].To which n special
Levy the standard deviation sigma of the characteristic value of the respective i-th dimension of vectoriIt can be calculated by following formula (2):
Wherein, μiIt is calculated by following formula (3)
Specifically, for example, the characteristic value of the 2nd dimensional feature (i.e. feature 2) for each vector in dotted line frame in Fig. 3
x21,x22,…,x2n, it can be based on the n numerical value, its mean value is calculated by formula (3)It is counted by formula (2)
Calculate its variance
Each dimension in each feature vector in the multiple feature vector is generated corresponding in step S204
Random number, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value and corresponding random number it
With to generate corresponding with the multiple feature vector multiple disturbance samples, to obtain disturbance sample set, wherein respectively
Characteristic value mean square deviation of the value range of a random number based on scheduled first parameter dimension corresponding with the random number
Product determines.
Referring still to Fig. 3, it is assumed that for the vector 1 in figure, for each dimension 1,2 ... of this feature vector, m, difference
Generate corresponding random number a11,a21,…am1, and by the current characteristic value x of each dimension of this feature vectori1It is updated to xi1+ai1,
Wherein [1, m] i ∈, so as to obtain perturbation features vector corresponding with this feature vector 1 (as shown in dotted line frame in Fig. 4).
For each feature vector, it can similarly apply disturbance, to obtain corresponding perturbation features vector.Fig. 4 is shown
N perturbation features vector corresponding with n feature vector in Fig. 3.By the perturbation features vector training pattern, i.e., to instruction
Practice sample and increase noise, model will be improved to the resistivity of noise, enhance the stability of model.
In one embodiment, each random number a for including in Fig. 4ij, wherein [1, m] i ∈, j ∈ [1, n] can lead to
Cross Gaussian random variable A=norm (0, λ σi) obtain, that is to say, that the stochastic variable mean value is 0, and mean square deviation is λ σi, σ herei
For calculated by above-mentioned formula (2) dimension i (the characteristic value mean square deviation of feature i), λ are predefined parameter, can value exist
Between 0.0001 to 0.1.Such as in conjunction with Fig. 3 and Fig. 4, each a corresponding with 2 dimension of feature in Fig. 42jAll by A=norm (0,
λσ2) generate, wherein σ2As σ shown in Fig. 32.As well known to the skilled person, according to the probability density of gaussian variable
Figure, aij99% possibility value will fall in [- 3 λ σi,3λσi] section in, that is to say, that by predefined parameter λ with
The characteristic value meansquaredeviationσ of dimension iiProduct λ σi, define each random number a of dimension iijValue range, wherein j takes
Being worth range is 1 to n.
In one embodiment, each random number a in Fig. 4ijIt can be obtained by mean random variable B, stochastic variable B
It may be, for example, [- λ σi,λσi] mean random variable in range.That is, passing through the product λ σiLimit each random number
aijValue range.The value range of the mean random variable B is not limited to be set as [- λ σi,λσi], for example, it is also possible to be
[-3λσi,3λσi] etc..In addition, each random number aijIt is not limited to through above-mentioned Gaussian random variable or mean random
Variable obtains, and can be obtained by other any stochastic variables, such as Poisson stochastic variable, as long as it passes through λ σiLimitation
Its value range.
Wherein, λ is used to balance the noise and model performance being added, and the value of λ is smaller, aijValue range it is smaller, Ye Jishi
The disturbance added is smaller, and when application disturbance is too small, the influence to feature vector is too small, cannot play the work for improving model performance
With when the disturbance of application is excessive, and affecting the forecasting accuracy of model.Therefore, raising of the value of λ for model performance
It is more important.In one embodiment, the value that λ can be determined by the specific environment that model is applied, for example, for speech recognition mould
Type, it is more noisy in the environment of its application, in the case where noise is larger, it can set larger for λ value, in the environment ratio of its application
It is quieter, in the case where noise is less than normal, it can set smaller for λ value.In one embodiment, as follows to be described in detail
, by being assessed trained model the value to determine λ, that is, select the preferable λ of assessed value as final model
Trained λ.In one embodiment, after having had determined λ in model before the use the training sample set training of batch,
Reusable λ value in subsequent repetition training.
After obtaining the corresponding disturbance sample of each initial sample of original training set by method shown in Fig. 2,
Disturbance sample set can be obtained, so as to obtain the training sample set and test specimens of model based on original training set and disturbance sample set
This collection.
Fig. 5 shows the method flow that model training sample set is obtained based on original training set according to this specification embodiment
Figure, wherein include multiple initial samples in the original training set, which comprises
In step S502, by method shown in Fig. 2, disturbance sample set, the disturbance are obtained based on the original training set
Sample set includes multiple disturbance samples corresponding with the multiple initial sample;And
Training sample is obtained by merging the multiple initial sample with the multiple disturbance sample in step S504
Collection.
Firstly, by method shown in Fig. 2, being generated corresponding multiple with the multiple initial sample in step S502
Disturb sample, wherein the multiple disturbance sample corresponds to identical first parameter of value.The multiple disturbance sample is, for example,
Multiple disturbance samples as shown in Figure 4, as described above, each random number a in multiple disturbance sampleijGauss can be passed through
Stochastic variable A=norm (0, λ σi) obtain, that is to say, that this batch disturbs sample and corresponds to identical λ value.
Training sample is obtained by merging the multiple initial sample with the multiple disturbance sample in step S504
Collection.It, can be by the whole samples and disturbance sample set in original training set in the case where not needing to assess training pattern
In whole samples merge, to obtain training sample set.By and in training sample concentration while including original training set disturbing
Dynamic sample set, enriches the training sample of model, so that model can adapt in different actual environments.
Fig. 6, which is shown, is based on original training set acquisition model training sample set and survey according to one kind of this specification embodiment
Try the method flow diagram of sample set, wherein include multiple initial samples in the original training set, which comprises
In step S602, by method shown in Fig. 2, disturbance sample set, the disturbance are obtained based on the original training set
Sample set includes multiple disturbance samples corresponding with the multiple initial sample;
In step S604, by will be in the initial sample in part and the multiple disturbance sample in the multiple initial sample
Portion disturbances sample merge, obtain training sample set;And
In step S606, by by initial sample remaining in the multiple initial sample at least partly with it is the multiple
At least partly merging for disturbing remaining disturbance sample in sample, obtains test sample collection.
It in the method, can be by the way that the initial sample in part be closed with portion disturbances sample after acquisition disturbs sample set
And together, to obtain training sample set.In one embodiment, for example, in combinable original training set 70% just
70% disturbance sample in beginning sample and disturbance sample set, to obtain training sample set.Later, by the original training set
Merge with the disturbance sample of the remaining such as 30% initial sample disturbed in sample set and 30%, to obtain and the training
The corresponding test sample collection of sample set.Wherein, the described 70% initial sample and 70% disturbance that the training sample is concentrated
Sample can be respectively it is mutual corresponding, be also possible to not corresponding.
In one embodiment, the initial sample in the part account for the multiple initial sample ratio and the portion disturbances
The ratio that sample accounts for the multiple disturbance sample can also be different, for example, in the case where model actual application environment is more noisy,
The disturbance sample including larger proportion can be concentrated in training sample, such as concentrating in training sample includes in whole disturbance samples
80% disturbance sample, all 20% initial sample in initial samples.Correspondingly, same proportional arrangement can also be passed through
Test sample collection, disturbance sample, residue for example including (accounting for all disturbance samples) 20% remaining in all disturbance samples are just
The initial sample of (accounting for all initial samples) 5% in beginning sample.
After training sample set and the test sample collection for obtaining model as shown in Figure 5 and Figure 6, training sample set can be passed through
Training pattern, and can be assessed by test sample collection and model is assessed.It is described below based on test sample the set pair analysis model
The predefined parameter λ of model is selected in assessment, thus the method for advanced optimizing model.
Fig. 7 shows a kind of model training method flow chart according to this specification embodiment, comprising:
In step S702, original training set is obtained, wherein include multiple initial samples in the original training set;
In step S704, by method shown in Fig. 6 be based on the original training set obtain multiple training sample sets and with institute
State the corresponding multiple test sample collections of multiple training sample sets, wherein the multiple training sample set and multiple values are not
The first same parameter respectively corresponds;
In step S706, "current" model is respectively trained using the multiple training sample set, to obtain multiple updates respectively
Model;
In step S708, corresponding more new model is assessed respectively using the multiple test sample collection, wherein the test
Sample set corresponds to identical training sample set with corresponding more new model;And
In step S710, it is based on assessment result, the update mould of the "current" model is determined in the multiple more new model
Type.
Firstly, obtaining original training set in step S702, wherein include multiple initial samples in the original training set.
The model is not limited to concrete type, can be supervised learning model, unsupervised learning model, Yi Jiqiang as described above
Change any kind in learning model.For example, the model is speech recognition modeling as described above, it is, for example, that supervision is learned
Model is practised, in this case, corresponding feature vector can be extracted from the voice, thus described in use by the way that voice is manually entered
The initial sample of feature vector and label value (semanteme) as model.However, the model can be potentially encountered not in practical applications
Same environment, such as quiet environment, a variety of noisy environment with different noises.And it is manually obtained under single environment
Initial sample can not simulate so how different environment, and under various circumstances manually obtain sample cost it is also relatively high.
So that the expansion of sample is carried out based on original training set by this method, to obtain training sample set.
In step S704, by method shown in Fig. 6 be based on the original training set obtain multiple training sample sets and with institute
State the corresponding multiple test sample collections of multiple training sample sets, wherein the multiple training sample set and multiple values are not
The first same parameter respectively corresponds.
That is, method shown in multipass Fig. 6 is based on just in the case where taking different value to above-mentioned predefined parameter λ
Beginning sample set obtains multiple training sample sets and multiple test sample collections corresponding with its difference.For example, λ can be distinguished value
It is 0.0001,0.001,0.01,0.1.To can determine influence of the magnitude to model training of λ.It is appreciated that the value of λ is not
It is limited to aforesaid way and above-mentioned number, but can be specifically limited according to concrete model.Specifically, for above-mentioned 4 λ values, base
In above-mentioned original training set A, 4 disturbance sample set B can be obtained respectively by method shown in Fig. 21、B2、B3、B4, it is assumed that it is based on
This 4 disturbance sample sets obtain 4 groups of sample sets respectively and are combined (C1, D1)、(C2, D2)、(C3, D3)、(C4, D4), wherein CiIndicate instruction
Practice sample set, DiIndicate test sample collection.
In step S706, "current" model is respectively trained using the multiple training sample set, to obtain multiple updates respectively
Model.
In the above-described example, that is to say, that use each training sample set C1、C2、C3、C4"current" model is respectively trained, with
4 update model Ms are obtained respectively1、M2、M3、M4。
In step S708, corresponding more new model is assessed respectively using the multiple test sample collection, wherein the test
Sample set corresponds to identical training sample set with corresponding more new model.
In the above-described example, that is to say, that use each test sample collection D1、D2、D3、D44 updates are assessed respectively
Model M1、M2、M3、M4, wherein test sample collection D1With update model M1Both correspond to training sample set C1, that is, test sample collection
D1With update model M1It is corresponding, it can similarly obtain, test sample collection D2With update model M2It is corresponding, test sample collection D3With
Update model M3It is corresponding, test sample collection D4With update model M4It is corresponding.Corresponding update mould can be calculated by test sample
The various evaluation indexes of type, such as accuracy rate, accurate rate, recall rate, so that more new model is assessed accordingly to this, example
Such as, in summary the assessed value of the model can be obtained by various evaluation indexes.
In step S710, it is based on assessment result, the update mould of the "current" model is determined in the multiple more new model
Type.
In the above-described example, after obtaining the corresponding respective assessed value of more new model of each λ, such as can be by assessed value
Highest more new model is determined as the more new model of the "current" model, that is, model after training, and retains the update mould of the determination
Type is to carry out subsequent model use, such as model prediction.
Fig. 8 shows a kind of device 800 that disturbance sample set is obtained based on original training set according to this specification embodiment,
It include multiple initial samples in the original training set, each initial sample includes corresponding feature vector, and described device includes:
Computing unit 81, is configured to, and calculates each dimension in the corresponding multiple feature vectors of the multiple initial sample
The characteristic value mean square deviation of the characteristic value of degree;And
Generation unit 82, is configured to, and for each dimension in each feature vector in the multiple feature vector, generates
Corresponding random number, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value with it is corresponding with
The sum of machine number, to generate multiple disturbance samples corresponding with the multiple feature vector, so that disturbance sample set is obtained,
In, the value range of each random number is square based on the characteristic value of scheduled first parameter dimension corresponding with the random number
The product of difference determines.
Fig. 9 shows a kind of device that model training sample set is obtained based on original training set according to this specification embodiment
900, wherein include multiple initial samples in the original training set, described device includes:
Acquiring unit 91, is configured to, and by above-mentioned apparatus, obtains disturbance sample set based on the original training set, described
Disturbance sample set includes multiple disturbance samples corresponding with the multiple initial sample;And
Combining unit 92, is configured to, and by merging the multiple initial sample with the multiple disturbance sample, obtains instruction
Practice sample set.
Figure 10, which is shown, is based on original training set acquisition model training sample set and survey according to one kind of this specification embodiment
Try the device 1000 of sample set, wherein include multiple initial samples in the original training set, described device includes:
Acquiring unit 101, is configured to, and the device of sample set is disturbed by above-mentioned acquisition, is obtained based on the original training set
Take disturbance sample set, the disturbance sample set includes multiple disturbance samples corresponding with the multiple initial sample;
First combining unit 102, is configured to, by by the initial sample in part in the multiple initial sample and described more
Portion disturbances sample in a disturbance sample merges, and obtains training sample set;And
Second combining unit 103, is configured to, by by at least portion of initial sample remaining in the multiple initial sample
Divide and disturb at least partly merging for sample with remaining in the multiple disturbance sample, obtains test sample collection.
In one embodiment, second combining unit is additionally configured to, by will in the multiple initial sample it is remaining
Initial sample merge with remaining disturbance sample in the multiple disturbance sample, obtain test sample collection.
Figure 11 shows a kind of model training apparatus 1100 according to this specification embodiment, comprising:
First acquisition unit 111, is configured to, and obtains original training set, wherein includes multiple first in the original training set
Beginning sample;
Second acquisition unit 112, is configured to, and is based on institute by above-mentioned acquisition training sample set and the device of test sample collection
It states original training set and obtains multiple training sample sets and multiple test samples corresponding with the multiple training sample set
Collection, wherein the multiple training sample set first parameter different from multiple values respectively corresponds;
Training unit 113, is configured to, and "current" model is respectively trained using the multiple training sample set, to obtain respectively
Multiple more new models;
Assessment unit 114, is configured to, and assesses corresponding more new model respectively using the multiple test sample collection, wherein
The test sample collection corresponds to identical training sample set with corresponding more new model;And
Determination unit 115, is configured to, and is based on assessment result, and the "current" model is determined in the multiple more new model
More new model.
On the other hand this specification provides a kind of computer readable storage medium, be stored thereon with computer program, work as institute
When stating computer program and executing in a computer, computer is enabled to execute any of the above-described method.
On the other hand this specification provides a kind of calculating equipment, including memory and processor, which is characterized in that described to deposit
It is stored with executable code in reservoir, when the processor executes the executable code, realizes any of the above-described method.
This specification embodiment is disturbed by the training data to model, simulates the data noise in true environment,
To increase the robustness of model, and by using noisy data training and assessment models, and mould is determined based on assessment result
Type predefined parameter, to quantitatively improve model to the validity of abnormal data, in addition, not to machine in this specification embodiment
The parameter of device learning model is limited, thus without limitation on the learning ability of model.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method
Part explanation.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims
It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment
It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable
Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can
With or may be advantageous.
Those of ordinary skill in the art should further appreciate that, describe in conjunction with the embodiments described herein
Each exemplary unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clear
Illustrate to Chu the interchangeability of hardware and software, generally describes each exemplary group according to function in the above description
At and step.These functions hold track actually with hardware or software mode, depending on technical solution specific application and set
Count constraint condition.Those of ordinary skill in the art can realize each specific application using distinct methods described
Function, but this realization is it is not considered that exceed scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can hold track with hardware, processor
Software module or the combination of the two implement.Software module can be placed in random access memory (RAM), memory, read-only storage
Device (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology neck
In any other form of storage medium well known in domain.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects
It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention
Protection scope, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include
Within protection scope of the present invention.
Claims (22)
- It include multiple initial samples in the original training set 1. a kind of obtain the method for disturbing sample set based on original training set This, each initial sample includes corresponding feature vector, which comprisesThe characteristic value for calculating the characteristic value of each dimension in the corresponding multiple feature vectors of the multiple initial sample is square Difference;AndFor each dimension in each feature vector in the multiple feature vector, corresponding random number is generated, and by the spy The current characteristic value of the dimension for levying vector is updated to the sum of the current characteristic value and corresponding random number, with generation with it is described The corresponding multiple disturbance samples of multiple feature vectors, to obtain disturbance sample set, wherein each random number takes The product for being worth range based on the characteristic value mean square deviation of scheduled first parameter dimension corresponding with the random number determines.
- 2. according to the method described in claim 1, wherein, the random number is random numbers of Gaussian distribution, the Gaussian Profile with The mean square deviation of machine number is the product of the characteristic value mean square deviation of first parameter dimension corresponding with the random number.
- 3. according to the method described in claim 1, wherein, the random number is mean random number, wherein the mean random number Value range between positive and negative first numerical value, wherein first numerical value be first parameter it is corresponding with the random number The product of the characteristic value mean square deviation of dimension.
- 4. a kind of method for obtaining model training sample set based on original training set, wherein include more in the original training set A initial sample, which comprisesBy method described in claim 1, disturbance sample set, the disturbance sample set packet are obtained based on the original training set Include multiple disturbance samples corresponding with the multiple initial sample;AndBy merging the multiple initial sample with the multiple disturbance sample, training sample set is obtained.
- 5. a kind of method for obtaining model training sample set and test sample collection based on original training set, wherein the initial sample This concentration includes multiple initial samples, which comprisesBy method described in claim 1, disturbance sample set, the disturbance sample set packet are obtained based on the original training set Include multiple disturbance samples corresponding with the multiple initial sample;By by the multiple initial sample the initial sample in part and it is the multiple disturbance sample in portion disturbances sample Merge, obtains training sample set;AndBy by initial sample remaining in the multiple initial sample at least partly with it is remaining in the multiple disturbance sample Disturbance sample at least partly merging, obtain test sample collection.
- 6. according to the method described in claim 5, wherein, the initial sample in part account for the ratio of the multiple initial sample with The ratio that the portion disturbances sample accounts for the multiple disturbance sample is identical.
- 7. according to the method described in claim 6, wherein, by by initial sample remaining in the multiple initial sample extremely Small part disturbs at least partly merging for sample with remaining in the multiple disturbance sample, and obtaining test sample collection includes leading to It crosses and merges initial sample remaining in the multiple initial sample with remaining disturbance sample in the multiple disturbance sample, obtain Take test sample collection.
- 8. according to the method described in claim 6, wherein, the initial sample in part and the portion disturbances sample are right respectively It answers.
- 9. a kind of model training method, comprising:Obtain original training set, wherein include multiple initial samples in the original training set;By the method described in claim 5 be based on the original training set obtain multiple training sample sets and with it is the multiple The corresponding multiple test sample collections of training sample set, wherein the multiple training sample set it is different from multiple values One parameter respectively corresponds;"current" model is respectively trained using the multiple training sample set, to obtain multiple more new models respectively;Assess corresponding more new model respectively using the multiple test sample collection, wherein the test sample collection with it is corresponding More new model corresponds to identical training sample set;AndBased on assessment result, the more new model of the "current" model is determined in the multiple more new model.
- 10. according to the method described in claim 9, wherein, the model is following any class model: supervised learning model, nothing Supervised learning model and intensified learning model.
- It include multiple initial samples in the original training set 11. a kind of obtain the device for disturbing sample set based on original training set This, each initial sample includes corresponding feature vector, and described device includes:Computing unit is configured to, and calculates the spy of each dimension in the corresponding multiple feature vectors of the multiple initial sample The characteristic value mean square deviation of value indicative;AndGeneration unit is configured to, and for each dimension in each feature vector in the multiple feature vector, is generated corresponding Random number, and by the current characteristic value of the dimension of this feature vector be updated to the current characteristic value and corresponding random number it With to generate corresponding with the multiple feature vector multiple disturbance samples, to obtain disturbance sample set, wherein respectively Characteristic value mean square deviation of the value range of a random number based on scheduled first parameter dimension corresponding with the random number Product determines.
- 12. device according to claim 11, wherein the random number is random numbers of Gaussian distribution, the Gaussian Profile The mean square deviation of random number is the product of the characteristic value mean square deviation of first parameter dimension corresponding with the random number.
- 13. device according to claim 11, wherein the random number is mean random number, wherein the mean random Several value ranges is between positive and negative first numerical value, wherein first numerical value is that first parameter is corresponding with the random number Dimension characteristic value mean square deviation product.
- 14. a kind of device for obtaining model training sample set based on original training set, wherein include more in the original training set A initial sample, described device include:Acquiring unit is configured to, and by the device described in claim 11, obtains disturbance sample based on the original training set Collection, the disturbance sample set include multiple disturbance samples corresponding with the multiple initial sample;AndCombining unit is configured to, and by merging the multiple initial sample with the multiple disturbance sample, obtains training sample Collection.
- 15. a kind of device for obtaining model training sample set and test sample collection based on original training set, wherein the initial sample This concentration includes multiple initial samples, and described device includes:Acquiring unit is configured to, and by the device described in claim 11, obtains disturbance sample based on the original training set Collection, the disturbance sample set include multiple disturbance samples corresponding with the multiple initial sample;First combining unit, is configured to, by by the multiple initial sample the initial sample in part and the multiple disturbance Portion disturbances sample in sample merges, and obtains training sample set;AndSecond combining unit, is configured to, by by initial sample remaining in the multiple initial sample at least partly with institute At least partly merging for stating remaining disturbance sample in multiple disturbance samples, obtains test sample collection.
- 16. device according to claim 15, wherein the initial sample in part accounts for the ratio of the multiple initial sample It is identical that the multiple disturbance ratio of sample is accounted for the portion disturbances sample.
- 17. device according to claim 16, wherein second combining unit is additionally configured to, by will be the multiple Remaining initial sample merges with remaining disturbance sample in the multiple disturbance sample in initial sample, obtains test sample Collection.
- 18. device according to claim 16, wherein the initial sample in part and the portion disturbances sample are right respectively It answers.
- 19. a kind of model training apparatus, comprising:First acquisition unit is configured to, and obtains original training set, wherein includes multiple initial samples in the original training set;Second acquisition unit is configured to, and is based on the original training set by the device described in claim 15 and is obtained multiple instructions Practice sample set and multiple test sample collections corresponding with the multiple training sample set, wherein the multiple training sample The first parameter for collecting different from multiple values respectively corresponds;Training unit is configured to, and "current" model is respectively trained using the multiple training sample set, to obtain multiple updates respectively Model;Assessment unit is configured to, and assesses corresponding more new model respectively using the multiple test sample collection, wherein the survey It tries sample set and corresponds to identical training sample set with corresponding more new model;AndDetermination unit is configured to, and is based on assessment result, and the update mould of the "current" model is determined in the multiple more new model Type.
- 20. device according to claim 19, wherein the model is following any class model: supervised learning model, nothing Supervised learning model and intensified learning model.
- 21. a kind of computer readable storage medium, is stored thereon with computer program, when the computer program in a computer When execution, computer perform claim is enabled to require the method for any one of 1-10.
- 22. a kind of calculating equipment, including memory and processor, which is characterized in that be stored with executable generation in the memory Code realizes method of any of claims 1-10 when the processor executes the executable code.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910133409.3A CN110033094A (en) | 2019-02-22 | 2019-02-22 | A kind of model training method and device based on disturbance sample |
PCT/CN2020/070290 WO2020168843A1 (en) | 2019-02-22 | 2020-01-03 | Model training method and apparatus based on disturbance samples |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910133409.3A CN110033094A (en) | 2019-02-22 | 2019-02-22 | A kind of model training method and device based on disturbance sample |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110033094A true CN110033094A (en) | 2019-07-19 |
Family
ID=67234962
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910133409.3A Pending CN110033094A (en) | 2019-02-22 | 2019-02-22 | A kind of model training method and device based on disturbance sample |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110033094A (en) |
WO (1) | WO2020168843A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111062442A (en) * | 2019-12-20 | 2020-04-24 | 支付宝(杭州)信息技术有限公司 | Method and device for explaining service processing result of service processing model |
WO2020168843A1 (en) * | 2019-02-22 | 2020-08-27 | 阿里巴巴集团控股有限公司 | Model training method and apparatus based on disturbance samples |
CN111783551A (en) * | 2020-06-04 | 2020-10-16 | 中国人民解放军军事科学院国防科技创新研究院 | Confrontation sample defense method based on Bayes convolutional neural network |
CN113780365A (en) * | 2021-08-19 | 2021-12-10 | 支付宝(杭州)信息技术有限公司 | Sample generation method and device |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060048010A1 (en) * | 2004-08-30 | 2006-03-02 | Hung-En Tai | Data analyzing method for a fault detection and classification system |
CN105808500A (en) * | 2016-02-26 | 2016-07-27 | 山西牡丹深度智能科技有限公司 | Realization method and device of deep learning |
CN107193863A (en) * | 2017-04-01 | 2017-09-22 | 广东工业大学 | A kind of Data Quality Assessment Methodology of data untagged |
CN107315918B (en) * | 2017-07-06 | 2020-05-01 | 青岛大学 | Method for improving steady estimation by using noise |
CN110033094A (en) * | 2019-02-22 | 2019-07-19 | 阿里巴巴集团控股有限公司 | A kind of model training method and device based on disturbance sample |
-
2019
- 2019-02-22 CN CN201910133409.3A patent/CN110033094A/en active Pending
-
2020
- 2020-01-03 WO PCT/CN2020/070290 patent/WO2020168843A1/en active Application Filing
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020168843A1 (en) * | 2019-02-22 | 2020-08-27 | 阿里巴巴集团控股有限公司 | Model training method and apparatus based on disturbance samples |
CN111062442A (en) * | 2019-12-20 | 2020-04-24 | 支付宝(杭州)信息技术有限公司 | Method and device for explaining service processing result of service processing model |
CN111062442B (en) * | 2019-12-20 | 2022-04-12 | 支付宝(杭州)信息技术有限公司 | Method and device for explaining service processing result of service processing model |
CN111783551A (en) * | 2020-06-04 | 2020-10-16 | 中国人民解放军军事科学院国防科技创新研究院 | Confrontation sample defense method based on Bayes convolutional neural network |
CN111783551B (en) * | 2020-06-04 | 2023-07-25 | 中国人民解放军军事科学院国防科技创新研究院 | Countermeasure sample defense method based on Bayesian convolutional neural network |
CN113780365A (en) * | 2021-08-19 | 2021-12-10 | 支付宝(杭州)信息技术有限公司 | Sample generation method and device |
CN113780365B (en) * | 2021-08-19 | 2024-06-14 | 支付宝(杭州)信息技术有限公司 | Sample generation method and device |
Also Published As
Publication number | Publication date |
---|---|
WO2020168843A1 (en) | 2020-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110033094A (en) | A kind of model training method and device based on disturbance sample | |
CN109346087B (en) | Noise robust speaker verification method and apparatus against bottleneck characteristics of a network | |
US20200125836A1 (en) | Training Method for Descreening System, Descreening Method, Device, Apparatus and Medium | |
US20210150347A1 (en) | Guided training of machine learning models with convolution layer feature data fusion | |
CN107305774A (en) | Speech detection method and device | |
CN108875463B (en) | Multi-view vector processing method and device | |
CN112215696A (en) | Personal credit evaluation and interpretation method, device, equipment and storage medium based on time sequence attribution analysis | |
CN114398611A (en) | Bimodal identity authentication method, device and storage medium | |
CN111144462A (en) | Unknown individual identification method and device for radar signals | |
Duong et al. | Speech enhancement based on nonnegative matrix factorization with mixed group sparsity constraint | |
CN110399279A (en) | A kind of intelligent measure for inhuman intelligent body | |
JP2018022014A (en) | Feature quantity extraction device, feature quantity extraction function information generator, and method and program thereof | |
CN115859128B (en) | Analysis method and system based on interaction similarity of archive data | |
CN106340310B (en) | Speech detection method and device | |
RU2148274C1 (en) | Method for identification of person using properties of signature | |
CN112766537B (en) | Short-term electric load prediction method | |
CN114186646A (en) | Block chain abnormal transaction identification method and device, storage medium and electronic equipment | |
KR102202823B1 (en) | Method and device for binary classification using characteristics of weighted maximum mean discrepancy operations for positive-unlabeled learning | |
CN110263196B (en) | Image retrieval method, image retrieval device, electronic equipment and storage medium | |
CN113159419A (en) | Group feature portrait analysis method, device and equipment and readable storage medium | |
CN112766403A (en) | Incremental clustering method and device based on information gain weight | |
JP5438703B2 (en) | Feature quantity enhancement device, feature quantity enhancement method, and program thereof | |
CN112016956A (en) | BP neural network-based ore grade estimation method and device | |
CN111613247A (en) | Foreground voice detection method and device based on microphone array | |
US20200034735A1 (en) | System for generating topic inference information of lyrics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190719 |