WO2020168843A1 - Model training method and apparatus based on disturbance samples - Google Patents

Model training method and apparatus based on disturbance samples Download PDF

Info

Publication number
WO2020168843A1
WO2020168843A1 PCT/CN2020/070290 CN2020070290W WO2020168843A1 WO 2020168843 A1 WO2020168843 A1 WO 2020168843A1 CN 2020070290 W CN2020070290 W CN 2020070290W WO 2020168843 A1 WO2020168843 A1 WO 2020168843A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample set
samples
initial
disturbance
model
Prior art date
Application number
PCT/CN2020/070290
Other languages
French (fr)
Chinese (zh)
Inventor
林建滨
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020168843A1 publication Critical patent/WO2020168843A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the embodiments of this specification relate to the field of machine learning, and more specifically, to a method and device for obtaining a disturbance sample set based on an initial sample set, a method and device for obtaining a model training sample set, and a method and device for obtaining a model training sample set and a test sample set , And a model training method and device based on disturbance samples.
  • the embodiments of this specification aim to provide a more effective model training method to solve the deficiencies in the prior art.
  • one aspect of this specification provides a method for obtaining a disturbance sample set based on an initial sample set, the initial sample set includes a plurality of initial samples, and each initial sample includes a corresponding feature vector, the method includes:
  • a corresponding random number is generated, and the current feature value of the dimension of the feature vector is updated to the sum of the current feature value and the corresponding random number ,
  • To generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers is based on the predetermined first parameter and the feature of the dimension corresponding to the random number The product of the mean square deviation of the values is determined.
  • the random number is a Gaussian distributed random number
  • the mean square error of the Gaussian distributed random number is a product of the first parameter and the mean square error of the eigenvalues of the dimension corresponding to the random number.
  • the random number is an average random number, wherein the value range of the average random number is between plus and minus a first value, wherein the first value is the first parameter and the The product of the mean square deviation of the eigenvalues of the dimension corresponding to the random number.
  • Another aspect of this specification provides a method for obtaining a model training sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the method includes:
  • the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples
  • a training sample set is obtained.
  • Another aspect of this specification provides a method for obtaining a model training sample set and a test sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the method includes:
  • the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples
  • the test sample set is obtained by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
  • the proportion of the partial initial samples in the plurality of initial samples is the same as the proportion of the partial disturbance samples in the plurality of disturbance samples.
  • obtaining a test sample set includes: The remaining initial samples in the initial samples are combined with the remaining disturbance samples in the plurality of disturbance samples to obtain a test sample set.
  • the part of the initial sample corresponds to the part of the disturbance sample respectively.
  • the embodiment of this specification provides a model training method, including:
  • the initial sample set includes a plurality of initial samples
  • a plurality of training sample sets and a plurality of test sample sets corresponding to the plurality of training sample sets are obtained based on the initial sample set, wherein the plurality of training sample sets are The sample set corresponds to multiple first parameters with different values;
  • test sample set Using the multiple test sample sets to respectively evaluate corresponding update models, wherein the test sample set and the corresponding update model correspond to the same training sample set;
  • an update model of the current model is determined among the multiple update models.
  • the model is any of the following types of models: supervised learning models, unsupervised learning models, and reinforcement learning models.
  • Another aspect of this specification provides a device for obtaining a disturbance sample set based on an initial sample set, the initial sample set includes a plurality of initial samples, each initial sample includes a corresponding feature vector, and the device includes:
  • the calculation unit is configured to calculate the mean square deviation of the eigenvalues of the eigenvalues of each dimension in the multiple eigenvectors respectively corresponding to the multiple initial samples;
  • the generating unit is configured to generate a corresponding random number for each dimension in each feature vector of the multiple feature vectors, and update the current feature value of the feature vector in that dimension to the current feature value and
  • the sum of the corresponding random numbers is used to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers is based on the predetermined first parameter and the random
  • the product of the mean square deviation of the eigenvalues of the dimension corresponding to the number is determined.
  • Another aspect of this specification provides a device for obtaining a model training sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the device includes:
  • the obtaining unit is configured to obtain a disturbance sample set through the above-mentioned device, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
  • the merging unit is configured to obtain a training sample set by merging the multiple initial samples with the multiple disturbance samples.
  • Another aspect of this specification provides a device for acquiring a model training sample set and a test sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the device includes:
  • the obtaining unit is configured to obtain a disturbance sample set through the foregoing apparatus for obtaining a disturbance sample set, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
  • the first merging unit is configured to obtain a training sample set by merging part of the initial samples in the plurality of initial samples and part of the disturbance samples in the plurality of disturbance samples;
  • the second merging unit is configured to obtain a test sample set by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
  • the second merging unit is further configured to obtain a test sample set by merging the remaining initial samples in the plurality of initial samples with the remaining disturbance samples in the plurality of disturbance samples.
  • model training device including:
  • the first obtaining unit is configured to obtain an initial sample set, wherein the initial sample set includes a plurality of initial samples
  • the second obtaining unit is configured to obtain a plurality of training sample sets and a plurality of test sample sets corresponding to the plurality of training sample sets based on the initial sample set through the above-mentioned apparatus for obtaining training sample sets and test sample sets , Wherein the multiple training sample sets respectively correspond to multiple first parameters with different values;
  • the training unit is configured to use the multiple training sample sets to train the current model respectively to obtain multiple updated models
  • the evaluation unit is configured to use the multiple test sample sets to respectively evaluate corresponding update models, wherein the test sample set and the corresponding update model correspond to the same training sample set;
  • the determining unit is configured to determine the update model of the current model among the multiple update models based on the evaluation result.
  • Another aspect of this specification provides a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed in a computer, the computer is caused to execute any of the above methods.
  • Another aspect of this specification provides a computing device including a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, any one of the above methods is implemented.
  • the embodiment of this specification perturbs the training data of the model to simulate the data noise in the real environment, thereby increasing the robustness of the model, and trains and evaluates the model by using the perturbation data, and determines the predetermined parameters of the model based on the evaluation result, thereby quantifying This improves the effectiveness of the model for abnormal data.
  • the parameters of the machine learning model are not limited in the embodiments of this specification, so that the learning ability of the model is not limited.
  • Fig. 1 shows a schematic diagram of a model training system 100 according to an embodiment of the present specification
  • Fig. 2 shows a method for obtaining a disturbance sample set based on an initial sample set according to an embodiment of the present specification
  • Fig. 3 schematically shows the calculation of the mean square error of one-dimensional eigenvalues in multiple eigenvectors
  • FIG. 4 shows n disturbance eigenvectors respectively corresponding to the n eigenvectors in FIG. 3;
  • FIG. 5 shows a flowchart of a method for obtaining a model training sample set based on an initial sample set according to an embodiment of the present specification
  • Fig. 6 shows a flow chart of a method for acquiring a model training sample set and a test sample set based on an initial sample set according to an embodiment of the specification
  • FIG. 7 shows a flowchart of a model training method according to an embodiment of the specification
  • FIG. 8 shows an apparatus 800 for obtaining a disturbance sample set based on an initial sample set according to an embodiment of the present specification
  • FIG. 9 shows an apparatus 900 for acquiring a model training sample set based on an initial sample set according to an embodiment of the present specification
  • FIG. 10 shows an apparatus 1000 for acquiring a model training sample set and a test sample set based on an initial sample set according to an embodiment of the present specification
  • FIG. 11 shows a model training device 1100 according to an embodiment of this specification.
  • Fig. 1 shows a schematic diagram of a model training system 100 according to an embodiment of the present specification.
  • the system 100 includes a data processing module 11, a training module 12 and an evaluation module 13.
  • the data set B is obtained by applying disturbance to each dimension value of the feature vector in each sample in the data set A.
  • the training module 12 at least part of the data (for example, 70% of the data in the data set A) is obtained from the data set A, and at least part of the data (for example, 70% of the data in the data set B) is obtained from the data set B, so as
  • the training data set is obtained by merging, and the machine learning model is trained using the training data set.
  • the machine learning model can be any model, for example, the speech recognition model mentioned above.
  • the speech recognition model is, for example, a supervised learning model or a reinforcement learning model. It can be understood that the arbitrary model may also be an unsupervised learning model.
  • the remaining data is obtained from data set A (for example, 30% of data in data set A), and the remaining data is obtained from data set B (for example, 30% of data in data set B), so as to merge them Instead, obtain a test data set and use the test data set to evaluate the trained model.
  • Fig. 2 shows a method for obtaining a disturbance sample set based on an initial sample set according to an embodiment of the present specification
  • the initial sample set includes a plurality of initial samples
  • each initial sample includes a corresponding feature vector
  • the method includes:
  • step S202 calculating the eigenvalue mean square error of the eigenvalues of each dimension in the eigenvectors corresponding to the multiple initial samples.
  • step S204 for each dimension of each feature vector in the plurality of feature vectors, a corresponding random number is generated, and the current feature value of the dimension of the feature vector is updated to the current feature value and the corresponding
  • the sum of random numbers is used to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers corresponds to the random number based on a predetermined first parameter
  • the product of the mean square deviation of the eigenvalues of the dimension is determined.
  • step S202 the mean square error of the eigenvalues of the eigenvalues of each dimension in the eigenvectors corresponding to the multiple initial samples is calculated.
  • the initial sample set is, for example, the data set A shown in FIG. 1.
  • the data set A includes, for example, n initial samples, and each sample includes a respective feature vector, and the feature vector is, for example, an m-dimensional feature. Vector, each dimension value of which corresponds to a feature value.
  • each sample also includes a corresponding label value.
  • the mean square error is the standard deviation, which is the square root of the variance ⁇ 2 , which can be represented by ⁇ .
  • n is the total number of samples
  • is the mean value of n x i .
  • Fig. 3 schematically shows the calculation of the mean square error of one-dimensional eigenvalues in a plurality of eigenvectors.
  • each feature vector includes the feature values of m features, and the i-th dimension feature (that is, the vector in the figure) of the j-th feature vector (ie, the vector j in the figure)
  • the feature value of feature i) is expressed as x ij , where i ⁇ [1, m], j ⁇ [1, n]. Therefore, the standard deviation ⁇ i of the eigenvalues of the i-th dimension of each of the n eigenvectors can be calculated by the following formula (2):
  • ⁇ i is calculated by the following formula (3)
  • step S204 for each dimension of each feature vector in the plurality of feature vectors, a corresponding random number is generated, and the current feature value of the dimension of the feature vector is updated to the current feature value and the corresponding
  • the sum of random numbers is used to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers corresponds to the random number based on a predetermined first parameter
  • the product of the mean square deviation of the eigenvalues of the dimension is determined.
  • FIG. 4 shows n disturbance feature vectors corresponding to the n feature vectors in FIG. 3 respectively. Training the model through the disturbance feature vector, that is, adding noise to the training sample, will improve the model's resistance to noise and enhance the stability of the model.
  • Gaussian random variable A norm(0, ⁇ i ) Obtained, that is to say, the mean value of the random variable is 0, and the mean square error is ⁇ i , where ⁇ i is the mean square error of the feature value of dimension i (feature i) calculated by the above formula (2), and ⁇ is a predetermined parameter, which can be taken The value is between 0.0001 and 0.1.
  • each random number a ij in FIG. 4 can be obtained by an average random variable B, and the random variable B may be, for example, an average random variable in the range of [- ⁇ i , ⁇ i ].
  • the value range of each random number a ij is limited by the product ⁇ i .
  • the value range of the average random variable B is not limited to be set to [- ⁇ i , ⁇ i ], for example, it can also be [-3 ⁇ i ,3 ⁇ i ] and so on.
  • each random number a ij is not limited to be obtained by the above-mentioned Gaussian random variable or average random variable, but can be obtained by any other random variable, such as a Poisson random variable, etc., as long as its value range is limited by ⁇ i OK.
  • is used to balance the added noise and model performance.
  • the value of ⁇ is more important for the improvement of model performance.
  • the value of ⁇ can be determined by the specific environment in which the model is applied. For example, for a speech recognition model, when the environment in which it is applied is relatively noisy and the noise is large, the value of ⁇ can be set to be larger.
  • the value of ⁇ is determined by evaluating the trained model, that is, ⁇ with a better evaluation value is selected as the final ⁇ for model training.
  • the ⁇ value can be reused in subsequent repeated training.
  • the disturbance sample set can be obtained, so that the training sample set and the test sample set of the model can be obtained based on the initial sample set and the disturbance sample set.
  • Fig. 5 shows a flowchart of a method for acquiring a model training sample set based on an initial sample set according to an embodiment of the present specification, wherein the initial sample set includes a plurality of initial samples, and the method includes:
  • a disturbance sample set is obtained based on the initial sample set by the method shown in FIG. 2, the disturbance sample set includes a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
  • step S504 a training sample set is obtained by combining the multiple initial samples with the multiple disturbance samples.
  • step S502 by using the method shown in FIG. 2, a plurality of disturbance samples respectively corresponding to the plurality of initial samples are generated, wherein the plurality of disturbance samples correspond to the first parameter with the same value.
  • the multiple perturbation samples are, for example, multiple perturbation samples as shown in FIG. 4.
  • a training sample set is obtained by combining the multiple initial samples with the multiple disturbance samples.
  • all the samples in the initial sample set and all the samples in the disturbance sample set can be combined to obtain the training sample set.
  • the training samples of the model are enriched, so that the model can be adapted to different actual environments.
  • Fig. 6 shows a flow chart of a method for obtaining a model training sample set and a test sample set based on an initial sample set according to an embodiment of the present specification, wherein the initial sample set includes a plurality of initial samples, and the method includes:
  • step S602 by using the method shown in FIG. 2, a disturbance sample set is obtained based on the initial sample set, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
  • step S604 a training sample set is obtained by merging part of the initial samples in the plurality of initial samples with part of the disturbance samples in the plurality of disturbance samples;
  • a test sample set is obtained by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
  • part of the initial sample and part of the disturbance sample can be merged together to obtain the training sample set.
  • 70% of the initial samples in the initial sample set and 70% of the disturbance samples in the disturbance sample set may be combined to obtain a training sample set.
  • the remaining initial sample set for example 30% of the initial sample and 30% of the disturbed sample in the initial sample set and the disturbance sample set, are combined to obtain a test sample set corresponding to the training sample set.
  • the 70% initial samples and the 70% disturbance samples in the training sample set may respectively correspond to each other, or may not correspond to each other.
  • the proportion of the partial initial samples in the plurality of initial samples and the proportion of the partial disturbance samples in the plurality of disturbance samples may also be different, for example, in the case of a noisy model in the actual application environment
  • a larger proportion of disturbance samples may be included in the training sample set.
  • the training sample set may include 80% of all disturbance samples and 20% of all initial samples.
  • the test sample set can also be configured in the same proportion, for example, including the remaining 20% of the disturbed samples (accounting for all disturbed samples) and the initial sample of 5% of the remaining initial samples (accounting for all the initial samples). .
  • the model After obtaining the training sample set and the test sample set of the model as shown in FIG. 5 and FIG. 6, the model can be trained by the training sample set, and the model can be evaluated by the test sample set evaluation.
  • the following describes the method of selecting the predetermined parameter ⁇ of the model based on the evaluation of the model by the test sample set, so as to further optimize the model.
  • Fig. 7 shows a flow chart of a model training method according to an embodiment of the specification, including:
  • step S702 an initial sample set is obtained, wherein the initial sample set includes a plurality of initial samples
  • step S704 multiple training sample sets and multiple test sample sets corresponding to the multiple training sample sets are obtained based on the initial sample set by the method shown in FIG. 6, wherein the multiple training sample sets Correspond to multiple first parameters with different values;
  • step S706 use the multiple training sample sets to train the current model respectively to obtain multiple updated models
  • step S708 use the multiple test sample sets to evaluate corresponding update models respectively, wherein the test sample set and the corresponding update model correspond to the same training sample set;
  • step S710 based on the evaluation result, an update model of the current model is determined among the multiple update models.
  • an initial sample set is obtained, wherein the initial sample set includes a plurality of initial samples.
  • the model is not limited to a specific type. As described above, it can be any type of a supervised learning model, an unsupervised learning model, and a reinforcement learning model.
  • the model is a speech recognition model as described above, which is, for example, a supervised learning model.
  • the corresponding feature vector can be extracted from the speech by manually inputting the voice, thereby using the feature vector and label
  • the value (semantics) is used as the initial sample of the model.
  • the model may encounter different environments in practical applications, such as a quiet environment, a variety of noisy environments with different noises, and so on.
  • the initial samples obtained manually in a single environment cannot simulate so many different environments, and the cost of obtaining samples manually in different environments is relatively high. Therefore, the method can be used to expand the sample based on the initial sample set to obtain the training sample set.
  • step S704 multiple training sample sets and multiple test sample sets corresponding to the multiple training sample sets are obtained based on the initial sample set by the method shown in FIG. 6, wherein the multiple training sample sets Correspond to multiple first parameters with different values.
  • can be set to 0.0001, 0.001, 0.01, and 0.1, respectively. Therefore, the influence of the magnitude of ⁇ on model training can be determined.
  • the value of ⁇ is not limited to the foregoing manner and the foregoing number, but may be specifically limited according to a specific model. Specifically, for the above four ⁇ values, based on the above initial sample set A, four perturbation sample sets B 1 , B 2 , B 3 , and B 4 can be obtained by the method shown in Fig. 2 respectively.
  • Sample sets were obtained 4 sets of sample sets (C 1 , D 1 ), (C 2 , D 2 ), (C 3 , D 3 ), (C 4 , D 4 ), where C i represents the training sample set, D i represents the test sample set.
  • step S706 the multiple training sample sets are used to train the current model respectively to obtain multiple updated models.
  • each training sample set C 1 , C 2 , C 3 , and C 4 are used to train the current model to obtain 4 updated models M 1 , M 2 , M 3 , and M 4 respectively .
  • step S708 the multiple test sample sets are used to respectively evaluate the corresponding update models, where the test sample set and the corresponding update model correspond to the same training sample set.
  • each test sample set D 1 , D 2 , D 3 , D 4 is used to evaluate the four update models M 1 , M 2 , M 3 , M 4 , and the test sample set Both D 1 and the updated model M 1 correspond to the training sample set C 1 , that is, the test sample set D 1 corresponds to the updated model M 1.
  • the test sample set D 2 corresponds to the updated model M 2
  • the test sample set D 3 corresponds to the updated model M 3
  • the test sample set D 4 corresponds to the updated model M 4 .
  • the test samples can be used to calculate various evaluation indicators of the corresponding update model, such as accuracy, precision, recall, etc., so as to evaluate the corresponding update model.
  • the above evaluation indicators can be combined to obtain the model’s The assessed value.
  • step S710 based on the evaluation result, an update model of the current model is determined among the multiple update models.
  • the update model with the highest evaluation value may be determined as the update model of the current model, that is, the post-training model, and the determined update model is retained For subsequent model use, such as model prediction.
  • Fig. 8 shows a device 800 for obtaining a disturbance sample set based on an initial sample set according to an embodiment of the present specification
  • the initial sample set includes a plurality of initial samples
  • each initial sample includes a corresponding feature vector
  • the device includes:
  • the calculation unit 81 is configured to calculate the mean square deviation of the eigenvalues of the eigenvalues of each dimension in the eigenvectors corresponding to the multiple initial samples;
  • the generating unit 82 is configured to generate a corresponding random number for each dimension of each feature vector in the plurality of feature vectors, and update the current feature value of the dimension of the feature vector to the current feature value And the corresponding random numbers to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers is based on the predetermined first parameter and the The product of the mean square deviation of the eigenvalues of the dimension corresponding to the random number is determined.
  • Fig. 9 shows a device 900 for acquiring a model training sample set based on an initial sample set according to an embodiment of the present specification, wherein the initial sample set includes a plurality of initial samples, and the device includes:
  • the obtaining unit 91 is configured to obtain a disturbance sample set based on the initial sample set through the foregoing device, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
  • the merging unit 92 is configured to obtain a training sample set by merging the multiple initial samples with the multiple disturbance samples.
  • Fig. 10 shows an apparatus 1000 for acquiring a model training sample set and a test sample set based on an initial sample set according to an embodiment of the present specification, wherein the initial sample set includes a plurality of initial samples, and the device includes:
  • the obtaining unit 101 is configured to obtain a disturbance sample set based on the initial sample set through the aforementioned apparatus for obtaining a disturbance sample set, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
  • the first merging unit 102 is configured to obtain a training sample set by merging part of the initial samples in the plurality of initial samples and part of the disturbance samples in the plurality of disturbance samples;
  • the second merging unit 103 is configured to obtain a test sample set by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
  • the second merging unit is further configured to obtain a test sample set by merging the remaining initial samples in the plurality of initial samples with the remaining disturbance samples in the plurality of disturbance samples.
  • FIG. 11 shows a model training device 1100 according to an embodiment of this specification, including:
  • the first obtaining unit 111 is configured to obtain an initial sample set, wherein the initial sample set includes a plurality of initial samples;
  • the second obtaining unit 112 is configured to obtain a plurality of training sample sets and a plurality of test samples respectively corresponding to the plurality of training sample sets based on the initial sample set through the foregoing apparatus for obtaining training sample sets and test sample sets Set, wherein the multiple training sample sets correspond to multiple first parameters with different values;
  • the training unit 113 is configured to separately train the current model using the multiple training sample sets to obtain multiple updated models respectively;
  • the evaluation unit 114 is configured to use the plurality of test sample sets to respectively evaluate corresponding update models, where the test sample set and the corresponding update model correspond to the same training sample set;
  • the determining unit 115 is configured to determine an update model of the current model among the multiple update models based on the evaluation result.
  • Another aspect of this specification provides a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed in a computer, the computer is caused to execute any of the above methods.
  • Another aspect of this specification provides a computing device including a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, any one of the above methods is implemented.
  • the embodiment of this specification perturbs the training data of the model to simulate the data noise in the real environment, thereby increasing the robustness of the model, and trains and evaluates the model by using the perturbation data, and determines the predetermined parameters of the model based on the evaluation result, thereby quantifying This improves the effectiveness of the model for abnormal data.
  • the parameters of the machine learning model are not limited in the embodiments of this specification, so that the learning ability of the model is not limited.
  • the steps of the method or algorithm described in the embodiments disclosed in this document can be implemented by hardware, a software module executed by a processor, or a combination of the two.
  • the software module can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or all areas in the technical field. Any other known storage medium.

Abstract

Provided are a method and apparatus for acquiring a disturbance sample set and training a model on the basis of the disturbance sample set. The method for acquiring a disturbance sample set comprises: calculating a mean square error of feature values of each dimension in a plurality of feature vectors respectively corresponding to a plurality of initial samples (S202); and for each dimension in each of the plurality of feature vectors, generating a corresponding random number, and updating the current feature value of the dimension of the feature vector to be the sum of the current feature value and the corresponding random number so as to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, and thereby acquiring a disturbance sample set (S204), wherein a value range of each random number is determined based on the product of a predetermined first parameter and a mean square error of feature values of a dimension corresponding to the random number.

Description

一种基于扰动样本的模型训练方法和装置Model training method and device based on disturbance samples 技术领域Technical field
本说明书实施例涉及机器学习领域,更具体地,涉及基于初始样本集获取扰动样本集的方法和装置、获取模型训练样本集的方法和装置、获取模型训练样本集和测试样本集的方法和装置、以及基于扰动样本的模型训练方法和装置。The embodiments of this specification relate to the field of machine learning, and more specifically, to a method and device for obtaining a disturbance sample set based on an initial sample set, a method and device for obtaining a model training sample set, and a method and device for obtaining a model training sample set and a test sample set , And a model training method and device based on disturbance samples.
背景技术Background technique
机器模型部署到实际环境中会受到各种各样的挑战,其中很重要的一个挑战就是模型的稳定性。以语音识别模型为例,机器学习模型在训练时候用到的数据往往是经过妥善处理、降噪的,然而在实际环境中,模型要面对的情况却十分复杂,比如处于嘈杂的环境、话筒的回声等都会导致模型要处理的数据带有噪声,和实际的训练数据不一致,从而导致模型的精度发生较大的变化。因此,提高机器学习模型的鲁棒性对于机器学习模型的实际应用有着重大的意义。目前机器学习算法普遍采用L1正则和L2正则来增强模型的鲁棒性,这两个方法都是通过限制模型参数的搜索空间(参数的绝对值)达到鲁棒性的效果。The deployment of the machine model into the actual environment will be subject to various challenges, and one of the most important challenges is the stability of the model. Take the speech recognition model as an example. The data used in the training of the machine learning model is often properly processed and noise-reduced. However, in the actual environment, the situation that the model has to face is very complicated, such as being in a noisy environment, microphone The echo of the model will cause the data to be processed by the model to be noisy, which is inconsistent with the actual training data, resulting in a large change in the accuracy of the model. Therefore, improving the robustness of the machine learning model is of great significance to the practical application of the machine learning model. At present, machine learning algorithms generally use L1 regularity and L2 regularity to enhance the robustness of the model. Both of these methods achieve robustness by limiting the search space of model parameters (the absolute value of the parameter).
因此,需要一种更有效的增强模型鲁棒性的模型训练方法。Therefore, a more effective model training method for enhancing the robustness of the model is needed.
发明内容Summary of the invention
本说明书实施例旨在提供一种更有效的模型训练方法,以解决现有技术中的不足。The embodiments of this specification aim to provide a more effective model training method to solve the deficiencies in the prior art.
为实现上述目的,本说明书一个方面提供一种基于初始样本集获取扰动样本集的方法,所述初始样本集中包括多个初始样本,每个初始样本包括对应的特征向量,所述方法包括:To achieve the above objective, one aspect of this specification provides a method for obtaining a disturbance sample set based on an initial sample set, the initial sample set includes a plurality of initial samples, and each initial sample includes a corresponding feature vector, the method includes:
计算所述多个初始样本分别对应的多个特征向量中每个维度的特征值的特征值均方差;以及Calculating the mean square error of the eigenvalues of the eigenvalues of each dimension in the eigenvectors corresponding to the multiple initial samples; and
对于所述多个特征向量中每个特征向量中的每个维度,生成相应的随机数,并将该特征向量的该维度的当前特征值更新为所述当前特征值与相应的随机数之和,以生成与所述多个特征向量分别对应的多个扰动样本,从而获取扰动样本集,其中,各个所述随机数的取值范围基于预定的第一参数与该随机数对应的维度的特征值均方差的乘积 确定。For each dimension of each feature vector in the plurality of feature vectors, a corresponding random number is generated, and the current feature value of the dimension of the feature vector is updated to the sum of the current feature value and the corresponding random number , To generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers is based on the predetermined first parameter and the feature of the dimension corresponding to the random number The product of the mean square deviation of the values is determined.
在一个实施例中,所述随机数为高斯分布随机数,所述高斯分布随机数的均方差为所述第一参数与该随机数对应的维度的特征值均方差的乘积。In an embodiment, the random number is a Gaussian distributed random number, and the mean square error of the Gaussian distributed random number is a product of the first parameter and the mean square error of the eigenvalues of the dimension corresponding to the random number.
在一个实施例中,所述随机数为平均随机数,其中,所述平均随机数的取值范围在正负第一数值之间,其中,所述第一数值为所述第一参数与该随机数对应的维度的特征值均方差的乘积。In one embodiment, the random number is an average random number, wherein the value range of the average random number is between plus and minus a first value, wherein the first value is the first parameter and the The product of the mean square deviation of the eigenvalues of the dimension corresponding to the random number.
本说明书另一方面提供一种基于初始样本集获取模型训练样本集的方法,其中,所述初始样本集中包括多个初始样本,所述方法包括:Another aspect of this specification provides a method for obtaining a model training sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the method includes:
通过上述方法,获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;以及Obtain a disturbance sample set by the above method, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples; and
通过将所述多个初始样本与所述多个扰动样本合并,获取训练样本集。By combining the multiple initial samples with the multiple disturbance samples, a training sample set is obtained.
本说明书另一方面提供一种基于初始样本集获取模型训练样本集和测试样本集的方法,其中,所述初始样本集中包括多个初始样本,所述方法包括:Another aspect of this specification provides a method for obtaining a model training sample set and a test sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the method includes:
通过上述获取扰动样本集的方法,获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;Obtain a disturbance sample set by the above method for obtaining a disturbance sample set, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
通过将所述多个初始样本中的部分初始样本与所述多个扰动样本中的部分扰动样本合并,获取训练样本集;以及Obtaining a training sample set by merging part of the initial samples in the plurality of initial samples with part of the disturbance samples in the plurality of disturbance samples; and
通过将所述多个初始样本中剩余的初始样本的至少部分与所述多个扰动样本中剩余的扰动样本的至少部分合并,获取测试样本集。The test sample set is obtained by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
在一个实施例中,所述部分初始样本占所述多个初始样本的比例与所述部分扰动样本占所述多个扰动样本的比例相同。In one embodiment, the proportion of the partial initial samples in the plurality of initial samples is the same as the proportion of the partial disturbance samples in the plurality of disturbance samples.
在一个实施例中,通过将所述多个初始样本中剩余的初始样本的至少部分与所述多个扰动样本中剩余的扰动样本的至少部分合并,获取测试样本集包括,通过将所述多个初始样本中剩余的初始样本与所述多个扰动样本中剩余的扰动样本合并,获取测试样本集。In one embodiment, by combining at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples, obtaining a test sample set includes: The remaining initial samples in the initial samples are combined with the remaining disturbance samples in the plurality of disturbance samples to obtain a test sample set.
在一个实施例中,所述部分初始样本与所述部分扰动样本分别对应。In an embodiment, the part of the initial sample corresponds to the part of the disturbance sample respectively.
本说明书实施例提供一种模型训练方法,包括:The embodiment of this specification provides a model training method, including:
获取初始样本集,其中,所述初始样本集中包括多个初始样本;Acquiring an initial sample set, wherein the initial sample set includes a plurality of initial samples;
通过上述获取训练样本集和测试样本集的方法基于所述初始样本集获取多个训练样本集、及与所述多个训练样本集分别对应的多个测试样本集,其中,所述多个训练样本集与多个取值不同的第一参数分别对应;Through the above method of obtaining training sample sets and test sample sets, a plurality of training sample sets and a plurality of test sample sets corresponding to the plurality of training sample sets are obtained based on the initial sample set, wherein the plurality of training sample sets are The sample set corresponds to multiple first parameters with different values;
使用所述多个训练样本集分别训练当前模型,以分别获取多个更新模型;Use the multiple training sample sets to train the current model respectively to obtain multiple updated models;
使用所述多个测试样本集分别评估相应的更新模型,其中,所述测试样本集与相应的更新模型对应于相同的训练样本集;以及Using the multiple test sample sets to respectively evaluate corresponding update models, wherein the test sample set and the corresponding update model correspond to the same training sample set; and
基于评估结果,在所述多个更新模型中确定所述当前模型的更新模型。Based on the evaluation result, an update model of the current model is determined among the multiple update models.
在一个实施例中,所述模型为以下任一类模型:监督学习模型、无监督学习模型、和强化学习模型。In one embodiment, the model is any of the following types of models: supervised learning models, unsupervised learning models, and reinforcement learning models.
本说明书另一方面提供一种基于初始样本集获取扰动样本集的装置,所述初始样本集中包括多个初始样本,每个初始样本包括对应的特征向量,所述装置包括:Another aspect of this specification provides a device for obtaining a disturbance sample set based on an initial sample set, the initial sample set includes a plurality of initial samples, each initial sample includes a corresponding feature vector, and the device includes:
计算单元,配置为,计算所述多个初始样本分别对应的多个特征向量中每个维度的特征值的特征值均方差;以及The calculation unit is configured to calculate the mean square deviation of the eigenvalues of the eigenvalues of each dimension in the multiple eigenvectors respectively corresponding to the multiple initial samples; and
生成单元,配置为,对于所述多个特征向量中每个特征向量中的每个维度,生成相应的随机数,并将该特征向量的该维度的当前特征值更新为所述当前特征值与相应的随机数之和,以生成与所述多个特征向量分别对应的多个扰动样本,从而获取扰动样本集,其中,各个所述随机数的取值范围基于预定的第一参数与该随机数对应的维度的特征值均方差的乘积确定。The generating unit is configured to generate a corresponding random number for each dimension in each feature vector of the multiple feature vectors, and update the current feature value of the feature vector in that dimension to the current feature value and The sum of the corresponding random numbers is used to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers is based on the predetermined first parameter and the random The product of the mean square deviation of the eigenvalues of the dimension corresponding to the number is determined.
本说明书另一方面提供一种基于初始样本集获取模型训练样本集的装置,其中,所述初始样本集中包括多个初始样本,所述装置包括:Another aspect of this specification provides a device for obtaining a model training sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the device includes:
获取单元,配置为,通过上述装置,获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;以及The obtaining unit is configured to obtain a disturbance sample set through the above-mentioned device, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples; and
合并单元,配置为,通过将所述多个初始样本与所述多个扰动样本合并,获取训练样本集。The merging unit is configured to obtain a training sample set by merging the multiple initial samples with the multiple disturbance samples.
本说明书另一方面提供一种基于初始样本集获取模型训练样本集和测试样本集的装置,其中,所述初始样本集中包括多个初始样本,所述装置包括:Another aspect of this specification provides a device for acquiring a model training sample set and a test sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the device includes:
获取单元,配置为,通过上述获取扰动样本集的装置,获取扰动样本集,所述扰 动样本集包括与所述多个初始样本分别对应的多个扰动样本;The obtaining unit is configured to obtain a disturbance sample set through the foregoing apparatus for obtaining a disturbance sample set, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
第一合并单元,配置为,通过将所述多个初始样本中的部分初始样本与所述多个扰动样本中的部分扰动样本合并,获取训练样本集;以及The first merging unit is configured to obtain a training sample set by merging part of the initial samples in the plurality of initial samples and part of the disturbance samples in the plurality of disturbance samples; and
第二合并单元,配置为,通过将所述多个初始样本中剩余的初始样本的至少部分与所述多个扰动样本中剩余的扰动样本的至少部分合并,获取测试样本集。The second merging unit is configured to obtain a test sample set by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
在一个实施例中,所述第二合并单元还配置为,通过将所述多个初始样本中剩余的初始样本与所述多个扰动样本中剩余的扰动样本合并,获取测试样本集。In an embodiment, the second merging unit is further configured to obtain a test sample set by merging the remaining initial samples in the plurality of initial samples with the remaining disturbance samples in the plurality of disturbance samples.
本说明书另一方面提供一种模型训练装置,包括:Another aspect of this specification provides a model training device, including:
第一获取单元,配置为,获取初始样本集,其中,所述初始样本集中包括多个初始样本;The first obtaining unit is configured to obtain an initial sample set, wherein the initial sample set includes a plurality of initial samples;
第二获取单元,配置为,通过上述获取训练样本集和测试样本集的装置基于所述初始样本集获取多个训练样本集、及与所述多个训练样本集分别对应的多个测试样本集,其中,所述多个训练样本集与多个取值不同的第一参数分别对应;The second obtaining unit is configured to obtain a plurality of training sample sets and a plurality of test sample sets corresponding to the plurality of training sample sets based on the initial sample set through the above-mentioned apparatus for obtaining training sample sets and test sample sets , Wherein the multiple training sample sets respectively correspond to multiple first parameters with different values;
训练单元,配置为,使用所述多个训练样本集分别训练当前模型,以分别获取多个更新模型;The training unit is configured to use the multiple training sample sets to train the current model respectively to obtain multiple updated models;
评估单元,配置为,使用所述多个测试样本集分别评估相应的更新模型,其中,所述测试样本集与相应的更新模型对应于相同的训练样本集;以及The evaluation unit is configured to use the multiple test sample sets to respectively evaluate corresponding update models, wherein the test sample set and the corresponding update model correspond to the same training sample set; and
确定单元,配置为,基于评估结果,在所述多个更新模型中确定所述当前模型的更新模型。The determining unit is configured to determine the update model of the current model among the multiple update models based on the evaluation result.
本说明书另一方面提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行上述任一项方法。Another aspect of this specification provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed in a computer, the computer is caused to execute any of the above methods.
本说明书另一方面提供一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现上述任一项方法。Another aspect of this specification provides a computing device including a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, any one of the above methods is implemented.
本说明书实施例通过对模型的训练数据进行扰动,模拟真实环境中的数据噪声,从而增加模型的鲁棒性,并且通过使用扰动数据训练并评估模型,并基于评估结果确定模型预定参数,从而定量地提高了模型对异常数据的有效性,另外,本说明书实施例中未对机器学习模型的参数进行限制,从而不会限制模型的学习能力。The embodiment of this specification perturbs the training data of the model to simulate the data noise in the real environment, thereby increasing the robustness of the model, and trains and evaluates the model by using the perturbation data, and determines the predetermined parameters of the model based on the evaluation result, thereby quantifying This improves the effectiveness of the model for abnormal data. In addition, the parameters of the machine learning model are not limited in the embodiments of this specification, so that the learning ability of the model is not limited.
附图说明Description of the drawings
通过结合附图描述本说明书实施例,可以使得本说明书实施例更加清楚:By describing the embodiments of this specification in conjunction with the accompanying drawings, the embodiments of this specification can be made clearer:
图1示出根据本说明书实施例的模型训练系统100的示意图;Fig. 1 shows a schematic diagram of a model training system 100 according to an embodiment of the present specification;
图2示出根据本说明书实施例的一种基于初始样本集获取扰动样本集的方法;Fig. 2 shows a method for obtaining a disturbance sample set based on an initial sample set according to an embodiment of the present specification;
图3示意示出对多个特征向量中的一维特征值的均方差的计算;Fig. 3 schematically shows the calculation of the mean square error of one-dimensional eigenvalues in multiple eigenvectors;
图4示出与图3中n个特征向量分别对应的n个扰动特征向量;FIG. 4 shows n disturbance eigenvectors respectively corresponding to the n eigenvectors in FIG. 3;
图5示出根据本说明书实施例的基于初始样本集获取模型训练样本集的方法流程图;FIG. 5 shows a flowchart of a method for obtaining a model training sample set based on an initial sample set according to an embodiment of the present specification;
图6示出了根据本说明书实施例的一种基于初始样本集获取模型训练样本集和测试样本集的方法流程图;Fig. 6 shows a flow chart of a method for acquiring a model training sample set and a test sample set based on an initial sample set according to an embodiment of the specification;
图7示出根据本说明书实施例的一种模型训练方法流程图;FIG. 7 shows a flowchart of a model training method according to an embodiment of the specification;
图8示出根据本说明书实施例的一种基于初始样本集获取扰动样本集的装置800;FIG. 8 shows an apparatus 800 for obtaining a disturbance sample set based on an initial sample set according to an embodiment of the present specification;
图9示出根据本说明书实施例的一种基于初始样本集获取模型训练样本集的装置900;FIG. 9 shows an apparatus 900 for acquiring a model training sample set based on an initial sample set according to an embodiment of the present specification;
图10示出根据本说明书实施例的一种基于初始样本集获取模型训练样本集和测试样本集的装置1000;FIG. 10 shows an apparatus 1000 for acquiring a model training sample set and a test sample set based on an initial sample set according to an embodiment of the present specification;
图11示出根据本说明书实施例的一种模型训练装置1100。FIG. 11 shows a model training device 1100 according to an embodiment of this specification.
具体实施方式detailed description
下面将结合附图描述本说明书实施例。The embodiments of this specification will be described below with reference to the drawings.
图1示出根据本说明书实施例的模型训练系统100的示意图。如图1所示,系统100包括,数据处理模块11、训练模块12和评估模块13。在数据处理模块11中,通过对数据集A中的每个样本中的特征向量的每个维度值施加扰动,从而获取数据集B。在训练模块12,从数据集A中获取至少部分数据(例如数据集A中70%的数据),从数据集B获取至少部分数据(例如数据集B中70%的数据),从而通过将其合并而获取训练数据集,并使用该训练数据集训练机器学习模型。该机器学习模型可以为任意模型,其例如为上文中的语音识别模型。该语音识别模型例如为监督学习模型或者为强化学习 模型等,可以理解,所述任意模型还可以为无监督学习模型。在评估模块13,从数据集A中获取剩余数据(例如数据集A中的30%数据),从数据集B中获取剩余数据(例如数据集B中的30%数据),从而通过将其合并而获取测试数据集,并使用该测试数据集评估该训练的模型。Fig. 1 shows a schematic diagram of a model training system 100 according to an embodiment of the present specification. As shown in FIG. 1, the system 100 includes a data processing module 11, a training module 12 and an evaluation module 13. In the data processing module 11, the data set B is obtained by applying disturbance to each dimension value of the feature vector in each sample in the data set A. In the training module 12, at least part of the data (for example, 70% of the data in the data set A) is obtained from the data set A, and at least part of the data (for example, 70% of the data in the data set B) is obtained from the data set B, so as The training data set is obtained by merging, and the machine learning model is trained using the training data set. The machine learning model can be any model, for example, the speech recognition model mentioned above. The speech recognition model is, for example, a supervised learning model or a reinforcement learning model. It can be understood that the arbitrary model may also be an unsupervised learning model. In the evaluation module 13, the remaining data is obtained from data set A (for example, 30% of data in data set A), and the remaining data is obtained from data set B (for example, 30% of data in data set B), so as to merge them Instead, obtain a test data set and use the test data set to evaluate the trained model.
下面详细描述上述各个处理过程。The various processing procedures described above are described in detail below.
图2示出根据本说明书实施例的一种基于初始样本集获取扰动样本集的方法,所述初始样本集中包括多个初始样本,每个初始样本包括对应的特征向量,所述方法包括:Fig. 2 shows a method for obtaining a disturbance sample set based on an initial sample set according to an embodiment of the present specification, the initial sample set includes a plurality of initial samples, each initial sample includes a corresponding feature vector, and the method includes:
在步骤S202,计算所述多个初始样本分别对应的多个特征向量中每个维度的特征值的特征值均方差;以及In step S202, calculating the eigenvalue mean square error of the eigenvalues of each dimension in the eigenvectors corresponding to the multiple initial samples; and
在步骤S204,对于所述多个特征向量中每个特征向量中的每个维度,生成相应的随机数,并将该特征向量的该维度的当前特征值更新为所述当前特征值与相应的随机数之和,以生成与所述多个特征向量分别对应的多个扰动样本,从而获取扰动样本集,其中,各个所述随机数的取值范围基于预定的第一参数与该随机数对应的维度的特征值均方差的乘积确定。In step S204, for each dimension of each feature vector in the plurality of feature vectors, a corresponding random number is generated, and the current feature value of the dimension of the feature vector is updated to the current feature value and the corresponding The sum of random numbers is used to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers corresponds to the random number based on a predetermined first parameter The product of the mean square deviation of the eigenvalues of the dimension is determined.
首先,在步骤S202,计算所述多个初始样本分别对应的多个特征向量中每个维度的特征值的特征值均方差。First, in step S202, the mean square error of the eigenvalues of the eigenvalues of each dimension in the eigenvectors corresponding to the multiple initial samples is calculated.
所述初始样本集例如为图1中所示的数据集A,在该数据集A中包括例如n个初始样本,每个样本中都包括各自的特征向量,所述特征向量例如为m维特征向量,其每个维度值对应于一个特征值。另外,在将要训练的模型例如为监督模型的情况中,所述每个样本中还包括相应的标签值。The initial sample set is, for example, the data set A shown in FIG. 1. The data set A includes, for example, n initial samples, and each sample includes a respective feature vector, and the feature vector is, for example, an m-dimensional feature. Vector, each dimension value of which corresponds to a feature value. In addition, in the case where the model to be trained is, for example, a supervised model, each sample also includes a corresponding label value.
所述均方差即为标准差,其为方差σ 2的平方根,可用σ表示。在对多个样本计算方差时,通常用下面的公式(1)计算: The mean square error is the standard deviation, which is the square root of the variance σ 2 , which can be represented by σ. When calculating the variance for multiple samples, the following formula (1) is usually used:
Figure PCTCN2020070290-appb-000001
Figure PCTCN2020070290-appb-000001
其中,n为样本总数,μ为n个x i的均值。 Among them, n is the total number of samples, μ is the mean value of n x i .
图3示意示出对多个特征向量中的一维特征值的均方差的计算。在图3中,假设有n个特征向量,每个特征向量都包括m个特征的特征值,可将其中第j个特征向量(即图中向量j)的第i维特征(即图中的特征i)的特征值表示为x ij,其中i∈[1,m],j∈[1,n]。从而,n个特征向量各自的第i维的特征值的标准差σ i可通过如下公式(2)计算: Fig. 3 schematically shows the calculation of the mean square error of one-dimensional eigenvalues in a plurality of eigenvectors. In Figure 3, assuming that there are n feature vectors, each feature vector includes the feature values of m features, and the i-th dimension feature (that is, the vector in the figure) of the j-th feature vector (ie, the vector j in the figure) The feature value of feature i) is expressed as x ij , where i ∈ [1, m], j ∈ [1, n]. Therefore, the standard deviation σ i of the eigenvalues of the i-th dimension of each of the n eigenvectors can be calculated by the following formula (2):
Figure PCTCN2020070290-appb-000002
Figure PCTCN2020070290-appb-000002
其中,μ i通过如下公式(3)计算 Among them, μ i is calculated by the following formula (3)
Figure PCTCN2020070290-appb-000003
Figure PCTCN2020070290-appb-000003
具体是,例如,对于图3中虚线框中的各个向量的第2维特征(即特征2)的特征值x 21,x 22,…,x 2n,可基于该n个数值,通过公式(3)计算其均值
Figure PCTCN2020070290-appb-000004
通过公式(2)计算其方差
Figure PCTCN2020070290-appb-000005
Specifically, for example, for the eigenvalues x 21 , x 22 ,..., x 2n of the second-dimensional feature (ie feature 2) of each vector in the dashed box in Fig. 3, based on the n values, the formula (3 ) Calculate its mean
Figure PCTCN2020070290-appb-000004
Calculate its variance by formula (2)
Figure PCTCN2020070290-appb-000005
在步骤S204,对于所述多个特征向量中每个特征向量中的每个维度,生成相应的随机数,并将该特征向量的该维度的当前特征值更新为所述当前特征值与相应的随机数之和,以生成与所述多个特征向量分别对应的多个扰动样本,从而获取扰动样本集,其中,各个所述随机数的取值范围基于预定的第一参数与该随机数对应的维度的特征值均方差的乘积确定。In step S204, for each dimension of each feature vector in the plurality of feature vectors, a corresponding random number is generated, and the current feature value of the dimension of the feature vector is updated to the current feature value and the corresponding The sum of random numbers is used to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers corresponds to the random number based on a predetermined first parameter The product of the mean square deviation of the eigenvalues of the dimension is determined.
仍然参考图3,假设对于图中的向量1,对于该特征向量的每个维度1,2,…,m,分别生成相应的随机数a 11,a 21,…a m1,并将该特征向量每个维度的当前特征值x i1更新为x i1+a i1,其中i∈[1,m],从而可获取与该特征向量1对应的扰动特征向量(如图4中虚线框内所示)。对于每个特征向量,都可以类似地施加扰动,从而获取与其对应的扰动特征向量。图4示出与图3中n个特征向量分别对应的n个扰动特征向量。通过该扰动特征向量训练模型,即对训练样本增加了噪声,将提高模型对噪声的抵抗能力,增强模型的稳定性。 Still referring to Figure 3, suppose that for the vector 1 in the figure, for each dimension 1, 2, ..., m of the feature vector, the corresponding random numbers a 11 , a 21 , ... a m1 are generated respectively, and the feature vector The current eigenvalue x i1 of each dimension is updated to x i1 +a i1 , where i ∈ [1, m], so that the disturbance eigenvector corresponding to the eigenvector 1 can be obtained (as shown in the dashed box in Figure 4) . For each feature vector, the perturbation can be similarly applied to obtain the perturbation feature vector corresponding to it. FIG. 4 shows n disturbance feature vectors corresponding to the n feature vectors in FIG. 3 respectively. Training the model through the disturbance feature vector, that is, adding noise to the training sample, will improve the model's resistance to noise and enhance the stability of the model.
在一个实施例中,对于图4中包括的每个随机数a ij,其中i∈[1,m],j∈[1,n],可通过高斯随机变量A=norm(0,λσ i)获取,也就是说该随机变量均值为0,均方差为λσ i,这里σ i为通过上述公式(2)计算的维度i(特征i)的特征值均方差,λ为预定参数,其可取值在0.0001到0.1之间。例如结合图3和图4,图4中与特征2维度对应的各个a 2j都通过A=norm(0,λσ 2)生成,其中,σ 2即为图3中所示σ 2。如本领域技术人员熟知的,根据高斯变量的概率密度图形,a ij的99%的可能取值将落在[-3λσ i,3λσ i]的区间中,也就是说,通过该预定参数λ与维度i的特征值均方差σ i的乘积λσ i,限定了维度i的各个随机数a ij的取值范围,其中j的取值范围为1到n。 In one embodiment, for each random number a ij included in Fig. 4, where i ∈ [1, m], j ∈ [1, n], Gaussian random variable A=norm(0, λσ i ) Obtained, that is to say, the mean value of the random variable is 0, and the mean square error is λσ i , where σ i is the mean square error of the feature value of dimension i (feature i) calculated by the above formula (2), and λ is a predetermined parameter, which can be taken The value is between 0.0001 and 0.1. For example, in combination with FIGS. 3 and 4, each a 2j corresponding to the feature 2 dimension in FIG. 4 is generated by A=norm(0, λσ 2 ), where σ 2 is σ 2 shown in FIG. 3. As is well known to those skilled in the art, according to the probability density graph of Gaussian variables, 99% of the possible values of a ij will fall in the interval of [-3λσ i ,3λσ i ], that is, through the predetermined parameter λ and The product λσ i of the mean square error σ i of the eigenvalues of the dimension i defines the value range of each random number a ij of the dimension i, where the value range of j is 1 to n.
在一个实施例中,图4中的各个随机数a ij可通过平均随机变量B获取,该随机变 量B例如可为[-λσ i,λσ i]范围内的平均随机变量。也就是说,通过所述乘积λσ i限定各个随机数a ij的取值范围。所述平均随机变量B的取值范围不限于设定为[-λσ i,λσ i],例如,也可以为[-3λσ i,3λσ i]等等。另外,所述各个随机数a ij不限于为通过上述高斯随机变量或平均随机变量获取,而可以通过其它任意随机变量获取,如泊松随机变量等等,只要其通过λσ i限制其取值范围即可。 In an embodiment, each random number a ij in FIG. 4 can be obtained by an average random variable B, and the random variable B may be, for example, an average random variable in the range of [-λσ i ,λσ i ]. In other words, the value range of each random number a ij is limited by the product λσ i . The value range of the average random variable B is not limited to be set to [-λσ i ,λσ i ], for example, it can also be [-3λσ i ,3λσ i ] and so on. In addition, each random number a ij is not limited to be obtained by the above-mentioned Gaussian random variable or average random variable, but can be obtained by any other random variable, such as a Poisson random variable, etc., as long as its value range is limited by λσ i OK.
其中,λ用于平衡加入的噪声和模型性能,λ的值越小,a ij的取值范围越小,也即施加的扰动越小,当施加扰动太小时,对特征向量的影响太小,不能起到提高模型性能的作用,当施加的扰动过大时,又影响了模型的预测准确性。因此,λ的取值对于模型性能的提高较为重要。在一个实施例中,可通过模型应用的具体环境确定λ的值,例如,对于语音识别模型,在其应用的环境比较嘈杂,噪音较大的情况中,可将λ值设置为较大,在其应用的环境比较安静,噪音偏小的情况中,可将λ值设置为较小。在一个实施例中,如下文将要详细描述的,通过对训练好的模型进行评估来确定λ的取值,即,选择评估值较好的λ作为最终的模型训练的λ。在一个实施例中,当模型在使用之前批次的训练样本集训练中已经确定好λ之后,在后续的反复训练中可重复使用该λ值。 Among them, λ is used to balance the added noise and model performance. The smaller the value of λ, the smaller the value range of a ij , that is, the smaller the disturbance applied. When the disturbance is too small, the effect on the eigenvector is too small. Can not play a role in improving the performance of the model, when the disturbance imposed is too large, it will affect the prediction accuracy of the model. Therefore, the value of λ is more important for the improvement of model performance. In one embodiment, the value of λ can be determined by the specific environment in which the model is applied. For example, for a speech recognition model, when the environment in which it is applied is relatively noisy and the noise is large, the value of λ can be set to be larger. The application environment is relatively quiet, and in the case of low noise, the value of λ can be set to be small. In an embodiment, as will be described in detail below, the value of λ is determined by evaluating the trained model, that is, λ with a better evaluation value is selected as the final λ for model training. In one embodiment, after the model has determined λ in training using the training sample set of the previous batch, the λ value can be reused in subsequent repeated training.
在通过图2所示方法获取初始样本集的各个初始样本分别对应的扰动样本之后,可获取扰动样本集,从而可基于初始样本集和扰动样本集获取模型的训练样本集和测试样本集。After obtaining the disturbance samples corresponding to each initial sample of the initial sample set by the method shown in FIG. 2, the disturbance sample set can be obtained, so that the training sample set and the test sample set of the model can be obtained based on the initial sample set and the disturbance sample set.
图5示出根据本说明书实施例的基于初始样本集获取模型训练样本集的方法流程图,其中,所述初始样本集中包括多个初始样本,所述方法包括:Fig. 5 shows a flowchart of a method for acquiring a model training sample set based on an initial sample set according to an embodiment of the present specification, wherein the initial sample set includes a plurality of initial samples, and the method includes:
在步骤S502,通过图2所示方法,基于所述初始样本集获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;以及In step S502, a disturbance sample set is obtained based on the initial sample set by the method shown in FIG. 2, the disturbance sample set includes a plurality of disturbance samples respectively corresponding to the plurality of initial samples; and
在步骤S504,通过将所述多个初始样本与所述多个扰动样本合并,获取训练样本集。In step S504, a training sample set is obtained by combining the multiple initial samples with the multiple disturbance samples.
首先,在步骤S502,通过图2所示方法,生成与所述多个初始样本分别对应的多个扰动样本,其中,所述多个扰动样本对应于取值相同的第一参数。所述多个扰动样本例如为如图4所示的多个扰动样本,如上文所述,在该多个扰动样本中的每个随机数a ij可通过高斯随机变量A=norm(0,λσ i)获取,也就是说,该批扰动样本对应于相同的值。 First, in step S502, by using the method shown in FIG. 2, a plurality of disturbance samples respectively corresponding to the plurality of initial samples are generated, wherein the plurality of disturbance samples correspond to the first parameter with the same value. The multiple perturbation samples are, for example, multiple perturbation samples as shown in FIG. 4. As described above, each random number a ij in the multiple perturbation samples can pass Gaussian random variable A=norm(0,λσ i ) Obtain, that is, the batch of disturbed samples corresponds to the same value.
在步骤S504,通过将所述多个初始样本与所述多个扰动样本合并,获取训练样本 集。在不需要对训练模型进行评估的情况中,可将初始样本集中的全部样本与扰动样本集中的全部样本合并,从而获取训练样本集。通过在训练样本集中同时包括初始样本集和扰动样本集,丰富了模型的训练样本,使得模型能够适应于不同的实际环境。In step S504, a training sample set is obtained by combining the multiple initial samples with the multiple disturbance samples. In the case where the training model does not need to be evaluated, all the samples in the initial sample set and all the samples in the disturbance sample set can be combined to obtain the training sample set. By including the initial sample set and the disturbance sample set in the training sample set, the training samples of the model are enriched, so that the model can be adapted to different actual environments.
图6示出了根据本说明书实施例的一种基于初始样本集获取模型训练样本集和测试样本集的方法流程图,其中,所述初始样本集中包括多个初始样本,所述方法包括:Fig. 6 shows a flow chart of a method for obtaining a model training sample set and a test sample set based on an initial sample set according to an embodiment of the present specification, wherein the initial sample set includes a plurality of initial samples, and the method includes:
在步骤S602,通过图2所示方法,基于所述初始样本集获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;In step S602, by using the method shown in FIG. 2, a disturbance sample set is obtained based on the initial sample set, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
在步骤S604,通过将所述多个初始样本中的部分初始样本与所述多个扰动样本中的部分扰动样本合并,获取训练样本集;以及In step S604, a training sample set is obtained by merging part of the initial samples in the plurality of initial samples with part of the disturbance samples in the plurality of disturbance samples; and
在步骤S606,通过将所述多个初始样本中剩余的初始样本的至少部分与所述多个扰动样本中剩余的扰动样本的至少部分合并,获取测试样本集。In step S606, a test sample set is obtained by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
在该方法中,在获取扰动样本集之后,可通过将部分初始样本与部分扰动样本合并在一起,从而获取训练样本集。在一个实施例中,例如,可合并初始样本集中的70%的初始样本和扰动样本集中的70%的扰动样本,从而获取训练样本集。之后,将所述初始样本集和扰动样本集中的剩余的例如30%的初始样本和30%的扰动样本合并,从而获取与该训练样本集对应的测试样本集。其中,所述训练样本集中的所述70%初始样本和所述70%扰动样本可以是分别相互对应的,也可以是不对应的。In this method, after obtaining the disturbance sample set, part of the initial sample and part of the disturbance sample can be merged together to obtain the training sample set. In one embodiment, for example, 70% of the initial samples in the initial sample set and 70% of the disturbance samples in the disturbance sample set may be combined to obtain a training sample set. Afterwards, the remaining initial sample set, for example 30% of the initial sample and 30% of the disturbed sample in the initial sample set and the disturbance sample set, are combined to obtain a test sample set corresponding to the training sample set. Wherein, the 70% initial samples and the 70% disturbance samples in the training sample set may respectively correspond to each other, or may not correspond to each other.
在一个实施例中,所述部分初始样本占所述多个初始样本的比例与所述部分扰动样本占所述多个扰动样本的比例也可以不同,例如,在模型实际应用环境较嘈杂的情况中,可在训练样本集中包括较大比例的扰动样本,例如在训练样本集中包括全部扰动样本中的80%的扰动样本,全部初始样本中的20%的初始样本。相应地,也可以通过同样的比例配置测试样本集,例如包括全部扰动样本中剩余的(占全部扰动样本)20%的扰动样本、剩余初始样本中的(占全部初始样本)5%的初始样本。In an embodiment, the proportion of the partial initial samples in the plurality of initial samples and the proportion of the partial disturbance samples in the plurality of disturbance samples may also be different, for example, in the case of a noisy model in the actual application environment In the training sample set, a larger proportion of disturbance samples may be included in the training sample set. For example, the training sample set may include 80% of all disturbance samples and 20% of all initial samples. Correspondingly, the test sample set can also be configured in the same proportion, for example, including the remaining 20% of the disturbed samples (accounting for all disturbed samples) and the initial sample of 5% of the remaining initial samples (accounting for all the initial samples). .
在如图5和图6所示获取模型的训练样本集和测试样本集之后,可通过训练样本集训练模型,并可通过测试样本集评估对模型进行评估。下面描述基于测试样本集对模型的评估挑选模型的预定参数λ,从而进一步优化模型的方法。After obtaining the training sample set and the test sample set of the model as shown in FIG. 5 and FIG. 6, the model can be trained by the training sample set, and the model can be evaluated by the test sample set evaluation. The following describes the method of selecting the predetermined parameter λ of the model based on the evaluation of the model by the test sample set, so as to further optimize the model.
图7示出根据本说明书实施例的一种模型训练方法流程图,包括:Fig. 7 shows a flow chart of a model training method according to an embodiment of the specification, including:
在步骤S702,获取初始样本集,其中,所述初始样本集中包括多个初始样本;In step S702, an initial sample set is obtained, wherein the initial sample set includes a plurality of initial samples;
在步骤S704,通过图6所示方法基于所述初始样本集获取多个训练样本集、及与所述多个训练样本集分别对应的多个测试样本集,其中,所述多个训练样本集与多个取值不同的第一参数分别对应;In step S704, multiple training sample sets and multiple test sample sets corresponding to the multiple training sample sets are obtained based on the initial sample set by the method shown in FIG. 6, wherein the multiple training sample sets Correspond to multiple first parameters with different values;
在步骤S706,使用所述多个训练样本集分别训练当前模型,以分别获取多个更新模型;In step S706, use the multiple training sample sets to train the current model respectively to obtain multiple updated models;
在步骤S708,使用所述多个测试样本集分别评估相应的更新模型,其中,所述测试样本集与相应的更新模型对应于相同的训练样本集;以及In step S708, use the multiple test sample sets to evaluate corresponding update models respectively, wherein the test sample set and the corresponding update model correspond to the same training sample set; and
在步骤S710,基于评估结果,在所述多个更新模型中确定所述当前模型的更新模型。In step S710, based on the evaluation result, an update model of the current model is determined among the multiple update models.
首先,在步骤S702,获取初始样本集,其中,所述初始样本集中包括多个初始样本。所述模型不限于具体类型,如上文所述,其可以为监督学习模型、无监督学习模型、以及强化学习模型中的任一类型。例如,所述模型为如上文所述的语音识别模型,其例如为监督学习模型,在该情况中,可通过人工输入语音,从该语音提取相应的特征向量,从而使用所述特征向量和标签值(语义)作为模型的初始样本。然而,该模型在实际应用中可能会遇到不同的环境,如安静的环境、多种具有不同噪声的嘈杂的环境等等。而在单一环境下人工获取的初始样本无法模拟这么多不同的环境,而在不同环境下人工获取样本的成本也比较高。从而,可通过该方法基于初始样本集进行样本的扩充,从而获取训练样本集。First, in step S702, an initial sample set is obtained, wherein the initial sample set includes a plurality of initial samples. The model is not limited to a specific type. As described above, it can be any type of a supervised learning model, an unsupervised learning model, and a reinforcement learning model. For example, the model is a speech recognition model as described above, which is, for example, a supervised learning model. In this case, the corresponding feature vector can be extracted from the speech by manually inputting the voice, thereby using the feature vector and label The value (semantics) is used as the initial sample of the model. However, the model may encounter different environments in practical applications, such as a quiet environment, a variety of noisy environments with different noises, and so on. The initial samples obtained manually in a single environment cannot simulate so many different environments, and the cost of obtaining samples manually in different environments is relatively high. Therefore, the method can be used to expand the sample based on the initial sample set to obtain the training sample set.
在步骤S704,通过图6所示方法基于所述初始样本集获取多个训练样本集、及与所述多个训练样本集分别对应的多个测试样本集,其中,所述多个训练样本集与多个取值不同的第一参数分别对应。In step S704, multiple training sample sets and multiple test sample sets corresponding to the multiple training sample sets are obtained based on the initial sample set by the method shown in FIG. 6, wherein the multiple training sample sets Correspond to multiple first parameters with different values.
也就是说,在对上述预定参数λ取不同值的情况中,多次通过图6所示方法基于初始样本集获取多个训练样本集和与其分别相应的多个测试样本集。例如,可以将λ分别取值为0.0001、0.001、0.01、0.1。从而可确定λ的量级对模型训练的影响。可以理解,λ的取值不限于上述方式和上述个数,而是可以根据具体模型具体限定。具体是,对于上述4个λ值,基于上述初始样本集A,可通过图2所示方法分别获取4个扰动样本集B 1、B 2、B 3、B 4,假设,基于该4个扰动样本集分别获取4组样本集合组(C 1,D 1)、(C 2,D 2)、(C 3,D 3)、(C 4,D 4),其中C i表示训练样本集,D i表示测试样本集。 That is to say, in the case of taking different values for the aforementioned predetermined parameter λ, multiple training sample sets and multiple test sample sets corresponding to the initial sample set are acquired through the method shown in FIG. 6 multiple times. For example, λ can be set to 0.0001, 0.001, 0.01, and 0.1, respectively. Therefore, the influence of the magnitude of λ on model training can be determined. It can be understood that the value of λ is not limited to the foregoing manner and the foregoing number, but may be specifically limited according to a specific model. Specifically, for the above four λ values, based on the above initial sample set A, four perturbation sample sets B 1 , B 2 , B 3 , and B 4 can be obtained by the method shown in Fig. 2 respectively. It is assumed that based on the 4 perturbations Sample sets were obtained 4 sets of sample sets (C 1 , D 1 ), (C 2 , D 2 ), (C 3 , D 3 ), (C 4 , D 4 ), where C i represents the training sample set, D i represents the test sample set.
在步骤S706,使用所述多个训练样本集分别训练当前模型,以分别获取多个更新 模型。In step S706, the multiple training sample sets are used to train the current model respectively to obtain multiple updated models.
在上述实例中,也就是说,使用各个训练样本集C 1、C 2、C 3、C 4分别训练当前模型,以分别获取4个更新模型M 1、M 2、M 3、M 4In the above example, that is, each training sample set C 1 , C 2 , C 3 , and C 4 are used to train the current model to obtain 4 updated models M 1 , M 2 , M 3 , and M 4 respectively .
在步骤S708,使用所述多个测试样本集分别评估相应的更新模型,其中,所述测试样本集与相应的更新模型对应于相同的训练样本集。In step S708, the multiple test sample sets are used to respectively evaluate the corresponding update models, where the test sample set and the corresponding update model correspond to the same training sample set.
在上述实例中,也就是说,使用各个测试样本集D 1、D 2、D 3、D 4分别评估所述4个更新模型M 1、M 2、M 3、M 4,其中,测试样本集D 1与更新模型M 1都对应于训练样本集C 1,即,测试样本集D 1与更新模型M 1相对应,类似地可得出,测试样本集D 2与更新模型M 2相对应,测试样本集D 3与更新模型M 3相对应,测试样本集D 4与更新模型M 4相对应。可通过测试样本计算相应的更新模型的各种评估指标,如准确率、精确率、召回率等等,从而对该相应的更新模型进行评估,例如,可综合上述各种评估指标获取该模型的评估值。 In the above example, that is, each test sample set D 1 , D 2 , D 3 , D 4 is used to evaluate the four update models M 1 , M 2 , M 3 , M 4 , and the test sample set Both D 1 and the updated model M 1 correspond to the training sample set C 1 , that is, the test sample set D 1 corresponds to the updated model M 1. Similarly, it can be concluded that the test sample set D 2 corresponds to the updated model M 2 , The test sample set D 3 corresponds to the updated model M 3 , and the test sample set D 4 corresponds to the updated model M 4 . The test samples can be used to calculate various evaluation indicators of the corresponding update model, such as accuracy, precision, recall, etc., so as to evaluate the corresponding update model. For example, the above evaluation indicators can be combined to obtain the model’s The assessed value.
在步骤S710,基于评估结果,在所述多个更新模型中确定所述当前模型的更新模型。In step S710, based on the evaluation result, an update model of the current model is determined among the multiple update models.
在上述实例中,在获取各个λ对应的更新模型各自的评估值之后,例如可将评估值最高的更新模型确定为所述当前模型的更新模型,即训练后模型,并保留该确定的更新模型以进行后续的模型使用,如模型预测等。In the above example, after obtaining the respective evaluation values of the update models corresponding to each λ, for example, the update model with the highest evaluation value may be determined as the update model of the current model, that is, the post-training model, and the determined update model is retained For subsequent model use, such as model prediction.
图8示出根据本说明书实施例的一种基于初始样本集获取扰动样本集的装置800,所述初始样本集中包括多个初始样本,每个初始样本包括对应的特征向量,所述装置包括:Fig. 8 shows a device 800 for obtaining a disturbance sample set based on an initial sample set according to an embodiment of the present specification, the initial sample set includes a plurality of initial samples, each initial sample includes a corresponding feature vector, and the device includes:
计算单元81,配置为,计算所述多个初始样本分别对应的多个特征向量中每个维度的特征值的特征值均方差;以及The calculation unit 81 is configured to calculate the mean square deviation of the eigenvalues of the eigenvalues of each dimension in the eigenvectors corresponding to the multiple initial samples; and
生成单元82,配置为,对于所述多个特征向量中每个特征向量中的每个维度,生成相应的随机数,并将该特征向量的该维度的当前特征值更新为所述当前特征值与相应的随机数之和,以生成与所述多个特征向量分别对应的多个扰动样本,从而获取扰动样本集,其中,各个所述随机数的取值范围基于预定的第一参数与该随机数对应的维度的特征值均方差的乘积确定。The generating unit 82 is configured to generate a corresponding random number for each dimension of each feature vector in the plurality of feature vectors, and update the current feature value of the dimension of the feature vector to the current feature value And the corresponding random numbers to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers is based on the predetermined first parameter and the The product of the mean square deviation of the eigenvalues of the dimension corresponding to the random number is determined.
图9示出根据本说明书实施例的一种基于初始样本集获取模型训练样本集的装置900,其中,所述初始样本集中包括多个初始样本,所述装置包括:Fig. 9 shows a device 900 for acquiring a model training sample set based on an initial sample set according to an embodiment of the present specification, wherein the initial sample set includes a plurality of initial samples, and the device includes:
获取单元91,配置为,通过上述装置,基于所述初始样本集获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;以及The obtaining unit 91 is configured to obtain a disturbance sample set based on the initial sample set through the foregoing device, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples; and
合并单元92,配置为,通过将所述多个初始样本与所述多个扰动样本合并,获取训练样本集。The merging unit 92 is configured to obtain a training sample set by merging the multiple initial samples with the multiple disturbance samples.
图10示出根据本说明书实施例的一种基于初始样本集获取模型训练样本集和测试样本集的装置1000,其中,所述初始样本集中包括多个初始样本,所述装置包括:Fig. 10 shows an apparatus 1000 for acquiring a model training sample set and a test sample set based on an initial sample set according to an embodiment of the present specification, wherein the initial sample set includes a plurality of initial samples, and the device includes:
获取单元101,配置为,通过上述获取扰动样本集的装置,基于所述初始样本集获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;The obtaining unit 101 is configured to obtain a disturbance sample set based on the initial sample set through the aforementioned apparatus for obtaining a disturbance sample set, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
第一合并单元102,配置为,通过将所述多个初始样本中的部分初始样本与所述多个扰动样本中的部分扰动样本合并,获取训练样本集;以及The first merging unit 102 is configured to obtain a training sample set by merging part of the initial samples in the plurality of initial samples and part of the disturbance samples in the plurality of disturbance samples; and
第二合并单元103,配置为,通过将所述多个初始样本中剩余的初始样本的至少部分与所述多个扰动样本中剩余的扰动样本的至少部分合并,获取测试样本集。The second merging unit 103 is configured to obtain a test sample set by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
在一个实施例中,所述第二合并单元还配置为,通过将所述多个初始样本中剩余的初始样本与所述多个扰动样本中剩余的扰动样本合并,获取测试样本集。In an embodiment, the second merging unit is further configured to obtain a test sample set by merging the remaining initial samples in the plurality of initial samples with the remaining disturbance samples in the plurality of disturbance samples.
图11示出根据本说明书实施例的一种模型训练装置1100,包括:FIG. 11 shows a model training device 1100 according to an embodiment of this specification, including:
第一获取单元111,配置为,获取初始样本集,其中,所述初始样本集中包括多个初始样本;The first obtaining unit 111 is configured to obtain an initial sample set, wherein the initial sample set includes a plurality of initial samples;
第二获取单元112,配置为,通过上述获取训练样本集和测试样本集的装置基于所述初始样本集获取多个训练样本集、及与所述多个训练样本集分别对应的多个测试样本集,其中,所述多个训练样本集与多个取值不同的第一参数分别对应;The second obtaining unit 112 is configured to obtain a plurality of training sample sets and a plurality of test samples respectively corresponding to the plurality of training sample sets based on the initial sample set through the foregoing apparatus for obtaining training sample sets and test sample sets Set, wherein the multiple training sample sets correspond to multiple first parameters with different values;
训练单元113,配置为,使用所述多个训练样本集分别训练当前模型,以分别获取多个更新模型;The training unit 113 is configured to separately train the current model using the multiple training sample sets to obtain multiple updated models respectively;
评估单元114,配置为,使用所述多个测试样本集分别评估相应的更新模型,其中,所述测试样本集与相应的更新模型对应于相同的训练样本集;以及The evaluation unit 114 is configured to use the plurality of test sample sets to respectively evaluate corresponding update models, where the test sample set and the corresponding update model correspond to the same training sample set; and
确定单元115,配置为,基于评估结果,在所述多个更新模型中确定所述当前模型的更新模型。The determining unit 115 is configured to determine an update model of the current model among the multiple update models based on the evaluation result.
本说明书另一方面提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行上述任一项方法。Another aspect of this specification provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed in a computer, the computer is caused to execute any of the above methods.
本说明书另一方面提供一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现上述任一项方法。Another aspect of this specification provides a computing device including a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, any one of the above methods is implemented.
本说明书实施例通过对模型的训练数据进行扰动,模拟真实环境中的数据噪声,从而增加模型的鲁棒性,并且通过使用扰动数据训练并评估模型,并基于评估结果确定模型预定参数,从而定量地提高了模型对异常数据的有效性,另外,本说明书实施例中未对机器学习模型的参数进行限制,从而不会限制模型的学习能力。The embodiment of this specification perturbs the training data of the model to simulate the data noise in the real environment, thereby increasing the robustness of the model, and trains and evaluates the model by using the perturbation data, and determines the predetermined parameters of the model based on the evaluation result, thereby quantifying This improves the effectiveness of the model for abnormal data. In addition, the parameters of the machine learning model are not limited in the embodiments of this specification, so that the learning ability of the model is not limited.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps described in the claims may be performed in a different order than in the embodiments and still achieve desired results. In addition, the processes depicted in the drawings do not necessarily require the specific order or sequential order shown to achieve the desired result. In certain embodiments, multitasking and parallel processing are also possible or may be advantageous.
本领域普通技术人员应该还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执轨道,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art should be further aware that the units and algorithm steps of the examples described in the embodiments disclosed herein can be implemented by electronic hardware, computer software or a combination of the two, in order to clearly illustrate the hardware For the interchangeability with software, the composition and steps of each example have been described generally in accordance with the function in the above description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those of ordinary skill in the art can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of this application.
结合本文中所公开的实施例描述的方法或算法的步骤可以用硬件、处理器执轨道的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the method or algorithm described in the embodiments disclosed in this document can be implemented by hardware, a software module executed by a processor, or a combination of the two. The software module can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or all areas in the technical field. Any other known storage medium.
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The specific embodiments described above further describe the purpose, technical solutions and beneficial effects of the present invention in further detail. It should be understood that the above are only specific embodiments of the present invention and are not intended to limit the scope of the present invention. The scope of protection, any modification, equivalent replacement, improvement, etc., made within the spirit and principle of the present invention shall be included in the scope of protection of the present invention.

Claims (22)

  1. 一种基于初始样本集获取扰动样本集的方法,所述初始样本集中包括多个初始样本,每个初始样本包括对应的特征向量,所述方法包括:A method for obtaining a disturbance sample set based on an initial sample set, the initial sample set includes a plurality of initial samples, and each initial sample includes a corresponding feature vector, the method includes:
    计算所述多个初始样本分别对应的多个特征向量中每个维度的特征值的特征值均方差;以及Calculating the mean square error of the eigenvalues of the eigenvalues of each dimension in the eigenvectors corresponding to the multiple initial samples; and
    对于所述多个特征向量中每个特征向量中的每个维度,生成相应的随机数,并将该特征向量的该维度的当前特征值更新为所述当前特征值与相应的随机数之和,以生成与所述多个特征向量分别对应的多个扰动样本,从而获取扰动样本集,其中,各个所述随机数的取值范围基于预定的第一参数与该随机数对应的维度的特征值均方差的乘积确定。For each dimension of each feature vector in the plurality of feature vectors, a corresponding random number is generated, and the current feature value of the dimension of the feature vector is updated to the sum of the current feature value and the corresponding random number , To generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers is based on the predetermined first parameter and the feature of the dimension corresponding to the random number The product of the mean square deviation of the values is determined.
  2. 根据权利要求1所述的方法,其中,所述随机数为高斯分布随机数,所述高斯分布随机数的均方差为所述第一参数与该随机数对应的维度的特征值均方差的乘积。The method according to claim 1, wherein the random number is a Gaussian distributed random number, and the mean square error of the Gaussian distributed random number is a product of the first parameter and the mean square error of the eigenvalues of the dimension corresponding to the random number .
  3. 根据权利要求1所述的方法,其中,所述随机数为平均随机数,其中,所述平均随机数的取值范围在正负第一数值之间,其中,所述第一数值为所述第一参数与该随机数对应的维度的特征值均方差的乘积。The method according to claim 1, wherein the random number is an average random number, wherein a value range of the average random number is between positive and negative first numerical values, wherein the first numerical value is the The product of the first parameter and the mean square deviation of the eigenvalues of the dimension corresponding to the random number.
  4. 一种基于初始样本集获取模型训练样本集的方法,其中,所述初始样本集中包括多个初始样本,所述方法包括:A method for obtaining a model training sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the method includes:
    通过权利要求1所述的方法,基于所述初始样本集获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;以及According to the method of claim 1, obtaining a disturbance sample set based on the initial sample set, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples; and
    通过将所述多个初始样本与所述多个扰动样本合并,获取训练样本集。By combining the multiple initial samples with the multiple disturbance samples, a training sample set is obtained.
  5. 一种基于初始样本集获取模型训练样本集和测试样本集的方法,其中,所述初始样本集中包括多个初始样本,所述方法包括:A method for acquiring a model training sample set and a test sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the method includes:
    通过权利要求1所述的方法,基于所述初始样本集获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;According to the method of claim 1, obtaining a disturbance sample set based on the initial sample set, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
    通过将所述多个初始样本中的部分初始样本与所述多个扰动样本中的部分扰动样本合并,获取训练样本集;以及Obtaining a training sample set by merging part of the initial samples in the plurality of initial samples with part of the disturbance samples in the plurality of disturbance samples; and
    通过将所述多个初始样本中剩余的初始样本的至少部分与所述多个扰动样本中剩余的扰动样本的至少部分合并,获取测试样本集。The test sample set is obtained by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
  6. 根据权利要求5所述的方法,其中,所述部分初始样本占所述多个初始样本的比例与所述部分扰动样本占所述多个扰动样本的比例相同。The method according to claim 5, wherein the proportion of the partial initial samples to the plurality of initial samples is the same as the proportion of the partial disturbance samples to the plurality of disturbance samples.
  7. 根据权利要求6所述的方法,其中,通过将所述多个初始样本中剩余的初始样本 的至少部分与所述多个扰动样本中剩余的扰动样本的至少部分合并,获取测试样本集包括,通过将所述多个初始样本中剩余的初始样本与所述多个扰动样本中剩余的扰动样本合并,获取测试样本集。The method according to claim 6, wherein obtaining a test sample set by combining at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples includes, The test sample set is obtained by merging the remaining initial samples in the plurality of initial samples with the remaining disturbance samples in the plurality of disturbance samples.
  8. 根据权利要求6所述的方法,其中,所述部分初始样本与所述部分扰动样本分别对应。The method according to claim 6, wherein the part of the initial sample corresponds to the part of the disturbance sample respectively.
  9. 一种模型训练方法,包括:A model training method includes:
    获取初始样本集,其中,所述初始样本集中包括多个初始样本;Acquiring an initial sample set, wherein the initial sample set includes a plurality of initial samples;
    通过权利要求5所述的方法基于所述初始样本集获取多个训练样本集、及与所述多个训练样本集分别对应的多个测试样本集,其中,所述多个训练样本集与多个取值不同的第一参数分别对应;According to the method of claim 5, multiple training sample sets and multiple test sample sets corresponding to the multiple training sample sets are obtained based on the initial sample set, wherein the multiple training sample sets are The first parameters with different values correspond respectively;
    使用所述多个训练样本集分别训练当前模型,以分别获取多个更新模型;Use the multiple training sample sets to train the current model respectively to obtain multiple updated models;
    使用所述多个测试样本集分别评估相应的更新模型,其中,所述测试样本集与相应的更新模型对应于相同的训练样本集;以及Using the multiple test sample sets to respectively evaluate corresponding update models, wherein the test sample set and the corresponding update model correspond to the same training sample set; and
    基于评估结果,在所述多个更新模型中确定所述当前模型的更新模型。Based on the evaluation result, an update model of the current model is determined among the multiple update models.
  10. 根据权利要求9所述的方法,其中,所述模型为以下任一类模型:监督学习模型、无监督学习模型、和强化学习模型。The method according to claim 9, wherein the model is any of the following types of models: supervised learning models, unsupervised learning models, and reinforcement learning models.
  11. 一种基于初始样本集获取扰动样本集的装置,所述初始样本集中包括多个初始样本,每个初始样本包括对应的特征向量,所述装置包括:A device for obtaining a disturbance sample set based on an initial sample set, the initial sample set includes a plurality of initial samples, each initial sample includes a corresponding feature vector, and the device includes:
    计算单元,配置为,计算所述多个初始样本分别对应的多个特征向量中每个维度的特征值的特征值均方差;以及The calculation unit is configured to calculate the mean square deviation of the eigenvalues of the eigenvalues of each dimension in the multiple eigenvectors respectively corresponding to the multiple initial samples; and
    生成单元,配置为,对于所述多个特征向量中每个特征向量中的每个维度,生成相应的随机数,并将该特征向量的该维度的当前特征值更新为所述当前特征值与相应的随机数之和,以生成与所述多个特征向量分别对应的多个扰动样本,从而获取扰动样本集,其中,各个所述随机数的取值范围基于预定的第一参数与该随机数对应的维度的特征值均方差的乘积确定。The generating unit is configured to generate a corresponding random number for each dimension in each feature vector of the multiple feature vectors, and update the current feature value of the feature vector in that dimension to the current feature value and The sum of the corresponding random numbers is used to generate a plurality of disturbance samples respectively corresponding to the plurality of feature vectors, thereby obtaining a disturbance sample set, wherein the value range of each of the random numbers is based on the predetermined first parameter and the random The product of the mean square deviation of the eigenvalues of the dimension corresponding to the number is determined.
  12. 根据权利要求11所述的装置,其中,所述随机数为高斯分布随机数,所述高斯分布随机数的均方差为所述第一参数与该随机数对应的维度的特征值均方差的乘积。The device according to claim 11, wherein the random number is a Gaussian distributed random number, and the mean square error of the Gaussian distributed random number is a product of the first parameter and the mean square error of the eigenvalues of the dimension corresponding to the random number .
  13. 根据权利要求11所述的装置,其中,所述随机数为平均随机数,其中,所述平均随机数的取值范围在正负第一数值之间,其中,所述第一数值为所述第一参数与该随机数对应的维度的特征值均方差的乘积。11. The device according to claim 11, wherein the random number is an average random number, wherein a value range of the average random number is between a positive and negative first value, wherein the first value is the The product of the first parameter and the mean square deviation of the eigenvalues of the dimension corresponding to the random number.
  14. 一种基于初始样本集获取模型训练样本集的装置,其中,所述初始样本集中包括 多个初始样本,所述装置包括:A device for acquiring a model training sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the device includes:
    获取单元,配置为,通过权利要求11所述的装置,基于所述初始样本集获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;以及The obtaining unit is configured to obtain a disturbance sample set based on the initial sample set through the apparatus of claim 11, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples; and
    合并单元,配置为,通过将所述多个初始样本与所述多个扰动样本合并,获取训练样本集。The merging unit is configured to obtain a training sample set by merging the multiple initial samples with the multiple disturbance samples.
  15. 一种基于初始样本集获取模型训练样本集和测试样本集的装置,其中,所述初始样本集中包括多个初始样本,所述装置包括:A device for acquiring a model training sample set and a test sample set based on an initial sample set, wherein the initial sample set includes a plurality of initial samples, and the device includes:
    获取单元,配置为,通过权利要求11所述的装置,基于所述初始样本集获取扰动样本集,所述扰动样本集包括与所述多个初始样本分别对应的多个扰动样本;The obtaining unit is configured to obtain a disturbance sample set based on the initial sample set through the device according to claim 11, the disturbance sample set including a plurality of disturbance samples respectively corresponding to the plurality of initial samples;
    第一合并单元,配置为,通过将所述多个初始样本中的部分初始样本与所述多个扰动样本中的部分扰动样本合并,获取训练样本集;以及The first merging unit is configured to obtain a training sample set by merging part of the initial samples in the plurality of initial samples and part of the disturbance samples in the plurality of disturbance samples; and
    第二合并单元,配置为,通过将所述多个初始样本中剩余的初始样本的至少部分与所述多个扰动样本中剩余的扰动样本的至少部分合并,获取测试样本集。The second merging unit is configured to obtain a test sample set by merging at least part of the remaining initial samples in the plurality of initial samples with at least part of the remaining disturbance samples in the plurality of disturbance samples.
  16. 根据权利要求15所述的装置,其中,所述部分初始样本占所述多个初始样本的比例与所述部分扰动样本占所述多个扰动样本的比例相同。The device according to claim 15, wherein the proportion of the partial initial samples in the plurality of initial samples is the same as the proportion of the partial disturbance samples in the plurality of disturbance samples.
  17. 根据权利要求16所述的装置,其中,所述第二合并单元还配置为,通过将所述多个初始样本中剩余的初始样本与所述多个扰动样本中剩余的扰动样本合并,获取测试样本集。The apparatus according to claim 16, wherein the second merging unit is further configured to obtain a test by merging the remaining initial samples in the plurality of initial samples with the remaining disturbance samples in the plurality of disturbance samples Sample set.
  18. 根据权利要求16所述的装置,其中,所述部分初始样本与所述部分扰动样本分别对应。The apparatus according to claim 16, wherein the part of the initial sample corresponds to the part of the disturbance sample respectively.
  19. 一种模型训练装置,包括:A model training device includes:
    第一获取单元,配置为,获取初始样本集,其中,所述初始样本集中包括多个初始样本;The first obtaining unit is configured to obtain an initial sample set, wherein the initial sample set includes a plurality of initial samples;
    第二获取单元,配置为,通过权利要求15所述的装置基于所述初始样本集获取多个训练样本集、及与所述多个训练样本集分别对应的多个测试样本集,其中,所述多个训练样本集与多个取值不同的第一参数分别对应;The second acquiring unit is configured to acquire a plurality of training sample sets and a plurality of test sample sets corresponding to the plurality of training sample sets respectively based on the initial sample set through the apparatus of claim 15, wherein, The multiple training sample sets are respectively corresponding to multiple first parameters with different values;
    训练单元,配置为,使用所述多个训练样本集分别训练当前模型,以分别获取多个更新模型;The training unit is configured to use the multiple training sample sets to train the current model respectively to obtain multiple updated models;
    评估单元,配置为,使用所述多个测试样本集分别评估相应的更新模型,其中,所述测试样本集与相应的更新模型对应于相同的训练样本集;以及The evaluation unit is configured to use the multiple test sample sets to respectively evaluate corresponding update models, wherein the test sample set and the corresponding update model correspond to the same training sample set; and
    确定单元,配置为,基于评估结果,在所述多个更新模型中确定所述当前模型的更 新模型。The determining unit is configured to determine an updated model of the current model among the multiple updated models based on the evaluation result.
  20. 根据权利要求19所述的装置,其中,所述模型为以下任一类模型:监督学习模型、无监督学习模型、和强化学习模型。The device according to claim 19, wherein the model is any of the following types of models: supervised learning models, unsupervised learning models, and reinforcement learning models.
  21. 一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-10中任一项的所述的方法。A computer-readable storage medium with a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of any one of claims 1-10.
  22. 一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-10中任一项所述的方法。A computing device, comprising a memory and a processor, characterized in that executable code is stored in the memory, and when the processor executes the executable code, the device described in any one of claims 1-10 method.
PCT/CN2020/070290 2019-02-22 2020-01-03 Model training method and apparatus based on disturbance samples WO2020168843A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910133409.3A CN110033094A (en) 2019-02-22 2019-02-22 A kind of model training method and device based on disturbance sample
CN201910133409.3 2019-02-22

Publications (1)

Publication Number Publication Date
WO2020168843A1 true WO2020168843A1 (en) 2020-08-27

Family

ID=67234962

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/070290 WO2020168843A1 (en) 2019-02-22 2020-01-03 Model training method and apparatus based on disturbance samples

Country Status (2)

Country Link
CN (1) CN110033094A (en)
WO (1) WO2020168843A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110033094A (en) * 2019-02-22 2019-07-19 阿里巴巴集团控股有限公司 A kind of model training method and device based on disturbance sample
CN111062442B (en) * 2019-12-20 2022-04-12 支付宝(杭州)信息技术有限公司 Method and device for explaining service processing result of service processing model
CN111783551B (en) * 2020-06-04 2023-07-25 中国人民解放军军事科学院国防科技创新研究院 Countermeasure sample defense method based on Bayesian convolutional neural network
CN113780365A (en) * 2021-08-19 2021-12-10 支付宝(杭州)信息技术有限公司 Sample generation method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060048010A1 (en) * 2004-08-30 2006-03-02 Hung-En Tai Data analyzing method for a fault detection and classification system
CN105808500A (en) * 2016-02-26 2016-07-27 山西牡丹深度智能科技有限公司 Realization method and device of deep learning
CN107193863A (en) * 2017-04-01 2017-09-22 广东工业大学 A kind of Data Quality Assessment Methodology of data untagged
CN107315918A (en) * 2017-07-06 2017-11-03 青岛大学 A kind of method that utilization noise improves robust iterative
CN110033094A (en) * 2019-02-22 2019-07-19 阿里巴巴集团控股有限公司 A kind of model training method and device based on disturbance sample

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060048010A1 (en) * 2004-08-30 2006-03-02 Hung-En Tai Data analyzing method for a fault detection and classification system
CN105808500A (en) * 2016-02-26 2016-07-27 山西牡丹深度智能科技有限公司 Realization method and device of deep learning
CN107193863A (en) * 2017-04-01 2017-09-22 广东工业大学 A kind of Data Quality Assessment Methodology of data untagged
CN107315918A (en) * 2017-07-06 2017-11-03 青岛大学 A kind of method that utilization noise improves robust iterative
CN110033094A (en) * 2019-02-22 2019-07-19 阿里巴巴集团控股有限公司 A kind of model training method and device based on disturbance sample

Also Published As

Publication number Publication date
CN110033094A (en) 2019-07-19

Similar Documents

Publication Publication Date Title
WO2020168843A1 (en) Model training method and apparatus based on disturbance samples
US20190098034A1 (en) Anomaly detection method and recording medium
CN111108362A (en) Abnormal sound detection device, abnormal model learning device, abnormal sound detection method, abnormal sound generation device, abnormal data generation device, abnormal sound generation method, and program
JP7266674B2 (en) Image classification model training method, image processing method and apparatus
US11935553B2 (en) Sound signal model learning device, sound signal analysis device, method and program
US20190122081A1 (en) Confident deep learning ensemble method and apparatus based on specialization
JP6472417B2 (en) Feature amount extraction device, feature amount extraction function information generation device, method and program thereof
JPWO2016084326A1 (en) Information processing system, information processing method, and program
CN113762005A (en) Method, device, equipment and medium for training feature selection model and classifying objects
CN110008972B (en) Method and apparatus for data enhancement
JP2020139914A (en) Substance structure analysis device, method and program
US20200019875A1 (en) Parameter calculation device, parameter calculation method, and non-transitory recording medium
US20220101187A1 (en) Identifying and quantifying confounding bias based on expert knowledge
JP5438703B2 (en) Feature quantity enhancement device, feature quantity enhancement method, and program thereof
US11544563B2 (en) Data processing method and data processing device
US20110093419A1 (en) Pattern identifying method, device, and program
JP7347750B2 (en) Verification device, learning device, method, and program
JP5255484B2 (en) Clustering distance learning device and program thereof, and clustering device
JP2010250391A (en) Data classification method, device, and program
Luebke et al. Linear dimension reduction in classification: adaptive procedure for optimum results
JP2010205043A (en) Pattern learning method, device and program
JP5834287B2 (en) Pattern classification learning device
CN115953584B (en) End-to-end target detection method and system with learning sparsity
CN116226260B (en) Big data decision method, system and cloud service center
US20240028936A1 (en) Device and computer-implemented method for machine learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20759724

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20759724

Country of ref document: EP

Kind code of ref document: A1