CN114565086A - Model training method, device, equipment and storage medium - Google Patents

Model training method, device, equipment and storage medium Download PDF

Info

Publication number
CN114565086A
CN114565086A CN202210205109.3A CN202210205109A CN114565086A CN 114565086 A CN114565086 A CN 114565086A CN 202210205109 A CN202210205109 A CN 202210205109A CN 114565086 A CN114565086 A CN 114565086A
Authority
CN
China
Prior art keywords
training
sample
samples
training set
difficult
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210205109.3A
Other languages
Chinese (zh)
Inventor
何旋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Unisinsight Technology Co Ltd
Original Assignee
Chongqing Unisinsight Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Unisinsight Technology Co Ltd filed Critical Chongqing Unisinsight Technology Co Ltd
Priority to CN202210205109.3A priority Critical patent/CN114565086A/en
Publication of CN114565086A publication Critical patent/CN114565086A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a model training method, which comprises the following steps: performing data enhancement on the samples in the first training set to obtain a second training set; constructing a plurality of third training sets in batches, and performing iterative training on the neural network model for a plurality of times by using the plurality of third training sets; randomly extracting third training samples in a third training set from the first training set or/and the second training set; in each process of training the neural network model, performing data enhancement on a first training sample of a first training set; and replacing the second training sample with the same number as the enhanced sample in the second training set with the enhanced sample. The method simulates a data enhancement sample generation strategy of online data enhancement in an offline data enhancement mode; obtaining the enhancement effect which is as rich as the enhancement effect of the online data with the storage consumption which is less than the enhancement of the offline data; meanwhile, a dirty data and hard sample mining method is provided, hard samples and dirty data in a data set are mined, the identification capability of a model on the hard samples is enhanced, and the dirty data in the data set is removed.

Description

Model training method, device, equipment and storage medium
Technical Field
The invention belongs to the technical field of deep learning, and particularly relates to a model training method, a model training device, model training equipment and a storage medium.
Background
In the training process of the convolutional neural network, data enhancement data set expression is used, and the generalization capability of the model is improved. Generally, in the model training process, offline data enhancement or online data enhancement is adopted. Before the model training, the off-line data enhancement is to perform one or more kinds of data enhancement on part or all of the training set to generate a certain amount of enhanced data to be stored, and the enhanced data is combined with the previous training set to be used as a new training set for training. The method has the advantages that the data enhancement work is completed in advance, extra time is not needed for data enhancement during model training, and the defects that the enhanced data needs to be saved, and the larger the data set is, the larger the consumed space is; the on-line data enhancement is to perform one or more kinds of data enhancement on part or all of the training data of each batch in the training process of the model and then directly send the training data to the model for training. The model training method adopted by each batch can be different, the enhanced data is regenerated every time, and the enhanced data is not stored. The method has the advantages that additional storage space is not occupied, each batch of enhancing methods can be different, the enhanced data set is richer, and the defect that additional time is spent for data enhancement before each batch of training is caused. In addition, in the training process of the model, some dirty data cannot interfere with the model in the data set, and some samples are difficult to fit. Therefore, how to effectively dig out dirty data and hard samples in a data set in the training process and improve model indexes is also a current research hotspot.
Disclosure of Invention
In view of the above drawbacks of the prior art, the present invention provides a model training method, apparatus, device and storage medium to solve the problem that the offline data enhancement method in the prior art consumes more space and takes longer time.
To achieve the above and other related objects, the present invention provides a model training method, comprising:
acquiring a first training set, wherein the first training set comprises a plurality of first training samples;
performing data enhancement on each first training sample in the first training set to obtain a second training set; the second training set comprises a plurality of second samples; wherein the first training sample has the same number as the corresponding second sample;
constructing a plurality of third training sets in batches, and performing iterative training on the neural network model for a plurality of times by using the plurality of third training sets until a stopping condition is met; wherein, the third training samples in the third training set are randomly extracted from the first training set or/and the second training set, and the serial numbers of each third training sample in a plurality of third training sets are different;
in each process of training the neural network model, performing data enhancement on a first training sample of a first training set to obtain an enhanced sample; replacing the enhancement sample with a second training sample in the second training set having the same number as the enhancement sample; and the first training sample after data enhancement and the second training sample in the third training set have the same number.
Optionally, the method further comprises:
constructing a difficult sample set;
establishing a plurality of fourth training sets in batches, and performing iterative training on the neural network model by using the fourth training sets until a stopping condition is met; and randomly extracting fourth training samples in the fourth training set from the first training set or/and the second training set or/and the hard sample set, wherein the serial numbers of the fourth training samples in a plurality of fourth training sets are different.
Optionally, constructing the third training set comprises:
reordering the numbers of second training samples in the second training set;
selecting a plurality of numbers from the reordered numbers as the numbers of a third training sample in the third training set;
and selecting the first training sample or/and the second training sample corresponding to the plurality of numbers from the first training set or/and the second training set as a third training sample, and constructing a third training set based on the third training sample.
Optionally, constructing the fourth training set comprises:
reordering the numbers of samples in the second training set;
selecting a plurality of numbers from the reordered numbers as the number of a fourth training sample in the fourth training set;
and selecting the first training samples or/and the second training samples or/and the difficult samples corresponding to the plurality of numbers from the first training set or/and the second training set or/and the difficult sample set as fourth training samples, and constructing a fourth training set based on the fourth training samples.
Optionally, if a difficult sample corresponding to one of the plurality of numbers exists in the difficult sample set, selecting a difficult sample of the corresponding number from the difficult sample set;
if a plurality of difficult samples corresponding to one of the numbers exist in the difficult sample set, randomly selecting one difficult sample from the plurality of difficult samples;
and if the difficult samples corresponding to the plurality of numbers do not exist in the difficult sample set, taking out first training samples or/and second training samples corresponding to the plurality of numbers from the first training set or/and the second training set.
Optionally, when the first training sample and the second training sample in the fourth training set are identified through a neural network model, adding a sample with an identification error or a confidence coefficient lower than a set confidence coefficient into the difficult sample set;
when the difficult samples in the fourth training set are identified through a neural network model, deleting the correctly identified samples from the difficult sample set;
and when identifying the difficult samples in the fourth training set through a neural network model, deleting the samples with the identification times larger than the set times from the difficult sample set, and keeping the samples in the dirty data set.
Optionally, if there is a first training sample belonging to a first training set in the difficult sample set, only the link of the first training sample is retained.
To achieve the above and other related objects, the present invention provides a model training apparatus, comprising:
the training set acquisition module is used for acquiring a first training set, and the first training set comprises a plurality of first training samples;
the first enhancement module is used for enhancing data of each first training sample in the first training set to obtain a second training set; the second training set comprises a plurality of second samples; wherein the first training sample has the same number as the corresponding second sample;
the training set construction module is used for constructing a plurality of third training sets in batches and carrying out a plurality of times of iterative training on the neural network model by utilizing the plurality of third training sets until a stopping condition is met; wherein, the third training samples in the third training set are randomly extracted from the first training set or/and the second training set, and the serial numbers of each third training sample in a plurality of third training sets are different;
the second enhancement module is used for enhancing data of the first training sample of the first training set in the process of training the neural network model each time to obtain an enhanced sample; replacing the enhancement sample with a second training sample in the second training set having the same number as the enhancement sample; and the first training sample after data enhancement and the second training sample in the third training set have the same number.
To achieve the above and other related objects, the present invention provides a model training apparatus, comprising:
a memory for storing a computer program;
a processor for executing the computer program stored in the memory to cause the apparatus to perform the model training method.
To achieve the above and other related objects, the present invention provides a storage medium storing a computer program which, when executed by a processor, performs the model training method.
As described above, the model training method, apparatus and storage medium of the present invention have the following advantages:
the invention discloses a model training method, which comprises the following steps: acquiring a first training set, wherein the first training set comprises a plurality of first training samples; performing data enhancement on each first training sample in the first training set to obtain a second training set; the second training set comprises a plurality of second samples; wherein the first training sample and the corresponding second sample have the same number; constructing a plurality of third training sets in batches, and performing iterative training on the neural network model for a plurality of times by using the plurality of third training sets until a stopping condition is met; wherein, the third training samples in the third training set are randomly extracted from the first training set or/and the second training set, and the serial numbers of each third training sample in a plurality of third training sets are different; in each process of training the neural network model, performing data enhancement on a first training sample of a first training set to obtain an enhanced sample; replacing the enhancement sample with a second training sample in the second training set having the same number as the enhancement sample; and the first training sample after data enhancement and the second training sample in the third training set have the same number. The model training method simulates a data enhancement sample generation strategy of on-line data enhancement in an off-line data enhancement mode; obtaining the enhancement effect which is as rich as the enhancement effect of the online data with the storage consumption which is less than the enhancement of the offline data; compared with on-line data enhancement, the generation of the data enhancement sample is parallel to the training of the model, and the additional training time cost is not increased. The invention also provides a dirty data and hard sample mining method, which is used for mining hard samples and dirty data in the data set, enhancing the identification capability of the model on the hard samples and removing the dirty data in the data set.
Drawings
FIG. 1 is a flow chart of a model training method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a relationship between a first training set, a second training set, and a hard sample set according to an embodiment of the present invention;
FIG. 3 is a flowchart of the fourth training set construction according to an embodiment of the present invention;
FIG. 4 is a flowchart of the fourth training set construction according to an embodiment of the present invention;
FIG. 5 is a schematic block diagram of a model training apparatus according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
As shown in fig. 1, an embodiment of the present application provides a model training method, including:
s100, obtaining a first training set, wherein the first training set comprises a plurality of first training samples;
s101, performing data enhancement on each first training sample in the first training set to obtain a second training set; the second training set comprises a plurality of second samples; wherein the first training sample has the same number as the corresponding second sample;
s102, constructing a plurality of third training sets in batches, and performing iterative training on the neural network model for a plurality of times by using the plurality of third training sets until a stopping condition is met; wherein, the third training samples in the third training set are randomly extracted from the first training set or/and the second training set, and the serial numbers of each third training sample in a plurality of third training sets are different;
s103, in each process of training the neural network model, enhancing a first training sample of a first training set to obtain an enhanced sample; replacing a second training sample in the second training set with the same number as the enhancement sample with the enhancement sample; and the first training sample after data enhancement and the second training sample in the third training set have the same number.
The model training method simulates a data enhancement sample generation strategy of on-line data enhancement in an off-line data enhancement mode; obtaining the enhancement effect which is as rich as the enhancement effect of the online data with the storage consumption which is less than the enhancement of the offline data; meanwhile, compared with on-line data enhancement, the generation of the data enhancement sample is parallel to the training of the model, and the additional training time cost is not increased.
The respective steps will be described in detail below.
In step S100, a first training set is obtained, where the first training set includes a plurality of first training samples;
the first training sample may be understood as a set of picture data, which may be pictures of the same object under different angles, different environmental conditions, different pixel colors, and the like. Of course, the image may also be a picture of different objects and different scenes, and the embodiment of the present application is not limited at all.
In step S101, performing data enhancement on each first training sample in the first training set to obtain a second training set; the second training set comprises a plurality of second samples; wherein the first training sample has the same number as the corresponding second sample;
data enhancement includes methods that use some computer vision to make some transformations on existing picture data to obtain equivalent new data that is highly correlated with the original data. In the present embodiment, the data enhancement strategy includes one of cropping, rotating, translating, flipping, sharpening, illuminating, and occluding. For example, the data enhancement strategies include cropping, rotating, translating, flipping, sharpening, illuminating, or occluding. Different first training samples can adopt different data enhancement strategies, and the same data enhancement strategy can also be adopted, so that only one training sample needs to use one data enhancement strategy.
Specifically, the construction of the second training set may be accomplished by:
firstly, determining a data enhancement strategy required to be used in a model training process, and then performing data enhancement on each training sample (namely each picture) in a first training set train _ origin once by using the data enhancement strategy to obtain a sample set train _ aug with the same size as the first training set train _ origin, recording the sample set train _ aug as a second training set, and storing the second training set locally; the second training samples in the second training set train _ aug correspond to the second training samples in the first training set train _ origin one to one. Fig. 2 is a diagram illustrating the correspondence relationship between the first training set Train _ origin and the second training set Train _ aug. In the first training set Train _ origin and the second training set, the pictures having a correspondence relationship have the same number.
In step S102, a plurality of third training sets are constructed in batches, and the neural network model is iteratively trained for a plurality of times by using the plurality of third training sets until a stop condition is satisfied; and randomly extracting third training samples in the third training set from the first training set or/and the second training set, wherein the serial numbers of the third training samples in a plurality of third training sets are different.
In an embodiment, as shown in fig. 3, constructing the third training set includes:
s300, reordering the numbers of the second training samples in the second training set;
s301, selecting a plurality of numbers from the reordered numbers as the numbers of a third training sample in the third training set;
s302 selects a first training sample or/and a second training sample corresponding to the plurality of numbers from the first training set or/and the second training set as a third training sample, and constructs a third training set based on the third training sample.
Specifically, a plurality of third training sets are constructed in batches, and the neural network model is iteratively trained for a plurality of times by using the plurality of third training sets until a stopping condition is met, and the method comprises the following steps:
step 3.1, assuming that the number of samples in the first training set train _ origin is n, the samples in the first training set train _ origin and the second training set train _ aug are numbered as a ═ 1, n, randomly scrambling a to obtain a new numbering sequence b,
b=shuffle(a)
and 3.2, training the model in batches, namely dividing the training sample of the whole training set into a plurality of sub-training sets, training the model by using the first sub-training set during the first training, then training the model by using the second sub-training set, and so on until the model training is finished. Thus, the first bs numbers from b constitute bs _ c, while the elements in b are shifted to the left by the bs position. Ensuring that the learning times of each picture in the whole training process are consistent through the step 3.1 and the step 3.2;
and 3.3, selecting bs training samples from the first training set train _ origin and the second training set train _ aug according to bs _ c to form a third training set bs _ set. The specific selection process comprises the following steps: traversing the numbers in bs _ c, randomly taking the corresponding numbered training samples from the first training set train _ origin or the second training set train _ aug. The probability that each numbered sample is from the first training set train _ origin or the second training set train _ aug may be the same or different. Step 3.3 ensures that the training samples in the third training set bs _ set of each training batch do not have the corresponding relationship expressed in fig. 2 with other samples.
bs_set∈(train_origin∪train_aug)
And 3.4, sending a batch of third training set bs _ set into the neural network model to train the neural network model.
And 3.5, repeating the steps of 3.2-3.4, iterating for m times or triggering other stop conditions.
In step S103, in each process of training the neural network model, a first training sample of a first training set is enhanced to obtain an enhanced sample; replacing the enhancement sample with a second training sample in the second training set having the same number as the enhancement sample; and the first training sample after data enhancement and the second training sample in the third training set have the same number.
Specifically, the serial number of the training sample in the second training set train _ aug is selected from the bs _ set and is denoted as bs _ c _ aug, and meanwhile, a data enhancement thread performs data enhancement on the picture numbered in the bs _ c _ aug in the first training set train _ origin in parallel, and the obtained picture replaces the sample with the same serial number in the second training set train _ aug. Therefore, the pictures in the second training set train _ aug are ensured to be untrained, and every time the picture with the same number is taken from the second training set train _ aug, the picture is generated by different data enhancement strategies, so that the effect of online data enhancement is achieved. And the number of pictures in the second training set train _ aug is kept unchanged, and the occupied storage space is kept unchanged. In addition, model training is generally carried out in a GPU, data enhancement is generally carried out in a CPU, and the model training speed is not slowed down due to the fact that the model training and the data enhancement are parallel; theoretically, half of each batch of training pictures come from the second training set train _ aug, so that the parallel data enhancement threads only need to generate bs/2 pictures at each time approximately, and the speed of generating the data enhancement pictures by the cpu can keep pace with the model training.
bs_c_aug={a|a∈bs_c,bs_set[a]∈traln_aug}
After step S103, the neural network model already fits most of the data, and can accurately predict most of the original data and data generated by data enhancement, and the main contradiction at this time is a difficult sample that has not been fitted by the model. Based on this, the present application also provides a method comprising:
constructing a difficult sample set; establishing a plurality of fourth training sets in batches, and performing iterative training on the neural network model by using the fourth training sets until a stopping condition is met; and randomly extracting fourth training samples in the fourth training set from the first training set or/and the second training set or/and the hard sample set, wherein the serial numbers of the fourth training samples in a plurality of fourth training sets are different. In this case, when the difficult sample set is initially constructed, the difficult sample set is empty.
In an embodiment, as shown in fig. 4, constructing the fourth training set includes:
s400, reordering the numbers of the samples in the second training set;
s401, selecting a plurality of numbers from the reordered numbers as the numbers of a fourth training sample in the fourth training set;
s402, selecting first training samples or/and second training samples or/and difficult samples corresponding to the numbers from the first training set or/and the second training set or/and the difficult sample set as fourth training samples, and constructing a fourth training set based on the fourth training samples.
In an embodiment, if a difficult sample corresponding to one of the plurality of numbers exists in the difficult sample set, selecting a difficult sample with a corresponding number from the difficult sample set;
if a plurality of difficult samples corresponding to one of the numbers exist in the difficult sample set, randomly selecting one difficult sample from the plurality of difficult samples;
and if the difficult samples corresponding to the plurality of numbers do not exist in the difficult sample set, taking out first training samples or/and second training samples corresponding to the plurality of numbers from the first training set or/and second training set.
In an embodiment, when the first training sample and the second training sample in the fourth training set are identified through a neural network model, adding a sample with an identification error or a confidence coefficient lower than a set confidence coefficient into the difficult sample set; when the difficult samples in the fourth training set are identified through a neural network model, deleting the correctly identified samples from the difficult sample set; and when identifying the difficult samples in the fourth training set through a neural network model, deleting the samples with the identification times larger than the set times from the difficult sample set, and keeping the samples in the dirty data set.
Specifically, a plurality of fourth training sets are constructed in batches, and the neural network model is iteratively trained by using the plurality of fourth training sets until a stopping condition is met, and the method comprises the following steps:
step 4.1, assuming that the number of samples in the first training set train _ origin is n, the samples in the first training set train _ origin and the second training set train _ aug are numbered a1 ═ 1, n, randomly scrambling a1 to obtain a new numbering sequence b1,
b1=shuffle(a)
and 4.2, training the model in batches, namely dividing the training sample of the whole training set into a plurality of sub-training sets, training the model by using the first sub-training set during the first training, then training the model by using the second sub-training set, and so on until the model training is finished. Thus, the previous bs1 numbers from b1 constitute bs _ c1, while the element in b1 is shifted left by bs 1. Ensuring that the learning times of each picture in the whole training process are consistent through the step 4.1 and the step 4.2;
and 4.3, according to the bs _ c1, selecting bs1 training samples from the first training set train _ origin, the second training set train _ aug and the difficult sample set train _ hard to form a fourth training set bs _ set 1. The specific process is as follows: traversing the serial numbers in the bs _ c1, if the difficult sample set train _ hard contains the pictures with the serial numbers, taking the samples with the corresponding serial numbers from the difficult sample set train _ hard, and if the serial numbers in the difficult sample set train _ hard contain a plurality of samples, randomly selecting one sample; if there is no numbered sample in the difficult sample set train _ hard, the corresponding numbered training sample is randomly taken from the first training set train _ origin or the second training set train _ aug. Step 4.3 also ensures that the training samples in the picture set bs _ set1 of each training batch do not have the corresponding relationship with other samples as expressed in fig. 2.
And 4.4, sending a batch of the fourth training set bs _ set1 into the neural network model to train the neural network model.
And 4.5, recording the reasoning result corresponding to each batch of pictures, adding the picture (hard sample) which belongs to the first training set train _ origin and is identified by the neural network model in the bs _ set1 or the picture (hard sample) with the confidence coefficient lower than the set confidence coefficient c _ t in the second training set train _ aug into the hard sample set train _ hard, and storing the hard sample set train _ hard, wherein the hard sample set train _ hard keeps the corresponding relationship of the serial numbers.
The correct picture is identified by the neural network model from the bs _ set1 originally belonging to the difficult sample set train _ hard, and is deleted from the difficult sample set train _ hard. The number of picture trainings belonging to the train _ hard in bs _ set1 is recorded, and when the number of picture trainings e belonging to the train _ hard is greater than the set number et, the hard sample is considered to deviate from the overall data distribution, and may be potentially dirty data, and the data is deleted from the hard sample set train _ hard and kept in the dirty data set train _ dirty for later analysis. It should be noted that the dirty data set train _ dirty is empty at the beginning, that is, the dirty data set train _ dirty is empty when the model is trained by using the first batch of data.
Regarding the memory space occupied by the difficult sample set train _ hard, in an extreme consideration, when the difficult sample set train _ hard gradually increases to include all numbered pictures, and then all the training samples of each batch come from the difficult sample set train _ hard, no new difficult sample is added to the difficult sample set train _ hard after the batch of training, the difficult sample set train _ hard does not increase any more, and the occupied space does not increase again. The theoretically difficult sample set train _ hard occupies the maximum storage value, which is equivalent to the size of the original data set. In addition, pictures in the hard sample set train _ hard belonging to the first training set train _ origin only need to keep the links of the pictures, and the occupied storage is further reduced. In summary, step 4.5 is to select the hard sample, the storage space occupied by the hard sample set train _ hard is controlled within a reasonable range. In addition, after all training is finished, the data sets in the dirty data set train _ dirty and the hard sample set train _ hard may be retained, and manually analyzed to provide a basis for the next training batch.
And 4.6, repeating the steps 4.1-4.5 until a stop condition is triggered.
To achieve the above and other related objects, the present invention provides a model training apparatus, as shown in fig. 5, comprising:
a training set obtaining module 500, configured to obtain a first training set, where the first training set includes a plurality of first training samples;
a first enhancing module 501, configured to enhance each first training sample in the first training set to obtain a second training set; the second training set comprises a plurality of second samples; wherein the first training sample has the same number as the corresponding second sample;
a training set constructing module 502, configured to construct a plurality of third training sets in batches, and perform multiple iterative training on the neural network model by using the plurality of third training sets until a stop condition is met; wherein, the third training samples in the third training set are randomly extracted from the first training set or/and the second training set, and the serial numbers of each third training sample in a plurality of third training sets are different;
a second enhancing module 503, configured to enhance the first training sample of the first training set in each training process of the neural network model to obtain an enhanced sample; replacing the enhancement sample with a second training sample in the second training set having the same number as the enhancement sample; and the enhanced first training sample and the second training sample in the third training set have the same number.
It should be noted that, because the embodiment of the apparatus portion and the embodiment of the method portion correspond to each other, please refer to the description of the embodiment of the method portion for the content of the embodiment of the apparatus portion, and details are not repeated here.
The invention also provides a storage medium storing a computer program which, when executed by a processor, performs the aforementioned model training method.
The present invention also provides an apparatus comprising:
a memory for storing a computer program;
a processor for executing the computer program stored by the memory to cause the apparatus to perform the aforementioned model training method.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may be an internal storage unit or an external storage device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital Card (SD), a flash memory Card (FlashCard), and the like. Further, the memory may also include both an internal storage unit and an external storage device. The memory is used for storing the computer program and other programs and data. The memory may also be used to temporarily store data that has been or will be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated module/unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may comprise any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, etc.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (10)

1. A method of model training, the method comprising:
acquiring a first training set, wherein the first training set comprises a plurality of first training samples;
performing data enhancement on each first training sample in the first training set to obtain a second training set; the second training set comprises a plurality of second samples; wherein the first training sample has the same number as the corresponding second sample;
constructing a plurality of third training sets in batches, and performing iterative training on the neural network model for a plurality of times by using the plurality of third training sets until a stopping condition is met; wherein, the third training samples in the third training set are randomly extracted from the first training set or/and the second training set, and the serial numbers of each third training sample in a plurality of third training sets are different;
in each process of training the neural network model, performing data enhancement on a first training sample of a first training set to obtain an enhanced sample; replacing the enhancement sample with a second training sample in the second training set having the same number as the enhancement sample; and the first training sample after data enhancement and the second training sample in the third training set have the same number.
2. The model training method of claim 1, further comprising:
constructing a difficult sample set;
establishing a plurality of fourth training sets in batches, and performing iterative training on the neural network model by using the fourth training sets until a stopping condition is met; and randomly extracting fourth training samples in the fourth training set from the first training set or/and the second training set or/and the hard sample set, wherein the serial numbers of the fourth training samples in a plurality of fourth training sets are different.
3. The model training method of claim 1, wherein constructing the third training set comprises:
reordering the numbers of second training samples in the second training set;
selecting a plurality of numbers from the reordered numbers as the numbers of a third training sample in the third training set;
and selecting the first training sample or/and the second training sample corresponding to the plurality of numbers from the first training set or/and the second training set as a third training sample, and constructing a third training set based on the third training sample.
4. The model training method of claim 2, wherein constructing the fourth training set comprises:
reordering the numbers of samples in the second training set;
selecting a plurality of numbers from the reordered numbers as the number of a fourth training sample in the fourth training set;
and selecting the first training samples or/and the second training samples or/and the difficult samples corresponding to the plurality of numbers from the first training set or/and the second training set or/and the difficult sample set as fourth training samples, and constructing a fourth training set based on the fourth training samples.
5. Model training method according to claim 4,
if a difficult sample corresponding to one of the plurality of numbers exists in the difficult sample set, selecting the difficult sample with the corresponding number from the difficult sample set;
if a plurality of difficult samples corresponding to one of the numbers exist in the difficult sample set, randomly selecting one difficult sample from the plurality of difficult samples;
and if the difficult samples corresponding to the plurality of numbers do not exist in the difficult sample set, taking out first training samples or/and second training samples corresponding to the plurality of numbers from the first training set or/and second training set.
6. The model training method according to claim 2,
when a first training sample and a second training sample in the fourth training set are identified through a neural network model, adding a sample with identification error or confidence coefficient lower than a set confidence coefficient into the difficult sample set;
when the difficult samples in the fourth training set are identified through a neural network model, deleting the correctly identified samples from the difficult sample set;
and when identifying the difficult samples in the fourth training set through a neural network model, deleting the samples with the identification times larger than the set times from the difficult sample set, and keeping the samples in the dirty data set.
7. The model training method according to claim 6, wherein if there is a first training sample belonging to a first training set in the difficult sample set, only a link of the first training sample is retained.
8. A model training apparatus, comprising:
the training set acquisition module is used for acquiring a first training set, and the first training set comprises a plurality of first training samples;
the first enhancement module is used for enhancing data of each first training sample in the first training set to obtain a second training set; the second training set comprises a plurality of second samples; wherein the first training sample has the same number as the corresponding second sample;
the training set construction module is used for constructing a plurality of third training sets in batches and carrying out a plurality of times of iterative training on the neural network model by utilizing the plurality of third training sets until a stopping condition is met; wherein, the third training samples in the third training set are randomly extracted from the first training set or/and the second training set, and the serial numbers of each third training sample in a plurality of third training sets are different;
the second enhancement module is used for enhancing data of the first training sample of the first training set in the process of training the neural network model each time to obtain an enhanced sample; replacing a second training sample in the second training set with the same number as the enhancement sample with the enhancement sample; and the first training sample after data enhancement and the second training sample in the third training set have the same number.
9. A model training apparatus, comprising:
a memory for storing a computer program;
a processor for executing the memory-stored computer program to cause the apparatus to perform the model training method of any one of claims 1 to 7.
10. A storage medium storing a computer program, characterized in that the computer program, when executed by a processor, performs the model training method according to any one of claims 1 to 7.
CN202210205109.3A 2022-03-02 2022-03-02 Model training method, device, equipment and storage medium Pending CN114565086A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210205109.3A CN114565086A (en) 2022-03-02 2022-03-02 Model training method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210205109.3A CN114565086A (en) 2022-03-02 2022-03-02 Model training method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114565086A true CN114565086A (en) 2022-05-31

Family

ID=81718021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210205109.3A Pending CN114565086A (en) 2022-03-02 2022-03-02 Model training method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114565086A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642667A (en) * 2021-08-30 2021-11-12 重庆紫光华山智安科技有限公司 Enhancement strategy determination method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642667A (en) * 2021-08-30 2021-11-12 重庆紫光华山智安科技有限公司 Enhancement strategy determination method and device, electronic equipment and storage medium
CN113642667B (en) * 2021-08-30 2024-02-02 重庆紫光华山智安科技有限公司 Picture enhancement strategy determination method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN104933747B (en) Vector animation is converted to the method and device of bitmaps animation
CN109949219B (en) Reconstruction method, device and equipment of super-resolution image
CN109802684B (en) Method and device for data compression
US20090202160A1 (en) Method for coding and decoding 3d data implemented as a mesh model
CN107657051A (en) A kind of generation method of picture tag, terminal device and storage medium
CN107133909B (en) Method and device for recombining shaders
CN112419214A (en) Method and device for generating labeled image, readable storage medium and terminal equipment
CN114565086A (en) Model training method, device, equipment and storage medium
CN110859642A (en) Method, device, equipment and storage medium for realizing medical image auxiliary diagnosis based on AlexNet network model
CN111179402B (en) Rendering method, device and system of target object
CN113257352A (en) Gene sequencing data sequencing method, integrated circuit and sequencing equipment
US6954207B2 (en) Method and apparatus for processing pixels based on segments
CN113706639B (en) Image compression method and device based on rectangular NAM, storage medium and computing equipment
CN115861665A (en) Method, device, equipment and medium for matching candidate frame in target detection network training
CN114648444A (en) Vector up-sampling calculation method and device applied to neural network data processing
CN113139563B (en) Optimization method and device for image classification model
CN114202762A (en) Handwritten sample generation method and device and application
US7696994B2 (en) Pipeline processing of image data with a low-resolution display of intermediate results
CN115964084A (en) Data interaction method, electronic equipment and storage medium
CN115456858B (en) Image processing method, device, computer equipment and computer readable storage medium
US20200366915A1 (en) Image processing method and image processing device
CN105335935A (en) Image processing apparatus and method
CN110930391A (en) Method, device and equipment for realizing medical image auxiliary diagnosis based on VggNet network model and storage medium
CN112906728A (en) Feature comparison method, device and equipment
CN111291055B (en) Data storage method and device, data processing method and device, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination