CN112215248A - Deep learning model training method and device, electronic equipment and storage medium - Google Patents
Deep learning model training method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN112215248A CN112215248A CN201910625339.3A CN201910625339A CN112215248A CN 112215248 A CN112215248 A CN 112215248A CN 201910625339 A CN201910625339 A CN 201910625339A CN 112215248 A CN112215248 A CN 112215248A
- Authority
- CN
- China
- Prior art keywords
- image set
- training image
- loss function
- training
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 243
- 238000013136 deep learning model Methods 0.000 title claims abstract description 130
- 238000000034 method Methods 0.000 title claims abstract description 40
- 230000006870 function Effects 0.000 claims abstract description 122
- 238000012545 processing Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000002372 labelling Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The application provides a deep learning model training method, a device, electronic equipment and a storage medium, which relate to the technical field of computers, a first training image set comprising a plurality of unlabelled images and a second training image set comprising a plurality of labeled images are used as the input of a deep learning model, and model parameters of the deep learning model are updated by using a first loss function and a second loss function which are obtained by the first training image set and the second training image set correspondingly.
Description
Technical Field
The application relates to the technical field of computers, in particular to a deep learning model training method and device, electronic equipment and a storage medium.
Background
With the rise of deep learning, the performance of supervised learning becomes stronger and stronger, but the strong performance of the supervised learning needs to depend on massive manually labeled training data, and labeling massive image data is a task which needs to consume a large amount of manpower, material resources and time; on the other hand, unsupervised learning does not need data labeling, but because no real label is trained, only a few priori knowledge can be relied on to design an algorithm or artificially give a few labels with weak supervised information, and the performance of unsupervised learning is obviously different from that of supervised learning.
Semi-supervised learning trains the model by using image data with labels and image data without labels at the same time, so that the demand of the model on image data with labels can be effectively reduced; although the image data without the label cannot provide a real label for supervised learning, the image data without the label can reflect the data distribution of a real image and help the network to learn a good representation of the image data. The model can be helped to better learn the image recognition task by mining the features of the image data that are hidden in the data distribution information or better characterizing the image data.
However, currently, for a semi-supervised learning scheme of a deep learning model, such as a semi-supervised learning scheme based on a generation countermeasure network (GAN), a model training scheme is highly targeted, and is difficult to be applied to training of other models.
Disclosure of Invention
The application aims to provide a deep learning model training method and device, electronic equipment and a storage medium, and generalization of model training is improved.
In order to achieve the above purpose, the embodiments of the present application employ the following technical solutions:
in a first aspect, an embodiment of the present application provides a deep learning model training method, where the method includes:
obtaining a first training image set and a second training image set, wherein the first training image set comprises a plurality of unlabeled images, and the second training image set comprises a plurality of labeled images;
respectively taking the first training image set and the second training image set as the input of the deep learning model to obtain a first loss function and a second loss function, wherein the first loss function is a loss function when the first training image set is used as the input for training the deep learning model, and the second loss function is a loss function when the second training image set is used as the input for training the deep learning model;
and updating the model parameters of the deep learning model according to the first loss function and the second loss function.
In a second aspect, an embodiment of the present application provides a deep learning model training apparatus, where the apparatus includes:
a processing module, configured to obtain a first training image set and a second training image set, where the first training image set includes a plurality of unlabeled images, and the second training image set includes a plurality of labeled images;
the processing module is further configured to take the first training image set and the second training image set as inputs of the deep learning model respectively to obtain a first loss function and a second loss function, where the first loss function is a loss function when the first training image set is used as an input to train the deep learning model, and the second loss function is a loss function when the second training image set is used as an input to train the deep learning model;
and the updating module is used for updating the model parameters of the deep learning model according to the first loss function and the second loss function.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory for storing one or more programs; a processor. The one or more programs, when executed by the processor, implement the deep learning model training method described above.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the deep learning model training method described above.
Compared with the prior art, when the deep learning model is trained, model parameters are updated by combining unsupervised learning and supervised learning instead of considering model structure characteristics, so that a training method of the model is not limited to a specific model structure, and generalization of model training is improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and it will be apparent to those skilled in the art that other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a schematic block diagram of a semi-supervised learning approach based on generation of a countermeasure network;
fig. 2 is a schematic structural block diagram of an electronic device provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart of a deep learning model training method provided by an embodiment of the present application;
FIG. 4 is a schematic training scenario diagram of a deep learning model provided in an embodiment of the present application;
FIG. 5 is a schematic flow chart of the substeps of S203 in FIG. 3;
FIG. 6 is another schematic flow chart of the substeps of S203 in FIG. 3;
FIG. 7 is a schematic flow chart of the substeps of S205 of FIG. 3;
FIG. 8 is a schematic block diagram of a deep learning model training apparatus according to an embodiment of the present disclosure;
in the figure: 100-an electronic device; 101-a memory; 102-a processor; 103-a communication interface; 300-deep learning model training device; 301-a processing module; 302-update module.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
In the semi-supervised learning scheme based on the generative countermeasure network, for example, the main idea of the learning scheme is to learn the distribution (for example, including a labeled image and a non-labeled image) of the real input image data by using the generative countermeasure network, and then update the distribution information of the image data in the weight parameter of the discriminator by using the generative countermeasure network, and the discriminator shares the weight parameter with the classifier, and then transfers the distribution information of the image data to the classifier, thereby making the classifier better predict the category of the image.
Referring to fig. 1, fig. 1 is a schematic frame diagram of a semi-supervised learning method based on generation of a countermeasure network, where the semi-supervised learning frame based on generation of the countermeasure network mainly includes a generator, a discriminator and a classifier.
The generator consists of a multilayer deconvolution model and can generate a false image by using N-dimensional random noise sampled from Gaussian distribution; the discriminator comprises a multi-layer convolution model trained to identify false images from the true images and the false images generated by the generator; the classifier and the discriminator share the weight parameters and the network structure, and the classifier receives real image data (including labeled image data and unlabeled image data) and outputs the categories of the real image (assuming that K categories exist in the real image).
Through the countertraining of the generator and the discriminator, the false images generated by the generator are more and more close to the real images, and the discriminator can more accurately obtain the distribution of the real image data in order to more accurately distinguish the real images from the false images generated by the generator, and update the distribution information in the weight parameters of the discriminator, so that the classifier can more accurately predict the truth and the category of the images by sharing the weight parameters with the discriminator.
However, for the above training process, for example, for a semi-supervised learning scheme based on generation of a countermeasure network, due to the inherent structural characteristics of the generation of the countermeasure network, the free space of structural design of the classification network is limited, so that the model training scheme has strong pertinence, and if the structure or even the type of the model changes, the model training scheme is difficult to be applied to the training process of other models.
Based on the above defects, a possible implementation manner provided by the embodiment of the present application is; and updating model parameters of the deep learning model by using a first training image set containing a plurality of unlabeled images and a second training image set containing a plurality of labeled images as input of the deep learning model and using a first loss function and a second loss function which are obtained by the first training image set and the second training image set correspondingly.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 2, fig. 2 is a schematic block diagram of an electronic device 100 according to an embodiment of the present disclosure. The electronic device 100 may be used as a device for training a deep learning model to implement the deep learning model training method provided in the embodiment of the present application, such as a mobile phone, a Personal Computer (PC), a tablet computer, a laptop computer, and the like.
The electronic device 100 includes a memory 101, a processor 102, and a communication interface 103, wherein the memory 101, the processor 102, and the communication interface 103 are electrically connected to each other directly or indirectly to enable data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines.
The memory 101 may be used to store software programs and modules, such as program instructions/modules corresponding to the deep learning model training apparatus 300 provided in the embodiments of the present application, and the processor 102 executes the software programs and modules stored in the memory 101, thereby executing various functional applications and data processing. The communication interface 103 may be used for communicating signaling or data with other node devices.
The Memory 101 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
The processor 102 may be an integrated circuit chip having signal processing capabilities. The Processor 102 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It will be appreciated that the configuration shown in FIG. 2 is merely illustrative and that electronic device 100 may include more or fewer components than shown in FIG. 2 or have a different configuration than shown in FIG. 2. The components shown in fig. 2 may be implemented in hardware, software, or a combination thereof.
The deep learning model training method provided in the embodiment of the present application is further described below by taking the electronic device 100 provided in fig. 2 as an exemplary execution subject.
Referring to fig. 3, fig. 3 is a schematic flowchart of a deep learning model training method according to an embodiment of the present application, including the following steps:
s201, a first training image set and a second training image set are obtained.
The first training image set comprises a plurality of unlabeled images, and the unlabeled images contained in the first training image set are used for the depth learning model to perform unsupervised learning; the second training image set comprises a plurality of labeled images, and the plurality of labeled images contained in the second training image set are used for the supervised learning of the deep learning model.
S203, the first training image set and the second training image set are respectively used as the input of the deep learning model to obtain a first loss function and a second loss function.
The first loss function is a loss function when the first training image set is used as input for training the deep learning model, and the second loss function is a loss function when the second training image set is used as input for training the deep learning model.
And S205, updating the model parameters of the deep learning model according to the first loss function and the second loss function.
In the embodiment of the application, the deep learning model is trained by adopting two training image sets, namely a first training image set and a second training image set.
When the deep learning model is trained, the obtained first training image set and the second training image set are respectively used as the input of the deep learning model to obtain a first loss function and a second loss function.
For example, referring to fig. 4, fig. 4 is a schematic training scene diagram of a deep learning model provided in an embodiment of the present application, where the deep learning model may be composed of multiple layers of convolution models, and can simultaneously receive input of two types of training pictures of a first training image set and a second training image set; the first training image set is a set of a plurality of unlabeled images, so that when the first training image set is used as the input of the deep learning model, the deep learning model performs unsupervised learning, and correspondingly, the first loss function is a loss function of the deep learning model under unsupervised learning; on the other hand, since the second training image set is a set of a plurality of labeled images, when the second training image set is used as the input of the deep learning model, the deep learning model performs supervised learning, and accordingly, the second loss function is a loss function of the deep learning model under the supervised learning.
It is worth to be noted that, in some possible application scenarios in the embodiments of the present application, the first training image set and the second training image set may be simultaneously used as inputs of the deep learning model, and then the first loss function and the second loss function are obtained after losses are respectively calculated; in addition, in some other possible application scenarios in the embodiment of the application, the first training image set may be used as the input of the deep learning model, so as to obtain a first loss function after calculating the loss, and then the second training image set is used as the input of the deep learning model, so as to obtain a second loss function after calculating the loss; the embodiment of the present application is not limited to this, the input order of the first training image set and the second training image set and the calculation order of the first loss function and the second loss function depend on a specific application scenario, and as long as the first training image set and the second training image set are respectively used as the input of the deep learning model and the losses are respectively calculated, the first loss function and the second loss function can be obtained.
Therefore, the model parameters of the deep learning model are updated according to the first loss function and the second loss function obtained by respectively training the deep learning model according to the first training image set and the second training image set, so that the model parameters are updated by combining unsupervised learning and supervised learning when the deep learning model is trained until the trained deep learning model converges.
Based on the above design, the deep learning model training method provided in the embodiment of the present application uses a first training image set including a plurality of unlabeled images and a second training image set including a plurality of labeled images as inputs of the deep learning model, and updates model parameters of the deep learning model by using a first loss function and a second loss function obtained by the first training image set and the second training image set, which correspond to each other.
It should be noted that, in general, tagged image data depends on manual tagging of a large number of untagged images, which requires a lot of manpower, material resources, and time.
Therefore, optionally, as a possible implementation manner, the number of unlabeled images included in the first training image set is greater than the number of labeled images included in the second training image set. Namely: in the embodiment of the application, the unsupervised training of massive unlabelled pictures is adopted, so that the deep learning model can carry out feature extraction on picture data, the image recognition task of the deep learning model can be conveniently learned, the use amount of the labeled picture data is reduced, the manpower and the physics which need to train the deep learning model are reduced, and the deep learning model is easier to train when a semi-supervised learning scheme is utilized.
In addition, to achieve the objective of obtaining the first loss function in S203, optionally, referring to fig. 5, fig. 5 is a schematic flow chart of the sub-step of S203 in fig. 3, as a possible implementation manner, S203 includes the following sub-steps:
s203-1, deleting image blocks from each picture in the first training image set to obtain a third training image set.
S203-2, the third training image set is used as the input of the deep learning model, and the restored training image set is obtained.
And restoring each picture in the training image set by the deep learning model according to the picture corresponding to the missing image block.
S203-3, calculating loss according to the restored training image set and the first training image set to obtain a first loss function.
When the first training image set is used for unsupervised learning of the deep learning model, a mode of missing image blocks is adopted for each picture in the first training image set, and then all pictures after the missing image blocks are gathered to obtain a third training image set.
It should be noted that, in general, the first training image set and the third training image set have the same number of pictures, and the pictures of the first training image set and the third training image set can be in one-to-one correspondence with each other, except that the pictures included in the third training image set are missing image blocks compared with the corresponding pictures in the first training image set.
Therefore, the third training image set is used as the input of the deep learning model, each picture in the third training image set is restored by the deep learning model, and all the restored pictures are collected to obtain a restored training image set.
It should be noted that, in general, the restored training image set and the first training image set have the same number of pictures, and the pictures of the restored training image set and the first training image set can be in one-to-one correspondence, except that the pictures included in the first training image set are original pictures, and the pictures corresponding to the restored training image set are restored from the pictures without the corresponding image blocks.
Obviously, it can be understood that, since the corresponding pictures in the training image set are restored and obtained through restoration of the corresponding pictures without image blocks, there may be a difference between the pictures included in the training image set and the original pictures corresponding to the first training image set, and the features of the extracted pictures can be learned by the deep learning model through the difference between the restored pictures and the original pictures.
Therefore, the loss is calculated according to the obtained reduction training image set and the first training image set, and then a first loss function of the deep learning model under unsupervised learning is obtained.
Based on the above design, in the deep learning model training method provided in the embodiment of the present application, after the images in the training image set lack the image blocks, the deep learning model restores the images without the image blocks, and then the restored images and the original images before the missing image blocks calculate the loss function, so that when the deep learning model performs self-supervised learning, features of the images are extracted for learning through a difference between the restored images and the original images before the missing image blocks, and the distribution information of the images in the training image set does not need to be learned, thereby avoiding the deterioration of the model performance due to inaccurate image data.
Optionally, as a possible implementation manner, in implementing S203-1, the manner of obtaining the third training image set may be:
and randomly missing image blocks with the set size of each picture in the first training image set, and collecting all pictures without the image blocks to obtain a third training image set.
It can be understood that the above-mentioned manner for implementing S203-1 is only one possible implementation manner, and that image blocks are randomly deleted according to a set size; in some other possible application scenarios according to the embodiment of the present application, the image set may also be deleted in other manners to obtain the third training image set, for example, the third training image set is obtained by deleting fixed image blocks through the set coordinate points.
On the other hand, to achieve the purpose of obtaining the second loss function in S203, optionally, referring to fig. 6, fig. 6 is another schematic flow chart of the sub-step of S203 in fig. 3, as another possible implementation manner, S203 includes the following sub-steps:
s203-6, the second training image set is used as the input of the deep learning model, and the training label corresponding to each picture in the second training image set is obtained.
S203-7, calculating loss according to the training label and the actual label corresponding to each picture in the second training image set to obtain a second loss function.
When the second training image set is used for supervised learning of the deep learning model, the second training image set is used as the input of the deep learning model, the deep learning model is used for predicting the label of each picture in the second training image set, and then the training label corresponding to each picture in the second training image set is obtained.
In the embodiment of the present application, the training labels predicted by the deep learning model for each picture may be different from the actual labels corresponding to each picture, and therefore, in the embodiment of the present application, the loss is calculated according to the training labels predicted by the deep learning model for each picture in the second training image set and the actual labels corresponding to each picture, so as to obtain the second loss function of the deep learning model under supervised learning.
To implement the above S205, optionally, referring to fig. 7, fig. 7 is a schematic flowchart of the sub-steps of S205 in fig. 3, and as a possible implementation, S205 includes the following sub-steps:
s205-1, the first loss function and the second loss function are weighted and summed with corresponding scaling ratios respectively to obtain a training loss sum.
And S205-2, updating the model parameters of the deep learning model according to the gradient obtained by the training loss sum to the model parameters of the deep learning model.
And when the first loss function and the second loss function are used for updating the model parameters of the deep learning model, respectively carrying out weighted summation on the first loss function and the second loss function according to the corresponding scaling ratios to obtain the training loss sum.
Therefore, the gradient of the model parameters of the deep learning model is obtained according to the training loss sum obtained by weighted summation, and the model parameters of the deep learning model are updated according to the obtained gradient.
The training loss sum comprises a part of a first loss function and a part of a second loss function, namely the training loss sum comprises learning information of a deep learning model under unsupervised learning and learning information of the deep learning model under supervised learning, so that the unsupervised learning and the supervised learning simultaneously optimize the deep learning model when model parameters of the deep learning model are updated and iterated, the two learning modes are mutually promoted, and the deep learning model learns the characteristics of the extracted pictures through a first training image set of unlabeled pictures, so that the model is assisted to recognize picture data, and the generalization of the model is improved.
Optionally, as a possible implementation manner, the scaling ratio of each of the first loss function and the second loss function is 1.
It is worth noting that, in some other possible implementation manners of the embodiment of the present application, a value may also be adopted as a scaling ratio corresponding to each of the first loss function and the second loss function, for example, the scaling ratio corresponding to the first loss function is 0.5, and the scaling ratio corresponding to the second loss function is 0.8, which depends on a specific application scenario, for example, in some other possible application scenarios of the embodiment of the present application, the scaling ratio corresponding to the first loss function may also be set to 0.6, and the scaling ratio corresponding to the second loss function may also be set to 0.9.
In addition, in some possible implementation manners of the embodiment of the present application, the scaling ratios corresponding to the first loss function and the second loss function may be obtained by pre-storing the deep learning model in a training device before training, or receiving an input of a user during training of the deep learning model in some other possible implementation manners of the embodiment of the present application.
Referring to fig. 8, based on the same inventive concept as the deep learning model training method provided in the foregoing embodiment of the present application, fig. 8 is a schematic structural diagram of a deep learning model training apparatus 300 provided in the embodiment of the present application, where the deep learning model training apparatus 300 includes a processing model 301 and an updating module 302.
The processing module 301 is configured to obtain a first training image set and a second training image set, where the first training image set includes a plurality of unlabeled images, and the second training image set includes a plurality of labeled images;
the processing module 301 is further configured to take the first training image set and the second training image set as inputs of the deep learning model, respectively, to obtain a first loss function and a second loss function, where the first loss function is a loss function when the first training image set is used as an input for training the deep learning model, and the second loss function is a loss function when the second training image set is used as an input for training the deep learning model;
the updating module 302 is configured to update the model parameters of the deep learning model according to the first loss function and the second loss function.
Optionally, as a possible implementation manner, when the processing module 301 takes the first training image set as an input of the deep learning model to obtain the first loss function, it is specifically configured to:
missing image blocks of each picture in the first training image set to obtain a third training image set;
taking the third training image set as the input of the deep learning model to obtain a restored training image set, wherein each picture in the restored training image set is restored by the deep learning model according to the picture corresponding to the missing image block;
and calculating loss according to the restored training image set and the first training image set to obtain a first loss function.
Optionally, as a possible implementation manner, when the processing module 301 deletes an image block from each picture in the first training image set to obtain a third training image set, the processing module is specifically configured to:
and randomly missing image blocks with the set size of each picture in the first training image set, and collecting all pictures without the image blocks to obtain a third training image set.
Optionally, as a possible implementation manner, the number of unlabeled images included in the first training image set is greater than the number of labeled images included in the second training image set.
Optionally, as a possible implementation manner, when the processing module 301 takes the second training image set as an input of the deep learning model to obtain the second loss function, the processing module is specifically configured to:
taking the second training image set as the input of the deep learning model to obtain a training label corresponding to each picture in the second training image set;
and calculating loss according to the training label and the actual label corresponding to each picture in the second training image set to obtain a second loss function.
Optionally, as a possible implementation manner, when the update module 301 updates the model parameters of the deep learning model according to the first loss function and the second loss function, it is specifically configured to:
respectively carrying out weighted summation on the first loss function and the second loss function according to corresponding scaling ratios to obtain a training loss sum;
and updating the model parameters of the deep learning model according to the gradient of the training loss sum to the model parameters of the deep learning model.
Optionally, as a possible implementation manner, the scaling ratio of each of the first loss function and the second loss function is 1.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative and, for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic or optical disk, etc. for storing program codes.
In summary, according to the deep learning model training method, the apparatus, the electronic device, and the storage medium provided in the embodiments of the present application, the first training image set including a plurality of unlabeled images and the second training image set including a plurality of labeled images are used as inputs of the deep learning model, and the model parameters of the deep learning model are updated by using the first loss function and the second loss function obtained by the first training image set and the second training image set, respectively.
And moreover, after the image blocks of the pictures in the training image set are deleted, the pictures without the image blocks are restored by the deep learning model, and then loss functions are calculated by the restored pictures and the original pictures before the image blocks are deleted, so that when the deep learning model performs self-supervision learning, the characteristics of the pictures are extracted for learning through the difference between the restored pictures and the original pictures before the image blocks are deleted, the distributed information of the images in the training image set does not need to be learned, and the phenomenon that the model performance is deteriorated due to inaccurate image data is avoided.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Claims (10)
1. A deep learning model training method, the method comprising:
obtaining a first training image set and a second training image set, wherein the first training image set comprises a plurality of unlabeled images, and the second training image set comprises a plurality of labeled images;
respectively taking the first training image set and the second training image set as the input of the deep learning model to obtain a first loss function and a second loss function, wherein the first loss function is a loss function when the first training image set is used as the input for training the deep learning model, and the second loss function is a loss function when the second training image set is used as the input for training the deep learning model;
and updating the model parameters of the deep learning model according to the first loss function and the second loss function.
2. The method of claim 1, wherein the step of deriving a first loss function using the first set of training images as input to the deep learning model comprises:
missing image blocks from each picture in the first training image set to obtain a third training image set;
taking the third training image set as the input of the deep learning model to obtain a restored training image set, wherein each picture in the restored training image set is restored by the deep learning model according to the picture of the corresponding missing image block;
and calculating loss according to the restored training image set and the first training image set to obtain the first loss function.
3. The method of claim 2, wherein the step of missing image blocks from each picture in the first training image set to obtain a third training image set comprises:
and randomly missing image blocks with the set size of each picture in the first training image set, and collecting all pictures without the image blocks to obtain the third training image set.
4. The method of claim 1, wherein the first set of training images contains a greater number of unlabeled images than the second set of training images contains labeled images.
5. The method of claim 1, wherein the step of deriving a second loss function using the second set of training images as input to the deep learning model comprises:
taking the second training image set as the input of the deep learning model to obtain a training label corresponding to each picture in the second training image set;
and calculating loss according to the training label and the actual label corresponding to each picture in the second training image set to obtain the second loss function.
6. The method of any one of claims 1-5, wherein updating model parameters of the deep learning model based on the first loss function and the second loss function comprises:
respectively carrying out weighted summation on the first loss function and the second loss function according to corresponding scaling ratios to obtain a training loss sum;
and updating the model parameters of the deep learning model according to the gradient of the training loss sum to the model parameters of the deep learning model.
7. The method of claim 6, wherein the scaling ratio for each of the first loss function and the second loss function is 1.
8. An apparatus for deep learning model training, the apparatus comprising:
a processing module, configured to obtain a first training image set and a second training image set, where the first training image set includes a plurality of unlabeled images, and the second training image set includes a plurality of labeled images;
the processing module is further configured to take the first training image set and the second training image set as inputs of the deep learning model respectively to obtain a first loss function and a second loss function, where the first loss function is a loss function when the first training image set is used as an input to train the deep learning model, and the second loss function is a loss function when the second training image set is used as an input to train the deep learning model;
and the updating module is used for updating the model parameters of the deep learning model according to the first loss function and the second loss function.
9. An electronic device, comprising:
a memory for storing one or more programs;
a processor;
the one or more programs, when executed by the processor, implement the method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910625339.3A CN112215248A (en) | 2019-07-11 | 2019-07-11 | Deep learning model training method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910625339.3A CN112215248A (en) | 2019-07-11 | 2019-07-11 | Deep learning model training method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112215248A true CN112215248A (en) | 2021-01-12 |
Family
ID=74048160
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910625339.3A Pending CN112215248A (en) | 2019-07-11 | 2019-07-11 | Deep learning model training method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112215248A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112784749A (en) * | 2021-01-22 | 2021-05-11 | 北京百度网讯科技有限公司 | Target model training method, target object identification method, target model training device, target object identification device and medium |
CN113052025A (en) * | 2021-03-12 | 2021-06-29 | 咪咕文化科技有限公司 | Training method of image fusion model, image fusion method and electronic equipment |
CN113111729A (en) * | 2021-03-23 | 2021-07-13 | 广州大学 | Training method, recognition method, system, device and medium of personnel recognition model |
CN113298135A (en) * | 2021-05-21 | 2021-08-24 | 南京甄视智能科技有限公司 | Model training method and device based on deep learning, storage medium and equipment |
CN113610911A (en) * | 2021-07-27 | 2021-11-05 | Oppo广东移动通信有限公司 | Training method and device of depth prediction model, medium and electronic equipment |
CN114282615A (en) * | 2021-12-27 | 2022-04-05 | 华中科技大学 | Intelligent roadbed compactness identification method and system |
WO2022151591A1 (en) * | 2021-01-18 | 2022-07-21 | 平安科技(深圳)有限公司 | Coupled multitask feature extraction method and apparatus, electronic device, and storage medium |
CN115063813A (en) * | 2022-07-05 | 2022-09-16 | 深圳大学 | Training method and training device of alignment model aiming at character distortion |
CN116152577A (en) * | 2023-04-19 | 2023-05-23 | 深圳须弥云图空间科技有限公司 | Image classification method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108122197A (en) * | 2017-10-27 | 2018-06-05 | 江西高创保安服务技术有限公司 | A kind of image super-resolution rebuilding method based on deep learning |
CN108416370A (en) * | 2018-02-07 | 2018-08-17 | 深圳大学 | Image classification method, device based on semi-supervised deep learning and storage medium |
CN109034205A (en) * | 2018-06-29 | 2018-12-18 | 西安交通大学 | Image classification method based on the semi-supervised deep learning of direct-push |
US20190197368A1 (en) * | 2017-12-21 | 2019-06-27 | International Business Machines Corporation | Adapting a Generative Adversarial Network to New Data Sources for Image Classification |
-
2019
- 2019-07-11 CN CN201910625339.3A patent/CN112215248A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108122197A (en) * | 2017-10-27 | 2018-06-05 | 江西高创保安服务技术有限公司 | A kind of image super-resolution rebuilding method based on deep learning |
US20190197368A1 (en) * | 2017-12-21 | 2019-06-27 | International Business Machines Corporation | Adapting a Generative Adversarial Network to New Data Sources for Image Classification |
CN108416370A (en) * | 2018-02-07 | 2018-08-17 | 深圳大学 | Image classification method, device based on semi-supervised deep learning and storage medium |
CN109034205A (en) * | 2018-06-29 | 2018-12-18 | 西安交通大学 | Image classification method based on the semi-supervised deep learning of direct-push |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022151591A1 (en) * | 2021-01-18 | 2022-07-21 | 平安科技(深圳)有限公司 | Coupled multitask feature extraction method and apparatus, electronic device, and storage medium |
CN112784749A (en) * | 2021-01-22 | 2021-05-11 | 北京百度网讯科技有限公司 | Target model training method, target object identification method, target model training device, target object identification device and medium |
CN112784749B (en) * | 2021-01-22 | 2023-11-10 | 北京百度网讯科技有限公司 | Training method of target model, recognition method, device and medium of target object |
CN113052025A (en) * | 2021-03-12 | 2021-06-29 | 咪咕文化科技有限公司 | Training method of image fusion model, image fusion method and electronic equipment |
CN113111729A (en) * | 2021-03-23 | 2021-07-13 | 广州大学 | Training method, recognition method, system, device and medium of personnel recognition model |
CN113111729B (en) * | 2021-03-23 | 2023-08-18 | 广州大学 | Training method, recognition method, system, device and medium for personnel recognition model |
CN113298135A (en) * | 2021-05-21 | 2021-08-24 | 南京甄视智能科技有限公司 | Model training method and device based on deep learning, storage medium and equipment |
CN113298135B (en) * | 2021-05-21 | 2023-04-18 | 小视科技(江苏)股份有限公司 | Model training method and device based on deep learning, storage medium and equipment |
CN113610911A (en) * | 2021-07-27 | 2021-11-05 | Oppo广东移动通信有限公司 | Training method and device of depth prediction model, medium and electronic equipment |
CN114282615A (en) * | 2021-12-27 | 2022-04-05 | 华中科技大学 | Intelligent roadbed compactness identification method and system |
CN114282615B (en) * | 2021-12-27 | 2024-09-10 | 华中科技大学 | Intelligent recognition method and system for roadbed compactness |
CN115063813A (en) * | 2022-07-05 | 2022-09-16 | 深圳大学 | Training method and training device of alignment model aiming at character distortion |
CN115063813B (en) * | 2022-07-05 | 2023-03-24 | 深圳大学 | Training method and training device of alignment model aiming at character distortion |
CN116152577A (en) * | 2023-04-19 | 2023-05-23 | 深圳须弥云图空间科技有限公司 | Image classification method and device |
CN116152577B (en) * | 2023-04-19 | 2023-08-29 | 深圳须弥云图空间科技有限公司 | Image classification method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112215248A (en) | Deep learning model training method and device, electronic equipment and storage medium | |
Zhang et al. | User profile preserving social network embedding | |
CN111708876B (en) | Method and device for generating information | |
CN109740018B (en) | Method and device for generating video label model | |
CN109086811B (en) | Multi-label image classification method and device and electronic equipment | |
US20180012237A1 (en) | Inferring user demographics through categorization of social media data | |
CN115244587A (en) | Efficient ground truth annotation | |
CN112801888A (en) | Image processing method, image processing device, computer equipment and storage medium | |
Viet‐Uyen Ha et al. | High variation removal for background subtraction in traffic surveillance systems | |
CN113792876B (en) | Backbone network generation method, device, equipment and storage medium | |
CN112801053B (en) | Video data processing method and device | |
CN111414921B (en) | Sample image processing method, device, electronic equipment and computer storage medium | |
CN117009125A (en) | Fault detection method and device, electronic equipment and storage medium | |
US11190470B2 (en) | Attachment analytics for electronic communications | |
CN111914850B (en) | Picture feature extraction method, device, server and medium | |
CN109410121B (en) | Human image beard generation method and device | |
CN116977195A (en) | Method, device, equipment and storage medium for adjusting restoration model | |
CN113239215B (en) | Classification method and device for multimedia resources, electronic equipment and storage medium | |
CN112906478B (en) | Target object identification method, device, equipment and storage medium | |
CN111797931B (en) | Image processing method, image processing network training method, device and equipment | |
CN114510592B (en) | Image classification method, device, electronic equipment and storage medium | |
CN114780847A (en) | Object information processing and information pushing method, device and system | |
CN110119721B (en) | Method and apparatus for processing information | |
CN112861874A (en) | Expert field denoising method and system based on multi-filter denoising result | |
CN114912568A (en) | Method, apparatus and computer-readable storage medium for data processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |