CN111325281A - Deep learning network training method and device, computer equipment and storage medium - Google Patents
Deep learning network training method and device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN111325281A CN111325281A CN202010146486.5A CN202010146486A CN111325281A CN 111325281 A CN111325281 A CN 111325281A CN 202010146486 A CN202010146486 A CN 202010146486A CN 111325281 A CN111325281 A CN 111325281A
- Authority
- CN
- China
- Prior art keywords
- training
- image
- images
- deep learning
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 title claims abstract description 333
- 238000013135 deep learning Methods 0.000 title claims abstract description 89
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000012545 processing Methods 0.000 claims abstract description 89
- 230000011218 segmentation Effects 0.000 claims abstract description 50
- 238000013507 mapping Methods 0.000 claims abstract description 39
- 238000000605 extraction Methods 0.000 claims abstract description 27
- 238000012163 sequencing technique Methods 0.000 claims description 22
- 238000004590 computer program Methods 0.000 claims description 21
- 241001465754 Metazoa Species 0.000 description 13
- 238000010606 normalization Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000009313 farming Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000009304 pastoral farming Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The application relates to a training method and device of a deep learning network, computer equipment and a storage medium. The method comprises the following steps: loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image; according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks; after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image; and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network. By adopting the method, the training efficiency can be improved under the condition of ensuring the accuracy of the recognition result.
Description
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a training method and apparatus for a deep learning network, a computer device, and a storage medium.
Background
In the farming industry, deep learning networks can be used to identify animals, to count animals or to analyze animal behavior. However, in the training process of the deep learning network, when the images of the animal are collected and then the plurality of images are used as training samples for model training, the sizes of the images are different, for example, 2000 × 3000 × 3, 3000 × 2000 × 3, 1000 × 600 × 3, and the like, and not only the length and the width are not close, but also the length and the width may be greatly different. Due to the existence of the fully-connected layer in the deep learning regression model, the size of the training samples input in batch needs to be limited to ensure that the loaded training samples have consistent sizes (such as 896 × 3), and the process can be understood as size normalization. However, the size normalization loses the definition of the image, and even small objects on the image may be completely lost, that is, the accuracy of feature local extraction in the late deep learning training is affected, so that the accuracy of the recognition result is reduced. Although training can be performed by using a conventional image segmentation method to ensure the accuracy of the recognition result, normalization processing needs to be performed on the segmented image regions, so that the training speed is reduced. Therefore, when deep learning network training is performed, how to improve training efficiency under the condition of ensuring the accuracy of the recognition result is a problem which needs to be solved.
Disclosure of Invention
In view of the above, it is necessary to provide a training method, an apparatus, a computer device, and a storage medium for a deep learning network, which can improve training efficiency while ensuring accuracy of a recognition result.
In a first aspect, a training method for a deep learning network is provided, where the method includes:
loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image;
according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks;
after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In an embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set includes the same number of image blocks.
In an embodiment, the step of performing block segmentation on the training image according to a preset block size to obtain a plurality of image blocks includes:
acquiring a preset sliding step length and the block size;
and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
In one embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and a splicing size obtained by splicing image blocks in each image block set is the same as a size of the training image before segmentation.
In one embodiment, before the loading the training image, the method further includes:
acquiring a plurality of training images;
randomly sequencing a plurality of training images by using a shuffle module;
carrying out batch classification on the training images after random sequencing to obtain a plurality of batches of training images;
and loading training images in batches for training.
In one embodiment, further comprising:
if the training images of each batch are loaded for training, determining to finish a round of training;
and utilizing a shuffle module to carry out random sequencing processing on the indexes of the plurality of training images, and carrying out the next round of training.
In one embodiment, after the step of determining that the training round is completed, the method further includes:
updating the number of training rounds;
and if the training round number does not reach the preset round number, randomly sequencing the indexes of the training images by using a shuffle module, and performing the next round of training.
In a second aspect, an apparatus for training a deep learning network is provided, the apparatus comprising:
the image loading module is used for loading the training images and the marked images; the marked image is obtained by marking the features in the training image;
the block segmentation module is used for carrying out block segmentation on the training image according to a preset segmentation size to obtain a plurality of image blocks;
the mapping processing module is used for mapping the processed image blocks and the label image after the feature extraction processing is carried out on the image blocks;
and the parameter adjusting and learning module is used for performing parameter adjusting and learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In a third aspect, a computer device is provided, comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image;
according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks;
after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In a fourth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image;
according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks;
after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
The training method, the device, the computer equipment and the storage medium of the deep learning network load the training image and the marking image firstly, then, according to the preset block size, the training image is divided into blocks to obtain image blocks, after the image blocks are processed by feature extraction, mapping processing is carried out on the image blocks and the mark images after the characteristic extraction processing, parameter adjustment learning is carried out on the image blocks and the mark images after the mapping processing, parameters of a deep learning network are obtained, loss of image information is avoided, accuracy of a recognition result is guaranteed, a training image and a mark image are loaded first, and then, block segmentation processing is carried out, so that image blocks obtained after segmentation do not need to be loaded, the condition of carrying out corresponding block segmentation on the marked image is avoided, the loading time and the block segmentation time of the deep learning network are saved, and the training efficiency is improved.
Drawings
FIG. 1 is a diagram illustrating an internal structure of a computer device according to an embodiment;
FIG. 2 is a schematic flow chart illustrating a method for training a deep learning network according to an embodiment;
FIG. 3 is a flowchart illustrating a method for training a deep learning network according to another embodiment;
FIG. 4 is a block diagram of a training apparatus for a deep learning network according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The training method of the deep learning network provided by the application can be applied to computer equipment shown in fig. 1. The computer device may be a server, and its internal structure diagram may be as shown in fig. 1. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing training data of the deep learning network. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a training method for a deep learning network.
Those skilled in the art will appreciate that the architecture shown in fig. 1 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In the farming and grazing industry, images of animals can be collected, the images are used for training the deep learning network, the trained deep learning network can be used for identifying the animals, the number of the animals is calculated or the behaviors of the animals are analyzed, but in the deep learning network training process, the training efficiency is difficult to improve under the condition that the accuracy of identification results is guaranteed. According to the training method of the deep learning network, after the deep learning network obtains the training images and the marking images, block segmentation processing is carried out according to the preset segmentation size to obtain the image blocks, and the plurality of image blocks and the marking images after feature extraction processing are subjected to mapping processing, so that parameters of the deep learning network are obtained, and the training efficiency of the deep learning network is improved under the condition that the accuracy of the recognition result is guaranteed.
The training image may be understood as an image used for training the deep learning network, and correspondingly, a parameter corresponding to the training image may be referred to as a training parameter; the labeled image can be understood as a standard image used for reference in the training process, and correspondingly, the parameter corresponding to the labeled image can be called a standard parameter; in the process of feature learning of the deep network, the training parameters are continuously adjusted to approach the standard parameters.
It can be understood that the training method of the deep learning network provided by the application can be used for identifying automobiles and the like, and identifying animals does not constitute a limitation on the training method of the deep learning network provided by the application.
In an embodiment, as shown in fig. 2, a training method for a deep learning network is provided, where this embodiment is applied to a server, and may also be applied to a system including a terminal and a server, and the method is implemented by interaction between the terminal and the server, where the specific way of implementing the interaction may be: the training images and the labeled images are sent to the server by the terminal, and then the deep learning network of the server can train in a mode of loading the training images and the labeled images. In this embodiment, the method includes the steps of:
step S202, loading a training image and a marking image.
The marked image (also referred to as "ground route") may be an image obtained by marking a feature in a training image, and may correspond to a deep learning network for different purposes, and the feature may be different, for example, in the deep learning network for identifying animals, the feature may be a pig, a cow, or the like, or, for example, in the deep learning network for identifying automobiles, the feature may be an off-road vehicle, a bus, a truck, or the like. The way to label features in the training image may be: marking the characteristic objects in the training images by processing modes such as picture frame and marking, taking a deep learning network for identifying automobiles as an example, carrying out picture frame processing on the automobiles in the training images so as to realize marking, and taking the training images after the picture frame processing as marking images
In the step, the deep learning network loads a training image and a corresponding marked image; it can be understood that, since each training image has its corresponding labeled image, the deep learning network loads the training images in the same number as the training images, for example, 16 training images and 16 corresponding labeled images, and then the deep learning network needs to load 16 training images and 16 labeled images.
Step S204, according to the preset block size, the training image is divided into a plurality of image blocks.
After the deep learning network loads the training images, the training images are located in the deep learning network, at the moment, the server obtains the preset block size, and performs block segmentation on the training images to obtain a plurality of image blocks.
The block size can be determined according to the size that the full-convolution module can process in the deep learning network, for example, the maximum size that the full-convolution module can process is a, then the block size is any size within a range of a, b; further, to further increase the training speed, the maximum size that can be processed by the full-convolution module may be taken as the block size, and in this case, a × b may be taken as the block size.
Since an image can be regarded as being composed of a plurality of pixels, an image block obtained by segmenting a training image can be understood as an image area between a pixel level and an image level; the manner of block segmentation for the training image may be implemented based on a pixel level method (for example, a patch method, and a corresponding obtained image block may be referred to as a patch image). When the block division is performed based on a pixel level method, the block size may be understood as the number of pixels included in each image block obtained by the block division.
In order to ensure the accuracy of the recognition result, in an embodiment of the present application, a plurality of training images are used for training, specifically, after a plurality of training images are loaded by a deep learning network, each training image is divided according to a preset block size to obtain a corresponding image block set, wherein each training image is divided according to a preset block size, so that the number of image blocks included in each image block set is the same; it should be noted that, the deep learning network may load the labeled images corresponding to the training images at the same time. The example is introduced that 16 training images are used for training, and 4 image blocks can be obtained by dividing one training image according to a preset block size, and since 4 image blocks can be obtained by dividing one training image, the 4 image blocks can be regarded as an image block set; after 16 training images and 16 labeled images are loaded by the deep learning network, each training image is subjected to block segmentation, and 16 image block sets, namely 64 image blocks, can be obtained; in this case, the deep learning network does not need to perform block segmentation on the marker image.
In step S206, after the feature extraction processing is performed on the plurality of image blocks, the mapping processing is performed on the processed plurality of image blocks and the marker image.
Before block division is performed on the training image, each labeled image has a corresponding training image, and after the block division processing of step S204, the training image is divided into a plurality of image blocks, and then feature extraction processing is performed on the plurality of image blocks. If the image block sets of the multiple training images exist, the server performs mapping processing on the multiple image blocks subjected to the feature extraction processing and the corresponding label images, namely, performing feature extraction processing on the image blocks in the image block sets to obtain the image block sets subjected to the feature extraction processing, and mapping the image block sets subjected to the feature extraction processing and the label images to form a mapping relation between the image blocks in the image block sets and the label images. Taking 16 image block sets, each image block set comprising 4 image blocks as an example for introduction, dividing 16 training images by a server to obtain 16 image block sets, then performing feature extraction processing on each image block in the 16 image block sets to obtain 16 image block sets after feature extraction processing, and at the moment, mapping each image block set after feature extraction processing and a corresponding label image by the server respectively to enable 4 image blocks in each image block set and the same label image to form a mapping relation.
And step S208, performing parameter adjustment learning on the image blocks and the marked images after the mapping processing to obtain parameters of the deep learning network.
The deep learning network can process the image blocks to obtain corresponding training parameters, and process the marked images to obtain standard parameters, so that the parameter adjustment learning can be understood as a process of adjusting the training parameters to approach the standard parameters. After the image blocks and the marked images are subjected to mapping processing, the server performs parameter adjustment learning by using the image blocks and the marked images subjected to mapping processing, so that the training parameters approach standard parameters, and parameters of a deep learning network are determined.
In a possible case, the training image may be subjected to block segmentation, and then the obtained image blocks are loaded into the deep learning network to perform training of the deep learning network, for example, 16 training images are subjected to block segmentation (each training image is segmented into 4 image blocks) to obtain 64 image blocks, and the 64 image blocks are loaded into the deep learning network, which obviously increases the time for loading the image by the deep learning network, and further reduces the training speed; in addition, this way also increases the space for deep network learning training, because the deep learning network can perform parameter adjustment learning only when each loaded image has its corresponding labeled image in the training process, if the training image is first subjected to block segmentation processing and the image block is loaded into the deep learning network, then each image block has its corresponding labeled block, that is, the labeled image needs to be subjected to the same block segmentation processing to obtain a labeled block, and then the image block and the labeled block are loaded into the deep learning network together, taking 16 training images and 4 image blocks segmented for each training image as an example to introduce: the method comprises the steps of respectively carrying out block segmentation processing on 16 training images and 16 marked images to obtain 64 image blocks and 64 marked blocks, and loading the 64 image blocks and the 64 marked blocks into a deep learning network, namely, the training space needs to be enough to accommodate the 64 image blocks and the 64 marked blocks. Therefore, the training method of the deep learning network includes the steps that the training images are subjected to block segmentation processing, the obtained image blocks are loaded into the deep learning network, and the training of the deep learning network is performed, so that although the loss of image information can be avoided, the image accuracy is guaranteed, the number of loaded images is increased, the loading time is increased, the training speed is further reduced, and the space required by training is increased.
Compared with the method of firstly dividing blocks and then loading, the training method of the deep learning network provided by the application loads the training image and the marking image, then performs the block division on the training image according to the preset block size to obtain the image block, performs the mapping processing on the image block and the marking image after the feature extraction processing after performing the feature extraction processing on the image block, performs the parameter adjustment learning on the image block and the marking image after the mapping processing, obtains the parameters of the deep learning network, avoids the loss of image information, ensures the accuracy of the identification result, loads the training image and the marking image firstly, then performs the block division processing, thus does not need to load the image block obtained after the division, avoids the situation of performing the corresponding block division on the marking image, saves the loading time and the block division time of the deep learning network, and improves the training efficiency, and because the marked image does not need to be subjected to block segmentation, the space required in the training process is saved.
It should be noted that, in step S204, the server may set the block size in advance, which may improve the training speed and ensure the accuracy of the recognition result. In a possible case, if the server performs block division on the training image according to an arbitrary size, since the full convolution network has a requirement on a processable size, the server also needs to perform normalization processing on image blocks of an arbitrary size, which results in an increase in training time, so that training speed is reduced, and it is difficult to ensure accuracy of the recognition result because part of image information is lost due to the normalization processing.
In a possible case, if the training image is divided according to the block size only, when the training image is divided, a situation that a partial image area cannot be included in the divided block may occur, that is, there is a problem that adjacent image blocks are not continuous, so that all information of the training image cannot be included in the image blocks, and accuracy of the recognition result is reduced. One embodiment of the application adopts a method of performing block segmentation by using a sliding step size and a block size, so that a plurality of image blocks obtained after segmentation can contain information of a whole training image; specifically, the server acquires a preset sliding step size and a preset block size, and performs sliding segmentation processing on the training image according to the sliding step size and the block size to obtain a plurality of image blocks; it can be understood that, at this time, the image blocks may be directly subjected to stitching processing to obtain the original training image, that is, the stitching size obtained after each image block is stitched is the same as the size of the training image before segmentation; further, it can be understood that, when there are a plurality of training images, each training image corresponds to a different image block set, and a stitching size obtained by stitching the image blocks in each image block set is the same as the size of the training image before segmentation.
When a plurality of training images (which can be regarded as an image training set) are used for training the deep learning network, the deep learning network can be divided into a plurality of batches to load the training images, so that batch training is realized; if all the training images are loaded, namely the deep learning network loads all the training images of each batch for training, one round of training can be determined to be completed, and the server can perform the next round of training at the moment.
In one example, the batch training performed by the server may specifically include: the method comprises the steps that a server firstly obtains various training images, random sequencing processing is carried out on the various training images by using a shuffle module, then batch classification is carried out on the training images after the random sequencing processing, multiple batches of training images are obtained, training is carried out by loading the training images in batches, the process of carrying out random sequencing can be understood as a shuffle process, and correspondingly, the module for realizing the shuffle process can be called the shuffle module.
In order to avoid the situation that the server continuously trains, the server may stop the training when the number of training rounds reaches the preset number of rounds according to the preset number of rounds. Specifically, after completing one round of training, the server updates the number of training rounds, and if the number of training rounds does not reach the preset number of rounds, continues the next round of training, where the server updates the number of training rounds by adding 1 to epoch, where the epoch is used to represent the number of training rounds.
Further, in order to further ensure the accuracy of the recognition result, before the next round of training, the server may perform random ordering processing on a plurality of training images that have been trained, and perform the next round of training by using the training images that have been subjected to the random ordering processing, that is, after the server finishes loading the training images of each batch, it is determined that one round of training is completed, then perform the random ordering processing on a plurality of training images, and perform the next round of training by using the training images that have been subjected to the random ordering processing; the random ordering processing may be performed on the plurality of training images by first obtaining an index (index) of the training images and then performing the random ordering processing on the index by using a shuffle module.
Fig. 3 shows another embodiment of the training method of the deep learning network of the present application, which is described below with reference to fig. 3 by taking an application scenario of identifying an animal as an example:
the image training set of the animal comprises a training image and a labeled image, wherein the labeled image can be obtained by carrying out frame labeling on a pig (in the embodiment, the pig is taken as a feature object) in the training image; after obtaining an image training set of an animal, a server loads training images and labeled images into a deep learning network in batches and trains the training images, wherein the process of training by using each batch of training images includes steps S302 to S308, specifically, steps S302 to S308 are:
step S302, loading a training image and a corresponding marking image;
step S304, patch dividing the training image according to a preset sliding step length and block size to obtain a plurality of image block sets, wherein the number of image blocks in each image block set is the same;
step S306, after the image blocks in the image block set are subjected to feature extraction processing, the image blocks and the label images in the processed image block set are subjected to mapping processing;
step S308, parameter adjustment learning is carried out on the image blocks and the marked images after mapping processing, and parameters of a deep learning network are obtained;
after loading the complete batch of training images, determining that a round of training is completed, updating the number of training rounds, i.e. adding 1 to the epoch (step S310);
step S312, acquiring an index of the training image, and randomly sequencing the index by using a shuffle module;
step S314, after the random sorting processing, obtaining a corresponding training image set, and performing batch processing on the image training set to obtain a plurality of batches of training images.
In the embodiment, patch segmentation is performed according to the preset sliding step length and the block size, so that the problems of pixel loss and reduced precision caused by size normalization of an image block are solved, the precision of local feature extraction in deep learning training is ensured, and the accuracy of an identification result is ensured; and the training images are loaded and then subjected to patch segmentation, so that the condition that corresponding patch segmentation needs to be carried out on the labeled images is avoided, the time for reconstruction, offline storage, preprocessing and the like of a training set based on the patch is reduced, the training efficiency is improved, and the space required by training is saved.
It should be understood that although the various steps in the flow charts of fig. 2-3 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-3 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.
In one embodiment, as shown in fig. 4, there is provided a training apparatus 400 for a deep learning network, including: an image loading module 402, a block segmentation module 404, a mapping processing module 406, and a parameter adjustment learning module 408, wherein:
an image loading module 402, configured to load a training image and a labeled image; the marked image is obtained by marking the features in the training image;
a block division module 404, configured to perform block division on the training image according to a preset division size to obtain a plurality of image blocks;
a mapping processing module 406, configured to perform, after performing feature extraction processing on the multiple image blocks, mapping processing on the processed multiple image blocks and the tag image;
and the parameter adjusting and learning module 408 is configured to perform parameter adjusting and learning on the mapped image blocks and the labeled images to obtain parameters of the deep learning network.
In one embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set includes the same number of image blocks.
In an embodiment, the block segmentation module 404 is further configured to obtain a preset sliding step size and a block size; and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
In one embodiment, when the number of training images is multiple, each training image corresponds to a different image block set, and the size of the mosaic obtained by mosaicing the image blocks in each image block set is the same as the size of the training image before segmentation.
In one embodiment, the training apparatus 400 for deep learning network further includes: the sorting processing first module is used for acquiring a plurality of training images; randomly sequencing a plurality of training images by using a shuffle module; the batch classification module is used for performing batch classification on the training images after random sequencing processing to obtain a plurality of batches of training images; and the image loading module is used for loading training images in batches for training.
In one embodiment, the training apparatus 400 for deep learning network further includes: the training round number determining module is used for determining that one round of training is finished after the training of the training images of each batch is loaded; and the second sorting module is used for randomly sorting the indexes of the plurality of training images by using the shuffle module and performing the next round of training.
In one embodiment, the training round number determining module is further configured to update the training round number; and the sorting processing third module is used for randomly sorting the indexes of the plurality of training images by using the shuffle module and carrying out the next round of training if the number of the training rounds does not reach the preset number of rounds.
For specific limitations of the training apparatus of the deep learning network, reference may be made to the above limitations of the training method of the deep learning network, and details are not repeated here. The modules in the training device of the deep learning network can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
loading a training image and a marking image; the marked image is obtained by marking the characteristic object in the training image;
according to the preset block size, performing block division on the training image to obtain a plurality of image blocks;
after the characteristic extraction processing is carried out on the plurality of image blocks, the processed plurality of image blocks and the mark image are subjected to mapping processing;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In one embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set includes the same number of image blocks.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a preset sliding step length and a block size; and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
In one embodiment, when the number of training images is multiple, each training image corresponds to a different image block set, and the size of the mosaic obtained by mosaicing the image blocks in each image block set is the same as the size of the training image before segmentation.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a plurality of training images; randomly sequencing a plurality of training images by using a shuffle module; carrying out batch classification on the training images after random sequencing to obtain a plurality of batches of training images; and loading training images in batches for training.
In one embodiment, the processor, when executing the computer program, further performs the steps of: if the training images of each batch are loaded for training, determining to finish a round of training; and randomly sequencing the indexes of the plurality of training images by using a shuffle module, and performing the next round of training.
In one embodiment, the processor, when executing the computer program, further performs the steps of: updating the number of training rounds; and if the number of the training rounds does not reach the preset number of the rounds, randomly sequencing the indexes of the training images by using a shuffle module, and performing the next round of training.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
loading a training image and a marking image; the marked image is obtained by marking the characteristic object in the training image;
according to the preset block size, performing block division on the training image to obtain a plurality of image blocks;
after the characteristic extraction processing is carried out on the plurality of image blocks, the processed plurality of image blocks and the mark image are subjected to mapping processing;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In one embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set includes the same number of image blocks.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a preset sliding step length and a block size; and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
In one embodiment, when the number of training images is multiple, each training image corresponds to a different image block set, and the size of the mosaic obtained by mosaicing the image blocks in each image block set is the same as the size of the training image before segmentation.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a plurality of training images; randomly sequencing a plurality of training images by using a shuffle module; carrying out batch classification on the training images after random sequencing to obtain a plurality of batches of training images; and loading training images in batches for training.
In one embodiment, the processor, when executing the computer program, further performs the steps of: if the training images of each batch are loaded for training, determining to finish a round of training; and randomly sequencing the indexes of the plurality of training images by using a shuffle module, and performing the next round of training.
In one embodiment, the processor, when executing the computer program, further performs the steps of: updating the number of training rounds; and if the number of the training rounds does not reach the preset number of the rounds, randomly sequencing the indexes of the training images by using a shuffle module, and performing the next round of training.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A training method of a deep learning network comprises the following steps:
loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image;
according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks;
after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
2. The method of claim 1, wherein when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set comprises the same number of image blocks.
3. The method of claim 1, wherein the step of performing block segmentation on the training image according to a predetermined block size to obtain a plurality of image blocks comprises:
acquiring a preset sliding step length and the block size;
and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
4. The method according to claim 3, wherein when the number of the training images is multiple, each training image corresponds to a different image block set, and a stitching size obtained by stitching the image blocks in each image block set is the same as a size of the training image before segmentation.
5. The method of claim 1, further comprising, prior to said loading training images:
acquiring a plurality of training images;
randomly sequencing a plurality of training images by using a shuffle module;
carrying out batch classification on the training images after random sequencing to obtain a plurality of batches of training images;
and loading training images in batches for training.
6. The method of claim 5, further comprising:
if the training images of each batch are loaded for training, determining to finish a round of training;
and utilizing a shuffle module to carry out random sequencing processing on the indexes of the plurality of training images, and carrying out the next round of training.
7. The method of claim 6, further comprising, after the step of determining completion of a round of training:
updating the number of training rounds;
and if the training round number does not reach the preset round number, randomly sequencing the indexes of the training images by using a shuffle module, and performing the next round of training.
8. An apparatus for training a deep learning network, comprising:
the image loading module is used for loading the training images and the marked images; the marked image is obtained by marking the features in the training image;
the block segmentation module is used for carrying out block segmentation on the training image according to a preset segmentation size to obtain a plurality of image blocks;
the mapping processing module is used for mapping the processed image blocks and the label image after the feature extraction processing is carried out on the image blocks;
and the parameter adjusting and learning module is used for performing parameter adjusting and learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010146486.5A CN111325281B (en) | 2020-03-05 | 2020-03-05 | Training method and device for deep learning network, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010146486.5A CN111325281B (en) | 2020-03-05 | 2020-03-05 | Training method and device for deep learning network, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111325281A true CN111325281A (en) | 2020-06-23 |
CN111325281B CN111325281B (en) | 2023-10-27 |
Family
ID=71173168
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010146486.5A Active CN111325281B (en) | 2020-03-05 | 2020-03-05 | Training method and device for deep learning network, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111325281B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114155555A (en) * | 2021-12-02 | 2022-03-08 | 北京中科智易科技有限公司 | Human behavior artificial intelligence judgment system and method |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105551036A (en) * | 2015-12-10 | 2016-05-04 | 中国科学院深圳先进技术研究院 | Training method and device for deep learning network |
CN109427052A (en) * | 2017-08-29 | 2019-03-05 | 中国移动通信有限公司研究院 | Correlation technique and equipment based on deep learning processing eye fundus image |
WO2019049060A1 (en) * | 2017-09-08 | 2019-03-14 | Stone Three Mining Solutions (Pty) Ltd | Froth segmentation in flotation cells |
CN109859203A (en) * | 2019-02-20 | 2019-06-07 | 福建医科大学附属口腔医院 | Defect dental imaging recognition methods based on deep learning |
CN109993197A (en) * | 2018-12-07 | 2019-07-09 | 天津大学 | A kind of zero sample multi-tag classification method based on the end-to-end example differentiation of depth |
CN110309855A (en) * | 2019-05-30 | 2019-10-08 | 上海联影智能医疗科技有限公司 | Training method, computer equipment and the storage medium of image segmentation |
CN110335199A (en) * | 2019-07-17 | 2019-10-15 | 上海骏聿数码科技有限公司 | A kind of image processing method, device, electronic equipment and storage medium |
US20190392267A1 (en) * | 2018-06-20 | 2019-12-26 | International Business Machines Corporation | Framework for integrating deformable modeling with 3d deep neural network segmentation |
CN110827330A (en) * | 2019-10-31 | 2020-02-21 | 河海大学 | Time sequence integrated multispectral remote sensing image change detection method and system |
-
2020
- 2020-03-05 CN CN202010146486.5A patent/CN111325281B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105551036A (en) * | 2015-12-10 | 2016-05-04 | 中国科学院深圳先进技术研究院 | Training method and device for deep learning network |
CN109427052A (en) * | 2017-08-29 | 2019-03-05 | 中国移动通信有限公司研究院 | Correlation technique and equipment based on deep learning processing eye fundus image |
WO2019049060A1 (en) * | 2017-09-08 | 2019-03-14 | Stone Three Mining Solutions (Pty) Ltd | Froth segmentation in flotation cells |
US20190392267A1 (en) * | 2018-06-20 | 2019-12-26 | International Business Machines Corporation | Framework for integrating deformable modeling with 3d deep neural network segmentation |
CN109993197A (en) * | 2018-12-07 | 2019-07-09 | 天津大学 | A kind of zero sample multi-tag classification method based on the end-to-end example differentiation of depth |
CN109859203A (en) * | 2019-02-20 | 2019-06-07 | 福建医科大学附属口腔医院 | Defect dental imaging recognition methods based on deep learning |
CN110309855A (en) * | 2019-05-30 | 2019-10-08 | 上海联影智能医疗科技有限公司 | Training method, computer equipment and the storage medium of image segmentation |
CN110335199A (en) * | 2019-07-17 | 2019-10-15 | 上海骏聿数码科技有限公司 | A kind of image processing method, device, electronic equipment and storage medium |
CN110827330A (en) * | 2019-10-31 | 2020-02-21 | 河海大学 | Time sequence integrated multispectral remote sensing image change detection method and system |
Non-Patent Citations (1)
Title |
---|
尹蕊;: "基于多尺度卷积神经网络的场景标记", 现代计算机(专业版), no. 06 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114155555A (en) * | 2021-12-02 | 2022-03-08 | 北京中科智易科技有限公司 | Human behavior artificial intelligence judgment system and method |
CN114155555B (en) * | 2021-12-02 | 2022-06-10 | 北京中科智易科技有限公司 | Human behavior artificial intelligence judgment system and method |
Also Published As
Publication number | Publication date |
---|---|
CN111325281B (en) | 2023-10-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109034078B (en) | Training method of age identification model, age identification method and related equipment | |
CN110705405A (en) | Target labeling method and device | |
CN111680701B (en) | Training method and device of image recognition model and image recognition method and device | |
CN112183295A (en) | Pedestrian re-identification method and device, computer equipment and storage medium | |
EP3859479A1 (en) | Method for determining distribution information, and control method and device for unmanned aerial vehicle | |
CN110473172B (en) | Medical image anatomical centerline determination method, computer device and storage medium | |
CN111192278B (en) | Semantic segmentation method, semantic segmentation device, computer equipment and computer readable storage medium | |
CN113222055B (en) | Image classification method and device, electronic equipment and storage medium | |
CN111507298B (en) | Face detection method, device, computer equipment and storage medium | |
CN111310800A (en) | Image classification model generation method and device, computer equipment and storage medium | |
CN112241646A (en) | Lane line recognition method and device, computer equipment and storage medium | |
CN111914814A (en) | Wheat rust detection method and device and computer equipment | |
CN115578590A (en) | Image identification method and device based on convolutional neural network model and terminal equipment | |
CN111325281A (en) | Deep learning network training method and device, computer equipment and storage medium | |
CN112101114A (en) | Video target detection method, device, equipment and storage medium | |
CN112819834A (en) | Method and device for classifying pathological images of stomach based on artificial intelligence | |
CN112686125A (en) | Vehicle type determination method and device, storage medium and electronic device | |
CN110929792B (en) | Image labeling method, device, electronic equipment and storage medium | |
DE102019209562B4 (en) | Device and method for training a neural network and device and method for validating a neural network | |
CN113766308A (en) | Video cover recommendation method and device, computer equipment and storage medium | |
CN110751163A (en) | Target positioning method and device, computer readable storage medium and electronic equipment | |
CN114821658A (en) | Face recognition method, operation control device, electronic device, and storage medium | |
CN110766652B (en) | Network training method, device, segmentation method, computer equipment and storage medium | |
CN110096607B (en) | Method and device for acquiring label picture | |
CN114399657A (en) | Vehicle detection model training method and device, vehicle detection method and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |