CN111325281A - Deep learning network training method and device, computer equipment and storage medium - Google Patents

Deep learning network training method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111325281A
CN111325281A CN202010146486.5A CN202010146486A CN111325281A CN 111325281 A CN111325281 A CN 111325281A CN 202010146486 A CN202010146486 A CN 202010146486A CN 111325281 A CN111325281 A CN 111325281A
Authority
CN
China
Prior art keywords
training
image
images
deep learning
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010146486.5A
Other languages
Chinese (zh)
Other versions
CN111325281B (en
Inventor
刘旭
杨龙
彭端
赵凌云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Hope Liuhe Co Ltd
Original Assignee
New Hope Liuhe Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New Hope Liuhe Co Ltd filed Critical New Hope Liuhe Co Ltd
Priority to CN202010146486.5A priority Critical patent/CN111325281B/en
Publication of CN111325281A publication Critical patent/CN111325281A/en
Application granted granted Critical
Publication of CN111325281B publication Critical patent/CN111325281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a training method and device of a deep learning network, computer equipment and a storage medium. The method comprises the following steps: loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image; according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks; after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image; and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network. By adopting the method, the training efficiency can be improved under the condition of ensuring the accuracy of the recognition result.

Description

Deep learning network training method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a training method and apparatus for a deep learning network, a computer device, and a storage medium.
Background
In the farming industry, deep learning networks can be used to identify animals, to count animals or to analyze animal behavior. However, in the training process of the deep learning network, when the images of the animal are collected and then the plurality of images are used as training samples for model training, the sizes of the images are different, for example, 2000 × 3000 × 3, 3000 × 2000 × 3, 1000 × 600 × 3, and the like, and not only the length and the width are not close, but also the length and the width may be greatly different. Due to the existence of the fully-connected layer in the deep learning regression model, the size of the training samples input in batch needs to be limited to ensure that the loaded training samples have consistent sizes (such as 896 × 3), and the process can be understood as size normalization. However, the size normalization loses the definition of the image, and even small objects on the image may be completely lost, that is, the accuracy of feature local extraction in the late deep learning training is affected, so that the accuracy of the recognition result is reduced. Although training can be performed by using a conventional image segmentation method to ensure the accuracy of the recognition result, normalization processing needs to be performed on the segmented image regions, so that the training speed is reduced. Therefore, when deep learning network training is performed, how to improve training efficiency under the condition of ensuring the accuracy of the recognition result is a problem which needs to be solved.
Disclosure of Invention
In view of the above, it is necessary to provide a training method, an apparatus, a computer device, and a storage medium for a deep learning network, which can improve training efficiency while ensuring accuracy of a recognition result.
In a first aspect, a training method for a deep learning network is provided, where the method includes:
loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image;
according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks;
after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In an embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set includes the same number of image blocks.
In an embodiment, the step of performing block segmentation on the training image according to a preset block size to obtain a plurality of image blocks includes:
acquiring a preset sliding step length and the block size;
and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
In one embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and a splicing size obtained by splicing image blocks in each image block set is the same as a size of the training image before segmentation.
In one embodiment, before the loading the training image, the method further includes:
acquiring a plurality of training images;
randomly sequencing a plurality of training images by using a shuffle module;
carrying out batch classification on the training images after random sequencing to obtain a plurality of batches of training images;
and loading training images in batches for training.
In one embodiment, further comprising:
if the training images of each batch are loaded for training, determining to finish a round of training;
and utilizing a shuffle module to carry out random sequencing processing on the indexes of the plurality of training images, and carrying out the next round of training.
In one embodiment, after the step of determining that the training round is completed, the method further includes:
updating the number of training rounds;
and if the training round number does not reach the preset round number, randomly sequencing the indexes of the training images by using a shuffle module, and performing the next round of training.
In a second aspect, an apparatus for training a deep learning network is provided, the apparatus comprising:
the image loading module is used for loading the training images and the marked images; the marked image is obtained by marking the features in the training image;
the block segmentation module is used for carrying out block segmentation on the training image according to a preset segmentation size to obtain a plurality of image blocks;
the mapping processing module is used for mapping the processed image blocks and the label image after the feature extraction processing is carried out on the image blocks;
and the parameter adjusting and learning module is used for performing parameter adjusting and learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In a third aspect, a computer device is provided, comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image;
according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks;
after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In a fourth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image;
according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks;
after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
The training method, the device, the computer equipment and the storage medium of the deep learning network load the training image and the marking image firstly, then, according to the preset block size, the training image is divided into blocks to obtain image blocks, after the image blocks are processed by feature extraction, mapping processing is carried out on the image blocks and the mark images after the characteristic extraction processing, parameter adjustment learning is carried out on the image blocks and the mark images after the mapping processing, parameters of a deep learning network are obtained, loss of image information is avoided, accuracy of a recognition result is guaranteed, a training image and a mark image are loaded first, and then, block segmentation processing is carried out, so that image blocks obtained after segmentation do not need to be loaded, the condition of carrying out corresponding block segmentation on the marked image is avoided, the loading time and the block segmentation time of the deep learning network are saved, and the training efficiency is improved.
Drawings
FIG. 1 is a diagram illustrating an internal structure of a computer device according to an embodiment;
FIG. 2 is a schematic flow chart illustrating a method for training a deep learning network according to an embodiment;
FIG. 3 is a flowchart illustrating a method for training a deep learning network according to another embodiment;
FIG. 4 is a block diagram of a training apparatus for a deep learning network according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The training method of the deep learning network provided by the application can be applied to computer equipment shown in fig. 1. The computer device may be a server, and its internal structure diagram may be as shown in fig. 1. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing training data of the deep learning network. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a training method for a deep learning network.
Those skilled in the art will appreciate that the architecture shown in fig. 1 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In the farming and grazing industry, images of animals can be collected, the images are used for training the deep learning network, the trained deep learning network can be used for identifying the animals, the number of the animals is calculated or the behaviors of the animals are analyzed, but in the deep learning network training process, the training efficiency is difficult to improve under the condition that the accuracy of identification results is guaranteed. According to the training method of the deep learning network, after the deep learning network obtains the training images and the marking images, block segmentation processing is carried out according to the preset segmentation size to obtain the image blocks, and the plurality of image blocks and the marking images after feature extraction processing are subjected to mapping processing, so that parameters of the deep learning network are obtained, and the training efficiency of the deep learning network is improved under the condition that the accuracy of the recognition result is guaranteed.
The training image may be understood as an image used for training the deep learning network, and correspondingly, a parameter corresponding to the training image may be referred to as a training parameter; the labeled image can be understood as a standard image used for reference in the training process, and correspondingly, the parameter corresponding to the labeled image can be called a standard parameter; in the process of feature learning of the deep network, the training parameters are continuously adjusted to approach the standard parameters.
It can be understood that the training method of the deep learning network provided by the application can be used for identifying automobiles and the like, and identifying animals does not constitute a limitation on the training method of the deep learning network provided by the application.
In an embodiment, as shown in fig. 2, a training method for a deep learning network is provided, where this embodiment is applied to a server, and may also be applied to a system including a terminal and a server, and the method is implemented by interaction between the terminal and the server, where the specific way of implementing the interaction may be: the training images and the labeled images are sent to the server by the terminal, and then the deep learning network of the server can train in a mode of loading the training images and the labeled images. In this embodiment, the method includes the steps of:
step S202, loading a training image and a marking image.
The marked image (also referred to as "ground route") may be an image obtained by marking a feature in a training image, and may correspond to a deep learning network for different purposes, and the feature may be different, for example, in the deep learning network for identifying animals, the feature may be a pig, a cow, or the like, or, for example, in the deep learning network for identifying automobiles, the feature may be an off-road vehicle, a bus, a truck, or the like. The way to label features in the training image may be: marking the characteristic objects in the training images by processing modes such as picture frame and marking, taking a deep learning network for identifying automobiles as an example, carrying out picture frame processing on the automobiles in the training images so as to realize marking, and taking the training images after the picture frame processing as marking images
In the step, the deep learning network loads a training image and a corresponding marked image; it can be understood that, since each training image has its corresponding labeled image, the deep learning network loads the training images in the same number as the training images, for example, 16 training images and 16 corresponding labeled images, and then the deep learning network needs to load 16 training images and 16 labeled images.
Step S204, according to the preset block size, the training image is divided into a plurality of image blocks.
After the deep learning network loads the training images, the training images are located in the deep learning network, at the moment, the server obtains the preset block size, and performs block segmentation on the training images to obtain a plurality of image blocks.
The block size can be determined according to the size that the full-convolution module can process in the deep learning network, for example, the maximum size that the full-convolution module can process is a, then the block size is any size within a range of a, b; further, to further increase the training speed, the maximum size that can be processed by the full-convolution module may be taken as the block size, and in this case, a × b may be taken as the block size.
Since an image can be regarded as being composed of a plurality of pixels, an image block obtained by segmenting a training image can be understood as an image area between a pixel level and an image level; the manner of block segmentation for the training image may be implemented based on a pixel level method (for example, a patch method, and a corresponding obtained image block may be referred to as a patch image). When the block division is performed based on a pixel level method, the block size may be understood as the number of pixels included in each image block obtained by the block division.
In order to ensure the accuracy of the recognition result, in an embodiment of the present application, a plurality of training images are used for training, specifically, after a plurality of training images are loaded by a deep learning network, each training image is divided according to a preset block size to obtain a corresponding image block set, wherein each training image is divided according to a preset block size, so that the number of image blocks included in each image block set is the same; it should be noted that, the deep learning network may load the labeled images corresponding to the training images at the same time. The example is introduced that 16 training images are used for training, and 4 image blocks can be obtained by dividing one training image according to a preset block size, and since 4 image blocks can be obtained by dividing one training image, the 4 image blocks can be regarded as an image block set; after 16 training images and 16 labeled images are loaded by the deep learning network, each training image is subjected to block segmentation, and 16 image block sets, namely 64 image blocks, can be obtained; in this case, the deep learning network does not need to perform block segmentation on the marker image.
In step S206, after the feature extraction processing is performed on the plurality of image blocks, the mapping processing is performed on the processed plurality of image blocks and the marker image.
Before block division is performed on the training image, each labeled image has a corresponding training image, and after the block division processing of step S204, the training image is divided into a plurality of image blocks, and then feature extraction processing is performed on the plurality of image blocks. If the image block sets of the multiple training images exist, the server performs mapping processing on the multiple image blocks subjected to the feature extraction processing and the corresponding label images, namely, performing feature extraction processing on the image blocks in the image block sets to obtain the image block sets subjected to the feature extraction processing, and mapping the image block sets subjected to the feature extraction processing and the label images to form a mapping relation between the image blocks in the image block sets and the label images. Taking 16 image block sets, each image block set comprising 4 image blocks as an example for introduction, dividing 16 training images by a server to obtain 16 image block sets, then performing feature extraction processing on each image block in the 16 image block sets to obtain 16 image block sets after feature extraction processing, and at the moment, mapping each image block set after feature extraction processing and a corresponding label image by the server respectively to enable 4 image blocks in each image block set and the same label image to form a mapping relation.
And step S208, performing parameter adjustment learning on the image blocks and the marked images after the mapping processing to obtain parameters of the deep learning network.
The deep learning network can process the image blocks to obtain corresponding training parameters, and process the marked images to obtain standard parameters, so that the parameter adjustment learning can be understood as a process of adjusting the training parameters to approach the standard parameters. After the image blocks and the marked images are subjected to mapping processing, the server performs parameter adjustment learning by using the image blocks and the marked images subjected to mapping processing, so that the training parameters approach standard parameters, and parameters of a deep learning network are determined.
In a possible case, the training image may be subjected to block segmentation, and then the obtained image blocks are loaded into the deep learning network to perform training of the deep learning network, for example, 16 training images are subjected to block segmentation (each training image is segmented into 4 image blocks) to obtain 64 image blocks, and the 64 image blocks are loaded into the deep learning network, which obviously increases the time for loading the image by the deep learning network, and further reduces the training speed; in addition, this way also increases the space for deep network learning training, because the deep learning network can perform parameter adjustment learning only when each loaded image has its corresponding labeled image in the training process, if the training image is first subjected to block segmentation processing and the image block is loaded into the deep learning network, then each image block has its corresponding labeled block, that is, the labeled image needs to be subjected to the same block segmentation processing to obtain a labeled block, and then the image block and the labeled block are loaded into the deep learning network together, taking 16 training images and 4 image blocks segmented for each training image as an example to introduce: the method comprises the steps of respectively carrying out block segmentation processing on 16 training images and 16 marked images to obtain 64 image blocks and 64 marked blocks, and loading the 64 image blocks and the 64 marked blocks into a deep learning network, namely, the training space needs to be enough to accommodate the 64 image blocks and the 64 marked blocks. Therefore, the training method of the deep learning network includes the steps that the training images are subjected to block segmentation processing, the obtained image blocks are loaded into the deep learning network, and the training of the deep learning network is performed, so that although the loss of image information can be avoided, the image accuracy is guaranteed, the number of loaded images is increased, the loading time is increased, the training speed is further reduced, and the space required by training is increased.
Compared with the method of firstly dividing blocks and then loading, the training method of the deep learning network provided by the application loads the training image and the marking image, then performs the block division on the training image according to the preset block size to obtain the image block, performs the mapping processing on the image block and the marking image after the feature extraction processing after performing the feature extraction processing on the image block, performs the parameter adjustment learning on the image block and the marking image after the mapping processing, obtains the parameters of the deep learning network, avoids the loss of image information, ensures the accuracy of the identification result, loads the training image and the marking image firstly, then performs the block division processing, thus does not need to load the image block obtained after the division, avoids the situation of performing the corresponding block division on the marking image, saves the loading time and the block division time of the deep learning network, and improves the training efficiency, and because the marked image does not need to be subjected to block segmentation, the space required in the training process is saved.
It should be noted that, in step S204, the server may set the block size in advance, which may improve the training speed and ensure the accuracy of the recognition result. In a possible case, if the server performs block division on the training image according to an arbitrary size, since the full convolution network has a requirement on a processable size, the server also needs to perform normalization processing on image blocks of an arbitrary size, which results in an increase in training time, so that training speed is reduced, and it is difficult to ensure accuracy of the recognition result because part of image information is lost due to the normalization processing.
In a possible case, if the training image is divided according to the block size only, when the training image is divided, a situation that a partial image area cannot be included in the divided block may occur, that is, there is a problem that adjacent image blocks are not continuous, so that all information of the training image cannot be included in the image blocks, and accuracy of the recognition result is reduced. One embodiment of the application adopts a method of performing block segmentation by using a sliding step size and a block size, so that a plurality of image blocks obtained after segmentation can contain information of a whole training image; specifically, the server acquires a preset sliding step size and a preset block size, and performs sliding segmentation processing on the training image according to the sliding step size and the block size to obtain a plurality of image blocks; it can be understood that, at this time, the image blocks may be directly subjected to stitching processing to obtain the original training image, that is, the stitching size obtained after each image block is stitched is the same as the size of the training image before segmentation; further, it can be understood that, when there are a plurality of training images, each training image corresponds to a different image block set, and a stitching size obtained by stitching the image blocks in each image block set is the same as the size of the training image before segmentation.
When a plurality of training images (which can be regarded as an image training set) are used for training the deep learning network, the deep learning network can be divided into a plurality of batches to load the training images, so that batch training is realized; if all the training images are loaded, namely the deep learning network loads all the training images of each batch for training, one round of training can be determined to be completed, and the server can perform the next round of training at the moment.
In one example, the batch training performed by the server may specifically include: the method comprises the steps that a server firstly obtains various training images, random sequencing processing is carried out on the various training images by using a shuffle module, then batch classification is carried out on the training images after the random sequencing processing, multiple batches of training images are obtained, training is carried out by loading the training images in batches, the process of carrying out random sequencing can be understood as a shuffle process, and correspondingly, the module for realizing the shuffle process can be called the shuffle module.
In order to avoid the situation that the server continuously trains, the server may stop the training when the number of training rounds reaches the preset number of rounds according to the preset number of rounds. Specifically, after completing one round of training, the server updates the number of training rounds, and if the number of training rounds does not reach the preset number of rounds, continues the next round of training, where the server updates the number of training rounds by adding 1 to epoch, where the epoch is used to represent the number of training rounds.
Further, in order to further ensure the accuracy of the recognition result, before the next round of training, the server may perform random ordering processing on a plurality of training images that have been trained, and perform the next round of training by using the training images that have been subjected to the random ordering processing, that is, after the server finishes loading the training images of each batch, it is determined that one round of training is completed, then perform the random ordering processing on a plurality of training images, and perform the next round of training by using the training images that have been subjected to the random ordering processing; the random ordering processing may be performed on the plurality of training images by first obtaining an index (index) of the training images and then performing the random ordering processing on the index by using a shuffle module.
Fig. 3 shows another embodiment of the training method of the deep learning network of the present application, which is described below with reference to fig. 3 by taking an application scenario of identifying an animal as an example:
the image training set of the animal comprises a training image and a labeled image, wherein the labeled image can be obtained by carrying out frame labeling on a pig (in the embodiment, the pig is taken as a feature object) in the training image; after obtaining an image training set of an animal, a server loads training images and labeled images into a deep learning network in batches and trains the training images, wherein the process of training by using each batch of training images includes steps S302 to S308, specifically, steps S302 to S308 are:
step S302, loading a training image and a corresponding marking image;
step S304, patch dividing the training image according to a preset sliding step length and block size to obtain a plurality of image block sets, wherein the number of image blocks in each image block set is the same;
step S306, after the image blocks in the image block set are subjected to feature extraction processing, the image blocks and the label images in the processed image block set are subjected to mapping processing;
step S308, parameter adjustment learning is carried out on the image blocks and the marked images after mapping processing, and parameters of a deep learning network are obtained;
after loading the complete batch of training images, determining that a round of training is completed, updating the number of training rounds, i.e. adding 1 to the epoch (step S310);
step S312, acquiring an index of the training image, and randomly sequencing the index by using a shuffle module;
step S314, after the random sorting processing, obtaining a corresponding training image set, and performing batch processing on the image training set to obtain a plurality of batches of training images.
In the embodiment, patch segmentation is performed according to the preset sliding step length and the block size, so that the problems of pixel loss and reduced precision caused by size normalization of an image block are solved, the precision of local feature extraction in deep learning training is ensured, and the accuracy of an identification result is ensured; and the training images are loaded and then subjected to patch segmentation, so that the condition that corresponding patch segmentation needs to be carried out on the labeled images is avoided, the time for reconstruction, offline storage, preprocessing and the like of a training set based on the patch is reduced, the training efficiency is improved, and the space required by training is saved.
It should be understood that although the various steps in the flow charts of fig. 2-3 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-3 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.
In one embodiment, as shown in fig. 4, there is provided a training apparatus 400 for a deep learning network, including: an image loading module 402, a block segmentation module 404, a mapping processing module 406, and a parameter adjustment learning module 408, wherein:
an image loading module 402, configured to load a training image and a labeled image; the marked image is obtained by marking the features in the training image;
a block division module 404, configured to perform block division on the training image according to a preset division size to obtain a plurality of image blocks;
a mapping processing module 406, configured to perform, after performing feature extraction processing on the multiple image blocks, mapping processing on the processed multiple image blocks and the tag image;
and the parameter adjusting and learning module 408 is configured to perform parameter adjusting and learning on the mapped image blocks and the labeled images to obtain parameters of the deep learning network.
In one embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set includes the same number of image blocks.
In an embodiment, the block segmentation module 404 is further configured to obtain a preset sliding step size and a block size; and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
In one embodiment, when the number of training images is multiple, each training image corresponds to a different image block set, and the size of the mosaic obtained by mosaicing the image blocks in each image block set is the same as the size of the training image before segmentation.
In one embodiment, the training apparatus 400 for deep learning network further includes: the sorting processing first module is used for acquiring a plurality of training images; randomly sequencing a plurality of training images by using a shuffle module; the batch classification module is used for performing batch classification on the training images after random sequencing processing to obtain a plurality of batches of training images; and the image loading module is used for loading training images in batches for training.
In one embodiment, the training apparatus 400 for deep learning network further includes: the training round number determining module is used for determining that one round of training is finished after the training of the training images of each batch is loaded; and the second sorting module is used for randomly sorting the indexes of the plurality of training images by using the shuffle module and performing the next round of training.
In one embodiment, the training round number determining module is further configured to update the training round number; and the sorting processing third module is used for randomly sorting the indexes of the plurality of training images by using the shuffle module and carrying out the next round of training if the number of the training rounds does not reach the preset number of rounds.
For specific limitations of the training apparatus of the deep learning network, reference may be made to the above limitations of the training method of the deep learning network, and details are not repeated here. The modules in the training device of the deep learning network can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
loading a training image and a marking image; the marked image is obtained by marking the characteristic object in the training image;
according to the preset block size, performing block division on the training image to obtain a plurality of image blocks;
after the characteristic extraction processing is carried out on the plurality of image blocks, the processed plurality of image blocks and the mark image are subjected to mapping processing;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In one embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set includes the same number of image blocks.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a preset sliding step length and a block size; and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
In one embodiment, when the number of training images is multiple, each training image corresponds to a different image block set, and the size of the mosaic obtained by mosaicing the image blocks in each image block set is the same as the size of the training image before segmentation.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a plurality of training images; randomly sequencing a plurality of training images by using a shuffle module; carrying out batch classification on the training images after random sequencing to obtain a plurality of batches of training images; and loading training images in batches for training.
In one embodiment, the processor, when executing the computer program, further performs the steps of: if the training images of each batch are loaded for training, determining to finish a round of training; and randomly sequencing the indexes of the plurality of training images by using a shuffle module, and performing the next round of training.
In one embodiment, the processor, when executing the computer program, further performs the steps of: updating the number of training rounds; and if the number of the training rounds does not reach the preset number of the rounds, randomly sequencing the indexes of the training images by using a shuffle module, and performing the next round of training.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
loading a training image and a marking image; the marked image is obtained by marking the characteristic object in the training image;
according to the preset block size, performing block division on the training image to obtain a plurality of image blocks;
after the characteristic extraction processing is carried out on the plurality of image blocks, the processed plurality of image blocks and the mark image are subjected to mapping processing;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
In one embodiment, when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set includes the same number of image blocks.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a preset sliding step length and a block size; and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
In one embodiment, when the number of training images is multiple, each training image corresponds to a different image block set, and the size of the mosaic obtained by mosaicing the image blocks in each image block set is the same as the size of the training image before segmentation.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a plurality of training images; randomly sequencing a plurality of training images by using a shuffle module; carrying out batch classification on the training images after random sequencing to obtain a plurality of batches of training images; and loading training images in batches for training.
In one embodiment, the processor, when executing the computer program, further performs the steps of: if the training images of each batch are loaded for training, determining to finish a round of training; and randomly sequencing the indexes of the plurality of training images by using a shuffle module, and performing the next round of training.
In one embodiment, the processor, when executing the computer program, further performs the steps of: updating the number of training rounds; and if the number of the training rounds does not reach the preset number of the rounds, randomly sequencing the indexes of the training images by using a shuffle module, and performing the next round of training.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A training method of a deep learning network comprises the following steps:
loading a training image and a marking image; the marked image is obtained by marking the feature objects in the training image;
according to the preset block size, performing block segmentation on the training image to obtain a plurality of image blocks;
after the plurality of image blocks are subjected to feature extraction processing, mapping the processed plurality of image blocks and the label image;
and performing parameter adjustment learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
2. The method of claim 1, wherein when the number of the training images is multiple, each training image corresponds to a different image block set, and each image block set comprises the same number of image blocks.
3. The method of claim 1, wherein the step of performing block segmentation on the training image according to a predetermined block size to obtain a plurality of image blocks comprises:
acquiring a preset sliding step length and the block size;
and performing sliding segmentation processing on the training image according to the sliding step length and the block size to obtain a plurality of image blocks.
4. The method according to claim 3, wherein when the number of the training images is multiple, each training image corresponds to a different image block set, and a stitching size obtained by stitching the image blocks in each image block set is the same as a size of the training image before segmentation.
5. The method of claim 1, further comprising, prior to said loading training images:
acquiring a plurality of training images;
randomly sequencing a plurality of training images by using a shuffle module;
carrying out batch classification on the training images after random sequencing to obtain a plurality of batches of training images;
and loading training images in batches for training.
6. The method of claim 5, further comprising:
if the training images of each batch are loaded for training, determining to finish a round of training;
and utilizing a shuffle module to carry out random sequencing processing on the indexes of the plurality of training images, and carrying out the next round of training.
7. The method of claim 6, further comprising, after the step of determining completion of a round of training:
updating the number of training rounds;
and if the training round number does not reach the preset round number, randomly sequencing the indexes of the training images by using a shuffle module, and performing the next round of training.
8. An apparatus for training a deep learning network, comprising:
the image loading module is used for loading the training images and the marked images; the marked image is obtained by marking the features in the training image;
the block segmentation module is used for carrying out block segmentation on the training image according to a preset segmentation size to obtain a plurality of image blocks;
the mapping processing module is used for mapping the processed image blocks and the label image after the feature extraction processing is carried out on the image blocks;
and the parameter adjusting and learning module is used for performing parameter adjusting and learning on the image blocks and the marked images after mapping processing to obtain parameters of the deep learning network.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202010146486.5A 2020-03-05 2020-03-05 Training method and device for deep learning network, computer equipment and storage medium Active CN111325281B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010146486.5A CN111325281B (en) 2020-03-05 2020-03-05 Training method and device for deep learning network, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010146486.5A CN111325281B (en) 2020-03-05 2020-03-05 Training method and device for deep learning network, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111325281A true CN111325281A (en) 2020-06-23
CN111325281B CN111325281B (en) 2023-10-27

Family

ID=71173168

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010146486.5A Active CN111325281B (en) 2020-03-05 2020-03-05 Training method and device for deep learning network, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111325281B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155555A (en) * 2021-12-02 2022-03-08 北京中科智易科技有限公司 Human behavior artificial intelligence judgment system and method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105551036A (en) * 2015-12-10 2016-05-04 中国科学院深圳先进技术研究院 Training method and device for deep learning network
CN109427052A (en) * 2017-08-29 2019-03-05 中国移动通信有限公司研究院 Correlation technique and equipment based on deep learning processing eye fundus image
WO2019049060A1 (en) * 2017-09-08 2019-03-14 Stone Three Mining Solutions (Pty) Ltd Froth segmentation in flotation cells
CN109859203A (en) * 2019-02-20 2019-06-07 福建医科大学附属口腔医院 Defect dental imaging recognition methods based on deep learning
CN109993197A (en) * 2018-12-07 2019-07-09 天津大学 A kind of zero sample multi-tag classification method based on the end-to-end example differentiation of depth
CN110309855A (en) * 2019-05-30 2019-10-08 上海联影智能医疗科技有限公司 Training method, computer equipment and the storage medium of image segmentation
CN110335199A (en) * 2019-07-17 2019-10-15 上海骏聿数码科技有限公司 A kind of image processing method, device, electronic equipment and storage medium
US20190392267A1 (en) * 2018-06-20 2019-12-26 International Business Machines Corporation Framework for integrating deformable modeling with 3d deep neural network segmentation
CN110827330A (en) * 2019-10-31 2020-02-21 河海大学 Time sequence integrated multispectral remote sensing image change detection method and system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105551036A (en) * 2015-12-10 2016-05-04 中国科学院深圳先进技术研究院 Training method and device for deep learning network
CN109427052A (en) * 2017-08-29 2019-03-05 中国移动通信有限公司研究院 Correlation technique and equipment based on deep learning processing eye fundus image
WO2019049060A1 (en) * 2017-09-08 2019-03-14 Stone Three Mining Solutions (Pty) Ltd Froth segmentation in flotation cells
US20190392267A1 (en) * 2018-06-20 2019-12-26 International Business Machines Corporation Framework for integrating deformable modeling with 3d deep neural network segmentation
CN109993197A (en) * 2018-12-07 2019-07-09 天津大学 A kind of zero sample multi-tag classification method based on the end-to-end example differentiation of depth
CN109859203A (en) * 2019-02-20 2019-06-07 福建医科大学附属口腔医院 Defect dental imaging recognition methods based on deep learning
CN110309855A (en) * 2019-05-30 2019-10-08 上海联影智能医疗科技有限公司 Training method, computer equipment and the storage medium of image segmentation
CN110335199A (en) * 2019-07-17 2019-10-15 上海骏聿数码科技有限公司 A kind of image processing method, device, electronic equipment and storage medium
CN110827330A (en) * 2019-10-31 2020-02-21 河海大学 Time sequence integrated multispectral remote sensing image change detection method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
尹蕊;: "基于多尺度卷积神经网络的场景标记", 现代计算机(专业版), no. 06 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155555A (en) * 2021-12-02 2022-03-08 北京中科智易科技有限公司 Human behavior artificial intelligence judgment system and method
CN114155555B (en) * 2021-12-02 2022-06-10 北京中科智易科技有限公司 Human behavior artificial intelligence judgment system and method

Also Published As

Publication number Publication date
CN111325281B (en) 2023-10-27

Similar Documents

Publication Publication Date Title
CN109034078B (en) Training method of age identification model, age identification method and related equipment
CN110705405A (en) Target labeling method and device
CN111680701B (en) Training method and device of image recognition model and image recognition method and device
CN112183295A (en) Pedestrian re-identification method and device, computer equipment and storage medium
EP3859479A1 (en) Method for determining distribution information, and control method and device for unmanned aerial vehicle
CN110473172B (en) Medical image anatomical centerline determination method, computer device and storage medium
CN111192278B (en) Semantic segmentation method, semantic segmentation device, computer equipment and computer readable storage medium
CN113222055B (en) Image classification method and device, electronic equipment and storage medium
CN111507298B (en) Face detection method, device, computer equipment and storage medium
CN111310800A (en) Image classification model generation method and device, computer equipment and storage medium
CN112241646A (en) Lane line recognition method and device, computer equipment and storage medium
CN111914814A (en) Wheat rust detection method and device and computer equipment
CN115578590A (en) Image identification method and device based on convolutional neural network model and terminal equipment
CN111325281A (en) Deep learning network training method and device, computer equipment and storage medium
CN112101114A (en) Video target detection method, device, equipment and storage medium
CN112819834A (en) Method and device for classifying pathological images of stomach based on artificial intelligence
CN112686125A (en) Vehicle type determination method and device, storage medium and electronic device
CN110929792B (en) Image labeling method, device, electronic equipment and storage medium
DE102019209562B4 (en) Device and method for training a neural network and device and method for validating a neural network
CN113766308A (en) Video cover recommendation method and device, computer equipment and storage medium
CN110751163A (en) Target positioning method and device, computer readable storage medium and electronic equipment
CN114821658A (en) Face recognition method, operation control device, electronic device, and storage medium
CN110766652B (en) Network training method, device, segmentation method, computer equipment and storage medium
CN110096607B (en) Method and device for acquiring label picture
CN114399657A (en) Vehicle detection model training method and device, vehicle detection method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant