CN112966786A - Automatic marking method for convolutional neural network training data - Google Patents

Automatic marking method for convolutional neural network training data Download PDF

Info

Publication number
CN112966786A
CN112966786A CN202110405677.3A CN202110405677A CN112966786A CN 112966786 A CN112966786 A CN 112966786A CN 202110405677 A CN202110405677 A CN 202110405677A CN 112966786 A CN112966786 A CN 112966786A
Authority
CN
China
Prior art keywords
image
neural network
convolutional neural
network training
training data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110405677.3A
Other languages
Chinese (zh)
Inventor
李静雅
王东杰
郭志鹏
樊昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Jiuhuan Shichuang Technology Co ltd
Original Assignee
Ningbo Jiuhuan Shichuang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Jiuhuan Shichuang Technology Co ltd filed Critical Ningbo Jiuhuan Shichuang Technology Co ltd
Priority to CN202110405677.3A priority Critical patent/CN112966786A/en
Publication of CN112966786A publication Critical patent/CN112966786A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides an automatic marking method for convolutional neural network training data, which comprises the following steps: step S1, extracting basic defect characteristics; step S2, feature extraction and sample clipping, including: according to the characteristic mask obtained in the step S1, combining an isolated domain method to obtain the central position and the shape size of each defect body, and taking the point in the set as the center, simultaneously applying a basic transformation operation to the image to increase the number of samples, and cutting out samples with preset specifications; step S3, extracting the characteristic contour of the sample obtained in the step S2, and marking the type; step S4, parallel optimization based on the shared memory parallel system OpenmMP.

Description

Automatic marking method for convolutional neural network training data
Technical Field
The invention relates to the technical field of neural network training, in particular to an automatic marking method for convolutional neural network training data.
Background
Training of convolutional neural network models, which are mainstream in deep learning, requires a large number of labeled images as input data (in the order of tens of thousands or more) in principle in order to obtain reliable calculation results. In practical application, the input image samples are generally marked manually at present, so that the labor cost and the time cost are huge, and the calculation efficiency cannot be matched with the actual training calculation efficiency of a machine. This difference in efficiency makes deep learning training results often limited in the speed of harvest by artificial labeling efficiency. On the other hand, quantifiable evaluation standards are not established for manually marked data, the data volume is huge, the cost for rechecking is high, and the uncertainty caused by the difference of data samples made by different personnel causes poor convergence effect of actual training calculation.
Disclosure of Invention
The object of the present invention is to solve at least one of the technical drawbacks mentioned.
To this end, the invention proposes an automated labeling method for convolutional neural network training data.
In order to achieve the above object, an embodiment of the present invention provides an automatic labeling method for convolutional neural network training data, including the following steps:
step S1, extracting basic defect characteristics;
step S2, feature extraction and sample clipping, including: according to the characteristic mask obtained in the step S1, combining an isolated domain method to obtain the central position and the shape size of each defect body, and taking the point in the set as the center, simultaneously applying a basic transformation operation to the image to increase the number of samples, and cutting out samples with preset specifications;
step S3, extracting the characteristic contour of the sample obtained in the step S2, and marking the type;
step S4, parallel optimization based on the shared memory parallel system OpenmMP.
Further, in the step S1, the method includes the steps of:
(1) reading in image data and compressing;
(2) gaussian filtering: performing Gaussian processing on the image data;
(3) image interpolation: carrying out interpolation processing on the image data after Gaussian processing;
(4) image enhancement: performing enhancement processing on the image data after the interpolation processing;
(5) self-adaptive binarization of an image;
(6) and (4) image debridement.
Further, in the step S2, the isolated domain method employs skeleton extraction and a watershed method.
Further, in the step S3, the binary diagram boundary is directly obtained for the sample obtained in the step S2 with the calculation amount of one iteration, and the contour description and the category label are performed on the defect feature and written into the corresponding configuration file.
According to the automatic marking method for the convolutional neural network training data, disclosed by the embodiment of the invention, the automatic batch extraction of the obvious defect characteristics is realized based on Gaussian and USM methods; meanwhile, unified defect training samples are generated at a high speed, a full-automatic sample generation mechanism is established, and the matching of the sample marking efficiency and the training calculation efficiency is realized. Gaussian filtering is a common noise smoothing operator, and when the Gaussian filtering is matched with an emerging non-mask sharpening method, the contrast of a local boundary can be improved to the limit. The invention can smoothly pick the obvious foreign matters in the object by utilizing the characteristic. In addition, the invention is based on an OpenMP parallel model, and the algorithm core of the OpenMP parallel model can process gray images with the average defect size of more than 3 pixels by adopting methods of series Gaussian filtering, non-mask sharpening and the like. According to the invention, through multiple optimization of a data processing algorithm framework and a data storage mode, the parallel efficiency basically presents linear increase within a 16-core range through testing; compared with the traditional image marking efficiency, the effective marking efficiency is improved by four orders of magnitude.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow diagram of an automated labeling method for convolutional neural network training data, in accordance with an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
As shown in fig. 1, the automatic labeling method for convolutional neural network training data according to the embodiment of the present invention includes the following steps:
step S1, extracting basic defect features, including the following steps:
(1) reading in image data and compressing;
(2) gaussian filtering: performing Gaussian processing on the image data;
(3) image interpolation: carrying out interpolation processing on the image data after Gaussian processing;
(4) image enhancement: performing enhancement processing on the image data after the interpolation processing;
(5) self-adaptive binarization of an image;
(6) and (4) image debridement.
Specifically, the basis of the enhancement is the Unshirp mask (USM) and Gaussian method, and the basic model is as follows:
Figure BDA0003022213890000031
g(x0,y0)=∫∫(x,y)∈Kernalf(x,y)O((x0,y0)-(x,y))dxdy
Figure BDA0003022213890000032
wherein A is amplitude, (x)0,y0) As the center point position, (σ)xy) Is the variance, O (x)0,y0) As the original value of the center position image, g (x)0,y0) Is a Gaussian processed image value, u (x)0,y0) For the image values after the enhancement processing, weight is the enhancement ratio and Kernal is the convolution kernel size.
The calculation method of the convolution kernel radius r is as follows:
rx=σx*(log(ε))2+1;
wherein ε is 0.01. For practical purposes, it is not necessary to pay attention to the size of the convolution kernel, and downscaling and interpolation algorithms are used to obtain the best precision (downscaling and interpolation algorithms), i.e. the data is reduced first, and then gaussian boolean and interpolation methods are used in sequence. Since image noise is generally dependent on the environment and the machine itself, its size does not correlate positively with resolution. When the resolution of the actual sample to be processed is higher, the method can simultaneously play a better noise reduction effect.
When weight tends to 1, it is evident that there is u (x)0,y0) Tends to be ∞. All the features that are emphasized will then be concentrated at the image maximum, i.e. where the most visible defects are located. Masks for extracting defective tissue can be obtained by simple background segmentation, such as binary methods like iterative self-organizing analysis (IJ iso data classifier).
Step S2, feature extraction and sample clipping, including: according to the feature mask acquired in step S1, the center position and the shape size of each defect body are acquired in combination with the isolated domain method, and with the point in the set as the center, a basic transformation operation is applied to the image to increase the number of samples while cutting out samples of a preset specification.
Specifically, most of the current convolutional neural networks represented by CNN generally need to control the parameters of the full link layer to be fixed, and thus the input image size needs to be uniform. When the size of the data source image and the position of the defect are not fixed, a large amount of time is usually spent on cutting the image by a common manual method.
In this step, the center position and the shape and size of each defect body can be obtained by combining the isolated domain method with the feature mask successfully obtained in step S1. With the point in the set as the center, we can simultaneously apply a fundamental transform operation to the image to increase the number of samples while directly tailoring the required sample size.
In step S2, the isolated domain method employs skeleton extraction and watershed methods.
In step S3, feature contour extraction is performed on the sample obtained in step S2, and a category is labeled.
Specifically, in cooperation with an actual training network data reading interface, the defect features are usually required to be subjected to contour description and category marking. According to the calculation result in the step S2, the binary diagram boundary is directly obtained under the calculation amount of one iteration and written into the corresponding configuration file, thereby avoiding the manual drawing work.
Step S4, parallel optimization based on the shared memory parallel system OpenmMP.
Inter-picture processing parameters are typically shared when the defect data originates from the same data volume. The main computational effort is now concentrated on the repeated convolution calculations for each picture. Because no communication is needed between the layers, the difficulty and complexity of actual programming can be greatly reduced by using OpenmMP.
According to the automatic marking method for the convolutional neural network training data, disclosed by the embodiment of the invention, the automatic batch extraction of the obvious defect characteristics is realized based on Gaussian and USM methods; meanwhile, unified defect training samples are generated at a high speed, a full-automatic sample generation mechanism is established, and the matching of the sample marking efficiency and the training calculation efficiency is realized. Gaussian filtering is a common noise smoothing operator, and when the Gaussian filtering is matched with an emerging non-mask sharpening method, the contrast of a local boundary can be improved to the limit. The invention can smoothly pick the obvious foreign matters in the object by utilizing the characteristic. In addition, the invention is based on an OpenMP parallel model, and the algorithm core of the OpenMP parallel model can process gray images with the average defect size of more than 3 pixels by adopting methods of series Gaussian filtering, non-mask sharpening and the like. According to the invention, through multiple optimization of a data processing algorithm framework and a data storage mode, the parallel efficiency basically presents linear increase within a 16-core range through testing; compared with the traditional image marking efficiency, the effective marking efficiency is improved by four orders of magnitude.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art without departing from the principle and spirit of the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (4)

1. An automated labeling method for convolutional neural network training data, comprising the steps of:
step S1, extracting basic defect characteristics;
step S2, feature extraction and sample clipping, including: according to the characteristic mask obtained in the step S1, combining an isolated domain method to obtain the central position and the shape size of each defect body, and taking the point in the set as the center, simultaneously applying a basic transformation operation to the image to increase the number of samples, and cutting out samples with preset specifications;
step S3, extracting the characteristic contour of the sample obtained in the step S2, and marking the type;
step S4, parallel optimization based on the shared memory parallel system OpenmMP.
2. The automated labeling method for convolutional neural network training data of claim 1, wherein in said step S1, the steps of:
(1) reading in image data and compressing;
(2) gaussian filtering: performing Gaussian processing on the image data;
(3) image interpolation: carrying out interpolation processing on the image data after Gaussian processing;
(4) image enhancement: performing enhancement processing on the image data after the interpolation processing;
(5) self-adaptive binarization of an image;
(6) and (4) image debridement.
3. The automated labeling method for convolutional neural network training data of claim 1, wherein in said step S2, said isolated domain method employs a skeleton extraction and watershed method.
4. The method as claimed in claim 1, wherein in step S3, the binary image boundary is directly obtained from the samples obtained in step S2 with an iterative computation, and the defect features are subjected to contour description and class labeling and written into the corresponding configuration file.
CN202110405677.3A 2021-04-15 2021-04-15 Automatic marking method for convolutional neural network training data Pending CN112966786A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110405677.3A CN112966786A (en) 2021-04-15 2021-04-15 Automatic marking method for convolutional neural network training data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110405677.3A CN112966786A (en) 2021-04-15 2021-04-15 Automatic marking method for convolutional neural network training data

Publications (1)

Publication Number Publication Date
CN112966786A true CN112966786A (en) 2021-06-15

Family

ID=76281455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110405677.3A Pending CN112966786A (en) 2021-04-15 2021-04-15 Automatic marking method for convolutional neural network training data

Country Status (1)

Country Link
CN (1) CN112966786A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780491A (en) * 2017-01-23 2017-05-31 天津大学 The initial profile generation method used in GVF methods segmentation CT pelvis images
CN109118466A (en) * 2018-08-29 2019-01-01 电子科技大学 A kind of processing method of infrared image and visual image fusion
WO2019169772A1 (en) * 2018-03-06 2019-09-12 平安科技(深圳)有限公司 Picture processing method, electronic apparatus, and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780491A (en) * 2017-01-23 2017-05-31 天津大学 The initial profile generation method used in GVF methods segmentation CT pelvis images
WO2019169772A1 (en) * 2018-03-06 2019-09-12 平安科技(深圳)有限公司 Picture processing method, electronic apparatus, and storage medium
CN109118466A (en) * 2018-08-29 2019-01-01 电子科技大学 A kind of processing method of infrared image and visual image fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LOVEFIVE55: "《Opencv-锐化增强算法(USM)》", 《HTTPS://BLOG.CSDN.NET/WEIXIN_41709536/ARTICLE/DETAILS/100889849》 *
丑的睡不着: "《图像处理之USM锐化》", 《HTTPS://BLOG.CSDN.NET/WEIXIN_42026802/ARTICLE/DETAILS/80117403》 *
野犬1998: "《Opencv学习笔记七(梯度算子、锐化)》", 《HTTPS://BLOG.CSDN.NET/QQ_42319367/ARTICLE/DETAILS/97509807》 *

Similar Documents

Publication Publication Date Title
CN107341499B (en) Fabric defect detection and classification method based on unsupervised segmentation and ELM
US20200364842A1 (en) Surface defect identification method and apparatus
CN108562589B (en) Method for detecting surface defects of magnetic circuit material
Sammons et al. Segmenting delaminations in carbon fiber reinforced polymer composite CT using convolutional neural networks
CN106709421B (en) Cell image identification and classification method based on transform domain features and CNN
Bong et al. Vision-based inspection system for leather surface defect detection and classification
CN109241867B (en) Method and device for recognizing digital rock core image by adopting artificial intelligence algorithm
CN110827260A (en) Cloth defect classification method based on LBP (local binary pattern) features and convolutional neural network
CN111369526B (en) Multi-type old bridge crack identification method based on semi-supervised deep learning
CN111476794A (en) UNET-based cervical pathological tissue segmentation method
CN113673482B (en) Cell antinuclear antibody fluorescence recognition method and system based on dynamic label distribution
CN113516619B (en) Product surface flaw identification method based on image processing technology
CN116012291A (en) Industrial part image defect detection method and system, electronic equipment and storage medium
TW202202831A (en) A computer implemented process to enhance edge defect detection and other defects in ophthalmic lenses
CN111879972A (en) Workpiece surface defect detection method and system based on SSD network model
CN112200789B (en) Image recognition method and device, electronic equipment and storage medium
CN112381140B (en) Abrasive particle image machine learning identification method based on new characteristic parameters
CN112581483A (en) Self-learning-based plant leaf vein segmentation method and device
CN112966786A (en) Automatic marking method for convolutional neural network training data
CN116433978A (en) Automatic generation and automatic labeling method and device for high-quality flaw image
CN110889858A (en) Automobile part segmentation method and device based on point regression
CN110930369A (en) Pathological section identification method based on group equal variation neural network and conditional probability field
CN110264463A (en) A kind of material counting method based on matlab image procossing
CN115082416A (en) Lens flaw detection method, device, equipment and storage medium
Mathiyalagan et al. Image fusion using convolutional neural network with bilateral filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Li Jingya

Inventor after: Wang Dongjie

Inventor after: Fan Hao

Inventor before: Li Jingya

Inventor before: Wang Dongjie

Inventor before: Guo Zhipeng

Inventor before: Fan Hao

CB03 Change of inventor or designer information
RJ01 Rejection of invention patent application after publication

Application publication date: 20210615

RJ01 Rejection of invention patent application after publication