CN108416382B - Web image training convolutional neural network method based on iterative sampling and one-to-many label correction - Google Patents

Web image training convolutional neural network method based on iterative sampling and one-to-many label correction Download PDF

Info

Publication number
CN108416382B
CN108416382B CN201810171017.1A CN201810171017A CN108416382B CN 108416382 B CN108416382 B CN 108416382B CN 201810171017 A CN201810171017 A CN 201810171017A CN 108416382 B CN108416382 B CN 108416382B
Authority
CN
China
Prior art keywords
web
training
model
label
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810171017.1A
Other languages
Chinese (zh)
Other versions
CN108416382A (en
Inventor
杨巨峰
程明明
孙晓晓
王恺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nankai University
Original Assignee
Nankai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nankai University filed Critical Nankai University
Priority to CN201810171017.1A priority Critical patent/CN108416382B/en
Publication of CN108416382A publication Critical patent/CN108416382A/en
Application granted granted Critical
Publication of CN108416382B publication Critical patent/CN108416382B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for training a convolutional neural network based on iterative sampling and one-to-many label correction for Web images. The purpose of this method is to solve the problem of insufficient data when using depth science for computer vision tasks by gradually adding Web images in the training set. The new method can be used as an auxiliary processing step of various computer vision tasks, and is characterized in that with the training of the model, the updated model can predict the label confidence of the Web image more accurately, and then the performance of the model is continuously improved by adding high-quality data into the training data through comparison, so that the noise data in the Web data can be effectively reduced, and the performance of the convolutional neural network is improved. Meanwhile, the complexity and diversity of Web image contents are considered, and the method adopts a one-to-many label correction strategy in the iterative process to reduce the influence of hard labeling on model training. Based on the steps, the whole model is iteratively trained until the performance of the network tends to be stable.

Description

Web image training convolutional neural network method based on iterative sampling and one-to-many label correction
Technical Field
The invention belongs to the technical field of computer vision, and relates to a method for training a convolutional neural network by a Web image, in particular to a method for training a convolutional neural network by a Web image based on iterative sampling and one-to-many label correction.
Background
With the development of deep learning, many typical problems such as object detection, object recognition, tracking, salient region detection, etc. benefit from the development, but training of the convolutional neural network requires a large amount of data, and manual labeling of data is time-consuming and labor-consuming, and especially for some problems requiring professional knowledge, manual labeling is more difficult to implement. In recent years, network image learning is attracting more and more attention, which is one of the current methods for solving the data shortage of the training convolutional neural network, and aims to train the convolutional neural network by using images which are easily acquired on the network. Training convolutional neural networks with network data faces two challenges: i) the data of the data on the network contains noise data; ii) there is a difference in the distribution of the network data and the standard data. If the label of the selected network image can not accurately reflect the content of the image, the training of the model is influenced. Meanwhile, the network data image has complex content, so that the data distribution of the webpage image and the target data set are quite different.
Currently, some work has been proposed to utilize network data. Krause, Jonathan, et al 2016, proposed in The paper "The unresponsive effective implementation of knowledge data for fine-grained-graphiderogation" (pp.301-320ECCV 2016.vol 9907.Springer) that The addition of network data to The training of convolutional neural networks can boost The effect of The model. In view of the influence of network noise data, Vo, Phong D et al in 2017 published work on Computer Vision and Image interpretation propose to remove noise data based on the prediction of the reference model, and then use the processed network data for auxiliary convolutional neural network training. The method can improve the classification effect of the model. Furthermore, TongXiao, equal to 2015, proposed a probabilistic graph model-based approach in "Learning from systematic data for image classification" (pp.2691-2699CVPR 2015). By modeling the relationship between the network image, the noise label and the real label, the model has certain robustness to noise data. But these efforts neglect a problem: the judgment of the model on the noise data is based on the standard data, and the distribution of the standard data and the network image have great difference, so that a great deal of network data can be deleted by mistake. If only data consistent with standard data is kept, the model can hardly learn new content again, and the addition of network data loses due value. Therefore, sampling of network data is crucial.
In recent years, iterative learning strategies have been applied to many machine learning tasks, such as data mining, pattern recognition, computer vision, and the like. Knowledge acquisition is performed in a simple to complex manner, and y.bengio et al proposed "Curriculumlearning" (pp.41-48ICML2009) (curriculum learning) in 2009 to describe the learning manner and introduce it into machine learning. In 2010, m.p. kumar et al proposed embedding the difficult and easy identification of course learning into the learning objective in "Self-tracked learning for latentvariable models" (pp.1189-1197NIPS 2010), and called this learning as Self-tracked learning, i.e. learning iteratively from easy to difficult from samples. It is worth mentioning that this method is a popular learning method, which conforms to the cognitive habits of people. For example, in 2017, Ma Fan and the like propose a selection optimization process based on self-pace learning to improve the traditional co-training algorithm; and a new model is designed by Dong in Few-shot object detection (arXiv:1706.08249) based on the idea of iterative learning, so that the object detection effect is improved. Inspired by these efforts, we consider that each iteration of progressive iterative learning can be viewed as: optimization of model weights with the addition of new data. Our method was designed based on the above analysis.
Some latest achievements in the field stimulate the inspiration of us, and provide a solid technical foundation for the realization of a Web image training convolutional neural network learning method based on iterative sampling and one-to-many label correction.
Disclosure of Invention
The technical problem to be solved by the invention is to train a convolutional neural network by using a Web image, accurately judge the label of a network picture, reduce the influence of noise data and more efficiently train a model by using the Web data.
In order to achieve the purpose of the invention, the following technical scheme is adopted to realize the purpose:
a. a user inputs a Web image data set, and feature extraction is carried out by using a reference model to obtain the label confidence of an image;
b. according to the confidence coefficient of the current prediction, performing one-to-many correction and sampling on the label of the Web image data;
c. training is continued by using the Web images sampled above to obtain an updated model, and then the steps are repeated until the performance of the model tends to be stable or the data is not changed basically.
In order to perform one-to-many correction in the step b, a reference model is obtained by training on a standard data set in advance, and then the features of the softmax layer are extracted as the confidence degrees of the predicted labels of the Web images.
The invention has the beneficial effects that: the method can be simply transferred to any convolutional neural network model, and is suitable for tasks with insufficient data and learning by means of Web images. Under the condition that a model is not selected, only the structure of the model and the size of data batch during training (determined according to the model and the video memory) need to be modified, and the performance of the model obtained through final training is greatly improved compared with the model only trained by standard data, and is also obviously improved compared with other methods utilizing Web images. In general, the method provides a brand-new scheme for training the convolutional neural network by using the Web image, and the method is believed to be well applied to many other computer vision classification tasks to help the tasks to train the convolutional neural network more fully and obtain a better model.
Drawings
FIG. 1 is a flowchart of a method for training a convolutional neural network based on iterative sampling and one-to-many label correction of a Web image.
FIG. 2 is a schematic diagram of a method for training a convolutional neural network based on iterative sampling and one-to-many label correction of a Web image.
Detailed Description
The invention is described in further detail below with reference to the following figures and detailed description:
referring to fig. 1, a flowchart of a method for training a convolutional neural network by using a Web image with iterative sampling and one-to-many label correction is shown, wherein the steps shown in the diagram are as follows:
a. and training the neural network by using the given labeled data set as reference data to obtain a reference model. The typical network training uses a cafe framework running on an incavida display card. By means of the high concurrency capability of the display card, the training process can be completed quickly; in the step, the convolutional neural network model trained by the standard data set is used as a reference model, so that the distribution of the sampled data is ensured not to be changed greatly, the training process of the model is influenced by the data distribution difference as little as possible, and the model is better drawn to a target task.
b. And (4) performing feature extraction of the softmax layer on the Web image to obtain a tag confidence coefficient. And then sequencing the predicted labels of the Web images, comparing the predicted labels with the Web labels of the Web images, and performing label correction and sampling on the Web data by using a one-to-many label correction strategy. Compared with the traditional one-to-one label correction, the strategy can fully utilize the richness and diversity of Web image contents. The specific content comprises the steps of obtaining the confidence coefficient of a predicted label of the Web image through a reference convolutional neural network, and then sequencing the label according to the confidence coefficient from high to low. When the label sampling data is corrected, firstly, whether a first prediction label is consistent with a Web label of a Web image is compared, and if the prediction is correct, the sample is adopted; if the two-bit label is incorrect, whether the previous two-bit label is correctly predicted is considered, on the premise that the difference between the confidence degrees of the first bit label and the second bit label is small, whether the second bit prediction label is consistent with the Web label of the Web image is compared, if the conditions are met, the two labels are respectively used as labels to sample the image twice, and the like, the previous four labels are considered, if the confidence degrees are similar and correct prediction is carried out simultaneously, the sample is labeled by the labels and sampled simultaneously;
c. and adding all the sampling samples into a training data set, continuing training to obtain a new model, performing a new round of data sampling on the whole Web image data set by using the new model as a reference model, and performing iterative training until the model performance tends to be stable or the data is basically not changed.
In the invention, the Web image training convolutional neural network is subjected to iterative training of the convolutional neural network and sampling of the Web image so as to obtain a Web image data set with less noise and a convolutional neural network with improved performance, the two steps can be mutually promoted, and simultaneously, the optimization is carried out so as to achieve the purpose of efficiently utilizing network data.
Fig. 2 shows a schematic diagram of the method, in which the core problem of the algorithm at each stage, the training process, and the system input and output are visually described. Fig. 2 and fig. 1 have the same meaning, but have different abstraction levels, and mainly assist in understanding the various parts in fig. 1.

Claims (1)

1. A Web image training convolutional neural network method based on iterative sampling and one-to-many label correction is characterized in that for the Web image training convolutional neural network, the Web image data set with less noise and the convolutional neural network with improved performance are obtained through the iterative training convolutional neural network and the sampling Web image, the two steps can be mutually promoted, and meanwhile, the purpose of efficiently utilizing the Web data to train the neural network is achieved through optimization, and the method specifically comprises the following steps:
a. training a convolution neural network model by using a standard data set as an initial reference model; by taking the convolutional neural network model trained by the standard data set as a reference model, the distribution of the sampled data is ensured not to change greatly, the training process of the model is influenced by the data distribution difference as little as possible, and the model is better drawn to a target task in the iteration process;
b. inputting a Web image to a reference model, and extracting the characteristics of a softmax layer to serve as the confidence coefficient of a prediction label of the Web image;
c. b, obtaining the sequence of the predicted labels according to the label prediction confidence coefficient of the Web image obtained in the step b, then comparing the predicted labels with the Web labels of the image, and performing label correction and sampling on the Web data by using a one-to-many label correction strategy;
d. c, adding the sampled Web data obtained in the step c into a standard training data set to continue training to obtain a new model, and repeating the steps b, c and d by taking the new model as a reference model until the performance of the training model and the change of the sampled data set tend to be stable;
b, sequencing the predicted labels of the Web image, comparing the predicted labels with the Web labels of the Web image, and then performing label correction and sampling on the Web data by using a one-to-many label correction strategy, wherein the specific contents comprise that the confidence coefficient of the predicted labels of the Web image is obtained through a reference convolutional neural network, then the labels are sequenced from high to low according to the confidence coefficient, when the sampled data of the labels are corrected, whether the predicted label at the first position is consistent with the Web label of the Web image is firstly compared, and if so, the sample is adopted; if the two labels are not consistent, whether the previous two-bit label is correctly predicted is considered, on the premise that the difference between the confidence degrees of the first bit label and the second bit label is small, whether the second bit prediction label is consistent with the Web label of the Web image is compared, if the conditions are met, the two labels are respectively used as labels to sample the image twice, and the rest is considered until the previous four labels are considered, if all the conditions are not met, the sample cannot be sampled, and if the conditions that the confidence degrees are close and correct prediction is carried out simultaneously are met, the sample is simultaneously labeled by the labels and sampled.
CN201810171017.1A 2018-03-01 2018-03-01 Web image training convolutional neural network method based on iterative sampling and one-to-many label correction Active CN108416382B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810171017.1A CN108416382B (en) 2018-03-01 2018-03-01 Web image training convolutional neural network method based on iterative sampling and one-to-many label correction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810171017.1A CN108416382B (en) 2018-03-01 2018-03-01 Web image training convolutional neural network method based on iterative sampling and one-to-many label correction

Publications (2)

Publication Number Publication Date
CN108416382A CN108416382A (en) 2018-08-17
CN108416382B true CN108416382B (en) 2022-04-19

Family

ID=63129702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810171017.1A Active CN108416382B (en) 2018-03-01 2018-03-01 Web image training convolutional neural network method based on iterative sampling and one-to-many label correction

Country Status (1)

Country Link
CN (1) CN108416382B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345515B (en) * 2018-09-17 2021-08-17 代黎明 Sample label confidence coefficient calculation method, device and equipment and model training method
CN110060247B (en) * 2019-04-18 2022-11-25 深圳市深视创新科技有限公司 Robust deep neural network learning method for dealing with sample labeling errors
CN110135480A (en) * 2019-04-30 2019-08-16 南开大学 A kind of network data learning method for eliminating deviation based on unsupervised object detection
CN110110780B (en) * 2019-04-30 2023-04-07 南开大学 Image classification method based on antagonistic neural network and massive noise data
CN110991321B (en) * 2019-11-29 2023-05-02 北京航空航天大学 Video pedestrian re-identification method based on tag correction and weighting feature fusion
CN112488073A (en) * 2020-12-21 2021-03-12 苏州科达特种视讯有限公司 Target detection method, system, device and storage medium
CN113342799B (en) * 2021-08-09 2021-12-21 明品云(北京)数据科技有限公司 Data correction method and system
CN113688949B (en) * 2021-10-25 2022-02-15 南京码极客科技有限公司 Network image data set denoising method based on dual-network joint label correction
CN116012569B (en) * 2023-03-24 2023-08-15 广东工业大学 Multi-label image recognition method based on deep learning and under noisy data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193946A (en) * 2010-03-18 2011-09-21 株式会社理光 Method and system for adding tags into media file
CN103208008A (en) * 2013-03-21 2013-07-17 北京工业大学 Fast adaptation method for traffic video monitoring target detection based on machine vision
CN105046210A (en) * 2015-07-01 2015-11-11 西安理工大学 Multi-label diagnostic method for train air conditioning faults
CN106529564A (en) * 2016-09-26 2017-03-22 浙江工业大学 Food image automatic classification method based on convolutional neural networks
CN107316049A (en) * 2017-05-05 2017-11-03 华南理工大学 A kind of transfer learning sorting technique based on semi-supervised self-training
CN107563406A (en) * 2017-07-21 2018-01-09 浙江工业大学 A kind of image sophisticated category method of autonomous learning
CN107644235A (en) * 2017-10-24 2018-01-30 广西师范大学 Image automatic annotation method based on semi-supervised learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8896715B2 (en) * 2010-02-11 2014-11-25 Microsoft Corporation Generic platform video image stabilization

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193946A (en) * 2010-03-18 2011-09-21 株式会社理光 Method and system for adding tags into media file
CN103208008A (en) * 2013-03-21 2013-07-17 北京工业大学 Fast adaptation method for traffic video monitoring target detection based on machine vision
CN105046210A (en) * 2015-07-01 2015-11-11 西安理工大学 Multi-label diagnostic method for train air conditioning faults
CN106529564A (en) * 2016-09-26 2017-03-22 浙江工业大学 Food image automatic classification method based on convolutional neural networks
CN107316049A (en) * 2017-05-05 2017-11-03 华南理工大学 A kind of transfer learning sorting technique based on semi-supervised self-training
CN107563406A (en) * 2017-07-21 2018-01-09 浙江工业大学 A kind of image sophisticated category method of autonomous learning
CN107644235A (en) * 2017-10-24 2018-01-30 广西师范大学 Image automatic annotation method based on semi-supervised learning

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Attend in groups: a weakly-supervised deep learning framework for learning from web data;Bohan Zhuang 等;《arXiv》;20161201;第1-11页 *
Querying Discriminative andBatch Mode Active Learning Representative Samples for;Zheng Wang 等;《ACM》;20131231;第158-166页 *
SSD: Single Shot MultiBox Detector;Wei Liu 等;《arXiv》;20161230;第1-17页 *
基于主动学习的多标签图像分类方法研究;焦阳;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160215;第2016年卷(第02期);第I138-1456页 *
基于多标签学习的卷积神经网络的图像标注方法;高耀东 等;《计算机应用》;20170110;第37卷(第1期);第228-232页 *

Also Published As

Publication number Publication date
CN108416382A (en) 2018-08-17

Similar Documents

Publication Publication Date Title
CN108416382B (en) Web image training convolutional neural network method based on iterative sampling and one-to-many label correction
CN109741332B (en) Man-machine cooperative image segmentation and annotation method
CN113011427B (en) Remote sensing image semantic segmentation method based on self-supervision contrast learning
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
WO2022037233A1 (en) Small sample visual target identification method based on self-supervised knowledge transfer
CN108399428B (en) Triple loss function design method based on trace ratio criterion
Goodfellow et al. Multi-digit number recognition from street view imagery using deep convolutional neural networks
CN110796026A (en) Pedestrian re-identification method based on global feature stitching
CN108288014A (en) Intelligent road extracting method and device, extraction model construction method and hybrid navigation system
CN111079847B (en) Remote sensing image automatic labeling method based on deep learning
CN103793926B (en) Method for tracking target based on sample reselection procedure
CN113688665B (en) Remote sensing image target detection method and system based on semi-supervised iterative learning
CN113111716B (en) Remote sensing image semiautomatic labeling method and device based on deep learning
CN104484347B (en) A kind of stratification Visual Feature Retrieval Process method based on geography information
CN110969681A (en) Method for generating handwriting characters based on GAN network
CN114067233B (en) Cross-mode matching method and system
CN110347853A (en) A kind of image hash code generation method based on Recognition with Recurrent Neural Network
CN116386148B (en) Knowledge graph guide-based small sample action recognition method and system
CN111275646B (en) Edge-preserving image smoothing method based on deep learning knowledge distillation technology
CN110489348B (en) Software functional defect mining method based on migration learning
CN110442736B (en) Semantic enhancer spatial cross-media retrieval method based on secondary discriminant analysis
CN113313178B (en) Cross-domain image example level active labeling method
CN115830322A (en) Building semantic segmentation label expansion method based on weak supervision network
CN115331065A (en) Robust noise multi-label image learning method based on decoder iterative screening
CN112906763A (en) Digital image automatic labeling method utilizing cross-task information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant