CN113269226B - Picture selection labeling method based on local and global information - Google Patents

Picture selection labeling method based on local and global information Download PDF

Info

Publication number
CN113269226B
CN113269226B CN202110399472.9A CN202110399472A CN113269226B CN 113269226 B CN113269226 B CN 113269226B CN 202110399472 A CN202110399472 A CN 202110399472A CN 113269226 B CN113269226 B CN 113269226B
Authority
CN
China
Prior art keywords
picture
model
objects
information
budget
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110399472.9A
Other languages
Chinese (zh)
Other versions
CN113269226A (en
Inventor
王魏
李文韬
陈攀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202110399472.9A priority Critical patent/CN113269226B/en
Publication of CN113269226A publication Critical patent/CN113269226A/en
Application granted granted Critical
Publication of CN113269226B publication Critical patent/CN113269226B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention discloses a picture selection and marking method based on local and global information, which can learn a model as good as possible by using pictures with marks as less as possible by enabling a learning model to automatically select partial pictures for marking. In order to reduce the requirement of image marking, the method utilizes the feature extraction capability of the depth model to construct the feature representation space of the image sample, and the effect of the sample on model updating is measured based on the local information of the image sample in the feature representation space. Meanwhile, the picture data space is divided into different areas based on the global information of the feature representation space, and the labeling budget is dynamically allocated according to the performance of the model on the different areas, so that the picture marking information is efficiently utilized, and the demand of picture marking is reduced.

Description

Picture selection and annotation method based on local and global information
Technical Field
The invention relates to a picture selection labeling method based on local and global information, which can efficiently select objects to be labeled in a picture database by utilizing the local and global information of a feature representation space, train a better picture classification model with less labeling cost, and belongs to the technical field of computer artificial intelligent data analysis.
Background
With the continuous development of the internet, a large amount of picture data needs to be processed, such as face pictures in face recognition, road pictures in automatic driving, commodity pictures on e-commerce platforms, and the like. The picture data structure is complex, so the picture classification task is often completed by using a depth model. But training the depth model requires a large number of labeled pictures. In general, it is expensive to label these pictures with a lot of manpower and material resources. In order to reduce the labeling cost and improve the utilization efficiency of labeled pictures, one solution is to let the model automatically select important pictures to be labeled, and collect the labels of the pictures for updating the model, which is the basic idea of selecting labels. The current selection labeling method mainly considers the uncertainty and the representativeness of data when measuring the importance degree of the data. Wherein the lower the confidence of the model in the prediction of the data, the higher the uncertainty of the data. In addition, the modulo length of the data gradient can also be used to estimate the uncertainty of the data. Since the uncertainty-based approach only considers the uncertainty level of a single data, the model easily picks out a batch of data with high uncertainty but redundancy. This problem can be alleviated to some extent by taking into account the representativeness of the data. Typically, a representative-based approach groups the features of the data into clusters, and selects the center point of each cluster as a representative of the cluster. Therefore, the distribution condition of the whole data can be described by only using a small amount of data. However, in this method, since there is no information about the model as a guide, the selected data does not necessarily facilitate the update of the model.
Disclosure of Invention
The invention aims to: aiming at the problems and the defects in the prior art, the invention provides a picture selection and annotation method based on local and global information. The method can utilize the picture characteristics to represent local information in space, combines the prediction result of the model, measures the information quantity of the picture, and can avoid similar or redundant pictures to a certain extent. Meanwhile, the global information of the feature representation space is combined, the picture data are divided into a plurality of clusters, and the labeling budget is dynamically allocated according to the performance of the model on different clusters, so that the utilization efficiency of the picture label is further improved, and the labeling cost is reduced. When the same number of marked pictures are utilized, the model trained by the method has better performance compared with a general selection marking method.
The technical scheme is as follows: a picture selection labeling method based on local and global information comprises the following contents:
first, a user is required to create a library of picture objects. And then randomly selecting a part of picture objects from the picture object library, acquiring marks of the picture objects, and forming an initial training set. And setting the structure of the depth model, the number of the selected picture objects in each round and the total number of iteration rounds by the user.
Next, the deep learning model is trained based on the training set. And converting the picture objects in the picture object library into feature representations by using the depth model, namely extracting the features of the pictures in the picture object library. Where the output of the penultimate layer of the depth model is often represented as a feature of the corresponding picture object. The space composed of these feature representations is called a feature representation space.
Then, in the feature expression space, the information amount of each object is estimated according to a local information calculation method, and a labeling budget is allocated according to a global information budget allocation method. Based on the budget, a batch of picture objects with high information content are selected, and marks of the picture objects are collected. And updating the marked picture object set and the unmarked picture object set. Meanwhile, the depth model is retrained by using the marked picture object set, and the feature representation of the picture object is re-extracted by using the new model. These steps are iterated in turn to specify the number of rounds. And the model of the last round is the final depth model.
And finally, in the prediction stage, the user inputs the picture object to be tested into the depth model obtained by training, and the depth model returns the prediction result to the user.
Has the advantages that: compared with the prior art, the method and the device have the advantages that the local information and the global information in the characteristic representation space are combined, the local information of the picture object is considered to avoid selecting a redundant picture, the budget is allocated according to needs through the global information of the characteristic representation space, the utilization efficiency of the picture data marks is improved, and the marking cost is reduced.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of a local information computation method in the present invention;
FIG. 3 is a flowchart of a global information budget allocation method according to the present invention.
Detailed Description
The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.
As shown in fig. 1, the method for selecting and labeling a picture based on local and global information includes the following steps:
step 100, establishing a library of picture objectsAnd randomly selecting a small number of objects from the picture object library for the data set, and acquiring marks of the objects to form an initial training set. The number of categories of data in the picture object library is denoted as C.
Figure BDA0003019557030000021
Representing a set of tagged picture objects,
Figure BDA0003019557030000022
representing a set of unmarked picture objects;
step 101, the user selects the depth model to be used, and records the model as f (·; Θ), wherein
Figure BDA0003019557030000023
In order to be the parameters of the model,
Figure BDA0003019557030000024
selecting the number B of samples selected in each round and the total number T of iteration rounds by a user as a full-connection layer parameter of the model and theta as other parameters in the model;
step 102, using the marked picture object
Figure BDA0003019557030000025
Training a depth model, wherein the current round number t is 1;
step 103, inputting the unmarked picture object into the depth model, and extracting the feature representation r of the picture object according to the depth model θ (x) Outputting f (x; theta) with the softmax layer;
step 104, estimating the information quantity provided by each object for the model according to a local information calculation method, as shown in fig. 2, the specific steps are as follows:
step 1041, selecting the range of the local neighbor area from the user;
step 1042, for the unmarked picture object x, the softmax layer output is f (x; Θ) ═ p 1 ,...,p C ),
Figure BDA0003019557030000031
Labelling of model f (x; Θ) predictions
Figure BDA0003019557030000032
To increase robustness, probability smoothing is performed as follows:
Figure BDA0003019557030000033
wherein g (x; Θ) ═ g (x; Θ) 1 ,…,g(x;Θ) C );
Step 1043, for unmarked picture object
Figure BDA0003019557030000034
And
Figure BDA0003019557030000035
computing information volume based on smoothed probabilities
Figure BDA0003019557030000036
Step 1044, recording the neighboring area of the picture object x as
Figure BDA0003019557030000037
Wherein r is θ (x) For the feature representation of the picture object x, the information content of the picture object x is
Figure BDA0003019557030000038
Figure BDA0003019557030000039
Step 1045, for all unmarked picture objects
Figure BDA00030195570300000310
Computing
Figure BDA00030195570300000311
And output.
105, aggregating the unmarked data into C clusters in the feature expression space according to the global information budget allocation method, and allocating budgets in different clusters (B) 1 ,…,B C ) In which B is j Budgets for the markers assigned to the jth cluster. As shown in fig. 3, the specific steps are as follows:
step 1051, selecting a Gibbs distributed temperature parameter tau by a user;
step 1052, using means + + method to group the feature representations of the unmarked picture objects into C clusters, where the picture objects in the jth cluster form a set of C clusters
Figure BDA00030195570300000312
Step 1053, estimating the performance of the model on different clusters, and recording the performance as gamma of the model on the jth cluster j
Figure BDA00030195570300000313
Step 1054, according to γ j Constructing a budget of Gibbs distribution α ═ (α) 1 ,…,α C )
Figure BDA0003019557030000041
Wherein ∑ j α j τ is a temperature parameter used to adjust the degree of smoothing of the Gibbs distribution;
step 1055, sampling for B times according to the Gibbs distribution alpha to obtain the budget (B) allocated in each cluster 1 ,…,B C ) And output, wherein ∑ j B j B is the total marking budget;
step 106, in each cluster, according to the corresponding budget B j j∈[C]Selecting the B with the highest information amount j Picture objects, obtaining their labels, adding them to a set of labeled objectsClosing box
Figure BDA0003019557030000042
In, update
Figure BDA0003019557030000043
And
Figure BDA0003019557030000044
retraining the depth model;
step 107, if T is less than T, T is T +1, and the step 103 is skipped;
and step 108, using the model obtained by the training of the T-th round as a final model. And for the object to be measured, outputting the mark predicted by the model.

Claims (1)

1. A picture selection labeling method based on local and global information is characterized by comprising the following contents:
firstly, establishing a picture object library; then randomly selecting a part of picture objects from a picture object library, acquiring marks of the picture objects, and forming an initial training set; setting the structure of a depth model, the number of picture objects selected in each round and the total number of iteration rounds;
secondly, training a deep learning model based on a training set; converting the picture objects in the picture object library into feature representation by using the depth model, namely extracting the features of the pictures in the picture object library; the space composed by the feature representation is called a feature representation space;
then, in the feature representation space, estimating the information quantity of each object according to a local information calculation method, and allocating a labeling budget according to a global information budget allocation method; based on the budget, selecting a batch of picture objects with high information quantity, and collecting marks of the picture objects; updating a marked picture object set and an unmarked picture object set; meanwhile, retraining the depth model by using the marked picture object set, and extracting the feature representation of the picture object by using the new model; iteration is carried out to designate the number of rounds; the model of the last round is the final depth model;
finally, in a prediction stage, the user inputs the picture object to be tested into a depth model obtained by training, and the depth model returns a prediction result to the user;
recording the category number of data in the picture object library as C;
Figure FDA0003678437290000011
representing a set of tagged picture objects,
Figure FDA0003678437290000012
a set of representative unmarked picture objects; the selected depth model is denoted as f (·; Θ), where
Figure FDA0003678437290000013
In order to be the parameters of the model,
Figure FDA0003678437290000014
selecting the number B of samples selected in each round and the total number T of iteration rounds by a user as a full-connection layer parameter of the model and theta as other parameters in the model; using tagged picture objects
Figure FDA0003678437290000015
Training a depth model, wherein the current round number t is 1; inputting unmarked picture objects into a depth model, and extracting characteristic representation r of the picture objects according to the depth model θ (x) Outputs f (x; theta) with the softmax layer;
the method for calculating the object information quantity by using the probability smoothing and the local information comprises the following specific steps:
step 1041, selecting a range E of the local neighbor area;
step 1042, for the unmarked picture object x, the softmax layer output is f (x; Θ) ═ p 1 ,…,p C ),
Figure FDA0003678437290000016
Labelling of model f (x; Θ) predictions
Figure FDA0003678437290000017
Probability smoothing is performed as follows:
Figure FDA0003678437290000018
wherein g (x; Θ) ═ g (x; Θ) 1 ,…,g(x;Θ) C );
Step 1043, for unmarked picture object
Figure FDA0003678437290000019
And
Figure FDA00036784372900000110
computing information volume based on smoothed probabilities
Figure FDA00036784372900000111
Step 1044, recording the neighboring area of the picture object x as
Figure FDA00036784372900000112
Wherein r is θ (x) For the feature representation of the picture object x, the information content of the picture object x is
Figure FDA00036784372900000113
Figure FDA0003678437290000021
Step 1045, for all unmarked picture objects
Figure FDA0003678437290000022
Calculating out
Figure FDA0003678437290000023
And outputting;
grouping the unmarked data into C clusters in the feature representation space according to the global information budget allocation method, and allocating budgets in different clusters (B) 1 ,…,B C ) In which B is j The specific steps for the marking budget allocated to the jth cluster are as follows:
step 1051, selecting a temperature parameter tau of Gibbs distribution by a user;
step 1052, using the kmeans + + method to group the feature representations of the unmarked picture objects into C clusters, the picture objects in the jth cluster are grouped into a set
Figure FDA0003678437290000024
Step 1053, estimating the performance of the model on different clusters, and recording the performance as gamma of the model on the jth cluster j
Figure FDA0003678437290000025
Step 1054, according to γ j Constructing a budget of Gibbs distribution α ═ (α) 1 ,…,α C )
Figure FDA0003678437290000026
Tau is a temperature parameter used for adjusting the smoothness degree of Gibbs distribution;
step 1055, sampling for B times according to the Gibbs distribution alpha to obtain the budget (B) allocated in each cluster 1 ,…,B C ) And output, wherein ∑ j B j B is the total marking budget.
CN202110399472.9A 2021-04-14 2021-04-14 Picture selection labeling method based on local and global information Active CN113269226B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110399472.9A CN113269226B (en) 2021-04-14 2021-04-14 Picture selection labeling method based on local and global information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110399472.9A CN113269226B (en) 2021-04-14 2021-04-14 Picture selection labeling method based on local and global information

Publications (2)

Publication Number Publication Date
CN113269226A CN113269226A (en) 2021-08-17
CN113269226B true CN113269226B (en) 2022-09-23

Family

ID=77229077

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110399472.9A Active CN113269226B (en) 2021-04-14 2021-04-14 Picture selection labeling method based on local and global information

Country Status (1)

Country Link
CN (1) CN113269226B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8452087B2 (en) * 2009-09-30 2013-05-28 Microsoft Corporation Image selection techniques
CN106934055B (en) * 2017-03-20 2020-05-19 南京大学 Semi-supervised webpage automatic classification method based on insufficient modal information
US11003892B2 (en) * 2018-11-09 2021-05-11 Sap Se Landmark-free face attribute prediction
CN111177384B (en) * 2019-12-25 2023-01-20 南京理工大学 Multi-mark Chinese emotion marking method based on global and local mark correlation
CN112434736A (en) * 2020-11-24 2021-03-02 成都潜在人工智能科技有限公司 Deep active learning text classification method based on pre-training model

Also Published As

Publication number Publication date
CN113269226A (en) 2021-08-17

Similar Documents

Publication Publication Date Title
CN111191732B (en) Target detection method based on full-automatic learning
CN108985334B (en) General object detection system and method for improving active learning based on self-supervision process
CN110245709B (en) 3D point cloud data semantic segmentation method based on deep learning and self-attention
CN114067160B (en) Small sample remote sensing image scene classification method based on embedded smooth graph neural network
CN112347970B (en) Remote sensing image ground object identification method based on graph convolution neural network
CN109902761B (en) Fishing situation prediction method based on marine environment factor fusion and deep learning
CN113223042B (en) Intelligent acquisition method and equipment for remote sensing image deep learning sample
CN110738132B (en) Target detection quality blind evaluation method with discriminant perception capability
CN113111716B (en) Remote sensing image semiautomatic labeling method and device based on deep learning
CN116403058B (en) Remote sensing cross-scene multispectral laser radar point cloud classification method
CN115292532B (en) Remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning
CN115471739A (en) Cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning
CN114863091A (en) Target detection training method based on pseudo label
CN111239137B (en) Grain quality detection method based on transfer learning and adaptive deep convolution neural network
CN110245723A (en) A kind of safe and reliable image classification semi-supervised learning method and device
JP2009259109A (en) Device, program and method for labeling, and recording medium recording labeling program
CN114579794A (en) Multi-scale fusion landmark image retrieval method and system based on feature consistency suggestion
CN117572457A (en) Cross-scene multispectral point cloud classification method based on pseudo tag learning
CN112528058B (en) Fine-grained image classification method based on image attribute active learning
CN113869418A (en) Small sample ship target identification method based on global attention relationship network
CN113034511A (en) Rural building identification algorithm based on high-resolution remote sensing image and deep learning
CN113269226B (en) Picture selection labeling method based on local and global information
Sun et al. Automatic building age prediction from street view images
CN111783788B (en) Multi-label classification method facing label noise
CN116012840B (en) Three-dimensional point cloud semantic segmentation labeling method based on active learning and semi-supervision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant