CN113379037B - Partial multi-mark learning method based on complementary mark cooperative training - Google Patents

Partial multi-mark learning method based on complementary mark cooperative training Download PDF

Info

Publication number
CN113379037B
CN113379037B CN202110717550.5A CN202110717550A CN113379037B CN 113379037 B CN113379037 B CN 113379037B CN 202110717550 A CN202110717550 A CN 202110717550A CN 113379037 B CN113379037 B CN 113379037B
Authority
CN
China
Prior art keywords
mark
neural network
candidate
loss
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110717550.5A
Other languages
Chinese (zh)
Other versions
CN113379037A (en
Inventor
张珍茹
张敏灵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202110717550.5A priority Critical patent/CN113379037B/en
Publication of CN113379037A publication Critical patent/CN113379037A/en
Application granted granted Critical
Publication of CN113379037B publication Critical patent/CN113379037B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a partial multi-mark learning method based on complementary mark collaborative training, which aims at the problem that noise marks exist in training data under a sample multi-mark scene and improves the performance of a classification model. The method uses two neural networks for collaborative training, wherein one network only learns from a candidate marker set and the other network only learns from a non-candidate marker set, namely, a complement marker. Specifically, a sample with small loss is selected from each batch data, another network is guided to update parameters, and finally the output of the two networks is combined according to weight to give confidence to the batch data. And respectively learning from the candidate marks and the non-candidate marks, and comprehensively considering the two networks to reduce the influence of noise in the candidate marks on the model performance, thereby obtaining a robust classification model under the partial mark learning scene.

Description

Partial multi-mark learning method based on complementary mark cooperative training
Technical Field
The invention belongs to the technical field of computer application, and particularly relates to a partial multi-mark learning method based on complementary mark cooperative training.
Background
The partial multi-mark learning method is a novel weak supervision learning method, one sample in training data corresponds to a plurality of candidate marks, wherein a plurality of marks are true, the other marks are pseudo marks, and the difference between the partial multi-mark learning method and the partial multi-mark learning method is that the number of the true marks is different.
Current solutions to biased marker learning typically estimate the confidence that each candidate marker is a true marker. For example, the PML-fp and PML-lc methods are to initialize the tag confidence and add it as a weight to the ranking penalty, and to obtain the tag relevance ranking using alternate optimization to minimize the ranking penalty. However, if the confidence coefficient of the candidate mark is estimated incorrectly, the method can affect the model along with the alternative optimization process. The fPML method utilizes implicit dependencies between low-rank matrix approximate markers and features to identify noise markers and trains to obtain a multi-marker classifier. The PML-LRS method utilizes a low-rank matrix and sparse decomposition to obtain a real marking matrix and an uncorrelated marking matrix from candidate marking matrices, thereby reducing the influence of noise marking. The parameter method is based on a two-stage strategy, decomposing the task into a marker confidence estimate and a predictive model derivation. The first stage is to estimate the confidence coefficient of the candidate mark through iterative mark propagation, and select a trusted mark with high confidence coefficient; the second stage trains the multi-label model based on beaconing, and the beaconing selection in the first stage has larger influence on the subsequent model.
The method relies on traditional machine learning, cannot be expanded to a large data set, and greatly limits the field expansion and project iteration; in addition, the existing method depends on the mark confidence of a single model, and if errors exist, the single model is overlapped with iteration, so that the performance of a subsequent model is reduced; in addition, the problem of marker noise in multi-marker learning tends to result in reduced model performance.
Disclosure of Invention
In order to solve the problems, the invention discloses a partial multi-mark learning method based on complementary mark cooperative training, and simultaneously, two neural networks are used for cooperative training, so that the model can gradually update the confidence of candidate marks in iteration, thereby giving higher attention to real marks, reducing the influence of noise marks and improving the performance of a learning model.
In order to achieve the above purpose, the technical scheme of the invention is as follows:
a partial multi-mark learning method based on complementary mark cooperative training comprises the following specific steps:
(1) The partial multi-mark training set is used as input;
(2) The neural network f calculates candidate marker loss, and selects a plurality of samples with the minimum loss as knowledge to be provided to the neural network g;
(3) The neural network g calculates the compensation mark loss, and selects a plurality of samples with the minimum loss as knowledge to be provided to the neural network f;
(4) Calculating the loss between the output of the sample selected by the neural network g in the neural network f and the target mark to update the parameters of the neural network f;
(5) Calculating the loss between the output of the sample selected by the neural network f in the neural network g and the target mark to update the parameters of the neural network g;
(6) Combining the outputs of the neural network f and the neural network g as a label confidence;
(7) Iteratively optimizing a neural network;
(8) Performing multi-mark prediction on the test data according to a threshold value;
(9) Multi-mark index evaluation;
(10) Submitting the result to manual sampling review;
(11) And (5) sending a training set, and iterating the process.
Further, step (1) prepares the more marked training data, specifically as follows:
training data is obtained from arbitrary multi-label application scenarios such as images, audio, text, etc.:let->Representing d-dimensional feature space, < >>Representing the tag space containing q tags, & gt>For d-dimensional feature vector, ++>For example x i Is a candidate marker set,/->For example x i Is added to the mark of the (c). True mark set +.>Hidden in the candidate marker set, i.e.>Model->Will be from->Is learned.
Further, step (2) calculates candidate marker loss
Under the condition that the true mark is not determined, the candidate mark is directly utilized to guide the neural network f to learn, and a cross entropy loss function is adopted, wherein the formula is as follows:
wherein S is k Representing candidate marker y k Confidence of p k Representing that model prediction samples belong to category y k The method calculates p using a softmax activation function k . For candidate labels, the neural network f hopes to have the output probability p k The higher the better. After calculating loss for one batch of samples, sorting the samples from small to large, and selecting samples with small partial lossThe ratio selected is controlled by the following formula:
wherein T is the current epoch number of the neural network, T max Is the set maximum epoch number, η is the learning rate. Because of the presence of noise markers in the candidate markers, the neural network f will gradually overfit the noise markers as epoch increases. Meanwhile, the neural network has a memory effect, and a simple and clean mode can be learned first. The ratio R (T) thus chosen decreases as epoch increases, i.e., the more "trusted" model initially learns knowledge.
Further, step (3) calculates the loss of the supplemental mark
The neural network g will only learn from the complement markers, its loss calculation formula is as follows:
wherein the method comprises the steps ofIndicating whether the supplementary mark is a supplementary mark, if so, the supplementary mark is 1, otherwise, the supplementary mark is 0. For the complement labels, the neural network g desirably has an output probability p k The closer to 0, the better. Similarly, the neural network g sorts the samples of a batch from small to large after calculating the loss, and selects the sample with smaller R (T) proportion loss +.>
Further, the step (4) (5) updates the neural network parameters
Neural network f selected samplesWill help the neural network g update the network parameters, likewise the sample chosen by neural network g ∈ ->It will also help the neural network f update the network parameters, i.e. calculate the losses in the own network to adjust the parameter weights, using samples that the opposite network sees reliable, respectively, as follows:
further, step (6) updates the label confidence
Sample candidate marker setThe initial confidence level for each marker in the pool is 1 and will vary with model training. Specifically, the output results of the neural networks f, g on each batch of samples are combined by weight and normalized to be the new label confidence. For each sample x in a batch i The label confidence update formula is as follows:
where α is a balance parameter controlling the proportion of information inherited from the neural networks f, g, respectively. In addition, the method only updates the confidence coefficient of the candidate mark, and the supplementary mark explicitly indicates that the sample does not belong to the sample, and the confidence coefficient is always 0.
Further, step (8) predicts multiple markers
After model training is completed, test case x is tested by the following method i * Multi-label prediction is performed.
The beneficial effects of the invention are as follows:
the partial multi-mark learning method based on the complementary mark collaborative training aims at the problems that the multi-mark description is carried out in training data under a machine learning scene and the mark space contains noise, and improves the performance of a learning model. The method uses two neural networks simultaneously for collaborative training, wherein one network only learns from a candidate mark set, and the other network only learns from a non-candidate mark set, namely a complementary mark, and the two networks mutually learn and update parameters in each batch. In addition, the method considers the output of the two models, so that the models can gradually update the confidence coefficient of the candidate mark in iteration, thereby giving higher attention to the real mark and reducing the influence of the noise mark.
Drawings
FIG. 1 is a diagram of a system frame according to the present invention.
Detailed Description
The present invention is further illustrated in the following drawings and detailed description, which are to be understood as being merely illustrative of the invention and not limiting the scope of the invention.
As shown in the figure, the partial multi-mark learning method based on the complementary mark cooperative training uses two neural networks to mutually guide learning. A neural network learns a candidate mark set in a training set as guiding information, but due to the fact that noise marks exist in the candidate mark set, model overfitting noise is caused, and model performance is reduced. The method adopts another neural network to learn from the non-candidate mark set, namely the supplementary mark, and the information provided by the supplementary mark is clear and can guide the learning of another model. The label confidence is then updated, with a greater confidence indicating a higher likelihood of the label being a true label. The method comprises the following specific steps:
(1) The partial multi-mark training set is used as input; training data is obtained for arbitrary multi-label application scenarios such as images, audio, text, etc.:let->Representing d-dimensional feature space, < >>Representing the tag space containing q tags, & gt>For d-dimensional feature vector, ++>For example x i Is a candidate marker set,/->For example x i Is added to the mark of the (c). True mark set +.>Hidden in the candidate marker set, i.e.>Model->Will followIs learned.
(2) The neural network f calculates candidate marker loss, and selects a plurality of samples with the minimum loss as knowledge to be provided to the neural network g;
under the condition that the true mark is not determined, the candidate mark is directly utilized to guide the neural network f to learn, and a cross entropy loss function is adopted, wherein the formula is as follows:
wherein S is k Representing candidate marker y k Confidence of p k Representing that model prediction samples belong to category y k The method calculates p using a softmax activation function k . For candidate labels, the neural network f hopes to have the output probability p k The higher the better. After calculating loss for one batch of samples, sorting the samples from small to large, and selecting samples with small partial lossThe ratio selected is controlled by the following formula:
wherein T is the current epoch number of the neural network, T max Is the set maximum epoch number, η is the learning rate. Because of the presence of noise markers in the candidate markers, the neural network f will gradually overfit the noise markers as epoch increases. Meanwhile, the neural network has a memory effect, and a simple and clean mode can be learned first. The ratio R (T) thus chosen decreases as epoch increases, i.e., the more "trusted" model initially learns knowledge.
(3) The neural network g calculates the compensation mark loss, and selects a plurality of samples with the minimum loss as knowledge to be provided to the neural network f;
the neural network g will only learn from the complement markers, its loss calculation formula is as follows:
wherein the method comprises the steps ofIndicating whether the supplementary mark is a supplementary mark, if so, the supplementary mark is 1, otherwise, the supplementary mark is 0. For the complement labels, the neural network g desirably has an output probability p k The closer to 0, the better. Similarly, the neural network g sorts the samples of a batch from small to large after calculating the loss, and selects the sample with smaller R (T) proportion loss +.>
(4) Calculating the loss between the output of the sample selected by the neural network g in the neural network f and the target mark to update the parameters of the neural network f;
(5) Calculating the loss between the output of the sample selected by the neural network f in the neural network g and the target mark to update the parameters of the neural network g;
neural network f selected samplesWill help the neural network g update network parameters and, as such, the mindSample selected via network g->It will also help the neural network f update the network parameters, i.e. calculate the losses in the own network to adjust the parameter weights, using samples that the opposite network sees reliable, respectively, as follows:
(6) Combining the outputs of the neural network f and the neural network g as a label confidence;
the initial confidence level for each marker in the sample candidate marker set is 1 and will vary with model training. Specifically, the output results of the neural networks f, g on each batch of samples are combined by weight and normalized to be the new label confidence. For each sample x in a batch i The label confidence update formula is as follows:
where α is a balance parameter controlling the proportion of information inherited from the neural networks f, g, respectively. In addition, the method only updates the confidence coefficient of the candidate mark, and the supplementary mark explicitly indicates that the sample does not belong to the sample, and the confidence coefficient is always 0.
(7) Iteratively optimizing a neural network;
(8) Performing multi-mark prediction on the test data according to a threshold value;
after model training is completed, test case x is tested by the following method i * Multi-label prediction is performed.
(9) Multi-mark index evaluation;
(10) Submitting the result to manual sampling review;
(11) And (5) sending a training set, and iterating the process.
The invention provides a partial multi-mark learning scheme which can be directly utilized to learn from partial multi-mark data, so that the performance of a classification model is improved, and the cost of manually screening and purifying the data is reduced.
In addition, the loss calculation of the candidate mark and the supplementary mark is mainly based on a cross entropy loss function, and other loss calculation methods exist. In addition, aiming at the scheme of model collaborative training, a divergence-based mode can be considered, namely, if two models have no divergence to the prediction result of the same sample, the sample is given higher weight; if there is a greater divergence, the sample weight lag learning is reduced or a third model is introduced to take on the role of "referee". Such modifications are intended to fall within the scope of the present invention.

Claims (1)

1. A partial multi-mark learning method based on complementary mark cooperative training is characterized in that: the method comprises the following specific steps:
(1) The method comprises the steps that a multi-mark training set is used as input, and the multi-mark training set is training data of images, audios and texts;
the method comprises the following steps:
training data is obtained from any of the partial mark application scenarios:let->Representing d-dimensional feature space, < >>Representing the tag space containing q tags, & gt>For d-dimensional feature vector, ++>For example x i Is a candidate marker set,/->For example x i Is a complement of the true marker set->Hidden in the candidate marker set, i.e.>Model f:>will be from->Learning in the middle;
(2) The neural network f calculates candidate marker loss, and selects a plurality of samples with the minimum loss as knowledge to be provided to the neural network g;
the method comprises the following steps:
under the condition that the true mark is not determined, the candidate mark is directly utilized to guide the neural network f to learn, and a cross entropy loss function is adopted, wherein the formula is as follows:
wherein S is k Representing candidate marker y k Confidence of p k Representation model predictionThe sample belonging to category y k The method calculates p using a softmax activation function k The method comprises the steps of carrying out a first treatment on the surface of the For candidate labels, the neural network f hopes to have the output probability p k The higher the better; after calculating loss for one batch of samples, sorting the samples from small to large, and selecting samples with small partial lossThe ratio selected is controlled by the following formula:
wherein T is the current epoch number of the neural network, T max Is the set maximum epoch number, η is the learning rate; because of the noise marks in the candidate marks, the neural network f gradually overfits the noise marks with the increase of epochs; meanwhile, the neural network has a memory effect, and a simple and clean mode can be learned first; the ratio R (T) thus chosen decreases as epoch increases, i.e., the knowledge initially learned by the more "trusted" model;
(3) The neural network g calculates the compensation mark loss, and selects a plurality of samples with the minimum loss as knowledge to be provided to the neural network f; the supplemental markers are a set of non-candidate markers;
the method comprises the following steps:
the neural network g will only learn from the complement markers, its loss calculation formula is as follows:
wherein the method comprises the steps ofIndicating whether the mark is a supplementary mark, if so, the mark is 1, otherwise, the mark is 0; for the complement labels, the neural network g desirably has an output probability p k The closer to 0, the better; similarly, neural network g computes the loss for a batch of samples before it is usedSorting from small to large, selecting samples with smaller R (T) proportion loss +.>
(4) Calculating the loss between the output of the sample selected by the neural network g in the neural network f and the target mark to update the parameters of the neural network f;
neural network f selected samplesThe neural network g will be helped to update the network parameters, calculate the losses in the own network to adjust the parameter weights using samples that the opposite network feels reliable, the formula is as follows:
(5) Calculating the loss between the output of the sample selected by the neural network f in the neural network g and the target mark to update the parameters of the neural network g;
neural network g selected samplesThe neural network f will be helped to update the network parameters, calculate the losses in the own network to adjust the parameter weights using samples that the opposite network feels reliable, the formula is as follows:
(6) Combining the outputs of the neural network f and the neural network g as a label confidence;
the method comprises the following steps:
the initial confidence of each marker in the sample candidate marker set is 1, and will change with model training; specifically, the output junctions of the neural networks f, g on each batch of samples are combined by weightThe result is normalized and then used as a new mark confidence coefficient; for each sample x in a batch i The label confidence update formula is as follows:
wherein alpha is a balance parameter, controlling the information proportion inherited from the neural network f and g respectively, and in addition, the method only updates the confidence coefficient of the candidate mark, and the supplementary mark explicitly indicates that the candidate mark does not belong to a sample, and the confidence coefficient is always 0;
(7) Iteratively optimizing a neural network;
(8) Performing multi-mark prediction on the test data according to a threshold value;
the method comprises the following steps:
after model training is completed, test case x is tested by the following method i * A multi-label prediction is performed and,
(9) Multi-mark index evaluation;
(10) Submitting the result to manual sampling review;
(11) And (5) sending a training set, and iterating the process.
CN202110717550.5A 2021-06-28 2021-06-28 Partial multi-mark learning method based on complementary mark cooperative training Active CN113379037B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110717550.5A CN113379037B (en) 2021-06-28 2021-06-28 Partial multi-mark learning method based on complementary mark cooperative training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110717550.5A CN113379037B (en) 2021-06-28 2021-06-28 Partial multi-mark learning method based on complementary mark cooperative training

Publications (2)

Publication Number Publication Date
CN113379037A CN113379037A (en) 2021-09-10
CN113379037B true CN113379037B (en) 2023-11-10

Family

ID=77579543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110717550.5A Active CN113379037B (en) 2021-06-28 2021-06-28 Partial multi-mark learning method based on complementary mark cooperative training

Country Status (1)

Country Link
CN (1) CN113379037B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115331065B (en) * 2022-10-13 2023-03-24 南京航空航天大学 Robust noise multi-label image learning method based on decoder iterative screening

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110580496A (en) * 2019-07-11 2019-12-17 南京邮电大学 Deep migration learning system and method based on entropy minimization
CN111581468A (en) * 2020-05-15 2020-08-25 北京交通大学 Multi-label learning method based on noise tolerance
CN111582506A (en) * 2020-05-15 2020-08-25 北京交通大学 Multi-label learning method based on global and local label relation
CN111581466A (en) * 2020-05-15 2020-08-25 北京交通大学 Multi-label learning method for characteristic information with noise
CN112465016A (en) * 2020-11-25 2021-03-09 上海海事大学 Partial multi-mark learning method based on optimal distance between two adjacent marks

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11663280B2 (en) * 2019-10-15 2023-05-30 Home Depot Product Authority, Llc Search engine using joint learning for multi-label classification

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110580496A (en) * 2019-07-11 2019-12-17 南京邮电大学 Deep migration learning system and method based on entropy minimization
CN111581468A (en) * 2020-05-15 2020-08-25 北京交通大学 Multi-label learning method based on noise tolerance
CN111582506A (en) * 2020-05-15 2020-08-25 北京交通大学 Multi-label learning method based on global and local label relation
CN111581466A (en) * 2020-05-15 2020-08-25 北京交通大学 Multi-label learning method for characteristic information with noise
CN112465016A (en) * 2020-11-25 2021-03-09 上海海事大学 Partial multi-mark learning method based on optimal distance between two adjacent marks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Addressing Imbalance in Multi-Label Classification Using Weighted Cross Entropy Loss Function;Mohammad Reza Rezaei-Dastjerdehei等;《2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)》;第333-338页 *
一种利用关联规则挖掘的多标记分类算法;刘军煜 等;《软件学报》;第2865-2878页 *

Also Published As

Publication number Publication date
CN113379037A (en) 2021-09-10

Similar Documents

Publication Publication Date Title
CN113255822B (en) Double knowledge distillation method for image retrieval
CN111079847B (en) Remote sensing image automatic labeling method based on deep learning
CN112001422B (en) Image mark estimation method based on deep Bayesian learning
US20230134531A1 (en) Method and system for rapid retrieval of target images based on artificial intelligence
CN113269239B (en) Relation network node classification method based on multichannel convolutional neural network
CN111325264A (en) Multi-label data classification method based on entropy
CN109214407A (en) Event detection model, calculates equipment and storage medium at method, apparatus
CN108596204B (en) Improved SCDAE-based semi-supervised modulation mode classification model method
CN113379037B (en) Partial multi-mark learning method based on complementary mark cooperative training
CN113392967A (en) Training method of domain confrontation neural network
CN111797935B (en) Semi-supervised depth network picture classification method based on group intelligence
CN117350330A (en) Semi-supervised entity alignment method based on hybrid teaching
CN114972959B (en) Remote sensing image retrieval method for sample generation and in-class sequencing loss in deep learning
CN112348108A (en) Sample labeling method based on crowdsourcing mode
CN107247996A (en) A kind of Active Learning Method applied to different distributed data environment
CN114495114B (en) Text sequence recognition model calibration method based on CTC decoder
CN116031879A (en) Hybrid intelligent feature selection method suitable for transient voltage stability evaluation of power system
CN111783788B (en) Multi-label classification method facing label noise
CN114997175A (en) Emotion analysis method based on field confrontation training
CN114595695A (en) Self-training model construction method for few-sample intention recognition system
CN114170461A (en) Teacher-student framework image classification method containing noise labels based on feature space reorganization
CN114220086A (en) Cost-efficient scene character detection method and system
CN114154582A (en) Deep reinforcement learning method based on environment dynamic decomposition model
CN112270334A (en) Few-sample image classification method and system based on abnormal point exposure
CN112161621B (en) Model-free auxiliary navigation adaptive area selection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant