CN111860631B

CN111860631B - Article identification method adopting error factor reinforcement mode to optimize loss function

Info

Publication number: CN111860631B
Application number: CN202010669159.8A
Authority: CN
Inventors: 于效宇; 陈颖璐; 刘艳; 谈海平; 李富超
Original assignee: University of Electronic Science and Technology of China Zhongshan Institute
Current assignee: University of Electronic Science and Technology of China Zhongshan Institute
Priority date: 2020-07-13
Filing date: 2020-07-13
Publication date: 2023-08-22
Anticipated expiration: 2040-07-13
Also published as: WO2022011827A1; CN111860631A

Abstract

The invention discloses a method for using error factorThe object identification method of the enhanced mode optimized loss function is characterized in that the optimized loss function is named as coross, and a penalty term is added on the basis of the original cross entropy loss function to realize, wherein the penalty term comprises the following three modules: the punishment degree adjusting factor T is used for adjusting the strength of the influence of the correlation on the cross entropy loss function, and a T value can be set according to actual conditions during model training; correlation between classes of a datasetTesting the output of all the article categories through a preliminary model, and obtaining the correlation after calculation by using an information entropy formulaThe method comprises the steps of carrying out a first treatment on the surface of the Probability of related categoriesThe probability of identifying the target object as the object class related to the target object in the training process is not a constant value, and the target object is dynamically adjusted according to each training condition of the model; by adding punishment, the accuracy of the model to article identification is improved, and the identification accuracy of the deep learning network model can be improved.

Description

Article identification method adopting error factor reinforcement mode to optimize loss function

Technical Field

The invention relates to a method for optimizing a loss function, in particular to a method for optimizing the loss function by adopting a miscause strengthening mode.

Background

When the article type is identified, the article is often easy to identify as the article related to the appearance and the characteristics of the article, the identification result is inaccurate, and the influence of the similarity between the articles on the model precision is not considered in the existing loss function, so that the model is difficult to obtain improvement after learning to a certain degree.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides the article identification method which has high identification accuracy and adopts the error-caused reinforcement mode to optimize the loss function.

The technical scheme adopted for solving the technical problems is as follows:

the object identification method adopting error-based reinforcement mode to optimize the loss function is realized by adding a penalty term based on the original cross entropy loss function, wherein the penalty term comprises the following three modules:

the punishment degree adjusting factor T is used for adjusting the strength of influence of the correlation on the cross entropy loss function, when T=0, the punishment degree adjusting factor T has no influence on the cross entropy loss function, at the moment, the coross is the cross entropy loss function, and a T value can be set according to actual conditions during model training;

correlation between classes of a datasetTesting the output of all article categories through a preliminary model, and obtaining the correlation after calculation by using an information entropy formula>The method comprises the steps of carrying out a first treatment on the surface of the Probability of related category->I.e., the probability of identifying the target item as its associated item class during the training process, is non-constant and dynamically adjusted according to each training situation of the model.

The method comprises the following steps:

step S1, preliminary training is carried out, and the relevance of each category is obtained;

step S2, dynamically adding penalty items according to the identification result;

s3, constructing a new loss function;

step S4: setting an overflow mechanism;

step S5: the coross training was used.

The specific steps of step S1 are as follows: and carrying out preliminary training on the model by adopting a loss function of cross entropy loss, wherein the model after preliminary training is used for testing the correlation items of each category and the correlation among each category.

Step S2 comprises the following specific steps: and dynamically adding a penalty term according to the identification result, monitoring the identification result of each picture, adding the output of the model to the related term as a part of the penalty term in the form of probability score into the calculation of the loss function, and simultaneously using an overflow mechanism to protect the loss function in the training process to continue calculation, and once overflow, using the original cross entropy loss function.

The specific steps of step S3 are as follows: based on the original cross entropy loss function, after a preliminary model is trained through the original cross entropy loss function, testing all kinds of related items, namely classification error factors, and introducing the classification error factors into calculation of the loss function during formal training to construct a new loss function, wherein the specific formula for obtaining corosos is as follows:

wherein i refers to the item category of the correct classification, j refers to the item category having correlation with i, and d refers to the number of the relevant item categories; t is an adjusting factor for adjusting the penalty degree for loss calculation;the relevance between similar article categories is represented by information entropy, and the larger the information entropy is, the larger the relevance is indicated, and the higher the probability of model identification errors is; />The output probability of the related category in the training process;

wherein ,the specific formula of (2) is as follows:

wherein ,to classify the average value of i picture outputs, the picture output values of each class are added by position and then averaged to ensure that the output is at a normal level.

D value can be set according to actual conditions during model training, when d=0, other article types cannot influence the type identification of the target article, and then coross is the cross entropy loss function.

The specific steps of step S4 are as follows: when (when)When the cross entropy loss function is overflowed, the optimized loss function coross is used, and the original cross entropy loss function is used for calculation.

Step S5 includes the following training modes:

A. the first training mode is as follows: training twice, namely training a preliminary model by using an original cross entropy loss function, measuring the correlation between each class of data sets by using the preliminary model, arranging the data sets into a correlation table, performing formal training by using corosos in the second training, adding the correlation item of each article class into the corosos in an index manner in the formal training for calculation, and searching the corresponding punishment item from the correlation table for the identification condition of each picture by aiming at the preliminary model, and adding the punishment item into the corosos for calculation;

B. the second training mode is as follows: the training is carried out only once, N epochs are added, the training comprises two stages, the first stage takes a model of epochs=int [ kN ] as a preliminary model, wherein 0 < k < 1, the correlation among the data set categories is measured by the preliminary model and is arranged into a correlation table, the second stage uses coross to start breakpoint training from the epochs=int [ kN ] +1, wherein 0 < k < 1, and in the breakpoint training process, corresponding penalty items are searched from the correlation table and added into a loss function for calculation according to the recognition condition of the model to each picture.

The beneficial effects of the invention are as follows: according to the method, punishment is added, so that the accuracy of the model in identifying the object is improved, and the accuracy of identifying the deep learning network model can be improved.

Drawings

The invention will be further described with reference to the drawings and examples.

FIG. 1 is a flow chart of the steps of the present invention;

FIG. 2 is a flowchart of the steps of a first training mode;

fig. 3 is a flow chart of the steps of the second training mode.

Detailed Description

Referring to fig. 1 to 3, an article identification method adopting a miscause reinforcement mode to optimize a loss function is disclosed, the optimized loss function is named as coross, a penalty term is added on the basis of the original cross entropy loss function to realize, and the penalty term comprises the following three modules:

correlation between classes of a datasetTesting the output of all article categories through a preliminary model, and obtaining the correlation after calculation by using an information entropy formula>；

Probability of related categoriesI.e., the probability of identifying the target item as its associated item class during the training process, is non-constant and dynamically adjusted according to each training situation of the model.

By adding punishment, the accuracy of the model to article identification is improved, and the identification accuracy of the deep learning network model can be improved.

The method comprises the following steps:

s3, constructing a new loss function;

step S4: setting an overflow mechanism;

step S5: the coross training was used.

wherein ,the specific formula of (2) is as follows:

wherein To classify the average value of i picture outputs, the picture output values of each class are added by position and then averaged to ensure that the output is at a normal level.

The specific steps of step S4 are as follows: when (when)When, the optimized loss function coross is used, and once overflowed, the original cross entropy loss function calculation is used (stated another way, whenWhen the method is used, the optimized loss function coross is used, otherwise, the original cross entropy loss function is used for calculation

Step S5 includes the following training modes:

B. the second training mode is as follows: only one training is performed, N epochs are used (when a complete data set passes through a neural network once and returns once, the process is called one epochs), the training comprises two stages, the first stage takes a model when epochs=int [ kN ] as a preliminary model, wherein 0 < k < 1, the correlation among the data set categories is measured by the preliminary model and is arranged into a correlation table, the second stage starts breakpoint training from the epochs=int [ kN ] +1 by using coross, wherein 0 < k < 1, and in the breakpoint training process, corresponding penalty terms are searched from the correlation table and added into a loss function according to the recognition condition of the model for each picture.

As shown in fig. 2, the first training method needs to be trained twice, the first training is preliminary training, the second training is formal training, the preliminary training adopts an original cross entropy loss function training to obtain the related items and the correlation thereof between each category, during the preliminary training, the original cross entropy loss function is used for training a preliminary model, then the preliminary model is used for measuring the correlation between each category of the data set, the preliminary model is used for measuring the correlation thereof to form a correlation table, the formal training adopts a corosos training model, the related items of each item category are added into the corosos in an index mode for calculation during the formal training, meanwhile, the corresponding penalty items are searched from the correlation table for the recognition condition of each picture by the preliminary model and are added into the loss function for calculation, the correlation table is called for reconstructing the cross entropy loss function, the correlation is added and a new loss function (corosos) is formed, and the formal training is performed on the model by using the corosos.

As shown in fig. 3, the second training method includes two stages, the first stage is to obtain the correlation term and correlation between the data set classes, N epochs (when a complete data set passes through the neural network once and returns once, this process is called once epochs), the first stage is to perform preliminary training on the model by using the original cross entropy loss function, then the model when epochs=int [ kN ] is taken as the preliminary model, where 0 < k < 1, the correlation between the data set classes is measured by using the preliminary model, and is arranged into a correlation table, the second stage starts breakpoint continuous training from the position of epochs=int [ kN ] +1 by using softmax+coross, where 0 < k < 1, and during the breakpoint continuous training, for each picture identification condition of the model, the corresponding cross entropy loss term and correlation are searched from the correlation table and added into the calculation of the loss function (the cross entropy loss function is reconstructed by calling the correlation table, and the correlation function is added into the calculation of the corlss in an indexed manner).

The above embodiments do not limit the protection scope of the invention, and those skilled in the art can make equivalent modifications and variations without departing from the whole inventive concept, and they still fall within the scope of the invention.

Claims

1. An article identification method for optimizing a loss function by adopting a miscause reinforcement mode is characterized by comprising the following steps:

performing preliminary training on the model by adopting a loss function of cross entropy loss, wherein the model after the preliminary training is used for testing the correlation items of each category and the correlation among each category;

dynamically adding a penalty term according to the identification result, monitoring the identification result of each picture, adding the output of the model to the related term as a part of the penalty term in the form of probability score into the calculation of a loss function, and simultaneously using an overflow mechanism to protect the loss function in the training process to continue calculation, and once overflowed, using the original cross entropy loss function;

s3, constructing a new loss function;

based on the original cross entropy loss function, after a preliminary model is trained through the original cross entropy loss function, testing all kinds of related items, namely classification error factors, and introducing the classification error factors into calculation of the loss function during formal training to construct a new loss function, so as to obtain an optimized loss function corosos, wherein the specific formula is as follows:

；

wherein i refers to the item category of the correct classification, j refers to the item category having correlation with i, and d refers to the number of the relevant item categories; t is an adjusting factor for adjusting the penalty degree for loss calculation;the relevance between similar article categories is represented by information entropy, and the larger the information entropy is, the larger the relevance is indicated, and the higher the probability of model identification errors is;the output probability of the related category in the training process;

wherein ,the specific formula of (2) is as follows:

；

wherein ,for classifying the average value of i picture output, adding the picture output values of each class according to the position, and then averaging to ensure that the output is at a normal level;

d value can be set according to actual conditions during model training, when d=0, other article types can not influence the type identification of the target article, and the optimized loss function corlos is the cross entropy loss function;

step S4: setting an overflow mechanism;

when (when)When the cross entropy loss function is overflowed, the original cross entropy loss function is used for calculation;

step S5: training by adopting an optimized loss function coross;

A. the first training mode is as follows: training twice, namely training a preliminary model by using an original cross entropy loss function, measuring the correlation between each class of data sets by using the preliminary model, finishing the correlation into a correlation table, performing formal training by using an optimized loss function coross in the second training, adding the correlation item of each object class into the optimized loss function coross in an index manner during the formal training, and calculating, according to the identification condition of the preliminary model on each picture, searching the corresponding penalty item from the correlation table and adding the penalty item into the optimized loss function coross;

B. the second training mode is as follows: the training method comprises the steps of training N epochs once, wherein the training comprises two stages, the first stage takes a model of epochs=int [ kN ] as a preliminary model, wherein 0 < k < 1, the correlation among various categories of a data set is measured by the preliminary model and is arranged into a correlation table, the second stage starts breakpoint training from the position of epochs=int [ kN ] +1 by using an optimized loss function coross, wherein 0 < k < 1, and in the breakpoint training process, corresponding penalty items are searched from the correlation table and are added into the loss function for calculation according to the recognition condition of the model on each picture.