CN111860631B - Article identification method adopting error factor reinforcement mode to optimize loss function - Google Patents
Article identification method adopting error factor reinforcement mode to optimize loss function Download PDFInfo
- Publication number
- CN111860631B CN111860631B CN202010669159.8A CN202010669159A CN111860631B CN 111860631 B CN111860631 B CN 111860631B CN 202010669159 A CN202010669159 A CN 202010669159A CN 111860631 B CN111860631 B CN 111860631B
- Authority
- CN
- China
- Prior art keywords
- loss function
- training
- model
- correlation
- cross entropy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000002787 reinforcement Effects 0.000 title claims description 6
- 238000012549 training Methods 0.000 claims abstract description 88
- 238000004364 calculation method Methods 0.000 claims abstract description 29
- 230000008569 process Effects 0.000 claims abstract description 14
- 238000012360 testing method Methods 0.000 claims abstract description 9
- 230000007246 mechanism Effects 0.000 claims description 6
- 238000012544 monitoring process Methods 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims 1
- 230000006870 function Effects 0.000 abstract description 59
- RVRCFVVLDHTFFA-UHFFFAOYSA-N heptasodium;tungsten;nonatriacontahydrate Chemical compound O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[W].[W].[W].[W].[W].[W].[W].[W].[W].[W].[W] RVRCFVVLDHTFFA-UHFFFAOYSA-N 0.000 abstract description 5
- 238000013135 deep learning Methods 0.000 abstract description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Algebra (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for using error factorThe object identification method of the enhanced mode optimized loss function is characterized in that the optimized loss function is named as coross, and a penalty term is added on the basis of the original cross entropy loss function to realize, wherein the penalty term comprises the following three modules: the punishment degree adjusting factor T is used for adjusting the strength of the influence of the correlation on the cross entropy loss function, and a T value can be set according to actual conditions during model training; correlation between classes of a datasetTesting the output of all the article categories through a preliminary model, and obtaining the correlation after calculation by using an information entropy formulaThe method comprises the steps of carrying out a first treatment on the surface of the Probability of related categoriesThe probability of identifying the target object as the object class related to the target object in the training process is not a constant value, and the target object is dynamically adjusted according to each training condition of the model; by adding punishment, the accuracy of the model to article identification is improved, and the identification accuracy of the deep learning network model can be improved.
Description
Technical Field
The invention relates to a method for optimizing a loss function, in particular to a method for optimizing the loss function by adopting a miscause strengthening mode.
Background
When the article type is identified, the article is often easy to identify as the article related to the appearance and the characteristics of the article, the identification result is inaccurate, and the influence of the similarity between the articles on the model precision is not considered in the existing loss function, so that the model is difficult to obtain improvement after learning to a certain degree.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides the article identification method which has high identification accuracy and adopts the error-caused reinforcement mode to optimize the loss function.
The technical scheme adopted for solving the technical problems is as follows:
the object identification method adopting error-based reinforcement mode to optimize the loss function is realized by adding a penalty term based on the original cross entropy loss function, wherein the penalty term comprises the following three modules:
the punishment degree adjusting factor T is used for adjusting the strength of influence of the correlation on the cross entropy loss function, when T=0, the punishment degree adjusting factor T has no influence on the cross entropy loss function, at the moment, the coross is the cross entropy loss function, and a T value can be set according to actual conditions during model training;
correlation between classes of a datasetTesting the output of all article categories through a preliminary model, and obtaining the correlation after calculation by using an information entropy formula>The method comprises the steps of carrying out a first treatment on the surface of the Probability of related category->I.e., the probability of identifying the target item as its associated item class during the training process, is non-constant and dynamically adjusted according to each training situation of the model.
The method comprises the following steps:
step S1, preliminary training is carried out, and the relevance of each category is obtained;
step S2, dynamically adding penalty items according to the identification result;
s3, constructing a new loss function;
step S4: setting an overflow mechanism;
step S5: the coross training was used.
The specific steps of step S1 are as follows: and carrying out preliminary training on the model by adopting a loss function of cross entropy loss, wherein the model after preliminary training is used for testing the correlation items of each category and the correlation among each category.
Step S2 comprises the following specific steps: and dynamically adding a penalty term according to the identification result, monitoring the identification result of each picture, adding the output of the model to the related term as a part of the penalty term in the form of probability score into the calculation of the loss function, and simultaneously using an overflow mechanism to protect the loss function in the training process to continue calculation, and once overflow, using the original cross entropy loss function.
The specific steps of step S3 are as follows: based on the original cross entropy loss function, after a preliminary model is trained through the original cross entropy loss function, testing all kinds of related items, namely classification error factors, and introducing the classification error factors into calculation of the loss function during formal training to construct a new loss function, wherein the specific formula for obtaining corosos is as follows:
wherein i refers to the item category of the correct classification, j refers to the item category having correlation with i, and d refers to the number of the relevant item categories; t is an adjusting factor for adjusting the penalty degree for loss calculation;the relevance between similar article categories is represented by information entropy, and the larger the information entropy is, the larger the relevance is indicated, and the higher the probability of model identification errors is; />The output probability of the related category in the training process;
wherein ,the specific formula of (2) is as follows:
wherein ,to classify the average value of i picture outputs, the picture output values of each class are added by position and then averaged to ensure that the output is at a normal level.
D value can be set according to actual conditions during model training, when d=0, other article types cannot influence the type identification of the target article, and then coross is the cross entropy loss function.
The specific steps of step S4 are as follows: when (when)When the cross entropy loss function is overflowed, the optimized loss function coross is used, and the original cross entropy loss function is used for calculation.
Step S5 includes the following training modes:
A. the first training mode is as follows: training twice, namely training a preliminary model by using an original cross entropy loss function, measuring the correlation between each class of data sets by using the preliminary model, arranging the data sets into a correlation table, performing formal training by using corosos in the second training, adding the correlation item of each article class into the corosos in an index manner in the formal training for calculation, and searching the corresponding punishment item from the correlation table for the identification condition of each picture by aiming at the preliminary model, and adding the punishment item into the corosos for calculation;
B. the second training mode is as follows: the training is carried out only once, N epochs are added, the training comprises two stages, the first stage takes a model of epochs=int [ kN ] as a preliminary model, wherein 0 < k < 1, the correlation among the data set categories is measured by the preliminary model and is arranged into a correlation table, the second stage uses coross to start breakpoint training from the epochs=int [ kN ] +1, wherein 0 < k < 1, and in the breakpoint training process, corresponding penalty items are searched from the correlation table and added into a loss function for calculation according to the recognition condition of the model to each picture.
The beneficial effects of the invention are as follows: according to the method, punishment is added, so that the accuracy of the model in identifying the object is improved, and the accuracy of identifying the deep learning network model can be improved.
Drawings
The invention will be further described with reference to the drawings and examples.
FIG. 1 is a flow chart of the steps of the present invention;
FIG. 2 is a flowchart of the steps of a first training mode;
fig. 3 is a flow chart of the steps of the second training mode.
Detailed Description
Referring to fig. 1 to 3, an article identification method adopting a miscause reinforcement mode to optimize a loss function is disclosed, the optimized loss function is named as coross, a penalty term is added on the basis of the original cross entropy loss function to realize, and the penalty term comprises the following three modules:
the punishment degree adjusting factor T is used for adjusting the strength of influence of the correlation on the cross entropy loss function, when T=0, the punishment degree adjusting factor T has no influence on the cross entropy loss function, at the moment, the coross is the cross entropy loss function, and a T value can be set according to actual conditions during model training;
correlation between classes of a datasetTesting the output of all article categories through a preliminary model, and obtaining the correlation after calculation by using an information entropy formula>;
Probability of related categoriesI.e., the probability of identifying the target item as its associated item class during the training process, is non-constant and dynamically adjusted according to each training situation of the model.
By adding punishment, the accuracy of the model to article identification is improved, and the identification accuracy of the deep learning network model can be improved.
The method comprises the following steps:
step S1, preliminary training is carried out, and the relevance of each category is obtained;
step S2, dynamically adding penalty items according to the identification result;
s3, constructing a new loss function;
step S4: setting an overflow mechanism;
step S5: the coross training was used.
The specific steps of step S1 are as follows: and carrying out preliminary training on the model by adopting a loss function of cross entropy loss, wherein the model after preliminary training is used for testing the correlation items of each category and the correlation among each category.
Step S2 comprises the following specific steps: and dynamically adding a penalty term according to the identification result, monitoring the identification result of each picture, adding the output of the model to the related term as a part of the penalty term in the form of probability score into the calculation of the loss function, and simultaneously using an overflow mechanism to protect the loss function in the training process to continue calculation, and once overflow, using the original cross entropy loss function.
The specific steps of step S3 are as follows: based on the original cross entropy loss function, after a preliminary model is trained through the original cross entropy loss function, testing all kinds of related items, namely classification error factors, and introducing the classification error factors into calculation of the loss function during formal training to construct a new loss function, wherein the specific formula for obtaining corosos is as follows:
wherein i refers to the item category of the correct classification, j refers to the item category having correlation with i, and d refers to the number of the relevant item categories; t is an adjusting factor for adjusting the penalty degree for loss calculation;the relevance between similar article categories is represented by information entropy, and the larger the information entropy is, the larger the relevance is indicated, and the higher the probability of model identification errors is; />The output probability of the related category in the training process;
wherein ,the specific formula of (2) is as follows:
wherein To classify the average value of i picture outputs, the picture output values of each class are added by position and then averaged to ensure that the output is at a normal level.
D value can be set according to actual conditions during model training, when d=0, other article types cannot influence the type identification of the target article, and then coross is the cross entropy loss function.
The specific steps of step S4 are as follows: when (when)When, the optimized loss function coross is used, and once overflowed, the original cross entropy loss function calculation is used (stated another way, whenWhen the method is used, the optimized loss function coross is used, otherwise, the original cross entropy loss function is used for calculation
Step S5 includes the following training modes:
A. the first training mode is as follows: training twice, namely training a preliminary model by using an original cross entropy loss function, measuring the correlation between each class of data sets by using the preliminary model, arranging the data sets into a correlation table, performing formal training by using corosos in the second training, adding the correlation item of each article class into the corosos in an index manner in the formal training for calculation, and searching the corresponding punishment item from the correlation table for the identification condition of each picture by aiming at the preliminary model, and adding the punishment item into the corosos for calculation;
B. the second training mode is as follows: only one training is performed, N epochs are used (when a complete data set passes through a neural network once and returns once, the process is called one epochs), the training comprises two stages, the first stage takes a model when epochs=int [ kN ] as a preliminary model, wherein 0 < k < 1, the correlation among the data set categories is measured by the preliminary model and is arranged into a correlation table, the second stage starts breakpoint training from the epochs=int [ kN ] +1 by using coross, wherein 0 < k < 1, and in the breakpoint training process, corresponding penalty terms are searched from the correlation table and added into a loss function according to the recognition condition of the model for each picture.
As shown in fig. 2, the first training method needs to be trained twice, the first training is preliminary training, the second training is formal training, the preliminary training adopts an original cross entropy loss function training to obtain the related items and the correlation thereof between each category, during the preliminary training, the original cross entropy loss function is used for training a preliminary model, then the preliminary model is used for measuring the correlation between each category of the data set, the preliminary model is used for measuring the correlation thereof to form a correlation table, the formal training adopts a corosos training model, the related items of each item category are added into the corosos in an index mode for calculation during the formal training, meanwhile, the corresponding penalty items are searched from the correlation table for the recognition condition of each picture by the preliminary model and are added into the loss function for calculation, the correlation table is called for reconstructing the cross entropy loss function, the correlation is added and a new loss function (corosos) is formed, and the formal training is performed on the model by using the corosos.
As shown in fig. 3, the second training method includes two stages, the first stage is to obtain the correlation term and correlation between the data set classes, N epochs (when a complete data set passes through the neural network once and returns once, this process is called once epochs), the first stage is to perform preliminary training on the model by using the original cross entropy loss function, then the model when epochs=int [ kN ] is taken as the preliminary model, where 0 < k < 1, the correlation between the data set classes is measured by using the preliminary model, and is arranged into a correlation table, the second stage starts breakpoint continuous training from the position of epochs=int [ kN ] +1 by using softmax+coross, where 0 < k < 1, and during the breakpoint continuous training, for each picture identification condition of the model, the corresponding cross entropy loss term and correlation are searched from the correlation table and added into the calculation of the loss function (the cross entropy loss function is reconstructed by calling the correlation table, and the correlation function is added into the calculation of the corlss in an indexed manner).
The above embodiments do not limit the protection scope of the invention, and those skilled in the art can make equivalent modifications and variations without departing from the whole inventive concept, and they still fall within the scope of the invention.
Claims (1)
1. An article identification method for optimizing a loss function by adopting a miscause reinforcement mode is characterized by comprising the following steps:
step S1, preliminary training is carried out, and the relevance of each category is obtained;
performing preliminary training on the model by adopting a loss function of cross entropy loss, wherein the model after the preliminary training is used for testing the correlation items of each category and the correlation among each category;
step S2, dynamically adding penalty items according to the identification result;
dynamically adding a penalty term according to the identification result, monitoring the identification result of each picture, adding the output of the model to the related term as a part of the penalty term in the form of probability score into the calculation of a loss function, and simultaneously using an overflow mechanism to protect the loss function in the training process to continue calculation, and once overflowed, using the original cross entropy loss function;
s3, constructing a new loss function;
based on the original cross entropy loss function, after a preliminary model is trained through the original cross entropy loss function, testing all kinds of related items, namely classification error factors, and introducing the classification error factors into calculation of the loss function during formal training to construct a new loss function, so as to obtain an optimized loss function corosos, wherein the specific formula is as follows:
;
wherein i refers to the item category of the correct classification, j refers to the item category having correlation with i, and d refers to the number of the relevant item categories; t is an adjusting factor for adjusting the penalty degree for loss calculation;the relevance between similar article categories is represented by information entropy, and the larger the information entropy is, the larger the relevance is indicated, and the higher the probability of model identification errors is;the output probability of the related category in the training process;
wherein ,the specific formula of (2) is as follows:
;
wherein ,for classifying the average value of i picture output, adding the picture output values of each class according to the position, and then averaging to ensure that the output is at a normal level;
d value can be set according to actual conditions during model training, when d=0, other article types can not influence the type identification of the target article, and the optimized loss function corlos is the cross entropy loss function;
step S4: setting an overflow mechanism;
when (when)When the cross entropy loss function is overflowed, the original cross entropy loss function is used for calculation;
step S5: training by adopting an optimized loss function coross;
A. the first training mode is as follows: training twice, namely training a preliminary model by using an original cross entropy loss function, measuring the correlation between each class of data sets by using the preliminary model, finishing the correlation into a correlation table, performing formal training by using an optimized loss function coross in the second training, adding the correlation item of each object class into the optimized loss function coross in an index manner during the formal training, and calculating, according to the identification condition of the preliminary model on each picture, searching the corresponding penalty item from the correlation table and adding the penalty item into the optimized loss function coross;
B. the second training mode is as follows: the training method comprises the steps of training N epochs once, wherein the training comprises two stages, the first stage takes a model of epochs=int [ kN ] as a preliminary model, wherein 0 < k < 1, the correlation among various categories of a data set is measured by the preliminary model and is arranged into a correlation table, the second stage starts breakpoint training from the position of epochs=int [ kN ] +1 by using an optimized loss function coross, wherein 0 < k < 1, and in the breakpoint training process, corresponding penalty items are searched from the correlation table and are added into the loss function for calculation according to the recognition condition of the model on each picture.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010669159.8A CN111860631B (en) | 2020-07-13 | 2020-07-13 | Article identification method adopting error factor reinforcement mode to optimize loss function |
PCT/CN2020/116176 WO2022011827A1 (en) | 2020-07-13 | 2020-09-18 | Method for optimizing loss function by means of error cause reinforcement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010669159.8A CN111860631B (en) | 2020-07-13 | 2020-07-13 | Article identification method adopting error factor reinforcement mode to optimize loss function |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111860631A CN111860631A (en) | 2020-10-30 |
CN111860631B true CN111860631B (en) | 2023-08-22 |
Family
ID=72984762
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010669159.8A Active CN111860631B (en) | 2020-07-13 | 2020-07-13 | Article identification method adopting error factor reinforcement mode to optimize loss function |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111860631B (en) |
WO (1) | WO2022011827A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112580714B (en) * | 2020-12-15 | 2023-05-30 | 电子科技大学中山学院 | Article identification method for dynamically optimizing loss function in error-cause reinforcement mode |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9691395B1 (en) * | 2011-12-31 | 2017-06-27 | Reality Analytics, Inc. | System and method for taxonomically distinguishing unconstrained signal data segments |
CN108984539A (en) * | 2018-07-17 | 2018-12-11 | 苏州大学 | The neural machine translation method of translation information based on simulation future time instance |
CN110569338A (en) * | 2019-07-22 | 2019-12-13 | 中国科学院信息工程研究所 | Method for training decoder of generative dialogue system and decoding method |
CN111242245A (en) * | 2020-04-26 | 2020-06-05 | 杭州雄迈集成电路技术股份有限公司 | Design method of classification network model of multi-class center |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108344574B (en) * | 2018-04-28 | 2019-09-10 | 湖南科技大学 | A kind of Wind turbines Method for Bearing Fault Diagnosis based on depth joint adaptation network |
-
2020
- 2020-07-13 CN CN202010669159.8A patent/CN111860631B/en active Active
- 2020-09-18 WO PCT/CN2020/116176 patent/WO2022011827A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9691395B1 (en) * | 2011-12-31 | 2017-06-27 | Reality Analytics, Inc. | System and method for taxonomically distinguishing unconstrained signal data segments |
CN108984539A (en) * | 2018-07-17 | 2018-12-11 | 苏州大学 | The neural machine translation method of translation information based on simulation future time instance |
CN110569338A (en) * | 2019-07-22 | 2019-12-13 | 中国科学院信息工程研究所 | Method for training decoder of generative dialogue system and decoding method |
CN111242245A (en) * | 2020-04-26 | 2020-06-05 | 杭州雄迈集成电路技术股份有限公司 | Design method of classification network model of multi-class center |
Also Published As
Publication number | Publication date |
---|---|
WO2022011827A1 (en) | 2022-01-20 |
CN111860631A (en) | 2020-10-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110826618A (en) | Personal credit risk assessment method based on random forest | |
CN114281809B (en) | Multi-source heterogeneous data cleaning method and device | |
CN109359135B (en) | Time sequence similarity searching method based on segment weight | |
CN109816002B (en) | Single sparse self-encoder weak and small target detection method based on feature self-migration | |
CN112764809B (en) | SQL code plagiarism detection method and system based on coding characteristics | |
CN111860631B (en) | Article identification method adopting error factor reinforcement mode to optimize loss function | |
CN111338950A (en) | Software defect feature selection method based on spectral clustering | |
CN105701501A (en) | Trademark image identification method | |
CN110287302B (en) | Method and system for determining confidence of open source information in national defense science and technology field | |
CN111651477A (en) | Multi-source heterogeneous commodity consistency judging method and device based on semantic similarity | |
CN114611515B (en) | Method and system for identifying enterprise actual control person based on enterprise public opinion information | |
CN100555270C (en) | A kind of machine automatic testing method and system thereof | |
Wu et al. | Optimization and improvement based on K-Means Cluster algorithm | |
CN101145166A (en) | Syllable drive based transliterated entity name computer automatic identification method | |
CN112580714B (en) | Article identification method for dynamically optimizing loss function in error-cause reinforcement mode | |
CN111833856A (en) | Voice key information calibration method based on deep learning | |
Zhe et al. | An algorithm of detection duplicate information based on segment | |
CN117523324B (en) | Image processing method and image sample classification method, device and storage medium | |
CN110033025A (en) | A kind of construction method of strong classifier in AdaBoost algorithm | |
CN117725437B (en) | Machine learning-based data accurate matching analysis method | |
CN111325097B (en) | Enhanced single-stage decoupled time sequence action positioning method | |
CN107273915B (en) | A kind of target classification identification method that local message is merged with global information | |
CN113139106B (en) | Event auditing method and device for security check | |
CN107908654A (en) | A kind of recommendation method, system and device in knowledge based storehouse | |
CN111209743A (en) | Improved HIDFWL feature extraction method based on information entropy and word length information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |