CN114462526A - Classification model training method and device, computer equipment and storage medium - Google Patents

Classification model training method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114462526A
CN114462526A CN202210106215.6A CN202210106215A CN114462526A CN 114462526 A CN114462526 A CN 114462526A CN 202210106215 A CN202210106215 A CN 202210106215A CN 114462526 A CN114462526 A CN 114462526A
Authority
CN
China
Prior art keywords
sample
label
classification
classification model
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210106215.6A
Other languages
Chinese (zh)
Inventor
楚选耕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202210106215.6A priority Critical patent/CN114462526A/en
Publication of CN114462526A publication Critical patent/CN114462526A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Abstract

The embodiment of the application discloses a classification model training method and device, computer equipment and a storage medium. According to the method and the device, through the acquisition of the plurality of sample data sets, the preset classification model is subjected to combined training through the plurality of sample data sets, the negative perception loss training model is used in the combined training stage, the classification model after training is obtained, and the classification model after training can reach better effect on all classification categories preliminarily. And then, carrying out alternate self-training by using the trained model, further improving the performance of the classification model through semi-supervised stage training, obtaining a final optimal model, and improving the accuracy of the classification model on target detection.

Description

Classification model training method and device, computer equipment and storage medium
Technical Field
The application relates to the technical field of computers, in particular to a classification model training method and device, computer equipment and a storage medium.
Background
With the rapid development of computer technology, the target detection algorithm has been widely applied in real life. However, as new needs are proposed and the kinds of products increase, the categories that the target detection algorithm needs to judge also increase. For the class increment target detection task, a new data set containing all classes is marked to train the target detector capable of detecting all classes. However, when adding a new category, this method needs to add the label of the new category to the old data set, and may need to add some new data including all the categories. This consumes a lot of manpower and material resources when the category and data volume become larger.
The more common class increment methods today train the target detector by adding a data set that contains only the new class. This greatly reduces the cost of data annotation. But the methods of training the target detector by adding a dataset containing only new classes are fine-tuned based on the old model, or are relatively simple to jointly train or fine tune the model using the new and old datasets. These algorithms can lead to poor performance of the final target detector in complex scenarios requiring numerous additions of new classes, resulting in low detection accuracy of the target detector.
Disclosure of Invention
The embodiment of the application provides a classification model training method and device, computer equipment and a storage medium, which can improve the accuracy of a classification model on target detection.
The embodiment of the application provides a classification model training method, which comprises the following steps:
obtaining a classification model to be trained and at least two sample sets, wherein labels of sample graphs of the sample sets comprise object class labels, and the object class labels indicate a target area in the sample graph and a classification class of a target object in the target area;
aiming at a sample graph in a sample set, extracting at least one feature graph of the sample graph through the classification model, and classifying each feature graph region in the feature graph to obtain the prediction probability of each feature graph region under each classification category of the classification model;
determining a first label corresponding to each feature map region in each feature map according to the object class label of the sample map;
if the label of the feature map region comprises the object class label, calculating a first loss of the feature map region based on the prediction probability of the feature map region under each classification class and the object class label;
if the label of the feature map region does not comprise the object class label, calculating a second loss of the feature map region based on the prediction probability of the feature map region under all classification classes corresponding to the sample set to which the sample map belongs and the label;
adjusting parameters of the classification model based on the first loss and the second loss of a sample graph in a sample set to complete training of the classification model based on the sample set.
Correspondingly, the embodiment of the present application further provides a classification model training device, including:
the device comprises an acquisition unit and a processing unit, wherein the acquisition unit is used for acquiring a classification model to be trained and at least two sample sets, labels of sample images of the sample sets comprise object class labels, and the object class labels indicate target areas in the sample images and classification classes of target objects in the target areas;
the first extraction unit is used for extracting at least one feature map of a sample map through the classification model aiming at the sample map in a sample set, and classifying the regions of the feature map in the feature map respectively to obtain the prediction probability of each feature map region under each classification category of the classification model;
a first determining unit, configured to determine, according to an object class label of the sample graph, a first label corresponding to each feature graph region in each feature graph;
a first calculating unit, configured to calculate a first loss of the feature map region based on the prediction probability of the feature map region in each classification category and the object category label if the label of the feature map region includes the object category label;
a second calculating unit, configured to calculate a second loss of the feature map region based on the prediction probabilities of the feature map region under all classification categories corresponding to the sample set to which the sample map belongs and the labels, if the labels of the feature map region do not include the object category label;
a first adjusting unit, configured to adjust parameters of the classification model based on the first loss and the second loss of a sample graph in a sample set, so as to complete training of the classification model based on the sample set.
In some embodiments, the obtaining unit comprises:
the first obtaining subunit is configured to obtain a historical sample set of a historical classification model, where the classification categories of the classification model to be trained at least include partial classification categories of the historical classification model;
the second obtaining subunit is used for obtaining a first sample set used for training the classification model to be trained on the basis of the historical sample set;
and the third obtaining subunit is configured to obtain a second sample set, where the classification category corresponding to the label in the second sample set includes a classification category other than the partial classification category in the classification categories of the classification model to be trained.
In some embodiments, the first determination unit comprises:
a first extraction subunit, configured to extract at least one label feature map of the label image through the classification model, and map the classification category label in the label image into the label feature map when extracting the label feature map;
and the first determining subunit is used for determining the label of the feature map area at the same position in the feature map based on the label of each feature map area in the label feature map with the same scale as that of each feature map.
In some embodiments, the first extraction unit comprises:
the second determining subunit is used for determining the feature vectors corresponding to the positions of the feature map areas in the feature map;
and the first classification subunit is used for classifying the feature vectors to obtain the prediction probability of the feature vectors under each classification type of the classification model.
In some embodiments, the apparatus further comprises:
the first execution unit is used for executing the training phase of the second mode after the training phase of the first mode based on the sample set is finished;
the classification unit is used for classifying the sample graphs in each sample set based on the classification model, determining a newly added pseudo object class label of the sample graph, and obtaining a plurality of pseudo sample sets, wherein the pseudo object class label indicates an interested region containing a pseudo object in the sample graph and a classification class of the pseudo object;
the second extraction unit is used for extracting at least one feature map of a sample map in a pseudo sample set through the classification model and classifying the regions of the feature maps in the feature map to obtain the prediction probability of each feature map region under each classification category of the classification model;
a second determining unit, configured to determine, according to the object class label and the pseudo object class label of the sample graph, a second label corresponding to each feature graph region in each feature graph;
a third calculating unit, configured to calculate a third loss of the feature map region based on the prediction probability of the feature map region in each classification category and the second label if the second label of the feature map region includes the object category label or a pseudo object category label;
a fourth calculating unit, configured to calculate a fourth loss of the feature map region based on the prediction probabilities of the feature map region under all classification categories corresponding to the pseudo sample set to which the sample map belongs and the second label if the label of the feature map region does not include the object category label and the pseudo object category label;
a second adjusting unit, configured to adjust parameters of the classification model based on the third loss and the fourth loss of a sample graph in a pseudo sample set, so as to complete a second mode of phase training of the classification model based on the pseudo sample set.
In some embodiments, the classification unit comprises:
the second extraction subunit is used for extracting the features of the background area outside the target area in the sample image based on the label of the sample image to obtain the features of the background area;
and the second classification subunit is used for classifying the characteristics of the background area through the classification model to obtain a pseudo object class label corresponding to the background area.
In some embodiments, the apparatus further comprises:
and the first processing unit is used for carrying out first enhancement processing on the sample image in the sample set.
In some embodiments, the apparatus further comprises:
and the second processing unit is used for carrying out second enhancement processing on the sample images in the pseudo sample set, wherein the image enhancement intensity of the first enhancement processing is lower than that of the second enhancement processing.
In some embodiments, the apparatus further comprises:
the third processing unit is used for taking the obtained classification model as a new model to be trained after the stage training of the second mode of the classification model based on the pseudo sample set;
and a second execution unit, configured to return to execute the sample graph in the sample set, extract at least one feature graph of the sample graph through the classification model, and classify each feature graph region in the feature graph to obtain a prediction probability of each feature graph region in each classification category of the classification model until a model convergence condition is satisfied.
Accordingly, embodiments of the present application further provide a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the classification model training method provided in any of the embodiments of the present application.
Correspondingly, the embodiment of the application also provides a storage medium, wherein the storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor to execute the classification model training method.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a storage medium. The processor of the terminal reads the computer instructions from the storage medium, and executes the computer instructions, so that the terminal executes the classification model training method provided in various optional implementation manners of the above aspects.
According to the method and the device, the network model is built, the preset classification model is subjected to combined training through the multiple sample data sets, the negative perception loss training model is used in the combined training stage, and the trained classification model is obtained, so that the trained classification model can achieve a good effect on all classification categories preliminarily. And then, performing alternate self-training by using the trained model, further improving the performance of the classification model through semi-supervised stage training, obtaining a final optimal model, improving the accuracy of the classification model on target detection, and improving the accuracy of the classification model on the target detection.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a scene schematic diagram of a classification model training system provided in an embodiment of the present application.
Fig. 2 is a schematic flowchart of a classification model training method according to an embodiment of the present application.
Fig. 3 is a schematic view of an application scenario of a classification model training method according to an embodiment of the present application.
Fig. 4 is a schematic view of an application scenario of another classification model training method provided in the embodiment of the present application.
Fig. 5 is a schematic flowchart of another classification model training method according to an embodiment of the present application.
Fig. 6 is a schematic view of an application scenario of another classification model training method according to an embodiment of the present application.
Fig. 7 is a schematic view of an application scenario of another classification model training method according to an embodiment of the present application.
Fig. 8 is a block diagram of a classification model training apparatus according to an embodiment of the present application.
Fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides a classification model training method and device, computer equipment and a storage medium. Specifically, the embodiment of the application provides a classification model training device suitable for computer equipment. The computer device may be a terminal or a server, and the server may be an independent physical server, or a server cluster or distributed system formed by a plurality of physical servers. The terminal includes, but is not limited to, a mobile phone, a computer, an intelligent voice interaction device, an intelligent appliance, a vehicle-mounted terminal, an aircraft, and the like, and the embodiment of the present application is not limited herein.
Referring to fig. 1, fig. 1 is a schematic view of a scenario of a classification model training system according to an embodiment of the present application, including a server, where the server may be connected to a network, and the network includes network entities such as a router and a gateway.
The server can obtain a classification model to be trained and at least two sample sets, wherein labels of sample graphs of the sample sets comprise object class labels, and the object class labels indicate target areas in the sample graphs and classification classes of target objects in the target areas; aiming at a sample graph in a sample set, extracting at least one feature graph of the sample graph through a classification model, and classifying each feature graph region in the feature graph to obtain the prediction probability of each feature graph region under each classification category of the classification model; determining a first label corresponding to each feature map region in each feature map according to the object category label of the sample map; if the label of the feature map region comprises an object class label, calculating a first loss of the feature map region based on the prediction probability of the feature map region under each classification class and the object class label; if the label of the feature map region does not comprise the object class label, calculating a second loss of the feature map region based on the prediction probability and the label of the feature map region under all classification classes corresponding to the sample set to which the sample map belongs; and adjusting parameters of the classification model based on the first loss and the second loss of the sample graph in the sample set so as to finish training the classification model based on the sample set.
It should be noted that the scene schematic diagram of the classification model training system shown in fig. 1 is only an example, and the classification model training system and the scene described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application.
Based on the above problems, embodiments of the present application provide a first classification model training method, apparatus, computer device, and storage medium, which can improve accuracy of a classification model for target detection. The following are detailed below. It should be noted that the following description of the embodiments is not intended to limit the preferred order of the embodiments.
As artificial intelligence technology has been researched and developed, artificial intelligence technology has been developed and applied in various fields. Such as robotics, smart medicine, etc. With the development of the technology, the artificial intelligence technology can be applied in more fields and can play more and more important value.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results, so that the machine has the functions of perception, reasoning and decision making.
Computer Vision technology (CV), which is machine Vision that uses a camera and a Computer to identify and measure a target instead of human eyes, and further performs image processing, so that the Computer processing becomes an image more suitable for human eyes to observe or transmit to an instrument to detect, and an artificial intelligence system capable of acquiring information from the image or multidimensional data is attempted to be established. The computer vision technology generally includes technologies such as classification model training, image recognition and the like, and also includes common biometric identification technologies such as face recognition, fingerprint recognition and the like.
Machine Learning (ML) is a multi-domain interdisciplinary discipline that studies how computers simulate or implement human Learning behaviors to acquire new knowledge or skills and reorganize existing knowledge structures to improve their performance. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
In the scheme, a computer vision technology is adopted to preprocess the sample graph, and a training data set is determined based on the processed sample graph. And finally, detecting the class of the target object in the image based on the trained classification network model, thereby improving the target detection effect of the classification model.
As shown in fig. 2, fig. 2 is a schematic flowchart of a classification model training method provided in the embodiment of the present application. The specific process of the classification model training method can be as follows:
101. and obtaining a classification model to be trained and at least two sample sets.
In the embodiment of the present application, the classification model may be used for performing target detection, for example, the classification model may be used for performing classification and identification on an object included in an image to obtain a category to which the object belongs.
The sample set is used for training the classification model to be trained, so that the classification model can identify classification categories corresponding to the sample set, each sample set comprises at least one sample graph, the labels of the sample graphs of the sample set comprise object category labels, and the object category labels are used for indicating a target area in the sample graph and the classification category of a target object in the target area.
For example, please refer to fig. 3, and fig. 3 is a schematic view of an application scenario of a classification model training method according to an embodiment of the present application. In the sample map shown in fig. 3, a target area is included, and the target area includes a target classification object, and the classification category of the target classification object may be: and the class A is the object class label of the sample graph, wherein the class A to which the target area and the target classification object belong.
In some embodiments, in order to improve the detection efficiency of the target detector performing class increment, the step "obtaining a classification model to be trained, and at least two sample sets" may include the following operations:
acquiring a historical sample set of a historical classification model, wherein the classification category of the classification model to be trained at least comprises part of classification categories of the historical classification model;
acquiring a first sample set used for training a classification model to be trained based on a historical sample set;
a second set of samples is obtained.
The historical classification model is a classification model obtained by training based on a historical sample set, and the classification category of the historical classification model may be a classification category corresponding to an object category label labeled in the historical sample set, that is, a classification category corresponding to the historical sample set.
For example, the classification categories corresponding to the historical sample set may include: class a, class B, class C, and class D, etc., the classification classes of the history classification model may be: class a, class B, class C, and class D.
Specifically, the classification categories of the classification model to be trained may be based on an increase of the classification categories of the historical classification model, i.e., a category increment. The classification category of the classification model to be trained may comprise a partial classification category of the historical classification model or the classification category of the classification model to be trained may comprise the entire classification category of the historical classification model.
For example, the classification categories of the historical classification model may be: class a, class B, class C, and class D, the classification classes of the classification model to be trained may include: class a, class B, class C, and/or class D.
The first sample set refers to a historical sample set corresponding to the classification category of the historical classification model which is the same as the classification category of the classification model to be trained.
For example, the historical sample set may include: the method comprises a sample set a and a sample set b, wherein the classification category corresponding to the sample set a may include: the classification categories corresponding to the sample set B may include: class C and class D. If the classification category of the classification model to be trained includes category a and category B, sample set a may be obtained from the historical sample set as the first sample set.
Or, the classification categories of the classification model to be trained include: class a and class B, class C and class D, a sample set a and a sample set B may be obtained from the historical sample set as the first sample set.
And the classification classes corresponding to the labels in the second sample set comprise classification classes except part of classification classes in the classification classes of the classification model to be trained.
For example, the classification categories of the classification model to be trained may include: class a, class B, class C, class D, class E, class F. The sample sets corresponding to the category a, the category B, the category C, and the category D may be obtained from the history sample set, and then the sample sets corresponding to the category E and the category F are obtained to obtain the second sample set. The class E and the class F can be new added classification classes based on historical classification models, and the second sample set can be used for training the classification model to be trained for realizing class increment through the first sample set.
102. And aiming at a sample graph in a sample set, extracting at least one feature graph of the sample graph through a classification model, and classifying the regions of the feature graphs in the feature graph respectively to obtain the prediction probability of the regions of the feature graphs under each classification category of the classification model.
The feature map includes image features of the sample map, and different feature maps may represent features of the sample map in different dimensions. Specifically, feature extraction is performed on the sample image through the classification model, so that at least one feature map of the sample image is obtained.
In the embodiment of the application, in order to improve the classification prediction effect of the classification model, the feature map may be divided into a plurality of feature map regions, and then each feature map region is classified by the classification model, so as to obtain the prediction probability of each feature map region in each classification category.
The classification categories of the classification model include classification categories corresponding to at least two sample sets, for example, the classification model is trained through a sample set a and a sample set b, where the classification categories corresponding to the sample set a may include: class a and class B; the classification category corresponding to the sample set b may include: class C and class D, the classification classes of the classification model include: class a, class B, class C, and class D.
The prediction probability refers to a probability value that the feature map region belongs to the classification category.
Specifically, each feature region of the feature map is classified by the classification model, and a probability value of each classification category of the feature map region belonging to the classification model is obtained.
For example, the feature map may include: the first feature map region, the second feature map region, and the classification category of the classification model may include: the category a, the category B, the category C, and the category D, classify the first feature map region through the classification model, and obtain probability values of the first feature map region under the category a, the category B, the category C, and the category D, respectively, for example, may be: 0.1, 0.2, 0.1, 0.4; classifying the second feature map region through a classification model to obtain probability values of the second feature map region under a category a, a category B, a category C and a category D, respectively, which may be, for example: 0.2,0.2,0.3,0.1.
In some embodiments, in order to improve the efficiency of the classification processing of the classification model, the step "classifying the feature map regions in the feature map to obtain the prediction probabilities of the feature map regions under each classification category of the classification model" may include the following operations:
determining a feature vector corresponding to the position of each feature map area in the feature map;
and classifying the feature vectors to obtain the prediction probability of the feature vectors under each classification category of the classification model.
The feature vector refers to a feature numerical expression of a position where a feature map region is located in the feature map. And converting the characteristics of the positions of the characteristic diagram areas in the characteristic diagram into characteristic numerical value expressions to obtain characteristic vectors corresponding to the characteristic diagram areas.
Further, when the feature regions are classified by the classification model, the feature vectors corresponding to the feature map regions are specifically classified by the classification model to obtain the prediction probability of each feature vector under each classification category of the classification model, so that the prediction probability of each feature map region under each classification category of the classification model is obtained. The classification model is used for classifying the feature vectors corresponding to the feature map region, so that the classification processing efficiency can be improved.
103. And determining a first label corresponding to each feature map region in each feature map according to the object class labels of the sample map.
The object class labels of the sample graph can be labeled manually, and the step is mainly to map the object class labels of the sample graph to each feature graph area of the feature graph of the sample graph.
The label of the sample graph can be a label image, and the label image is obtained by labeling the classification type identifier on the target area on the sample graph.
In some embodiments, in order to improve the efficiency of model training, the step "determining the first label corresponding to each feature map region in each feature map" may include the following operations:
extracting at least one label feature map of the label image through a classification model, and mapping classification category labels in the label image into the label feature map when the label feature map is extracted;
and determining the label of the feature map area at the same position in the feature map based on the label of each feature map area in the label feature map with the same scale as that of each feature map.
Specifically, at least one label feature map of the label image is extracted through the classification model, feature extraction can be performed on the label image through the classification model, features of the label image in the same dimension form a label feature map, meanwhile, when the feature extraction is performed on the label image, features of classification category labels in the label image are extracted, when the label feature map is formed based on the features of the label image, the features of the classification category labels are added to the label feature map, and therefore the classification category labels in the label image are mapped to the label feature map, and the subsequent label addition of the feature map region of the feature map is facilitated.
Further, matching the feature maps with the same scale with the label feature map, and then performing label mapping on the matched feature map and the label feature map.
For example, referring to fig. 4, fig. 4 is a schematic view of an application scenario of another classification model training method according to an embodiment of the present application. The feature map shown in fig. 4 has the same scale as the label feature map, where for a position 11 in the label feature map at a position 1 in the feature map, the label corresponding to the position 11 in the label feature map may be mapped to the position 1 in the feature map, so as to obtain the label corresponding to the feature map region at the position 1 in the feature map.
104. If the label of the feature map region comprises an object class label, calculating a first loss of the feature map region based on the prediction probability of the feature map region in each classification class and the object class label.
Specifically, if the label of the feature map area includes the object type label, it indicates that the content of the feature map area belongs to the classification type of the object type label.
The first loss is a loss value calculated from a feature map region including an object class label in the feature map.
In the embodiment of the present application, a specified loss function is designed, and the form of the specified loss function may be as follows:
Figure BDA0003494019590000131
where N denotes the number of sample sets, DiRepresents a sample set, CiRepresenting a sample set DiCorresponding classification category, FiRepresentation from sample set DiF represents FiOf the feature vector. lfiIs represented by CiLabel of notes,/negRepresenting a background label, a background label. x is a radical of a fluorine atomiIs CiNode of the corresponding classification detection head, xoIs to divide C in the classification detection headiOther nodes than the corresponding node. L iscRepresenting the binary cross-entropy focus loss for classification. α is inf1, l for a positive samplefAlpha is 0, l for negative samplesfFor positive samples, the label corresponding to the feature vector is the object class label,/fAnd representing that the label corresponding to the feature vector is a background label for the negative sample.
The annotated label is a label manually marked in the sample image, and the annotated label contains an object to be detected. The background refers to a part of the sample graph which is not labeled manually, and can be regarded as a background part, and the background part can correspond to a background label.
The classification detection refers to a target detector, that is, a structure for mapping feature vectors into final predictions (positions and categories) in a classification model, and is generally a one-layer or multi-layer fully-connected network. And a node refers to a node that is output in a fully-connected network.
For example, the length of the network feature vector is 256, the target detector can process 80 classes, and the class detection head maps the 256-dimensional feature vector to 80 dimensions, where each dimension represents the confidence level of the class belonging to the class. The node is any one of 80 output dimensions.
Specifically, for a feature region including an object class label, calculating a feature vector corresponding to the feature region may pass through LcAccording to the prediction probability of the feature map region under each classification category and the object category markAnd calculating a loss value corresponding to the characteristic region to obtain a first loss.
For example, the feature map region in the feature map of the sample map includes: the first feature map region, the second feature map region and the like, and the classification model comprises classification categories which can be category A, category B, category C and the like. Wherein, the first feature map area includes the object class label, then the first feature map area can pass LcAnd calculating loss values corresponding to the first characteristic region according to the prediction probabilities of the first characteristic region under the category A, the category B and the category C and the category A, the category B and the category C to obtain a first loss.
105. And if the label of the feature map region does not comprise the object class label, calculating the second loss of the feature map region based on the prediction probability and the label of the feature map region under all classification classes corresponding to the sample set to which the sample map belongs.
Specifically, for a feature region not including an object class label, calculating a feature vector corresponding to the feature region may pass through LcAnd calculating a loss value corresponding to the characteristic region according to the prediction probability of the characteristic region under the classification category corresponding to the sample set to which the sample graph corresponding to the characteristic region belongs and the object category label to obtain a second loss.
For example, the feature map region in the feature map of the sample map includes: the first feature map region, the second feature map region and the like, and the classification model comprises classification categories which can be category A, category B, category C and the like. The second feature map region does not include the object class label, and the classification class corresponding to the sample set to which the sample map corresponding to the second feature map region belongs may be: class A, class B, then for the second feature map region, can pass LcAnd calculating the loss value corresponding to the second characteristic region according to the prediction probability of the second characteristic map region under the category A and the category B and the category A and the category B to obtain a second loss.
106. And adjusting parameters of the classification model based on the first loss and the second loss of the sample graph in the sample set so as to finish training the classification model based on the sample set.
Further, according to the specified loss function of the embodiment of the present application:
Figure BDA0003494019590000151
after the loss of each feature map region in the feature map is obtained through calculation, the method also includes: the method comprises the steps of obtaining a first loss of a feature map area including an object class label and a second loss of the feature map area not including the object class label, summing the first loss and the second loss corresponding to feature map features belonging to the same feature map to obtain a loss corresponding to each feature map, summing the losses corresponding to feature maps belonging to the same sample set to obtain a loss corresponding to the sample set, summing the losses corresponding to each sample set to obtain a total loss, namely LNA(x, l), and then adjusting parameters of the classification model through the total loss until the classification model converges to obtain the trained classification model.
In some embodiments, to further enhance the classification effect of the classification model, the method further comprises the following steps:
after the training phase of the first mode based on the sample set is finished, executing the training phase of the second mode;
classifying the sample graphs in each sample set based on a classification model, determining a newly added pseudo object class label of the sample graph, and obtaining a plurality of pseudo sample sets, wherein the pseudo object class label indicates an interested region containing a pseudo object in the sample graph and a classification class of the pseudo object;
aiming at a sample graph in a pseudo sample set, extracting at least one feature graph of the sample graph through a classification model, and classifying each feature graph region in the feature graph to obtain the prediction probability of each feature graph region under each classification category of the classification model;
determining a second label corresponding to each feature map area in each feature map according to the object class label and the pseudo object class label of the sample map;
if the second label of the feature map region comprises an object class label or a pseudo object class label, calculating a third loss of the feature map region based on the prediction probability of the feature map region under each classification class and the second label;
if the label of the feature map region does not comprise the object class label and the pseudo object class label, calculating a fourth loss of the feature map region based on the prediction probability of the feature map region under all classification classes corresponding to the pseudo sample set to which the sample map belongs and the second label;
and adjusting parameters of the classification model based on the third loss and the fourth loss of the sample graph in the pseudo sample set so as to complete the stage training of the second mode of the classification model based on the pseudo sample set.
The training phase based on the first mode of the sample set, namely the training of the classification model in the above-mentioned step 101-106, includes training the classification model by specifying a loss function.
In some embodiments, in order to improve the utilization efficiency of the sample set, more labels that are not manually labeled in the sample set are mined, and the step "classify the sample graphs in each sample set based on the classification model, and determine a new pseudo object class label added to the sample graph" may include the following operations:
based on the label of the sample image, extracting the characteristics of a background area outside the target area in the sample image to obtain the characteristics of the background area;
and classifying the characteristics of the background area through a classification model to obtain a pseudo object class label corresponding to the background area.
Specifically, a region outside a target region with a label in the sample image can be used as a background region, no label is manually marked in the background region, and then feature extraction is performed on the background region through a classification model to obtain a background region feature corresponding to the background region.
Further, the classification model is used for classifying the characteristics of the background area, and the category of the background area is identified, so that the pseudo object category label is obtained. Wherein the pseudo object class label may comprise a manually labeled object class label or may not comprise a manually labeled object class label.
In some embodiments, in order to improve the accuracy of generating the pseudo object class labels, before the step "classifying the sample graphs in each sample set based on the classification model", the following steps may be further included:
carrying out first enhancement processing on a sample image in a sample set;
before the step "extracting at least one feature map of the sample map by the classification model for the sample map in a pseudo sample set", the following steps may be further included:
and carrying out second enhancement processing on the sample images in the pseudo sample set.
The image enhancement intensity of the first enhancement processing is lower than that of the second enhancement processing, for example, the first enhancement processing may be: the weak enhancement process and the second enhancement process may be a strong enhancement process. In particular, the strong enhancement may include: overturning, mosaic enhancing, mixed enhancing and the like.
In some embodiments, to avoid repeating training the classification model for the same classification label, after generating the pseudo object class label, label deduplication may be performed according to the object class label and the pseudo object class label.
Specifically, all the pseudo object class labels and the artificial labels, that is, the object class labels, may be processed by using a Non-Maximum Suppression (Non-Maximum Suppression) method, and the object class labels and repeated labels in the pseudo object class labels are removed to obtain final labels. Since the quality of the artificial tags is higher, all artificial tags can be retained and pseudo tags can be preferentially removed.
For a specific training process of training the second mode of the classification model based on the pseudo sample set, reference may be made to the above description, which is not repeated herein.
In some embodiments, to further improve the classification effect of the classification model, the method may further include the steps of:
after the stage of the second mode of the classification model is trained based on the pseudo sample set, the obtained classification model is used as a new model to be trained;
and returning to execute the steps of extracting at least one feature map of the sample map through the classification model aiming at the sample map in a sample set, and classifying the feature map regions in the feature map respectively to obtain the prediction probability of each feature map region under each classification category of the classification model until the model convergence condition is met.
Specifically, after the stage training of the second mode of the classification model, the classification model is obtained, a new label is introduced, good results cannot be obtained often by directly using the pseudo label performance, a confirmation deviation exists, and when the model generates incorrect pseudo labels with high confidence, incorrect predictions are further strengthened by the incorrect pseudo labels. In order to solve the problem, the obtained classification model is used as a new model to be trained, then the stage training of the first mode is carried out again, and the two training modes are carried out alternately until the classification model converges. By the support of the stage training of the first mode, the confirmation deviation of the stage training of the second mode can be corrected in time.
The embodiment of the application discloses a classification model training method, which comprises the following steps: the method comprises the steps of obtaining a classification model to be trained and at least two sample sets, wherein labels of sample graphs of the sample sets comprise object class labels, and the object class labels indicate a target area in the sample graphs and a classification class of a target object in the target area; extracting at least one feature map of the sample map through a classification model aiming at the sample map in a sample set, and classifying each feature map region in the feature map to obtain the prediction probability of each feature map region under each classification category of the classification model; determining a first label corresponding to each feature map region in each feature map according to the object category label of the sample map; if the label of the feature map region comprises an object class label, calculating a first loss of the feature map region based on the prediction probability of the feature map region under each classification class and the object class label; if the label of the feature map region does not comprise the object class label, calculating a second loss of the feature map region based on the prediction probability and the label of the feature map region under all classification classes corresponding to the sample set to which the sample map belongs; and adjusting parameters of the classification model based on the first loss and the second loss of the sample graph in the sample set so as to finish training the classification model based on the sample set. Therefore, the accuracy of the classification model for target detection can be improved.
Based on the above description, the classification model training method of the present application will be further described below by way of example. Referring to fig. 5, fig. 5 is a schematic flowchart illustrating another classification model training method according to an embodiment of the present disclosure. The specific process can be as follows:
201. at least two data sets are acquired.
In the embodiment of the application, the data set includes a sample graph, the sample graph may include a label labeled manually, and the classification category corresponding to the label is a classification category to be identified by a training classification model.
For example, the number of data sets may be N, including: data set { D1,D2,…,DN},CiRepresenting a data set DiOf the labeled Categories, Cunion=C1∪C2∪...∪CNThe goal of this scheme is to extract { D } from the data set1,D2,…,DNLearning one can detect CunionThe target detector of (1). Object detector, i.e. classification model, wherein each data set DiCan be labeled Ci,Ci∩CjThe data sets may or may not be empty sets, that is, different data sets may or may not include the same category.
202. And performing stage training of a first mode on the preset classification model through at least two data sets to obtain a trained classification model.
The embodiment of the application trains the model in a multi-stage mode. Firstly, Joint Training (i.e. phase Training in a first mode) is performed, and in this phase, an NA Loss (negative perceptual Loss) Training model can be used, so that a classification model can be initially trained in all classes CunionThereby achieving better effect. Then we use the model to perform alternating self-training (AST), i.e. phase training in the second mode, by which phase training in the second modeAnd further improving the performance of the model to obtain the final optimal model.
For example, please refer to fig. 6, and fig. 6 is a schematic view of an application scenario of another classification model training method according to an embodiment of the present application. Fig. 6 shows a flow of training the classification model alternately by the phase training of the first mode and the phase training of the second mode in the present scheme.
Specifically, in the joint training portion, each data set may be treated as one task to train all tasks according to the multi-task training paradigm. Specifically, training an object detector with a classification detection head, it is proposed to use NA Loss for this stage of training.
Wherein the form of NA Loss is as follows:
Figure BDA0003494019590000191
where N denotes the number of data sets, FiRepresentation from DiF denotes FiA feature vector of (1). lfiIs represented by CiLabel of notes,/negRepresenting a background label. x is the number ofiIs CiNode of corresponding classification detection head, xoAre other nodes of the classification header. α is infi1, l for a positive samplefiAlpha is 0 for negative samples. L iscRepresenting the binary cross-entropy focus loss for classification.
NA Loss trains all datasets jointly by establishing associations between different datasets. For example, there is a tag ciAnnotated positive samples, which are not assigned to other labels, can be used in conjunction to train all nodes in the detection head. But since the representation is not known to come from DiWhether the background sample of (2) contains Cunion-CiSo the background sample still only trains CiThe corresponding node of (2). The negative perception loss can be directly punished from the gradual optimization of random gradient descent (SGD) to a low loss mode (characteristic separable state) by generating high loss (characteristic confused state)Penalizes the mapping mode of feature confusion. As the model converges, the aliased features will eventually be optimized to be separable, and the feature aliasing problem will also be solved. In addition, NA Loss actually adds additional background annotations, which also improve the performance of the model. After training of negative perceptual loss in the first mode phase, the model is at CunionThe performance is good.
203. And classifying the sample images in the data set through the trained classification model to generate a pseudo label, and performing second-mode stage training on the trained classification model based on the pseudo label to obtain a new classification model.
Specifically, a special semi-supervised Training process, namely AST (alternative Self-Training), is designed in the embodiment of the present application.
The semi-supervised training process may include two parts: pseudo label training and alternate bias correction. In the pseudo label training stage, firstly, the trained classification model can be used for predicting the pseudo labels in the sample graph, then data enhancement is performed on the data with the pseudo labels added in the data set, and then the model is trained by using the enhanced data.
Specifically, each training iteration is broken down into two steps. In the first step, we generate predicted pseudo labels on weakly enhanced images (only resizing and normalization) using a frozen EMA (exponential weighted average) model, then process all pseudo labels and artificial labels using a non-maximal suppression method, remove duplicate labels therein to get final labels. Wherein in the non-maximum suppression process all artificial tags can be retained and pseudo tags removed preferentially, since the quality of artificial tags is obviously higher. In the second step, we apply strong enhancement (flipping, mosaic enhancement, aliasing enhancement, etc.) to the image and the filter label generated in the first step. After the strongly enhanced image is trained, EMA is adopted to update the model weight so as to obtain more stable performance. The weak-strong enhancement adds consistency constraint to the pseudo label training and improves the effect of semi-supervised training.
204. And alternately training the new classification models through the stage training of the first mode and the stage training of the second mode until the models are converged to obtain the target classification model.
Specifically, the problem of characteristic confusion in the classification model can be solved by pseudo label training, and a new label can be introduced, but the performance of directly using the pseudo label often cannot obtain a good result, and the pseudo label training method can comprise the following steps: and confirming the deviation problem. When the model produces incorrect pseudo-tags of high confidence, these incorrect predictions are further reinforced by the incorrect pseudo-tags, and the model itself cannot correct these incorrect predictions. Based on this, it is proposed to correct the bias using the above-mentioned negative perception joint training method. After a period of pseudo label semi-supervised training stage, switching to a negative perception joint training mode again to continue training for a period of time, and then switching back to the pseudo label training mode. The two training modes alternate until the model converges. With the support of negative perception joint training, the confirmation deviation can be corrected in time.
Furthermore, the invention performs tests on the COCO data set, and FCOS is used as target detection models. The COCO is partitioned into multiple subdata sets containing different classes to simulate a series of partially labeled data sets used in practical applications. Wherein when the data set is divided, the picture is not reused, which makes the more sub data sets are divided, the less the number of tags. In the case of 20 subdata sets each containing 4 classes, the method of the embodiment of the present application has an AP improvement of 8.6% over the baseline method of simple joint training. And the reasoning speed of the model is not influenced because the structure of the model does not change at all.
The following table one is an ablation experiment with negative perception co-training and alternating self-training added respectively:
Figure BDA0003494019590000211
Figure BDA0003494019590000221
watch 1
The following table two is a comparative experiment with more subdata sets:
Figure BDA0003494019590000222
watch two
The classification is carried out according to the category, COCO 2SUB, 4SUB and 8SUB divide 80 categories of COCO equally, and COCO 3SUB divides 20, 20 and 40. The Multi Model refers to training a Model in each subdata set, then using all models for joint test, the Multi Task refers to performing multitask training without using NA Loss and AST, the NA Loss refers to performing multitask joint training by using NA Loss, and the NA Loss + AST refers to using an alternative self-training method after negative perception joint training.
For example, please refer to fig. 7, and fig. 7 is a schematic view of an application scenario of another classification model training method according to an embodiment of the present application. Experiments were performed on the VOC data set as shown in fig. 7. The visualization result shows that both NA Loss and AST can reduce the feature similarity between the categories, so that the problem of feature confusion is solved and a better effect is achieved.
Further, the method of the present application example is compared with other semi-supervised methods. The results show that the AST of the embodiment of the present application significantly improves the model effect. The comparison of AST with other semi-supervised training methods is shown in table three below:
Figure BDA0003494019590000231
watch III
Further, the method of the embodiments of the present application was tested on other types of detectors besides FCOS. The results show that the method of the embodiment of the application can be promoted on other one-stage and two-stage detectors, and the effectiveness of the method is further verified.
The embodiment of the application discloses a classification model training method, which comprises the following steps: the method comprises the steps of obtaining at least two data sets, carrying out first-mode stage training on a preset classification model through the at least two data sets to obtain a trained classification model, carrying out classification processing on a sample graph in the data sets through the trained classification model to generate a pseudo label, carrying out second-mode stage training on the trained classification model based on the pseudo label to obtain a new classification model, and alternately training the new classification model through the first-mode stage training and the second-mode stage training until the model converges to obtain a target classification model. Thus, the classification effect of the classification model can be improved.
In order to better implement the classification model training method provided by the embodiment of the present application, the embodiment of the present application further provides a classification model training device based on the classification model training method. The meaning of the nouns is the same as that in the above classification model training method, and specific implementation details can refer to the description in the method embodiment.
Referring to fig. 8, fig. 8 is a block diagram of a classification model training apparatus according to an embodiment of the present application, the apparatus including:
an obtaining unit 301, configured to obtain a classification model to be trained, and at least two sample sets, where labels of sample graphs of the sample sets include object class labels, and the object class labels indicate a target area in the sample graph and a classification class of a target object in the target area;
a first extraction unit 302, configured to extract, for a sample graph in a sample set, at least one feature graph of the sample graph through the classification model, and classify each feature graph region in the feature graph to obtain a prediction probability of each feature graph region under each classification category of the classification model;
a first determining unit 303, configured to determine, according to an object class label of the sample graph, a first label corresponding to each feature graph region in each feature graph;
a first calculating unit 304, configured to calculate a first loss of the feature map region based on the prediction probability of the feature map region in each classification category and the object category label if the label of the feature map region includes the object category label;
a second calculating unit 305, configured to calculate a second loss of the feature map region based on the prediction probabilities of the feature map region under all classification categories corresponding to the sample set to which the sample map belongs and the labels, if the labels of the feature map region do not include the object category label;
a first adjusting unit 306, configured to adjust parameters of the classification model based on the first loss and the second loss of a sample graph in a sample set, so as to complete training of the classification model based on the sample set.
In some embodiments, the obtaining unit 301 may include:
the first obtaining subunit is configured to obtain a historical sample set of a historical classification model, where the classification categories of the classification model to be trained at least include partial classification categories of the historical classification model;
the second obtaining subunit is used for obtaining a first sample set used for training the classification model to be trained on the basis of the historical sample set;
and the third obtaining subunit is configured to obtain a second sample set, where the classification category corresponding to the label in the second sample set includes a classification category other than the partial classification category in the classification categories of the classification model to be trained.
In some embodiments, the first determining unit 303 may include:
a first extraction subunit, configured to extract at least one label feature map of the label image through the classification model, and map the classification category label in the label image into the label feature map when extracting the label feature map;
and the first determining subunit is used for determining the label of the feature map area at the same position in the feature map based on the label of each feature map area in the label feature map with the same scale as that of each feature map.
In some embodiments, the first extraction unit 302 may include:
the second determining subunit is used for determining the feature vectors corresponding to the positions of the feature map areas in the feature map;
and the first classification subunit is used for classifying the feature vectors to obtain the prediction probability of the feature vectors under each classification type of the classification model.
In some embodiments, the apparatus may further comprise:
the first execution unit is used for executing the training phase of the second mode after the training phase of the first mode based on the sample set is finished;
the classification unit is used for classifying the sample graphs in each sample set based on the classification model, determining a newly added pseudo object class label of the sample graph, and obtaining a plurality of pseudo sample sets, wherein the pseudo object class label indicates an interested region containing a pseudo object in the sample graph and a classification class of the pseudo object;
the second extraction unit is used for extracting at least one feature map of a sample map in a pseudo sample set through the classification model and classifying the regions of the feature maps in the feature map to obtain the prediction probability of each feature map region under each classification category of the classification model;
a second determining unit, configured to determine, according to the object class label and the pseudo object class label of the sample graph, a second label corresponding to each feature graph region in each feature graph;
a third calculating unit, configured to calculate a third loss of the feature map region based on the prediction probability of the feature map region in each classification category and the second label if the second label of the feature map region includes the object category label or a pseudo object category label;
a fourth calculating unit, configured to calculate a fourth loss of the feature map region based on the prediction probabilities of the feature map region under all classification categories corresponding to the pseudo sample set to which the sample map belongs and the second label if the label of the feature map region does not include the object category label and the pseudo object category label;
a second adjusting unit, configured to adjust parameters of the classification model based on the third loss and the fourth loss of a sample graph in a pseudo sample set, so as to complete a second mode of phase training of the classification model based on the pseudo sample set.
In some embodiments, the classification unit may include:
the second extraction subunit is used for extracting the features of the background area outside the target area in the sample image based on the label of the sample image to obtain the features of the background area;
and the second classification subunit is used for classifying the characteristics of the background area through the classification model to obtain a pseudo object class label corresponding to the background area.
In some embodiments, the apparatus further comprises:
and the first processing unit is used for carrying out first enhancement processing on the sample image in the sample set.
In some embodiments, the apparatus may further comprise:
and the second processing unit is used for carrying out second enhancement processing on the sample images in the pseudo sample set, wherein the image enhancement intensity of the first enhancement processing is lower than that of the second enhancement processing.
In some embodiments, the apparatus may further comprise:
the third processing unit is used for taking the obtained classification model as a new model to be trained after the stage training of the second mode of the classification model based on the pseudo sample set;
and a second execution unit, configured to return to execute the sample graph in the sample set, extract at least one feature graph of the sample graph through the classification model, and classify each feature graph region in the feature graph to obtain a prediction probability of each feature graph region in each classification category of the classification model until a model convergence condition is satisfied.
The embodiment of the application discloses a classification model training device, which obtains a classification model to be trained and at least two sample sets by an obtaining unit 301, wherein labels of sample graphs of the sample sets include object class labels, the object class labels indicate a target region in a sample graph and a classification class of a target object in the target region, a first extracting unit 302 extracts at least one feature graph of the sample graph by the classification model aiming at the sample graphs in one sample set, classifies each feature graph region in the feature graph to obtain a prediction probability of each feature graph region under each classification class of the classification model, a first determining unit 303 determines a first label corresponding to each feature graph region in each feature graph according to the object class labels of the sample graphs, and a first calculating unit 304 determines the object class label if the label of the feature graph region includes the object class label, a first loss of the feature map region is calculated based on the prediction probability of the feature map region in each classification category and the object category label, if the label of the feature map region does not include the object category label, the second calculation unit 305 calculates a second loss of the feature map region based on the prediction probabilities of the feature map region in all classification categories corresponding to the sample set to which the sample map belongs and the label, and the first adjustment unit 306 adjusts the parameters of the classification model based on the first loss and the second loss of the sample map in the sample set to complete the training of the classification model based on the sample set. Therefore, the accuracy of the classification model for target detection can be improved.
The embodiment of the present application further provides a computer device, which may be a server, as shown in fig. 9, which shows a schematic structural diagram of the server according to the embodiment of the present application, and specifically:
the server may include components such as a processor 701 of one or more processing cores, memory 702 of one or more computer-readable storage media, a power supply 703, and an input unit 704. Those skilled in the art will appreciate that the server architecture shown in FIG. 9 does not constitute a limitation on the servers, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components. Wherein:
the processor 701 is a control center of the server, connects various parts of the entire server using various interfaces and lines, and performs various functions of the server and processes data by running or executing software programs and/or modules stored in the memory 702 and calling data stored in the memory 702. Optionally, processor 701 may include one or more processing cores; preferably, the processor 701 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 701.
The memory 702 may be used to store software programs and modules, and the processor 701 may execute various functional applications and classification model training by executing the software programs and modules stored in the memory 702. The memory 702 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the server, and the like. Further, the memory 702 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 702 may also include a memory controller to provide the processor 701 with access to the memory 702.
The server further includes a power source 703 for supplying power to each component, and preferably, the power source 703 may be logically connected to the processor 701 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The power supply 703 may also include any component including one or more of a dc or ac power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
The server may also include an input unit 704, and the input unit 704 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the server may further include a display unit and the like, which will not be described in detail herein. Specifically, in this embodiment, the processor 701 in the server loads the executable file corresponding to the process of one or more application programs into the memory 702 according to the following instructions, and the processor 701 runs the application program stored in the memory 702, thereby implementing various functions as follows:
the method comprises the steps of obtaining a classification model to be trained and at least two sample sets, wherein labels of sample graphs of the sample sets comprise object class labels, and the object class labels indicate a target area in the sample graphs and a classification class of a target object in the target area;
aiming at a sample graph in a sample set, extracting at least one feature graph of the sample graph through a classification model, and classifying each feature graph region in the feature graph to obtain the prediction probability of each feature graph region under each classification category of the classification model;
determining a first label corresponding to each feature map region in each feature map according to the object category label of the sample map;
if the label of the feature map region comprises an object class label, calculating a first loss of the feature map region based on the prediction probability of the feature map region under each classification class and the object class label;
if the label of the feature map region does not comprise the object class label, calculating a second loss of the feature map region based on the prediction probability and the label of the feature map region under all classification classes corresponding to the sample set to which the sample map belongs;
and adjusting parameters of the classification model based on the first loss and the second loss of the sample graph in the sample set so as to finish training the classification model based on the sample set.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
Therefore, the server of the embodiment can realize the step of training the classification model, and improve the safety of identity information verification.
It will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by instructions or by instructions controlling associated hardware, which may be stored in a storage medium and loaded and executed by a processor.
To this end, the present application provides a storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the steps in any one of the classification model training methods provided in the present application. For example, the instructions may perform the steps of:
the method comprises the steps of obtaining a classification model to be trained and at least two sample sets, wherein labels of sample graphs of the sample sets comprise object class labels, and the object class labels indicate a target area in the sample graphs and a classification class of a target object in the target area; aiming at a sample graph in a sample set, extracting at least one feature graph of the sample graph through a classification model, and classifying each feature graph region in the feature graph to obtain the prediction probability of each feature graph region under each classification category of the classification model; determining a first label corresponding to each feature map region in each feature map according to the object category label of the sample map; if the label of the feature map region comprises an object class label, calculating a first loss of the feature map region based on the prediction probability of the feature map region under each classification class and the object class label; if the label of the feature map region does not comprise the object class label, calculating a second loss of the feature map region based on the prediction probability and the label of the feature map region under all classification classes corresponding to the sample set to which the sample map belongs; and adjusting parameters of the classification model based on the first loss and the second loss of the sample graph in the sample set so as to finish training the classification model based on the sample set.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the instructions stored in the storage medium can execute the steps in any classification model training method provided in the embodiments of the present application, the beneficial effects that can be achieved by any classification model training method provided in the embodiments of the present application can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the terminal reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the terminal executes the classification model training method provided in the various optional implementation modes of the above aspects.
The classification model training method, apparatus, computer device, and storage medium provided in the embodiments of the present application are introduced in detail, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the embodiments is only used to help understanding the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (12)

1. A classification model training method, the method comprising:
obtaining a classification model to be trained and at least two sample sets, wherein labels of sample graphs of the sample sets comprise object class labels, and the object class labels indicate a target area in the sample graph and a classification class of a target object in the target area;
aiming at a sample graph in a sample set, extracting at least one feature graph of the sample graph through the classification model, and classifying each feature graph region in the feature graph to obtain the prediction probability of each feature graph region under each classification category of the classification model;
determining a first label corresponding to each feature map region in each feature map according to the object class label of the sample map;
if the label of the feature map region comprises the object class label, calculating a first loss of the feature map region based on the prediction probability of the feature map region under each classification class and the object class label;
if the label of the feature map region does not comprise the object class label, calculating a second loss of the feature map region based on the prediction probability of the feature map region under all classification classes corresponding to the sample set to which the sample map belongs and the label;
adjusting parameters of the classification model based on the first loss and the second loss of a sample graph in a sample set to complete training of the classification model based on the sample set.
2. The method of claim 1, wherein obtaining the classification model to be trained, and at least two sample sets comprises:
acquiring a historical sample set of a historical classification model, wherein the classification category of the classification model to be trained at least comprises part of classification categories of the historical classification model;
acquiring a first sample set used for training the classification model to be trained based on the historical sample set;
and acquiring a second sample set, wherein the classification categories corresponding to the labels in the second sample set comprise classification categories except the partial classification categories in the classification categories of the classification model to be trained.
3. The method according to claim 1, wherein the label is a label image obtained by labeling a classification category identifier on the sample map for the target area;
the determining a first label corresponding to each feature map region in each feature map according to the object class label of the sample map includes:
extracting at least one label feature map of the label image through a classification model, and mapping classification category labels in the label image into the label feature map when the label feature map is extracted;
and determining the label of the feature map area at the same position in the feature map based on the label of each feature map area in the label feature map with the same scale as that of each feature map.
4. The method according to claim 1, wherein the classifying the feature map regions in the feature map to obtain the prediction probability of the feature map regions under each classification category of the classification model comprises:
determining a feature vector corresponding to the position of each feature map area in the feature map;
and classifying the feature vectors to obtain the prediction probability of the feature vectors under each classification category of the classification model.
5. The method according to any one of claims 1-4, further comprising:
after the training phase of the first mode based on the sample set is finished, executing the training phase of the second mode;
classifying the sample graphs in each sample set based on the classification model, determining a newly added pseudo object class label of the sample graph, and obtaining a plurality of pseudo sample sets, wherein the pseudo object class label indicates an interested region containing a pseudo object in the sample graph and a classification class of the pseudo object;
aiming at a sample graph in a pseudo sample set, extracting at least one feature graph of the sample graph through the classification model, and classifying each feature graph region in the feature graph to obtain the prediction probability of each feature graph region under each classification category of the classification model;
determining a second label corresponding to each characteristic diagram region in each characteristic diagram according to the object class label and the pseudo object class label of the sample diagram;
if the second label of the feature map region comprises the object class label or the pseudo object class label, calculating a third loss of the feature map region based on the prediction probability of the feature map region under each classification class and the second label;
if the label of the feature map region does not include the object class label and the pseudo object class label, calculating a fourth loss of the feature map region based on the prediction probability of the feature map region under all classification classes corresponding to the pseudo sample set to which the sample map belongs and the second label;
adjusting parameters of the classification model based on the third and fourth losses of a sample graph in a set of pseudo samples to complete a second mode of phase training of the classification model based on the set of pseudo samples.
6. The method of claim 5, wherein the classifying the sample graphs in each sample set based on the classification model to determine the new pseudo object class labels of the sample graphs comprises:
based on the label of the sample image, extracting the characteristics of a background area outside the target area in the sample image to obtain the characteristics of the background area;
and classifying the characteristics of the background area through the classification model to obtain a pseudo object class label corresponding to the background area.
7. The method of claim 5, further comprising, prior to said classifying the sample graph in each sample set based on the classification model:
carrying out first enhancement processing on a sample image in a sample set;
before the extracting, by the classification model, at least one feature map of a sample map in a pseudo sample set, the method further includes:
and performing second enhancement processing on the sample images in the pseudo sample set, wherein the image enhancement intensity of the first enhancement processing is lower than that of the second enhancement processing.
8. The method of claim 5, further comprising:
after the stage of the second mode of the classification model is trained based on the pseudo sample set, taking the obtained classification model as a new model to be trained;
and returning to execute the sample graph in the sample set, extracting at least one feature graph of the sample graph through the classification model, and classifying the feature graph regions in the feature graph respectively to obtain the prediction probability of each feature graph region under each classification category of the classification model until a model convergence condition is met.
9. A classification model training apparatus, characterized in that the apparatus comprises:
the device comprises an acquisition unit and a training unit, wherein the acquisition unit is used for acquiring a classification model to be trained and at least two sample sets, labels of sample images of the sample sets comprise object class labels, and the object class labels indicate target areas in the sample images and classification classes of target objects in the target areas;
the first extraction unit is used for extracting at least one feature map of a sample map through the classification model aiming at the sample map in a sample set, and classifying the regions of the feature map in the feature map respectively to obtain the prediction probability of each feature map region under each classification category of the classification model;
a first determining unit, configured to determine, according to an object class label of the sample graph, a first label corresponding to each feature graph region in each feature graph;
a first calculating unit, configured to calculate a first loss of the feature map region based on the prediction probability of the feature map region in each classification category and the object category label if the label of the feature map region includes the object category label;
a second calculating unit, configured to calculate a second loss of the feature map region based on the prediction probabilities of the feature map region under all classification categories corresponding to the sample set to which the sample map belongs and the labels, if the labels of the feature map region do not include the object category label;
a first adjusting unit, configured to adjust parameters of the classification model based on the first loss and the second loss of a sample graph in a sample set, so as to complete training of the classification model based on the sample set.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the classification model training method according to any one of claims 1 to 8 when executing the program.
11. A storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor to perform the classification model training method according to any one of claims 1 to 8.
12. A computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the classification model training method of any one of claims 1 to 10.
CN202210106215.6A 2022-01-28 2022-01-28 Classification model training method and device, computer equipment and storage medium Pending CN114462526A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210106215.6A CN114462526A (en) 2022-01-28 2022-01-28 Classification model training method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210106215.6A CN114462526A (en) 2022-01-28 2022-01-28 Classification model training method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114462526A true CN114462526A (en) 2022-05-10

Family

ID=81410995

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210106215.6A Pending CN114462526A (en) 2022-01-28 2022-01-28 Classification model training method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114462526A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114792173A (en) * 2022-06-20 2022-07-26 支付宝(杭州)信息技术有限公司 Prediction model training method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114792173A (en) * 2022-06-20 2022-07-26 支付宝(杭州)信息技术有限公司 Prediction model training method and device
CN114792173B (en) * 2022-06-20 2022-10-04 支付宝(杭州)信息技术有限公司 Prediction model training method and device

Similar Documents

Publication Publication Date Title
CN107844784A (en) Face identification method, device, computer equipment and readable storage medium storing program for executing
CN112131978B (en) Video classification method and device, electronic equipment and storage medium
Li et al. IBEA-SVM: an indicator-based evolutionary algorithm based on pre-selection with classification guided by SVM
US20200410338A1 (en) Multimodal data learning method and device
CN111523422B (en) Key point detection model training method, key point detection method and device
CN112221159B (en) Virtual item recommendation method and device and computer readable storage medium
CN112052948B (en) Network model compression method and device, storage medium and electronic equipment
CN109508686B (en) Human behavior recognition method based on hierarchical feature subspace learning
CN113139664B (en) Cross-modal migration learning method
CN112418302A (en) Task prediction method and device
CN115223020B (en) Image processing method, apparatus, device, storage medium, and computer program product
Yarkony et al. Data association via set packing for computer vision applications
CN112420125A (en) Molecular attribute prediction method and device, intelligent equipment and terminal
CN111046655B (en) Data processing method and device and computer readable storage medium
CN114462526A (en) Classification model training method and device, computer equipment and storage medium
CN114492601A (en) Resource classification model training method and device, electronic equipment and storage medium
WO2023231753A1 (en) Neural network training method, data processing method, and device
CN115795355B (en) Classification model training method, device and equipment
US11948387B2 (en) Optimized policy-based active learning for content detection
CN115359296A (en) Image recognition method and device, electronic equipment and storage medium
CN115410250A (en) Array type human face beauty prediction method, equipment and storage medium
CN112633425B (en) Image classification method and device
CN115063374A (en) Model training method, face image quality scoring method, electronic device and storage medium
CN114299340A (en) Model training method, image classification method, system, device and medium
CN113392867A (en) Image identification method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination