CN116071613A - Training method and device for image classification model, computer equipment and medium - Google Patents

Training method and device for image classification model, computer equipment and medium Download PDF

Info

Publication number
CN116071613A
CN116071613A CN202211728219.4A CN202211728219A CN116071613A CN 116071613 A CN116071613 A CN 116071613A CN 202211728219 A CN202211728219 A CN 202211728219A CN 116071613 A CN116071613 A CN 116071613A
Authority
CN
China
Prior art keywords
image
classified
loss
category
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211728219.4A
Other languages
Chinese (zh)
Inventor
邢玲
王爱波
余晓填
王孝宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN202211728219.4A priority Critical patent/CN116071613A/en
Publication of CN116071613A publication Critical patent/CN116071613A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to the field of image classification technologies, and in particular, to a training method and apparatus for an image classification model, a computer device, and a medium. According to the method, a class probability estimation vector and a class probability prediction vector of an image to be classified are determined through an image classification model, the predicted image class of the image to be classified is determined according to the class probability prediction vector, the image number of each image class is further determined, a first loss weight parameter of each image to be classified is determined by combining a preset image number threshold, the model loss of the image classification model is determined according to the predicted image class of each image to be classified, the first loss weight parameter, the class probability estimation vector and the class probability prediction vector, the image classification model is trained, the first loss weight parameter is used as a weight basis of similarity between the class probability estimation vector and the class probability prediction vector, the training influence of the tag imbalance problem on the image classification model is reduced, and the accuracy of the image classification model is improved.

Description

Training method and device for image classification model, computer equipment and medium
Technical Field
The present invention relates to the field of image classification technologies, and in particular, to a training method and apparatus for an image classification model, a computer device, and a medium.
Background
In recent years, machine learning algorithms have been dependent on high quality and high capacity supervised image datasets, with significant results in the field of image classification. However, in an actual image classification scene, because of the problem of marking noise widely existing in an image data set, when an image classification model performs model training, the image classification model tends to learn clean image samples first, then learn noise image samples, and as the iteration number increases, the image classification model gradually learns more and more noise information in the training process, so that the image classification model cannot obtain a classification effect with higher accuracy on a low-quality image data set.
The existing image classification model can prevent the image classification model from memorizing noise information in the training process based on regularization processing, so that the classification accuracy of the image classification model is improved. However, the method is only suitable for the situation of balanced distribution of image data, when the problem of unbalanced label distribution exists in an image data set, the regularization treatment can prevent the image classification model from effectively learning images to be classified, and the image classification model can not be prevented from memorizing noise information in the training process, so that the classification accuracy of the image classification model is reduced.
Therefore, how to improve the classification accuracy of the image classification model based on the unbalanced noisy image data set is a problem to be solved.
Disclosure of Invention
In view of the above, the embodiments of the present invention provide a training method, apparatus, computer device, and medium for an image classification model, so as to solve the problem that the classification accuracy of the image classification model is low when the existing training method for an image classification model faces an unbalanced noisy image data set.
In a first aspect, an embodiment of the present invention provides a training method for an image classification model, where the training method for an image classification model includes:
acquiring images to be classified in a training set, inputting the images to be classified into an image classification model, and determining a category probability estimation vector and a category probability prediction vector of each image to be classified;
determining a predicted image category of each image to be classified according to the category probability prediction vector, determining the image number of each image category according to the predicted image category of each image to be classified, and determining a first loss weight parameter of each image to be classified according to the image number of each image category and a preset image number threshold;
Determining model losses of the image classification model according to the predicted image category, the first loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified;
and training the image classification model according to the model loss to obtain a trained image classification model.
In a second aspect, a second embodiment of the present invention provides a training method for an image classification model, where the training method for an image classification model includes:
acquiring images to be classified in a training set, inputting the images to be classified into an image classification model, and determining a category probability estimation vector and a category probability prediction vector of each image to be classified;
according to the category probability prediction vector, determining the predicted image category of each image to be classified, determining the image number of each image category according to the predicted image category of each image to be classified, and determining the first loss weight parameter of each image to be classified according to the image number of each image category and a preset image number threshold.
And acquiring the original image category of each image to be classified, and determining the image quantity of each original image category according to the image to be classified and the original image category.
And determining a second loss weight parameter of each image to be classified according to the image quantity of each original image category and a preset image quantity threshold value.
And determining model loss of the image classification model according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified.
And training the image classification model according to the model loss to obtain a trained image classification model.
In a third aspect, an embodiment of the present invention provides a training apparatus for an image classification model, where the training apparatus for an image classification model includes:
the probability prediction module is used for acquiring images to be classified in a training set, inputting the images to be classified into an image classification model, and determining a category probability estimation vector and a category probability prediction vector of each image to be classified;
the parameter determining module is used for determining the predicted image category of each image to be classified according to the category probability prediction vector, determining the image quantity of each image category according to the predicted image category of each image to be classified, and determining the first loss weight parameter of each image to be classified according to the image quantity of each image category and a preset image quantity threshold;
The loss calculation module is used for determining model loss of the image classification model according to the predicted image category of each image to be classified, the first loss weight parameter, the category probability estimation vector and the category probability prediction vector;
and the model training module is used for training the image classification model according to the model loss to obtain a trained image classification model.
Optionally, the training device of the image classification model further includes:
the image quantity determining module is used for obtaining the original image category of each image to be classified, and determining the image quantity of each original image category according to the image to be classified and the original image category;
and the second loss weight parameter determining module is used for determining the second loss weight parameter of each image to be classified according to the image quantity of each original image category and the preset image quantity threshold value.
Correspondingly, the loss calculation module is configured to determine a model loss of the image classification model according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified.
In a fourth aspect, an embodiment of the present invention provides a computer device, the computer device including a processor, a memory, and a computer program stored in the memory and executable on the processor, the processor implementing the training method of the image classification model according to the first or second aspect when the computer program is executed.
In a fifth aspect, an embodiment of the present invention provides a computer readable storage medium storing a computer program, which when executed by a processor implements the training method of the image classification model according to the first or second aspect.
Compared with the prior art, the embodiment of the invention has the beneficial effects that: the method comprises the steps of inputting images to be classified into an image classification model, determining a class probability estimation vector and a class probability prediction vector of each image to be classified, determining a predicted image class of each image to be classified according to the class probability prediction vector, determining the image number of each image class according to the predicted image class of each image to be classified, determining a first loss weight parameter of each image to be classified according to the image number of each image class and a preset image number threshold, and improving the reliability and accuracy of the first loss weight parameter by taking the preset image number threshold as a measurement standard of image number distribution of the image classes; according to the predicted image category, the first loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified, determining the model loss of the image classification model, training the image classification model according to the model loss to obtain a trained image classification model, and taking the first loss weight parameter as the weight basis of the similarity between the category probability estimation vector and the category probability prediction vector to reduce the training influence of the tag imbalance problem on the image classification model and improve the accuracy of the image classification model.
Compared with the prior art, the second embodiment of the invention has the beneficial effects that: the method comprises the steps of inputting images to be classified into an image classification model, determining a class probability estimation vector and a class probability prediction vector of each image to be classified, determining a predicted image class of each image to be classified according to the class probability prediction vector, determining the image number of each image class according to the predicted image class of each image to be classified, determining a first loss weight parameter of each image to be classified according to the image number of each image class and a preset image number threshold, and improving the reliability and accuracy of the first loss weight parameter by taking the preset image number threshold as a measurement standard of image number distribution of the image classes; the method comprises the steps of obtaining original image categories of each image to be classified, determining the image quantity of each original image category according to the image to be classified and the original image categories, determining a second loss weight parameter of each image to be classified according to the image quantity of each original image category and a preset image quantity threshold value, and improving the reliability and accuracy of the second loss weight parameter by taking the preset image quantity threshold value as a measurement standard of image quantity distribution of the original image categories; according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified, determining the model loss of the image classification model, training the image classification model according to the model loss to obtain a trained image classification model, taking the second loss weight parameter as a weight basis of similarity between the category probability estimation vector and the corresponding predicted image category, and taking the first loss weight parameter as a weight basis of similarity between the category probability estimation vector and the category probability prediction vector, so that the training influence of the tag imbalance problem on the image classification model is reduced, and the accuracy of the image classification model is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic view of an application environment of a training method of an image classification model according to an embodiment of the present invention;
FIG. 2 is a flowchart of a training method of an image classification model according to an embodiment of the present invention;
fig. 3 is a flow chart of a training method of an image classification model according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of a training device for an image classification model according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in the present description and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the invention. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
The embodiment of the invention can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
It should be understood that the sequence numbers of the steps in the following embodiments do not mean the order of execution, and the execution order of the processes should be determined by the functions and the internal logic, and should not be construed as limiting the implementation process of the embodiments of the present invention.
In order to illustrate the technical scheme of the invention, the following description is made by specific examples.
The training method of the image classification model provided by the embodiment of the invention can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server. The clients include, but are not limited to, palm top computers, desktop computers, notebook computers, ultra-mobile personal computer (UMPC), netbooks, cloud computing devices, personal digital assistants (personal digital assistant, PDA), and other computing devices. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Referring to fig. 2, a flowchart of a training method for an image classification model according to an embodiment of the present invention may be applied to the client in fig. 1, where the training method for an image classification model may include the following steps:
step S201, obtaining images to be classified in a training set, inputting the images to be classified into an image classification model, and determining a category probability estimation vector and a category probability prediction vector of each image to be classified.
The training set in this embodiment selects an image dataset with a label noise problem in an actual image classification scene, that is, a noise class label of an image to be classified in the training set is inconsistent with an actual image class of a corresponding image to be classified, and a label distribution imbalance problem, that is, an image class of the image to be classified is corresponding to a bias distribution, that is, several image classes contain a large number of image samples to be classified, and most image classes contain only a very small number of image samples to be classified.
In the training process of the image classification model, after the images to be classified in the training set are acquired, the images to be classified are input into the image classification model for feature extraction and feature analysis, and the category probability estimation vector and the category probability prediction vector of each image to be classified are determined.
The probability estimation vector is composed of probabilities that the images to be classified respectively belong to various image categories, and can be used for predicting the image categories corresponding to the images to be classified.
The class probability prediction vector obtained by the kth iterative training can be obtained by fusing the class probability estimation vector obtained by the kth iterative training and the class probability prediction vector obtained by the kth-1 iterative training, and the class probability prediction vector combines real-time class probability prediction information and historical class probability prediction information of the image classification model in the iterative training and can be used for predicting the image class of the image to be classified. Where k=2, 3, … K, K may be a preset target number of iterations.
In addition, the iterative training process can analyze and learn the images to be classified and the class labels thereof, optimize the parameters of the image classification model, improve the accuracy of the image classification model, and correspondingly, the reliability of the image classification model in the early stage of iterative training is lower, so in the embodiment, when the class probability prediction vector in the kth iterative training is determined, as the number of iterations increases, the first fusion weight of the class probability estimation vector obtained by setting the kth iterative training gradually increases, the second fusion weight of the class probability prediction vector obtained by the kth-1 iterative training gradually decreases, and the reliability and accuracy of the class probability prediction vector can be improved.
For example, the first fusion weight in the kth iteration training is denoted as β 1k The second fusion weight in the kth iteration training is marked as beta 1k As the iteration number increases, the first fusion weight gradually increases, the second fusion weight gradually decreases, and the first fusion weight is:
Figure SMS_1
wherein beta is 1k For the first fusion weight, x, in the kth iterative training 1 Lower value limit of first fusion weight, x 2 And the upper limit of the value of the first fusion weight is set, K is a preset target iteration number, and K is the iteration number.
The second fusion weight is:
Figure SMS_2
wherein beta is 2k For the second fusion weight in the kth iteration training, x 1 Lower value limit of first fusion weight, x 2 And the upper limit of the value of the first fusion weight is set, K is a preset target iteration number, and K is the iteration number.
The total number of images to be classified is marked as I, and for the ith (i=1, 2, …, I) image to be classified, the class probability estimation vector obtained in the kth iterative training is marked as
Figure SMS_3
The class probability prediction vector obtained in the k-1 th iteration training is marked as +.>
Figure SMS_4
The class probability prediction vector obtained in the kth iterative training is:
Figure SMS_5
in the method, in the process of the invention,
Figure SMS_6
class probability prediction vector of the ith image to be classified obtained in the kth iterative training,/for the kth image to be classified >
Figure SMS_7
Estimating vector for class probability of the ith image to be classified obtained in the kth iterative training,/for the class probability of the ith image to be classified>
Figure SMS_8
The category probability prediction vector beta of the ith image to be classified obtained in the k-1 th iterative training 1k For the first fusion weight, beta 2k And the second fusion weight.
In this embodiment, the image classification task may be classifying the human body pose of the image to be classified in the scenes of gait analysis, video monitoring, sports science, and the like, and correspondingly, the class labels are various human body pose classes such as rest, walking, running, squatting, jumping, and the like, and the image classification model is a human body pose classification model; the method can also be used for classifying the face attribute of the image to be classified in scenes such as commodity audience analysis, population aging analysis and the like, correspondingly, the class labels are various face attribute classes such as men, women, young, middle-aged, elderly and the like, and the image classification model is a face attribute classification model; the method can also be used for classifying the emotion of the person to be classified in the scenes of teaching evaluation, commodity sales, personnel interview and the like, correspondingly, the class labels are various emotion classes of the person such as happiness, tension, heart injury, aversion, boring and the like, and the image classification model is a person emotion classification model; the method can also be used for classifying the room styles of the images to be classified in the scenes of renting, decorating, buying and the like, correspondingly, the category labels are various room styles such as a field, a conciseness, a classicality, a new Chinese style, a Mediterranean, southeast Asia and the like, and the image classification model is a room style classification model.
In this embodiment, taking the classification of the human body pose of the image to be classified as an example, the image to be classified in the training set is a human body pose image including various poses, and the class labels are various human body pose classes such as rest, walking, running, squatting, jumping, and the like. Because the generation of labeling errors is difficult to avoid when labeling class labels, part of human body posture images can be labeled as wrong human body posture classes, so that the training set in the embodiment is a noisy human body posture image data set.
In training, acquiring human body posture images in a human body posture image dataset, inputting the human body posture images into a human body posture classification model for feature extraction and feature analysis, and determining a class probability estimation vector and a class probability prediction vector of each human body posture image.
Optionally, inputting the images to be classified into the image classification model, and determining the category probability estimation vector and the category probability prediction vector of each image to be classified includes:
inputting the images to be classified into an image classification model, and determining a class probability estimation vector of each image to be classified in a first iteration process;
determining a category probability prediction vector of each image to be classified in the first iteration process according to a preset category probability vector and a category probability estimation vector of each image to be classified in the first iteration process;
Determining a class probability estimation vector of each image to be classified in the kth iteration process and a class probability prediction vector in the kth-1 iteration process, wherein k=2, 3, …;
and determining the class probability prediction vector of each image to be classified in the kth iteration process according to the class probability estimation vector of each image to be classified in the kth iteration process and the class probability prediction vector in the kth-1 iteration process.
The method comprises the steps of inputting an image to be classified into an image classification model, carrying out first iterative training on the image classification model, and obtaining only class probability estimated vectors of the first iterative training, wherein the class probability estimated vectors of all iterative training before the first iterative training are absent.
In order to facilitate the unification of calculation of the class probability prediction vectors, the embodiment sets a preset class probability vector, and fuses the class probability vector with the class probability estimation vector obtained by the first iteration training to obtain the class probability prediction vector in the first iteration process.
In an embodiment, in order not to affect the calculation result of the class probability estimation vector, the preset class probability vector may be set to a zero vector.
Then, from the second iteration training, the category probability prediction vector in the kth iteration process can be obtained by fusing the category probability estimation vector in the kth iteration process and the category probability prediction vector in the kth-1 iteration process.
For example, for the ith image to be classified, a class probability estimation vector obtained in the first iterative training is obtained
Figure SMS_9
The preset class probability is marked as P 0 The class probability prediction vector obtained in the first iterative training is:
Figure SMS_10
in the method, in the process of the invention,
Figure SMS_11
for the class probability predictive vector, beta of the ith image to be classified in the first iterative training 11 For the first fused weight, beta, in the first iteration training 21 For the second fusion weight in the first iterative training, +.>
Figure SMS_12
Estimating vector for probability of ith image to be classified in first iterative training, P 0 The probability of the category is preset.
According to the method, the device and the system, the difference between the first iterative training and other iterative training is considered, the preset class probability is set to be fused with the probability estimation vector in the first iterative training, the class probability prediction vectors with different iteration times are adaptively fused and calculated, and the reliability and the accuracy of the class probability prediction vectors are improved.
The step of acquiring the images to be classified in the training set, inputting the images to be classified into an image classification model, determining a class probability estimation vector and a class probability prediction vector of each image to be classified, carrying out feature extraction and feature analysis on the basis of the unbalanced noisy training set, determining the probability estimation vector of the images to be classified for predicting the classes of the images, and combining the real-time category prediction information and the historical prediction information in the iterative training to obtain a category probability prediction vector for predicting the category of the image, and adjusting the fusion duty ratio of the real-time category probability prediction information and the historical category probability prediction information by considering the relation between the reliability of the image classification model and the iteration number, thereby improving the reliability and the accuracy of the category probability prediction vector.
Step S202, according to the category probability prediction vector, determining the predicted image category of each image to be classified, according to the predicted image category of each image to be classified, determining the image number of each image category, and according to the image number of each image category and the preset image number threshold, determining the first loss weight parameter of each image to be classified.
The class probability prediction vector is obtained by fusion of probability estimation vectors, and the class probability estimation vector is formed by probabilities that the images to be classified respectively belong to various image classes, so that the predicted image class of each image to be classified can be predicted and determined according to the probability of each image class corresponding to the class probability prediction vector.
The predicted image categories of all the images to be classified are counted, the number of the images belonging to each image category can be determined, when the number of the images of different image categories is large in difference, the number of the images of the image categories is unevenly distributed, the problem of unbalanced labels occurs in a training set, correspondingly, the image category with the large number of the images is regarded as the dominant category, the image category with the small number of the images is regarded as the long-tail category, then the image classification model firstly learns the images to be classified belonging to the dominant category in the training process, then learns the images to be classified belonging to the long-tail category, the application condition that the image classification model firstly learns the clean images to be classified based on regularization is broken, and then the training effect of the image classification model is poor is caused.
Therefore, in order to solve the problem of tag imbalance, in this embodiment, the loss weight of each image category in the model loss is adjusted according to the number of images of the image category, and correspondingly, the more the number of images of the image category is, the smaller the loss weight of the image to be classified corresponding to the image category in the model loss is set, so that the training influence of the tag imbalance problem on the image classification model is reduced, and the accuracy of the image classification model is improved.
Specifically, in this embodiment, a preset image number threshold is set as a measurement reference of the image number distribution, and a ratio of the preset image number threshold to the image number of each image class is calculated, so as to measure the image number distribution of each image class according to the ratio, and the ratio corresponding to each image class is determined as a first loss weight parameter of each image to be classified belonging to the image class.
In an embodiment, a ratio of the total number of images to be classified and the total number of image categories in the training set may be calculated, where the ratio is an average image number of the image categories, and then the average image number of the image categories is determined as a preset image number threshold, and the average image number of the image categories is used as a measurement reference of image number distribution, so that reliability of the first loss weight parameter is improved.
In this embodiment, the human body pose of the image to be classified is classified as an example, the predicted human body pose class of each human body pose image is determined according to the class probability prediction vector, the image number of each human body pose class is determined according to the predicted human body pose class of each human body pose image, a preset image number threshold is set as a measurement reference of the image number distribution, the ratio of the preset image number threshold to the image number of each human body pose class is calculated, the image number distribution of each human body pose class is measured according to the ratio, and the ratio corresponding to each human body pose class is determined as the first loss weight parameter of each human body pose image belonging to the human body pose class.
Optionally, determining the predicted image category of each image to be classified according to the category probability prediction vector includes:
according to the category probability prediction vector of each image to be classified, determining a vector element value of each position in the category probability prediction vector;
and determining the maximum value of the vector element values, and determining the image category corresponding to the maximum value as the predicted image category of each image to be classified.
The vector element value of each position in the class probability prediction vector represents the probability that the corresponding image to be classified belongs to each image class, and then the image class corresponding to the maximum value in the vector elements can be determined as the predicted image class of each image to be classified.
According to the method and the device for classifying the images, the image type corresponding to the maximum value of the vector element values in the class probability prediction vector is determined as the predicted image type of each image to be classified, and accuracy of the predicted image type is improved.
Optionally, determining the first loss weight parameter of each image to be classified according to the number of images of each image category and the preset image number threshold value includes:
calculating the ratio of a preset image quantity threshold value to the image quantity of each image category, and determining the ratio as a first loss weight parameter of each image category;
And determining the first loss weight parameter of each image to be classified according to the predicted image category of each image to be classified and the first loss weight parameter of each image category.
The method comprises the steps of calculating the ratio of a preset image quantity threshold value to the image quantity of each image category, and determining the ratio as a first loss weight parameter of each image category, wherein the preset image quantity threshold value is a measurement standard of image quantity distribution.
For example, the preset image number threshold is denoted as S 0 The number of the images of the jth image class in the kth iterative training is recorded as
Figure SMS_13
The first loss weight parameter of the jth image class in the kth iterative training is: />
Figure SMS_14
In the method, in the process of the invention,
Figure SMS_15
for the first loss weight parameter of the jth image class in the kth iteration training, S 0 For a preset image quantity threshold, +.>
Figure SMS_16
The number of images for the j-th image class in the kth iterative training.
The first loss weight parameter of each image to be classified can be determined according to the image category to which each image to be classified belongs, and correspondingly, the first loss weight parameter of the ith image to be classified in the kth iteration training is recorded as
Figure SMS_17
According to the embodiment, the preset image quantity threshold value is used as a measurement standard of image quantity distribution, the ratio of the preset image quantity threshold value to the image quantity of each image category is calculated, the first loss weight parameter of each image category is determined, and then the first loss weight parameter of each image to be classified is determined according to the image category to which each image to be classified belongs, so that the reliability and the accuracy of the first loss weight parameter are improved.
According to the method, the predicted image category of each image to be classified is determined according to the category probability prediction vector, the image number of each image category is determined according to the predicted image category of each image to be classified, the first loss weight parameter of each image to be classified is determined according to the image number of each image category and the preset image number threshold value, the image category corresponding to the maximum value of the vector element value in the category probability prediction vector is determined as the predicted image category of each image to be classified, the accuracy of the predicted image category is improved, the preset image number threshold value is used as a measurement standard of the image number distribution, the first loss weight parameter of each image to be classified is determined by calculating the ratio of the preset image number threshold value to the image number of each image category, and the reliability and the accuracy of the first loss weight parameter are improved.
Step S203, determining model loss of the image classification model according to the predicted image category, the first loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified.
The classification probability estimation vector and the classification probability prediction vector are used for representing probabilities that the corresponding images to be classified belong to the image categories respectively, and according to the similarity between the classification probability estimation vector and the corresponding prediction image category of each image to be classified and the similarity between the classification probability estimation vector and the classification probability prediction vector, the first sub-loss and the second sub-loss of each image to be classified are calculated to represent the classification accuracy of each image to be classified.
Because the first loss weight parameter is obtained based on the category probability prediction vector, the first loss weight parameter is used as the loss weight of the second sub-loss, and the first sub-loss and the second sub-loss of all the images to be classified are fused to obtain the model loss of the image classification model, so that the training influence of the tag imbalance problem on the image classification model is reduced, and the accuracy of the image classification model is improved.
For example, for the ith image to be classified in the kth iterative training, calculating the similarity between the class probability estimation vector and the predicted image class to obtain a first sub-loss
Figure SMS_18
Calculating the similarity between the class probability estimation vector and the class probability prediction vector to obtain a second sub-penalty +.>
Figure SMS_19
The first loss weight parameter is taken as the second sub-lossThe loss weight is calculated to obtain the model loss of the ith image to be classified:
Figure SMS_20
in the method, in the process of the invention,
Figure SMS_21
for model loss of the ith image to be classified in the kth iterative training,/for the model loss of the ith image to be classified in the kth iterative training>
Figure SMS_22
For the first sub-loss of the ith image to be classified in the kth iterative training, +.>
Figure SMS_23
For the second sub-loss of the ith image to be classified in the kth iterative training, +.>
Figure SMS_24
And the first loss weight parameter of the ith image to be classified in the kth iterative training is obtained.
And then, adding the model losses of all the images to be classified in the k iteration training to obtain the model losses of the image classification model in the k iteration training.
In this embodiment, the classification of the human body posture of the image to be classified is taken as an example, the predicted human body posture category is used to represent the human body posture category to which the corresponding image to be classified belongs, the category probability estimation vector and the category probability prediction vector are used to represent the probability that the corresponding human body posture image respectively belongs to each human body posture category, so that the classification accuracy of each human body posture is represented by calculating the first sub-loss and the second sub-loss of each human body posture image according to the similarity between the category probability estimation vector of each human body posture image and the corresponding predicted human body posture category and the similarity between the category probability estimation vector and the category probability prediction vector, and then the first loss weight parameter is used as the loss weight of the second sub-loss, and the first sub-loss and the second sub-loss of all human body posture images are fused to obtain the model loss of the human body posture classification model, so as to reduce the training influence of the tag imbalance problem on the human body posture classification model, and improve the accuracy of the human body posture classification model.
Optionally, determining the model penalty of the image classification model according to the predicted image category, the first penalty weight parameter, the category probability estimation vector, and the category probability prediction vector of each image to be classified includes:
determining cross entropy loss of each image to be classified according to the class probability estimation vector and the predicted image class;
calculating the average value of the cross entropy loss of all the images to be classified, and determining the average value of the cross entropy loss as the third model sub-loss of the image classification model;
determining regularization loss of each image to be classified according to the category probability estimation vector and the category probability prediction vector;
determining a second sub-loss of each image to be classified according to the first loss weight parameter and the regularization loss;
calculating the average value of the second sub-losses of all the images to be classified, and determining the average value of the second sub-losses as the second model sub-loss of the image classification model;
and determining the model loss of the image classification model according to the third model sub-loss and the second model sub-loss of the image classification model.
In order to facilitate similarity calculation, corresponding class probability vectors are firstly determined according to the predicted image classes, then cross entropy loss between the class probability estimation vectors and the class probability vectors is calculated, and the average value of the cross entropy loss of all the images to be classified is determined as a third model sub-loss of the image classification model.
And calculating regularization loss between the class probability estimation vector and the class probability prediction vector, taking the first loss weight parameter as the weight of the regularization loss to obtain second sub-loss of each image to be classified, and then determining the average value of the second sub-loss of all the images to be classified as the second model sub-loss of the image classification model.
In addition, in the iterative training process, as the iterative training times increase, the reliability of the image classification model gradually increases, the third model sub-loss gradually decreases, the third model sub-loss is obtained based on cross entropy calculation, and due to the self-properties of logarithms, the third model sub-loss has gradient problems, as the iterative times increase, the gradient problems of the third model sub-loss become more serious, so that the accuracy of the third model sub-loss gradually decreases. Therefore, in order to improve the reliability and accuracy of the model loss, the third model sub-loss in the present embodiment has a smaller and smaller duty ratio in the model loss, and the second model sub-loss has a larger and larger duty ratio in the model loss. In this embodiment, according to the preset loss adjustment parameter as the weight of the second model sub-loss, the duty ratio of the second model sub-loss is dynamically adjusted, so as to obtain the model loss of the image classification model.
For example, for the ith image to be classified in the kth iterative training, the class probability vector is denoted as Y i k Calculating a class probability estimate vector
Figure SMS_25
And a class probability vector Y i k The cross entropy loss between them is: />
Figure SMS_26
In the method, in the process of the invention,
Figure SMS_27
for the cross entropy loss of the ith image to be classified in the kth iterative training, N is the number of image categories, +.>
Figure SMS_28
For the element value of the nth position in the class probability vector of the ith image to be classified in the kth iterative training,/->
Figure SMS_29
For the ith image to be classified in the kth iterative training the class probability estimates the element value of the nth position in the vector.
Then calculate the kth iterationThe cross entropy loss average value of the I images to be classified in the training is obtained, and a third model sub-loss in the kth iteration training is obtained
Figure SMS_30
Calculating a class probability estimate vector
Figure SMS_31
And a class probability prediction vector T i k Regularization loss between:
Figure SMS_32
in the method, in the process of the invention,
Figure SMS_33
regularization loss of the ith image to be classified in the kth iteration training, P i k Estimating a vector T for the class probability of the ith image to be classified in the kth iterative training i k For the class probability prediction vector of the ith image to be classified in the kth iterative training, +.>
Figure SMS_34
For the first loss weight parameter of the ith image to be classified in the kth iterative training, α is a preset value, and in this embodiment, the preset value α=1 is taken.
Parameter of first loss weight
Figure SMS_35
As regularization loss->
Figure SMS_36
Obtaining a second sub-loss, calculating a second sub-loss average value of the I images to be classified in the kth iterative training, and obtaining a second model sub-loss in the kth iterative training as follows:
Figure SMS_37
in the method, in the process of the invention,
Figure SMS_38
for the second model sub-loss in the kth iteration training, is->
Figure SMS_39
For the first loss weight parameter of the ith image to be classified in the kth iterative training, +.>
Figure SMS_40
And (5) regularization loss of the ith image to be classified in the kth iterative training.
Then, the preset loss adjustment parameter in the kth iteration training is marked as lambda k Based on third model sub-loss in kth iterative training
Figure SMS_41
And a second model sub-loss, the model loss in the kth iterative training is obtained as follows:
Figure SMS_42
wherein L is k For model loss in the kth iteration training,
Figure SMS_43
for the third model sub-loss in the kth iteration training, is->
Figure SMS_44
For the second model sub-loss in the kth iteration training, lambda k Parameters are adjusted for the preset loss in the kth iteration training.
According to the embodiment, the cross entropy loss of each image to be classified is determined according to the class probability estimation vector and the predicted image class, the average value of the cross entropy loss of all the images to be classified is determined to be the third model sub-loss of the image classification model, the regularization loss between the class probability estimation vector and the class probability prediction vector is calculated, the first loss weight parameter is used as the weight of the regularization loss to obtain the second sub-loss of each image to be classified, the average value of the second sub-loss of all the images to be classified is determined to be the second model sub-loss of the image classification model, and the duty ratio of the second sub-loss in the model loss is dynamically adjusted according to the preset loss adjustment parameter, so that the accuracy of the model loss is improved.
The step of determining the model loss of the image classification model according to the predicted image category, the first loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified, characterizes the classification accuracy of the image classification model through the similarity between the category probability estimation vector and the corresponding predicted image category of each image to be classified and the similarity between the category probability estimation vector and the category probability prediction vector, and takes the first loss weight parameter as the weight basis of the similarity between the category probability estimation vector and the category probability prediction vector so as to reduce the training influence of the tag imbalance problem on the image classification model and improve the accuracy of the image classification model.
And step S204, training the image classification model according to the model loss to obtain a trained image classification model.
The model loss can measure the classification accuracy of the image classification model in the corresponding iterative training, and the smaller the model loss is, the higher the classification accuracy of the image classification model in the corresponding iterative training is, and in the iterative training of the image classification model, the image classification model is trained based on the model loss until the model loss converges, so that the trained image classification model is obtained.
In the embodiment, taking the human body posture of an image to be classified as an example, in the iterative training of a human body posture classification model, the human body posture classification model is trained based on model loss until the model loss converges, and a trained human body posture classification model is obtained.
And training the image classification model according to the model loss to obtain a trained image classification model, wherein the model loss characterizes the classification accuracy of the image classification model in the corresponding iterative training, and the image classification model is trained on the basis of the model loss until the model loss converges, so that the classification accuracy of the image classification model is improved.
According to the method, the device and the system, the class probability estimation vector and the class probability prediction vector of each image to be classified are determined by inputting the image to be classified into the image classification model, the predicted image class of each image to be classified is determined according to the class probability prediction vector, the image number of each image class is determined according to the predicted image class of each image to be classified, the first loss weight parameter of each image to be classified is determined according to the image number of each image class and the preset image number threshold, and the reliability and the accuracy of the first loss weight parameter are improved by taking the preset image number threshold as a measurement standard of the image number distribution of the image class; according to the predicted image category, the first loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified, determining the model loss of the image classification model, training the image classification model according to the model loss to obtain a trained image classification model, and taking the first loss weight parameter as the weight basis of the similarity between the category probability estimation vector and the category probability prediction vector to reduce the training influence of the tag imbalance problem on the image classification model and improve the accuracy of the image classification model.
Referring to fig. 3, a flowchart of a training method for an image classification model according to a second embodiment of the present invention is shown, where the training method for an image classification model may be applied to the client in fig. 1, and the training method for an image classification model may include the following steps:
step S301, obtaining images to be classified in a training set, inputting the images to be classified into an image classification model, and determining a category probability estimation vector and a category probability prediction vector of each image to be classified;
step S302, a predicted image category of each image to be classified is determined according to the category probability prediction vector, the image number of each image category is determined according to the predicted image category of each image to be classified, and a first loss weight parameter of each image to be classified is determined according to the image number of each image category and a preset image number threshold.
Step S303, obtaining the original image category of each image to be classified, and determining the image quantity of each original image category according to the image to be classified and the original image category.
Step S304, determining a second loss weight parameter of each image to be classified according to the image number of each original image category and a preset image number threshold.
Step S305, determining model loss of the image classification model according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified.
And step S306, training the image classification model according to the model loss to obtain a trained image classification model.
Step S301, step S302, and step S306 are respectively consistent with step S201, step S202, and step S204 in the training method of the image classification model provided in the first embodiment of the present invention, and are not described herein, where step S303, step S304, and step S305 are specifically as follows:
step S303, obtaining the original image category of each image to be classified, and determining the image quantity of each original image category according to the image to be classified and the original image category.
The unbalanced noisy training set in this embodiment includes images to be classified and class labels thereof, and then the original image class of each image to be classified can be determined according to the class labels, and then the image number of each original image class can be obtained through statistics according to the original image classes of all the images to be classified.
In this embodiment, taking the classification of the human body pose of the image to be classified as an example, the original human body pose class of each human body pose image can be determined according to the class label, and then the image number of each original human body pose class can be obtained through statistics according to the original human body pose classes of all human body pose images.
The step of obtaining the original image category of each image to be classified, and determining the image number of each original image category according to the image to be classified and the original image category, wherein the image number of each original image category is determined by the category label in the training set, and the method can be used as a calculation basis for the distribution of the original image category number in the training set.
Step S304, determining a second loss weight parameter of each image to be classified according to the image number of each original image category and a preset image number threshold.
When the difference of the number of images of different original image categories is large, the number of images of the original image categories are distributed unevenly, the problem of uneven labels occurs in a training set, the problem that an image classification model based on regularization processing firstly learns clean images to be classified, and then learns application conditions of noise images to be classified, so that the training effect of the image classification model is poor.
Therefore, in this embodiment, the loss weight of each original image category in the model loss is adjusted according to the number of images of the original image category, and correspondingly, the larger the number of images of the original image category is, the smaller the loss weight of the image to be classified corresponding to the original image category in the model loss is set, so that the training influence of the original tag imbalance problem on the image classification model is reduced, and the accuracy of the image classification model is improved.
Specifically, calculating a ratio of a preset image quantity threshold to the image quantity of each original image category, measuring the image quantity distribution of each original image category according to the ratio, and determining the ratio corresponding to each original image category as a second loss weight parameter of each image to be classified belonging to the original image category.
For example, the number of images in the jth original image category in the kth iterative training is recorded as
Figure SMS_45
The second loss weight parameter of the jth original image class in the kth iterative training is:
Figure SMS_46
in the method, in the process of the invention,
Figure SMS_47
for the second loss weight parameter of the jth original image category in the kth iteration training, S 0 For a preset image quantity threshold, +.>
Figure SMS_48
The number of images in the jth original image category in the kth iterative training.
The second loss weight parameter of each image to be classified can be determined according to the original image category to which each image to be classified belongs, and correspondingly, the second loss weight parameter of the ith image to be classified in the kth iterative training is recorded as
Figure SMS_49
In this embodiment, taking the classification of the human body pose of the image to be classified as an example, calculating the ratio of the preset image number threshold to the image number of each original human body pose class, measuring the image number distribution of each original human body pose class according to the ratio, and determining the ratio corresponding to each original human body pose class as the second loss weight parameter of each human body pose image belonging to the original human body pose class.
The step of determining the second loss weight parameter of each image to be classified according to the image number of each original image category and the preset image number threshold value, wherein the preset image number threshold value is used as a measurement standard of image number distribution, and the second loss weight parameter of each image to be classified is determined by calculating the ratio of the preset image number threshold value to the image number of each original image category, so that the reliability and the accuracy of the second loss weight parameter are improved.
Step S305, determining model loss of the image classification model according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified.
The classification probability estimation vector and the classification probability prediction vector are used for representing probabilities that the corresponding images to be classified belong to the image categories respectively, and according to the similarity between the classification probability estimation vector and the corresponding prediction image category of each image to be classified and the similarity between the classification probability estimation vector and the classification probability prediction vector, the first sub-loss and the second sub-loss of each image to be classified are calculated to represent the classification accuracy of each image to be classified.
Because the first loss weight parameter is obtained based on the category probability prediction vector, the second loss weight parameter is obtained based on the original image category calculation, the category probability estimation vector and the category probability prediction vector do not influence the second loss weight parameter, the first loss weight parameter is used as the loss weight of the second sub-loss, the second loss weight parameter is used as the loss weight of the first sub-loss, the first sub-loss and the second sub-loss are subjected to fusion calculation according to the first loss weight parameter and the second loss weight parameter, and the model loss of the image classification model is obtained, so that the training influence of the tag imbalance problem on the image classification model is reduced, and the accuracy of the image classification model is improved.
In this embodiment, taking the classification of the human body pose of the image to be classified as an example, the predicted human body pose class is used to represent the human body pose class to which the corresponding human body pose image belongs, and the class probability estimation vector and the class probability prediction vector are used to represent the probabilities that the corresponding human body pose image respectively belongs to each human body pose class, then the classification accuracy of each human body pose image can be represented according to the similarity between the class probability estimation vector of each human body pose image and the corresponding predicted human body pose class and the similarity between the class probability estimation vector and the class probability prediction vector, and the first sub-loss and the second sub-loss of each human body pose image are calculated. And then taking the first loss weight parameter as the loss weight of the second sub-loss, taking the second loss weight parameter as the loss weight of the first sub-loss, carrying out fusion calculation on the first sub-loss and the second sub-loss according to the first loss weight parameter and the second loss weight parameter to obtain the model loss of the human body posture classification model, so as to reduce the training influence of the tag imbalance problem on the human body posture classification model and improve the accuracy of the human body posture classification model.
Optionally, determining the model loss of the image classification model according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified includes:
determining cross entropy loss of each image to be classified according to the class probability estimation vector and the predicted image class;
determining a first sub-loss of each image to be classified according to the second loss weight parameter and the cross entropy loss;
calculating the average value of the first sub-losses of all the images to be classified, and determining the average value of the first sub-losses as the first model sub-loss of the image classification model;
determining regularization loss of each image to be classified according to the category probability estimation vector and the category probability prediction vector;
determining a second sub-loss of each image to be classified according to the first loss weight parameter and the regularization loss;
calculating the average value of the second sub-losses of all the images to be classified, and determining the average value of the second sub-losses as the second model sub-loss of the image classification model;
and determining the model loss of the image classification model according to the first model sub-loss and the second model sub-loss of the image classification model.
In order to facilitate similarity calculation, firstly, determining a corresponding category probability vector according to a predicted image category, then calculating cross entropy loss between the category probability estimation vector and the category probability vector, taking a second loss weight parameter as the weight of the cross entropy loss to obtain a first sub-loss of each image to be classified, and then determining the average value of the first sub-losses of all the images to be classified as a first model sub-loss of an image classification model.
And calculating regularization loss between the class probability estimation vector and the class probability prediction vector, taking the first loss weight parameter as the weight of the regularization loss to obtain second sub-loss of each image to be classified, and then determining the average value of the second sub-loss of all the images to be classified as the second model sub-loss of the image classification model.
And then, taking the preset loss adjustment parameter as the weight of the second model sub-loss, dynamically adjusting the duty ratio of the second model sub-loss, and determining the model loss of the image classification model according to the first model sub-loss and the second model sub-loss.
For example, for the ith image to be classified in the kth iterative training, the class probability estimates the vector
Figure SMS_50
And a class probability vector Y i k The cross entropy loss between them is->
Figure SMS_51
The second loss weight parameter +.>
Figure SMS_52
As cross entropy loss->
Figure SMS_53
The method comprises the steps of obtaining a first sub-loss, calculating a first sub-loss average value of I images to be classified in the kth iterative training, and obtaining a first model sub-loss in the kth iterative training as follows:
Figure SMS_54
in the method, in the process of the invention,
Figure SMS_55
for the first model sub-loss in the kth iteration training, is->
Figure SMS_56
For the second loss weight parameter of the ith image to be classified in the kth iterative training, +.>
Figure SMS_57
And the cross entropy loss of the ith image to be classified in the kth iterative training is obtained.
Then, the parameter lambda is adjusted based on the preset loss in the kth iteration training k Third model sub-loss in kth iterative training
Figure SMS_58
And a second model sub-loss, the model loss in the kth iterative training is obtained as follows:
Figure SMS_59
wherein L is k For model loss in the kth iteration training,
Figure SMS_60
for the first model sub-loss in the kth iteration training, is->
Figure SMS_61
For the second model sub-loss in the kth iteration training, lambda k Parameters are adjusted for the preset loss in the kth iteration training.
The step of determining the model loss of the image classification model according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified, characterizes the classification accuracy of the image classification model through the similarity between the category probability estimation vector and the corresponding predicted image category of each image to be classified and the similarity between the category probability estimation vector and the category probability prediction vector, takes the second loss weight parameter as the weight basis of the similarity between the category probability estimation vector and the corresponding predicted image category, takes the first loss weight parameter as the weight basis of the similarity between the category probability estimation vector and the category probability prediction vector, reduces the training influence of the tag imbalance problem on the image classification model, and improves the accuracy of the image classification model.
According to the method, the device and the system, the class probability estimation vector and the class probability prediction vector of each image to be classified are determined by inputting the image to be classified into the image classification model, the predicted image class of each image to be classified is determined according to the class probability prediction vector, the image number of each image class is determined according to the predicted image class of each image to be classified, the first loss weight parameter of each image to be classified is determined according to the image number of each image class and the preset image number threshold, and the reliability and the accuracy of the first loss weight parameter are improved by taking the preset image number threshold as a measurement standard of the image number distribution of the image class; the method comprises the steps of obtaining original image categories of each image to be classified, determining the image quantity of each original image category according to the image to be classified and the original image categories, determining a second loss weight parameter of each image to be classified according to the image quantity of each original image category and a preset image quantity threshold value, and improving the reliability and accuracy of the second loss weight parameter by taking the preset image quantity threshold value as a measurement standard of image quantity distribution of the original image categories; according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified, determining the model loss of the image classification model, training the image classification model according to the model loss to obtain a trained image classification model, taking the second loss weight parameter as a weight basis of similarity between the category probability estimation vector and the corresponding predicted image category, and taking the first loss weight parameter as a weight basis of similarity between the category probability estimation vector and the category probability prediction vector, so that the training influence of the tag imbalance problem on the image classification model is reduced, and the accuracy of the image classification model is improved.
The third embodiment of the present invention provides an image classification method, which uses the trained image classification model in the first embodiment or the second embodiment of the present invention to perform image classification, and may include the following steps:
the method comprises the steps of obtaining an image to be classified in an image classification task, inputting the image to be classified into a trained image classification model, outputting a class probability estimation vector of the image to be classified, and determining the image class of the image to be classified according to the class probability estimation vector of the image to be classified.
The image classification task can be to classify the human body posture of the image to be classified in the scenes of gait analysis, video monitoring, sports science and the like, correspondingly, the class labels are various human body posture classes such as static, walking, running, squatting, jumping and the like, and the image classification model is a human body posture classification model; the method can also be used for classifying the face attribute of the image to be classified in scenes such as commodity audience analysis, population aging analysis and the like, correspondingly, the class labels are various face attribute classes such as men, women, young, middle-aged, elderly and the like, and the image classification model is a face attribute classification model; the method can also be used for classifying the emotion of the person to be classified in the scenes of teaching evaluation, commodity sales, personnel interview and the like, correspondingly, the class labels are various emotion classes of the person such as happiness, tension, heart injury, aversion, boring and the like, and the image classification model is a person emotion classification model; the method can also be used for classifying the room styles of the images to be classified in the scenes of renting, decorating, buying and the like, correspondingly, the category labels are various room styles such as a field, a conciseness, a classicality, a new Chinese style, a Mediterranean, southeast Asia and the like, and the image classification model is a room style classification model.
In this embodiment, taking a human body posture of an image to be classified as an example, after a human body posture image to be classified is obtained, the human body posture image is input into a trained human body posture classification model to perform feature extraction and feature analysis, and a class probability estimation vector of the human body posture image is output, where the class probability estimation vector can represent the probability that the corresponding human body posture image belongs to each human body posture class. The human body posture category corresponding to the maximum probability value in the category probability estimation vector can be determined as the human body posture category of the image to be classified, and the human body posture classification task is completed.
It will be appreciated that in the specific embodiments of the present application, related data relating to facial images, body images, room images, etc. are required to obtain user permissions or agreements when the embodiments of the present application are applied to specific products or technologies, and the collection, use and processing of related data is required to comply with relevant laws and regulations and standards of the relevant countries and regions.
According to the embodiment, the trained image classification model in the first embodiment or the second embodiment of the invention is obtained, the feature extraction and the feature analysis are carried out on the images to be classified, the class probability estimation vector of the images to be classified is output, the image class of the images to be classified is determined, and the classification accuracy of the images to be classified is improved.
Corresponding to the speech recognition method of the above embodiment, fig. 4 is a block diagram of a training device for an image classification model according to a fourth embodiment of the present invention, and for convenience of explanation, only the portion related to the embodiment of the present invention is shown.
Referring to fig. 4, the training apparatus of the image classification model includes:
the probability prediction module 41 is configured to obtain images to be classified in the training set, input the images to be classified into the image classification model, and determine a category probability estimation vector and a category probability prediction vector of each image to be classified;
a parameter determining module 42, configured to determine a predicted image class of each image to be classified according to the class probability prediction vector, determine the number of images of each image class according to the predicted image class of each image to be classified, and determine a first loss weight parameter of each image to be classified according to the number of images of each image class and a preset image number threshold;
a loss calculation module 43, configured to determine a model loss of the image classification model according to the predicted image category, the first loss weight parameter, the category probability estimation vector, and the category probability prediction vector of each image to be classified;
the model training module 44 is configured to train the image classification model according to the model loss, so as to obtain a trained image classification model.
Optionally, the training device of the image classification model further includes:
the image quantity determining module is used for acquiring the original image category of each image to be classified, and determining the image quantity of each original image category according to the image to be classified and the original image category;
the second loss weight parameter determining module is used for determining a second loss weight parameter of each image to be classified according to the image quantity of each original image category and a preset image quantity threshold value.
Correspondingly, the loss calculation module is used for determining model loss of the image classification model according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified.
Optionally, the second loss weight parameter determining module includes:
the cross entropy loss calculation sub-module is used for determining the cross entropy loss of each image to be classified according to the category probability estimation vector and the predicted image category;
the first sub-loss calculation sub-module is used for determining the first sub-loss of each image to be classified according to the second loss weight parameter and the cross entropy loss;
the first model sub-loss calculation sub-module is used for calculating the average value of the first sub-losses of all the images to be classified, and determining the average value of the first sub-losses as the first model sub-loss of the image classification model;
The regularization loss calculation sub-module is used for determining regularization loss of each image to be classified according to the category probability estimation vector and the category probability prediction vector;
the second sub-loss calculation sub-module is used for determining the second sub-loss of each image to be classified according to the first loss weight parameter and the regularization loss;
the second model sub-loss calculation sub-module is used for calculating the average value of the second sub-losses of all the images to be classified, and determining the average value of the second sub-losses as the second model sub-loss of the image classification model;
the first model loss calculation sub-module is used for determining the model loss of the image classification model according to the first model sub-loss and the second model sub-loss of the image classification model.
Optionally, the probability prediction module 41 includes:
the first class probability prediction sub-module is used for inputting the images to be classified into the image classification model and determining class probability estimation vectors of each image to be classified in the first iteration process;
a second class probability prediction sub-module, for estimating a vector from the preset class probability vector and the class probability of each image to be classified in the first iteration, determining a class probability prediction vector of each image to be classified in the first iteration process;
A third class probability prediction sub-module, configured to determine a class probability estimation vector of each image to be classified in a kth iteration process and a class probability prediction vector in a kth-1 iteration process, where k=2, 3, …;
and the fourth category probability prediction sub-module is used for determining the category probability prediction vector of each image to be classified in the kth iteration process according to the category probability estimation vector of each image to be classified in the kth iteration process and the category probability prediction vector of each image to be classified in the kth-1 iteration process.
Optionally, the parameter determining module 42 includes:
the vector element value determining submodule is used for determining the vector element value of each position in the class probability prediction vector according to the class probability prediction vector of each image to be classified;
and the predicted image category determining sub-module is used for determining the maximum value of the vector element values and determining the image category corresponding to the maximum value as the predicted image category of each image to be classified.
Optionally, the parameter determining module 42 includes:
the first parameter determination submodule is used for calculating the ratio of a preset image quantity threshold value to the image quantity of each image category and determining the ratio as a first loss weight parameter of each image category;
The second parameter determining sub-module is used for determining the first loss weight parameter of each image to be classified according to the predicted image category of each image to be classified and the first loss weight parameter of each image category.
Optionally, the loss calculation module 43 includes:
the cross entropy loss calculation sub-module is used for determining the cross entropy loss of each image to be classified according to the category probability estimation vector and the predicted image category;
the third model sub-loss calculation sub-module is used for calculating the average value of the cross entropy loss of all the images to be classified, and determining the average value of the cross entropy loss as the third model sub-loss of the image classification model;
the regularization loss calculation sub-module is used for determining regularization loss of each image to be classified according to the category probability estimation vector and the category probability prediction vector;
the second sub-loss calculation sub-module is used for determining the second sub-loss of each image to be classified according to the first loss weight parameter and the regularization loss;
the second model sub-loss calculation sub-module is used for calculating the average value of the second sub-losses of all the images to be classified, and determining the average value of the second sub-losses as the second model sub-loss of the image classification model;
And the second model loss calculation sub-module is used for determining the model loss of the image classification model according to the third model sub-loss and the second model sub-loss of the image classification model.
Fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention. As shown in fig. 5, the computer device of this embodiment includes: at least one processor (only one shown in fig. 5), a memory, and a computer program stored in the memory and executable on the at least one processor, the processor executing the computer program to perform the steps of any of the various model training method embodiments described above.
The computer device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that fig. 5 is merely an example of a computer device and is not intended to limit the computer device, and that a computer device may include more or fewer components than shown, or may combine certain components, or different components, such as may also include a network interface, a display screen, an input device, and the like.
The processor may be a CPU, but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory includes a readable storage medium, an internal memory, etc., where the internal memory may be the memory of the computer device, the internal memory providing an environment for the execution of an operating system and computer-readable instructions in the readable storage medium. The readable storage medium may be a hard disk of a computer device, and in other embodiments may be an external storage device of the computer device, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. that are provided on the computer device. Further, the memory may also include both internal storage units and external storage devices of the computer device. The memory is used to store an operating system, application programs, boot loader (BootLoader), data, and other programs such as program codes of computer programs, and the like. The memory may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present invention. The specific working process of the units and modules in the above device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again. The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above-described embodiment, and may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of the method embodiment described above. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code, a recording medium, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
The present invention may also be implemented as a computer program product for implementing all or part of the steps of the method embodiments described above, when the computer program product is run on a computer device, causing the computer device to execute the steps of the method embodiments described above.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided by the present invention, it should be understood that the disclosed apparatus/computer device and method may be implemented in other manners. For example, the apparatus/computer device embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (10)

1. The training method of the image classification model is characterized by comprising the following steps of:
acquiring images to be classified in a training set, inputting the images to be classified into an image classification model, and determining a category probability estimation vector and a category probability prediction vector of each image to be classified;
Determining a predicted image category of each image to be classified according to the category probability prediction vector, determining the image number of each image category according to the predicted image category of each image to be classified, and determining a first loss weight parameter of each image to be classified according to the image number of each image category and a preset image number threshold;
determining model losses of the image classification model according to the predicted image category, the first loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified;
and training the image classification model according to the model loss to obtain a trained image classification model.
2. The method of training an image classification model according to claim 1, further comprising:
acquiring an original image category of each image to be classified, and determining the image quantity of each original image category according to the image to be classified and the original image category;
determining a second loss weight parameter of each image to be classified according to the image quantity of each original image category and the preset image quantity threshold;
Correspondingly, determining the model penalty of the image classification model according to the predicted image category, the first penalty weight parameter, the category probability estimation vector and the category probability prediction vector for each of the images to be classified comprises:
and determining model loss of the image classification model according to the predicted image category, the first loss weight parameter, the second loss weight parameter, the category probability estimation vector and the category probability prediction vector of each image to be classified.
3. The method of training an image classification model according to claim 2, wherein the determining a model penalty of the image classification model from the predicted image class, the first penalty weight parameter, the second penalty weight parameter, the class probability estimation vector, and the class probability prediction vector for each of the images to be classified comprises:
determining cross entropy loss of each image to be classified according to the class probability estimation vector and the predicted image class;
determining a first sub-loss of each image to be classified according to the second loss weight parameter and the cross entropy loss;
Calculating the average value of the first sub-losses of all the images to be classified, and determining the average value of the first sub-losses as the first model sub-loss of the image classification model;
determining regularization loss of each image to be classified according to the category probability estimation vector and the category probability prediction vector;
determining a second sub-loss of each image to be classified according to the first loss weight parameter and the regularization loss;
calculating the average value of the second sub-losses of all the images to be classified, and determining the average value of the second sub-losses as the second model sub-loss of the image classification model;
and determining the model loss of the image classification model according to the first model sub-loss and the second model sub-loss of the image classification model.
4. The method of training an image classification model according to claim 1, wherein the inputting the images to be classified into an image classification model, determining a class probability estimation vector and a class probability prediction vector for each of the images to be classified, comprises:
inputting the images to be classified into an image classification model, and determining a class probability estimation vector of each image to be classified in a first iteration process;
Determining a category probability prediction vector of each image to be classified in the first iteration process according to a preset category probability vector and a category probability estimation vector of each image to be classified in the first iteration process;
determining a class probability estimation vector of each image to be classified in the kth iteration process and a class probability prediction vector in the kth-1 iteration process, wherein k=2, 3, …;
and determining the class probability prediction vector of each image to be classified in the kth iteration process according to the class probability estimation vector of each image to be classified in the kth iteration process and the class probability prediction vector in the kth-1 iteration process.
5. The method of claim 1, wherein determining a predicted image class for each of the images to be classified based on the class probability prediction vector comprises:
according to the category probability prediction vector of each image to be classified, determining a vector element value of each position in the category probability prediction vector;
and determining the maximum value of the vector element values, and determining the image category corresponding to the maximum value as the predicted image category of each image to be classified.
6. The method according to claim 1, wherein determining the first loss weight parameter of each image to be classified according to the number of images of each image class and a preset image number threshold value comprises:
calculating the ratio of the preset image quantity threshold value to the image quantity of each image category, and determining the ratio as a first loss weight parameter of each image category;
and determining the first loss weight parameter of each image to be classified according to the predicted image category of each image to be classified and the first loss weight parameter of each image category.
7. The method according to claim 1, wherein determining the model penalty of the image classification model based on the predicted image class, the first penalty weight parameter, the class probability estimation vector, and the class probability prediction vector for each of the images to be classified comprises:
determining cross entropy loss of each image to be classified according to the class probability estimation vector and the predicted image class;
calculating the average value of the cross entropy loss of all the images to be classified, and determining the average value of the cross entropy loss as the third model sub-loss of the image classification model;
Determining regularization loss of each image to be classified according to the category probability estimation vector and the category probability prediction vector;
determining a second sub-loss of each image to be classified according to the first loss weight parameter and the regularization loss;
calculating the average value of the second sub-losses of all the images to be classified, and determining the average value of the second sub-losses as the second model sub-loss of the image classification model;
and determining the model loss of the image classification model according to the third model sub-loss and the second model sub-loss of the image classification model.
8. An image classification model training apparatus, wherein the image classification model training apparatus comprises:
the probability prediction module is used for acquiring images to be classified in a training set, inputting the images to be classified into an image classification model, and determining a category probability estimation vector and a category probability prediction vector of each image to be classified;
the parameter determining module is used for determining the predicted image category of each image to be classified according to the category probability prediction vector, determining the image quantity of each image category according to the predicted image category of each image to be classified, and determining the first loss weight parameter of each image to be classified according to the image quantity of each image category and a preset image quantity threshold;
The loss calculation module is used for determining model loss of the image classification model according to the predicted image category of each image to be classified, the first loss weight parameter, the category probability estimation vector and the category probability prediction vector;
and the model training module is used for training the image classification model according to the model loss to obtain a trained image classification model.
9. A computer device, characterized in that it comprises a processor, a memory and a computer program stored in the memory and executable on the processor, which processor implements the training method of the image classification model according to any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the training method of the image classification model according to any one of claims 1 to 7.
CN202211728219.4A 2022-12-29 2022-12-29 Training method and device for image classification model, computer equipment and medium Pending CN116071613A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211728219.4A CN116071613A (en) 2022-12-29 2022-12-29 Training method and device for image classification model, computer equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211728219.4A CN116071613A (en) 2022-12-29 2022-12-29 Training method and device for image classification model, computer equipment and medium

Publications (1)

Publication Number Publication Date
CN116071613A true CN116071613A (en) 2023-05-05

Family

ID=86178010

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211728219.4A Pending CN116071613A (en) 2022-12-29 2022-12-29 Training method and device for image classification model, computer equipment and medium

Country Status (1)

Country Link
CN (1) CN116071613A (en)

Similar Documents

Publication Publication Date Title
CN110472675B (en) Image classification method, image classification device, storage medium and electronic equipment
CN112348117B (en) Scene recognition method, device, computer equipment and storage medium
CN110070067A (en) The training method of video classification methods and its model, device and electronic equipment
CN110363220B (en) Behavior class detection method and device, electronic equipment and computer readable medium
CN108197592B (en) Information acquisition method and device
CN108550065B (en) Comment data processing method, device and equipment
CN110909784B (en) Training method and device of image recognition model and electronic equipment
CN108228684B (en) Method and device for training clustering model, electronic equipment and computer storage medium
CN111881671A (en) Attribute word extraction method
CN111401343B (en) Method for identifying attributes of people in image and training method and device for identification model
CN112819024B (en) Model processing method, user data processing method and device and computer equipment
CN110019837B (en) User portrait generation method and device, computer equipment and readable medium
CN114329022A (en) Method for training erotic classification model, method for detecting image and related device
CN113887699A (en) Knowledge distillation method, electronic device and storage medium
CN112818946A (en) Training of age identification model, age identification method and device and electronic equipment
CN111967383A (en) Age estimation method, and training method and device of age estimation model
CN109657710B (en) Data screening method and device, server and storage medium
CN116680401A (en) Document processing method, document processing device, apparatus and storage medium
WO2019177130A1 (en) Information processing device and information processing method
CN117011577A (en) Image classification method, apparatus, computer device and storage medium
CN116071613A (en) Training method and device for image classification model, computer equipment and medium
CN115240647A (en) Sound event detection method and device, electronic equipment and storage medium
CN112183714B (en) Automatic data slicing based on artificial neural network
CN114492657A (en) Plant disease classification method and device, electronic equipment and storage medium
CN113779159A (en) Model training method, argument detecting device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination