WO2024016945A1 - 图像分类模型的训练方法、图像分类方法及相关设备 - Google Patents

图像分类模型的训练方法、图像分类方法及相关设备 Download PDF

Info

Publication number
WO2024016945A1
WO2024016945A1 PCT/CN2023/102430 CN2023102430W WO2024016945A1 WO 2024016945 A1 WO2024016945 A1 WO 2024016945A1 CN 2023102430 W CN2023102430 W CN 2023102430W WO 2024016945 A1 WO2024016945 A1 WO 2024016945A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
classification
model
unlabeled
type
Prior art date
Application number
PCT/CN2023/102430
Other languages
English (en)
French (fr)
Inventor
吕永春
朱徽
周迅溢
蒋宁
吴海英
Original Assignee
马上消费金融股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 马上消费金融股份有限公司 filed Critical 马上消费金融股份有限公司
Publication of WO2024016945A1 publication Critical patent/WO2024016945A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • the present application relates to the field of artificial intelligence technology, and in particular to a training method for an image classification model, an image classification method and related equipment.
  • SSL Semi-Supervised Learning
  • This application provides an image classification model training method, image classification method and related equipment, which are used to solve the problem of poor training effect of the existing image classification model, which affects the final image classification accuracy and stability.
  • inventions of the present application provide a training method for an image classification model.
  • the image classification model includes a first image classification sub-model and a second image classification sub-model.
  • the method includes: obtaining Obtain an image set used to train the image classification model.
  • the image set includes labeled images, unlabeled images, and category labels corresponding to the labeled images; through the target image classifier in the image classification model A model that performs classification and recognition on the labeled image and the unlabeled image respectively to obtain the first classification reference information of the labeled image and the first classification reference information of the unlabeled image; the target image classifier
  • the model is the first image classification sub-model or the second image classification sub-model; based on the first classification reference information corresponding to the labeled image, the category label corresponding to the labeled image and the unlabeled image
  • the second classification reference information determines the classification loss of the target image classification sub-model; the second classification reference information of the unlabeled image is other images in the image classification model except the target image classification sub-model.
  • the classification sub-model performs classification and recognition on the unlabeled image;
  • the classification loss of the target image classification sub-model refers to the classification loss of the first image classification sub-model or the classification loss of the second image classification sub-model;
  • Model parameters of the image classification model are adjusted based on the classification loss of the first image classification sub-model and the classification loss of the second image classification sub-model.
  • an image classification method which includes: classifying and identifying images to be processed through an image classification model, and obtaining a classification reference information set of the images to be processed; wherein the classification reference information set includes a first Target classification reference information and second target classification reference information.
  • the image classification model includes a first image classification sub-model and a second image classification sub-model.
  • the first image classification sub-model is used to classify the image to be processed.
  • the first target classification reference information is obtained through recognition.
  • the second image classification sub-model is used to classify and identify the image to be processed to obtain the second target classification reference information.
  • the image classification model is based on the first aspect.
  • the training method is obtained by training; based on the classification reference information set of the image to be processed, the category to which the image to be processed belongs is determined.
  • inventions of the present application provide a training device for an image classification model.
  • the image classification model includes a first image classification sub-model and a second image classification sub-model.
  • the training device includes: an acquisition unit for acquiring For an image set for training the image classification model, the image set includes labeled images, unlabeled images and category labels corresponding to the labeled images; Class unit, used to classify and identify the labeled image and the unlabeled image respectively through the target image classification sub-model in the image classification model, and obtain the first classification reference information and the labeled image.
  • the first classification reference information of the unlabeled image; the target image classification sub-model is the first image classification sub-model or the second image classification sub-model; a determining unit is used to determine based on the labeled image corresponding The first classification reference information, the category label corresponding to the labeled image and the second classification reference information of the unlabeled image determine the classification loss of the target image classification sub-model; the second classification reference of the unlabeled image
  • the information is obtained by classifying and identifying the unlabeled image through other image classification sub-models in the image classification model except the target image classification sub-model; the classification loss of the target image classification sub-model refers to the The classification loss of the first image classification sub-model or the classification loss of the second image classification sub-model; an adjustment unit for classification based on the classification loss of the first image classification sub-model and the second image classification sub-model Loss, adjusting the model parameters of the image classification model.
  • an image classification device including: a classification unit configured to classify and identify images to be processed through an image classification model to obtain a classification reference information set of the images to be processed; wherein, the classification reference The information set includes first target classification reference information and second target classification reference information.
  • the image classification model includes a first image classification sub-model and a second image classification sub-model. The first image classification sub-model is used to classify the The image to be processed is classified and identified to obtain the first target classification reference information. The second image classification sub-model is used to classify and identify the image to be processed to obtain the second target classification reference information.
  • the image classification model It is trained based on the above training method; the determination unit is used to determine the category to which the image to be processed belongs based on the classification reference information set of the image to be processed.
  • embodiments of the present application provide an electronic device, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions to implement the above method.
  • embodiments of the present application provide a computer-readable storage medium, which enables the electronic device to perform the above method when instructions in the storage medium are executed by a processor of the electronic device.
  • Figure 1 is a schematic flowchart of a training method for an image classification model provided by an embodiment of the present application
  • Figure 2 is a schematic flowchart of a training method for an image classification model provided by another embodiment of the present application.
  • Figure 3 is a schematic flow chart of an image classification method provided by an embodiment of the present application.
  • Figure 4 is a schematic structural diagram of a training device for an image classification model provided by an embodiment of the present application.
  • Figure 5 is a schematic structural diagram of an image classification device provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the existing image classification model training method uses a pre-trained teacher network to generate pseudo labels for sample images, and then the student network uses the sample images with pseudo labels to conduct semi-supervised learning to form the final image classification model.
  • the pseudo labels are completely generated by the information conversion learned by the teacher network from the sample images, this is not conducive to the student network fully mining and utilizing the information contained in the sample images, especially in the early stage of training, the pseudo labels generated by the teacher network
  • the confidence level is not high, resulting in poor training effect of the image classification model, which in turn affects the final image classification accuracy and stability.
  • embodiments of the present application propose a training method for an image classification model.
  • the one-way teacher-student relationship between each image classification sub-model of the image classification model is improved to a mutual teacher-student relationship. Relationship, using the information learned by one image classification sub-model from the sample image, to provide pseudo-labels for semi-supervised learning for another image classification sub-model, so that each image classification sub-model can learn from each other and teach each other, thereby making the sample image
  • the information contained can be fully mined and utilized, thereby improving the training effect of the image classification model and obtaining a more accurate and reliable image classification model.
  • the embodiment of the present application also proposes an image classification method, which can accurately classify and identify images by using the image classification model obtained by training.
  • both the training method of the image classification model and the image classification method provided by the embodiments of the present application can be executed by electronic devices or software installed in the electronic devices.
  • the so-called electronic devices here may include terminal devices, such as smartphones, tablets, laptops, desktop computers, intelligent voice interaction devices, smart home appliances, smart watches, vehicle-mounted terminals, aircraft, etc.; or, electronic devices may also include servers, For example, it can be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud server that provides cloud computing services.
  • Figure 1 is a schematic flowchart of a training method for an image classification model provided by one embodiment of the present application.
  • the method may include the following steps:
  • the image set includes labeled images, unlabeled images, and category labels corresponding to labeled images.
  • Labeled images refer to images with corresponding category labels
  • unlabeled images refer to images without corresponding category labels.
  • the image set can include multiple labeled images and multiple unlabeled images, and multiple labeled images can belong to different categories.
  • the category label of the labeled image is used to represent the real category to which the labeled image belongs. Specifically, it can represent the real category to which the content presented in the labeled image belongs.
  • the categories to which labeled images belong can be people, animals, landscapes, etc.; for another example, the categories to which labeled images belong can also be subcategories subdivided under a certain major category, such as for the major category of people.
  • the category to which the labeled image belongs can be depressed, happy, angry, etc., or it can also be a real face, a fake face, etc.; for another example, for the category of animals, the category to which the labeled image belongs can be Cats, dogs, horses, mules, etc.
  • the category label corresponding to the labeled image can have any appropriate form.
  • the category label corresponding to the labeled image can be obtained by one-hot encoding of the real category to which the labeled image belongs, or , can also be obtained by word embedding of the real category to which the labeled image belongs.
  • the embodiment of the present application does not limit the form of the category label.
  • S104 Classify and identify labeled images and unlabeled images respectively through the target image classification sub-model in the image classification model, and obtain first classification reference information of labeled images and first classification reference information of unlabeled images.
  • the image classification model in the embodiment of the present application may include a first image classification sub-model and The second image classification sub-model, the first image classification sub-model and the second image classification sub-model can all classify and identify each image in the image set and obtain corresponding classification reference information.
  • the image classification model is trained using a semi-supervised learning method to obtain the final image classification model.
  • the first image classification submodel and the second image classification The sub-models may have the same network structure, or, in order to simplify the model structure and achieve compression and acceleration of the image classification model, the first image classification sub-model and the second image classification sub-model may also have different network structures, such as the second Compared with the first image classification sub-model, the image classification sub-model adopts a more streamlined structure.
  • the target image classification sub-model is the first image classification sub-model or the second image classification sub-model. That is, the first image classification sub-model and the second image classification sub-model can be respectively used as the target image classification sub-model.
  • the classification reference information of the labeled image corresponding to the target image classification sub-model is called the first classification reference information of the labeled image, and the labeled image corresponds to another sub-model in the image classification model except the target image classification sub-model.
  • An image classification sub-model is called the second classification reference information of the labeled image.
  • the classification reference information of the unlabeled image corresponding to the target image classification sub-model is called the first classification reference information of the unlabeled image, and the unlabeled image corresponds to another one in the image classification model other than the target image classification sub-model.
  • the image classification sub-model is called the second classification reference information of unlabeled images.
  • the classification reference information of the labeled image may include at least one of the following information: the probability that the labeled image is identified as belonging to each preset category in a plurality of preset categories, the category to which the labeled image belongs, etc.; similarly, none
  • the classification reference information of the labeled image may include at least one of the following information: the probability that the unlabeled image is identified as belonging to each of the plurality of preset categories, the category to which the unlabeled image belongs, and so on.
  • multiple preset categories include cats, dogs, horses, and mules.
  • the classification reference information of each image may include the probability that each image is recognized as belonging to a cat, a dog, a horse, and a mule respectively, and the category to which each image belongs. It can be the category corresponding to the highest probability among the multiple preset categories.
  • the images in the image set can be obtained by data enhancement processing of the initial images. That is to say, before the above S102, the embodiment of this application provides
  • the provided training method of the image classification model may also include: performing data enhancement processing on the images in the initial image set to obtain an image set used for training the image classification model, so that the images in the obtained image set contain disturbance information.
  • the initial image set includes initial unlabeled images and initial labeled images.
  • data enhancement processing of various enhancement levels can be performed on the initial unlabeled image to obtain an unlabeled image.
  • the labeled images can be classified and recognized through the target image classification sub-model to obtain the first classification reference information of the labeled images, and multiple unlabeled images can be classified separately through the target image classification sub-model. Each unlabeled image in the image is classified and recognized, and the first classification reference information of each unlabeled image is obtained.
  • Perform data enhancement processing of various enhancement levels on the initial unlabeled image can be implemented as follows: perform weakly-augmented processing on the initial unlabeled image to obtain the first type of unlabeled image, and perform the following on the initial unlabeled image: Strongly-augmented image processing is used to obtain the second type of unlabeled image. That is, the unlabeled images include the first type of unlabeled images and the second type of unlabeled images, and the corresponding enhancement degree of the first type of unlabeled images is smaller than the corresponding enhancement degree of the second type of unlabeled images.
  • Weak enhancement processing may specifically include but is not limited to at least one of the following processing methods: translation, flipping, etc.
  • Strong enhancement processing may include but is not limited to at least one of the following processing methods: occlusion, color transformation, random erasure (Random Erase) )wait.
  • the degree of enhancement of the weak enhancement process is small, that is, the disturbance introduced to the initial unlabeled image is small, it will not distort the obtained first type of unlabeled image, so that the target image classification sub-model can obtain accurate Learning the noise in the first category of unlabeled images based on the first category reference information is conducive to improving the learning effect of the target image classification sub-model; in addition, considering that only weakly enhanced images may cause the target image classification sub-model to Falling into an overfitting state, the essential features of the first type of unlabeled image cannot be extracted.
  • the disturbance introduced by the strong enhancement process is relatively large, which may cause distortion of the first type of unlabeled image, but it can still retain features sufficient to identify the category.
  • the target image classification sub-model is input, and the target image classification sub-model learns the unlabeled images with different enhancement levels, which is beneficial to improving the learning effect of the target image classification sub-model. Enhance the expressive ability of the target image classification sub-model.
  • the training method of the image classification model provided by the embodiment of the present application may also include: performing weak enhancement processing on the initial labeled images in the initial image set. , get the labeled image.
  • S106 Determine the classification loss of the target image classification submodel based on the first classification reference information of the labeled image, the category label corresponding to the labeled image, and the second classification reference information of the unlabeled image.
  • the second classification reference information of the unlabeled image is obtained by classifying and identifying the unlabeled image by other image classification sub-models in the image classification model except the target image classification sub-model.
  • the labeled image in the image set can be based on the classification reference information corresponding to the first image classification sub-model, the category label corresponding to the labeled image, and the category label in the image set.
  • the unlabeled image corresponds to the classification reference information of the second image classification sub-model, and the classification loss of the first image classification sub-model is determined; for the second image classification sub-model, the labeled image in the image set corresponds to the second image classification sub-model.
  • the classification reference information of the model, the category labels corresponding to the labeled images, and the classification reference information of the unlabeled images in the image set correspond to the first image classification sub-model, and the classification loss of the second image classification sub-model is determined.
  • the first image classification sub-model and the second image classification sub-model can use their own learned information to provide guidance to each other, so that the first image classification sub-model and the second image classification sub-model are connected by a one-way teacher.
  • the student-student relationship becomes a mutual teacher-student relationship, which is conducive to complementary learning and teaching between various image classification sub-models, so that the information contained in the images in the image set can be fully explored and utilized, which is conducive to improving the training effect of the image classification model.
  • the classification loss of the image classification sub-model is used to indicate that the classification reference information obtained by the image classification sub-model for classifying and identifying the input image corresponds to the input image.
  • each learning task performed by each image classification sub-model based on the input image set is a semi-supervised learning task, which combines supervised learning tasks based on labeled images and their corresponding category labels and unsupervised learning based on unlabeled images Task
  • each learning task may produce a certain classification loss.
  • the classification loss of the target image classification sub-model can include supervised losses and unsupervised losses of the target image classification sub-model.
  • the target image classification sub-model has The supervised loss is used to represent the classification loss caused by the target image classification sub-model performing a supervised learning task, and the unsupervised loss of the target image classification sub-model is used to represent the classification loss caused by the target image classification sub-model performing an unsupervised learning task.
  • the supervised loss of the target image classification sub-model can be determined based on the first classification reference information of the labeled images in the image set and the category labels corresponding to the labeled images.
  • the unsupervised loss of the target image classification sub-model The supervised loss may be determined based on the first classification reference information and the second classification reference information of the unlabeled images in the image set.
  • the unlabeled image is modified from the initial unlabeled image.
  • the data is input into the target image classification sub-model after being subjected to various enhancement levels.
  • the resulting first classification reference information of the unlabeled image includes the first classification reference information of each unlabeled image.
  • the supervised loss of the target image classification sub-model can be determined based on the first classification reference information of the labeled image in the image set and the category label corresponding to the labeled image
  • the unsupervised loss of the target image classification sub-model can be determined based on the first classification reference information of the labeled image in the image set.
  • the first classification reference information of the unlabeled image is determined and the second classification reference information is determined.
  • the above-mentioned unlabeled images include the first type of unlabeled images and the second type of unlabeled images.
  • the degree of enhancement corresponding to the first type of unlabeled images is smaller than the degree of enhancement corresponding to the second type of unlabeled images. Accordingly, the above-mentioned S106 may specifically include the following: step:
  • the second classification reference information of the image generates the first type of pseudo-label corresponding to the first type of unlabeled image, which is equivalent to artificially labeling the first type of unlabeled image to indicate the predicted category to which the first type of unlabeled image belongs, thus Provide guidance for unsupervised learning tasks of target image classification submodels.
  • the first pseudo-label corresponding to the first type of unlabeled image may be used to indicate the prediction category to which the first type of unlabeled image belongs.
  • the first type of pseudo-label corresponding to the first type of unlabeled image can also be used to indicate the first target object area in the first type of unlabeled image and the prediction category to which the first target object area belongs, where the first target object area It refers to the area where the target object in the first type of unlabeled image recognized by the target image classification sub-model is located.
  • the first target object area refers to the face area in the first type of unlabeled image.
  • the first classification reference information of the first type of unlabeled image includes the probability that the first type of unlabeled image is identified as belonging to each of the plurality of preset categories.
  • the above S141 can be specifically implemented as: based on the second classification reference information of the first type of unlabeled image, determine the preset category corresponding to the maximum probability from multiple preset categories; if the probability corresponding to the preset category is greater than the preset probability threshold , then based on the preset category corresponding to the maximum probability, the first type of pseudo label of the first type of unlabeled image is generated.
  • the preset category corresponding to the maximum probability is determined from multiple preset categories. If the probability corresponding to the preset category is greater than the preset probability threshold, then based on the preset category corresponding to the maximum probability, a pseudo label of the first type of unlabeled image corresponding to the first image classification sub-model is generated; and based on the first image classification sub-model, the first type of unlabeled image is generated for the first type of unlabeled image.
  • the classification reference information output by the label image determines the preset category corresponding to the maximum probability from multiple preset categories. If the probability corresponding to the preset category is greater than the preset probability threshold, then based on the preset category corresponding to the maximum probability, a generated The first category of unlabeled images corresponds to the pseudo-labels of the second image classification submodel.
  • the pseudo-label of the first type of unlabeled image corresponding to the first image classification sub-model can be determined by the following formula (1)
  • the pseudo-label of the first type of unlabeled image corresponding to the second image classification sub-model can be determined by the following formula (2) Confirm:
  • the degree of enhancement corresponding to the first type of unlabeled image is small, that is, the disturbance introduced to the initial unlabeled image is small, the first type of unlabeled image will not be distorted.
  • the corresponding degree of the first type of unlabeled image is When the maximum probability in the classification reference information is greater than the preset probability threshold, the corresponding pseudo label is generated based on the preset category corresponding to the maximum probability, which can greatly reduce the possibility of introducing noise or errors in the pseudo label, thereby ensuring that each The image classification sub-model learns the noise in the first type of unlabeled images on the basis of obtaining accurate classification and recognition results, which is beneficial to improving the learning effect of each image classification sub-model.
  • the unsupervised loss of the first image classification sub-model may be determined based on the classification reference information output by the first image classification sub-model for the second type of unlabeled image in the image set and the pseudo-label of the unlabeled image corresponding to the first image classification sub-model. ; and, the classification reference information output by the second image classification sub-model for the second type of unlabeled image corresponding to the unlabeled image in the image set, and the unlabeled
  • the labeled image corresponds to the pseudo-label of the second image classification submodel, and the unsupervised loss of the second image classification submodel is determined.
  • the unsupervised performance of the target image classification sub-model can be determined based on the first classification reference information of each second type unlabeled image in the image set, the first type pseudo label of each unlabeled image, and the preset loss function. loss.
  • the preset loss function can be set according to actual needs, including but not limited to at least one of a cross-entropy loss function, a classification loss function, and a bounding box regression loss function.
  • the first type pseudo label corresponding to the first type unlabeled image and the preset loss function Determine the unsupervised sub-loss corresponding to the first type of unlabeled image; further, determine the weighted sum of the unsupervised sub-loss corresponding to each first type of unlabeled image in the image set as the unsupervised loss of the target image classification sub-model.
  • the first category can be based on the confidence of the first pseudo-label corresponding to the unlabeled image.
  • Unlabeled images are set with corresponding loss weights, such as giving higher loss weights to first-class unlabeled images with high-confidence pseudo-labels, and giving higher loss weights to unlabeled images with low-confidence first-class pseudo-labels.
  • the low loss weight can combat the noise in the first type of pseudo labels to a certain extent, which is beneficial to improving the training effect of the image classification model.
  • the training method of the image classification model provided by the embodiment of the present application may also include: based on the first type of pseudo labels corresponding to the first type of unlabeled images and the second type of pseudo labels corresponding to the first type of unlabeled images, Determine the loss weight corresponding to the first type of unlabeled image, where the second type of pseudo label corresponding to the first type of unlabeled image is generated based on the first classification reference information of the first type of unlabeled image.
  • the specific generation method is the same as that based on the first type of unlabeled image.
  • a class of first-class pseudo-labels for unlabeled images works in a similar manner.
  • the unsupervised sub-loss corresponding to the first type of unlabeled image is determined, and Based on the loss weight corresponding to the first type of unlabeled image and the corresponding loss weight of the first type of unlabeled image Unsupervised loss, determine the unsupervised loss of the target image classification submodel. Example ground.
  • the unsupervised sub-loss corresponding to each first-class unlabeled image in the image set can be weighted and summed to obtain the unsupervised loss of the target image classification sub-model.
  • the classification reference information obtained after predicting the same image by different image classification sub-models should theoretically be the same, and the pseudo labels corresponding to the same image corresponding to different image classification sub-models should also be the same.
  • the first category without The predicted category indicated by the first type of pseudo-label corresponding to the labeled image is compared with the predicted category indicated by the second type of pseudo-label corresponding to the first type of unlabeled image. If the two are inconsistent, the first type of unlabeled image can be determined.
  • the confidence of the two pseudo-labels is low, and thus a lower loss weight (i.e., the first preset weight) can be assigned to the first type of unlabeled image; if the two are consistent, the first type of unlabeled image can be determined
  • the confidence of the two pseudo-labels is higher, and thus a higher loss weight (ie, the second preset weight) can be assigned to the first type of unlabeled image.
  • the first type of pseudo-label corresponding to the first type of unlabeled image can be used to indicate the first target object area in the first type of unlabeled image and the prediction category to which the first target object area belongs.
  • the first type of unlabeled image The corresponding pseudo-label of the second type may be used to indicate the second target object area in the unlabeled image of the first type, and the prediction category to which the second target object area belongs.
  • the intersection between the first target object region and the second target object region can be determined Compare and compare the prediction category to which the first target object region belongs and the prediction category to which the second target object region belongs to obtain a comparison result; further, based on the intersection and comparison results, determine the first type of unlabeled The loss weight corresponding to the image.
  • the third unlabeled image corresponding to the first type can be determined.
  • the confidence of the first type of pseudo-label and the second type of pseudo-label is low, and then the first type of unlabeled image can be given a first preset weight; if the intersection and union ratio is greater than the preset ratio, and the comparison result indicates that the first target The prediction category to which the object area belongs and the prediction category to which the second target object area belongs If the measured categories are consistent, it can be determined that the confidence of the first category of pseudo-labels and the second category of pseudo-labels corresponding to the first category of unlabeled images is low, and then the second preset weight can be assigned to the first category of unlabeled images. The second preset weight is greater than the first preset weight.
  • the unsupervised loss of the target image classification sub-model can be determined through the following formula (3).
  • N u represents the number of unlabeled images of the first category in the image set
  • B represents the image set
  • b ⁇ h indicates that the confidence of the first-class pseudo-label and the second-class pseudo-label corresponding to the b-th first-class unlabeled image is high
  • b ⁇ B ⁇ h means that the confidence of the first-type pseudo-label and the second-type pseudo-label corresponding to the b-th first-type unlabeled image is low
  • represents the classification loss function represents the bounding box regression loss function
  • represents the loss weight corresponding to the first-class unlabeled image with a higher confidence pseudo-label.
  • the confidence of the pseudo labels generated in the early stage of training is usually not high, it can easily lead to poor training results of the image classification model, and the pseudo labels generated after classifying and identifying the same image through different image classification sub-models.
  • the labels should be the same. Based on this, based on the pseudo-labels of the first type of unlabeled images corresponding to each image classification sub-model, the confidence of the pseudo-labels can be judged, and then the corresponding loss weights can be set for the first type of unlabeled images. , can combat the noise in pseudo labels to a certain extent, which is beneficial to improving the training effect of the image classification model.
  • the supervised loss of the target image classification submodel can be determined through the following formula (4):
  • N l represents the labeled number of signed images
  • the corresponding category label represents the classification loss function
  • the classification loss of the target image classification sub-model is determined as follows:
  • each image classification sub-model in the image classification model is a semi-supervised learning task based on the image set, which combines supervised learning based on labeled images and their corresponding category labels, and based on unlabeled images Unsupervised learning of its corresponding pseudo-labels may produce a certain classification loss under each learning task.
  • the supervised loss of the image classification sub-model is determined based on the classification reference information output by the image classification sub-model and the category label corresponding to the labeled image, so that there is The supervised loss can accurately reflect the classification loss generated by the image classification sub-model when performing supervised learning tasks; the classification reference information obtained by using the same image after different data enhancement processes and inputting it into the same image classification sub-model has the same rules in theory , based on the classification reference information of the first type of unlabeled image with smaller enhancement intensity corresponding to the image classification sub-model, generate a pseudo label corresponding to another image classification sub-model for the first type of unlabeled image, and then use the first type The unlabeled image corresponds to the pseudo-label of each image classification sub-model and the second type of unlabeled image with greater enhancement intensity corresponds to the classification reference information of each image classification sub-model.
  • Determining the unsupervised loss of each image classification sub-model not only makes The unsupervised loss can accurately reflect the classification loss produced by the corresponding image classification sub-model when performing unsupervised learning tasks. It is also conducive to each image classification sub-model using the first type of unsupervised loss with smaller enhancement intensity during the unsupervised learning process.
  • the classification reference information of labeled images supervises and enhances the classification reference information of the second type of unlabeled images with greater strength, which is beneficial to improving the classification of each image classification sub-model. class accuracy.
  • the embodiment of the present application here shows a specific implementation method of determining the classification loss of the target image classification sub-model.
  • the classification loss of the target image classification sub-model can also be determined in other ways, which is not limited in the embodiments of the present application.
  • S108 Adjust the model parameters of the image classification model based on the classification loss of the first image classification sub-model and the classification loss of the second image classification sub-model.
  • the above S108 may include the following steps:
  • the classification loss of the image classification model is used to represent the difference between the classification reference information obtained by the image classification model for classifying and identifying the input image and the real category to which the input image belongs.
  • the classification loss of the image classification model can be determined by the following formula (6):
  • ⁇ u represents the loss weight corresponding to the unsupervised loss.
  • the model parameters of the image classification model may include model parameters of the first image classification sub-model and model parameters of the second image classification sub-model.
  • model parameters may include but are not limited to the number of neurons in each network layer in the image classification sub-model, the connection relationship between neurons in different network layers, and Connection edge weights, corresponding biases of neurons in each network layer, etc.
  • the classification loss of the image classification model can reflect the performance of the image classification model on the input image, The difference between the classification reference information output by classification recognition and the real category to which the input image belongs.
  • the back propagation algorithm can be used to classify the first image based on the classification loss of the image classification model.
  • the respective model parameters of the sub-model and the second image classification sub-model are adjusted.
  • the classification loss of the image classification model, the current model parameters of the first image classification sub-model and the second image classification can be used.
  • the back propagation algorithm is used to determine the prediction losses caused by each network layer of the first image classification sub-model and the second image classification sub-model; then, with the goal of reducing the classification loss of the image classification model, Adjust the relevant parameters of each network layer in the first image classification sub-model and the relevant parameters of each network layer in the second image classification sub-model layer by layer.
  • the above process is only one model parameter adjustment process.
  • the image classification model may need to be adjusted multiple times. Therefore, the above steps S102 to S108 can be repeated multiple times until the preset training stop is met. conditions, thus obtaining the final image classification model.
  • the preset training stop condition may be that the classification loss of the image classification model is less than the preset loss threshold, or it may be that the number of adjustments reaches the preset number, etc. This is not limited in the embodiments of the present application.
  • the classification loss generated by each image classification sub-model will affect the classification of the image classification model. Accuracy, for this reason, the result of the weighted summation of the classification losses of each image classification sub-model is used as the classification loss of the image classification model, so that the classification loss of the image classification model can more accurately reflect the classification deviation of the image classification model, Then, the classification loss of the image classification model is used to adjust the model parameters of the image classification model, which is beneficial to improving the classification accuracy of the image classification model.
  • the training method of the image classification model provided by the embodiment of the present application, under the semi-supervised learning framework, uses each image classification sub-model in the image classification model to separately classify and identify each image in the image set, and obtain multiple images of each image.
  • classification reference information and one classification reference information corresponds to an image classification sub-model; then, each image classification sub-model is used as a target image classification sub-model, and the labeled images in the image set correspond to the target image classification sub-model.
  • the classification reference information, the category label corresponding to the labeled image, and the unlabeled image in the image set correspond to the classification reference information of another image classification sub-model, and the classification loss of the target image classification sub-model is determined, that is, using a target image classification sub-model
  • the information learned from the image set provides guidance for another target image classification sub-model, changing the one-way teacher-student relationship between each target image classification sub-model of the image classification model into a mutual teacher-student relationship; further, based on the image
  • the classification loss of each target image classification sub-model in the classification model and adjusting the model parameters of the image classification model can make full use of the mutual teacher-student relationship between each target image classification sub-model, allowing complementary learning between each target image classification sub-model. , teaching and learning, so that the information contained in the image set can be fully exploited and utilized, thereby improving the training effect of the image classification model and obtaining a more accurate and reliable image classification model.
  • the above embodiment introduces the training method of the image classification model.
  • the image classification model for different application scenarios can be trained.
  • the image set used to train the image classification model and the label of each image contained in it can be selected according to the application scenario.
  • Application scenarios applicable to the training method provided by the embodiments of the present application may include, but are not limited to, target detection, facial expression classification, natural animal classification, handwritten digit recognition and other scenarios.
  • the category label corresponding to the labeled image is used to mark the target object contained in the labeled image and the category to which the target object belongs, such as cats, dogs, horses, etc., as provided by the above embodiments of the present application.
  • the image classification model trained by the training method can detect the area where the target object is located in the image to be processed, and identify the category to which the target object belongs.
  • the trained image classification model can be applied to any scene that requires classification and recognition of images.
  • the following pairs are based on images
  • the application process of the classification model is explained in detail.
  • Embodiments of the present application also provide an image classification method using an image classification model, which can classify and identify images to be processed based on the image classification model trained by the above training method.
  • Figure 3 is a schematic flow chart of an image classification method according to an embodiment of the present application.
  • the method may include the following steps:
  • S302 Classify and identify the image to be processed through the image classification model, and obtain a classification reference information set of the image to be processed.
  • the classification reference information set of the image to be processed includes the first target classification reference information of the image to be processed and the second target classification reference information of the image to be processed.
  • the image classification model includes a first image classification sub-model and a second image classification sub-model. The first image classification sub-model is used to classify and identify the image to be processed, and obtain the first target classification reference information of the image to be processed; the second image classification sub-model is used to classify and identify the image to be processed, and obtain the second target of the image to be processed. Classification reference information.
  • the category to which the image to be processed can be determined based on the classification reference information of the image to be processed corresponding to any image classification sub-model.
  • the category corresponding to the maximum probability in the first target classification reference information of the image to be processed can be determined as the category to which the image to be processed belongs, or the category corresponding to the maximum probability in the second target classification reference information of the image to be processed can also be determined.
  • the above multiple classification reference information of the image to be processed can also be combined to determine the category to which the image to be processed belongs. For example, if the category corresponding to the maximum classification probability in the first target classification reference information of the image to be processed is consistent with the category corresponding to the maximum classification probability in the second target classification reference information of the image to be processed, then the category can be determined as the image to be processed.
  • the category it belongs to for another example, it can be determined based on the intersection between the first target category set in the first target category reference information of the image to be processed and the second target category set in the second target category reference information of the image to be processed.
  • the category to which the image to be processed belongs wherein the first target category set includes the first target classification reference information exceeding the preset probability threshold
  • the second target category set includes categories corresponding to probabilities exceeding the preset probability threshold in the second target classification reference information, and so on.
  • the image classification method provided by the embodiments of the present application is based on the semi-supervised learning method and utilizes the mutual teacher-student relationship between each image classification sub-model, through complementary learning and teaching between each image classification sub-model.
  • the image classification model is obtained through training, so the image classification model has high accuracy and reliability; further, using the image classification model to classify and identify the images to be processed will help improve the accuracy and reliability of the image classification results.
  • FIG. 4 is a schematic structural diagram of an image classification model training device 400 provided in an embodiment of the present application.
  • the device 400 includes: an acquisition unit 410, used to acquire the data used to train the image classification model.
  • Image set the image set includes labeled images, unlabeled images, and category labels corresponding to the labeled images;
  • the classification unit 420 is used to classify the described images respectively through the target image classification sub-model in the image classification model.
  • the labeled image and the unlabeled image are classified and identified to obtain the first classification reference information of the labeled image and the first classification reference information of the unlabeled image;
  • the target image classification sub-model is the first The image classification sub-model or the second image classification sub-model;
  • the determining unit 430 is used to determine the first classification reference information corresponding to the labeled image, the category label corresponding to the labeled image, and the unlabeled image.
  • the second classification reference information determines the classification loss of the target image classification sub-model; classifies and identifies the unlabeled image through the non-target image classification sub-model in the image classification model, and obtains the third classification loss of the unlabeled image.
  • the non-target image classification sub-model is the image classification model except the target image Other image classification submodels other than the classification submodel;
  • the classification loss of the target image classification submodel refers to the classification loss of the first image classification submodel or the classification loss of the second image classification submodel;
  • the adjustment unit 440 Used to adjust model parameters of the image classification model based on the classification loss of the first image classification sub-model and the classification loss of the second image classification sub-model.
  • the acquisition unit is also used to acquire an initial unlabeled image.
  • the training device 400 further includes: an enhancement unit, configured to perform data enhancement processing of multiple enhancement degrees on the initial unlabeled image to obtain the unlabeled image.
  • an enhancement unit configured to perform data enhancement processing of multiple enhancement degrees on the initial unlabeled image to obtain the unlabeled image.
  • the unlabeled images include a first type of unlabeled image and a second type of unlabeled image, and the corresponding enhancement degree of the first type of unlabeled image is smaller than the enhancement degree of the second type of unlabeled image.
  • the determining unit determines the classification of the target image classification sub-model based on the first classification reference information corresponding to the labeled image, the category label corresponding to the labeled image, and the second classification reference information of the unlabeled image.
  • Loss including: generating a first type of pseudo label corresponding to the first type of unlabeled image based on the second classification reference information of the first type of unlabeled image; based on the first classification of the second type of unlabeled image Determine the unsupervised loss of the target image classification sub-model based on the reference information and the first type of pseudo-label corresponding to the first type of unlabeled image; based on the first classification reference information of the labeled image and the labeled image Corresponding category labels, determine the supervised loss of the target image classification sub-model; based on the unsupervised loss of the target image classification sub-model and the supervised loss of the target image classification sub-model, determine the target image classification sub-model Classification loss.
  • the determining unit is further configured to determine the target image classification based on the first classification reference information of the second type of unlabeled image and the first type of pseudo label corresponding to the first type of unlabeled image.
  • the first type of unlabeled image is determined based on the first type of pseudo label corresponding to the first type of unlabeled image and the second type of pseudo label corresponding to the first type of unlabeled image.
  • Corresponding loss weight wherein the second type of pseudo label corresponding to the first type of unlabeled image is generated based on the first classification reference information of the first type of unlabeled image.
  • the determination unit determines the unsupervised loss of the target image classification submodel based on the first classification reference information of the second type of unlabeled image and the first type of pseudo label corresponding to the first type of unlabeled image, including : Based on the first classification reference information of the second type of unlabeled image and the first type of pseudo label corresponding to the first type of unlabeled image, determine the unsupervised sub-loss corresponding to the first type of unlabeled image; based on The loss weight corresponding to the first type of unlabeled image and the unsupervised sub-loss corresponding to the first type of unlabeled image determine the unsupervised loss of the target image classification sub-model.
  • the first type of pseudo label corresponding to the first type of unlabeled image is used to indicate the first target object area in the first type of unlabeled image and the prediction category to which the first target object area belongs
  • the second type of pseudo-label corresponding to the first type of unlabeled image is used to indicate the second target object area in the first type of unlabeled image and the prediction category to which the second target object area belongs.
  • the determination unit determines the loss weight corresponding to the first type of unlabeled image based on the first type of pseudo label corresponding to the first type of unlabeled image and the second type of pseudo label corresponding to the first type of unlabeled image. , including: determining the intersection ratio between the first target object region and the second target object region, and comparing the prediction category to which the first target object region belongs and the prediction category to which the second target object region belongs. The categories are compared to obtain a comparison result; based on the intersection ratio and the comparison result, the loss weight corresponding to the first type of unlabeled image is determined.
  • the determination unit determines the loss weight corresponding to the first type of unlabeled image based on the intersection and union ratio and the comparison result, including: if the intersection and union ratio is less than or equal to a preset ratio or If the comparison result shows that the prediction category to which the first target object region belongs is inconsistent with the prediction category to which the second target object region belongs, then it is determined that the loss weight corresponding to the first type of unlabeled image is the first preset Weight; if the intersection ratio is greater than the preset ratio and the comparison result shows that the prediction category to which the first target object region belongs is consistent with the prediction category to which the second target object region belongs, then determine The loss weight corresponding to the first type of unlabeled image is a second preset weight, wherein the second preset weight is greater than the first preset weight.
  • the first classification reference information of the first type of unlabeled image and the second classification reference information of the first type of unlabeled image both include that the first type of unlabeled image is identified as belonging to multiple The probability of each of the preset categories.
  • the determining unit generates a first type of pseudo label corresponding to the first type of unlabeled image based on the second classification reference information of the first type of unlabeled image, including: based on the second type of pseudo label of the first type of unlabeled image.
  • Two classification reference information determine the preset category corresponding to the maximum probability from the plurality of preset categories; if the maximum probability is greater than the preset probability threshold, generate the third based on the preset category corresponding to the maximum probability First-class pseudo-labels for a class of unlabeled images.
  • the adjustment unit adjusts the model parameters of the image classification model based on the classification loss of the first image classification sub-model and the classification loss of the second image classification sub-model, including: adjusting the first image classification sub-model.
  • the classification loss of the image classification sub-model and the classification loss of the second image classification sub-model are weighted and summed to obtain the classification loss of the image classification model; through the back propagation algorithm, based on the classification loss of the image classification model, Adjust model parameters of the image classification model.
  • the image classification model training device provided by the embodiment of the present application can be used as the execution subject of the image classification model training method shown in Figure 1.
  • step S102 in the image classification model training method shown in Figure 1 can be performed by Figure 4
  • the acquisition unit in the training device of the image classification model shown is executed.
  • Step S104 can be executed by the classification unit in the training device of the image classification model.
  • Step S106 can be executed by the determination unit in the training device of the image classification model.
  • Step S108 can be executed by the image classification unit.
  • the adjustment unit in the model's training device is executed.
  • each unit in the training device of the image classification model shown in Figure 4 can be separately or entirely combined into one or several other units to form, or some of the units (some) can also be It can be further divided into multiple units with smaller functions, which can achieve the same operation without affecting the realization of the technical effects of the embodiments of the present application.
  • the above units are divided based on logical functions.
  • the function of one unit can also be realized by multiple units, or the functions of multiple units can be realized by one unit.
  • the training device of the image classification model may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by multiple units in cooperation.
  • a central processing unit Central On a general-purpose computing device such as a computer with processing elements and storage elements such as Processing Unit (CPU), Random Access Memory (RAM), and Read-Only Memory (ROM), the operation can execute such as
  • the computer program (including program code) involved in each step of the corresponding method shown in Figure 1 is used to construct the training device of the image classification model shown in Figure 4, and to implement the training of the image classification model in the embodiment of the present application.
  • the computer program can be recorded on, for example, a computer-readable storage medium, and can be reproduced in an electronic device through the computer-readable storage medium and run therein.
  • the image classification model training device uses each image classification sub-model in the image classification model to classify and identify each image in the image set, and obtain multiple images of each image.
  • classification reference information and one classification reference information corresponds to an image classification sub-model; then, each image classification sub-model is used as a target image classification sub-model, and the labeled images in the image set correspond to the target image classification sub-model.
  • the classification reference information, the category label corresponding to the labeled image, and the unlabeled image in the image set correspond to the classification reference information of another image classification sub-model, and the classification loss of the target image classification sub-model is determined, that is, using a target image classification sub-model
  • the information learned from the image set provides guidance for another target image classification sub-model, changing the one-way teacher-student relationship between each target image classification sub-model of the image classification model into a mutual teacher-student relationship; further, based on the image
  • the classification loss of each target image classification sub-model in the classification model and adjusting the model parameters of the image classification model can make full use of the mutual teacher-student relationship between each target image classification sub-model, allowing complementary learning between each target image classification sub-model. , teaching and learning, so that the information contained in the image set can be fully exploited and utilized, thereby improving the training effect of the image classification model and obtaining a more accurate and reliable image classification model.
  • inventions of the present application also provide an image classification device.
  • Figure 5 is a schematic structural diagram of an image classification device 500 provided in an embodiment of the present application.
  • the device 500 includes: a classification unit 510, used to classify and identify the image to be processed through an image classification model to obtain the image to be processed.
  • the classification reference information set includes first target classification reference information and second target classification reference information
  • the image classification model includes a first image classification sub-model and a second image classification sub-model, and the first image
  • the classification sub-model is used to classify and identify the image to be processed to obtain the first target classification reference information
  • the second image classification sub-model is used to classify and identify the image to be processed to obtain the second target classification.
  • Reference information the image classification model is trained based on the training method described in the embodiment of this application; the determination unit 520 is used to determine the category to which the image to be processed belongs based on the classification reference information set of the image to be processed.
  • step S302 in the image classification method shown in Figure 3 can be performed by the classification in the image classification device shown in Figure 5.
  • Unit execution, step S304 may be executed by a determination unit in the image classification device.
  • each unit in the image classification device shown in Figure 5 can be separately or entirely combined into one or several additional units, or some of the units can be further disassembled. It is divided into multiple units with smaller functions, which can achieve the same operation without affecting the realization of the technical effects of the embodiments of the present application.
  • the above units are divided based on logical functions.
  • the function of one unit can also be realized by multiple units, or the functions of multiple units can be realized by one unit.
  • the image classification device may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by multiple units in cooperation.
  • the method can be implemented by including a central processing unit (Central Processing Unit, CPU), a random access storage medium (Random Access Memory, RAM), a read-only storage medium (Read-Only Memory, ROM), etc.
  • a general-purpose computing device such as a computer with processing elements and storage elements
  • a computer program capable of executing the steps involved in the corresponding method as shown in Figure 3 is run to construct the image classification as shown in Figure 5 device, and to implement the image classification method in the embodiment of the present application.
  • the computer program may be recorded on, for example, a computer-readable storage medium, and may be reproduced in an electronic device through the computer-readable storage medium, and be stored in the electronic device. Which runs.
  • the image classification device because the image classification model is based on a semi-supervised learning method, utilizes the mutual teacher-student relationship between each image classification sub-model, and achieves mutual learning and teaching through complementary learning and teaching between each image classification sub-model.
  • the image classification model is obtained through training, so the image classification model has high accuracy and reliability; further, using the image classification model to classify and identify the images to be processed will help improve the accuracy and reliability of the image classification results.
  • Figure 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the electronic device includes a processor and optionally an internal bus, a network interface, and a memory.
  • the memory may include memory, such as high-speed random access memory (Random-Access Memory, RAM), or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
  • RAM random access memory
  • non-volatile memory such as at least one disk memory.
  • the electronic equipment may also include other hardware required by the business.
  • the processor, network interface and memory can be connected to each other through an internal bus, which can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect, a peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture, extended industrial standard architecture) bus, etc.
  • the bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one bidirectional arrow is used in Figure 6, but it does not mean that there is only one bus or one type of bus.
  • Memory used to store programs.
  • a program may include program code including computer operating instructions.
  • Memory may include internal memory and non-volatile memory and provides instructions and data to the processor.
  • the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs it, forming a training device for the image classification model at the logical level.
  • the processor executes the program stored in the memory, and is specifically used to perform the following operations: obtain an image set for training the image classification model, the image set includes labeled images, unlabeled images, and the labeled images.
  • the target image classification sub-model is the third An image classification sub-model or the second image classification sub-model; based on the first classification reference information corresponding to the labeled image, the category label corresponding to the labeled image and the second classification reference information of the unlabeled image , determine the classification loss of the target image classification sub-model; the second classification reference information of the unlabeled image is determined by other image classification sub-models in the image classification model except the target image classification sub-model.
  • the unlabeled image is obtained by classification and recognition; the classification loss of the target image classification sub-model refers to the classification loss of the first image classification sub-model or the classification loss of the second image classification sub-model; based on the first The classification loss of the image classification sub-model and the classification loss of the second image classification sub-model are used to adjust the model parameters of the image classification model.
  • the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs it to form an image classification device at the logical level.
  • the processor executes the program stored in the memory, and is specifically used to perform the following operations: classify and identify the image to be processed through the image classification model, and obtain a classification reference information set of the image to be processed; wherein the classification reference information set includes First target classification reference information and second target classification reference information.
  • the image classification model includes a first image classification sub-model and a second image classification sub-model. The first image classification sub-model is used to classify the image to be processed. Classification and recognition are performed to obtain the first target classification reference information.
  • the second image classification sub-model is used to classify and identify the image to be processed to obtain the second target classification reference information.
  • the image classification model is based on this
  • the training method of the image classification model described in the application embodiment is obtained by training; based on the classification reference information set of the image to be processed, the category to which the image to be processed belongs is determined.
  • the method performed by the training device of the image classification model disclosed in the embodiment shown in FIG. 1 of the present application or the method performed by the image classification device disclosed in the embodiment shown in FIG. 3 of the present application can be applied to the processor, or by the processor. implement.
  • the processor may be an integrated circuit chip that has signal processing capabilities.
  • each step of the above method can be implemented through the hardware in the processor. It is completed by instructions in the form of integrated logic circuits or software.
  • the above-mentioned processor can be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (Digital Signal Processor, DSP), dedicated integrated processor Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • CPU central processing unit
  • NP Network Processor
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the steps of the method disclosed in conjunction with the embodiments of the present application can be directly implemented by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other mature storage media in this field.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the electronic device can also perform the method in Figure 1 and implement the functions of the image classification model training device in the embodiments shown in Figures 1 and 2.
  • the electronic device can also perform the method in Figure 3 and implement the image classification device.
  • the functions of the embodiment shown in Figure 3 will not be described again in the embodiment of this application.
  • the electronic device of this application does not exclude other implementation methods, such as logic devices or a combination of software and hardware, etc. That is to say, the execution subject of the following processing flow is not limited to each logical unit. It can also be hardware or logic devices.
  • Embodiments of the present application also provide a computer-readable storage medium that stores one or more programs.
  • the one or more programs include instructions.
  • the instructions are used by a portable electronic device including multiple application programs.
  • the portable electronic device can be caused to perform the method of the embodiment shown in Figure 1, and is specifically used to perform the following operations: obtain an image set used to train the image classification model, and the image set includes labeled images. , unlabeled images and the category labels corresponding to the labeled images; through the target image classification sub-model in the image classification model, classify and identify the labeled images and the unlabeled images respectively, and obtain the labeled images.
  • the second classification reference information is obtained by classifying and identifying the unlabeled image through other image classification sub-models in the image classification model except the target image classification sub-model; the classification loss of the target image classification sub-model refers to the classification loss of the first image classification sub-model or the classification loss of the second image classification sub-model; based on the classification loss of the first image classification sub-model and the classification loss of the second image classification sub-model , adjust the model parameters of the image classification model.
  • the computer-readable storage medium stores one or more programs, and the one or more programs include instructions that, when executed by a portable electronic device including a plurality of application programs, enable the portable electronic device to perform the steps shown in FIG. 3
  • the method of the embodiment is specifically used to perform the following operations: classify and identify the image to be processed through the image classification model, and obtain the classification reference information set of the image to be processed; wherein the classification reference information set includes the first target classification reference information and second target classification reference information.
  • the image classification model includes a first image classification sub-model and a second image classification sub-model. The first image classification sub-model is used to classify and identify the image to be processed.
  • the first target classification reference information and the second image classification sub-model are used to classify and identify the image to be processed to obtain the second target classification reference information.
  • the image classification model is based on the embodiments of this application.
  • the image classification model is trained by the training method described above; based on the classification reference information set of the image to be processed, the category to which the image to be processed belongs is determined.
  • the systems, devices, modules or units described in the above embodiments may be implemented by computer chips or entities, or by products with certain functions.
  • a typical implementation device is a computing machine.
  • the computer may be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or A combination of any of these devices.
  • Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information.
  • Information may be computer-readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • read-only memory read-only memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • compact disc read-only memory CD-ROM
  • DVD digital versatile disc
  • Magnetic tape cassettes tape disk storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transient computer-readable media (transitory media), such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种图像分类模型的训练方法、图像分类方法及相关设备,所述训练方法包括:获取图像集;通过图像分类模型中的目标图像分类子模型,对有标签图像和无标签图像进行分类识别,得到有标签图像的第一分类参考信息和无标签图像的第一分类参考信息;通过图像分类模型中的非目标图像分类子模型对无标签图像进行分类识别,得到无标签图像的第二分类参考信息;基于有标签图像的第一分类参考信息、有标签图像对应的类别标签以及无标签图像的第二分类参考信息,确定目标图像分类子模型的分类损失;基于第一图像分类子模型的分类损失和第二图像分类子模型的分类损失,调整图像分类模型的模型参数。

Description

图像分类模型的训练方法、图像分类方法及相关设备
交叉引用
本申请要求在2022年07月19日提交中国专利局、申请号为202210872051.8、发明名称为“图像分类模型的训练方法、图像分类方法及相关设备”的中国专利申请的优先权,该申请的全部内容通过引用结合在本发明中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种图像分类模型的训练方法、图像分类方法及相关设备。
背景技术
半监督学习(Semi-Supervised Learning,SSL)是模式识别和机器学习领域研究的重点问题,是监督学习和无监督学习相结合的一种学习方式。近年来,半监督学习在图像分类等领域得到了广泛应用。
发明内容
本申请提供一种图像分类模型的训练方法、图像分类方法及相关设备,用于解决现有的图像分类模型的训练效果不佳的问题而影响最终的图像分类准确性和稳定性的问题。
本申请实施例采用下述技术方案:
一方面,本申请实施例提供一种图像分类模型的训练方法,所述图像分类模型包括第一图像分类子模型和第二图像分类子模型,所述方法包括:获 取用于对所述图像分类模型进行训练的图像集,所述图像集中包括有标签图像、无标签图像以及所述有标签图像对应的类别标签;通过所述图像分类模型中的目标图像分类子模型,分别对所述有标签图像和所述无标签图像进行分类识别,得到所述有标签图像的第一分类参考信息和所述无标签图像的第一分类参考信息;所述目标图像分类子模型为所述第一图像分类子模型或所述第二图像分类子模型;基于所述有标签图像对应的第一分类参考信息、所述有标签图像对应的类别标签以及所述无标签图像的第二分类参考信息,确定所述目标图像分类子模型的分类损失;所述无标签图像的第二分类参考信息为通过所述图像分类模型中除所述目标图像分类子模型之外的其他图像分类子模型对所述无标签图像进行分类识别得到;所述目标图像分类子模型的分类损失是指所述第一图像分类子模型的分类损失或所述第二图像分类子模型的分类损失;基于所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失,调整所述图像分类模型的模型参数。
一方面,本申请实施例提供一种图像分类方法,包括:通过图像分类模型对待处理图像进行分类识别,得到所述待处理图像的分类参考信息集;其中,所述分类参考信息集包括第一目标分类参考信息和第二目标分类参考信息,所述图像分类模型包括第一图像分类子模型和第二图像分类子模型,所述第一图像分类子模型用于对所述待处理图像进行分类识别得到所述第一目标分类参考信息,所述第二图像分类子模型用于对所述待处理图像进行分类识别得到所述第二目标分类参考信息,所述图像分类模型为基于第一方面所述的训练方法训练得到;基于所述待处理图像的分类参考信息集,确定所述待处理图像所属的类别。
一方面,本申请实施例提供一种图像分类模型的训练装置,所述图像分类模型包括第一图像分类子模型和第二图像分类子模型,所述训练装置包括:获取单元,用于获取用于对所述图像分类模型进行训练的图像集,所述图像集中包括有标签图像、无标签图像以及所述有标签图像对应的类别标签;分 类单元,用于通过所述图像分类模型中的目标图像分类子模型,分别对所述有标签图像和所述无标签图像进行分类识别,得到所述有标签图像的第一分类参考信息和所述无标签图像的第一分类参考信息;所述目标图像分类子模型为所述第一图像分类子模型或所述第二图像分类子模型;确定单元,用于基于所述有标签图像对应的第一分类参考信息、所述有标签图像对应的类别标签以及所述无标签图像的第二分类参考信息,确定所述目标图像分类子模型的分类损失;所述无标签图像的第二分类参考信息为通过所述图像分类模型中除所述目标图像分类子模型之外的其他图像分类子模型对所述无标签图像进行分类识别得到;所述目标图像分类子模型的分类损失是指所述第一图像分类子模型的分类损失或所述第二图像分类子模型的分类损失;调整单元,用于基于所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失,调整所述图像分类模型的模型参数。
一方面,本申请实施例提供一种图像分类装置,包括:分类单元,用于通过图像分类模型对待处理图像进行分类识别,得到所述待处理图像的分类参考信息集;其中,所述分类参考信息集包括第一目标分类参考信息和第二目标分类参考信息,所述图像分类模型包括第一图像分类子模型和第二图像分类子模型,所述第一图像分类子模型用于对所述待处理图像进行分类识别得到所述第一目标分类参考信息,所述第二图像分类子模型用于对所述待处理图像进行分类识别得到所述第二目标分类参考信息,所述图像分类模型为基于上述训练方法训练得到;确定单元,用于基于所述待处理图像的分类参考信息集,确定所述待处理图像所属的类别。
第五方面,本申请实施例提供一种电子设备,包括:处理器;用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令,以实现上述的方法。
一方面,本申请实施例提供一种计算机可读存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行上述的方法。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于描述本申请,并不构成对本申请的不当限定。在附图中:
图1为本申请的一个实施例提供的一种图像分类模型的训练方法的流程示意图;
图2为本申请的另一个实施例提供的一种图像分类模型的训练方法的流程示意图;
图3为本申请的一个实施例提供的一种图像分类方法的流程示意图;
图4为本申请的一个实施例提供的一种图像分类模型的训练装置的结构示意图;
图5为本申请的一个实施例提供的一种图像分类装置的结构示意图;
图6为本申请的一个实施例提供的一种电子设备的结构示意图。
具体实施方式
为了使本领域的人员更好地理解本申请实施例中的技术方案,下面将结合本申请具体实施例及相应的附图对本申请技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应理解,这样使用的数据在适当情况下可以互换,以便本申请实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,本说明书和权利要求书中“和/或”表示所两节对象的至少其中之一,字符“/”一般表示前后关联对象是一种“或”的关系。
现有的图像分类模型的训练方法,由采用预先训练好的教师网络为样本图像生成伪标签,再由学生网络利用带伪标签的样本图像等进行半监督学习,形成最终的图像分类模型。但是,由于伪标签完全是由教师网络从样本图像中学习的信息转换产生,这不利于学生网络对样本图像中包含的信息进行充分挖掘和利用,尤其是在训练前期,教师网络生成的伪标签的置信度不高,导致图像分类模型的训练效果不佳,进而影响最终的图像分类准确性和稳定性。
有鉴于此,本申请实施例提出了一种图像分类模型的训练方法,在半监督学习框架下,将图像分类模型的各图像分类子模型之间单向的师生关系改进为互为师生关系,利用一个图像分类子模型从样本图像中学习的信息,为另一个图像分类子模型提供进行半监督学习的伪标签,使得各个图像分类子模型之间互补学习、教学相长,进而使得样本图像中包含的信息能够得到充分挖掘和利用,从而提高图像分类模型的训练效果,得到更准确、更可靠的图像分类模型。
本申请实施例还提出了一种图像分类方法,利用训练得到的图像分类模型可以准确对图像进行分类识别。
应理解,本申请实施例提供的图像分类模型的训练方法和图像分类方法,均可以电子设备或安装在电子设备中的软件执行。此处所谓的电子设备可以包括终端设备,比如智能手机、平板电脑、笔记本电脑、台式计算机、智能语音交互设备、智能家电、智能手表、车载终端、飞行器等;或者,电子设备还可以包括服务器,比如独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云计算服务的云服务器。
以下结合附图,详细说明本申请各实施例提供的技术方案。
请参考图1,为本申请的一个实施例提供的一种图像分类模型的训练方法的流程示意图,该方法可以包括如下步骤:
S102,获取用于对图像分类模型进行训练的图像集。
其中,图像集中包括有标签图像、无标签图像以及有标签图像对应的类别标签。
有标签图像是指具有对应的类别标签的图像,无标签图像是指不具有对应的类别标签的图像。实际应用中,为进一步提高图像分类模型的分类准确性,图像集可以包括多个有标签图像和多个无标签图像,多个有标签图像可以属于不同的类别。
其中,有标签图像的类别标签用于表示有标签图像所属的真实类别,具体可以表示有标签图像呈现的内容所属的真实类别。例如,有标签图像所属的类别可以是人物、动物、风景等类别;又如,有标签图像所属的类别也可以是某个大类别下细分出的子类别,比如对于人物这个大类而言,有标签图像所属的类别可以是情绪低落、高兴、愤怒等,或者,也可以是真实人脸、伪造人脸等;又如,对于动物这个大类而言,有标签图像所属的类别可以是猫、狗、马、骡子等。实际应用中,有标签图像对应的类别标签可以具有任意适当的形式,比如,有标签图像对应的类别标签可以是对有标签图像所属的真实类别进行独热编码(one-hot)编码得到,或者,也可以是对有标签图像所属的真实类别进行词嵌入得到,本申请实施例对类别标签的形式不作限定。
S104,通过图像分类模型中的目标图像分类子模型,分别对有标签图像和无标签图像进行分类识别,得到有标签图像的第一分类参考信息和无标签图像的第一分类参考信息。
为了能够在有标签图像的数量有限的情况下,训练出高准确性和高可靠性的图像分类模型,如图2所示,本申请实施例的图像分类模型可以包括第一图像分类子模型和第二图像分类子模型,第一图像分类子模型和第二图像分类子模型均可以对图像集中的每个图像进行分类识别,得到相应的分类参考信息。在此基础上,采用半监督学习方式对该图像分类模型进行训练,得到最终的图像分类模型。实际应用中,第一图像分类子模型和第二图像分类 子模型可以具有相同的网络结构,或者,为了简化模型结构,以实现图像分类模型的压缩与加速,第一图像分类子模型和第二图像分类子模型也可以具有不同的网络结构,比如第二图像分类子模型相较于第一图像分类子模型采用了更精简的结构。
本申请实施例中,目标图像分类子模型为第一图像分类子模型或第二图像分类子模型。也即,可分别将第一图像分类子模型和第二图像分类子模型作为目标图像分类子模型,通过上述S104,识别出有标签图像对应于每个图像分类子模型的分类参考信息以及无标签图像对应于每个图像分类子模型的分类参考信息。
为了便于区分,将有标签图像对应于目标图像分类子模型的分类参考信息称为有标签图像的第一分类参考信息,将有标签图像对应于图像分类模型中除目标图像分类子模型以外的另一图像分类子模型称为有标签图像的第二分类参考信息。同样地,将无标签图像对应于目标图像分类子模型的分类参考信息称为无标签图像的第一分类参考信息,将无标签图像对应于图像分类模型中除目标图像分类子模型以外的另一图像分类子模型称为无标签图像的第二分类参考信息。
有标签图像的分类参考信息可以包括如下信息中的至少一种:有标签图像被识别为属于多个预设类别中每个预设类别的概率、有标签图像所属的类别等;同样地,无标签图像的分类参考信息可以包括如下信息中的至少一种:无标签图像被识别为属于多个预设类别中每个预设类别的概率、无标签图像所属的类别等。示例地,多个预设类别包括猫、狗、马、骡子,每个图像的分类参考信息可以包括每个图像被识别为分别属于猫、狗、马、骡子的概率,每个图像所属的类别可以为这多个预设类别中最大概率对应的类别。
为使图像分类模型中的各个图像分类子模型能够对图像集的图像进行充分理解和学习,以提高图像分类模型的表达能力,图像集中的图像可以是对初始图像进行数据增强处理得到。也即,在上述S102之前,本申请实施例提 供的图像分类模型的训练方法还可以包括:对初始图像集中的图像进行数据增强处理,得到用于对图像分类模型进行训练的图像集,从而使得得到的图像集中的图像包含了扰动信息。其中,初始图像集包括初始无标签图像和初始有标签图像。
对于初始无标签图像,可对初始无标签图像进行多种增强程度的数据增强处理,得到无标签图像,其中,无标签图像的数量为多个,每个无标签图像与一种增强强度对应。相应地,在上述S104中,可通过目标图像分类子模型,对有标签图像进行分类识别,得到有标签图像的第一分类参考信息,以及通过目标图像分类子模型,分别对多个无标签图像中的每个无标签图像进行分类识别,得到每个无标签图像的第一分类参考信息。
对初始无标签图像进行多种增强程度的数据增强处理,具体可以实现为:对初始无标签图像进行弱增强(Weakly-augmented)处理,得到第一类无标签图像,以及对初始无标签图像进行强增强图像(Strongly-augmented)处理,得到第二类无标签图像。也即,无标签图像包括第一类无标签图像和第二类无标签图像,第一类无标签图像对应的增强程度小于第二类无标签图像对应的增强程度。
弱增强处理具体可以包括但不限于如下处理方式中的至少一种:平移、翻转等,强增强处理可以包括但不限于如下处理方式中的至少一种:遮挡、颜色变换、随机消除(Random Erase)等。
可以理解的是,由于弱增强处理的增强程度较小,也即向初始无标签图像引入的扰动较小,不至于使得到的第一类无标签图像失真,使得目标图像分类子模型在得到准确的第一分类参考信息的基础上学习到第一类无标签图像中的噪声,有利于提高目标图像分类子模型的学习效果;此外,考虑到仅用弱增强图像可能会导致目标图像分类子模型陷入过拟合状态,无法提取到第一类无标签图像的本质特征,而强增强处理引入的扰动较大,可能带来第一类无标签图像的失真,但仍然能够保留足以辨认类别的特征,通过对初始 无标签图像分别进行弱增强处理和强增强处理后,输入目标图像分类子模型,由目标图像分类子模型对不同增强程度的无标签图像进行学习,有利于提高目标图像分类子模型的学习效果,增强目标图像分类子模型的表达能力。
可选地,为进一步提高目标图像分类子模型的表达能力,在上述S104之前,本申请实施例提供的图像分类模型的训练方法还可以包括:对初始图像集中的初始有标签图像进行弱增强处理,得到有标签图像。
S106,基于有标签图像的第一分类参考信息、有标签图像对应的类别标签以及无标签图像的第二分类参考信息,确定目标图像分类子模型的分类损失。
其中,无标签图像的第二分类参考信息通过图像分类模型中除目标图像分类子模型之外的其他图像分类子模型对无标签图像进行分类识别得到。
换句话说,在上述S106中,对于第一图像分类子模型,可基于图像集中的有标签图像对应于第一图像分类子模型的分类参考信息、有标签图像对应的类别标签、以及图像集中的无标签图像对应于第二图像分类子模型的分类参考信息,确定第一图像分类子模型的分类损失;对于第二图像分类子模型,可基于图像集中的有标签图像对应于第二图像分类子模型的分类参考信息、有标签图像对应的类别标签、以及图像集中的无标签图像对应于第一图像分类子模型的分类参考信息,确定第二图像分类子模型的分类损失。
这样,第一图像分类子模型和第二图像分类子模型彼此之间可以利用自身学习的信息为对方提供指导,使得第一图像分类子模型和第二图像分类子模型之间由单向的师生关系变为互为师生关系,有利于各个图像分类子模型之间互补学习、教学相长,使得图像集中的图像包含的信息得到充分挖掘和利用,从而有利于提高图像分类模型的训练效果。
本申请实施例中,对于每个图像分类子模型而言,该图像分类子模型的分类损失用于表示该图像分类子模型针对输入的图像进行分类识别所得到的分类参考信息与输入的图像对应的类别标签所表示的类别之间的差异。
考虑到各个图像分类子模型基于输入的图像集所进行的学习任务是半监督学习任务,其结合了基于有标签图像及其对应的类别标签的有监督学习任务以及基于无标签图像的无监督学习任务,每种学习任务均可能产生一定的分类损失,为此,目标图像分类子模型的分类损失可以包括目标图像分类子模型的有监督损失和无监督损失,其中,目标图像分类子模型的有监督损失用于表示目标图像分类子模型进行有监督学习任务所产生的分类损失,目标图像分类子模型的无监督损失用于表示目标图像分类子模型进行无监督学习任务所产生的分类损失。
在一种可选的实现方案中,目标图像分类子模型的有监督损失可基于图像集中的有标签图像的第一分类参考信息以及有标签图像对应的类别标签确定,目标图像分类子模型的无监督损失可基于图像集中的无标签图像的第一分类参考信息及第二分类参考信息确定。
在另一种可选的实现方式中,为使目标图像分类子模型能够对输入的图像进行充分理解和学习,以提高目标图像分类子模型的表达能力,无标签图像是在对初始无标签图像进行多种增强程度的数据增强处理后才输入到目标图像分类子模型中的,因而得到的无标签图像的第一分类参考信息,包括每个无标签图像的第一分类参考信息。相应地,目标图像分类子模型的有监督损失可基于图像集中的有标签图像的第一分类参考信息以及有标签图像对应的类别标签确定,目标图像分类子模型的无监督损失可基于图像集中的无标签图像的第一分类参考信息确定和第二分类参考信息。
上述无标签图像包括第一类无标签图像和第二类无标签图像,第一类无标签图像对应的增强程度小于第二类无标签图像对应的增强程度,相应地,上述S106具体可以包括如下步骤:
S161,基于第一类无标签图像的第二分类参考信息,生成第一类无标签图像对应的第一伪标签。
由于第一类无标签图像本身不具有对应的类别标签,基于第一类无标签 图像的第二分类参考信息,生成第一类无标签图像对应的第一类伪标签,相当于是为第一类无标签图像打上人工标签,以表示第一类无标签图像所属的预测类别,从而为目标图像分类子模型的无监督学习任务提供指导。实际应用中,第一类无标签图像对应的第一伪标签可用于指示第一类无标签图像所属的预测类别。当然,第一类无标签图像对应的第一类伪标签还可用于指示第一类无标签图像中的第一目标对象区域以及第一目标对象区域所属的预测类别,其中,第一目标对象区域是指目标图像分类子模型识别到的第一类无标签图像中的目标对象所在的区域。比如,在人脸分类识别场景下,第一目标对象区域是指第一类无标签图像中的人脸区域。
第一类无标签图像的第一分类参考信息包括第一类无标签图像被识别为属于多个预设类别中每个预设类别的概率,在此情形下,作为一种可选的方案,上述S141具体可实现为:基于第一类无标签图像的第二分类参考信息,从多个预设类别中确定最大概率对应的预设类别;若该预设类别对应的概率大于预设概率阈值,则基于该最大概率对应的预设类别,生成第一类无标签图像的第一类伪标签。
如图2所示,基于第二图像分类子模型针对第一类无标签图像输出的分类参考信息,从多个预设类别中确定最大概率对应的预设类别,若该预设类别对应的概率大于预设概率阈值,则基于该最大概率对应的预设类别,生成第一类无标签图像对应于第一图像分类子模型的伪标签;以及,基于第一图像分类子模型针对第一类无标签图像输出的分类参考信息,从多个预设类别中确定最大概率对应的预设类别,若该预设类别对应的概率大于预设概率阈值,则基于该最大概率对应的预设类别,生成第一类无标签图像对应于第二图像分类子模型的伪标签。
示例地,第一类无标签图像对应于第一图像分类子模型的伪标签可通过如下公式(1)确定,第一类无标签图像对应于第二图像分类子模型的伪标签可通过如下公式(2)确定:

其中,表示图像集中的第ζ个第一类无标签图像对应于第一图像分类子模型的伪标签,ONE_HOT表示独热编码,q2表示第二图像分类子模型,表示第二图像分类子模型针对第一类无标签图像输出的分类参考信息,表示该分类参考信息中的最大概率,表示该最大概率对应的预设类别,γ表示预设概率阈值;表示第一类无标签图像对应于图像分类子模型的伪标签,q1表示第一图像分类子模型,表示第一图像分类子模型针对第一类无标签图像输出的分类参考信息,表示该分类参考信息中的最大概率,表示该最大概率对应的预设类别。
由于第一类无标签图像对应的增强程度较小,也即向初始无标签图像引入的扰动较小,使得第一类无标签图像不至于失真,再加上在第一类无标签图像对应的分类参考信息中的最大概率大于预设概率阈值时,才基于该最大概率对应的预设类别生成相应的伪标签,可以极大程度降低在伪标签中引入噪声或错误的可能性,从而确保各个图像分类子模型在得到准确分类识别结果的基础上学习到第一类无标签图像中的噪声,有利于提高各个图像分类子模型的学习效果。
S162,第二类无标签图像的第一分类参考信息和第一类无标签图像对应的第一类伪标签,确定目标图像分类子模型的无监督损失。
可基于第一图像分类子模型针对图像集中的第二类无标签图像输出的分类参考信息以及无标签图像对应于第一图像分类子模型的伪标签,确定第一图像分类子模型的无监督损失;以及,基于第二图像分类子模型针对图像集中的无标签图像对应的第二类无标签图像所输出的分类参考信息、以及无标 签图像对应于第二图像分类子模型的伪标签,确定第二图像分类子模型的无监督损失。
在上述S162中,可基于图像集中每个第二类无标签图像的第一分类参考信息、每个无标签图像的第一类伪标签以及预设损失函数,确定目标图像分类子模型的无监督损失。实际应用中,预设损失函数可以根据实际需要进行设置,例如包括但不限于交叉熵损失函数、分类损失函数、边界框回归损失函数中的至少一种。
示例地,可针对图像集中的每个第一类无标签图像,基于第二类无标签图像的第一分类参考信息、第一类无标签图像对应的第一类伪标签以及预设损失函数,确定第一类无标签图像对应的无监督子损失;进一步,将图像集中的每个第一类无标签图像对应的无监督子损失的加权和,确定为目标图像分类子模型的无监督损失。
可选地,考虑到在训练前期生成的伪标签的置信度通常不高,容易导致图像分类模型的训练效果不佳,可基于无标签图像对应的第一伪标签的置信度,为第一类无标签图像设置相应的损失权重,比如对具有高置信度的伪标签的第一类无标签图像赋予较高的损失权重,而对具有低置信度的第一类伪标签的无标签图像赋予较低的损失权重,从而可以一定程度上对第一类伪标签中的噪声进行对抗,有利于提高图像分类模型的训练效果。
在上述S162之前,本申请实施例提供的图像分类模型的训练方法还可以包括:基于第一类无标签图像对应的第一类伪标签和第一类无标签图像对应的第二类伪标签,确定第一类无标签图像对应的损失权重,其中,第一类无标签图像对应的第二类伪标签为基于第一类无标签图像的第一分类参考信息生成的,具体生成方式与基于第一类无标签图像的第一类伪标签的方式类似。相应地,上述S162中,基于第二类无标签图像的第一分类参考信息和第一类无标签图像对应的第一类伪标签,确定第一类无标签图像对应的无监督子损失,以及基于第一类无标签图像对应的损失权重和第一类无标签图像对应的 无监督损失,确定目标图像分类子模型的无监督损失。示例地。可基于图像集中每个第一类无标签图像对应的损失权重,对图像集中每个第一类无标签图像对应的无监督子损失进行加权求和,得到目标图像分类子模型的无监督损失。
示例地,同一图像经不同的图像分类子模型预测之后得到的分类参考信息理论上应当相同,进而同一图像对应于不同图像分类子模型的伪标签也应当相同,对此,可将第一类无标签图像对应的第一类伪标签指示的预测类别和第一类无标签图像对应的第二类伪标签指示的预测类别进行比对处理,若两者不一致,则可确定第一类无标签图像的两个伪标签的置信度较低,进而可以为第一类无标签图像赋予较低的损失权重(即第一预设权重);若两者一致,则可确定第一类无标签图像的两个伪标签的置信度较高,进而可以为第一类无标签图像赋予较高的损失权重(即第二预设权重)。
示例地,第一类无标签图像对应的第一类伪标签可用于指示第一类无标签图像中的第一目标对象区域、以及第一目标对象区域所属的预测类别,第一类无标签图像对应的第二类伪标签可用于指示第一类无标签图像中的第二目标对象区域、以及第二目标对象区域所属的预测类别。为确保为第一类无标签图像赋予的损失权重与第一类无标签图像对应的两个伪标签的置信度更匹配,可确定第一目标对象区域与第二目标对象区域之间的交并比以及,并将第一目标对象区域所属的预测类别与第二目标对象区域所属的预测类别进行比对,得到比对结果;进一步,基于交并比和比对结果,确定第一类无标签图像对应的损失权重。
例如,若交并比小于或等于预设比值或者比对结果表明第一目标对象区域所属的预测类别与第二目标对象区域所属的预测类别不一致,则可确定第一类无标签图像对应的第一类伪标签及第二类伪标签的置信度均较低,进而可以为第一类无标签图像赋予第一预设权重;若交并比大于预设比值、且比对结果表明第一目标对象区域所属的预测类别与第二目标对象区域所属的预 测类别一致,则可确定第一类无标签图像对应的第一类伪标签及第二类伪标签的置信度均较低,进而可以为第一类无标签图像赋予第二预设权重,第二预设权重大于第一预设权重。
相应地,在上述S162中,可通过如下公式(3)确定目标图像分类子模型的无监督损失。
其中,表示目标图像分类子模型的无监督损失,Nu表示图像集中第一类无标签图像的数量,B表示图像集,表示图像集中的第b个第一类无标签图像,b∈h表示第b个第一类无标签图像对应的第一类伪标签及第二类伪标签的置信度均较高,b∈B\h表示第b个第一类无标签图像对应的第一类伪标签及第二类伪标签的置信度均较低,表示分类损失函数,表示边界框回归损失函数,表示第一类无标签图像对应对应的第一类伪标签,δ表示具有较高置信度伪标签的第一类无标签图像对应的损失权重。
可以理解的是,由于在训练前期生成的伪标签的置信度通常不高,容易导致图像分类模型的训练效果不佳,且通过不同的图像分类子模型对同一图像进行分类识别后所生成的伪标签理论上应当相同,基于此,基于第一类无标签图像分别对应于各个图像分类子模型的伪标签,即可判断伪标签的置信度,进而为第一类无标签图像设置相应的损失权重,可以在一定程度上对伪标签中的噪声进行对抗,有利于提高图像分类模型的训练效果。
S163,基于有标签图像的第一分类参考信息和有标签图像对应的类别标签,确定目标图像分类子模型的有监督损失。
示例地,可通过如下公式(4)确定目标图像分类子模型的有监督损失:
其中,表示目标图像分类子模型的有监督损失,Nl表示图像集中有标 签图像的数量,表示图像集中第l个有标签图像,表示有标签图像对应的类别标签,表示分类损失函数,表示边界框回归损失函数。
S164,基于目标图像分类子模型的无监督损失及有监督损失,确定目标图像分类子模型的分类损失。
示例地,如下述公式(5)确定目标图像分类子模型的分类损失:
其中,表示目标图像分类子模型的分类损失,表示目标图像分类子模型的有监督损失,表示目标图像分类子模型的无监督损失,λu表示无监督损失对应的损失权重。
可以理解的是,图像分类模型中每个图像分类子模型基于图像集所进行都是半监督学习任务,其结合了基于有标签图像及其对应的类别标签的有监督学习、以及基于无标签图像及其对应的伪标签的无监督学习,每种学习任务下都可能会产生一定的分类损失。为此,对于每个图像分类子模型而言,基于该图像分类子模型针对有标签图像输出的分类参考信息以及有标签图像对应的类别标签,确定该图像分类子模型的有监督损失,使得有监督损失能够准确反映出该图像分类子模型在进行有监督学习任务时产生的分类损失;利用相同图像经过不同数据增强处理后输入到同一图像分类子模型得到的分类参考信息在理论上相同的规律,基于增强强度较小的第一类无标签图像对应于该图像分类子模型的分类参考信息,为第一类无标签图像生成对应于另一图像分类子模型的伪标签,而后利用第一类无标签图像对应于各个图像分类子模型的伪标签以及增强强度较大的第二类无标签图像对应于各个图像分类子模型的分类参考信息,确定各个图像分类子模型的无监督损失,不仅使得无监督损失能够准确反映出对应的图像分类子模型在进行无监督学习任务时产生的分类损失,还有利于各个图像分类子模型在无监督学习过程中,利用增强强度较小的第一类无标签图像的分类参考信息监督增强强度较大的第二类无标签图像的分类参考信息,从而有利于提高各个图像分类子模型的分 类准确率。
本申请实施例在此示出了确定目标图像分类子模型的分类损失的一种具体实现方式。当然,应理解,目标图像分类子模型的分类损失也可以采用其它的方式确定,本申请实施例对此不作限定。
S108,基于第一图像分类子模型的分类损失和第二图像分类子模型的分类损失,调整图像分类模型的模型参数。
在一种可选的实现方式中,如图2所示,上述S108可以包括如下步骤:
S181,对第一图像分类子模型的分类损失和第二图像分类子模型的分类损失进行加权求和,得到图像分类模型的分类损失。
其中,图像分类模型的分类损失用于表示图像分类模型对输入的图像进行分类识别所得到的分类参考信息与输入的图像所属的真实类别之间的差异。示例地,图像分类模型的分类损失可通过如下公式(6)确定:
其中,表示图像分类模型的分类损失,表示第一图像分类子模型的分类损失,表示第二图像分类子模型的分类损失,表示第一图像分类子模型的有监督损失,表示第一图像分类子模型的无监督损失,表示第二图像分类子模型的有监督损失,表示第二图像分类子模型的无监督损失,λu表示无监督损失对应的损失权重。
S182,通过反向传播算法,基于图像分类模型的分类损失,调整图像分类模型的模型参数。
其中,图像分类模型的模型参数可以包括第一图像分类子模型的模型参数和第二图像分类子模型的模型参数。对于各个图像分类子模型而言,以神经网络为例,其模型参数可以包括但不限于该图像分类子模型中各网络层的神经元数量、不同网络层中的神经元之间的连接关系以及连接边权重、各网络层中的神经元对应的偏置等。
由于图像分类模型的分类损失能够反映图像分类模型对输入的图像进行 分类识别所输出的分类参考信息与输入的图像所属的真实类别之间的差异,为得到高准确率的图像分类模型,可采用反向传播算法,基于图像分类模型的分类损失对第一图像分类子模型和第二图像分类子模型各自的模型参数进行调整。
在采用反向传播算法调整第一图像分类子模型和第二图像分类子模型各自的模型参数时,可基于图像分类模型的分类损失、第一图像分类子模型当前的模型参数和第二图像分类子模型当前的模型参数,采用反向传播算法确定第一图像分类子模型和第二图像分类子模型各自的各网络层引起的预测损失;然后,以使图像分类模型的分类损失下降为目标,逐层调整第一图像分类子模型中各网络层的相关参数以及第二图像分类子模型中各网络层的相关参数。
本申请实施例在此示出了上述S182的一种具体实现方式。当然,应理解,上述S182也可以采用其它的方式实现,本申请实施例对此不作限制。
需要说明的是,上述过程仅为一次模型参数调整过程,实际应用中,可能需要对图像分类模型进行多次模型参数调整,因而可重复执行上述步骤S102至S108多次,直到满足预设训练停止条件,由此得到最终的图像分类模型。其中,预设训练停止条件可以是图像分类模型的分类损失小于预设损失阈值,或者,也可以是调整次数达到预设次数等,本申请实施例对此不作限定。
由于每个图像分类子模型针对输入的图像得到的分类参考信息与输入的图像所属的真实类别之间都可能存在一定的差异,因而各个图像分类子模型产生的分类损失都会影响图像分类模型的分类准确率,为此,通过对各个图像分类子模型的分类损失进行加权求和后的结果作为图像分类模型的分类损失,使得图像分类模型的分类损失能够更准确地反映图像分类模型的分类偏差,进而利用图像分类模型的分类损失对图像分类模型的模型参数进行调整,有利于提高图像分类模型的分类准确率。
本申请实施例提供的图像分类模型的训练方法,在半监督学习框架下,通过图像分类模型中的各个图像分类子模型,分别对图像集中的每个图像进行分类识别,得到每个图像的多个分类参考信息,且一个分类参考信息对应一个图像分类子模型;然后,分别将每个图像分类子模型作为目标图像分类子模型,基于图像集中的有标签图像对应于该目标图像分类子模型的分类参考信息、有标签图像对应的类别标签以及图像集中的无标签图像对应于另一个图像分类子模型的分类参考信息,确定目标图像分类子模型的分类损失,也即利用一个目标图像分类子模型从图像集中学习的信息,为另一个目标图像分类子模型提供指导,使得图像分类模型的各目标图像分类子模型之间由单向的师生关系变为互为师生关系;进一步,基于图像分类模型中每个目标图像分类子模型的分类损失,调整图像分类模型的模型参数,可以充分利用各个目标图像分类子模型之间的相互师生关系,使得各个目标图像分类子模型之间互补学习、教学相长,进而使得图像集中包含的信息得到充分挖掘和利用,从而提高图像分类模型的训练效果,得到更准确、更可靠的图像分类模型。
上述实施例介绍了图像分类模型的训练方法,通过上述训练方法,可训练针对不同应用场景的图像分类模型。针对不同的应用场景,训练图像分类模型所采用的图像集及其中包含的每个图像的标签可根据应用场景进行选择。本申请实施例提供的上述训练方法所适用的应用场景可以例如包括但不限于目标检测、人脸表情分类、自然界动物分类、手写数字识别等场景。以自然界动物分类这一应用场景为例,有标签图像对应的类别标签用于标记有标签图像包含的目标对象以及目标对象所属的类别,比如猫、狗、马等,通过上述本申请实施例提供的训练方法训练得到的图像分类模型能够检测出待处理图像中的目标对象所在的区域,并识别出目标对象所属的类别。
基于本申请上述实施例所示的图像分类模型的训练方法,训练得到的图像分类模型可应用于任意需要对图像进行分类识别的场景。下面对基于图像 分类模型的应用过程进行详细说明。
本申请实施例还提供一种图像分类模型的图像分类方法,能够基于上述训练方法训练出的图像分类模型,对待处理图像进行分类识别。
请参考图3,为本申请的一个实施例提供的一种图像分类方法的流程示意图,该方法可以包括如下步骤:
S302,通过图像分类模型对待处理图像进行分类识别,得到待处理图像的分类参考信息集。
其中,待处理图像的分类参考信息集包括待处理图像的第一目标分类参考信息和待处理图像的第二目标分类参考信息。所述图像分类模型包括第一图像分类子模型和第二图像分类子模型。第一图像分类子模型用于对待处理图像进行分类识别,得到待处理图像的第一目标分类参考信息;第二图像分类子模型用于对待处理图像进行分类识别,得到待处理图像的第二目标分类参考信息。
S304,基于待处理图像的分类参考信息集,确定待处理图像所属的类别。
可选地,可以基于待处理图像对应于任一图像分类子模型的分类参考信息,确定待处理图像所属的类别。例如,可将待处理图像的第一目标分类参考信息中最大概率对应的类别,确定为待处理图像所属的类别,或者,也可将待处理图像的第二目标分类参考信息中最大概率对应的类别,确定为待处理图像所属的类别。
可选地,还可综合待处理图像的上述多个分类参考信息,确定待处理图像所属的类别。例如,若待处理图像的第一目标分类参考信息中最大分类概率对应的类别与待处理图像的第二目标分类参考信息中最大分类概率对应的类别一致,则可将该类别确定为待处理图像所属的类别;又如,可基于待处理图像的第一目标分类参考信息中的第一目标类别集与待处理图像的第二目标分类参考信息中的第二目标类别集之间的交集,确定待处理图像所属的类别,其中,第一目标类别集包括第一目标分类参考信息中超过预设概率阈值 的概率对应的类别,第二目标类别集包括第二目标分类参考信息中超过预设概率阈值的概率对应的类别,等等。
本申请实施例提供的图像分类方法,由于图像分类模型是在半监督学习方式的基础上,利用各个图像分类子模型之间的相互师生关系,通过各个图像分类子模型之间互补学习、教学相长而训练得到的,因而图像分类模型具有较高的准确性和可靠性;进一步,利用图像分类模型对待处理图像进行分类识别,有助于提高图像分类结果的准确性和可靠性。
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
此外,与上述图1所示的图像分类模型的训练方法相对应地,本申请实施例还提供一种图像分类模型的训练装置。请参见图4,为本申请的一个实施例提供的一种图像分类模型的训练装置400的结构示意图,该装置400包括:获取单元410,用于获取用于对所述图像分类模型进行训练的图像集,所述图像集中包括有标签图像、无标签图像以及所述有标签图像对应的类别标签;分类单元420,用于通过所述图像分类模型中的目标图像分类子模型,分别对所述有标签图像和所述无标签图像进行分类识别,得到所述有标签图像的第一分类参考信息和所述无标签图像的第一分类参考信息;所述目标图像分类子模型为所述第一图像分类子模型或所述第二图像分类子模型;确定单元430,用于基于所述有标签图像对应的第一分类参考信息、所述有标签图像对应的类别标签以及所述无标签图像的第二分类参考信息,确定所述目标图像分类子模型的分类损失;通过所述图像分类模型中的非目标图像分类子模型对所述无标签图像进行分类识别,得到所述无标签图像的第二分类参考信息,所述非目标图像分类子模型为所述图像分类模型中除所述目标图像 分类子模型之外的其他图像分类子模型;所述目标图像分类子模型的分类损失是指所述第一图像分类子模型的分类损失或所述第二图像分类子模型的分类损失;调整单元440,用于基于所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失,调整所述图像分类模型的模型参数。
可选地,所述获取单元,还用于获取初始无标签图像。
所述训练装置400还包括:增强单元,用于对所述初始无标签图像进行多种增强程度的数据增强处理,得到所述无标签图像,所述无标签图像的数量为多个,每个无标签图像与一种增强程度对应。
可选地,所述无标签图像包括第一类无标签图像和第二类无标签图像,第一类无标签图像对应的增强程度小于所述第二类无标签图像的增强程度。
所述确定单元基于所述有标签图像对应的第一分类参考信息、所述有标签图像对应的类别标签以及所述无标签图像的第二分类参考信息,确定所述目标图像分类子模型的分类损失,包括:基于所述第一类无标签图像的第二分类参考信息,生成所述第一类无标签图像对应的第一类伪标签;基于所述第二类无标签图像的第一分类参考信息和所述第一类无标签图像对应的第一类伪标签,确定所述目标图像分类子模型的无监督损失;基于所述有标签图像的第一分类参考信息和所述有标签图像对应的类别标签,确定所述目标图像分类子模型的有监督损失;基于所述目标图像分类子模型的无监督损失和目标图像分类子模型的有监督损失,确定所述目标图像分类子模型的分类损失。
可选地,所述确定单元,还用于在基于所述第二类无标签图像的第一分类参考信息和所述第一类无标签图像对应的第一类伪标签确定所述目标图像分类子模型的无监督损失之前,基于所述第一类无标签图像对应的第一类伪标签和所述第一类无标签图像对应的第二类伪标签,确定所述第一类无标签图像对应的损失权重,其中,所述第一类无标签图像对应的第二类伪标签为基于所述第一类无标签图像的第一分类参考信息生成的。
所述确定单元基于所述第二类无标签图像的第一分类参考信息和所述第一类无标签图像对应的第一类伪标签,确定所述目标图像分类子模型的无监督损失,包括:基于所述第二类无标签图像的第一分类参考信息和所述第一类无标签图像对应的第一类伪标签,确定所述第一类无标签图像对应的无监督子损失;基于所述第一类无标签图像对应的损失权重和所述第一类无标签图像对应的无监督子损失,确定所述目标图像分类子模型的无监督损失。
可选地,所述第一类无标签图像对应的第一类伪标签用于指示所述第一类无标签图像中的第一目标对象区域以及所述第一目标对象区域所属的预测类别,所述第一类无标签图像对应的第二类伪标签用于指示所述第一类无标签图像中的第二目标对象区域以及所述第二目标对象区域所属的预测类别。
所述确定单元基于所述第一类无标签图像对应的第一类伪标签和所述第一类无标签图像对应的第二类伪标签,确定所述第一类无标签图像对应的损失权重,包括:确定所述第一目标对象区域与所述第二目标对象区域之间的交并比,以及将所述第一目标对象区域所属的预测类别与所述第二目标对象区域所属的预测类别进行比对,得到比对结果;基于所述交并比和所述比对结果,确定所述第一类无标签图像对应的损失权重。
可选地,所述确定单元基于所述交并比和所述比对结果,确定所述第一类无标签图像对应的损失权重,包括:若所述交并比小于或等于预设比值或者所述比对结果表明所述第一目标对象区域所属的预测类别与所述第二目标对象区域所属的预测类别不一致,则确定所述第一类无标签图像对应的损失权重为第一预设权重;若所述交并比大于所述预设比值、且所述比对结果表明所述第一目标对象区域所属的预测类别与所述第二目标对象区域所属的预测类别一致,则确定所述第一类无标签图像对应的损失权重为第二预设权重,其中,所述第二预设权重大于所述第一预设权重。
可选地,所述第一类无标签图像的第一分类参考信息和所述第一类无标签图像的第二分类参考信息均包括所述第一类无标签图像被识别为属于多个 预设类别中每个预设类别的概率。
所述确定单元基于所述第一类无标签图像的第二分类参考信息,生成所述第一类无标签图像对应的第一类伪标签,包括:基于所述第一类无标签图像的第二分类参考信息,从所述多个预设类别中确定最大概率对应的预设类别;若所述最大概率大于预设概率阈值,则基于所述最大概率对应的预设类别,生成所述第一类无标签图像的第一类伪标签。
可选地,所述调整单元基于所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失,调整所述图像分类模型的模型参数,包括:对所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失进行加权求和,得到所述图像分类模型的分类损失;通过反向传播算法,基于所述图像分类模型的分类损失,调整所述图像分类模型的模型参数。
显然,本申请实施例提供的图像分类模型的训练装置能够作为图1所示的图像分类模型的训练方法的执行主体,例如,图1所示的图像分类模型的训练方法中步骤S102可由图4所示的图像分类模型的训练装置中的获取单元执行,步骤S104可由图像分类模型的训练装置中的分类单元执行,步骤S106可由图像分类模型的训练装置中的确定单元执行,步骤S108可由图像分类模型的训练装置中的调整单元执行。
根据本申请的另一个实施例,图4所示的图像分类模型的训练装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其他实施例中,图像分类模型的训练装置也可以包括其他单元,在实际应用中,这些功能也可以由其他单元协助实现,并且可以由多个单元协作实现。
根据本申请的另一个实施例,可以通过在包括中央处理单元(Central  Processing Unit,CPU)、随机存取存储介质(Random Access Memory,RAM)、只读存储介质(Read-Only Memory,ROM)等处理元件和存储元件的例如计算机的通用计算设备上,运行能够执行如图1所示的相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造如图4中所示的图像分类模型的训练装置,以及来实现本申请实施例的图像分类模型的训练方法。所述计算机程序可以记载于例如计算机可读存储介质上,并通过计算机可读存储介质转载于电子设备中,并在其中运行。
本申请实施例提供的图像分类模型的训练装置,在半监督学习框架下,通过图像分类模型中的各个图像分类子模型,分别对图像集中的每个图像进行分类识别,得到每个图像的多个分类参考信息,且一个分类参考信息对应一个图像分类子模型;然后,分别将每个图像分类子模型作为目标图像分类子模型,基于图像集中的有标签图像对应于该目标图像分类子模型的分类参考信息、有标签图像对应的类别标签以及图像集中的无标签图像对应于另一个图像分类子模型的分类参考信息,确定目标图像分类子模型的分类损失,也即利用一个目标图像分类子模型从图像集中学习的信息,为另一个目标图像分类子模型提供指导,使得图像分类模型的各目标图像分类子模型之间由单向的师生关系变为互为师生关系;进一步,基于图像分类模型中每个目标图像分类子模型的分类损失,调整图像分类模型的模型参数,可以充分利用各个目标图像分类子模型之间的相互师生关系,使得各个目标图像分类子模型之间互补学习、教学相长,进而使得图像集中包含的信息得到充分挖掘和利用,从而提高图像分类模型的训练效果,得到更准确、更可靠的图像分类模型。
此外,与上述图3所示的图像分类方法相对应地,本申请实施例还提供一种图像分类装置。请参见图5,为本申请的一个实施例提供的一种图像分类装置500的结构示意图,该装置500包括:分类单元510,用于通过图像分类模型对待处理图像进行分类识别,得到所述待处理图像的分类参考信息 集;其中,所述分类参考信息集包括第一目标分类参考信息和第二目标分类参考信息,所述图像分类模型包括第一图像分类子模型和第二图像分类子模型,所述第一图像分类子模型用于对所述待处理图像进行分类识别得到所述第一目标分类参考信息,所述第二图像分类子模型用于对所述待处理图像进行分类识别得到所述第二目标分类参考信息,所述图像分类模型为基于本申请实施例所述的训练方法训练得到;确定单元520,用于基于所述待处理图像的分类参考信息集,确定所述待处理图像所属的类别。
显然,本申请实施例提供的图像分类装置能够作为图3所示的图像分类方法的执行主体,例如,图3所示的图像分类方法中步骤S302可由图5所示的图像分类装置中的分类单元执行,步骤S304可由图像分类装置中的确定单元执行。
根据本申请的另一个实施例,图5所示的图像分类装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其他实施例中,图像分类装置也可以包括其他单元,在实际应用中,这些功能也可以由其他单元协助实现,并且可以由多个单元协作实现。
根据本申请的另一个实施例,可以通过在包括中央处理单元(Central Processing Unit,CPU)、随机存取存储介质(Random Access Memory,RAM)、只读存储介质(Read-Only Memory,ROM)等处理元件和存储元件的例如计算机的通用计算设备上,运行能够执行如图3所示的相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造如图5中所示的图像分类装置,以及来实现本申请实施例的图像分类方法。所述计算机程序可以记载于例如计算机可读存储介质上,并通过计算机可读存储介质转载于电子设备中,并在 其中运行。
本申请实施例提供的图像分类装置,由于图像分类模型是在半监督学习方式的基础上,利用各个图像分类子模型之间的相互师生关系,通过各个图像分类子模型之间互补学习、教学相长而训练得到的,因而图像分类模型具有较高的准确性和可靠性;进一步,利用图像分类模型对待处理图像进行分类识别,有助于提高图像分类结果的准确性和可靠性。
图6是本申请的一个实施例电子设备的结构示意图。请参考图6,在硬件层面,该电子设备包括处理器,可选地还包括内部总线、网络接口、存储器。其中,存储器可能包含内存,例如高速随机存取存储器(Random-Access Memory,RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少1个磁盘存储器等。当然,该电子设备还可能包括其他业务所需要的硬件。
处理器、网络接口和存储器可以通过内部总线相互连接,该内部总线可以是ISA(Industry Standard Architecture,工业标准体系结构)总线、PCI(Peripheral Component Interconnect,外设部件互连标准)总线或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图6中仅用一个双向箭头表示,但并不表示仅有一根总线或一种类型的总线。
存储器,用于存放程序。具体地,程序可以包括程序代码,所述程序代码包括计算机操作指令。存储器可以包括内存和非易失性存储器,并向处理器提供指令和数据。
处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成图像分类模型的训练装置。处理器,执行存储器所存放的程序,并具体用于执行以下操作:获取用于对所述图像分类模型进行训练的图像集,所述图像集中包括有标签图像、无标签图像以及所述有标签图像对应的类别标签;通过所述图像分类模型中的目标图像分类子模型,分别对所 述有标签图像和所述无标签图像进行分类识别,得到所述有标签图像的第一分类参考信息和所述无标签图像的第一分类参考信息;所述目标图像分类子模型为所述第一图像分类子模型或所述第二图像分类子模型;基于所述有标签图像对应的第一分类参考信息、所述有标签图像对应的类别标签以及所述无标签图像的第二分类参考信息,确定所述目标图像分类子模型的分类损失;所述无标签图像的第二分类参考信息为通过所述图像分类模型中除所述目标图像分类子模型之外的其他图像分类子模型对所述无标签图像进行分类识别得到;所述目标图像分类子模型的分类损失是指所述第一图像分类子模型的分类损失或所述第二图像分类子模型的分类损失;基于所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失,调整所述图像分类模型的模型参数。
或者,处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成图像分类装置。处理器,执行存储器所存放的程序,并具体用于执行以下操作:通过图像分类模型对待处理图像进行分类识别,得到所述待处理图像的分类参考信息集;其中,所述分类参考信息集包括第一目标分类参考信息和第二目标分类参考信息,所述图像分类模型包括第一图像分类子模型和第二图像分类子模型,所述第一图像分类子模型用于对所述待处理图像进行分类识别得到所述第一目标分类参考信息,所述第二图像分类子模型用于对所述待处理图像进行分类识别得到所述第二目标分类参考信息,所述图像分类模型为基于本申请实施例所述的图像分类模型的训练方法训练得到;基于所述待处理图像的分类参考信息集,确定所述待处理图像所属的类别。
上述如本申请图1所示实施例揭示的图像分类模型的训练装置执行的方法或者上述如本申请图3所示实施例揭示的图像分类装置执行的方法可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬 件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
该电子设备还可执行图1的方法,并实现图像分类模型的训练装置在图1、图2所示实施例的功能,或者,该电子设备还可执行图3的方法,并实现图像分类装置在图3所示实施例的功能,本申请实施例在此不再赘述。
当然,除了软件实现方式之外,本申请的电子设备并不排除其他实现方式,比如逻辑器件抑或软硬件结合的方式等等,也就是说以下处理流程的执行主体并不限定于各个逻辑单元,也可以是硬件或逻辑器件。
本申请实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个应用程序的便携式电子设备执行时,能够使该便携式电子设备执行图1所示实施例的方法,并具体用于执行以下操作:获取用于对所述图像分类模型进行训练的图像集,所述图像集中包括有标签图像、无标签图像以及所述有标签图像对应的类别标签;通过所述图像分类模型中的目标图像分类子模型,分别对所述有标签图像和所述无标签图像进行分类识别,得到所述有标签图 像的第一分类参考信息和所述无标签图像的第一分类参考信息;所述目标图像分类子模型为所述第一图像分类子模型或所述第二图像分类子模型;基于所述有标签图像对应的第一分类参考信息、所述有标签图像对应的类别标签以及所述无标签图像的第二分类参考信息,确定所述目标图像分类子模型的分类损失;所述无标签图像的第二分类参考信息为通过所述图像分类模型中除所述目标图像分类子模型之外的其他图像分类子模型对所述无标签图像进行分类识别得到;所述目标图像分类子模型的分类损失是指所述第一图像分类子模型的分类损失或所述第二图像分类子模型的分类损失;基于所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失,调整所述图像分类模型的模型参数。
或者,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个应用程序的便携式电子设备执行时,能够使该便携式电子设备执行图3所示实施例的方法,并具体用于执行以下操作:通过图像分类模型对待处理图像进行分类识别,得到所述待处理图像的分类参考信息集;其中,所述分类参考信息集包括第一目标分类参考信息和第二目标分类参考信息,所述图像分类模型包括第一图像分类子模型和第二图像分类子模型,所述第一图像分类子模型用于对所述待处理图像进行分类识别得到所述第一目标分类参考信息,所述第二图像分类子模型用于对所述待处理图像进行分类识别得到所述第二目标分类参考信息,所述图像分类模型为基于本申请实施例所述的图像分类模型的训练方法训练得到;基于所述待处理图像的分类参考信息集,确定所述待处理图像所属的类别。
总之,以上所述仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算 机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。

Claims (13)

  1. 一种图像分类模型的训练方法,所述方法包括:
    获取用于对所述图像分类模型进行训练的图像集,所述图像集中包括有标签图像、无标签图像以及所述有标签图像对应的类别标签;
    通过所述图像分类模型中的目标图像分类子模型,对所述有标签图像和所述无标签图像进行分类识别,得到所述有标签图像的第一分类参考信息和所述无标签图像的第一分类参考信息;
    通过所述图像分类模型中的非目标图像分类子模型对所述无标签图像进行分类识别,得到所述无标签图像的第二分类参考信息;
    基于所述有标签图像的第一分类参考信息、所述有标签图像对应的类别标签以及所述无标签图像的第二分类参考信息,确定所述目标图像分类子模型的分类损失,所述目标图像分类子模型的分类损失包括所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失;
    基于所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失,调整所述图像分类模型的模型参数。
  2. 根据权利要求1所述的方法,其中,所述方法还包括:
    获取初始无标签图像;
    对所述初始无标签图像进行多种增强程度的数据增强处理,得到所述无标签图像,所述无标签图像的数量为多个,每个无标签图像与一种增强程度对应。
  3. 根据权利要求2所述的方法,其中,所述无标签图像包括第一类无标签图像和第二类无标签图像,第一类无标签图像对应的增强程度小于所述第二类无标签图像的增强程度;
    所述基于所述有标签图像对应的第一分类参考信息、所述有标签图像对 应的类别标签以及所述无标签图像的第二分类参考信息,确定所述目标图像分类子模型的分类损失,包括:
    基于所述第一类无标签图像的第二分类参考信息,生成所述第一类无标签图像对应的第一类伪标签;
    基于所述第二类无标签图像的第一分类参考信息和所述第一类无标签图像对应的第一类伪标签,确定所述目标图像分类子模型的无监督损失;
    基于所述有标签图像的第一分类参考信息和所述有标签图像对应的类别标签,确定所述目标图像分类子模型的有监督损失;
    基于所述目标图像分类子模型的无监督损失和目标图像分类子模型的有监督损失,确定所述目标图像分类子模型的分类损失。
  4. 根据权利要求3所述的方法,其中,在基于所述第二类无标签图像的第一分类参考信息和所述第一类无标签图像对应的第一类伪标签确定所述目标图像分类子模型的无监督损失之前,所述方法还包括:
    基于所述第一类无标签图像对应的第一类伪标签和所述第一类无标签图像对应的第二类伪标签,确定所述第一类无标签图像对应的损失权重,其中,所述第一类无标签图像对应的第二类伪标签为基于所述第一类无标签图像的第一分类参考信息生成的;
    所述基于所述第二类无标签图像的第一分类参考信息和所述第一类无标签图像对应的第一类伪标签,确定所述目标图像分类子模型的无监督损失,包括:
    基于所述第二类无标签图像的第一分类参考信息和所述第一类无标签图像对应的第一类伪标签,确定所述第一类无标签图像对应的无监督子损失;
    基于所述第一类无标签图像对应的损失权重和所述第一类无标签图像对应的无监督子损失,确定所述目标图像分类子模型的无监督损失。
  5. 根据权利要求4所述的方法,其中,所述第一类无标签图像对应的第一类伪标签用于指示所述第一类无标签图像中的第一目标对象区域以及所述第一目标对象区域所属的预测类别,所述第一类无标签图像对应的第二类伪标签用于指示所述第一类无标签图像中的第二目标对象区域以及所述第二目标对象区域所属的预测类别;
    所述基于所述第一类无标签图像对应的第一类伪标签和所述第一类无标签图像对应的第二类伪标签,确定所述第一类无标签图像对应的损失权重,包括:
    确定所述第一目标对象区域与所述第二目标对象区域之间的交并比,以及将所述第一目标对象区域所属的预测类别与所述第二目标对象区域所属的预测类别进行比对,得到比对结果;
    基于所述交并比和所述比对结果,确定所述第一类无标签图像对应的损失权重。
  6. 根据权利要求5所述的方法,其中,所述基于所述交并比和所述比对结果,确定所述第一类无标签图像对应的损失权重,包括:
    若所述交并比小于或等于预设比值或者所述比对结果表明所述第一目标对象区域所属的预测类别与所述第二目标对象区域所属的预测类别不一致,则确定所述第一类无标签图像对应的损失权重为第一预设权重;
    若所述交并比大于所述预设比值、且所述比对结果表明所述第一目标对象区域所属的预测类别与所述第二目标对象区域所属的预测类别一致,则确定所述第一类无标签图像对应的损失权重为第二预设权重,其中,所述第二预设权重大于所述第一预设权重。
  7. 根据权利要求3所述的方法,其中,所述第一类无标签图像的第一分类参考信息和所述第一类无标签图像的第二分类参考信息均包括识别所述第 一类无标签图像属于预设类别的概率;
    基于所述第一类无标签图像的第二分类参考信息,生成所述第一类无标签图像对应的第一类伪标签,包括:
    基于所述第一类无标签图像的第二分类参考信息,从所述多个预设类别中确定最大概率对应的预设类别;
    若所述最大概率大于预设概率阈值,则基于所述最大概率对应的预设类别,生成所述第一类无标签图像的第一类伪标签。
  8. 根据权利要求1至7中任一项所述的方法,其中,所述基于所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失,调整所述图像分类模型的模型参数,包括:
    对所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失进行加权求和,得到所述图像分类模型的分类损失;
    通过反向传播算法,基于所述图像分类模型的分类损失,调整所述图像分类模型的模型参数。
  9. 一种图像分类方法,包括:
    通过图像分类模型对待处理图像进行分类识别,得到所述待处理图像的分类参考信息集;其中,所述分类参考信息集包括第一目标分类参考信息和第二目标分类参考信息,所述图像分类模型包括第一图像分类子模型和第二图像分类子模型,所述第一图像分类子模型用于对所述待处理图像进行分类识别得到所述第一目标分类参考信息,所述第二图像分类子模型用于对所述待处理图像进行分类识别得到所述第二目标分类参考信息,所述图像分类模型为基于权利要求1至8中任一项所述的训练方法训练得到;
    基于所述待处理图像的分类参考信息集,确定所述待处理图像所属的类别。
  10. 一种图像分类模型的训练装置,所述训练装置包括:
    获取单元,用于获取用于对所述图像分类模型进行训练的图像集,所述图像集中包括有标签图像、无标签图像以及所述有标签图像对应的类别标签;
    分类单元,用于通过所述图像分类模型中的目标图像分类子模型,对所述有标签图像和所述无标签图像进行分类识别,得到所述有标签图像的第一分类参考信息和所述无标签图像的第一分类参考信息;通过所述图像分类模型中的非目标图像分类子模型对所述无标签图像进行分类识别,得到所述无标签图像的第二分类参考信息;
    确定单元,用于基于所述有标签图像对应的第一分类参考信息、所述有标签图像对应的类别标签以及所述无标签图像的第二分类参考信息,确定所述目标图像分类子模型的分类损失,所述目标图像分类子模型的分类损失包括所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失;调整单元,用于基于所述第一图像分类子模型的分类损失和所述第二图像分类子模型的分类损失,调整所述图像分类模型的模型参数。
  11. 一种图像分类装置,包括:
    分类单元,用于通过图像分类模型对待处理图像进行分类识别,得到所述待处理图像的分类参考信息集;其中,所述分类参考信息集包括第一目标分类参考信息和第二目标分类参考信息,所述图像分类模型包括第一图像分类子模型和第二图像分类子模型,所述第一图像分类子模型用于对所述待处理图像进行分类识别得到所述第一目标分类参考信息,所述第二图像分类子模型用于对所述待处理图像进行分类识别得到所述第二目标分类参考信息,所述图像分类模型为基于权利要求1至8中任一项所述的训练方法训练得到;
    确定单元,用于基于所述待处理图像的分类参考信息集,确定所述待处理图像所属的类别。
  12. 一种电子设备,包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令,以实现如权利要求1至8中任一项所述的方法;或者,所述处理器被配置为执行所述指令,以实现如权利要求9所述的方法。
  13. 一种计算机可读存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行如权利要求1至8中任一项所述的方法;或者,所述处理器被配置为执行所述指令,以实现如权利要求9所述的方法。
PCT/CN2023/102430 2022-07-19 2023-06-26 图像分类模型的训练方法、图像分类方法及相关设备 WO2024016945A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210872051.8 2022-07-19
CN202210872051.8A CN117456219A (zh) 2022-07-19 2022-07-19 图像分类模型的训练方法、图像分类方法及相关设备

Publications (1)

Publication Number Publication Date
WO2024016945A1 true WO2024016945A1 (zh) 2024-01-25

Family

ID=89595374

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/102430 WO2024016945A1 (zh) 2022-07-19 2023-06-26 图像分类模型的训练方法、图像分类方法及相关设备

Country Status (2)

Country Link
CN (1) CN117456219A (zh)
WO (1) WO2024016945A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170132528A1 (en) * 2015-11-06 2017-05-11 Microsoft Technology Licensing, Llc Joint model training
CN112149733A (zh) * 2020-09-23 2020-12-29 北京金山云网络技术有限公司 模型训练、质量确定方法、装置、电子设备及存储介质
CN113240655A (zh) * 2021-05-21 2021-08-10 深圳大学 一种自动检测眼底图像类型的方法、存储介质及装置
CN114722958A (zh) * 2022-04-22 2022-07-08 商汤集团有限公司 网络训练及目标检测方法、装置、电子设备和存储介质
CN114742119A (zh) * 2021-12-30 2022-07-12 浙江大华技术股份有限公司 交叉监督的模型训练方法、图像分割方法及相关设备
CN115375706A (zh) * 2022-08-16 2022-11-22 中国科学院深圳先进技术研究院 图像分割模型训练方法、装置、设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170132528A1 (en) * 2015-11-06 2017-05-11 Microsoft Technology Licensing, Llc Joint model training
CN112149733A (zh) * 2020-09-23 2020-12-29 北京金山云网络技术有限公司 模型训练、质量确定方法、装置、电子设备及存储介质
CN113240655A (zh) * 2021-05-21 2021-08-10 深圳大学 一种自动检测眼底图像类型的方法、存储介质及装置
CN114742119A (zh) * 2021-12-30 2022-07-12 浙江大华技术股份有限公司 交叉监督的模型训练方法、图像分割方法及相关设备
CN114722958A (zh) * 2022-04-22 2022-07-08 商汤集团有限公司 网络训练及目标检测方法、装置、电子设备和存储介质
CN115375706A (zh) * 2022-08-16 2022-11-22 中国科学院深圳先进技术研究院 图像分割模型训练方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN117456219A (zh) 2024-01-26

Similar Documents

Publication Publication Date Title
CN112990432B (zh) 目标识别模型训练方法、装置及电子设备
US10275719B2 (en) Hyper-parameter selection for deep convolutional networks
CN108027899B (zh) 用于提高经训练的机器学习模型的性能的方法
CN109471938B (zh) 一种文本分类方法及终端
WO2019100724A1 (zh) 训练多标签分类模型的方法和装置
CN110069709B (zh) 意图识别方法、装置、计算机可读介质及电子设备
US20220215298A1 (en) Method for training sequence mining model, method for processing sequence data, and device
US11481576B2 (en) Subject-object interaction recognition model
WO2021184902A1 (zh) 图像分类方法、装置、及其训练方法、装置、设备、介质
WO2023088174A1 (zh) 目标检测方法及装置
CN112749737A (zh) 图像分类方法及装置、电子设备、存储介质
WO2024016949A1 (zh) 标签生成、图像分类模型的方法、图像分类方法及装置
WO2024016945A1 (zh) 图像分类模型的训练方法、图像分类方法及相关设备
CN113822374B (zh) 基于半监督学习的模型训练方法、系统、终端及存储介质
CN111753583A (zh) 一种识别方法及装置
CN115438658A (zh) 一种实体识别方法、识别模型的训练方法和相关装置
CN111858999A (zh) 一种基于分段困难样本生成的检索方法及装置
CN110751197A (zh) 图片分类方法、图片模型训练方法及设备
CN112347893B (zh) 用于视频行为识别的模型训练方法、装置和计算机设备
US20240004889A1 (en) Task agnostic open-set prototypes for few-shot open-set recognition
CN113033212B (zh) 文本数据处理方法及装置
Kind-Trueller et al. The Deep Learning Method for Image Segmentation to Improve the Efficiency of Data Processing Without Compromising the Accuracy of an Autonomous Driving Country-Road Pilot System After Image Classification
CN117132754A (zh) 边界框分布模型的训练、目标检测方法及装置
ALLAK et al. S4A-NET: Deep Learning-Based Intelligent Object Detection with Cat-egorization and Localization
Subbiah et al. Automated Plant Disease Detection Systems for the Smart Farming Sector

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23842011

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023842011

Country of ref document: EP

Effective date: 20240313