CN111814913A

CN111814913A - Training method and device for image classification model, electronic equipment and storage medium

Info

Publication number: CN111814913A
Application number: CN202010844672.6A
Authority: CN
Inventors: 罗茂
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd; Shenzhen Huantai Technology Co Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd; Shenzhen Huantai Technology Co Ltd
Priority date: 2020-08-20
Filing date: 2020-08-20
Publication date: 2020-10-23

Abstract

The application discloses a training method and device of an image classification model, electronic equipment and a storage medium, wherein the image classification model comprises a plurality of image classification modules, and each image classification module corresponds to different image classification scenes to obtain a training data set; performing feature extraction on each image sample classification label to obtain a feature image corresponding to each image sample; respectively training a plurality of image classification modules by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules; based on the plurality of trained image classification modules, a trained image classification model is obtained. Because each trained image classification module classifies each image classification scene, each image classification module in the trained image classification model can accurately and efficiently classify the image corresponding to each image classification scene.

Description

Training method and device for image classification model, electronic equipment and storage medium

Technical Field

The present application relates to the field of image classification technologies, and in particular, to a method and an apparatus for training an image classification model, an electronic device, and a storage medium.

Background

When a user shoots pictures, when the number of the shot pictures is large, the shot pictures need to be classified for convenience of management, and in a common method, the pictures are manually classified one by the user, so that the picture classification efficiency and accuracy are low.

Disclosure of Invention

In view of the foregoing problems, the present application provides a training method and apparatus for an image classification model, an electronic device, and a storage medium, which can improve image classification efficiency and classification accuracy.

In a first aspect, an embodiment of the present application provides a method for training an image classification model, where the image classification model includes a plurality of image classification modules, and each image classification module corresponds to a different image classification scene, and the method includes: acquiring a training data set, wherein the training data set comprises a plurality of image samples and a sample classification label corresponding to each image sample; performing feature extraction on each image sample to obtain a feature image corresponding to each image sample; respectively training the plurality of image classification modules by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules; obtaining a trained image classification model based on the plurality of trained image classification modules.

In a second aspect, an embodiment of the present application provides a training apparatus for an image classification model, including: the acquisition module is used for acquiring a training data set, wherein the training data set comprises a plurality of image samples and sample classification labels corresponding to the image samples; the characteristic extraction module is used for extracting the characteristics of each image sample to obtain a characteristic image corresponding to each image sample; the training module is used for training the plurality of image classification modules respectively by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules; and the image classification model obtaining module is used for obtaining a trained image classification model based on the trained image classification modules.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the above-described methods.

In a fourth aspect, the present application provides a computer-readable storage medium, in which program codes are stored, and the program codes can be called by a processor to execute the method as described above.

The image classification model comprises a plurality of image classification modules, each image classification module corresponds to different image classification scenes to obtain a training data set, wherein the training data set comprises a plurality of image samples and sample classification labels corresponding to the image samples; performing feature extraction on each image sample classification label to obtain a feature image corresponding to each image sample; respectively training a plurality of image classification modules by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules; based on the plurality of trained image classification modules, a trained image classification model is obtained. Because each trained image classification module classifies each image classification scene, each image classification module in the trained image classification model can accurately and efficiently classify the image corresponding to each image classification scene.

These and other aspects of the present application will be more readily apparent from the following description of the embodiments.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 illustrates a multi-scene multi-label classification diagram;

FIG. 2 shows a schematic diagram of an image classification model;

FIG. 3 illustrates a model structure diagram of an image classification model;

FIG. 4 shows a schematic diagram of an image classification model of the present application;

FIG. 5 illustrates a model structure diagram of an image classification model of the present application;

FIG. 6 illustrates a flow diagram of a method of training an image classification model according to one embodiment of the present application;

FIG. 7 shows a flowchart of a method of training an image classification model according to another embodiment of the present application;

FIG. 8 shows a flowchart of a method of training an image classification model according to another embodiment of the present application;

FIG. 9 shows a flowchart of a method of training an image classification model according to another embodiment of the present application;

FIG. 10 shows a flowchart of a method of training an image classification model according to another embodiment of the present application;

fig. 11 shows a flow chart of substeps S560;

FIG. 12 shows a block diagram of a training apparatus for an image classification model according to an embodiment of the present application;

FIG. 13 is a block diagram of an electronic device for performing a training method of an image classification model according to an embodiment of the present application;

fig. 14 is a storage unit for storing or carrying program codes for implementing a training method of an image classification model according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

When a user takes pictures, when the number of the taken pictures is huge, the taken pictures need to be classified for convenient management, and in actual classification, due to the factors of rich scenes, complex labels and the like, in most cases, one picture may belong to not only one class but also multiple classes (multi-label classification), namely, the so-called multi-scene multi-label classification. Fig. 1 is a schematic diagram illustrating a multi-scene multi-label classification, please refer to fig. 1, where a plurality of classifications correspond to scene 1, and classification 1, 2, etc. correspond to scene 1; the scene 2 corresponds to a plurality of scene classifications, and the scene 2 corresponds to the classification 1, the classification 2, and so on.

For multi-scene multi-label classification with a huge number of pictures, the inventor finds that an image classification model for picture classification can be adopted to automatically classify the pictures, as shown in fig. 2, fig. 2 shows a schematic diagram of the image classification model, and the image classification model can comprise a feature extraction module and an image classification module; the feature extraction module is used for extracting features of the input image to obtain a feature image. And inputting the characteristic images into an image classification module, and outputting classification labels by the image classification module to finish automatic classification of the input images.

For example, an image in the scene 1 is input into an image classification model as shown in fig. 2, and the obtained image classification model outputs a classification label corresponding to the image; an image in the scene 2 is input into the image classification model shown in fig. 2, and the obtained image classification model outputs a classification label corresponding to the image.

Fig. 3 shows a model structure diagram of an image classification model, as shown in fig. 3.

However, the inventor finds that the image classification model shown in fig. 2 uniformly classifies all images in all scenes, but because of rich scenes, corresponding images in each scene are numerous, and all images are classified by one image classification module, which may cause problems of inaccurate classification result, low classification efficiency, and the like.

In view of the above problems, the inventors have found through long-term research and provide a training method, an apparatus, an electronic device, and a storage medium for an image classification model according to embodiments of the present application, which are specifically improved in that a training data set is obtained, each image sample in the training data set is trained according to different image classification scenes corresponding to the image sample, image classification modules corresponding to different classification scenes are obtained, and images corresponding to each scene are classified by the image classification module corresponding to each scene, so that accuracy and efficiency of classification are improved. The specific training method of the image classification model is described in detail in the following embodiments.

Fig. 4 is a schematic diagram of an image classification model of the present application, please refer to fig. 4, where the image classification model includes a plurality of image classification modules, and each image classification module corresponds to a different image classification scene.

Fig. 5 shows a model structure diagram of the image classification model of the present application, as shown in fig. 5.

Fig. 6 is a flowchart illustrating a method for training an image classification model according to an embodiment of the present application, and referring to fig. 6, in a specific embodiment, the method for training an image classification model is applied to the apparatus 500 for training an image classification model shown in fig. 12 and an electronic device 600 (fig. 13) of the apparatus 500 for training an image classification model. The specific process of the present embodiment will be described below by taking an electronic device as an example, and it is understood that the electronic device applied in the present embodiment may be a mobile terminal, a smart phone, a tablet computer, a wearable electronic device, and the like, which is not limited herein. As will be described in detail with respect to the flow shown in fig. 6, the method for training the image classification model may specifically include the following steps:

step S110, a training data set is obtained, wherein the training data set comprises a plurality of image samples and sample classification labels corresponding to the image samples.

In some embodiments, the plurality of image samples in the training dataset may be preview images acquired by a camera of the electronic device; the photos can also be photos shot by a camera of the electronic equipment and stored in a local photo album, for example, photos shot by a camera of a mobile phone; the images may be downloaded from a network and stored in an album, and the like, which is not limited herein.

Alternatively, the training data set may comprise a plurality of sub-training data sets, e.g. the training data set may comprise a first sub-training data set, a second sub-training data set and a third sub-training data set.

When the image classification scene is a garbage classification scene, and the image classification model trained in the embodiment is used for classifying garbage, the first sub-training data set includes a plurality of garbage image samples and sample classification labels corresponding to the garbage image samples. In this embodiment, the acquired image sample may be an image sample of garbage. The sample classification label corresponding to each image sample may be a classification of garbage. For example, when the image sample of the garbage is kitchen garbage, the corresponding sample classification label is wet garbage.

When the image classification scene is a commodity classification scene, the image classification model trained in this embodiment is further used for classifying commodities of the shopping platform, the second sub-training data set includes an image sample of the food material and a sample classification label corresponding to the image sample of the food material, the acquired image sample may be the image sample of the food material, and the sample classification label corresponding to each image sample may be a type of the food material. For example, when the image sample of the food material is beef, the corresponding sample classification label thereof is meat.

When the image classification scene is a user photo clearing scene, the image classification model trained in this embodiment is further used for classifying images shot in the smart phone, the third sub-training data set includes image samples shot by the smart phone and classification labels corresponding to the image samples shot by the smart phone, the obtained image samples may be the image samples shot by the smart phone, and the classification labels of the samples corresponding to each image sample may be clear or fuzzy.

In some embodiments, the acquired image to be detected may be a static image or a dynamic image, which is not limited herein.

And step S120, performing feature extraction on each image sample to obtain a feature image corresponding to each image sample.

In a possible implementation manner, feature extraction is performed on each image sample through a feature extraction algorithm to obtain a feature image corresponding to each image sample, and optionally, the feature extraction algorithm may be, but is not limited to, a Histogram of Oriented Gradients (HOG) algorithm or a Local Binary Pattern (LBP) algorithm.

In another possible implementation, feature extraction may be performed on each image sample by a trained feature extraction module.

Step S130, training the plurality of image classification modules respectively by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules.

And respectively training the plurality of image classification modules through a machine learning algorithm by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules. The machine learning algorithm comprises algorithms corresponding to a plurality of image classification modules.

In some embodiments, the plurality of trained image classification modules may be stored locally on the electronic device after pre-training is completed.

In some embodiments, the trained human feature point detection model may also be stored in a server in communication with the electronic device after being trained in advance.

Step S140, obtaining a trained image classification model based on a plurality of trained image classification modules.

The obtained trained image classification model is shown in fig. 4.

In the training method for the image classification model provided by this embodiment, the image classification model includes a plurality of image classification modules, each image classification module corresponds to a different image classification scene to obtain a training data set, where the training data set includes a plurality of image samples and a sample classification label corresponding to each image sample; performing feature extraction on each image sample classification label to obtain a feature image corresponding to each image sample; respectively training a plurality of image classification modules by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules; based on the plurality of trained image classification modules, a trained image classification model is obtained. Because each trained image classification module classifies each image classification scene, each image classification module in the trained image classification model can accurately and efficiently classify the image corresponding to each image classification scene.

On the basis of the foregoing embodiment, as shown in fig. 4, the image classification model further includes a feature extraction module, which can perform feature extraction on an image sample through the feature extraction module, this embodiment provides a training method of the image classification model, fig. 7 shows a flowchart of a training method of the image classification model according to another embodiment of the present application, and referring to fig. 7, the training method of the image classification model may further include the following steps:

step S210, a training data set is obtained, where the training data set includes a plurality of image samples and a sample classification label corresponding to each image sample.

For detailed description of step S210, please refer to step S110, which is not described herein again.

Step S220, inputting each image sample into the trained feature extraction module, and obtaining a feature image corresponding to each image sample output by the trained feature extraction module.

In some implementations, the trained feature extraction module may be stored locally on the electronic device after pre-training is completed.

In some embodiments, the trained feature extraction module may also be stored in a server communicatively coupled to the electronic device after pre-training.

Step S230, training the plurality of image classification modules respectively by using the feature image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data, so as to obtain a plurality of trained image classification modules.

For detailed description of step S230, please refer to step S130, which is not described herein again.

Step S240, obtaining a trained image classification model based on the trained feature extraction module and the trained image classification modules.

In this embodiment, as shown in fig. 4, after the feature extraction module and the plurality of image classification modules in fig. 4 are both trained, the trained feature extraction module and the plurality of trained image classification modules obtain a trained image classification model.

Before using the feature image corresponding to each image sample output by the trained feature extraction module, the feature extraction module needs to be trained, fig. 8 shows a flowchart of a training method of an image classification model according to another embodiment of the present application, please refer to fig. 8, where the training method of the image classification model may specifically include the following steps:

step S310, a training data set is obtained, wherein the training data set comprises a plurality of image samples, a sample classification label corresponding to each image sample, and a sample characteristic image corresponding to each image sample.

For detailed description of step S310, please refer to step S110, which is not described herein again.

And S320, taking each image sample as input data and the feature image corresponding to each image sample as output data, and training the feature extraction module to obtain the trained feature extraction module.

In this embodiment, each image sample is used as input data, a feature image corresponding to each image sample is used as output data, and the feature extraction module is trained through a machine learning algorithm to obtain a trained feature extraction module. The machine learning algorithm further comprises an algorithm corresponding to the feature extraction module.

And step S330, inputting each image sample into the trained feature extraction module to obtain a feature image corresponding to each image sample output by the trained feature extraction module.

For the detailed description of step S330, please refer to step S220, which is not described herein again.

Step S340, taking the feature image corresponding to each image sample as input data, taking the sample classification label corresponding to each image sample as output data, and respectively training the plurality of image classification modules to obtain a plurality of trained image classification modules.

For detailed description of step S340, please refer to step S130, which is not described herein again.

And step S350, obtaining a trained image classification model based on the trained feature extraction module and the trained image classification modules.

For the detailed description of step S350, please refer to step S240, which is not described herein again.

Because the scenes related to the pictures are rich, the scenes for image classification may include a plurality of scenes, in the image classification model, an image classification module is trained for each scene, and each image classification module is used for accurately classifying the image corresponding to each scene, this embodiment provides a training method for an image classification model, which is used for training each classification module in the image classification model, fig. 9 shows a flowchart of a training method for an image classification model according to another embodiment of the present application, please refer to fig. 9, and the training method for an image classification model may specifically include the following steps:

step S410, a training data set is obtained, wherein the training data set comprises a plurality of image samples and sample classification labels corresponding to the image samples.

For detailed description of step S410, please refer to step S110, which is not described herein again.

And step S420, performing feature extraction on each image sample to obtain a feature image corresponding to each image sample.

For detailed description of step S420, please refer to step S120, which is not described herein.

Step S430, performing scene classification on the feature images corresponding to each image sample to obtain a plurality of scene categories, where each scene category includes at least one feature image.

With the above embodiment, the feature images corresponding to each image sample are subjected to scene classification, and the obtained scene categories can be garbage classification scenes, commodity classification scenes and user photo cleaning scenes.

The scene types are not limited to the three scene types described above, and may be other scene types according to actual situations.

Step S440, determining any scene category from the multiple scene categories as a target scene category, and acquiring a feature image included in the target scene category as a target feature image.

For example, a garbage classification scene is used as a target scene category, and an image of garbage corresponding to the garbage classification scene is used as a target feature image.

Step S450, taking the target characteristic image as input data, taking the target sample classification label as output data, training the image classification module corresponding to the target scene category, and obtaining the trained target image classification module, wherein the target sample classification label is the sample classification label corresponding to the target characteristic image sample.

In some embodiments, training the target image classification module comprises: inputting target characteristic images (for example, n1 target characteristic images) as input data into an initial image classification module to obtain predicted output data output by an initial image classification model, wherein the predicted output data may be a matrix of n1 × 1, and each target characteristic image corresponds to one element in the matrix; selecting a target sample classification label as real output data; calculating to obtain a loss function according to the predicted output data and the true output data, wherein the loss function is used for measuring the difference between the predicted output data and the true output data, when the loss function is larger, the difference between the predicted output data and the true output data is larger, namely the difference between the predicted output data and the true output data is larger, otherwise, the loss function is smaller, the difference between the predicted output data and the output data is smaller, namely the difference between the predicted output data and the true output data is smaller, and the predicted output data is closer to the true output data; and adjusting parameters of the initial image classification module according to the loss function to obtain a trained target image classification module. The goal of training the target image classification module is to minimize the loss function, so that the predicted output data output by the trained target image classification module is closer to the real output data.

When the garbage classification scene is selected as a target scene category, the image of the garbage corresponding to the garbage classification scene is used as a target characteristic image, the target characteristic image comprises a wet garbage image, a dry garbage image, a recoverable garbage image and a harmful garbage image, and the corresponding target sample classification labels are respectively a wet garbage classification label, a dry garbage classification label, a recoverable garbage classification label and a harmful garbage classification label. And taking the wet garbage image, the dry garbage image, the recoverable garbage image and the harmful garbage image as input data, taking the corresponding wet garbage classification label, the dry garbage classification label, the recoverable garbage classification label and the harmful garbage classification label as output data, and training the image classification module corresponding to the garbage classification scene to obtain a trained target image classification module.

And when the image classification module corresponding to the garbage classification scene is successfully trained, respectively training the commodity classification scene and the user photo cleaning scene according to the mode.

In the embodiment, the scene classification corresponding to the image sample is fully utilized, the obtained image classification modules corresponding to each scene are not affected with each other, the learning capability of each image classification module on the image sample of each scene is stronger, and the accurate classification of each image classification module is ensured.

In other embodiments, training the target image classification module with a full amount of data (i.e., a plurality of image samples, e.g., N, in the training dataset) includes: except for the target characteristic image (for example, N1 images), the sample classification label of the rest images in the multiple image samples is labeled as "other label", N image samples are used as input data and input into the initial image classification model, and prediction output data output by the initial image classification model is obtained, wherein the prediction output data can be a matrix of (N1+1) × 1, the target characteristic image corresponds to the first N1 elements in the matrix, and the rest image samples (N-N1 images) correspond to the last element in the matrix. Selecting a target sample classification label and other labels as real output data; calculating to obtain a loss function according to the predicted output data and the true output data, wherein the loss function is used for measuring the difference between the predicted output data and the true output data, when the loss function is larger, the difference between the predicted output data and the true output data is larger, namely the difference between the predicted output data and the true output data is larger, otherwise, the loss function is smaller, the difference between the predicted output data and the output data is smaller, namely the difference between the predicted output data and the true output data is smaller, and the predicted output data is closer to the true output data; and adjusting parameters of the initial image classification module according to the loss function to obtain a trained target image classification module.

In the embodiment, the target image classification module is trained by adopting the full amount of data, so that the image classification module is associated with the target characteristic image and other characteristic images in the training data set, and the classification accuracy of the target image classification module is ensured.

Step S460, obtaining a trained image classification model based on the trained image classification modules.

For the detailed description of step S460, please refer to step S140, which is not described herein again.

While the trained image classification model is obtained by the above embodiment, the embodiment provides a training method of an image classification model, an image to be classified is classified by the trained image classification model, fig. 10 shows a flowchart of a training method of an image classification model according to another embodiment of the present application, and referring to fig. 10, the training method of an image classification model may further include the following steps:

step S510, a training data set is obtained, where the training data set includes a plurality of image samples and a sample classification label corresponding to each image sample.

And step S520, performing feature extraction on each image sample to obtain a feature image corresponding to each image sample.

Step S530, training the plurality of image classification modules respectively by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules.

And S540, obtaining a trained image classification model based on a plurality of trained image classification modules.

For detailed descriptions of steps S510 to S540, refer to steps S110 to S140, which are not described herein again.

And step S550, acquiring an image to be classified.

The image to be classified may be a preview image collected by a camera of the electronic device, a photograph taken by the camera of the electronic device and stored in an album, an image downloaded from a network and stored in the album, or the like, which is not limited herein.

And step S560, inputting the image to be classified into the trained image classification model, obtaining the trained image classification model and outputting the classification label corresponding to the image to be classified.

In some embodiments, the trained image classification model may be stored locally on the electronic device after pre-training is completed. Based on this, after the electronic device acquires the image to be classified, the trained image classification model can be directly called locally, for example, the trained image classification model can be directly sent to the trained image classification model to indicate the trained image classification model to read the image to be detected in the target storage area, or the electronic device can directly input the image to be classified into the trained image classification model stored locally, so that the speed of inputting the image to be detected into the trained image classification model due to the influence of network factors is effectively avoided, and the user experience is improved.

In some embodiments, the trained image classification model may also be stored in a server communicatively coupled to the electronic device after the pre-training is completed. Based on this, after the electronic device acquires the image to be classified, an instruction can be sent to the trained image classification model stored in the server through the network to instruct the trained image classification model to read the image to be classified acquired by the electronic device through the network, or the electronic device can send the image to be classified to the trained image classification model stored in the server through the network, so that the occupation of the storage space of the electronic device is reduced and the influence on the normal operation of the electronic device is reduced by storing the trained image classification model in the server.

For example, when the image to be classified includes a wet spam image, the wet spam image is input into a trained image classification model, and the model outputs a classification label wet spam label corresponding to the wet spam image. According to the output result, the user can conveniently classify the garbage.

For another example, when the image to be classified further includes a commodity image, the commodity image is input into the trained image classification model, and the model outputs a classification label corresponding to the commodity image. The shopping platform can classify according to the classification label corresponding to the commodity image.

And when the images to be classified also comprise the user mobile phone photos, inputting the user mobile phone photos into the trained image classification model, and outputting the classification labels corresponding to the user mobile phone photos by the model. The sample classification label can be a clear image or a fuzzy image, the user mobile phone photo with the sample classification label being fuzzy can be deleted, the user can delete the fuzzy photo in the mobile phone at the same time, and the time for sorting the photo is saved.

The trained image classification model includes a trained feature extraction module and a plurality of trained image classification modules, fig. 11 shows a flowchart of the substep of step S560, please refer to fig. 11, and step S560 includes the substeps of:

and a substep S561, inputting the image to be classified into the trained feature extraction module, and obtaining the feature image corresponding to the image to be classified output by the trained feature extraction module.

And a substep S562 of inputting the characteristic images corresponding to the images to be classified into a plurality of trained image classification modules respectively, and acquiring classification labels corresponding to the images to be classified output by the trained image classification modules respectively.

To implement the above method class embodiments, this embodiment provides an apparatus for training an image classification model, fig. 12 shows a block diagram of an apparatus for training an image classification model according to an embodiment of the present application, and referring to fig. 12, an apparatus 600 for training an image classification model includes: an acquisition module 610, a feature extraction module 620, a training module 630, and an image classification model acquisition module 640.

The obtaining module 610 is configured to obtain a training data set, where the training data set includes a plurality of image samples and a sample classification label corresponding to each image sample.

And the feature extraction module 620 is configured to perform feature extraction on each image sample to obtain a feature image corresponding to each image sample.

The training module 630 is configured to train the plurality of image classification modules respectively by using the feature image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data, so as to obtain a plurality of trained image classification modules.

An image classification model obtaining module 640, configured to obtain a trained image classification model based on a plurality of trained image classification modules.

Optionally, the image classification model obtaining module 640 includes: a first sub-image classification model obtaining module. And the first sub-image classification model obtaining module is used for obtaining a trained image classification model based on the trained feature extraction module and the trained image classification modules.

Optionally, the feature extraction module 620 includes: and a sub-feature extraction module. And the sub-feature extraction module is used for inputting each image sample into the trained feature extraction module to obtain a feature image corresponding to each image sample output by the trained feature extraction module.

Optionally, the training data set further includes a sample feature image corresponding to each image sample, and the training apparatus 600 for image classification model further includes: and a feature extraction training module. And the feature extraction training module is used for training the feature extraction module by taking each image sample as input data and the feature image corresponding to each image sample as output data to obtain the trained feature extraction module.

Optionally, the training apparatus 600 for image classification models further includes: the device comprises a scene classification module and a determination module. The training module 630 includes: and a sub-training module.

And the scene classification module is used for carrying out scene classification on the characteristic images corresponding to the image samples to obtain a plurality of scene categories, wherein each scene category comprises at least one characteristic image.

The determining module is used for determining any scene category from the plurality of scene categories as a target scene category and acquiring a characteristic image included in the target scene category as a target characteristic image.

And the sub-training module is used for training the image classification module corresponding to the target scene category by taking the target characteristic image as input data and the target sample classification label as output data to obtain the trained target image classification module, wherein the target sample classification label is the sample classification label corresponding to the target characteristic image sample.

Optionally, the training apparatus 600 for image classification models further includes: the device comprises an image to be classified acquisition module and an output module.

And the image to be classified acquiring module is used for acquiring the image to be classified.

And the output module is used for inputting the images to be classified into the trained image classification model, obtaining the trained image classification model and outputting the classification labels corresponding to the images to be classified.

Further, the trained image classification model comprises a trained feature extraction module and a plurality of trained image classification modules, and the output module comprises: a characteristic image output module and a classification label output module.

The characteristic image output module is used for inputting the images to be classified into the trained characteristic extraction module to obtain characteristic images corresponding to the images to be classified output by the trained characteristic extraction module;

and the classification label output module is used for respectively inputting the characteristic images corresponding to the images to be classified into the trained image classification modules and respectively acquiring the classification labels corresponding to the images to be classified output by the trained image classification modules.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, the coupling between the modules may be electrical, mechanical or other type of coupling.

In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

Fig. 13 is a block diagram of an electronic device for performing a training method of an image classification model according to an embodiment of the present application, please refer to fig. 13, which shows a block diagram of an electronic device 700 according to an embodiment of the present application. The electronic device 700 may be a smart phone, a tablet computer, an electronic book, or other electronic devices capable of running an application. The electronic device 700 in the present application may include one or more of the following components: a processor 710, a memory 720, and one or more applications, wherein the one or more applications may be stored in the memory 720 and configured to be executed by the one or more processors 710, the one or more programs configured to perform a method as described in the aforementioned method embodiments.

Processor 710 may include one or more processing cores, among other things. The processor 710 interfaces with various components throughout the electronic device 700 using various interfaces and circuitry to perform various functions of the electronic device 700 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 720 and invoking data stored in the memory 720. Alternatively, the processor 710 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 710 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content to be displayed; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 710, but may be implemented by a communication chip.

The Memory 720 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 720 may be used to store instructions, programs, code sets, or instruction sets. The memory 720 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by the mobile terminal 700 during use (e.g., phone books, audio-visual data, chat log data), etc.

Fig. 14 is a storage unit for storing or carrying program codes for implementing a training method of an image classification model according to an embodiment of the present application, and please refer to fig. 14, which shows a block diagram of a computer-readable storage medium provided by an embodiment of the present application. The computer readable medium 800 has stored therein a program code that can be called by a processor to execute the method described in the above method embodiments.

The computer-readable storage medium 800 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 800 includes a non-volatile computer-readable storage medium. The computer readable storage medium 800 has storage space for program code 810 to perform any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 810 may be compressed, for example, in a suitable form.

To sum up, the image classification model provided by the present application includes a plurality of image classification modules, each image classification module corresponds to a different image classification scene to obtain a training data set, where the training data set includes a plurality of image samples and a sample classification label corresponding to each image sample; performing feature extraction on each image sample classification label to obtain a feature image corresponding to each image sample; respectively training a plurality of image classification modules by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules; based on the plurality of trained image classification modules, a trained image classification model is obtained. Because each trained image classification module classifies each image classification scene, each image classification module in the trained image classification model can accurately and efficiently classify the image corresponding to each image classification scene.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A training method for an image classification model, wherein the image classification model comprises a plurality of image classification modules, each image classification module corresponds to a different image classification scene, and the method comprises the following steps:

acquiring a training data set, wherein the training data set comprises a plurality of image samples and a sample classification label corresponding to each image sample;

performing feature extraction on each image sample to obtain a feature image corresponding to each image sample;

respectively training the plurality of image classification modules by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules;

obtaining a trained image classification model based on the plurality of trained image classification modules.

2. The method of claim 1, wherein the trained image classification model further comprises a trained feature extraction module, and wherein obtaining the trained image classification model based on the plurality of trained image classification modules comprises:

obtaining the trained image classification model based on the trained feature extraction module and the plurality of trained image classification modules.

3. The method according to claim 2, wherein the performing feature extraction on each image sample to obtain a feature image corresponding to each image sample comprises:

and inputting each image sample into the trained feature extraction module to obtain a feature image corresponding to each image sample output by the trained feature extraction module.

4. The method according to claim 3, wherein the training data set further includes a sample feature image corresponding to each of the image samples, and before inputting each of the image samples into the trained feature extraction module and obtaining the feature image corresponding to each of the image samples output by the trained feature extraction module, further includes:

and taking each image sample as input data, taking a characteristic image corresponding to each image sample as output data, and training a characteristic extraction module to obtain the trained characteristic extraction module.

5. The method according to claim 1, wherein after performing feature extraction on each of the image samples to obtain a feature image corresponding to each of the image samples, the method further comprises:

and carrying out scene classification on the characteristic images corresponding to the image samples to obtain a plurality of scene categories, wherein each scene category comprises at least one characteristic image.

Determining any scene category from the plurality of scene categories as a target scene category, and acquiring a characteristic image included in the target scene category as a target characteristic image;

the training the plurality of image classification modules by using the feature image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules comprises:

and taking the target characteristic image as input data, taking a target sample classification label as output data, training an image classification module corresponding to the target scene class, and obtaining a trained target image classification module, wherein the target sample classification label is a sample classification label corresponding to the target characteristic image sample.

6. The method according to any one of claims 1-5, wherein after obtaining the trained image classification model based on the plurality of trained image classification modules, further comprising:

acquiring an image to be classified;

and inputting the image to be classified into the trained image classification model, obtaining the trained image classification model, and outputting a classification label corresponding to the image to be classified.

7. The classification method according to claim 6, wherein the trained image classification model includes a trained feature extraction module and a plurality of trained image classification modules, and the inputting the image to be classified into the trained image classification model, obtaining the trained image classification model, and outputting the classification label corresponding to the image to be classified by the trained image classification model includes:

inputting the image to be classified into the trained feature extraction module to obtain a feature image corresponding to the image to be classified output by the trained feature extraction module;

and respectively inputting the characteristic images corresponding to the images to be classified into the trained image classification modules, and respectively acquiring the classification labels corresponding to the images to be classified output by the trained image classification modules.

8. An apparatus for training an image classification model, comprising:

the acquisition module is used for acquiring a training data set, wherein the training data set comprises a plurality of image samples and sample classification labels corresponding to the image samples;

the characteristic extraction module is used for extracting the characteristics of each image sample to obtain a characteristic image corresponding to each image sample;

the training module is used for training the plurality of image classification modules respectively by taking the characteristic image corresponding to each image sample as input data and the sample classification label corresponding to each image sample as output data to obtain a plurality of trained image classification modules;

and the image classification model obtaining module is used for obtaining a trained image classification model based on the trained image classification modules.

9. An electronic device, comprising:

one or more processors;

a memory;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-7.

10. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 7.