WO2023280229A1

WO2023280229A1 - Image processing method, electronic device, and storage medium

Info

Publication number: WO2023280229A1
Application number: PCT/CN2022/104184
Authority: WO
Inventors: 张连文; 谢伟雁; 林志; 董冠方; 李小慧
Original assignee: 华为技术有限公司
Priority date: 2021-07-07
Filing date: 2022-07-06
Publication date: 2023-01-12
Also published as: CN115661502A

Abstract

The present application relates to the field of image processing, and in particular to an image processing method, an electronic device, and a storage medium. The method comprises: obtaining a dataset and a model pool, the dataset comprising a plurality of sample images, the model pool comprising a plurality of pretrained first image classification models; respectively inputting the dataset into each of the first image classification models of the model pool to obtain predicted distribution results respectively corresponding to the plurality of first image classification models, each predicted distribution result being used for indicating probability distribution of each sample image on a plurality of image tags corresponding to the dataset; and according to the plurality of predicted distribution results, determining sample classification difficulty corresponding to at least one sample image in the dataset, the sample classification difficulty being used for indicating the difficulty of classification of the sample image by the model. The method provided by embodiments of the present application can automatically evaluate sample classification difficulty without manual participation, can be applied to large-scale data, and provides the basic capability for subsequent applications such as classification design and data verification.

Description

Image processing method, electronic device and storage medium

This application claims the priority of the Chinese patent application with the application number 202110767141.6 and the application name "image processing method, electronic device and storage medium" submitted to the China Patent Office on July 07, 2021, the entire contents of which are incorporated herein by reference Applying.

technical field

The present application relates to the field of image processing, in particular to an image processing method, electronic equipment and storage media.

Background technique

With the development of artificial intelligence technology and image processing technology, image classification processing is increasingly based on artificial intelligence models. In order to realize image classification, related technologies generally train an image classification model through a large number of sample images, so as to call the trained image classification model to perform image classification processing.

However, different sample images contain different amounts of information, which leads to different degrees of difficulty for different sample images to be learned by the image classification model. In the related art, the sample images are not differentiated, but the image classification model is directly trained based on the samples with unbalanced training difficulty, which leads to the limited knowledge learned by the image classification model, which leads to poor final classification ability.

Contents of the invention

In view of this, the embodiment of the present application proposes an image processing method, an electronic device and a storage medium, which can solve the problem that the training difficulty of the sample images in the related art is unbalanced, which will affect the model training effect when training the image classification model.

In a first aspect, an embodiment of the present application provides an image processing method, the method comprising:

Obtain a data set and a model pool, the data set includes a plurality of sample images, and the model pool includes a plurality of pre-trained first image classification models;

Input the data set into each first image classification model of the model pool respectively, and obtain prediction distribution results corresponding to each of the plurality of first image classification models, and the prediction distribution results are used to indicate that each of the first image classification models The probability distribution of the sample image on a plurality of image tags corresponding to the data set;

A sample classification difficulty corresponding to at least one sample image in the data set is determined according to the plurality of prediction distribution results, and the sample classification difficulty is used to indicate the degree of difficulty for the sample image to be classified by a model.

In this implementation, the image processing method provided in the embodiment of the present application obtains a data set and a model pool, and inputs the data set into each first image classification model in the model pool respectively, to obtain multiple first image classification models According to the corresponding prediction distribution results, the difficulty of sample classification corresponding to at least one sample image in the data set is determined according to multiple prediction distribution results. The corresponding sample classification difficulty trains the image classification model, so as to solve the problem of poor model training effect caused by the unbalanced training difficulty of the sample images. The design of classification methods and applications such as data verification provide basic capabilities.

In a possible implementation manner, the determining the sample classification difficulty corresponding to at least one sample image in the data set according to the multiple prediction distribution results includes:

For each of the at least one of the sample images, determining the confusion of the sample images based on the plurality of prediction distribution results and the number of the first image classification models in the model pool The degree of confusion is used to indicate the difficulty of classifying the sample corresponding to the sample image.

In this implementation, confusion is introduced as a measure of sample classification difficulty, so that the evaluation of sample classification difficulty can be performed in labeled data or unlabeled data, which further ensures the design and data verification of subsequent new classification methods Wait.

In another possible implementation manner, the confusion degree of the sample image is positively correlated with the entropy of the prediction distribution result of the sample image corresponding to each of the first image classification models, and is related to the entropy of the first image classification model. The number of models is negatively correlated, and the entropy of the predicted distribution result corresponding to the first image classification model is determined according to the probability of the sample image output by the first image classification model on each of the image labels.

In this implementation, a possible implementation is provided for calculating the perplexity of sample images, which further ensures the feasibility and reliability of automatically evaluating the difficulty of sample classification.

In another possible implementation, the method further includes:

According to the plurality of prediction distribution results, determine the label discrimination difficulty between any two image labels in the plurality of image labels, and the label discrimination difficulty is used to indicate the difficulty of any two image labels being distinguished by the model ease.

In this implementation, according to the results of multiple prediction distributions, the label discrimination difficulty between any two image labels in multiple image labels is determined. The label discrimination difficulty measures the difficulty of any two image labels being distinguished by the model. The image classification model is trained based on the difficulty of label discrimination, and the classification effect of the image classification model is provided. Moreover, this method can be applied to large-scale data, providing basic capabilities for subsequent applications such as the design of new classification methods and data verification.

In another possible implementation manner, the determining the label discrimination difficulty between any two image labels among the multiple image labels according to the multiple prediction distribution results includes:

For any two of the plurality of image tags, according to the plurality of prediction distribution results and the number of the first image classification models in the model pool, determine any two of the image tags The label confusion index between them is used to indicate the difficulty of label distinction between any two image labels.

In this implementation, the label confusion index is introduced as a measure of the difficulty of label distinction, which effectively and efficiently evaluates the difficulty of label distinction, and further ensures the design of subsequent new classification methods and data verification.

In another possible implementation manner, the label confusion index between any two image labels is determined according to the similarity of the probability distribution of the data set on the two image labels.

In this implementation, a possible implementation is provided for calculating the label confusion index between any two image labels, which further ensures the feasibility and reliability of automatically evaluating the difficulty of label distinction.

In another possible implementation, before acquiring the dataset and the model pool, further includes:

training the original second image classification model according to the training set to obtain the first image classification model, the training set including a plurality of training images;

Wherein, there are at least two of the first image classification models in the model pool, and the corresponding model training parameters are different, and the model training parameters include the training set, the type and model of the second image classification model At least one of the training durations.

In this implementation, the model pool is constructed as the basis for difficulty assessment, which avoids the participation of subjects in related technologies, and can realize large-scale automated assessment; there are at least two = first image classifications in the model pool The model training parameters corresponding to each model are different, making the model pool more representative, and the distribution obtained by using the model pool can avoid the bias caused by the model, so as to ensure that the final calculation result is not affected by the model selection bias.

In another possible implementation, the method further includes:

Filtering the data set according to the sample classification difficulty corresponding to each of the plurality of sample images in the data set and a preset sample classification difficulty threshold;

For each sample image in the filtered data set, the image label of the sample image is corrected according to the target image label output by the maximum proportion of the model pool.

In this implementation, the data set is screened according to the sample classification difficulty corresponding to each of the multiple sample images in the data set and the preset sample classification difficulty threshold; for each sample image in the filtered data set, according to the model pool The target image label output at the maximum ratio corrects the image label of the sample image, so that after the difficulty of sample classification is automatically evaluated, the mislabeled data in the data set can be verified and corrected according to the difficulty of sample classification obtained from the evaluation.

In another possible implementation, the method further includes:

Constructing a third image classification model according to the label discrimination difficulty between any two of the image tags in the data set, the third image classification model is a multi-level classification model;

training the third image classification model according to the training set to obtain a fourth image classification model;

When the target image to be classified is obtained, the fourth image classification model is invoked to perform classification processing on the target image to obtain a classification result of the target image.

In this implementation, after automatically evaluating the difficulty of label discrimination, a third image classification model can be constructed according to the difficulty of label discrimination, and the third image classification model can be trained according to the training set to obtain a fourth image classification model with higher accuracy; call the When the four-image classification model performs inference, because the average complexity of the model can be reduced compared with the original complex large model, it can achieve accelerated inference and facilitate terminal deployment.

In the second aspect, the embodiments of the present application provide an image processing device, the device includes at least one unit, and at least one unit is used to implement the above-mentioned first aspect or any one of the possible implementations of the first aspect. image processing method.

In a third aspect, an embodiment of the present application provides an electronic device, and the electronic device includes:

processor;

memory for storing processor-executable instructions;

Wherein, the processor is configured to implement the image processing method provided in the first aspect or any possible implementation manner of the first aspect when executing the instructions.

In a fourth aspect, the embodiments of the present application provide a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium bearing computer-readable codes, when the computer-readable codes are stored in an electronic When running in the device, the processor in the electronic device executes the image processing method provided in the first aspect or any possible implementation manner of the first aspect.

In the fifth aspect, the embodiments of the present application provide a non-volatile computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above-mentioned first aspect or the first aspect can be realized The image processing method provided by any one of the possible implementations.

Description of drawings

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features, and aspects of the application and, together with the specification, serve to explain the principles of the application.

Fig. 1 shows a schematic diagram of samples and labels of different difficulties in the related art.

Fig. 2 shows a schematic structural diagram of an image processing system provided by an exemplary embodiment of the present application.

Fig. 3 shows a schematic structural diagram of an electronic device provided by an exemplary embodiment of the present application.

Fig. 4 shows a schematic structural diagram of an electronic device provided by another exemplary embodiment of the present application.

Fig. 5 shows a flowchart of an image processing method provided by an exemplary embodiment of the present application.

Fig. 6 shows a schematic diagram of the principle of the calculation method of the label confusion index provided by an exemplary embodiment of the present application.

Fig. 7 shows a schematic diagram of the distribution of confusion and perplexity of the ImageNet data set provided by an exemplary embodiment of the present application.

Fig. 8 shows a schematic diagram of calculation results of some sample images obtained by an image processing method provided in an exemplary embodiment of the present application.

Fig. 9 shows a schematic diagram of the distribution of the label confusion index of the ImageNet data set provided by an exemplary embodiment of the present application.

Fig. 10 shows a schematic diagram of calculation results of some label pairs obtained by an image processing method provided in an exemplary embodiment of the present application.

Fig. 11 shows a flowchart of an image processing method provided by another exemplary embodiment of the present application.

Fig. 12 shows a schematic diagram of the principle of correcting incorrectly labeled data provided by another exemplary embodiment of the present application.

Fig. 13 shows a flowchart of an image processing method provided by another exemplary embodiment of the present application.

Fig. 14 shows a schematic diagram of the principle of a multi-level image classification method provided by another exemplary embodiment of the present application.

Fig. 15 shows a flowchart of an image processing method provided by another exemplary embodiment of the present application.

Fig. 16 shows a block diagram of an image processing apparatus provided by another exemplary embodiment of the present application.

detailed description

Various exemplary embodiments, features, and aspects of the present application will be described in detail below with reference to the accompanying drawings. The same reference numbers in the figures indicate functionally identical or similar elements. While various aspects of the embodiments are shown in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" is not necessarily to be construed as superior or better than other embodiments.

In addition, in order to better illustrate the present application, numerous specific details are given in the following specific implementation manners. It will be understood by those skilled in the art that the present application may be practiced without certain of the specific details. In some instances, methods, means, components and circuits well known to those skilled in the art have not been described in detail in order to highlight the gist of the present application.

At present, in more and more application scenarios, the performance of artificial intelligence technology is comparable to that of human beings. However, there are still some big differences between the current learning process of artificial intelligence and human learning process. As shown in Figure 1, the hourglass is an easy sample, and the racket is a difficult sample; the label "guitar" and the label "violin" are easily distinguishable labels, and the label "violin" and the label "cello" are indistinguishable labels. In the current artificial intelligence research, although samples are divided into difficulties and labels are also divided into difficulties, it is generally inclined to use a black box model for training and learning, "treating all samples equally" and "treating all labels equally". In addition, some current interpretation methods for artificial intelligence models do not take into account the difficulty of sample classification and label distinction. That is to say, no matter for easy samples or difficult samples, the complexity and level of sophistication of the explanation are the same, and there are few comparisons between different labels, which does not meet the needs of human understanding models.

The difficulty of sample classification and the difficulty of label distinction are basic characteristics of data. The lack of exploration of these two characteristics may cause model developers to be unable to "teach students in accordance with their aptitude", thus developing more efficient and accurate models. Therefore, the embodiment of the present application provides a solution for automatically evaluating the difficulty of sample classification and label distinction, which aims to provide basic capabilities for exploring artificial intelligence classification methods and artificial intelligence model interpretation methods that conform to the human cognitive process, so as to provide a basis for subsequent artificial intelligence. The development of the intelligent industry provides a new direction of development. In addition, this solution can also be applied to label verification and cleaning of data.

In related technologies, it is possible to test the time it takes for a subject to visually search for a specified picture, and use the response time as a measure of the sample classification difficulty of the sample image in the visual search task, that is, the longer the subject's response time, the sample The harder it is for an image to be searched, that is, the higher the difficulty of sample classification for the sample image. However, this method requires human participation, cannot be tested for large-scale data, and is easily affected by the state of the subject, such as the mental state of the subject, resulting in statistical deviation.

In addition, in related technologies, the performance index of the top model can also be used as a measure of the difficulty of the data set. Optionally, multiple models are selected for each data set for training to obtain the accuracy performance of each model. The higher the accuracy of the model with the best performance, the lower the classification difficulty of the data set. However, this method is aimed at evaluating the classification difficulty of the entire data set, and cannot evaluate the classification difficulty of a single sample image, nor can it evaluate the difficulty of distinguishing image labels.

The embodiment of the present application proposes an image processing method, electronic equipment, and storage medium. By obtaining a data set and a model pool, the data set is respectively input into each first image classification model in the model pool to obtain a plurality of first images According to the prediction distribution results corresponding to the classification models, the difficulty of sample classification corresponding to at least one sample image in the data set is determined according to multiple prediction distribution results. The difficulty of sample classification measures the difficulty of classifying the sample image by the model. The sample classification difficulty corresponding to the sample image trains the image classification model, so as to solve the problem of poor model training effect caused by the unbalanced training difficulty of the sample image. In addition, the embodiment of the present application also determines the label discrimination difficulty between any two image labels among the multiple image labels according to multiple prediction distribution results. The label discrimination difficulty measures the difficulty of any two image labels being distinguished by the model. Subsequently, an image classification model can be trained based on the label discrimination difficulty between any two image labels in the plurality of image labels. It ensures that this scheme can be applied to large-scale data, and provides basic capabilities for subsequent applications such as the design of new classification methods and data verification.

First, the application scenarios involved in this application are introduced.

Please refer to FIG. 2 , which shows a schematic structural diagram of an image processing system provided by an exemplary embodiment of the present application.

The image processing system includes a database 220 , an image analysis system 240 , an image verification system 260 and an image visualization system 280 .

The database 220 is used to provide the image analysis system 240 with a data set, the data set includes a plurality of sample images.

The image analysis system 240 includes a difficulty assessment device 242, which is configured to obtain a data set from a database, and perform sample classification difficulty calculation on at least one sample image in the data set.

Optionally, the difficulty evaluation device 242 is also used to calculate the difficulty of label distinction between any two image labels among the plurality of image labels corresponding to the data set.

The difficulty evaluation device 242 is also used to provide calculation results to the subsequent image verification system 260 and image visualization system 280 .

The image verification system 260 is used for verifying and correcting the incorrectly marked data in the data set according to the calculation result of the difficulty evaluation device 242 .

The image visualization system 280 is used to display the content related to the calculation result according to the calculation result of the difficulty assessment device 242 .

The execution subject of the image processing method provided in the embodiment of the present application is an electronic device. Please refer to FIG. 3 , which shows a schematic structural diagram of an electronic device provided by an exemplary embodiment of the present application.

The electronic device may be a terminal or a server. The terminal includes a mobile terminal or a fixed terminal, such as a mobile phone, a tablet computer, a laptop computer, a desktop computer, and the like. The server can be one server, or a server cluster composed of several servers, or a cloud computing service center.

As shown in FIG. 3 , the electronic device includes a processor 310 , a memory 320 and a communication interface 330 . Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation to the electronic device, and may include more or less components than shown in the figure, or combine some components, or arrange different components. in:

The processor 310 is the control center of the electronic device, and uses various interfaces and lines to connect various parts of the entire electronic device, by running or executing software programs and/or modules stored in the memory 320, and calling data stored in the memory 320 , to perform various functions of the electronic equipment and process data, thereby controlling the electronic equipment as a whole. The processor 310 may be implemented by a central processing unit (Central Processing Unit, CPU), and may also be implemented by a graphics processing unit (Graphics Processing Unit, GPU).

The memory 320 can be used to store software programs as well as modules. The processor 310 executes various functional applications and data processing by running software programs and modules stored in the memory 320 . The memory 320 can mainly include a program storage area and a data storage area, wherein the program storage area can store an operating system 321, an acquisition module 322, an input module 323, a determination module 324 and at least one functionally required application program 325 (such as a neural network model training, etc.); the storage data area can store data created according to the use of the electronic device, etc. Memory 320 can be realized by any type of volatile or nonvolatile memory device or their combination, such as Static Random Access Memory (Static Random Access Memory, SRAM), Electrically Erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read Only Memory (Read Only Memory, ROM), magnetic memory, flash memory, magnetic or optical disk. Correspondingly, the memory 320 may further include a memory controller to provide the processor 310 with access to the memory 320 .

Wherein, the processor 310 performs the following functions by running the acquiring module 322: acquiring a data set and a model pool, the data set includes a plurality of sample images, and the model pool includes a plurality of first image classification models that have been trained in advance; the processor 310 inputs by running Module 323 performs the following functions: respectively input the data set into each first image classification model in the model pool, and obtain the prediction distribution results corresponding to each of the multiple first image classification models, and the prediction distribution results are used to indicate that each sample image is in Probability distribution on multiple image labels corresponding to the data set; the processor 310 executes the following functions by running the determination module 324: according to multiple prediction distribution results, determine the sample classification difficulty corresponding to at least one sample image in the data set, and the sample classification difficulty is used for Indicates how easily the sample image is classified by the model.

In a possible implementation, as shown in Figure 4, the electronic device can be a server 40, and the product implementation form of this case is a neural network model and program codes included in the big data analysis platform and deployed on the hardware of the server 40 . The neural network model and program code involved in the embodiment of the present application are deployed in the image analysis system 240. The image analysis system 240 includes a difficulty evaluation device 242, and the difficulty evaluation device 242 is also used to provide calculation results to the subsequent image verification system 260 and and/or image visualization system 280 . During operation, the program code provided by the embodiment of the present application operates on the host memory 41 (including database 220) of the server 40, CPU42 fast hardware GPU43 wherein the image analysis system 240 can also be called a cloud server image analysis platform, and the image verification system 260 can also be Known as a cloud service image verification platform, the image visualization system 280 may also be called a cloud service image visualization platform. It should be noted that, for the introduction of the database 220 , the image analysis system 240 , the image verification system 260 and the image visualization system 280 , reference may be made to the relevant descriptions in the above embodiments, which will not be repeated here.

In the following, several exemplary embodiments are used to introduce the image processing method provided by the embodiment of the present application.

Please refer to FIG. 5 , which shows a flowchart of an image processing method provided by an exemplary embodiment of the present application. This embodiment is illustrated by using the method in the electronic device shown in FIG. 3 or 4 . The method includes the following steps.

Step 501, build a model pool, the model pool includes a plurality of first image classification models.

Optionally, the electronic device builds a model pool in the manner of multi-type stratified sampling, and the model pool includes multiple first image classification models with different classification effects. The stratified sampling method is to divide the population into several strata according to a certain mark, and select a certain number of samples from each stratum. In the sampling survey of the difficulty of sample classification, the indicators that need to be considered include people's knowledge level (such as education background), life experience (such as age) and so on. Similarly, when constructing a model pool, the signs that need to be considered include the model structure, the number of training images, and the duration of model training, etc., so that the model pool is more representative and the deviation in the subsequent calculation of sample classification difficulty is as small as possible.

In a possible implementation manner, the electronic device selects multiple types of original second image classification models, and trains the selected multiple types of second image classification models on training sets of different sizes. During each training process, models are collected at different stages of model training (initial stage, middle stage and close to convergence), and multiple first image classification models with different classification effects are obtained.

Optionally, there are at least two first image classification models in the model pool, and the corresponding model training parameters are different, and the model training parameters include at least one of the training set, the type of the second image classification model, and the model training duration.

Schematically, the type of the second image classification model includes but is not limited to at least one of the model types shown in Table 1.

Table I

模型Model	参数数量(百万)Number of parameters (millions)	内存(MB)RAM(MB)
VGG16VGG16	138138	500500
ResNET50ResNET50	2525	9898
ResNET101ResNET101	4444	174174
InceptionV3InceptionV3	24twenty four	9292
XceptionXception	23twenty three	8888
DenseNet121 DenseNet121	88	3333
DenseNet169DenseNet169	1414	5757
DenseNet201 DenseNet201	2020	8080
EfficientNetB0EfficientNetB0	5.35.3	2020
EfficientNetB2EfficientNetB2	99	3636

Schematically, the electronic device acquires a target data set (such as an ImageNet data set), and obtains sub-data sets whose sizes are 25%, 50% and 75% of the target data set by random sampling in the target data set, as 3 sub-data sets of different sizes. a training set. This embodiment of the present application does not limit it.

The electronic device constructs a model pool, and stores the constructed model pool in the electronic device.

Step 502, input the data set into each first image classification model in the model pool respectively, and obtain the prediction distribution results corresponding to each of the multiple first image classification models, and the prediction distribution results are used to indicate that each sample image is in the data set corresponding to Probability distributions over multiple image labels for .

The electronic device acquires a data set and a model pool, the data set includes a plurality of sample images, and the model pool includes a plurality of pre-trained first image classification models. The electronic device respectively inputs the data set into each first image classification model in the model pool, and obtains prediction distribution results corresponding to each of the multiple first image classification models, and the prediction distribution result is used to indicate that each sample image is in the data set corresponding Probability distributions over multiple image labels.

Step 503 , according to multiple prediction distribution results, determine the sample classification difficulty corresponding to at least one sample image in the data set, and determine the label discrimination difficulty between any two image labels among the multiple image labels.

Wherein, the sample classification difficulty is used to indicate the degree of difficulty of the sample image being classified by the model. Label discrimination difficulty is used to indicate how easy it is for any two image labels to be distinguished by the model.

In a possible implementation, a metric is used to evaluate the difficulty of sample classification, that is, Confusion Perplexity (C-Perplexity). Confusion perplexity is obtained by calculating the average uncertainty of multiple prediction distributions for a single sample. Confusion perplexity can be applied to both labeled or unlabeled sample images. For sample images with low classification difficulty, most models should have low uncertainty about the calculation results, and vice versa.

Optionally, for each sample image in at least one sample image, the electronic device determines the confusion perplexity of the sample image according to the multiple prediction distribution results and the number of the first image classification models in the model pool, where the confusion perplexity is represented by Indicates the sample classification difficulty corresponding to the sample image.

Optionally, the degree of confusion of the sample image is positively correlated with the difficulty of sample classification of the sample image, that is, the greater the degree of confusion of the sample image, the higher the difficulty of sample classification of the sample image, that is, the harder the sample image is to be classified by the model Classification.

Optionally, the confusion perplexity of the sample image is positively correlated with the entropy of the predicted distribution result of the sample image corresponding to each first image classification model, and is negatively correlated with the number of the first image classification model. The entropy of the predicted distribution result is determined according to the probabilities of the sample images output by the first image classification model on each image label.

Schematically, the electronic device determines the confusion perplexity φ _c (x) of the sample image by the following formula according to the multiple prediction distribution results and the number of the first image classification models in the model pool:

Wherein H _i (x)=-∑ _y P _i (y|x)log ₂ P _i (y|x);

Among them, x is the sample image, y is the image label, N is the number of the first image classification model in the model pool, H _i (x) is the prediction distribution result corresponding to the i-th first image classification model in the model pool Entropy, P _i (y|x) is the probability of the sample image x output by the i-th first image classification model on the image label y, and both i and N are positive integers.

In a possible implementation, a metric is used to evaluate the difficulty of distinguishing between two image labels, that is, label confusion index (Confusion Index, CI). The label confusion index measures the degree to which the first image classification model in the model pool confuses any two image labels in the dataset by calculating the similarity of the predicted distributions of the entire dataset on the two image labels Sex gets. If the probabilities of the two image tags are high or low at the same time, the difficulty of distinguishing the two image tags is relatively high. If the probability of the two image tags is high and the other is low, the distinction between the two image tags The difficulty is relatively low.

Wherein, the division is performed according to the sample images, and the prediction distribution result corresponding to each first image classification model includes the probability distribution of each sample image on multiple image labels. According to the classification of image labels, the prediction distribution result corresponding to each first image classification model includes the prediction distribution corresponding to each image label, and the prediction distribution is used to indicate the probability of multiple sample images on the image label.

For any two image tags (i.e., tag pairs) among the multiple image tags, the electronic device determines the label confusion index between any two image tags according to the multiple prediction distribution results and the number of the first image classification models in the model pool , the label confusion index is used to indicate the difficulty of label discrimination between any two image labels.

Optionally, the label confusion index between any two image tags is positively correlated with the difficulty of distinguishing the labels between the two image tags, that is, the greater the label confusion index between the two image tags, the greater the difference between the two image tags. The higher the difficulty of label discrimination, that is, the harder it is for the two image labels to be distinguished by the model.

Optionally, the label confusion index between any two image labels is determined according to the similarity of the probability distribution of the data set on the two image labels.

Schematically, the electronic device determines the label confusion index φ _c (y ₁ , y ₂ ) between any two image labels according to the multiple prediction distribution results and the number of the first image classification models in the model pool by the following formula:

Among them, x is the sample image, y ₁ and y ₂ are any two image labels, N is the number of first image classification models in the model pool, P _i (y ₁ |x) is the i-th first image classification The probability of the sample image x output by the model on the image label y ₁ , P _i (y ₂ |x) is the probability of the sample image x output by the i-th first image classification model on the image label y ₂ , E _x is the data The average of the probabilities of multiple sample images in the set, i and N are both positive integers.

In a schematic example, as shown in FIG. 6 , the electronic device inputs the data set 61 into each first image classification model in the model pool 62 to obtain the predicted distribution corresponding to image label 1 and the corresponding prediction distribution of image label 2. Prediction distribution, calculate the similarity according to the prediction distribution of image label 1 and the prediction distribution of image label 2, and determine the label confusion index between these two image labels.

Step 504, displaying at least one sample classification difficulty corresponding to the sample image and/or label discrimination difficulty between any two image labels among the plurality of image labels.

Optionally, the electronic device displays the overall distribution of sample classification difficulty and label discrimination difficulty in a preset display form according to the above calculation results. For example, the preset display form is in the form of a data dashboard.

Optionally, the electronic device displays in an interactive form the difficulty of classifying a single sample image in the data set and/or the difficulty of distinguishing labels between two image labels. This embodiment of the present application does not limit it.

Step 505: Perform difficulty analysis according to the difficulty of classifying samples corresponding to at least one sample image and/or the difficulty of label distinction between any two image labels among the plurality of image labels, and output cause information.

Optionally, the electronic device analyzes the difficulty of sample classification and label discrimination based on the above-mentioned data dashboard, and outputs information on the reasons for the high difficulty of sample classification and label discrimination.

It should be noted that the above step 504 and step 505 are optional steps, that is, both the above step 504 and step 505 may be performed, or neither may be performed, or one may be performed, which is not limited in this embodiment of the present application.

To sum up, the image processing method provided by the embodiment of the present application can automatically evaluate the difficulty of sample classification and label distinction without human participation, so it can be applied to large-scale data; The difficulty of classifying samples can be calculated; for a data set with a large scale of labels, the difficulty of label discrimination between any two image labels among the multiple image labels corresponding to the data set can be calculated.

The embodiment of the present application also introduces confusion perplexity as a measure of sample classification difficulty and label confusion index as a measure of label distinction difficulty. On the one hand, the evaluation of sample classification difficulty can be performed in labeled data or unlabeled data; on the other hand On the one hand, it effectively and efficiently evaluates the difficulty of label discrimination, thus providing basic capabilities for the design of new classification methods and applications such as data verification.

The embodiment of the present application also builds a model pool as the basis for difficulty evaluation, avoiding the participation of subjects in related technologies, and can realize large-scale automated evaluation; also adopts the principle of stratified sampling for the selection of models, using this The distribution obtained by the model pool constructed by the scheme can avoid the bias caused by the model, so as to ensure that the final calculation result is not affected by the model selection bias.

In an illustrative example, the data set is the ImageNet data set, and the model pool includes 500 first image classification models with different classification effects. The construction method of the model pool is similar to the stratified sampling method in the sample survey. For the ImageNet data set, the calculation method of confusion perplexity proposed in the embodiment of the present application is adopted, and the overall distribution is obtained as shown in FIG. 7 . As the difference caused by the model pool, at the same time, only the model pool with strong classifiers is used as a comparison, where the abscissa is the degree of confusion, and the ordinate is the density, that is, the number of samples with the degree of confusion in the current position unit accounts for the total number of samples The ratio of numbers, described in mathematical form, is δratio(CP)/δCP, and ratio(CP) refers to the ratio of the number of samples when the confusion degree is CP to the total number of samples. It can be seen from FIG. 7 that the distribution of the model pool constructed by using the embodiment of the present application is more uniform than that of the model pool using only strong classifiers. Moreover, most of the sample images in the dataset belong to easy samples (confusion perplexity close to 1).

In another schematic example, the calculation results of some sample images obtained by using the image processing method provided by the embodiment of the present application are shown in Figure 8, and Figure 8 also shows the calculation error perplexity (Misclassification Perplexity, X-Perplexity) for comparison. Among them, the error perplexity is the misclassification rate of the sample image by the model, which can only evaluate the classification difficulty of the labeled data, but in the real scene, the acquisition cost of the labeled data is high, and the unlabeled data is easier to obtain. It can be seen from Figure 8 that, on the one hand, images with high confusion and perplexity often have many objects and are difficult to classify; images with low confusion and perplexity often have single objects and are easy to classify, which conforms to the cognitive situation. On the other hand, under this data set, the confusion perplexity proposed by this scheme generally matches the error perplexity in related technologies in most cases, but it is worth noting that some data have high error perplexity but their confusion perplexity is low. Relatively low, this part of the data is often due to errors in the label itself, and this feature can be subsequently applied to label data verification.

Keeping the same experimental settings, calculate the label confusion index according to the above-mentioned examples, the overall distribution is shown in Figure 9, and the calculation results of some label pairs are shown in Figure 10. It can be seen from Figure 10 that the easily confused label pairs are usually relatively similar, and the difficultly confused label pairs are usually quite different, which is in line with the cognitive situation.

In the process of investigating the reasons for the high difficulty of sample classification, it is found that when the sample error perplexity is extremely high and the confusion perplexity is low, it is very likely that the image label is wrong in the process of manual labeling. Lacking a measure of confusion perplexity, these samples cannot be quickly located. Therefore, the image processing method provided by the embodiment of the present application can be applied to the verification and correction of incorrectly labeled data in the dataset. In a possible implementation, the above image processing method also includes but not limited to the following steps, as shown in Figure 11:

Step 1101, filter the data set according to the sample classification difficulty corresponding to each of the plurality of sample images in the data set and the preset sample classification difficulty threshold.

Wherein, the sample classification difficulty of each sample image in the filtered data set is less than a preset sample classification difficulty threshold.

Optionally, the sample classification difficulty threshold is set by default, or is set by a user. This embodiment of the present application does not limit it.

Optionally, the confusion perplexity is used as the sample classification difficulty measurement method, and the sample classification difficulty threshold is the confusion perplexity threshold. Filter the data set according to the confusion degrees corresponding to the multiple sample images in the data set and the preset confusion degree threshold, and the confusion degree of each sample image in the filtered data set is less than the preset confusion degree threshold.

Optionally, the data set is screened according to the corresponding confusion perplexity and false perplexity of multiple sample images in the data set, the preset confusion perplexity threshold and false perplexity threshold, and each image in the filtered data set The aliasing perplexity of the sample images is less than a preset aliasing perplexity threshold, and the false perplexity of each sample image is greater than the preset false perplexity threshold.

Schematically, the error perplexity XP(x) of the sample image can be determined by the following formula:

Among them, x is any input sample image, N is the number of first image classification models in the model pool, C _i (x) is the prediction label of the i-th first image classification model on the sample image x (usually taken as The label with the highest probability), y _gt is the real label of the sample image x, I(Ω) is the indicator function, where I(Ω) is defined as follows:

Optionally, the confusion perplexity threshold and/or the false perplexity threshold are set by default, or are custom-set. This embodiment of the present application does not limit it.

Step 1102, for each sample image in the filtered data set, correct the image label of the sample image according to the target image label output by the maximum proportion of the model pool.

For each sample image in the filtered data set, input the sample image into each first image classification model of the model pool to obtain the respective highest probability labels corresponding to multiple first image classification models, each first image The highest probability label of the classification model is the image label corresponding to the highest probability in the prediction distribution results of the sample image.

Optionally, for each sample image in the filtered data set, the label with the highest number of repetitions among the highest probability labels corresponding to each of the multiple first image classification models is determined as the target image label output by the maximum proportion of the model pool, and the The image label of the sample image is corrected to this target image label.

Schematically, the model pool includes three first image classification models, and the sample image is respectively input into these three first image classification models to obtain the highest probability labels corresponding to each of the three first image classification models, that is, the highest probability The label is ("dog", "cat", "dog"), where the dog appears 2 times and the cat appears once, then the target image label output by the maximum proportion of the model pool is "dog".

In a schematic example, as shown in Figure 12, the error perplexity threshold is preset as θ _X , and the confusion perplexity threshold is θ _C ; according to the filter condition "S={XP _i >θ _X }∩{CP _i < θ _C }for i∈T (T is the complete set of data)" to filter the data set to obtain the filtered data set; for each sample image in the filtered data set, the original image label of the sample image is processed in the following way Correction: "L _i = TVL _i for i∈S", where L _i is the corrected label, and TVL is the target image label output by the maximum proportion of the model pool.

For image classification models, it is easy to confuse some classes with other classes. For example, it is always easy to confuse a cello with a violin, and confuse an acoustic guitar, an electric guitar, and a banjo. But at the same time, some categories are easy to distinguish, such as "hourglass". Based on this, the embodiment of the present application provides a multi-level image classification method. In a possible implementation, the above image processing method also includes but not limited to the following steps, as shown in Figure 13:

Step 1301, constructing a third image classification model according to the label discrimination difficulty between any two image labels in the data set, the third image classification model is a multi-level classification model; training the third image classification model according to the training set to obtain a fourth image classification model Model.

Optionally, an innovative multi-level classification method is obtained according to the label discrimination difficulty between image labels. First, the labels are clustered into different clusters according to the difficulty of distinguishing, so that the labels between different clusters are easy to distinguish, and the labels in the same cluster are difficult to distinguish. Then, a relatively simple model (which can be called a cluster model) is constructed for different clusters. A moderately complex model (which can be called a class model) is constructed to obtain a multi-level classification model, that is, a third image classification model. Finally, the third image classification model is trained according to the training set (the training method is consistent with the common deep learning method, except that the loss function needs to add the loss function corresponding to the cluster model), and the fourth image classification model is obtained through training. When calling the fourth image classification model for reasoning, since the average complexity of the model can be reduced compared with the original complex large model, accelerated reasoning can be achieved, which is helpful for terminal deployment. Wherein, the fourth image classification model is a classification model obtained after training the third image classification model, that is, the fourth image classification model is a multi-level classification model.

Optionally, both the third image classification model and the fourth image classification model include a first neural network and a second neural network, the first neural network is used to perform the first classification processing on the input target image to obtain the first classification result, and the second One classification result corresponds to a plurality of second classification results, and the second neural network is used to perform second classification processing on the target image based on the first classification result to obtain the second classification result.

Wherein, the first classification result is a coarse-grained classification result, and the second classification result is a fine-grained classification result, that is, multiple second classification results corresponding to the first classification result belong to the first classification result. Optionally, the first classification result is used to indicate the image type, and the second classification result is used to indicate the image label. For example, if the first classification result is violin, the first classification result corresponds to two possible second classification results: violin and cello. For another example, the first classification result is guitar, and the first classification result corresponds to three possible second classification results: acoustic guitar, electric guitar, and banjo.

Step 1302, when the target image to be classified is obtained, call the fourth image classification model to classify the target image, and obtain a classification result of the target image.

When the target image to be classified is obtained, the target image is input into the fourth image classification model to output a classification result of the target image, and the classification result is used to indicate the image label of the target image.

In a schematic example, as shown in FIG. 14 , the target image 141 is input into the first neural network 142 of the fourth image classification model to obtain an intermediate result, that is, the first classification result, and the first classification result can be violin class or Guitar class or other classes, based on the first classification result, the target image is input to the second neural network 143 of the fourth image classification model to output the final result, that is, the second classification result, such as the first classification result corresponding to the "violin class" The second classification result is "violin or cello", and the second classification result corresponding to the first classification result "guitar" is "acoustic guitar or electric guitar or banjo". This embodiment of the present application does not limit it.

To sum up, the image processing method provided by this embodiment has the advantage over the general neural network method that, on the one hand, it conforms to the step-by-step cognitive process of human beings, and solves easy problems first, and then solves difficult problems. A model with higher accuracy is learned; on the other hand, when inference is performed, since the average complexity of the model can be reduced, accelerated inference can be achieved, which is helpful for the deployment of electronic devices (such as terminals).

Please refer to FIG. 15 , which shows a flowchart of an image processing method provided by an exemplary embodiment of the present application. This embodiment is illustrated by using the method in the electronic device shown in FIG. 3 or 4 . The method includes the following steps.

Step 1501, acquire a data set and a model pool, the data set includes a plurality of sample images, and the model pool includes a plurality of pre-trained first image classification models.

Step 1502, input the data set into each first image classification model in the model pool respectively, and obtain the prediction distribution results corresponding to each of the multiple first image classification models, and the prediction distribution results are used to indicate that each sample image is in the data set corresponding to Probability distributions over multiple image labels for .

Step 1503: Determine the sample classification difficulty corresponding to at least one sample image in the data set according to multiple prediction distribution results, where the sample classification difficulty is used to indicate the degree of difficulty for the sample image to be classified by the model.

It should be noted that, for implementation details of each step in this embodiment, reference may be made to relevant descriptions in the foregoing embodiments, and details are not repeated here.

Please refer to FIG. 16 , which shows a block diagram of an image processing apparatus provided by another exemplary embodiment of the present application. The device can be implemented as all or part of the electronic equipment through software, hardware or a combination of the two. The apparatus may include: an acquisition unit 1610 , an input unit 1620 and a determination unit 1630 .

An acquisition unit 1610, configured to acquire a data set and a model pool, the data set includes a plurality of sample images, and the model pool includes a plurality of pre-trained first image classification models;

The input unit 1620 is configured to respectively input the data set into each first image classification model in the model pool, and obtain the prediction distribution results corresponding to each of the multiple first image classification models, and the prediction distribution results are used to indicate that each sample image is in The probability distribution over multiple image labels corresponding to the dataset;

The determining unit 1630 is configured to determine the sample classification difficulty corresponding to at least one sample image in the data set according to the multiple prediction distribution results, where the sample classification difficulty is used to indicate the degree of difficulty of the sample image being classified by the model.

In a possible implementation manner, the determining unit 1630 is further configured to:

For each sample image in at least one sample image, according to a plurality of prediction distribution results and the number of the first image classification model in the model pool, determine the confusion perplexity of the sample image, and the confusion perplexity is used to indicate the corresponding Sample classification difficulty.

In another possible implementation, the perplexity of the sample image is positively correlated with the entropy of the predicted distribution results of the sample image corresponding to each first image classification model, and is negatively correlated with the number of the first image classification models. The entropy of the predicted distribution result corresponding to an image classification model is determined according to the probability of the sample image output by the first image classification model on each image label.

In another possible implementation, the device also includes:

The determination unit 1630 is further configured to determine the label discrimination difficulty between any two image labels in the plurality of image labels according to the multiple prediction distribution results, and the label discrimination difficulty is used to indicate the degree of difficulty for any two image labels to be distinguished by the model.

In another possible implementation manner, the determining unit 1630 is further configured to:

For any two image labels in multiple image labels, according to the multiple prediction distribution results and the number of the first image classification model in the model pool, determine the label confusion index between any two image labels, and the label confusion index is used to indicate Label discrimination difficulty between any two image labels.

In another possible implementation manner, the device further includes: a training unit;

The training unit is used to train the original second image classification model according to the training set to obtain the first image classification model, and the training set includes a plurality of training images;

Wherein, there are at least two first image classification models in the model pool, and the corresponding model training parameters are different, and the model training parameters include at least one of the training set, the type of the second image classification model, and the duration of model training.

In another possible implementation manner, the device further includes: a correction unit; the correction unit is configured to:

Screening the data set according to the corresponding sample classification difficulty of each of the multiple sample images in the data set and a preset sample classification difficulty threshold;

For each sample image in the filtered dataset, the image label of the sample image is corrected according to the target image label output by the largest proportion of the model pool.

In another possible implementation manner, the device further includes: a classification unit; the classification unit is used for:

Constructing a third image classification model according to the label discrimination difficulty between any two image labels in the data set, the third image classification model is a multi-level classification model;

When the target image to be classified is acquired, the fourth image classification model is invoked to perform classification processing on the target image to obtain a classification result of the target image.

It should be noted that, when realizing the functions of the device provided by the above-mentioned embodiments, the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned function allocation can be completed by different functional modules according to the needs. The internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the device and the method embodiment provided by the above embodiment belong to the same idea, and the specific implementation process thereof is detailed in the method embodiment, and will not be repeated here.

An embodiment of the present application provides an image processing apparatus, which includes: a processor; a memory for storing instructions executable by the processor; wherein, the processor is configured to implement the above-mentioned method performed by the electronic device when executing the instructions.

An embodiment of the present application provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are run in a processor of an electronic device , the processor in the electronic device executes the above method executed by the electronic device.

An embodiment of the present application provides a non-volatile computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing method executed by the electronic device is realized.

A computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. A computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (non-exhaustive list) of computer-readable storage media include: portable computer disk, hard disk, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), erasable Electrically Programmable Read-Only-Memory (EPROM or flash memory), Static Random-Access Memory (Static Random-Access Memory, SRAM), Portable Compression Disk Read-Only Memory (Compact Disc Read-Only Memory, CD -ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices such as punched cards or raised structures in grooves with instructions stored thereon, and any suitable combination of the foregoing .

Computer readable program instructions or codes described herein may be downloaded from a computer readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, local area network, wide area network, and/or wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .

Computer program instructions for performing the operations of the present application may be assembly instructions, instruction set architecture (Instruction Set Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or in one or more source or object code written in any combination of programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In cases involving a remote computer, the remote computer can be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it can be connected to an external computer such as use an Internet service provider to connect via the Internet). In some embodiments, electronic circuits, such as programmable logic circuits, field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or programmable logic arrays (Programmable Logic Array, PLA), the electronic circuit can execute computer-readable program instructions, thereby realizing various aspects of the present application.

Aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.

It is also possible to load computer-readable program instructions into a computer, other programmable data processing device, or other equipment, so that a series of operational steps are performed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , so that instructions executed on computers, other programmable data processing devices, or other devices implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.

The flowchart and block diagrams in the figures show the architecture, functions and operations of possible implementations of apparatuses, systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.

It should also be noted that each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, can be implemented with hardware (such as circuits or ASIC (Application Specific Integrated Circuit, application-specific integrated circuit)), or can be implemented with a combination of hardware and software, such as firmware.

Although the present application has been described in conjunction with various embodiments here, however, in the process of implementing the claimed application, those skilled in the art can understand and Other variations of the disclosed embodiments are implemented. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that these measures cannot be combined to advantage.

Having described various embodiments of the present application above, the foregoing description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and alterations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principle of each embodiment, practical application or improvement of technology in the market, or to enable other ordinary skilled in the art to understand each embodiment disclosed herein.

Claims

An image processing method, characterized in that the method comprises:

Obtain a data set and a model pool, the data set includes a plurality of sample images, and the model pool includes a plurality of pre-trained first image classification models;

Input the data set into each first image classification model of the model pool respectively, and obtain prediction distribution results corresponding to each of the plurality of first image classification models, and the prediction distribution results are used to indicate that each of the first image classification models The probability distribution of the sample image on a plurality of image tags corresponding to the data set;

A sample classification difficulty corresponding to at least one sample image in the data set is determined according to the plurality of prediction distribution results, and the sample classification difficulty is used to indicate the degree of difficulty for the sample image to be classified by a model.
The method according to claim 1, wherein the determination of the difficulty of classifying samples corresponding to at least one of the sample images in the data set according to the plurality of prediction distribution results includes:

For each of the at least one of the sample images, determining the confusion of the sample images based on the plurality of prediction distribution results and the number of the first image classification models in the model pool The degree of confusion is used to indicate the difficulty of classifying the sample corresponding to the sample image.
The method according to claim 2, wherein the confusion perplexity of the sample image is positively correlated with the entropy of the predicted distribution result of the sample image corresponding to each of the first image classification models, and is positively correlated with the first The number of image classification models is negatively correlated, and the entropy of the predicted distribution result corresponding to the first image classification model is determined according to the probability of the sample image output by the first image classification model on each of the image labels.
The method according to any one of claims 1 to 3, wherein the method further comprises:

According to the plurality of prediction distribution results, determine the label discrimination difficulty between any two image labels in the plurality of image labels, and the label discrimination difficulty is used to indicate the difficulty of any two image labels being distinguished by the model ease.
According to the method according to claim 4, wherein, according to the plurality of prediction distribution results, determining the difficulty of label distinction between any two image labels in the plurality of image labels comprises:

For any two of the plurality of image tags, according to the plurality of prediction distribution results and the number of the first image classification models in the model pool, determine any two of the image tags The label confusion index between them is used to indicate the difficulty of label distinction between any two image labels.
The method according to claim 5, wherein the label confusion index between any two image labels is determined according to the similarity of the probability distribution of the data set on the two image labels.
According to the method according to any one of claims 1 to 6, it is characterized in that, before the acquisition of data sets and model pools, it also includes:

training the original second image classification model according to the training set to obtain the first image classification model, the training set including a plurality of training images;

Wherein, there are at least two of the first image classification models in the model pool, and the corresponding model training parameters are different, and the model training parameters include the training set, the type and model of the second image classification model At least one of the training durations.
The method according to any one of claims 1 to 6, wherein the method further comprises:

Filtering the data set according to the sample classification difficulty corresponding to each of the plurality of sample images in the data set and a preset sample classification difficulty threshold;

For each sample image in the filtered data set, the image label of the sample image is corrected according to the target image label output by the maximum proportion of the model pool.
The method according to any one of claims 4 to 6, wherein the method further comprises:

Constructing a third image classification model according to the label discrimination difficulty between any two of the image tags in the data set, the third image classification model is a multi-level classification model;

training the third image classification model according to the training set to obtain a fourth image classification model;

When the target image to be classified is obtained, the fourth image classification model is invoked to perform classification processing on the target image to obtain a classification result of the target image.
An electronic device, characterized in that the electronic device comprises:

processor;

memory for storing processor-executable instructions;

Wherein, the processor is configured to implement the method of any one of claims 1-9 when executing the instructions.
A non-volatile computer-readable storage medium, on which computer program instructions are stored, wherein, when the computer program instructions are executed by a processor, the method according to any one of claims 1-9 is implemented.