WO2022105336A1

WO2022105336A1 - Image classification method and electronic device

Info

Publication number: WO2022105336A1
Application number: PCT/CN2021/114146
Authority: WO
Inventors: 申世伟; 李家宏; 李思则; 李岩
Original assignee: 北京达佳互联信息技术有限公司
Priority date: 2020-11-23
Filing date: 2021-08-23
Publication date: 2022-05-27
Also published as: CN112434178A

Abstract

Provided are an image classification method and an electronic device. The image classification method comprises: determining a first probability of a target image on the basis of a first image feature extracted from the target image; then, where the first probability is greater than a first probability threshold, determining a second probability of the target image on the basis of a second image feature extracted from the target image and an object feature of a target object associated with the target image; and finally, where the second probability is greater than a second probability threshold, determining that the target object belongs to a target category.

Description

Image classification method and electronic device

This application is based on the Chinese patent application with the application number of 202011325685.9 and the filing date of November 23, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is incorporated herein by reference.

technical field

The present application relates to the field of image detection, and in particular, to an image classification method, apparatus, electronic device and storage medium.

Background technique

With the development of computer vision technology, image content understanding and analysis are becoming more and more intelligent. The classification task based on image information is an important application of computer vision.

With the increase of image information, how to efficiently decompose and classify images is particularly important in special scenarios such as security audit and abnormal behavior detection.

SUMMARY OF THE INVENTION

In a first aspect, the embodiments of the present application provide an image classification method, including:

performing feature extraction on the target image to obtain the first image feature of the target image;

determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;

In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;

determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;

In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.

In some embodiments, the method further includes:

In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.

In some embodiments, the method further includes:

in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;

In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.

In some embodiments, the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate, and the third probability threshold value relationship between.

In a second aspect, an embodiment of the present application provides a method for training an image classification model, where the image classification model includes a first model and a second model, and the method includes:

acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;

The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;

Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;

Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;

The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.

In some embodiments, samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;

The method also includes:

In response to the need to improve the recall rate of the positive samples, setting the loss weight of the first model to a value greater than 1;

In response to the need to improve the recall rate of the negative samples, the loss weight of the first model is set to a value less than 1.

In some embodiments, the removing from the first sample set sample images that do not belong to the target category and have a similarity with the target category greater than a specified similarity, including:

An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;

Classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each sample image and the target category, where the similarity is used to indicate that the sample image belongs to the target the possibility of categories;

For any sample image in the first sample set, in response to the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, the first sample image is selected from the first sample. Centrally remove any of the sample images.

In some embodiments, the constructing the second sample set based on the sample images belonging to the first sample set with a corresponding probability greater than a specified probability threshold includes:

adding the sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;

Each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images;

The second sample set is obtained by adding a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set.

In some embodiments, the method further includes:

For any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain image features of the any sample image, where the image features are used to indicate the sense of at least one characteristic of the object of interest;

Based on the image feature, obtain the object identifier of the sample object associated with any of the sample images;

Based on the object identification, an object feature of the sample object is obtained.

In some embodiments, performing feature identification on the object of interest in any sample image to obtain image features of any sample image, including:

In response to the plurality of objects of interest being included in any one of the sample images, sequentially acquiring at least one feature of at least one of the objects of interest from the any one of the sample images according to the size order of the plurality of objects of interest , to obtain the image features of any of the sample images.

In a third aspect, the present application also provides an image classification device, the device comprising:

a first feature extraction module, configured to perform feature extraction on the target image to obtain a first image feature of the target image;

a first probability determination module, configured to determine a first probability of the target image based on a first image feature of the target image, where the first probability is used to indicate a possibility that the target image belongs to a target category;

a first probability judgment module, configured to extract a second image feature from the target image in response to the first probability being greater than a first probability threshold, and obtain an object feature of a target object associated with the target image;

The first probability judgment module is further configured to determine a second probability of the target image based on the second image feature and the object feature, where the second probability is used to indicate that the target image belongs to the the likelihood of the target class;

The second probability judgment module is configured to determine that the target object belongs to the target category in response to the second probability being greater than a second probability threshold.

In some embodiments, the first probability determination module is further configured to:

In some embodiments, the first probability judgment module is further configured to:

In a fourth aspect, the present application also provides a training device for an image classification model, the image classification model includes a first model and a second model, and the device includes:

a first acquisition module, configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

a target sample set acquisition module, configured to remove from the first sample set sample images that do not belong to the target category and whose similarity to the target category is greater than a specified similarity, to obtain a target sample set;

A first training module, configured to obtain the first model by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;

A probability acquisition module, configured to classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each of the sample images, where the probability is used to indicate a corresponding sample object Likelihood of falling into said target category;

a second acquisition module, configured to construct a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;

The second training module is configured to obtain the second model by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.

The first training module is further configured to set the loss weight of the first model to a value greater than 1 in response to the need to improve the recall rate of the positive samples; in response to the need to improve the recall rate of the negative samples, Set the loss weight of the first model to a value less than 1.

In some embodiments, the target sample set acquisition module includes:

a first training unit, configured to obtain an intermediate model by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;

A similarity obtaining unit, configured to classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each of the sample images and the target category, and the similarity is determined by to represent the likelihood that the sample image belongs to the target class;

The filtering unit is configured to, for any sample image in the first sample set, respond that the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, from The any one of the sample images is removed from the first sample set.

In some embodiments, the second obtaining module includes:

a third sample set obtaining unit, configured to add sample images whose corresponding probability in the first sample set is greater than a specified probability threshold to the third sample set;

a cropping processing unit, configured to crop each sample image in the third sample set for multiple times to obtain a plurality of cropped sample images;

The second sample set acquiring unit is configured to add a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set to obtain the second sample set.

In some embodiments, the apparatus further includes: a feature extraction module;

The feature extraction module includes:

A feature recognition unit, configured to perform feature recognition on an object of interest in any sample image in the second sample set, to obtain image features of any sample image, and the image features are used to indicate at least one feature of the object of interest;

an obtaining unit, configured to obtain the object identifier of the sample object associated with any one of the sample images based on the image feature;

The portrait acquisition unit is configured to acquire the object feature of the sample object based on the object identifier.

In some embodiments, the feature identification unit is configured to, in response to that any one of the sample images includes a plurality of objects of interest, in order of the size of the plurality of objects of interest, sequentially from any one of the objects of interest Obtain at least one feature of at least one object of interest in this image, and obtain the image feature of any sample image.

In a fifth aspect, another embodiment of the present application further provides an electronic device, comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be used by the at least one processor. Instructions executed by a processor, the instructions being executed by the at least one processor to enable the at least one processor to implement the steps of:

In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:

In a sixth aspect, another embodiment of the present application further provides an electronic device, comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be used by the at least one processor. Instructions executed by a processor, the instructions being executed by the at least one processor to enable the at least one processor to implement the steps of:

The instructions executed by the at least one processor are also used to implement the following steps:

In response to the plurality of objects of interest being included in the any sample image, sequentially acquiring at least one feature of at least one object of interest from the any sample image according to the size order of the plurality of objects of interest , to obtain the image features of any of the sample images.

In a seventh aspect, another embodiment of the present application further provides a non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make The computer implements the following steps:

In an eighth aspect, another embodiment of the present application further provides a non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make The computer implements the following steps:

In a ninth aspect, another embodiment of the present application further provides a computer program product, comprising computer instructions, wherein the computer instructions implement the following steps when executed by a processor:

In a tenth aspect, another embodiment of the present application further provides a computer program product, comprising computer instructions, wherein the computer instructions implement the following steps when executed by a processor:

In the embodiment of the present application, the method of combining the first model and the second model is used to classify the target image, wherein the first model improves the recall rate, and the second model improves the accuracy rate. The overall performance of the image classification method.

Description of drawings

FIG. 1 is an application scenario diagram of the image classification method provided by an embodiment of the present application;

2 is a flowchart of a training method for an image classification model provided by an embodiment of the present application;

3 is a flowchart of training a first model provided by an embodiment of the present application;

4 is a flowchart of acquiring a target sample set provided by an embodiment of the present application;

5 is a flowchart of training a second model provided by an embodiment of the present application;

6 is a flowchart of constructing a second sample set provided by an embodiment of the present application;

FIG. 7 is a flowchart of extracting image features and acquiring object features provided by an embodiment of the present application;

8 is a flowchart of an image classification method provided by an embodiment of the present application;

9 is a schematic structural diagram of an image classification model provided by an embodiment of the present application;

10 is a device diagram of an image classification device provided by an embodiment of the present application;

11 is a device diagram of a training device for an image classification model provided by an embodiment of the application;

FIG. 12 is a diagram of an electronic device of an image classification method provided by an embodiment of the present application.

Detailed ways

With the development of computer vision technology, image content understanding and analysis are becoming more and more intelligent. The classification task based on image information is an important application of computer vision; with the increase of image information, how to efficiently decompose and classify images is particularly important in special scenarios such as security audit and abnormal behavior detection. In these special scenes, the natural occurrence rate of a certain class of images is very low (a few parts per 10,000). For example, only a few target images may appear in ten thousand pictures.

This application proposes an image classification method that uses two stages to complete image classification. The first stage is used to achieve guaranteed recall, and the second stage further analyzes the output of the first stage to achieve guaranteed classification accuracy.

In the embodiment of the present application, the first stage uses the first model to perform feature extraction on the target image, and the images that cannot be accurately processed by the first model are further analyzed by the second model in the second stage. The second model in the second stage is a decision tree-based model. The second model analyzes the features of multiple dimensions based on the method of fusing multiple features to ensure the accuracy of the classification results. The two models in this application perform their respective functions, which improves the overall recognition effect and performance of the model.

The embodiments of the present application process images of different categories in the same way. For ease of understanding, the embodiments of the present application take whether the images contain illegal content as an example for description, and the illegal content includes but is not limited to political content, violent content, and terrorist content. Wait.

The image classification method in the embodiments of the present application will be described in detail below with reference to the accompanying drawings.

For ease of understanding, the following describes the technical solutions provided by the embodiments of the present application by taking classifying and identifying illegal images as an example. It should be understood that the image classification method provided in the embodiment of the present application can also be applied to other classification tasks, which is not limited in the embodiment of the present application.

In some embodiments, as shown in FIG. 1 , FIG. 1 is an application scenario diagram of the image classification method provided by the embodiment of the present application. The application scenario includes: terminal device 101, server 102, network 103, and storage 104;

The terminal device 101 uploads the picture and stores it in the memory 104 through the server 102, and the trained model is installed on the server 102; during application, the server 102 obtains the picture from the memory 104, and the server 012 classifies it based on the deployed model.

In some embodiments, the server 102 can not only obtain the target image through the picture uploaded by the terminal 101, but also can obtain the target image from the short video, which is not limited in this application.

In the image classification method provided in the embodiment of the present application, first, feature extraction is performed on the target image based on the trained first model, and the first probability of the target image is determined based on the extracted first feature; The second image feature and the object feature extracted from the second image feature and the object feature are extracted to determine the second probability of the target image.

For ease of understanding, the embodiments of the present application describe the image classification method provided by the embodiments of the present application based on two parts, model training and model use.

First, the training of image classification model

In some embodiments, the samples include positive samples and negative samples, the samples belonging to the target category are positive samples, and the samples not belonging to the target category are negative samples. Taking the application scenario of detecting whether a sample image violates the rules as an example, in response to the expected detection of an image including the violation content in the sample image, the image including the violation content is set as a positive sample, and the loss weight is set to be greater than 1. Use the labeled sample images to train the model. The labeled sample images refer to whether the sample images are labeled as belonging to the target category. The training method is as follows:

As shown in FIG. 2, it is a flowchart of the training method of the image classification model provided by the embodiment of the present application, that is, FIG. 2 shows the training process of the image classification model in the implementation process, so that the server executes the image classification model. Taking the first model and the second model as an example, the training method of the image classification model includes the following steps:

In step 201, a first sample set is obtained, the first sample set includes a plurality of sample images, and each sample image is associated with a pre-marked category;

In step 202, remove from the first sample set sample images that do not belong to the target category and whose similarity with the target category is greater than the specified similarity, to obtain a target sample set;

In step 203, the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;

In step 204, classify and identify each sample image in the first sample set based on the first model to obtain a probability corresponding to each sample image, and the probability is used to indicate the possibility that the corresponding sample image belongs to the target category ;

In step 205, a second sample set is constructed based on the sample images whose probability corresponding to the first sample set is greater than the specified probability threshold;

In step 206, the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.

For ease of understanding, the training processes of the first model and the second model are respectively described below.

1. Train the first model

As shown in FIG. 3 , the flowchart of training the first model provided by the embodiment of the present application, taking execution by a server as an example, includes the following steps:

In step 301: a first sample set is obtained, the first sample set includes a plurality of sample images, and each sample image is associated with a pre-marked category.

The category associated with the sample image may be marked manually or automatically by the server, which is not limited in this application.

For example, the first sample set includes multiple manually annotated illegal images and multiple manually annotated normal images.

In step 302: remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set.

In one embodiment, as shown in FIG. 4 , which is a flowchart of obtaining a target sample set provided by an embodiment of the present application, in order to obtain the first model through training, this step is implemented based on the following steps 401 to 403 .

In step 401: an intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;

Wherein, the intermediate model is a model obtained by training based on the first sample set, and the server obtains the above-mentioned first model by training on the basis of the intermediate model. For any sample image in the first sample set, the server uses the sample image as the input of the model, and the category associated with any sample image as the expected output, that is, the label in supervised learning, to train the model .

In step 402: classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each sample image and the target category, and the similarity is used to indicate the possibility that the sample image belongs to the target category ;

Wherein, for any sample image in the first sample set, the server can input the any sample image into the intermediate model, and the intermediate model can classify and identify the any sample image to obtain the probability of the any sample image , the probability is used to indicate the possibility that any sample image belongs to the target category, and the server uses the probability as the similarity between the any sample image and the target category.

In some embodiments, the server can also calculate the similarity between the sample image and the target category by using the similarity calculation formula. Other methods for calculating the similarity are also applicable in this application, which is not limited in this application.

In step 403: for any sample image in the first sample set, in response to the any sample image not belonging to the target category and the similarity with the target category is greater than the specified similarity, remove the sample image from the first sample set Any sample image.

For example: a sample image does not belong to the target category, but the similarity with the target category is 90%, and the specified similarity is 50%, at this time, the sample image is removed from the first sample set.

In some embodiments, samples that belong to the target category are positive samples, and samples that do not belong to the target category are negative samples; the server can set the loss weight of the first model according to requirements, and the setting method is as follows:

In response to the need to improve the recall rate of positive samples, set the loss weight of the first model to a value greater than 1;

In response to the need to improve the recall rate of negative samples, the loss weight of the first model is set to a value less than 1.

For example: Loss-weight = 0.5 means that more attention is paid to the accuracy of identifying negative samples, and it is hoped that the model can accurately identify negative samples. In response to the model predicting a sample with a low probability of being a negative sample, the server determines the sample as a positive sample. At this time, the recall of positive samples is guaranteed. In response to loss weight = 2, it means that more attention is paid to the accuracy of positive sample recognition.

In step 303: a first model is obtained by training based on the sample images in the target sample set and the categories associated with the sample images in the target sample set.

Among them, the server can use the sample images in the target sample set as the input of the model, and use the category associated with the sample images as the expected output of the model, that is, the labels in supervised learning, to train the model until the model converges, and the above-mentioned first model is obtained. .

For example, the server inputs the labeled sample images into a deep learning image recognizer such as resnet50 or inception-v3 or efficient-b3, sets the learning rate to 0.001, and iterates 80 times based on the optimizer to obtain an intermediate model, which is expressed as M-stage1-v0. Then classify and identify each sample image in the first sample set based on the intermediate model, clean the first data set based on the recognition results, remove negative samples that are easily confused with positive samples, and obtain the target data set. Set training to obtain the above-mentioned first model, the first model is recorded as M-stage1-v1.

In some embodiments, the judgment condition for model convergence is that the loss of the model no longer decreases, or the number of training times reaches a specified number of training times. It should be noted that the conditions for judging that the second model is trained to convergence and the first model is trained to convergence are the same, which will not be repeated in the following.

In some embodiments, after obtaining the first model through training, the server classifies and identifies each sample image in the first sample set based on the first model, and obtains a probability corresponding to each sample image, and the probability is used to indicate the corresponding The likelihood that the sample image belongs to the target class. The server constructs a second sample set based on the probability corresponding to each sample image and the first sample set, and the second sample set is used to train the second model.

2. Train the second model

As shown in FIG. 5 , the flowchart of training the second model provided by the embodiment of the present application, taking execution by a server as an example, includes the following steps:

In step 501: construct a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;

In one embodiment, as shown in FIG. 6 , FIG. 6 is a flowchart of constructing a second sample set provided by an embodiment of the present application, and this step is implemented based on the following steps 601 to 603 .

In step 601: add sample images whose corresponding probability in the first sample set is greater than a specified probability threshold to the third sample set.

The third sample set is empty, or includes at least one sample image, which is not limited in this embodiment of the present application.

In step 602, each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images;

Among them, the number of sample images added to the third sample set by the server is small, and the server can expand the number of sample images in the third sample set by cropping the sample images in the third sample set, and can also make the second model more Universality.

For example, the server crops a sample image into 5 images, which can be used as new sample images for model training.

In step 603 : adding the plurality of cropped sample images and the categories associated with the plurality of cropped sample images to a third sample set to obtain a second sample set.

In some embodiments, after obtaining the second sample set, the server can further extract image features of each sample image in the second sample set and object features of sample objects associated with each sample image. As shown in FIG. 7 , FIG. 7 is a flowchart of extracting image features and acquiring object features according to an embodiment of the present application. The steps of extracting image features and acquiring object features are implemented based on the following steps 701 to 703 .

In step 701: for any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain the image feature of the any sample image, and the image feature is used to indicate the interested object at least one characteristic of the object;

Wherein, in response to that any sample image includes multiple objects of interest, the server sequentially acquires at least one feature of at least one object of interest from the any sample image according to the size order of the multiple objects of interest, and obtains the Image features of any sample image.

For example, taking the object of interest as a face as an example, the server sample image extracts the features of the face from the sample image in the order of the size of the face, and obtains the image features. The image features can be age, gender, etc. Wherein, the server obtains the features of faces of no more than three persons in the sample image.

In step 702: based on the image feature, obtain the object identifier of the sample object associated with any sample image;

In step 703: based on the object identifier, obtain the object feature of the sample object.

The object characteristics of the sample object include at least one of the following: violations in the last 7 days, user age, gender, city, and historical browsing records. The content included in the object feature is determined according to the application scenario, which is not limited in this application.

In step 502, a second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.

The second model obtained by the server training can accurately extract the features in the images, and classify the images according to the extracted features to obtain sample images belonging to the target category.

For example, the server uses a machine learning model such as XGBoost (eXtreme Gradient Boosting, extreme gradient boosting) for training, the learning rate is set to 0.03, the maximum tree depth is set to 6, the parameter regularization coefficient is set to 2, and the parameters are based on categorical cross-entropy loss. Adjust and train to obtain a second model, which is represented as M-stage2-v1.

By combining the first model with the second model, the recall rate of the first model is improved, and the accuracy rate of the second model is improved, and the two models perform their own duties, thereby improving the overall image classification model provided by the embodiment of the present application. performance.

Second, the use of image classification models

As shown in FIG. 8 , it is a flowchart of the image classification method provided by this application, and the image classification method is implemented by steps 801 to 805:

In step 801: perform feature extraction on the target image to obtain a first image feature of the target image.

The server can obtain the target image first, and then perform feature extraction on the target image. The server can acquire the target image through real-time acquisition, and can also acquire the acquired target image from a database, which is not limited in this embodiment of the present application.

In step 802: based on the first image feature of the target image, determine a first probability of the target image, where the first probability is used to indicate the possibility that the target image belongs to the target category.

Wherein, the larger the value of the first probability, the higher the possibility that the target image belongs to the target category.

In step 803: in response to the first probability being greater than the first probability threshold, extract the second image feature from the target image, and obtain the object feature of the target object associated with the target image;

In some embodiments, it is determined that the target image does not belong to the target category in response to the first probability being less than or equal to the first probability threshold.

In step 804, a second probability of the target image is determined based on the second image feature and the object feature, where the second probability is used to indicate the possibility that the target image belongs to the target category.

The server can fuse the second image feature and the object feature, and then determine the second probability of the target image based on the fused feature.

In step 805: in response to the second probability being greater than the second probability threshold, it is determined that the target object belongs to the target category.

In some embodiments, in response to the second probability being greater than the third probability threshold and less than or equal to the second probability threshold, the target image is assigned to a specified task set for storing images requiring manual processing.

For example: take the second probability as 80%, the second probability threshold as 90%, and the third probability threshold as 70% as an example, since the second probability is smaller than the second probability threshold, the server cannot determine the target image as the target category, but Since the second probability is greater than the third probability threshold, it indicates that the similarity between the target image and the target category is high. In order to improve the accuracy of the determination, the server assigns the target object to the specified task set. The specified task set is a task queue that needs to be processed in the manual processing link, so that the server can screen out difficult images for manual review.

In some embodiments, the target image is determined not to belong to the target category in response to the second probability being less than or equal to the third probability threshold.

According to the image classification method provided by the embodiment of the present application, when the image needs to be classified, the image to be tested only needs to be input into the image classification model, and the second probability threshold is set as required to effectively classify the image.

In some embodiments, when using the image classification model, the server can set the third probability threshold based on the near-call variation curve. Wherein, the quasi-recall variation curve describes the recall parameter used to describe the recall rate, the precision rate parameter used to describe the determination accuracy rate of the target category, and the relationship between the third probability threshold. That is to say, the quasi-call variation curve is a three-dimensional corresponding relationship, and the three-dimensional corresponding relationship includes the correlation relationship between the recall rate, the accuracy rate and the third probability threshold. When there is a clear demand for recall rate and accuracy rate, the server determines the corresponding third probability threshold based on the quasi-call change curve, recall rate and accuracy rate according to the demand, so that different third probability thresholds can be selected according to different business requirements. threshold.

For ease of understanding, the following describes the structure of the image classification model provided by the embodiment of the present application. As shown in FIG. 9 , FIG. 9 is a schematic structural diagram of the image classification model provided by the embodiment of the present application. An example of execution by a server will be described. The target image is input into the first model 810, the first model performs feature extraction on the target image, and the first image features are extracted; there is a first probability that the first model determines the target image, and the first probability is used to represent the target image. Likelihood of belonging to the target class. Since the first model has the function of guaranteeing recall, that is, the first model can classify images of suspected target categories into target categories, therefore, in response to the first probability output by the first model being less than or equal to the first probability threshold, the server determines the target The image does not belong to the target category; in response to the first probability being greater than the first probability threshold, the server determines that the target image is of the target category. In some embodiments, since the first model has the characteristic of guaranteeing recall, the accuracy of the server's determination of the target image as the target category may not meet the business requirements, and the server can classify and identify the target image again based on the second model, to ensure the accuracy of the classification results. That is, as shown in FIG. 9 , when the second probability of the target image belonging to the target category is greater than the second probability threshold, it is determined that the target object belongs to the target category.

It should be noted that, all the embodiments of the present application can be implemented independently or in combination with other embodiments, which are regarded as the protection scope of the present application.

FIG. 10 is an apparatus diagram of an image classification apparatus provided by an embodiment of the present application. As shown in FIG. 10, an image classification apparatus 900 is proposed, including:

The first feature extraction module 901 is configured to perform feature extraction on the target image to obtain the first image feature of the target image;

The first probability determination module 902 is configured to determine the first probability of the target image based on the first image feature of the target image, where the first probability is used to indicate the possibility that the target image belongs to the target category;

The first probability judgment module 903 is configured to, in response to the first probability being greater than the first probability threshold, extract the second image feature from the target image, and obtain the object feature of the target object associated with the target image;

The first probability judgment module 903 is further configured to determine a second probability of the target image based on the second image feature and the object feature, where the second probability is used to indicate the possibility that the target image belongs to the target category;

The second probability judgment module 904 is configured to determine that the target object belongs to the target category in response to the second probability being greater than the second probability threshold.

In response to the second probability being greater than the third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set, where the specified task set is used to store images requiring manual processing;

In some embodiments, the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the relationship between the recall rate, the accuracy rate, and the third probability threshold value .

FIG. 11 is an apparatus diagram of an apparatus for training an image classification model provided by an embodiment of the present application. As shown in FIG. 11 , an apparatus 1000 for training an image classification model is proposed, including:

The first acquisition module 1001 is configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

The target sample set acquisition module 1002 is configured to remove sample images that do not belong to the target category and have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;

The first training module 1003 is configured to obtain the first model by training based on the sample images in the target sample set and the categories associated with the sample images in the target sample set;

The probability acquisition module 1004 is configured to classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, and the probability is used to indicate that the corresponding sample object belongs to the target the possibility of categories;

The second acquisition module 1005 is configured to construct a second sample set based on the sample images whose probability corresponding to the first sample set is greater than the specified probability threshold;

The second training module 1006 is configured to obtain the second model by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.

The first training module is further configured to set the loss weight of the first model to a value greater than 1 in response to the need to improve the recall rate of the positive samples; set the first model to be greater than 1 in response to the need to improve the recall rate of the negative samples The loss weight of the model is a value less than 1.

In some embodiments, the target sample set acquisition module includes:

The similarity obtaining unit is configured to classify and identify each sample image in the first sample set based on the intermediate model, and obtain the similarity between each sample image and the target category, and the similarity is used to indicate that the sample image belongs to the likelihood of that target category;

The filtering unit is configured to, for any sample image in the first sample set, respond that the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, then select from the first sample image Either sample image is removed from this episode.

In some embodiments, the second obtaining module includes:

The third sample set obtaining unit is configured to add sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;

The second sample set obtaining unit is configured to add a plurality of the cropped sample images and the categories associated with the plurality of cropped sample images to the third sample set to obtain the second sample set.

In some embodiments, the apparatus further includes: a feature extraction module; the feature extraction module includes:

A feature identification unit, configured to perform feature identification on an object of interest in any sample image in the second sample set, to obtain an image feature of the any sample image, and the image feature is used to indicate at least one characteristic of the object of interest;

an acquisition unit, configured to acquire the object identifier of the sample object associated with the any sample image based on the image feature;

In some embodiments, the feature identification unit is configured to, in response to the any sample image including a plurality of objects of interest, sequentially acquire from the any sample image according to the size order of the plurality of objects of interest At least one feature of at least one object of interest is obtained to obtain an image feature of any one of the sample images.

An electronic device according to another exemplary embodiment of the present application is described below.

In some embodiments, the electronic device of the present application includes at least one processor and at least one memory. Wherein, the memory stores program code, and when the program code is executed by the processor, the processor can execute the following steps:

Perform feature extraction on the target image to obtain the first image feature of the target image;

determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to the target category;

In response to the first probability being greater than the first probability threshold, extracting a second image feature from the target image, and acquiring the object feature of the target object associated with the target image;

Based on the second image feature and the object feature, determine a second probability of the target image, where the second probability is used to represent the possibility that the target image belongs to the target category;

In some embodiments, the electronic device in the embodiments of the present application includes at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the memory The instructions are executed by the at least one processor to enable the at least one processor to implement the following steps:

obtaining a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

Remove sample images from the first sample set that do not belong to the target category and have a similarity with the target category greater than the specified similarity to obtain a target sample set;

Based on the first model, classify and identify each sample image in the first sample set, to obtain a probability corresponding to each sample image, and the probability is used to indicate the possibility that the corresponding sample image belongs to the target category;

Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than the specified probability threshold;

In response to the need to improve the recall rate of the positive sample, set the loss weight of the first model to a value greater than 1;

Based on the intermediate model, classify and identify each sample image in the first sample set, and obtain the similarity between each sample image and the target category, and the similarity is used to indicate the possibility that the sample image belongs to the target category;

For any sample image in the first sample set, in response to the any sample image not belonging to the target category and the similarity with the target category is greater than the specified similarity, remove the any sample image from the first sample set a sample image.

adding sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;

Each sample image in the third sample set is cropped multiple times to obtain a plurality of cropped sample images;

The second sample set is obtained by adding a plurality of the cropped sample images and the categories associated with the plurality of cropped sample images to the third sample set.

For any sample image in the second sample set, perform feature recognition on the object of interest in any sample image to obtain an image feature of the any sample image, where the image feature is used to indicate at least one of the object of interest feature;

Based on the image feature, obtain the object identifier of the sample object associated with any sample image;

Based on the object identification, the object characteristics of the sample object are obtained.

In response to the sample image including a plurality of objects of interest, according to the size order of the plurality of objects of interest, sequentially acquire at least one feature of at least one object of interest from the any sample image, and obtain the arbitrary sample image. image features of a sample image.

The electronic device 130 according to this embodiment of the present application is described below with reference to FIG. 12 .

As shown in FIG. 12, the electronic device 130 takes the form of a general electronic device. Components of the electronic device 130 may include, but are not limited to: the above-mentioned at least one processor 131 , the above-mentioned at least one memory 132 , and a bus 133 connecting different system components (including the memory 132 and the processor 131 ).

Bus 133 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus structures.

Memory 132 may include readable media in the form of volatile memory, such as random access memory (RAM) 1321 and/or cache memory 1322 , and may further include read only memory (ROM) 1323 .

The memory 132 may also include a program/utility 1325 having a set (at least one) of program modules 1324 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, which An implementation of a network environment may be included in each or some combination of the examples.

Electronic device 130 may also communicate with one or more external devices 134 (eg, keyboards, pointing devices, etc.), may also communicate with one or more devices that enable a user to interact with electronic device 130, and/or communicate with the electronic device 130 communicates with any device (eg, router, modem, etc.) capable of communicating with one or more other electronic devices. Such communication may take place through input/output (I/O) interface 135. Also, the electronic device 130 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 136 . As shown, network adapter 136 communicates with other modules for electronic device 130 via bus 133 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 130, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives and data backup storage systems.

In some embodiments, an image classification method provided by the present application is implemented in the form of a computer program product, the computer program product includes computer instructions, and the computer instructions are executed by a processor to implement the following steps:

In some embodiments, the training method of an image classification model provided by the present application is implemented in the form of a computer program product, and the computer program product includes computer instructions, and the computer instructions are executed by a processor to implement the following steps:

In some embodiments, a computer program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. Also included in the readable storage medium are any of the following: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only Memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.

The program product for image classification of embodiments of the present application may employ a portable compact disk read only memory (CD-ROM) and include program code, and may be executed on an electronic device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, carrying readable program code therein. Such propagated data signals may take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. A readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Write program code for performing the operations of the present application in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming languages - Such as "C" language or similar programming language. The program code may execute entirely on the user's electronic device, partly on the user's device, as a stand-alone software package, partly on the user's electronic device and partly on a remote electronic device, or entirely on the remote electronic device or service Execute on the end. In the case of remote electronic equipment, the remote electronic equipment may be connected to the user electronic equipment through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to external electronic equipment (eg, using Internet services provider to connect via the Internet).

In some embodiments, a non-volatile computer-readable storage medium is provided, and the non-volatile computer-readable storage medium stores a computer program for causing a computer to implement the following steps:

In some embodiments, a non-volatile computer-readable storage medium is provided, and the non-volatile computer-readable storage medium stores a computer program for causing a computer to implement the steps of: obtaining a first a sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

Exemplarily, the embodiment of the present application provides an image classification method, including:

Get the target image;

Using the first image feature of the target image to determine the first probability that the target image belongs to the target category;

When the first probability is higher than the first probability threshold, extract the second image feature from the target image, and obtain the object feature of the target object associated with the target image; adopt a decision tree to determine the second image feature and the object feature Perform fusion processing to obtain the second probability that the target image belongs to the target category;

When the second probability is higher than the second probability threshold, it is determined that the target object belongs to the target category.

In some embodiments, after determining the first probability that the target image belongs to the target category, the method further includes:

When the first probability is less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.

In some embodiments, after determining the second probability that the target image belongs to the target category, the method further includes:

When the second probability is higher than the third probability threshold and less than the second probability threshold, assign the target object to the specified task set;

When the second probability is smaller than the third probability threshold, it is determined that the target image does not belong to the target category.

In some embodiments, a quasi-call variation curve is pre-stored, and the quasi-call variation curve is used to describe the recall parameter of the recall rate, the precision rate parameter used to describe the determination accuracy rate of the target category, and the difference between the third probability threshold. connection relation;

The third probability threshold is set according to the third recall index and accuracy index.

In some embodiments, feature extraction is performed on the target image using a pre-trained first model, and a first probability that the target image belongs to the target category is determined, wherein the first model is trained according to the following method:

Filter out the sample images in the first sample set that do not belong to the target category and the similarity with the target category is higher than the specified similarity to obtain the target sample set;

The sample image in the target sample set is used as the input of the first model, and the category of the sample image is used as the expected output of the first model, and the first model is trained until the training of the first model converges.

In some embodiments, a second model based on a decision tree is used to fuse the second image feature and the object feature to obtain a second probability that the target image belongs to the target category, wherein the second model is based on the following method of training:

Obtain sample images whose probability of belonging to the target category is greater than the specified probability threshold to construct a second sample set;

For any sample image in the second sample set, extract image features from the sample image, and obtain object features of the target object associated with the sample image;

The second model is trained using the image features and the object features until the training of the second model converges.

In some embodiments, the samples of the target category are positive samples, and the samples that do not belong to the target category are negative samples;

When it is necessary to improve the recall rate of the positive sample, set the loss weight of the first model to a value greater than 1;

When the negative sample recall rate needs to be improved, the loss weight of the first model is set to a value less than 1.

In some embodiments, filtering out sample images in the first sample set that do not belong to the target category and have a similarity with the target category higher than a specified similarity, including:

Before the first model is trained, use the sample image in the first sample set as the input of the first model, and use the category of the sample image as the expected output of the first model, train the first model until The first model training converges;

Input each sample image in the first sample set into the first model, and obtain the probability that the sample image belongs to the target category as the similarity with the target category;

If the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity, the sample image is filtered out.

In some embodiments, the acquisition of sample images whose probability of belonging to the target category is greater than a specified probability threshold to construct a second sample set includes:

Perform multiple cropping processing on each sample image in the second sample set to obtain multiple cropped sample images;

The respective categories of the sample images in the second sample set and the multiple cropped sample images are acquired, and the third sample set composed of the sample images and corresponding categories is constructed.

In some embodiments, extracting image features from the sample image and obtaining object features of the target object associated with the sample image include:

Perform feature recognition on the object of interest in the sample image, and obtain feature information of the object of interest from the sample image;

Obtain the object identifier of the target object associated with the sample image;

The object feature of the target object is acquired according to the object identifier.

In some embodiments, performing feature identification on the object of interest in the sample image, and obtaining feature information of the object of interest from the sample image, including:

When the sample image includes a plurality of objects of interest, the characteristic information of at least one object of interest is sequentially acquired from the sample image according to the size order of the objects of interest.

Exemplarily, the embodiment of the present application provides a training method for an image classification model, including:

Using the sample image in the target sample set as the input of the first model, and using the category of the sample image as the expected output of the first model, train the first model until the training of the first model converges;

Use the trained first model to classify and identify each sample image in the first sample set, and obtain the probability of each sample image belonging to the target category;

When the positive sample recall rate needs to be improved, the loss weight of the first model is set to a value greater than 1;

The respective categories of the sample images in the second sample set and the plurality of cropped sample images are acquired, and the third sample set composed of the sample images and corresponding categories is constructed.

In some embodiments, the image features are extracted from the sample image and the object features of the target object associated with the sample image are obtained, including:

Exemplarily, the present application also provides an image classification device, the device comprising:

an image acquisition module, configured to acquire a target image;

a first probability determination module, configured to use the first image feature of the target image to determine the first probability that the target image belongs to the target category;

The first probability judgment module is configured to extract the second image feature from the target image when the first probability is higher than the first probability threshold, and obtain the object feature of the target object associated with the target image; The second image feature and the object feature are fused to obtain a second probability that the target image belongs to the target category;

The second probability judgment module is configured to determine that the target object belongs to the target category when the second probability is higher than the second probability threshold.

In some embodiments, a pre-trained first model is used to perform feature extraction on the target image, and a first probability that the target image belongs to the target category is determined, wherein the first model is trained according to the following modules:

a first sample set obtaining module, configured to obtain a first sample set, the first sample set includes a plurality of sample images, and each sample image is associated with a pre-marked category;

The filtering module is configured to filter out the sample images in the first sample set that do not belong to the target category and whose similarity with the target category is higher than the specified similarity, to obtain the target sample set;

A first model training module, configured to use the sample image in the target sample set as the input of the first model, use the category of the sample image as the expected output of the first model, and train the first model until the first model Model training converges.

In some embodiments, a second model based on a decision tree is used to fuse the second image feature and the object feature to obtain a second probability that the target image belongs to the target category, wherein the second model is based on the following Module trained:

The second sample set obtaining module is configured to obtain sample images whose probability of belonging to the target category is greater than the specified probability threshold to construct a second sample set;

A feature extraction module, configured to extract image features from the sample image for any sample image in the second sample set, and obtain object features of the target object associated with the sample image;

The second model training module is configured to use the image feature and the object feature to train the second model until the second model training converges.

In some embodiments, the filtering module includes:

an initial training unit, configured to use the sample image in the first sample set as the input of the first model and the category of the sample image as the expected output of the first model before the first model is trained, training the first model until the training of the first model converges;

a similarity obtaining unit, configured to input each sample image in the first sample set into the first model, and obtain the probability that the sample image belongs to the target category as the similarity with the target category;

The filtering unit is configured to filter out the sample image if the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity.

In some embodiments, the second sample set acquisition module includes:

a cropping unit, configured to perform multiple cropping processes on each sample image in the second sample set, to obtain multiple cropped sample images;

The third sample set obtaining unit is configured to obtain the respective categories of the sample images in the second sample set and the plurality of cropped sample images, and construct the third sample set consisting of the sample images and the corresponding categories.

In some embodiments, the feature extraction module includes:

a feature information acquisition unit, configured to perform feature recognition on the object of interest in the sample image, and obtain feature information of the object of interest from the sample image;

an object identification obtaining unit, configured to obtain the object identification of the target object associated with the sample image;

The object feature obtaining unit is configured to obtain the object feature of the target object according to the object identifier.

In some embodiments, the feature information acquisition unit includes:

Exemplarily, the present application also provides an apparatus for training an image classification model, the apparatus comprising:

The target sample set acquisition module is configured to filter out the sample images in the first sample set that do not belong to the target category and the similarity with the target category is higher than the specified similarity to obtain the target sample set;

a first training module, configured to use the sample image in the target sample set as the input of the first model, use the category of the sample image as the expected output of the first model, and train the first model until the first model training convergence;

a probability acquisition module, configured to use the trained first model to classify and identify each sample image in the first sample set, and obtain the probability of each sample image belonging to the target category;

A second acquisition module, configured to acquire sample images whose probability of belonging to the target category is greater than a specified probability threshold to construct a second sample set;

The second training module is configured to train the second model using the image feature and the object feature until the training of the second model converges.

In some embodiments, the target sample set acquisition module includes:

A first training unit, configured to take the sample image in the first sample set as the input of the first model, and take the category of the sample image as the expected output of the first model before the first model is trained , train the first model until the training of the first model converges;

In some embodiments, the second obtaining module includes:

a cropping processing unit, configured to perform multiple cropping processes on each sample image in the second sample set, respectively, to obtain a plurality of cropped sample images;

In some embodiments, the feature extraction module includes:

a feature recognition unit, configured to perform feature recognition on the object of interest in the sample image, and obtain feature information of the object of interest from the sample image;

an acquisition unit, configured to acquire the object identifier of the target object associated with the sample image;

The portrait acquisition unit is configured to acquire the object feature of the target object according to the object identifier.

All the embodiments of the present disclosure can be implemented independently or in combination with other embodiments, which are all regarded as the protection scope required by the present disclosure.

Claims

An image classification method, wherein the method comprises:

performing feature extraction on the target image to obtain the first image feature of the target image;

determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to indicate the possibility that the target image belongs to a target category;

In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;

determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;

In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
The method of claim 1, wherein the method further comprises:

In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
The method of claim 1, wherein the method further comprises:

in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;

In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
The method according to claim 3, wherein the third probability threshold is determined based on a quasi-call variation curve, a recall rate and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate and the accuracy rate. The relationship between the third probability thresholds is described.
A training method for an image classification model, wherein the image classification model includes a first model and a second model, and the method includes:

acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;

The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;

Classify and identify each sample image in the first sample set based on the first model, and obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;

Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;

The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
The method according to claim 5, wherein the samples belonging to the target category are positive samples, and the samples not belonging to the target category are negative samples;

The method also includes:

In response to the need to improve the recall rate of the positive samples, setting the loss weight of the first model to a value greater than 1;

In response to the need to improve the recall rate of the negative samples, the loss weight of the first model is set to a value less than 1.
The method according to claim 5, wherein the removing from the first sample set the sample images that do not belong to the target category and have a similarity with the target category greater than a specified similarity, comprises:

An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;

Classify and identify each sample image in the first sample set based on the intermediate model, and obtain the similarity between each sample image and the target category, where the similarity is used to indicate that the sample image belongs to the target the possibility of categories;

For any sample image in the first sample set, in response to the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, the first sample image is selected from the first sample. Centrally remove any of the sample images.
The method according to claim 5, wherein the constructing the second sample set based on the sample images belonging to the first sample set with a corresponding probability greater than a specified probability threshold comprises:

adding the sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;

Each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images;

The second sample set is obtained by adding a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set.
The method of claim 5, wherein the method further comprises:

For any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain image features of the any sample image, where the image features are used to indicate the sense of at least one characteristic of the object of interest;

Based on the image feature, obtain the object identifier of the sample object associated with any of the sample images;

Based on the object identification, an object feature of the sample object is obtained.
The method according to claim 9, wherein the performing feature identification on the object of interest in any of the sample images to obtain the image features of the any of the sample images, comprising:

In response to the plurality of objects of interest being included in any one of the sample images, sequentially acquiring at least one feature of at least one of the objects of interest from the any one of the sample images according to the size order of the plurality of objects of interest , to obtain the image features of any of the sample images.
An image classification device, wherein the device comprises:

a first feature extraction module, configured to perform feature extraction on the target image to obtain a first image feature of the target image;

a first probability determination module, configured to determine a first probability of the target image based on a first image feature of the target image, where the first probability is used to indicate a possibility that the target image belongs to a target category;

a first probability judgment module, configured to extract a second image feature from the target image in response to the first probability being greater than a first probability threshold, and obtain an object feature of a target object associated with the target image;

The first probability judgment module is further configured to determine a second probability of the target image based on the second image feature and the object feature, where the second probability is used to indicate that the target image belongs to the the likelihood of the target class;

The second probability judgment module is configured to determine that the target object belongs to the target category in response to the second probability being greater than a second probability threshold.
The apparatus of claim 11, wherein the first probability determination module is further configured to determine that the target image does not belong to the target image in response to the first probability being less than or equal to the first probability threshold target category.
The apparatus according to claim 11, wherein the first probability judgment module is further configured to:

in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;

In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
The apparatus according to claim 13, wherein the third probability threshold is determined based on a quasi-call variation curve, a recall rate and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate and the accuracy rate. The relationship between the third probability thresholds is described.
An apparatus for training an image classification model, wherein the image classification model includes a first model and a second model, and the apparatus includes:

a first acquisition module, configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

a target sample set acquisition module, configured to remove from the first sample set sample images that do not belong to the target category and whose similarity to the target category is greater than a specified similarity, to obtain a target sample set;

a first training module, configured to obtain the first model by training based on the sample images in the target sample set and the categories associated with the sample images in the target sample set;

A probability acquisition module, configured to classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each of the sample images, where the probability is used to indicate a corresponding sample object Likelihood of falling into said target category;

a second acquisition module, configured to construct a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;

The second training module is configured to obtain the second model by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
The apparatus according to claim 15, wherein the samples belonging to the target category are positive samples, and the samples not belonging to the target category are negative samples;

The first training module is further configured to set the loss weight of the first model to a value greater than 1 in response to the need to improve the recall rate of the positive samples; in response to the need to improve the recall rate of the negative samples, Set the loss weight of the first model to a value less than 1.
The apparatus according to claim 15, wherein the target sample set acquisition module comprises:

a first training unit, configured to obtain an intermediate model by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;

A similarity obtaining unit, configured to classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each of the sample images and the target category, and the similarity is determined by to represent the likelihood that the sample image belongs to the target class;

The filtering unit is configured to, for any sample image in the first sample set, respond that the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, from The any one of the sample images is removed from the first sample set.
The apparatus according to claim 15, wherein the second obtaining module comprises:

a third sample set obtaining unit, configured to add sample images whose corresponding probability in the first sample set is greater than a specified probability threshold to the third sample set;

a cropping processing unit, configured to crop each sample image in the third sample set for multiple times to obtain a plurality of cropped sample images;

The second sample set acquiring unit is configured to add a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set to obtain the second sample set.
The apparatus of claim 15, wherein the apparatus further comprises: a feature extraction module;

The feature extraction module includes:

A feature recognition unit, configured to perform feature recognition on an object of interest in any sample image in the second sample set, to obtain image features of any sample image, and the image features are used to indicate at least one feature of the object of interest;

an obtaining unit, configured to obtain the object identifier of the sample object associated with any one of the sample images based on the image feature;

The portrait acquisition unit is configured to acquire the object feature of the sample object based on the object identifier.
The apparatus according to claim 19, wherein the feature identification unit is configured to, in response to that any one of the sample images includes a plurality of objects of interest, in order of the size of the plurality of objects of interest, in sequence Obtain at least one feature of at least one of the objects of interest from any of the sample images to obtain image features of the any of the sample images.
An electronic device comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor The at least one processor executes to enable the at least one processor to implement the following steps:

performing feature extraction on the target image to obtain the first image feature of the target image;

determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;

In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;

determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;

In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
The electronic device according to claim 21, wherein the instructions executed by the at least one processor are further used to implement the following steps:

In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
The electronic device according to claim 21, wherein the instructions executed by the at least one processor are further used to implement the following steps:

in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;

In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
The electronic device according to claim 23, wherein the third probability threshold is determined based on a quasi-call variation curve, a recall rate and an accuracy rate, the quasi-call variation curve being used to describe the recall rate, the accuracy rate and the relationship between the third probability thresholds.
An electronic device comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor The at least one processor executes to enable the at least one processor to implement the following steps:

acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;

The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;

Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;

Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;

The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
The electronic device according to claim 25, wherein the samples belonging to the target category are positive samples, and the samples not belonging to the target category are negative samples;

The instructions executed by the at least one processor are also used to implement the following steps:

In response to the need to improve the recall rate of the positive samples, setting the loss weight of the first model to a value greater than 1;

In response to the need to improve the recall rate of the negative samples, the loss weight of the first model is set to a value less than 1.
The electronic device according to claim 25, wherein the instructions executed by the at least one processor are further used to implement the following steps:

An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;

Classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each sample image and the target category, where the similarity is used to indicate that the sample image belongs to the target the possibility of categories;

For any sample image in the first sample set, in response to the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, the first sample image is selected from the first sample. Centrally remove any of the sample images.
The electronic device of claim 25, wherein the instructions executed by the at least one processor are further used to implement the following steps:

adding the sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;

Each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images;

The second sample set is obtained by adding a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set.
The electronic device according to claim 25, wherein the instructions executed by the at least one processor are further used to implement the following steps:

For any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain image features of the any sample image, where the image features are used to indicate the sense of at least one characteristic of the object of interest;

Based on the image feature, obtain the object identifier of the sample object associated with any of the sample images;

Based on the object identification, an object feature of the sample object is obtained.
The electronic device according to claim 29, wherein the instructions executed by the at least one processor are further used to implement the following steps:

In response to the plurality of objects of interest being included in any one of the sample images, sequentially acquiring at least one feature of at least one of the objects of interest from the any one of the sample images according to the size order of the plurality of objects of interest , to obtain the image features of any of the sample images.
A non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make a computer realize the following steps:

performing feature extraction on the target image to obtain the first image feature of the target image;

determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;

In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;

determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;

In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
A non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make a computer realize the following steps:

acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;

The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;

Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;

Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;

The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
A computer program product comprising computer instructions, wherein the computer instructions, when executed by a processor, implement the following steps:

performing feature extraction on the target image to obtain the first image feature of the target image;

determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;

In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;

Based on the second image feature and the object feature, a second probability of the target image is determined, and the second probability is used to represent the possibility that the target image belongs to the target category;

In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
A computer program product comprising computer instructions, wherein the computer instructions, when executed by a processor, implement the following steps:

acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;

Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;

The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;

Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;

Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;

The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.