WO2022105336A1 - Image classification method and electronic device - Google Patents

Image classification method and electronic device Download PDF

Info

Publication number
WO2022105336A1
WO2022105336A1 PCT/CN2021/114146 CN2021114146W WO2022105336A1 WO 2022105336 A1 WO2022105336 A1 WO 2022105336A1 CN 2021114146 W CN2021114146 W CN 2021114146W WO 2022105336 A1 WO2022105336 A1 WO 2022105336A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
image
target
probability
sample set
Prior art date
Application number
PCT/CN2021/114146
Other languages
French (fr)
Chinese (zh)
Inventor
申世伟
李家宏
李思则
李岩
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2022105336A1 publication Critical patent/WO2022105336A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Definitions

  • the present application relates to the field of image detection, and in particular, to an image classification method, apparatus, electronic device and storage medium.
  • the embodiments of the present application provide an image classification method, including:
  • the second probability is used to represent the possibility that the target image belongs to the target category
  • the method further includes:
  • the method further includes:
  • the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate, and the third probability threshold value relationship between.
  • an embodiment of the present application provides a method for training an image classification model, where the image classification model includes a first model and a second model, and the method includes:
  • the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
  • the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
  • the method also includes:
  • the loss weight of the first model is set to a value less than 1.
  • the removing from the first sample set sample images that do not belong to the target category and have a similarity with the target category greater than a specified similarity including:
  • An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
  • the first sample image is selected from the first sample. Centrally remove any of the sample images.
  • the constructing the second sample set based on the sample images belonging to the first sample set with a corresponding probability greater than a specified probability threshold includes:
  • Each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images
  • the second sample set is obtained by adding a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set.
  • the method further includes:
  • any sample image in the second sample set For any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain image features of the any sample image, where the image features are used to indicate the sense of at least one characteristic of the object of interest;
  • an object feature of the sample object is obtained.
  • performing feature identification on the object of interest in any sample image to obtain image features of any sample image including:
  • the present application also provides an image classification device, the device comprising:
  • a first feature extraction module configured to perform feature extraction on the target image to obtain a first image feature of the target image
  • a first probability determination module configured to determine a first probability of the target image based on a first image feature of the target image, where the first probability is used to indicate a possibility that the target image belongs to a target category;
  • a first probability judgment module configured to extract a second image feature from the target image in response to the first probability being greater than a first probability threshold, and obtain an object feature of a target object associated with the target image
  • the first probability judgment module is further configured to determine a second probability of the target image based on the second image feature and the object feature, where the second probability is used to indicate that the target image belongs to the the likelihood of the target class;
  • the second probability judgment module is configured to determine that the target object belongs to the target category in response to the second probability being greater than a second probability threshold.
  • the first probability determination module is further configured to:
  • the first probability judgment module is further configured to:
  • the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate, and the third probability threshold value relationship between.
  • the present application also provides a training device for an image classification model, the image classification model includes a first model and a second model, and the device includes:
  • a first acquisition module configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • a target sample set acquisition module configured to remove from the first sample set sample images that do not belong to the target category and whose similarity to the target category is greater than a specified similarity, to obtain a target sample set
  • a first training module configured to obtain the first model by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
  • a probability acquisition module configured to classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each of the sample images, where the probability is used to indicate a corresponding sample object Likelihood of falling into said target category;
  • a second acquisition module configured to construct a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold
  • the second training module is configured to obtain the second model by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
  • the first training module is further configured to set the loss weight of the first model to a value greater than 1 in response to the need to improve the recall rate of the positive samples; in response to the need to improve the recall rate of the negative samples, Set the loss weight of the first model to a value less than 1.
  • the target sample set acquisition module includes:
  • a first training unit configured to obtain an intermediate model by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
  • a similarity obtaining unit configured to classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each of the sample images and the target category, and the similarity is determined by to represent the likelihood that the sample image belongs to the target class;
  • the filtering unit is configured to, for any sample image in the first sample set, respond that the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, from The any one of the sample images is removed from the first sample set.
  • the second obtaining module includes:
  • a third sample set obtaining unit configured to add sample images whose corresponding probability in the first sample set is greater than a specified probability threshold to the third sample set;
  • a cropping processing unit configured to crop each sample image in the third sample set for multiple times to obtain a plurality of cropped sample images
  • the second sample set acquiring unit is configured to add a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set to obtain the second sample set.
  • the apparatus further includes: a feature extraction module
  • the feature extraction module includes:
  • a feature recognition unit configured to perform feature recognition on an object of interest in any sample image in the second sample set, to obtain image features of any sample image, and the image features are used to indicate at least one feature of the object of interest;
  • an obtaining unit configured to obtain the object identifier of the sample object associated with any one of the sample images based on the image feature
  • the portrait acquisition unit is configured to acquire the object feature of the sample object based on the object identifier.
  • the feature identification unit is configured to, in response to that any one of the sample images includes a plurality of objects of interest, in order of the size of the plurality of objects of interest, sequentially from any one of the objects of interest Obtain at least one feature of at least one object of interest in this image, and obtain the image feature of any sample image.
  • another embodiment of the present application further provides an electronic device, comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be used by the at least one processor.
  • Instructions executed by a processor the instructions being executed by the at least one processor to enable the at least one processor to implement the steps of:
  • the second probability is used to represent the possibility that the target image belongs to the target category
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate, and the third probability threshold value relationship between.
  • another embodiment of the present application further provides an electronic device, comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be used by the at least one processor.
  • Instructions executed by a processor the instructions being executed by the at least one processor to enable the at least one processor to implement the steps of:
  • the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
  • the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
  • the instructions executed by the at least one processor are also used to implement the following steps:
  • the loss weight of the first model is set to a value less than 1.
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
  • the first sample image is selected from the first sample. Centrally remove any of the sample images.
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • Each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images
  • the second sample set is obtained by adding a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set.
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • any sample image in the second sample set For any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain image features of the any sample image, where the image features are used to indicate the sense of at least one characteristic of the object of interest;
  • an object feature of the sample object is obtained.
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • another embodiment of the present application further provides a non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make The computer implements the following steps:
  • the second probability is used to represent the possibility that the target image belongs to the target category
  • another embodiment of the present application further provides a non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make The computer implements the following steps:
  • the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
  • the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • another embodiment of the present application further provides a computer program product, comprising computer instructions, wherein the computer instructions implement the following steps when executed by a processor:
  • the second probability is used to represent the possibility that the target image belongs to the target category
  • another embodiment of the present application further provides a computer program product, comprising computer instructions, wherein the computer instructions implement the following steps when executed by a processor:
  • the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
  • the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • the method of combining the first model and the second model is used to classify the target image, wherein the first model improves the recall rate, and the second model improves the accuracy rate.
  • the overall performance of the image classification method is used to classify the target image, wherein the first model improves the recall rate, and the second model improves the accuracy rate.
  • FIG. 1 is an application scenario diagram of the image classification method provided by an embodiment of the present application
  • FIG. 2 is a flowchart of a training method for an image classification model provided by an embodiment of the present application
  • FIG. 3 is a flowchart of training a first model provided by an embodiment of the present application.
  • FIG. 4 is a flowchart of acquiring a target sample set provided by an embodiment of the present application.
  • FIG. 5 is a flowchart of training a second model provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of constructing a second sample set provided by an embodiment of the present application.
  • FIG. 7 is a flowchart of extracting image features and acquiring object features provided by an embodiment of the present application.
  • FIG. 8 is a flowchart of an image classification method provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an image classification model provided by an embodiment of the present application.
  • FIG. 10 is a device diagram of an image classification device provided by an embodiment of the present application.
  • FIG. 11 is a device diagram of a training device for an image classification model provided by an embodiment of the application.
  • FIG. 12 is a diagram of an electronic device of an image classification method provided by an embodiment of the present application.
  • This application proposes an image classification method that uses two stages to complete image classification.
  • the first stage is used to achieve guaranteed recall, and the second stage further analyzes the output of the first stage to achieve guaranteed classification accuracy.
  • the first stage uses the first model to perform feature extraction on the target image, and the images that cannot be accurately processed by the first model are further analyzed by the second model in the second stage.
  • the second model in the second stage is a decision tree-based model.
  • the second model analyzes the features of multiple dimensions based on the method of fusing multiple features to ensure the accuracy of the classification results.
  • the two models in this application perform their respective functions, which improves the overall recognition effect and performance of the model.
  • the embodiments of the present application process images of different categories in the same way.
  • the embodiments of the present application take whether the images contain illegal content as an example for description, and the illegal content includes but is not limited to political content, violent content, and terrorist content. Wait.
  • FIG. 1 is an application scenario diagram of the image classification method provided by the embodiment of the present application.
  • the application scenario includes: terminal device 101, server 102, network 103, and storage 104;
  • the terminal device 101 uploads the picture and stores it in the memory 104 through the server 102, and the trained model is installed on the server 102; during application, the server 102 obtains the picture from the memory 104, and the server 012 classifies it based on the deployed model.
  • the server 102 can not only obtain the target image through the picture uploaded by the terminal 101, but also can obtain the target image from the short video, which is not limited in this application.
  • the second image feature and the object feature extracted from the second image feature and the object feature are extracted to determine the second probability of the target image.
  • the embodiments of the present application describe the image classification method provided by the embodiments of the present application based on two parts, model training and model use.
  • the samples include positive samples and negative samples, the samples belonging to the target category are positive samples, and the samples not belonging to the target category are negative samples.
  • the image including the violation content is set as a positive sample, and the loss weight is set to be greater than 1.
  • the labeled sample images refer to whether the sample images are labeled as belonging to the target category.
  • FIG. 2 shows the training process of the image classification model in the implementation process, so that the server executes the image classification model.
  • the training method of the image classification model includes the following steps:
  • a first sample set is obtained, the first sample set includes a plurality of sample images, and each sample image is associated with a pre-marked category;
  • step 202 remove from the first sample set sample images that do not belong to the target category and whose similarity with the target category is greater than the specified similarity, to obtain a target sample set;
  • the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
  • step 204 classify and identify each sample image in the first sample set based on the first model to obtain a probability corresponding to each sample image, and the probability is used to indicate the possibility that the corresponding sample image belongs to the target category ;
  • a second sample set is constructed based on the sample images whose probability corresponding to the first sample set is greater than the specified probability threshold
  • the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • the flowchart of training the first model provided by the embodiment of the present application includes the following steps:
  • step 301 a first sample set is obtained, the first sample set includes a plurality of sample images, and each sample image is associated with a pre-marked category.
  • the category associated with the sample image may be marked manually or automatically by the server, which is not limited in this application.
  • the first sample set includes multiple manually annotated illegal images and multiple manually annotated normal images.
  • step 302 remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set.
  • FIG. 4 is a flowchart of obtaining a target sample set provided by an embodiment of the present application, in order to obtain the first model through training, this step is implemented based on the following steps 401 to 403 .
  • step 401 an intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
  • the intermediate model is a model obtained by training based on the first sample set
  • the server obtains the above-mentioned first model by training on the basis of the intermediate model.
  • the server uses the sample image as the input of the model, and the category associated with any sample image as the expected output, that is, the label in supervised learning, to train the model .
  • step 402 classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each sample image and the target category, and the similarity is used to indicate the possibility that the sample image belongs to the target category ;
  • the server can input the any sample image into the intermediate model, and the intermediate model can classify and identify the any sample image to obtain the probability of the any sample image , the probability is used to indicate the possibility that any sample image belongs to the target category, and the server uses the probability as the similarity between the any sample image and the target category.
  • the server can also calculate the similarity between the sample image and the target category by using the similarity calculation formula.
  • Other methods for calculating the similarity are also applicable in this application, which is not limited in this application.
  • step 403 for any sample image in the first sample set, in response to the any sample image not belonging to the target category and the similarity with the target category is greater than the specified similarity, remove the sample image from the first sample set Any sample image.
  • a sample image does not belong to the target category, but the similarity with the target category is 90%, and the specified similarity is 50%, at this time, the sample image is removed from the first sample set.
  • samples that belong to the target category are positive samples, and samples that do not belong to the target category are negative samples;
  • the server can set the loss weight of the first model according to requirements, and the setting method is as follows:
  • the loss weight of the first model is set to a value less than 1.
  • Loss-weight 0.5 means that more attention is paid to the accuracy of identifying negative samples, and it is hoped that the model can accurately identify negative samples.
  • the server determines the sample as a positive sample. At this time, the recall of positive samples is guaranteed.
  • loss weight 2 means that more attention is paid to the accuracy of positive sample recognition.
  • step 303 a first model is obtained by training based on the sample images in the target sample set and the categories associated with the sample images in the target sample set.
  • the server can use the sample images in the target sample set as the input of the model, and use the category associated with the sample images as the expected output of the model, that is, the labels in supervised learning, to train the model until the model converges, and the above-mentioned first model is obtained. .
  • the server inputs the labeled sample images into a deep learning image recognizer such as resnet50 or inception-v3 or efficient-b3, sets the learning rate to 0.001, and iterates 80 times based on the optimizer to obtain an intermediate model, which is expressed as M-stage1-v0. Then classify and identify each sample image in the first sample set based on the intermediate model, clean the first data set based on the recognition results, remove negative samples that are easily confused with positive samples, and obtain the target data set. Set training to obtain the above-mentioned first model, the first model is recorded as M-stage1-v1.
  • a deep learning image recognizer such as resnet50 or inception-v3 or efficient-b3
  • the judgment condition for model convergence is that the loss of the model no longer decreases, or the number of training times reaches a specified number of training times. It should be noted that the conditions for judging that the second model is trained to convergence and the first model is trained to convergence are the same, which will not be repeated in the following.
  • the server classifies and identifies each sample image in the first sample set based on the first model, and obtains a probability corresponding to each sample image, and the probability is used to indicate the corresponding The likelihood that the sample image belongs to the target class.
  • the server constructs a second sample set based on the probability corresponding to each sample image and the first sample set, and the second sample set is used to train the second model.
  • the flowchart of training the second model provided by the embodiment of the present application includes the following steps:
  • step 501 construct a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
  • FIG. 6 is a flowchart of constructing a second sample set provided by an embodiment of the present application, and this step is implemented based on the following steps 601 to 603 .
  • step 601 add sample images whose corresponding probability in the first sample set is greater than a specified probability threshold to the third sample set.
  • the third sample set is empty, or includes at least one sample image, which is not limited in this embodiment of the present application.
  • each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images
  • the number of sample images added to the third sample set by the server is small, and the server can expand the number of sample images in the third sample set by cropping the sample images in the third sample set, and can also make the second model more Universality.
  • the server crops a sample image into 5 images, which can be used as new sample images for model training.
  • step 603 adding the plurality of cropped sample images and the categories associated with the plurality of cropped sample images to a third sample set to obtain a second sample set.
  • the server can further extract image features of each sample image in the second sample set and object features of sample objects associated with each sample image.
  • FIG. 7 is a flowchart of extracting image features and acquiring object features according to an embodiment of the present application. The steps of extracting image features and acquiring object features are implemented based on the following steps 701 to 703 .
  • step 701 for any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain the image feature of the any sample image, and the image feature is used to indicate the interested object at least one characteristic of the object;
  • the server sequentially acquires at least one feature of at least one object of interest from the any sample image according to the size order of the multiple objects of interest, and obtains the Image features of any sample image.
  • the server sample image extracts the features of the face from the sample image in the order of the size of the face, and obtains the image features.
  • the image features can be age, gender, etc.
  • the server obtains the features of faces of no more than three persons in the sample image.
  • step 702 based on the image feature, obtain the object identifier of the sample object associated with any sample image;
  • step 703 based on the object identifier, obtain the object feature of the sample object.
  • the object characteristics of the sample object include at least one of the following: violations in the last 7 days, user age, gender, city, and historical browsing records.
  • the content included in the object feature is determined according to the application scenario, which is not limited in this application.
  • a second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • the second model obtained by the server training can accurately extract the features in the images, and classify the images according to the extracted features to obtain sample images belonging to the target category.
  • the server uses a machine learning model such as XGBoost (eXtreme Gradient Boosting, extreme gradient boosting) for training, the learning rate is set to 0.03, the maximum tree depth is set to 6, the parameter regularization coefficient is set to 2, and the parameters are based on categorical cross-entropy loss. Adjust and train to obtain a second model, which is represented as M-stage2-v1.
  • XGBoost eXtreme Gradient Boosting, extreme gradient boosting
  • the recall rate of the first model is improved, and the accuracy rate of the second model is improved, and the two models perform their own duties, thereby improving the overall image classification model provided by the embodiment of the present application. performance.
  • FIG. 8 it is a flowchart of the image classification method provided by this application, and the image classification method is implemented by steps 801 to 805:
  • step 801 perform feature extraction on the target image to obtain a first image feature of the target image.
  • the server can obtain the target image first, and then perform feature extraction on the target image.
  • the server can acquire the target image through real-time acquisition, and can also acquire the acquired target image from a database, which is not limited in this embodiment of the present application.
  • step 802 based on the first image feature of the target image, determine a first probability of the target image, where the first probability is used to indicate the possibility that the target image belongs to the target category.
  • step 803 in response to the first probability being greater than the first probability threshold, extract the second image feature from the target image, and obtain the object feature of the target object associated with the target image;
  • a second probability of the target image is determined based on the second image feature and the object feature, where the second probability is used to indicate the possibility that the target image belongs to the target category.
  • the server can fuse the second image feature and the object feature, and then determine the second probability of the target image based on the fused feature.
  • step 805 in response to the second probability being greater than the second probability threshold, it is determined that the target object belongs to the target category.
  • the target image in response to the second probability being greater than the third probability threshold and less than or equal to the second probability threshold, is assigned to a specified task set for storing images requiring manual processing.
  • the server takes the second probability as 80%, the second probability threshold as 90%, and the third probability threshold as 70% as an example, since the second probability is smaller than the second probability threshold, the server cannot determine the target image as the target category, but Since the second probability is greater than the third probability threshold, it indicates that the similarity between the target image and the target category is high.
  • the server assigns the target object to the specified task set.
  • the specified task set is a task queue that needs to be processed in the manual processing link, so that the server can screen out difficult images for manual review.
  • the target image is determined not to belong to the target category in response to the second probability being less than or equal to the third probability threshold.
  • the image to be tested when the image needs to be classified, the image to be tested only needs to be input into the image classification model, and the second probability threshold is set as required to effectively classify the image.
  • the server when using the image classification model, can set the third probability threshold based on the near-call variation curve.
  • the quasi-recall variation curve describes the recall parameter used to describe the recall rate, the precision rate parameter used to describe the determination accuracy rate of the target category, and the relationship between the third probability threshold. That is to say, the quasi-call variation curve is a three-dimensional corresponding relationship, and the three-dimensional corresponding relationship includes the correlation relationship between the recall rate, the accuracy rate and the third probability threshold.
  • the server determines the corresponding third probability threshold based on the quasi-call change curve, recall rate and accuracy rate according to the demand, so that different third probability thresholds can be selected according to different business requirements. threshold.
  • FIG. 9 is a schematic structural diagram of the image classification model provided by the embodiment of the present application.
  • the target image is input into the first model 810, the first model performs feature extraction on the target image, and the first image features are extracted; there is a first probability that the first model determines the target image, and the first probability is used to represent the target image. Likelihood of belonging to the target class.
  • the server determines the target The image does not belong to the target category; in response to the first probability being greater than the first probability threshold, the server determines that the target image is of the target category.
  • the accuracy of the server's determination of the target image as the target category may not meet the business requirements, and the server can classify and identify the target image again based on the second model, to ensure the accuracy of the classification results. That is, as shown in FIG. 9 , when the second probability of the target image belonging to the target category is greater than the second probability threshold, it is determined that the target object belongs to the target category.
  • FIG. 10 is an apparatus diagram of an image classification apparatus provided by an embodiment of the present application. As shown in FIG. 10, an image classification apparatus 900 is proposed, including:
  • the first feature extraction module 901 is configured to perform feature extraction on the target image to obtain the first image feature of the target image
  • the first probability determination module 902 is configured to determine the first probability of the target image based on the first image feature of the target image, where the first probability is used to indicate the possibility that the target image belongs to the target category;
  • the first probability judgment module 903 is configured to, in response to the first probability being greater than the first probability threshold, extract the second image feature from the target image, and obtain the object feature of the target object associated with the target image;
  • the first probability judgment module 903 is further configured to determine a second probability of the target image based on the second image feature and the object feature, where the second probability is used to indicate the possibility that the target image belongs to the target category;
  • the second probability judgment module 904 is configured to determine that the target object belongs to the target category in response to the second probability being greater than the second probability threshold.
  • the first probability determination module is further configured to:
  • the first probability judgment module is further configured to:
  • the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the relationship between the recall rate, the accuracy rate, and the third probability threshold value .
  • FIG. 11 is an apparatus diagram of an apparatus for training an image classification model provided by an embodiment of the present application. As shown in FIG. 11 , an apparatus 1000 for training an image classification model is proposed, including:
  • the first acquisition module 1001 is configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the target sample set acquisition module 1002 is configured to remove sample images that do not belong to the target category and have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
  • the first training module 1003 is configured to obtain the first model by training based on the sample images in the target sample set and the categories associated with the sample images in the target sample set;
  • the probability acquisition module 1004 is configured to classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, and the probability is used to indicate that the corresponding sample object belongs to the target the possibility of categories;
  • the second acquisition module 1005 is configured to construct a second sample set based on the sample images whose probability corresponding to the first sample set is greater than the specified probability threshold;
  • the second training module 1006 is configured to obtain the second model by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
  • the first training module is further configured to set the loss weight of the first model to a value greater than 1 in response to the need to improve the recall rate of the positive samples; set the first model to be greater than 1 in response to the need to improve the recall rate of the negative samples
  • the loss weight of the model is a value less than 1.
  • the target sample set acquisition module includes:
  • a first training unit configured to obtain an intermediate model by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
  • the similarity obtaining unit is configured to classify and identify each sample image in the first sample set based on the intermediate model, and obtain the similarity between each sample image and the target category, and the similarity is used to indicate that the sample image belongs to the likelihood of that target category;
  • the filtering unit is configured to, for any sample image in the first sample set, respond that the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, then select from the first sample image Either sample image is removed from this episode.
  • the second obtaining module includes:
  • the third sample set obtaining unit is configured to add sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;
  • a cropping processing unit configured to crop each sample image in the third sample set for multiple times to obtain a plurality of cropped sample images
  • the second sample set obtaining unit is configured to add a plurality of the cropped sample images and the categories associated with the plurality of cropped sample images to the third sample set to obtain the second sample set.
  • the apparatus further includes: a feature extraction module; the feature extraction module includes:
  • a feature identification unit configured to perform feature identification on an object of interest in any sample image in the second sample set, to obtain an image feature of the any sample image, and the image feature is used to indicate at least one characteristic of the object of interest;
  • an acquisition unit configured to acquire the object identifier of the sample object associated with the any sample image based on the image feature
  • the portrait acquisition unit is configured to acquire the object feature of the sample object based on the object identifier.
  • the feature identification unit is configured to, in response to the any sample image including a plurality of objects of interest, sequentially acquire from the any sample image according to the size order of the plurality of objects of interest At least one feature of at least one object of interest is obtained to obtain an image feature of any one of the sample images.
  • the electronic device of the present application includes at least one processor and at least one memory.
  • the memory stores program code, and when the program code is executed by the processor, the processor can execute the following steps:
  • the target image determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to the target category;
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the relationship between the recall rate, the accuracy rate, and the third probability threshold value .
  • the electronic device in the embodiments of the present application includes at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the memory The instructions are executed by the at least one processor to enable the at least one processor to implement the following steps:
  • the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
  • the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
  • the instructions executed by the at least one processor are also used to implement the following steps:
  • the loss weight of the first model is set to a value less than 1.
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
  • any sample image in the first sample set in response to the any sample image not belonging to the target category and the similarity with the target category is greater than the specified similarity, remove the any sample image from the first sample set a sample image.
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • Each sample image in the third sample set is cropped multiple times to obtain a plurality of cropped sample images
  • the second sample set is obtained by adding a plurality of the cropped sample images and the categories associated with the plurality of cropped sample images to the third sample set.
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • any sample image in the second sample set perform feature recognition on the object of interest in any sample image to obtain an image feature of the any sample image, where the image feature is used to indicate at least one of the object of interest feature;
  • the object characteristics of the sample object are obtained.
  • the instructions executed by the at least one processor are further used to implement the following steps:
  • the electronic device 130 according to this embodiment of the present application is described below with reference to FIG. 12 .
  • the electronic device 130 takes the form of a general electronic device.
  • Components of the electronic device 130 may include, but are not limited to: the above-mentioned at least one processor 131 , the above-mentioned at least one memory 132 , and a bus 133 connecting different system components (including the memory 132 and the processor 131 ).
  • Bus 133 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus structures.
  • Memory 132 may include readable media in the form of volatile memory, such as random access memory (RAM) 1321 and/or cache memory 1322 , and may further include read only memory (ROM) 1323 .
  • RAM random access memory
  • ROM read only memory
  • the memory 132 may also include a program/utility 1325 having a set (at least one) of program modules 1324 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, which An implementation of a network environment may be included in each or some combination of the examples.
  • Electronic device 130 may also communicate with one or more external devices 134 (eg, keyboards, pointing devices, etc.), may also communicate with one or more devices that enable a user to interact with electronic device 130, and/or communicate with the electronic device 130 communicates with any device (eg, router, modem, etc.) capable of communicating with one or more other electronic devices. Such communication may take place through input/output (I/O) interface 135. Also, the electronic device 130 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 136 . As shown, network adapter 136 communicates with other modules for electronic device 130 via bus 133 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 130, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
  • an image classification method provided by the present application is implemented in the form of a computer program product, the computer program product includes computer instructions, and the computer instructions are executed by a processor to implement the following steps:
  • the second probability is used to represent the possibility that the target image belongs to the target category
  • the training method of an image classification model provided by the present application is implemented in the form of a computer program product, and the computer program product includes computer instructions, and the computer instructions are executed by a processor to implement the following steps:
  • the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
  • the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • a computer program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Also included in the readable storage medium are any of the following: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only Memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.
  • the program product for image classification of embodiments of the present application may employ a portable compact disk read only memory (CD-ROM) and include program code, and may be executed on an electronic device.
  • CD-ROM portable compact disk read only memory
  • the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, carrying readable program code therein. Such propagated data signals may take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Write program code for performing the operations of the present application in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming languages - Such as "C" language or similar programming language.
  • the program code may execute entirely on the user's electronic device, partly on the user's device, as a stand-alone software package, partly on the user's electronic device and partly on a remote electronic device, or entirely on the remote electronic device or service Execute on the end.
  • the remote electronic equipment may be connected to the user electronic equipment through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to external electronic equipment (eg, using Internet services provider to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet services provider to connect via the Internet
  • a non-volatile computer-readable storage medium stores a computer program for causing a computer to implement the following steps:
  • the target image determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to the target category;
  • a non-volatile computer-readable storage medium stores a computer program for causing a computer to implement the steps of: obtaining a first a sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
  • the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  • an image classification method including:
  • the second probability is higher than the second probability threshold, it is determined that the target object belongs to the target category.
  • the method further includes:
  • the target image does not belong to the target category.
  • the method further includes:
  • the second probability is smaller than the third probability threshold, it is determined that the target image does not belong to the target category.
  • a quasi-call variation curve is pre-stored, and the quasi-call variation curve is used to describe the recall parameter of the recall rate, the precision rate parameter used to describe the determination accuracy rate of the target category, and the difference between the third probability threshold. connection relation;
  • the third probability threshold is set according to the third recall index and accuracy index.
  • feature extraction is performed on the target image using a pre-trained first model, and a first probability that the target image belongs to the target category is determined, wherein the first model is trained according to the following method:
  • the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the sample image in the target sample set is used as the input of the first model, and the category of the sample image is used as the expected output of the first model, and the first model is trained until the training of the first model converges.
  • a second model based on a decision tree is used to fuse the second image feature and the object feature to obtain a second probability that the target image belongs to the target category, wherein the second model is based on the following method of training:
  • the second model is trained using the image features and the object features until the training of the second model converges.
  • the samples of the target category are positive samples, and the samples that do not belong to the target category are negative samples;
  • the loss weight of the first model When it is necessary to improve the recall rate of the positive sample, set the loss weight of the first model to a value greater than 1;
  • the loss weight of the first model is set to a value less than 1.
  • filtering out sample images in the first sample set that do not belong to the target category and have a similarity with the target category higher than a specified similarity including:
  • the first model Before the first model is trained, use the sample image in the first sample set as the input of the first model, and use the category of the sample image as the expected output of the first model, train the first model until The first model training converges;
  • the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity, the sample image is filtered out.
  • the acquisition of sample images whose probability of belonging to the target category is greater than a specified probability threshold to construct a second sample set includes:
  • the respective categories of the sample images in the second sample set and the multiple cropped sample images are acquired, and the third sample set composed of the sample images and corresponding categories is constructed.
  • extracting image features from the sample image and obtaining object features of the target object associated with the sample image include:
  • the object feature of the target object is acquired according to the object identifier.
  • performing feature identification on the object of interest in the sample image, and obtaining feature information of the object of interest from the sample image including:
  • the characteristic information of at least one object of interest is sequentially acquired from the sample image according to the size order of the objects of interest.
  • the embodiment of the present application provides a training method for an image classification model, including:
  • the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the second model is trained using the image features and the object features until the training of the second model converges.
  • the samples of the target category are positive samples, and the samples that do not belong to the target category are negative samples;
  • the loss weight of the first model is set to a value greater than 1;
  • the loss weight of the first model is set to a value less than 1.
  • filtering out sample images in the first sample set that do not belong to the target category and have a similarity with the target category higher than a specified similarity including:
  • the first model Before the first model is trained, use the sample image in the first sample set as the input of the first model, and use the category of the sample image as the expected output of the first model, train the first model until The first model training converges;
  • the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity, the sample image is filtered out.
  • the acquisition of sample images whose probability of belonging to the target category is greater than a specified probability threshold to construct a second sample set includes:
  • the respective categories of the sample images in the second sample set and the plurality of cropped sample images are acquired, and the third sample set composed of the sample images and corresponding categories is constructed.
  • the image features are extracted from the sample image and the object features of the target object associated with the sample image are obtained, including:
  • the object feature of the target object is acquired according to the object identifier.
  • performing feature identification on the object of interest in the sample image, and obtaining feature information of the object of interest from the sample image including:
  • the characteristic information of at least one object of interest is sequentially acquired from the sample image according to the size order of the objects of interest.
  • the present application also provides an image classification device, the device comprising:
  • an image acquisition module configured to acquire a target image
  • a first feature extraction module configured to perform feature extraction on the target image to obtain a first image feature of the target image
  • a first probability determination module configured to use the first image feature of the target image to determine the first probability that the target image belongs to the target category
  • the first probability judgment module is configured to extract the second image feature from the target image when the first probability is higher than the first probability threshold, and obtain the object feature of the target object associated with the target image;
  • the second image feature and the object feature are fused to obtain a second probability that the target image belongs to the target category;
  • the second probability judgment module is configured to determine that the target object belongs to the target category when the second probability is higher than the second probability threshold.
  • the first probability determination module is further configured to:
  • the target image does not belong to the target category.
  • the first probability judgment module is further configured to:
  • the second probability is smaller than the third probability threshold, it is determined that the target image does not belong to the target category.
  • a quasi-call variation curve is pre-stored, and the quasi-call variation curve is used to describe the recall parameter of the recall rate, the precision rate parameter used to describe the determination accuracy rate of the target category, and the difference between the third probability threshold. connection relation;
  • the third probability threshold is set according to the third recall index and accuracy index.
  • a pre-trained first model is used to perform feature extraction on the target image, and a first probability that the target image belongs to the target category is determined, wherein the first model is trained according to the following modules:
  • a first sample set obtaining module configured to obtain a first sample set, the first sample set includes a plurality of sample images, and each sample image is associated with a pre-marked category;
  • the filtering module is configured to filter out the sample images in the first sample set that do not belong to the target category and whose similarity with the target category is higher than the specified similarity, to obtain the target sample set;
  • a first model training module configured to use the sample image in the target sample set as the input of the first model, use the category of the sample image as the expected output of the first model, and train the first model until the first model Model training converges.
  • a second model based on a decision tree is used to fuse the second image feature and the object feature to obtain a second probability that the target image belongs to the target category, wherein the second model is based on the following Module trained:
  • the second sample set obtaining module is configured to obtain sample images whose probability of belonging to the target category is greater than the specified probability threshold to construct a second sample set;
  • a feature extraction module configured to extract image features from the sample image for any sample image in the second sample set, and obtain object features of the target object associated with the sample image;
  • the second model training module is configured to use the image feature and the object feature to train the second model until the second model training converges.
  • the samples of the target category are positive samples, and the samples that do not belong to the target category are negative samples;
  • the loss weight of the first model is set to a value greater than 1;
  • the loss weight of the first model is set to a value less than 1.
  • the filtering module includes:
  • an initial training unit configured to use the sample image in the first sample set as the input of the first model and the category of the sample image as the expected output of the first model before the first model is trained, training the first model until the training of the first model converges;
  • a similarity obtaining unit configured to input each sample image in the first sample set into the first model, and obtain the probability that the sample image belongs to the target category as the similarity with the target category;
  • the filtering unit is configured to filter out the sample image if the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity.
  • the second sample set acquisition module includes:
  • a cropping unit configured to perform multiple cropping processes on each sample image in the second sample set, to obtain multiple cropped sample images
  • the third sample set obtaining unit is configured to obtain the respective categories of the sample images in the second sample set and the plurality of cropped sample images, and construct the third sample set consisting of the sample images and the corresponding categories.
  • the feature extraction module includes:
  • a feature information acquisition unit configured to perform feature recognition on the object of interest in the sample image, and obtain feature information of the object of interest from the sample image;
  • an object identification obtaining unit configured to obtain the object identification of the target object associated with the sample image
  • the object feature obtaining unit is configured to obtain the object feature of the target object according to the object identifier.
  • the feature information acquisition unit includes:
  • the characteristic information of at least one object of interest is sequentially acquired from the sample image according to the size order of the objects of interest.
  • the present application also provides an apparatus for training an image classification model, the apparatus comprising:
  • a first acquisition module configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
  • the target sample set acquisition module is configured to filter out the sample images in the first sample set that do not belong to the target category and the similarity with the target category is higher than the specified similarity to obtain the target sample set;
  • a first training module configured to use the sample image in the target sample set as the input of the first model, use the category of the sample image as the expected output of the first model, and train the first model until the first model training convergence;
  • a probability acquisition module configured to use the trained first model to classify and identify each sample image in the first sample set, and obtain the probability of each sample image belonging to the target category;
  • a second acquisition module configured to acquire sample images whose probability of belonging to the target category is greater than a specified probability threshold to construct a second sample set
  • a feature extraction module configured to extract image features from the sample image for any sample image in the second sample set, and obtain object features of the target object associated with the sample image;
  • the second training module is configured to train the second model using the image feature and the object feature until the training of the second model converges.
  • the samples of the target category are positive samples, and the samples that do not belong to the target category are negative samples;
  • the loss weight of the first model is set to a value greater than 1;
  • the loss weight of the first model is set to a value less than 1.
  • the target sample set acquisition module includes:
  • a first training unit configured to take the sample image in the first sample set as the input of the first model, and take the category of the sample image as the expected output of the first model before the first model is trained , train the first model until the training of the first model converges;
  • a similarity obtaining unit configured to input each sample image in the first sample set into the first model, and obtain the probability that the sample image belongs to the target category as the similarity with the target category;
  • the filtering unit is configured to filter out the sample image if the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity.
  • the second obtaining module includes:
  • a cropping processing unit configured to perform multiple cropping processes on each sample image in the second sample set, respectively, to obtain a plurality of cropped sample images
  • the third sample set obtaining unit is configured to obtain the respective categories of the sample images in the second sample set and the plurality of cropped sample images, and construct the third sample set consisting of the sample images and the corresponding categories.
  • the feature extraction module includes:
  • a feature recognition unit configured to perform feature recognition on the object of interest in the sample image, and obtain feature information of the object of interest from the sample image;
  • an acquisition unit configured to acquire the object identifier of the target object associated with the sample image
  • the portrait acquisition unit is configured to acquire the object feature of the target object according to the object identifier.
  • performing feature identification on the object of interest in the sample image, and obtaining feature information of the object of interest from the sample image including:
  • the characteristic information of at least one object of interest is sequentially acquired from the sample image according to the size order of the objects of interest.

Abstract

Provided are an image classification method and an electronic device. The image classification method comprises: determining a first probability of a target image on the basis of a first image feature extracted from the target image; then, where the first probability is greater than a first probability threshold, determining a second probability of the target image on the basis of a second image feature extracted from the target image and an object feature of a target object associated with the target image; and finally, where the second probability is greater than a second probability threshold, determining that the target object belongs to a target category.

Description

图像分类方法及电子设备Image classification method and electronic device
本申请基于申请号为202011325685.9、申请日为2020年11月23日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is based on the Chinese patent application with the application number of 202011325685.9 and the filing date of November 23, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is incorporated herein by reference.
技术领域technical field
本申请涉及图像检测领域,特别涉及一种图像分类方法、装置、电子设备和存储介质。The present application relates to the field of image detection, and in particular, to an image classification method, apparatus, electronic device and storage medium.
背景技术Background technique
随着计算机视觉技术的发展,图像内容理解和分析越来越智能化。基于图像信息的分类任务是计算机视觉的一个重要应用。With the development of computer vision technology, image content understanding and analysis are becoming more and more intelligent. The classification task based on image information is an important application of computer vision.
随着图像信息的增多,如何高效的对图像分解分类在安全审核、异常行为检测等特殊场景中尤为重要。With the increase of image information, how to efficiently decompose and classify images is particularly important in special scenarios such as security audit and abnormal behavior detection.
发明内容SUMMARY OF THE INVENTION
第一方面,本申请实施例提供了一种图像分类方法,包括:In a first aspect, the embodiments of the present application provide an image classification method, including:
对目标图像进行特征提取,得到所述目标图像的第一图像特征;performing feature extraction on the target image to obtain the first image feature of the target image;
基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;
响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;
基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;
响应于所述第二概率大于第二概率阈值,确定所述目标对象属于所述目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
在一些实施例中,所述方法还包括:In some embodiments, the method further includes:
响应于所述第一概率小于或等于所述第一概率阈值,确定所述目标图像不属于所述目标类别。In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,所述方法还包括:In some embodiments, the method further includes:
响应于所述第二概率大于第三概率阈值且小于或等于所述第二概率阈值,将所述目标图像分配到指定任务集合中,所述指定任务集合用于存储需要人工处理的图像;in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;
响应于所述第二概率小于或等于所述第三概率阈值,确定所述目标图像不属于所述目标类别。In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,所述第三概率阈值基于准召变化曲线、召回率和准确率确定,所述准召变化曲线用于描述所述召回率、所述准确率以及所述第三概率阈值之间的关联关系。In some embodiments, the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate, and the third probability threshold value relationship between.
第二方面,本申请实施例提供了一种图像分类模型的训练方法,所述图像分类模型包括第一模型和第二模型,所述方法包括:In a second aspect, an embodiment of the present application provides a method for training an image classification model, where the image classification model includes a first model and a second model, and the method includes:
获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;
基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
在一些实施例中,属于所述目标类别的样本为正样本,不属于所述目标类别的样本为负样本;In some embodiments, samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
所述方法还包括:The method also includes:
响应于需要提高所述正样本的召回率,设置所述第一模型的损失权重为大于1的值;In response to the need to improve the recall rate of the positive samples, setting the loss weight of the first model to a value greater than 1;
响应于需要提高所述负样本的召回率,设置所述第一模型的损失权重为小于1的值。In response to the need to improve the recall rate of the negative samples, the loss weight of the first model is set to a value less than 1.
在一些实施例中,所述从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,包括:In some embodiments, the removing from the first sample set sample images that do not belong to the target category and have a similarity with the target category greater than a specified similarity, including:
基于所述第一样本集中的样本图像以及所述第一样本集中样本图像关联的类别,训练得到中间模型;An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
基于所述中间模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像与所述目标类别的相似度,所述相似度用于表示样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each sample image and the target category, where the similarity is used to indicate that the sample image belongs to the target the possibility of categories;
对于所述第一样本集中的任一样本图像,响应于所述任一样本图像不属于所述目标类别且与所述目标类别的相似度大于指定相似度,则从所述第一样本集中移除所述任一样本图像。For any sample image in the first sample set, in response to the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, the first sample image is selected from the first sample. Centrally remove any of the sample images.
在一些实施例中,所述基于属于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集,包括:In some embodiments, the constructing the second sample set based on the sample images belonging to the first sample set with a corresponding probability greater than a specified probability threshold includes:
将所述第一样本集中对应的概率大于所述指定概率阈值的样本图像,添加至第三样本集;adding the sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;
对所述第三样本集中的每张样本图像分别进行多次裁剪,得到多张裁剪后的样本图像;Each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images;
将多张所述裁剪后的样本图像以及多张所述裁剪后的样本图像关联的类别添加至所述第三样本集,得到所述第二样本集。The second sample set is obtained by adding a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set.
在一些实施例中,所述方法还包括:In some embodiments, the method further includes:
对于所述第二样本集中的任一样本图像,对所述任一样本图像中的感兴趣对象进行特征识别,得到所述任一样本图像的图像特征,所述图像特征用于指示所述感兴趣对象的至少一个特征;For any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain image features of the any sample image, where the image features are used to indicate the sense of at least one characteristic of the object of interest;
基于所述图像特征,获取所述任一样本图像关联的样本对象的对象标识;Based on the image feature, obtain the object identifier of the sample object associated with any of the sample images;
基于所述对象标识,获取所述样本对象的对象特征。Based on the object identification, an object feature of the sample object is obtained.
在一些实施例中,所述对所述任一样本图像中的感兴趣对象进行特征识别,得到所述任一样本图像的图像特征,包括:In some embodiments, performing feature identification on the object of interest in any sample image to obtain image features of any sample image, including:
响应于所述任一样本图像中包括多个感兴趣对象,按照多个所述感兴趣对象的大小顺序,依序从所述任一样本图像中获取至少一个所述感兴趣对象的至少一个特征,得到所述任一样本图像的图像特征。In response to the plurality of objects of interest being included in any one of the sample images, sequentially acquiring at least one feature of at least one of the objects of interest from the any one of the sample images according to the size order of the plurality of objects of interest , to obtain the image features of any of the sample images.
第三方面,本申请还提供了一种图像分类装置,所述装置包括:In a third aspect, the present application also provides an image classification device, the device comprising:
第一特征提取模块,被配置为对所述目标图像进行特征提取,得到所述目标图像的第一图像特征;a first feature extraction module, configured to perform feature extraction on the target image to obtain a first image feature of the target image;
第一概率确定模块,被配置为基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;a first probability determination module, configured to determine a first probability of the target image based on a first image feature of the target image, where the first probability is used to indicate a possibility that the target image belongs to a target category;
第一概率判断模块,被配置为响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;a first probability judgment module, configured to extract a second image feature from the target image in response to the first probability being greater than a first probability threshold, and obtain an object feature of a target object associated with the target image;
所述第一概率判断模块,还被配置为基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;The first probability judgment module is further configured to determine a second probability of the target image based on the second image feature and the object feature, where the second probability is used to indicate that the target image belongs to the the likelihood of the target class;
第二概率判断模块,被配置为响应于所述第二概率大于第二概率阈值时,确定所述目标对象属于所述目标类别。The second probability judgment module is configured to determine that the target object belongs to the target category in response to the second probability being greater than a second probability threshold.
在一些实施例中,所述第一概率确定模块,还被配置为:In some embodiments, the first probability determination module is further configured to:
响应于所述第一概率小于或等于所述第一概率阈值,确定所述目标图像不属于所述目标类别。In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,所述第一概率判断模块,还被配置为:In some embodiments, the first probability judgment module is further configured to:
响应于所述第二概率大于第三概率阈值且小于或等于所述第二概率阈值,将所述目标图像分配到指定任务集合中,所述指定任务集合用于存储需要人工处理的图像;in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;
响应于所述第二概率小于或等于所述第三概率阈值,确定所述目标图像不属于所述目标类别。In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,所述第三概率阈值基于准召变化曲线、召回率和准确率确定,所述准召变化曲线用于描述所述召回率、所述准确率以及所述第三概率阈值之间的关联关系。In some embodiments, the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate, and the third probability threshold value relationship between.
第四方面,本申请还提供了一种图像分类模型的训练装置,所述图像分类模型包括第一模型和第二模型,所述装置包括:In a fourth aspect, the present application also provides a training device for an image classification model, the image classification model includes a first model and a second model, and the device includes:
第一获取模块,被配置为获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;a first acquisition module, configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
目标样本集获取模块,被配置为从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;a target sample set acquisition module, configured to remove from the first sample set sample images that do not belong to the target category and whose similarity to the target category is greater than a specified similarity, to obtain a target sample set;
第一训练模块,被配置为基于所述目标样本集中的样本图像以及所述目标样本集中样本 图像关联的类别,训练得到所述第一模型;A first training module, configured to obtain the first model by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
概率获取模块,被配置为基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本对象属于所述目标类别的可能性;A probability acquisition module, configured to classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each of the sample images, where the probability is used to indicate a corresponding sample object Likelihood of falling into said target category;
第二获取模块,被配置为基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;a second acquisition module, configured to construct a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
第二训练模块,被配置为基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second training module is configured to obtain the second model by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
在一些实施例中,属于所述目标类别的样本为正样本,不属于所述目标类别的样本为负样本;In some embodiments, samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
所述第一训练模块,还被配置为响应于需要提高所述正样本的召回率,设置所述第一模型的损失权重为大于1的值;响应于需要提高所述负样本的召回率,设置所述第一模型的损失权重为小于1的值。The first training module is further configured to set the loss weight of the first model to a value greater than 1 in response to the need to improve the recall rate of the positive samples; in response to the need to improve the recall rate of the negative samples, Set the loss weight of the first model to a value less than 1.
在一些实施例中,所述目标样本集获取模块,包括:In some embodiments, the target sample set acquisition module includes:
第一训练单元,被配置为基于所述第一样本集中的样本图像以及所述第一样本集中样本图像关联的类别,训练得到中间模型;a first training unit, configured to obtain an intermediate model by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
相似度获取单元,被配置为基于所述中间模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像与所述目标类别的相似度,所述相似度用于表示样本图像属于所述目标类别的可能性;A similarity obtaining unit, configured to classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each of the sample images and the target category, and the similarity is determined by to represent the likelihood that the sample image belongs to the target class;
过滤单元,被配置为对于所述第一样本集中的任一样本图像,响应于所述任一样本图像不属于所述目标类别且与所述目标类别的相似度大于指定相似度,则从所述第一样本集中移除所述任一样本图像。The filtering unit is configured to, for any sample image in the first sample set, respond that the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, from The any one of the sample images is removed from the first sample set.
在一些实施例中,所述第二获取模块,包括:In some embodiments, the second obtaining module includes:
第三样本集获取单元,被配置将所述第一样本集中对应的概率大于指定概率阈值的样本图像,添加至第三样本集;a third sample set obtaining unit, configured to add sample images whose corresponding probability in the first sample set is greater than a specified probability threshold to the third sample set;
裁剪处理单元,被配置为对所述第三样本集中的每张样本图像分别进行多次裁剪,得到多张裁剪后的样本图像;a cropping processing unit, configured to crop each sample image in the third sample set for multiple times to obtain a plurality of cropped sample images;
第二样本集获取单元,被配置为将多张所述裁剪后的样本图像以及多张所述裁剪后的样本图像关联的类别添加至所述第三样本集,得到所述第二样本集。The second sample set acquiring unit is configured to add a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set to obtain the second sample set.
在一些实施例中,所述装置还包括:特征提取模块;In some embodiments, the apparatus further includes: a feature extraction module;
所述特征提取模块,包括:The feature extraction module includes:
特征识别单元,被配置为对于所述第二样本集中的任一样本图像,对所述任一样本图像中的感兴趣对象进行特征识别,得到所述任一样本图像的图像特征,所述图像特征用于指示所述感兴趣对象的至少一个特征;A feature recognition unit, configured to perform feature recognition on an object of interest in any sample image in the second sample set, to obtain image features of any sample image, and the image features are used to indicate at least one feature of the object of interest;
获取单元,被配置为基于所述图像特征,获取所述任一样本图像关联的样本对象的对象标识;an obtaining unit, configured to obtain the object identifier of the sample object associated with any one of the sample images based on the image feature;
画像获取单元,被配置为基于所述对象标识,获取所述样本对象的对象特征。The portrait acquisition unit is configured to acquire the object feature of the sample object based on the object identifier.
在一些实施例中,所述特征识别单元,被配置为响应于所述任一样本图像中包括多个感兴趣对象,按照多个所述感兴趣对象的大小顺序,依序从所述任一样本图像中获取至少一个所述感兴趣对象的至少一个特征,得到所述任一样本图像的图像特征。In some embodiments, the feature identification unit is configured to, in response to that any one of the sample images includes a plurality of objects of interest, in order of the size of the plurality of objects of interest, sequentially from any one of the objects of interest Obtain at least one feature of at least one object of interest in this image, and obtain the image feature of any sample image.
第五方面,本申请另一实施例还提供了一种电子设备,包括至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够实现下述步骤:In a fifth aspect, another embodiment of the present application further provides an electronic device, comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be used by the at least one processor. Instructions executed by a processor, the instructions being executed by the at least one processor to enable the at least one processor to implement the steps of:
对目标图像进行特征提取,得到所述目标图像的第一图像特征;performing feature extraction on the target image to obtain the first image feature of the target image;
基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;
响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;
基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;
响应于所述第二概率大于第二概率阈值,确定所述目标对象属于所述目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
在一些实施例中,所述至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
响应于所述第一概率小于或等于所述第一概率阈值,确定所述目标图像不属于所述目标类别。In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,所述至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
响应于所述第二概率大于第三概率阈值且小于或等于所述第二概率阈值,将所述目标图像分配到指定任务集合中,所述指定任务集合用于存储需要人工处理的图像;in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;
响应于所述第二概率小于或等于所述第三概率阈值,确定所述目标图像不属于所述目标类别。In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,所述第三概率阈值基于准召变化曲线、召回率和准确率确定,所述准召变化曲线用于描述所述召回率、所述准确率以及所述第三概率阈值之间的关联关系。In some embodiments, the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate, and the third probability threshold value relationship between.
第六方面,本申请另一实施例还提供了一种电子设备,包括至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够实现下述步骤:In a sixth aspect, another embodiment of the present application further provides an electronic device, comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be used by the at least one processor. Instructions executed by a processor, the instructions being executed by the at least one processor to enable the at least one processor to implement the steps of:
获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;
基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
在一些实施例中,属于所述目标类别的样本为正样本,不属于所述目标类别的样本为负样本;In some embodiments, samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
所述至少一个处理器执行的指令,还用于实现下述步骤:The instructions executed by the at least one processor are also used to implement the following steps:
响应于需要提高所述正样本的召回率,设置所述第一模型的损失权重为大于1的值;In response to the need to improve the recall rate of the positive samples, setting the loss weight of the first model to a value greater than 1;
响应于需要提高所述负样本的召回率,设置所述第一模型的损失权重为小于1的值。In response to the need to improve the recall rate of the negative samples, the loss weight of the first model is set to a value less than 1.
在一些实施例中,所述至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
基于所述第一样本集中的样本图像以及所述第一样本集中样本图像关联的类别,训练得到中间模型;An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
基于所述中间模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像与所述目标类别的相似度,所述相似度用于表示样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each sample image and the target category, where the similarity is used to indicate that the sample image belongs to the target the possibility of categories;
对于所述第一样本集中的任一样本图像,响应于所述任一样本图像不属于所述目标类别且与所述目标类别的相似度大于指定相似度,则从所述第一样本集中移除所述任一样本图像。For any sample image in the first sample set, in response to the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, the first sample image is selected from the first sample. Centrally remove any of the sample images.
在一些实施例中,所述至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
将所述第一样本集中对应的概率大于所述指定概率阈值的样本图像,添加至第三样本集;adding the sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;
对所述第三样本集中的每张样本图像分别进行多次裁剪,得到多张裁剪后的样本图像;Each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images;
将多张所述裁剪后的样本图像以及多张所述裁剪后的样本图像关联的类别添加至所述第三样本集,得到所述第二样本集。The second sample set is obtained by adding a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set.
在一些实施例中,所述至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
对于所述第二样本集中的任一样本图像,对所述任一样本图像中的感兴趣对象进行特征识别,得到所述任一样本图像的图像特征,所述图像特征用于指示所述感兴趣对象的至少一个特征;For any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain image features of the any sample image, where the image features are used to indicate the sense of at least one characteristic of the object of interest;
基于所述图像特征,获取所述任一样本图像关联的样本对象的对象标识;Based on the image feature, obtain the object identifier of the sample object associated with any of the sample images;
基于所述对象标识,获取所述样本对象的对象特征。Based on the object identification, an object feature of the sample object is obtained.
在一些实施例中,所述至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
响应于所述任一样本图像中包括多个感兴趣对象,按照多个所述感兴趣对象的大小顺序, 依序从所述任一样本图像中获取至少一个所述感兴趣对象的至少一个特征,得到所述任一样本图像的图像特征。In response to the plurality of objects of interest being included in the any sample image, sequentially acquiring at least one feature of at least one object of interest from the any sample image according to the size order of the plurality of objects of interest , to obtain the image features of any of the sample images.
第七方面,本申请另一实施例还提供了一种非易失性计算机可读存储介质,其中,所述非易失性计算机可读存储介质存储有计算机程序,所述计算机程序用于使计算机实现下述步骤:In a seventh aspect, another embodiment of the present application further provides a non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make The computer implements the following steps:
对目标图像进行特征提取,得到所述目标图像的第一图像特征;performing feature extraction on the target image to obtain the first image feature of the target image;
基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;
响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;
基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;
响应于所述第二概率大于第二概率阈值,确定所述目标对象属于所述目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
第八方面,本申请另一实施例还提供了一种非易失性计算机可读存储介质,其中,所述非易失性计算机可读存储介质存储有计算机程序,所述计算机程序用于使计算机实现下述步骤:In an eighth aspect, another embodiment of the present application further provides a non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make The computer implements the following steps:
获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;
基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
第九方面,本申请另一实施例还提供了一种计算机程序产品,包括计算机指令,其中,所述计算机指令被处理器执行时实现下述步骤:In a ninth aspect, another embodiment of the present application further provides a computer program product, comprising computer instructions, wherein the computer instructions implement the following steps when executed by a processor:
对目标图像进行特征提取,得到所述目标图像的第一图像特征;performing feature extraction on the target image to obtain the first image feature of the target image;
基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;
响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;
基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;
响应于所述第二概率大于第二概率阈值,确定所述目标对象属于所述目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
第十方面,本申请另一实施例还提供了一种计算机程序产品,包括计算机指令,其中,所述计算机指令被处理器执行时实现下述步骤:In a tenth aspect, another embodiment of the present application further provides a computer program product, comprising computer instructions, wherein the computer instructions implement the following steps when executed by a processor:
获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;
基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
本申请实施例中,采用第一模型和第二模型的结合的方法对目标图像进行分类,其中第 一模型提高了召回率,第二模型提高了准确率,两个模型各司其职,提升了图像分类方法的整体性能。In the embodiment of the present application, the method of combining the first model and the second model is used to classify the target image, wherein the first model improves the recall rate, and the second model improves the accuracy rate. The overall performance of the image classification method.
附图说明Description of drawings
图1为本申请实施例提供的图像分类方法的应用场景图;FIG. 1 is an application scenario diagram of the image classification method provided by an embodiment of the present application;
图2为本申请实施例提供的图像分类模型的训练方法的的流程图;2 is a flowchart of a training method for an image classification model provided by an embodiment of the present application;
图3为本申请实施例提供的训练第一模型的流程图;3 is a flowchart of training a first model provided by an embodiment of the present application;
图4为本申请实施例提供的获取目标样本集的流程图;4 is a flowchart of acquiring a target sample set provided by an embodiment of the present application;
图5为本申请实施例提供的训练第二模型的流程图;5 is a flowchart of training a second model provided by an embodiment of the present application;
图6为本申请实施例提供的构建第二样本集的流程图;6 is a flowchart of constructing a second sample set provided by an embodiment of the present application;
图7为本申请实施例提供的提取图像特征和获取对象特征的流程图;FIG. 7 is a flowchart of extracting image features and acquiring object features provided by an embodiment of the present application;
图8为本申请实施例提供的图像分类方法的流程图;8 is a flowchart of an image classification method provided by an embodiment of the present application;
图9为本申请实施例提供的图像分类模型的结构示意图;9 is a schematic structural diagram of an image classification model provided by an embodiment of the present application;
图10为本申请实施例提供的图像分类装置的装置图;10 is a device diagram of an image classification device provided by an embodiment of the present application;
图11为本申请实施例提供的图像分类模型的训练装置的装置图;11 is a device diagram of a training device for an image classification model provided by an embodiment of the application;
图12为本申请实施例提供的图像分类方法的电子设备图。FIG. 12 is a diagram of an electronic device of an image classification method provided by an embodiment of the present application.
具体实施方式Detailed ways
随着计算机视觉技术的发展,图像内容理解和分析越来越智能化。基于图像信息的分类任务是计算机视觉的一个重要应用;随着图像信息的增多,如何高效的对图像分解分类在安全审核、异常行为检测等特殊场景中尤为重要。在这些特殊场景中,某一类图像的自然发生率非常低(万分之几)。如一万张图片里面也许才会出现几个目标图像。With the development of computer vision technology, image content understanding and analysis are becoming more and more intelligent. The classification task based on image information is an important application of computer vision; with the increase of image information, how to efficiently decompose and classify images is particularly important in special scenarios such as security audit and abnormal behavior detection. In these special scenes, the natural occurrence rate of a certain class of images is very low (a few parts per 10,000). For example, only a few target images may appear in ten thousand pictures.
本申请提出了一种图像分类方法,采用两个阶段来完成图像分类。第一个阶段用于实现保证召回率,第二个阶段对第一阶段的输出结果进行进一步分析,用于实现保证分类的准确性。This application proposes an image classification method that uses two stages to complete image classification. The first stage is used to achieve guaranteed recall, and the second stage further analyzes the output of the first stage to achieve guaranteed classification accuracy.
本申请实施例中第一阶段采用第一模型对目标图像进行特征提取,对于第一模型无法准确处理的图像,由第二阶段的第二模型进一步分析。第二阶段的第二模型为基于决策树的模型,该第二模型基于融合多特征的方式,对多个维度的特征进行分析,来保证分类结果的准确性。本申请中的两个模型各司其职,提升了模型整体的识别效果和性能。In the embodiment of the present application, the first stage uses the first model to perform feature extraction on the target image, and the images that cannot be accurately processed by the first model are further analyzed by the second model in the second stage. The second model in the second stage is a decision tree-based model. The second model analyzes the features of multiple dimensions based on the method of fusing multiple features to ensure the accuracy of the classification results. The two models in this application perform their respective functions, which improves the overall recognition effect and performance of the model.
本申请实施例对不同类别的图像的处理方式相同,为了便于理解,本申请实施例中以图片是否包含违规内容为例来进行说明,该违规内容包括但不限于政治内容、暴力内容以及恐怖内容等。The embodiments of the present application process images of different categories in the same way. For ease of understanding, the embodiments of the present application take whether the images contain illegal content as an example for description, and the illegal content includes but is not limited to political content, violent content, and terrorist content. Wait.
下面结合附图对本申请实施例中的图像分类方法进行详细说明。The image classification method in the embodiments of the present application will be described in detail below with reference to the accompanying drawings.
为便于理解,下文以对违规图像进行分类识别为例,对本申请实施例提供的技术方案进行说明。应当理解的是,本申请实施例提供的图像分类方法,还能够有应用于其它分类任务,本申请实施例对此不进行限定。For ease of understanding, the following describes the technical solutions provided by the embodiments of the present application by taking classifying and identifying illegal images as an example. It should be understood that the image classification method provided in the embodiment of the present application can also be applied to other classification tasks, which is not limited in the embodiment of the present application.
在一些实施例中,如图1所示,图1为本申请实施例提供的图像分类方法的应用场景图。该应用场景中包括:终端设备101、服务器102、网络103、存储器104;In some embodiments, as shown in FIG. 1 , FIG. 1 is an application scenario diagram of the image classification method provided by the embodiment of the present application. The application scenario includes: terminal device 101, server 102, network 103, and storage 104;
终端设备101上传图片,并通过服务器102存储到存储器104中,训练好的模型安装在服务器102;在应用时,服务器102从存储器104中获取图片,并由服务器012基于部署的模型进行分类。The terminal device 101 uploads the picture and stores it in the memory 104 through the server 102, and the trained model is installed on the server 102; during application, the server 102 obtains the picture from the memory 104, and the server 012 classifies it based on the deployed model.
在一些实施例中,服务器102不仅能够通过终端101上传的图片获取目标图像,还能够从短视频中获取目标图像,本申请对此不作限定。In some embodiments, the server 102 can not only obtain the target image through the picture uploaded by the terminal 101, but also can obtain the target image from the short video, which is not limited in this application.
本申请实施例提供的图像分类方法中,首先基于训练好的第一模型对目标图像进行特征提取,并基于提取到的第一特征确定目标图像第一概率;然后基于第二模型对从目标图像中提取到的第二图像特征和对象特征,确定目标图像的第二概率。In the image classification method provided in the embodiment of the present application, first, feature extraction is performed on the target image based on the trained first model, and the first probability of the target image is determined based on the extracted first feature; The second image feature and the object feature extracted from the second image feature and the object feature are extracted to determine the second probability of the target image.
为了便于理解,本申请实施例基于模型训练和模型使用两部分内容,来对本申请实施例提供的图像分类方法进行说明。For ease of understanding, the embodiments of the present application describe the image classification method provided by the embodiments of the present application based on two parts, model training and model use.
一、图像分类模型的训练First, the training of image classification model
在一些实施例中,样本包括正样本和负样本,属于目标类别的样本为正样本,不属于目标类别的样本为负样本。以应用场景为检测样本图像是否违规为例,响应于期望检测出样本图像中包括违规内容的图像,则将包括违规内容的图像设定为正样本,并设置损失权重大于1。使用标注好的样本图像对模型进行训练,该标注好的样本图像是指样本图像标注有是否属 于目标类别,训练方法如下:In some embodiments, the samples include positive samples and negative samples, the samples belonging to the target category are positive samples, and the samples not belonging to the target category are negative samples. Taking the application scenario of detecting whether a sample image violates the rules as an example, in response to the expected detection of an image including the violation content in the sample image, the image including the violation content is set as a positive sample, and the loss weight is set to be greater than 1. Use the labeled sample images to train the model. The labeled sample images refer to whether the sample images are labeled as belonging to the target category. The training method is as follows:
如图2所示,为本申请实施例提供的图像分类模型的训练方法的流程图,也即图2示出了图像分类模型在实施过程中的训练流程,以由服务器执行,该图像分类模型包括第一模型和第二模型为例,该图像分类模型的训练方法包括以下步骤:As shown in FIG. 2, it is a flowchart of the training method of the image classification model provided by the embodiment of the present application, that is, FIG. 2 shows the training process of the image classification model in the implementation process, so that the server executes the image classification model. Taking the first model and the second model as an example, the training method of the image classification model includes the following steps:
在步骤201中,获取第一样本集,该第一样本集中包括多张样本图像,各样本图像关联有预先标注的类别;In step 201, a first sample set is obtained, the first sample set includes a plurality of sample images, and each sample image is associated with a pre-marked category;
在步骤202中,从该第一样本集中移除不属于目标类别且与该目标类别的相似度大于指定相似度的样本图像,得到目标样本集;In step 202, remove from the first sample set sample images that do not belong to the target category and whose similarity with the target category is greater than the specified similarity, to obtain a target sample set;
在步骤203中,基于该目标样本集中的样本图像以及该目标样本集中样本图像关联的类别,训练得到该第一模型;In step 203, the first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
在步骤204中,基于该第一模型分别对该第一样本集中的各样本图像进行分类识别,得到各样本图像对应的概率,该概率用于指示对应的样本图像属于该目标类别的可能性;In step 204, classify and identify each sample image in the first sample set based on the first model to obtain a probability corresponding to each sample image, and the probability is used to indicate the possibility that the corresponding sample image belongs to the target category ;
在步骤205中,基于该第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;In step 205, a second sample set is constructed based on the sample images whose probability corresponding to the first sample set is greater than the specified probability threshold;
在步骤206中,基于该第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到该第二模型。In step 206, the second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
为了便于理解,下面分别对第一模型和第二模型的训练过程进行说明。For ease of understanding, the training processes of the first model and the second model are respectively described below.
1、训练第一模型1. Train the first model
如图3所示,为本申请实施例提供的训练第一模型的流程图,以由服务器执行为例,包括以下步骤:As shown in FIG. 3 , the flowchart of training the first model provided by the embodiment of the present application, taking execution by a server as an example, includes the following steps:
在步骤301中:获取第一样本集,第一样本集中包括多张样本图像,各样本图像关联有预先标注的类别。In step 301: a first sample set is obtained, the first sample set includes a plurality of sample images, and each sample image is associated with a pre-marked category.
其中,与样本图像关联的类别,可由人工进行标注,也可以由服务器自动进行标注,本申请对此不作限定。The category associated with the sample image may be marked manually or automatically by the server, which is not limited in this application.
例如,第一样本集中包括多个人工标注的违规图像以及多个人工标注的正常图像。For example, the first sample set includes multiple manually annotated illegal images and multiple manually annotated normal images.
在步骤302中:从第一样本集中移除不属于目标类别且与目标类别的相似度大于指定相似度的样本图像,得到目标样本集。In step 302: remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set.
在一个实施例中,如图4所示,图4为本申请实施例提供的获取目标样本集的流程图,为了能够训练得到第一模型,本步骤基于下述步骤401至步骤403实现。In one embodiment, as shown in FIG. 4 , which is a flowchart of obtaining a target sample set provided by an embodiment of the present application, in order to obtain the first model through training, this step is implemented based on the following steps 401 to 403 .
在步骤401中:基于第一样本集中的样本图像以及该第一样本集中样本图像关联的类别,训练得到中间模型;In step 401: an intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
其中,该中间模型为基于第一样本集训练得到的模型,服务器在该中间模型的基础上训练得到上述第一模型。对于该第一样本集中的任一样本图像,服务器将该任一样本图像作为模型的输入,将该任一样本图像关联的类别作为期望输出,也即有监督学习中的标签,进行模型训练。Wherein, the intermediate model is a model obtained by training based on the first sample set, and the server obtains the above-mentioned first model by training on the basis of the intermediate model. For any sample image in the first sample set, the server uses the sample image as the input of the model, and the category associated with any sample image as the expected output, that is, the label in supervised learning, to train the model .
在步骤402中:基于该中间模型分别对该第一样本集中的各样本图像进行分类识别,得到各样本图像与目标类别的相似度,该相似度用于表示样本图像属于目标类别的可能性;In step 402: classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each sample image and the target category, and the similarity is used to indicate the possibility that the sample image belongs to the target category ;
其中,对于该第一样本集中的任一样本图像,服务器能够将该任一样本图像输入该中间模型,由该中间模型对该任一样本图像进行分类识别,得到该任一样本图像的概率,该概率用于表示该任一样本图像属于目标类别的可能性,服务器将该概率作为该任一样本图像与目标类别的相似度。Wherein, for any sample image in the first sample set, the server can input the any sample image into the intermediate model, and the intermediate model can classify and identify the any sample image to obtain the probability of the any sample image , the probability is used to indicate the possibility that any sample image belongs to the target category, and the server uses the probability as the similarity between the any sample image and the target category.
在一些实施例中,服务器还能够通过相似度计算公式来计算样本图像与目标类别的相似度,其他计算相似度的方法在本申请中也适用,本申请对此不作限定。In some embodiments, the server can also calculate the similarity between the sample image and the target category by using the similarity calculation formula. Other methods for calculating the similarity are also applicable in this application, which is not limited in this application.
在步骤403中:对于该第一样本集中的任一样本图像,响应于该任一样本图像不属于目标类别且与目标类别的相似度大于指定相似度,从第一样本集中移除该任一样本图像。In step 403: for any sample image in the first sample set, in response to the any sample image not belonging to the target category and the similarity with the target category is greater than the specified similarity, remove the sample image from the first sample set Any sample image.
例如:一个样本图像不属于目标类别,但是与目标类别的相似度为90%,而指定相似度为50%,此时则将该样本图像从第一样本集中移除。For example: a sample image does not belong to the target category, but the similarity with the target category is 90%, and the specified similarity is 50%, at this time, the sample image is removed from the first sample set.
在一些实施例中,属于目标类别的样本为正样本,不属于目标类别的样本为负样本;服务器能够根据需求来设置第一模型的损失权重,设置方式如下:In some embodiments, samples that belong to the target category are positive samples, and samples that do not belong to the target category are negative samples; the server can set the loss weight of the first model according to requirements, and the setting method is as follows:
响应于需要提高正样本的召回率,设置该第一模型的损失权重为大于1的值;In response to the need to improve the recall rate of positive samples, set the loss weight of the first model to a value greater than 1;
响应于需要提高负样本的召回率,设置该第一模型的损失权重为小于1的值。In response to the need to improve the recall rate of negative samples, the loss weight of the first model is set to a value less than 1.
例如:损失权重(Loss-weight)=0.5意味着更关注于对负样本识别的准确性,希望模型能够对负样本进行准确识别。响应于模型将某个样本预测为负样本的概率较低,服务器将该样本判定为正样本。此时保证了正样本的召回。响应于损失权重=2,则意味着更关注于对正样本识别的准确性。For example: Loss-weight = 0.5 means that more attention is paid to the accuracy of identifying negative samples, and it is hoped that the model can accurately identify negative samples. In response to the model predicting a sample with a low probability of being a negative sample, the server determines the sample as a positive sample. At this time, the recall of positive samples is guaranteed. In response to loss weight = 2, it means that more attention is paid to the accuracy of positive sample recognition.
在步骤303中:基于目标样本集中的样本图像以及该目标样本集中样本图像关联的类别,训练得到第一模型。In step 303: a first model is obtained by training based on the sample images in the target sample set and the categories associated with the sample images in the target sample set.
其中,服务器能够将目标样本集中的样本图像作为模型的输入,将样本图像关联的类别作为模型的期望输出,也即有监督学习中的标签,来训练模型,直至模型收敛,得到上述第一模型。Among them, the server can use the sample images in the target sample set as the input of the model, and use the category associated with the sample images as the expected output of the model, that is, the labels in supervised learning, to train the model until the model converges, and the above-mentioned first model is obtained. .
例如,服务器将标注后的样本图像输入到resnet50或inception-v3或efficient-b3等深度学习图像识别器中,设置学习率为0.001,基于优化器迭代80次,得到中间模型,该中间模型表示为M-stage1-v0。然后基于该中间模型对第一样本集中的各样本图像进行分类识别,基于识别结果对第一数据集进行清洗,移除容易与正样本混淆的负样本,得到目标数据集,在基于目标数据集训练得到上述第一模型,该第一模型记为M-stage1-v1。For example, the server inputs the labeled sample images into a deep learning image recognizer such as resnet50 or inception-v3 or efficient-b3, sets the learning rate to 0.001, and iterates 80 times based on the optimizer to obtain an intermediate model, which is expressed as M-stage1-v0. Then classify and identify each sample image in the first sample set based on the intermediate model, clean the first data set based on the recognition results, remove negative samples that are easily confused with positive samples, and obtain the target data set. Set training to obtain the above-mentioned first model, the first model is recorded as M-stage1-v1.
在一些实施例中,模型收敛的判断条件是模型的损失不再下降,或训练次数达到了指定的训练次数。需要说明的是,后续介绍的第二模型训练至收敛与第一模型训练至收敛的判断条件相同,在后续不再赘述。In some embodiments, the judgment condition for model convergence is that the loss of the model no longer decreases, or the number of training times reaches a specified number of training times. It should be noted that the conditions for judging that the second model is trained to convergence and the first model is trained to convergence are the same, which will not be repeated in the following.
在一些实施例中,服务器在训练得到第一模型之后,基于该第一模型分别对第一样本集中的各样本图像进行分类识别,得到各样本图像对应的概率,该概率用于指示对应的样本图像属于目标类别的可能性。服务器基于各样本图像对应的概率以及第一样本集,构建第二样本集,该第二样本集用于训练第二模型。In some embodiments, after obtaining the first model through training, the server classifies and identifies each sample image in the first sample set based on the first model, and obtains a probability corresponding to each sample image, and the probability is used to indicate the corresponding The likelihood that the sample image belongs to the target class. The server constructs a second sample set based on the probability corresponding to each sample image and the first sample set, and the second sample set is used to train the second model.
2、训练第二模型2. Train the second model
如图5所示,为本申请实施例提供的训练第二模型的流程图,以由服务器执行为例,包括以下步骤:As shown in FIG. 5 , the flowchart of training the second model provided by the embodiment of the present application, taking execution by a server as an example, includes the following steps:
在步骤501中:基于第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;In step 501: construct a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
在一个实施例中,如图6所示,图6为本申请实施例提供的构建第二样本集的流程图,本步骤基于下述步骤601至步骤603实现。In one embodiment, as shown in FIG. 6 , FIG. 6 is a flowchart of constructing a second sample set provided by an embodiment of the present application, and this step is implemented based on the following steps 601 to 603 .
在步骤601中:将第一样本集中对应的概率大于指定概率阈值的样本图像,添加至第三样本集。In step 601: add sample images whose corresponding probability in the first sample set is greater than a specified probability threshold to the third sample set.
其中,该第三样本集为空,或者包括至少一个样本图像,本申请实施例对此不进行限制。The third sample set is empty, or includes at least one sample image, which is not limited in this embodiment of the present application.
在步骤602中,对第三样本集中的每张样本图像分别进行多次裁剪,得到多张裁剪后的样本图像;In step 602, each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images;
其中,服务器添加到第三样本集中的样本图像的数量较少,服务器通过对第三样本集中的样本图像进行裁剪,能够扩充第三样本集中的样本图像的数量,还能使得第二模型更有普适性。Among them, the number of sample images added to the third sample set by the server is small, and the server can expand the number of sample images in the third sample set by cropping the sample images in the third sample set, and can also make the second model more Universality.
例如,服务器将一张样本图像裁剪为5张图片,该5张图片能够作为新的样本图像来进行模型训练。For example, the server crops a sample image into 5 images, which can be used as new sample images for model training.
在步骤603中:将多张裁剪后的样本图像以及多张裁剪后的样本图像关联的类别添加至第三样本集,得到第二样本集。In step 603 : adding the plurality of cropped sample images and the categories associated with the plurality of cropped sample images to a third sample set to obtain a second sample set.
在一些实施例中,服务器在得到第二样本集之后,还能够提取该第二样本集中各样本图像的图像特征以及各样本图像关联的样本对象的对象特征。如图7所示,图7为本申请实施例提供的提取图像特征和获取对象特征的流程图。该提取图像特征和获取对象特征的步骤,基于下述步骤701至步骤703实现。In some embodiments, after obtaining the second sample set, the server can further extract image features of each sample image in the second sample set and object features of sample objects associated with each sample image. As shown in FIG. 7 , FIG. 7 is a flowchart of extracting image features and acquiring object features according to an embodiment of the present application. The steps of extracting image features and acquiring object features are implemented based on the following steps 701 to 703 .
在步骤701中:对于第二样本集中的任一样本图像,对该任一样本图像中的感兴趣对象进行特征识别,得到该任一样本图像的图像特征,该图像特征用于指示该感兴趣对象的至少一个特征;In step 701: for any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain the image feature of the any sample image, and the image feature is used to indicate the interested object at least one characteristic of the object;
其中,响应于该任一样本图像中包括多个感兴趣对象,服务器按照多个感兴趣对象的大小顺序,依序从该任一样本图像中获取至少一个感兴趣对象的至少一个特征,得到该任一样本图像的图像特征。Wherein, in response to that any sample image includes multiple objects of interest, the server sequentially acquires at least one feature of at least one object of interest from the any sample image according to the size order of the multiple objects of interest, and obtains the Image features of any sample image.
例如:以感兴趣的对象为人脸为例,服务器样本图像中按照人脸的大小顺序,从样本图像中提取人脸的特征,得到图像特征,该图像特征可以为年龄、性别等。其中,服务器获取该样本图像中不超过三个人的人脸的特征。For example, taking the object of interest as a face as an example, the server sample image extracts the features of the face from the sample image in the order of the size of the face, and obtains the image features. The image features can be age, gender, etc. Wherein, the server obtains the features of faces of no more than three persons in the sample image.
在步骤702中:基于该图像特征,获取该任一样本图像关联的样本对象的对象标识;In step 702: based on the image feature, obtain the object identifier of the sample object associated with any sample image;
在步骤703中:基于该对象标识,获取样本对象的对象特征。In step 703: based on the object identifier, obtain the object feature of the sample object.
其中,样本对象的对象特征包含以下至少一种:最近7天的违规情况、用户年龄、性别、所在城市以及历史浏览记录等。该对象特征包含的内容根据应用场景进行确定,本申请对此不作限定。The object characteristics of the sample object include at least one of the following: violations in the last 7 days, user age, gender, city, and historical browsing records. The content included in the object feature is determined according to the application scenario, which is not limited in this application.
在步骤502中:基于第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到第二模型。In step 502, a second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
其中,服务器训练得到第二模型能够准确的提取图像中的特征,并根据提取到的特征对图像进行分类,得到属于目标类别的样本图像。The second model obtained by the server training can accurately extract the features in the images, and classify the images according to the extracted features to obtain sample images belonging to the target category.
例如,服务器使用XGBoost(eXtreme Gradient Boosting,极端梯度提升)等机器学习模型进行训练,将学习率设置为0.03,最大树深度设置为6,参数正则化系数设置为2,基于分类交叉熵损失进行参数调整,训练得到第二模型,该第二模型表示为M-stage2-v1。For example, the server uses a machine learning model such as XGBoost (eXtreme Gradient Boosting, extreme gradient boosting) for training, the learning rate is set to 0.03, the maximum tree depth is set to 6, the parameter regularization coefficient is set to 2, and the parameters are based on categorical cross-entropy loss. Adjust and train to obtain a second model, which is represented as M-stage2-v1.
通过将第一模型与第二模型结合,使得第一模型提高了召回率,第二模型提高了准确率,两个模型各司其职,从而提升了本申请实施例提供的图像分类模型的整体性能。By combining the first model with the second model, the recall rate of the first model is improved, and the accuracy rate of the second model is improved, and the two models perform their own duties, thereby improving the overall image classification model provided by the embodiment of the present application. performance.
二、图像分类模型的使用Second, the use of image classification models
如图8所示,为本申请提供的图像分类方法的流程图,该图像分类方法由步骤801至步骤805实现:As shown in FIG. 8 , it is a flowchart of the image classification method provided by this application, and the image classification method is implemented by steps 801 to 805:
在步骤801中:对目标图像进行特征提取,得到目标图像的第一图像特征。In step 801: perform feature extraction on the target image to obtain a first image feature of the target image.
其中,服务器能够先获取目标图像,然后再对该目标图像进行特征提取。服务器能够通过实时采集的方式获取该目标图像,也能够从数据库获取已采集的目标图像,本申请实施例对此不进行限制。The server can obtain the target image first, and then perform feature extraction on the target image. The server can acquire the target image through real-time acquisition, and can also acquire the acquired target image from a database, which is not limited in this embodiment of the present application.
在步骤802中:基于目标图像的第一图像特征,确定目标图像的第一概率,该第一概率用于表示该目标图像属于目标类别的可能性。In step 802: based on the first image feature of the target image, determine a first probability of the target image, where the first probability is used to indicate the possibility that the target image belongs to the target category.
其中,该第一概率的值越大,表示目标图像属于目标类别的可能性越高。Wherein, the larger the value of the first probability, the higher the possibility that the target image belongs to the target category.
在步骤803中:响应于第一概率大于第一概率阈值,从目标图像中提取第二图像特征,并获取目标图像关联的目标对象的对象特征;In step 803: in response to the first probability being greater than the first probability threshold, extract the second image feature from the target image, and obtain the object feature of the target object associated with the target image;
在一些实施例中,响应于第一概率小于或等于第一概率阈值,确定目标图像不属于目标类别。In some embodiments, it is determined that the target image does not belong to the target category in response to the first probability being less than or equal to the first probability threshold.
在步骤804中,基于第二图像特征和对象特征,确定该目标图像的第二概率,该第二概率用于表示该目标图像属于目标类别的可能性。In step 804, a second probability of the target image is determined based on the second image feature and the object feature, where the second probability is used to indicate the possibility that the target image belongs to the target category.
其中,服务器能够对第二图像特征和对象特征进行融合,然后基于融合后的特征,确定目标图像的第二概率。The server can fuse the second image feature and the object feature, and then determine the second probability of the target image based on the fused feature.
在步骤805中:响应于第二概率大于第二概率阈值,确定目标对象属于目标类别。In step 805: in response to the second probability being greater than the second probability threshold, it is determined that the target object belongs to the target category.
在一些实施例中,响应于第二概率大于第三概率阈值且小于或等于第二概率阈值,将目标图像分配到指定任务集合中,该指定任务集合用于存储需要人工处理的图像。In some embodiments, in response to the second probability being greater than the third probability threshold and less than or equal to the second probability threshold, the target image is assigned to a specified task set for storing images requiring manual processing.
例如:以第二概率80%,第二概率阈值为90%,第三概率阈值为70%为例,由于第二概率小于第二概率阈值,所以服务器不能把该目标图像判定为目标类别,但是由于第二概率大于第三概率阈值,表示目标图像与目标类别的相似度较高,为了提高判定的准确性,服务器将目标对象分配到指定任务集合中。该指定任务集合是人工处理环节需要处理的任务队列,这样,服务器实现将疑难的图像筛选出来进行人工审核。For example: take the second probability as 80%, the second probability threshold as 90%, and the third probability threshold as 70% as an example, since the second probability is smaller than the second probability threshold, the server cannot determine the target image as the target category, but Since the second probability is greater than the third probability threshold, it indicates that the similarity between the target image and the target category is high. In order to improve the accuracy of the determination, the server assigns the target object to the specified task set. The specified task set is a task queue that needs to be processed in the manual processing link, so that the server can screen out difficult images for manual review.
在一些实施例中,响应于第二概率小于或等于第三概率阈值,确定目标图像不属于目标类别。In some embodiments, the target image is determined not to belong to the target category in response to the second probability being less than or equal to the third probability threshold.
根据本申请实施例提供的图像分类方法,在需要对图像进行分类时,仅需将待测图像输入图像分类模型,并根据需求设置第二概率阈值即可有效的对图像进行分类。According to the image classification method provided by the embodiment of the present application, when the image needs to be classified, the image to be tested only needs to be input into the image classification model, and the second probability threshold is set as required to effectively classify the image.
在一些实施例中,在使用图像分类模型时,服务器能够根据准召变化曲线设定第三概率阈值。其中,准召变化曲线描述用于描述召回率的召回参数、用于描述目标类别的判定准确率的准确率参数以及第三概率阈值之间的关联关系。也即准召变化曲线是一个三维对应关系,该三维对应关系包括召回率、准确率和第三概率阈值之间的关联关系。当有明确的召回率和 准确率需求时,服务器根据该需求,基于准召变化曲线、召回率和准确率确定对应的第三概率阈值,从而能够实现根据不同的业务需求选择不同的第三概率阈值。In some embodiments, when using the image classification model, the server can set the third probability threshold based on the near-call variation curve. Wherein, the quasi-recall variation curve describes the recall parameter used to describe the recall rate, the precision rate parameter used to describe the determination accuracy rate of the target category, and the relationship between the third probability threshold. That is to say, the quasi-call variation curve is a three-dimensional corresponding relationship, and the three-dimensional corresponding relationship includes the correlation relationship between the recall rate, the accuracy rate and the third probability threshold. When there is a clear demand for recall rate and accuracy rate, the server determines the corresponding third probability threshold based on the quasi-call change curve, recall rate and accuracy rate according to the demand, so that different third probability thresholds can be selected according to different business requirements. threshold.
为了便于理解,下面对本申请实施例提出的图像分类模型的结构进行说明,如图9所示,图9为本申请实施例提供的图像分类模型的结构示意图。以由服务器执行为例,进行说明。将目标图像输入至第一模型810,由第一模型对目标图像进行特征提取,提取出第一图像特征;有该第一模型确定目标图像的第一概率,该第一概率用于表示目标图像属于目标类别的可能性。由于第一模型具有保召回的功能,也即第一模型能够将疑似目标类别的图像分类到目标类别,因此,响应于第一模型输出的第一概率小于或等于第一概率阈值,服务器判断目标图像不属于目标类别;响应于第一概率大于第一概率阈值,服务器判断目标图像为目标类别。在一些实施例中,由于第一模型具有保召回的特性,因此服务器将目标图像判断为目标类别的准确性可能无法满足业务需求,则服务器能够基于第二模型再次对该目标图像进行分类识别,以确保分类结果的准确性。也即,如图9所示,当目标图像属于目标类别的第二概率大于第二概率阈值时,确定目标对象属于目标类别。For ease of understanding, the following describes the structure of the image classification model provided by the embodiment of the present application. As shown in FIG. 9 , FIG. 9 is a schematic structural diagram of the image classification model provided by the embodiment of the present application. An example of execution by a server will be described. The target image is input into the first model 810, the first model performs feature extraction on the target image, and the first image features are extracted; there is a first probability that the first model determines the target image, and the first probability is used to represent the target image. Likelihood of belonging to the target class. Since the first model has the function of guaranteeing recall, that is, the first model can classify images of suspected target categories into target categories, therefore, in response to the first probability output by the first model being less than or equal to the first probability threshold, the server determines the target The image does not belong to the target category; in response to the first probability being greater than the first probability threshold, the server determines that the target image is of the target category. In some embodiments, since the first model has the characteristic of guaranteeing recall, the accuracy of the server's determination of the target image as the target category may not meet the business requirements, and the server can classify and identify the target image again based on the second model, to ensure the accuracy of the classification results. That is, as shown in FIG. 9 , when the second probability of the target image belonging to the target category is greater than the second probability threshold, it is determined that the target object belongs to the target category.
需要说明的是,本申请所有实施例均可以单独被执行,也可以与其他实施例相结合被执行,均视为本申请要求的保护范围。It should be noted that, all the embodiments of the present application can be implemented independently or in combination with other embodiments, which are regarded as the protection scope of the present application.
图10为本申请实施例提供的图像分类装置的装置图。如图10所示,提出一种图像分类装置900,包括:FIG. 10 is an apparatus diagram of an image classification apparatus provided by an embodiment of the present application. As shown in FIG. 10, an image classification apparatus 900 is proposed, including:
第一特征提取模块901,被配置为对目标图像进行特征提取,得到目标图像的第一图像特征;The first feature extraction module 901 is configured to perform feature extraction on the target image to obtain the first image feature of the target image;
第一概率确定模块902,被配置为基于目标图像的第一图像特征,确定目标图像的第一概率,该第一概率用于表示该目标图像属于目标类别的可能性;The first probability determination module 902 is configured to determine the first probability of the target image based on the first image feature of the target image, where the first probability is used to indicate the possibility that the target image belongs to the target category;
第一概率判断模块903,被配置为响应于第一概率大于第一概率阈值,从目标图像中提取第二图像特征,并获取该目标图像关联的目标对象的对象特征;The first probability judgment module 903 is configured to, in response to the first probability being greater than the first probability threshold, extract the second image feature from the target image, and obtain the object feature of the target object associated with the target image;
第一概率判断模块903,还被配置为基于第二图像特征和对象特征,确定目标图像的第二概率,该第二概率用于表示该目标图像属于目标类别的可能性;The first probability judgment module 903 is further configured to determine a second probability of the target image based on the second image feature and the object feature, where the second probability is used to indicate the possibility that the target image belongs to the target category;
第二概率判断模块904,被配置为响应于第二概率大于第二概率阈值,确定目标对象属于目标类别。The second probability judgment module 904 is configured to determine that the target object belongs to the target category in response to the second probability being greater than the second probability threshold.
在一些实施例中,第一概率确定模块,还被配置为:In some embodiments, the first probability determination module is further configured to:
响应于该第一概率小于或等于该第一概率阈值,确定该目标图像不属于该目标类别。In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,第一概率判断模块,还被配置为:In some embodiments, the first probability judgment module is further configured to:
响应于该第二概率大于第三概率阈值且小于或等于该第二概率阈值,将该目标图像分配到指定任务集合中,该指定任务集合用于存储需要人工处理的图像;In response to the second probability being greater than the third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set, where the specified task set is used to store images requiring manual processing;
响应于该第二概率小于或等于该第三概率阈值,确定该目标图像不属于该目标类别。In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,该第三概率阈值基于准召变化曲线、召回率和准确率确定,该准召变化曲线用于描述该召回率、该准确率以及该第三概率阈值之间的关联关系。In some embodiments, the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the relationship between the recall rate, the accuracy rate, and the third probability threshold value .
图11为本申请实施例提供的图像分类模型的训练装置的装置图。如图11所示,提出一种图像分类模型的训练装置1000,包括:FIG. 11 is an apparatus diagram of an apparatus for training an image classification model provided by an embodiment of the present application. As shown in FIG. 11 , an apparatus 1000 for training an image classification model is proposed, including:
第一获取模块1001,被配置为获取第一样本集,该第一样本集中包括多张样本图像,各该样本图像关联有预先标注的类别;The first acquisition module 1001 is configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
目标样本集获取模块1002,被配置为为从该第一样本集中移除不属于目标类别且与该目标类别的相似度大于指定相似度的样本图像,得到目标样本集;The target sample set acquisition module 1002 is configured to remove sample images that do not belong to the target category and have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
第一训练模块1003,被配置为基于该目标样本集中的样本图像以及该目标样本集中样本图像关联的类别,训练得到该第一模型;The first training module 1003 is configured to obtain the first model by training based on the sample images in the target sample set and the categories associated with the sample images in the target sample set;
概率获取模块1004,被配置为基于该第一模型分别对该第一样本集中的各样本图像进行分类识别,得到各该样本图像对应的概率,该概率用于指示对应的样本对象属于该目标类别的可能性;The probability acquisition module 1004 is configured to classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, and the probability is used to indicate that the corresponding sample object belongs to the target the possibility of categories;
第二获取模块1005,被配置为基于该第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;The second acquisition module 1005 is configured to construct a second sample set based on the sample images whose probability corresponding to the first sample set is greater than the specified probability threshold;
第二训练模块1006,被配置为基于该第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到该第二模型。The second training module 1006 is configured to obtain the second model by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
在一些实施例中,属于目标类别的样本为正样本,不属于目标类别的样本为负样本;In some embodiments, samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
该第一训练模块,还被配置为响应于需要提高该正样本的召回率,设置该第一模型的损失权重为大于1的值;响应于需要提高该负样本的召回率,设置该第一模型的损失权重为小于1的值。The first training module is further configured to set the loss weight of the first model to a value greater than 1 in response to the need to improve the recall rate of the positive samples; set the first model to be greater than 1 in response to the need to improve the recall rate of the negative samples The loss weight of the model is a value less than 1.
在一些实施例中,该目标样本集获取模块,包括:In some embodiments, the target sample set acquisition module includes:
第一训练单元,被配置为基于该第一样本集中的样本图像以及该第一样本集中样本图像关联的类别,训练得到中间模型;a first training unit, configured to obtain an intermediate model by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
相似度获取单元,被配置为基于该中间模型分别对该第一样本集中的各样本图像进行分类识别,得到各该样本图像与该目标类别的相似度,该相似度用于表示样本图像属于该目标类别的可能性;The similarity obtaining unit is configured to classify and identify each sample image in the first sample set based on the intermediate model, and obtain the similarity between each sample image and the target category, and the similarity is used to indicate that the sample image belongs to the likelihood of that target category;
过滤单元,被配置为对于该第一样本集中的任一样本图像,响应于该任一样本图像不属于该目标类别且与该目标类别的相似度大于指定相似度,则从该第一样本集中移除该任一样本图像。The filtering unit is configured to, for any sample image in the first sample set, respond that the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, then select from the first sample image Either sample image is removed from this episode.
在一些实施例中,该第二获取模块,包括:In some embodiments, the second obtaining module includes:
第三样本集获取单元,被配置将该第一样本集中对应的概率大于指定概率阈值的样本图像,添加至第三样本集;The third sample set obtaining unit is configured to add sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;
裁剪处理单元,被配置为对该第三样本集中的每张样本图像分别进行多次裁剪,得到多张裁剪后的样本图像;a cropping processing unit, configured to crop each sample image in the third sample set for multiple times to obtain a plurality of cropped sample images;
第二样本集获取单元,被配置为将多张该裁剪后的样本图像以及多张该裁剪后的样本图像关联的类别添加至该第三样本集,得到该第二样本集。The second sample set obtaining unit is configured to add a plurality of the cropped sample images and the categories associated with the plurality of cropped sample images to the third sample set to obtain the second sample set.
在一些实施例中,该装置还包括:特征提取模块;该特征提取模块,包括:In some embodiments, the apparatus further includes: a feature extraction module; the feature extraction module includes:
特征识别单元,被配置为对于该第二样本集中的任一样本图像,对该任一样本图像中的感兴趣对象进行特征识别,得到该任一样本图像的图像特征,该图像特征用于指示该感兴趣对象的至少一个特征;A feature identification unit, configured to perform feature identification on an object of interest in any sample image in the second sample set, to obtain an image feature of the any sample image, and the image feature is used to indicate at least one characteristic of the object of interest;
获取单元,被配置为基于该图像特征,获取该任一样本图像关联的样本对象的对象标识;an acquisition unit, configured to acquire the object identifier of the sample object associated with the any sample image based on the image feature;
画像获取单元,被配置为基于该对象标识,获取该样本对象的对象特征。The portrait acquisition unit is configured to acquire the object feature of the sample object based on the object identifier.
在一些实施例中,该特征识别单元,被配置为响应于该任一样本图像中包括多个感兴趣对象,按照多个该感兴趣对象的大小顺序,依序从该任一样本图像中获取至少一个该感兴趣对象的至少一个特征,得到该任一样本图像的图像特征。In some embodiments, the feature identification unit is configured to, in response to the any sample image including a plurality of objects of interest, sequentially acquire from the any sample image according to the size order of the plurality of objects of interest At least one feature of at least one object of interest is obtained to obtain an image feature of any one of the sample images.
下面介绍根据本申请的另一示例性实施方式的电子设备。An electronic device according to another exemplary embodiment of the present application is described below.
在一些实施例中,本申请的电子设备至少包括至少一个处理器、以及至少一个存储器。其中,存储器存储有程序代码,当程序代码被处理器执行时,使得处理器执行能够实现下述步骤:In some embodiments, the electronic device of the present application includes at least one processor and at least one memory. Wherein, the memory stores program code, and when the program code is executed by the processor, the processor can execute the following steps:
对目标图像进行特征提取,得到该目标图像的第一图像特征;Perform feature extraction on the target image to obtain the first image feature of the target image;
基于该目标图像的第一图像特征,确定该目标图像的第一概率,该第一概率用于表示该目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to the target category;
响应于该第一概率大于第一概率阈值,从该目标图像中提取第二图像特征,并获取该目标图像关联的目标对象的对象特征;In response to the first probability being greater than the first probability threshold, extracting a second image feature from the target image, and acquiring the object feature of the target object associated with the target image;
基于该第二图像特征和该对象特征,确定该目标图像的第二概率,该第二概率用于表示该目标图像属于该目标类别的可能性;Based on the second image feature and the object feature, determine a second probability of the target image, where the second probability is used to represent the possibility that the target image belongs to the target category;
响应于该第二概率大于第二概率阈值,确定该目标对象属于该目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
在一些实施例中,该至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
响应于该第一概率小于或等于该第一概率阈值,确定该目标图像不属于该目标类别。In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,该至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
响应于该第二概率大于第三概率阈值且小于或等于该第二概率阈值,将该目标图像分配到指定任务集合中,该指定任务集合用于存储需要人工处理的图像;In response to the second probability being greater than the third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set, where the specified task set is used to store images requiring manual processing;
响应于该第二概率小于或等于该第三概率阈值,确定该目标图像不属于该目标类别。In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,该第三概率阈值基于准召变化曲线、召回率和准确率确定,该准召变化曲线用于描述该召回率、该准确率以及该第三概率阈值之间的关联关系。In some embodiments, the third probability threshold is determined based on a quasi-call variation curve, a recall rate, and an accuracy rate, and the quasi-call variation curve is used to describe the relationship between the recall rate, the accuracy rate, and the third probability threshold value .
在一些实施例中,本申请实施例中的电子设备包括至少一个处理器;以及与该至少一个处理器通信连接的存储器;其中,该存储器存储有可被该至少一个处理器执行的指令,该指 令被该至少一个处理器执行,以使该至少一个处理器能够实现下述步骤:In some embodiments, the electronic device in the embodiments of the present application includes at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the memory The instructions are executed by the at least one processor to enable the at least one processor to implement the following steps:
获取第一样本集,该第一样本集中包括多张样本图像,各该样本图像关联有预先标注的类别;obtaining a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
从该第一样本集中移除不属于目标类别且与该目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images from the first sample set that do not belong to the target category and have a similarity with the target category greater than the specified similarity to obtain a target sample set;
基于该目标样本集中的样本图像以及该目标样本集中样本图像关联的类别,训练得到该第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
基于该第一模型分别对该第一样本集中的各样本图像进行分类识别,得到各该样本图像对应的概率,该概率用于指示对应的样本图像属于该目标类别的可能性;Based on the first model, classify and identify each sample image in the first sample set, to obtain a probability corresponding to each sample image, and the probability is used to indicate the possibility that the corresponding sample image belongs to the target category;
基于该第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than the specified probability threshold;
基于该第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到该第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
在一些实施例中,属于该目标类别的样本为正样本,不属于该目标类别的样本为负样本;In some embodiments, samples belonging to the target category are positive samples, and samples not belonging to the target category are negative samples;
该至少一个处理器执行的指令,还用于实现下述步骤:The instructions executed by the at least one processor are also used to implement the following steps:
响应于需要提高该正样本的召回率,设置该第一模型的损失权重为大于1的值;In response to the need to improve the recall rate of the positive sample, set the loss weight of the first model to a value greater than 1;
响应于需要提高该负样本的召回率,设置该第一模型的损失权重为小于1的值。In response to the need to improve the recall rate of the negative samples, the loss weight of the first model is set to a value less than 1.
在一些实施例中,该至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
基于该第一样本集中的样本图像以及该第一样本集中样本图像关联的类别,训练得到中间模型;An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
基于该中间模型分别对该第一样本集中的各样本图像进行分类识别,得到各该样本图像与该目标类别的相似度,该相似度用于表示样本图像属于该目标类别的可能性;Based on the intermediate model, classify and identify each sample image in the first sample set, and obtain the similarity between each sample image and the target category, and the similarity is used to indicate the possibility that the sample image belongs to the target category;
对于该第一样本集中的任一样本图像,响应于该任一样本图像不属于该目标类别且与该目标类别的相似度大于指定相似度,则从该第一样本集中移除该任一样本图像。For any sample image in the first sample set, in response to the any sample image not belonging to the target category and the similarity with the target category is greater than the specified similarity, remove the any sample image from the first sample set a sample image.
在一些实施例中,该至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
将该第一样本集中对应的概率大于该指定概率阈值的样本图像,添加至第三样本集;adding sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;
对该第三样本集中的每张样本图像分别进行多次裁剪,得到多张裁剪后的样本图像;Each sample image in the third sample set is cropped multiple times to obtain a plurality of cropped sample images;
将多张该裁剪后的样本图像以及多张该裁剪后的样本图像关联的类别添加至该第三样本集,得到该第二样本集。The second sample set is obtained by adding a plurality of the cropped sample images and the categories associated with the plurality of cropped sample images to the third sample set.
在一些实施例中,该至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
对于该第二样本集中的任一样本图像,对该任一样本图像中的感兴趣对象进行特征识别,得到该任一样本图像的图像特征,该图像特征用于指示该感兴趣对象的至少一个特征;For any sample image in the second sample set, perform feature recognition on the object of interest in any sample image to obtain an image feature of the any sample image, where the image feature is used to indicate at least one of the object of interest feature;
基于该图像特征,获取该任一样本图像关联的样本对象的对象标识;Based on the image feature, obtain the object identifier of the sample object associated with any sample image;
基于该对象标识,获取该样本对象的对象特征。Based on the object identification, the object characteristics of the sample object are obtained.
在一些实施例中,该至少一个处理器执行的指令,还用于实现下述步骤:In some embodiments, the instructions executed by the at least one processor are further used to implement the following steps:
响应于该任一样本图像中包括多个感兴趣对象,按照多个该感兴趣对象的大小顺序,依序从该任一样本图像中获取至少一个该感兴趣对象的至少一个特征,得到该任一样本图像的图像特征。In response to the sample image including a plurality of objects of interest, according to the size order of the plurality of objects of interest, sequentially acquire at least one feature of at least one object of interest from the any sample image, and obtain the arbitrary sample image. image features of a sample image.
下面参照图12来描述根据本申请的这种实施方式的电子设备130。The electronic device 130 according to this embodiment of the present application is described below with reference to FIG. 12 .
如图12所示,电子设备130以通用电子设备的形式表现。电子设备130的组件可以包括但不限于:上述至少一个处理器131、上述至少一个存储器132、连接不同系统组件(包括存储器132和处理器131)的总线133。As shown in FIG. 12, the electronic device 130 takes the form of a general electronic device. Components of the electronic device 130 may include, but are not limited to: the above-mentioned at least one processor 131 , the above-mentioned at least one memory 132 , and a bus 133 connecting different system components (including the memory 132 and the processor 131 ).
总线133表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器、外围总线、处理器或者使用多种总线结构中的任意总线结构的局域总线。 Bus 133 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus structures.
存储器132可以包括易失性存储器形式的可读介质,例如随机存取存储器(RAM)1321和/或高速缓存存储器1322,还可以进一步包括只读存储器(ROM)1323。 Memory 132 may include readable media in the form of volatile memory, such as random access memory (RAM) 1321 and/or cache memory 1322 , and may further include read only memory (ROM) 1323 .
存储器132还可以包括具有一组(至少一个)程序模块1324的程序/实用工具1325,这样的程序模块1324包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。The memory 132 may also include a program/utility 1325 having a set (at least one) of program modules 1324 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, which An implementation of a network environment may be included in each or some combination of the examples.
电子设备130也可以与一个或多个外部设备134(例如键盘、指向设备等)通信,还可与一个或者多个使得用户能与电子设备130交互的设备通信,和/或与使得该电子设备130能与一个或多个其它电子设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这 种通信可以通过输入/输出(I/O)接口135进行。并且,电子设备130还可以通过网络适配器136与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器136通过总线133与用于电子设备130的其它模块通信。应当理解,尽管图中未示出,可以结合电子设备130使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理器、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。 Electronic device 130 may also communicate with one or more external devices 134 (eg, keyboards, pointing devices, etc.), may also communicate with one or more devices that enable a user to interact with electronic device 130, and/or communicate with the electronic device 130 communicates with any device (eg, router, modem, etc.) capable of communicating with one or more other electronic devices. Such communication may take place through input/output (I/O) interface 135. Also, the electronic device 130 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 136 . As shown, network adapter 136 communicates with other modules for electronic device 130 via bus 133 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 130, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
在一些实施例中,本申请提供的一种图像分类方法实现为一种计算机程序产品的形式,该计算机程序产品包括计算机指令,计算机指令被处理器执行时实现下述步骤:In some embodiments, an image classification method provided by the present application is implemented in the form of a computer program product, the computer program product includes computer instructions, and the computer instructions are executed by a processor to implement the following steps:
对目标图像进行特征提取,得到所述目标图像的第一图像特征;performing feature extraction on the target image to obtain the first image feature of the target image;
基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;
响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;
基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;
响应于所述第二概率大于第二概率阈值,确定所述目标对象属于所述目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
在一些实施例中,本申请提供的一种图像分类模型的训练方法实现为一种计算机程序产品的形式,该计算机程序产品包括计算机指令,计算机指令被处理器执行时实现下述步骤:In some embodiments, the training method of an image classification model provided by the present application is implemented in the form of a computer program product, and the computer program product includes computer instructions, and the computer instructions are executed by a processor to implement the following steps:
获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;
基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
在一些实施例中,计算机程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的还包括下述任一种:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。In some embodiments, a computer program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. Also included in the readable storage medium are any of the following: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only Memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.
本申请的实施方式的用于图像分类的程序产品可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在电子设备上运行。然而,本申请的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The program product for image classification of embodiments of the present application may employ a portable compact disk read only memory (CD-ROM) and include program code, and may be executed on an electronic device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, carrying readable program code therein. Such propagated data signals may take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. A readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于——无线、有线、光缆、RF等等,或者上述的任意合适的组合。以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户电子设备上执行、部分地在用户设备上执行、作为一个独立的软件 包执行、部分在用户电子设备上部分在远程电子设备上执行、或者完全在远程电子设备或服务端上执行。在涉及远程电子设备的情形中,远程电子设备可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户电子设备,或者,可以连接到外部电子设备(例如利用因特网服务提供商来通过因特网连接)。Program code embodied on a readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Write program code for performing the operations of the present application in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming languages - Such as "C" language or similar programming language. The program code may execute entirely on the user's electronic device, partly on the user's device, as a stand-alone software package, partly on the user's electronic device and partly on a remote electronic device, or entirely on the remote electronic device or service Execute on the end. In the case of remote electronic equipment, the remote electronic equipment may be connected to the user electronic equipment through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to external electronic equipment (eg, using Internet services provider to connect via the Internet).
在一些实施例中,提供了一种非易失性计算机可读存储介质,该非易失性计算机可读存储介质存储有计算机程序,该计算机程序用于使计算机实现下述步骤:In some embodiments, a non-volatile computer-readable storage medium is provided, and the non-volatile computer-readable storage medium stores a computer program for causing a computer to implement the following steps:
对目标图像进行特征提取,得到该目标图像的第一图像特征;Perform feature extraction on the target image to obtain the first image feature of the target image;
基于该目标图像的第一图像特征,确定该目标图像的第一概率,该第一概率用于表示该目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to the target category;
响应于该第一概率大于第一概率阈值,从该目标图像中提取第二图像特征,并获取该目标图像关联的目标对象的对象特征;In response to the first probability being greater than the first probability threshold, extracting a second image feature from the target image, and acquiring the object feature of the target object associated with the target image;
基于该第二图像特征和该对象特征,确定该目标图像的第二概率,该第二概率用于表示该目标图像属于该目标类别的可能性;Based on the second image feature and the object feature, determine a second probability of the target image, where the second probability is used to represent the possibility that the target image belongs to the target category;
响应于该第二概率大于第二概率阈值,确定该目标对象属于该目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
在一些实施例中,提供了一种非易失性计算机可读存储介质,该非易失性计算机可读存储介质存储有计算机程序,该计算机程序用于使计算机实现下述步骤:获取第一样本集,该第一样本集中包括多张样本图像,各该样本图像关联有预先标注的类别;In some embodiments, a non-volatile computer-readable storage medium is provided, and the non-volatile computer-readable storage medium stores a computer program for causing a computer to implement the steps of: obtaining a first a sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
从该第一样本集中移除不属于目标类别且与该目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images from the first sample set that do not belong to the target category and have a similarity with the target category greater than the specified similarity to obtain a target sample set;
基于该目标样本集中的样本图像以及该目标样本集中样本图像关联的类别,训练得到该第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
基于该第一模型分别对该第一样本集中的各样本图像进行分类识别,得到各该样本图像对应的概率,该概率用于指示对应的样本图像属于该目标类别的可能性;Based on the first model, classify and identify each sample image in the first sample set, to obtain a probability corresponding to each sample image, and the probability is used to indicate the possibility that the corresponding sample image belongs to the target category;
基于该第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than the specified probability threshold;
基于该第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到该第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
示例性的,本申请实施例提供了一种图像分类方法,包括:Exemplarily, the embodiment of the present application provides an image classification method, including:
获取目标图像;Get the target image;
对该目标图像进行特征提取,得到该目标图像的第一图像特征;Perform feature extraction on the target image to obtain the first image feature of the target image;
采用该目标图像的第一图像特征,确定该目标图像属于目标类别的第一概率;Using the first image feature of the target image to determine the first probability that the target image belongs to the target category;
当该第一概率高于第一概率阈值时,从该目标图像中提取第二图像特征,并获取该目标图像关联的目标对象的对象特征;采用决策树对该第二图像特征和该对象特征进行融合处理,得到该目标图像属于该目标类别的第二概率;When the first probability is higher than the first probability threshold, extract the second image feature from the target image, and obtain the object feature of the target object associated with the target image; adopt a decision tree to determine the second image feature and the object feature Perform fusion processing to obtain the second probability that the target image belongs to the target category;
当该第二概率高于第二概率阈值时,确定该目标对象属于该目标类别。When the second probability is higher than the second probability threshold, it is determined that the target object belongs to the target category.
在一些实施例中,该确定该目标图像属于目标类别的第一概率之后,该方法还包括:In some embodiments, after determining the first probability that the target image belongs to the target category, the method further includes:
当该第一概率小于或等于该第一概率阈值时,确定该目标图像不属于该目标类别。When the first probability is less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,该确定该目标图像属于该目标类别的第二概率之后,该方法还包括:In some embodiments, after determining the second probability that the target image belongs to the target category, the method further includes:
当该第二概率高于第三概率阈值且小于该第二概率阈值时,将该目标对象分配到指定任务集合中;When the second probability is higher than the third probability threshold and less than the second probability threshold, assign the target object to the specified task set;
当该第二概率小于该第三概率阈值时,确定该目标图像不属于该目标类别。When the second probability is smaller than the third probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,预先存储有准召变化曲线,该准召变化曲线用于描述召回率的召回参数、用于描述目标类别的判定准确率的准确率参数以及该第三概率阈值之间的关联关系;In some embodiments, a quasi-call variation curve is pre-stored, and the quasi-call variation curve is used to describe the recall parameter of the recall rate, the precision rate parameter used to describe the determination accuracy rate of the target category, and the difference between the third probability threshold. connection relation;
该第三概率阈值是根据第三的召回率指标和准确率指标设定的。The third probability threshold is set according to the third recall index and accuracy index.
在一些实施例中,采用预先训练好的第一模型对该目标图像进行特征提取,并确定该目标图像属于该目标类别的第一概率,其中,该第一模型是根据以下方法训练的:In some embodiments, feature extraction is performed on the target image using a pre-trained first model, and a first probability that the target image belongs to the target category is determined, wherein the first model is trained according to the following method:
获取第一样本集,该第一样本集中包括多张样本图像,各该样本图像关联有预先标注的类别;obtaining a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
将该第一样本集中不属于目标类别且与该目标类别的相似度高于指定相似度的样本图像过滤掉,得到目标样本集;Filter out the sample images in the first sample set that do not belong to the target category and the similarity with the target category is higher than the specified similarity to obtain the target sample set;
以该目标样本集中的该样本图像作为该第一模型的输入,以该样本图像的类别作为该第一模型的期望输出,训练该第一模型直至该第一模型训练收敛。The sample image in the target sample set is used as the input of the first model, and the category of the sample image is used as the expected output of the first model, and the first model is trained until the training of the first model converges.
在一些实施例中,采用基于决策树的第二模型对该第二图像特征和该对象特征进行融合处理,得到该目标图像属于该目标类别的第二概率,其中,该第二模型是根据以下方法训练的:In some embodiments, a second model based on a decision tree is used to fuse the second image feature and the object feature to obtain a second probability that the target image belongs to the target category, wherein the second model is based on the following method of training:
获取属于该目标类别的概率大于指定概率阈值的样本图像构建第二样本集;Obtain sample images whose probability of belonging to the target category is greater than the specified probability threshold to construct a second sample set;
针对该第二样本集中的任一样本图像,从该样本图像中提取图像特征,并获取与该样本图像关联的目标对象的对象特征;For any sample image in the second sample set, extract image features from the sample image, and obtain object features of the target object associated with the sample image;
采用该图像特征和该对象特征训练该第二模型,直至该第二模型训练收敛。The second model is trained using the image features and the object features until the training of the second model converges.
在一些实施例中,该目标类别的样本为正样本,不属于该目标类别的样本为负样本;In some embodiments, the samples of the target category are positive samples, and the samples that do not belong to the target category are negative samples;
当需要提高该正样本召回率时,设置该第一模型的损失权重为大于1的值;When it is necessary to improve the recall rate of the positive sample, set the loss weight of the first model to a value greater than 1;
当需要提高该负样本召回率时,设置该第一模型的损失权重为小于1的值。When the negative sample recall rate needs to be improved, the loss weight of the first model is set to a value less than 1.
在一些实施例中,该将该第一样本集中不属于目标类别且与该目标类别的相似度高于指定相似度的样本图像过滤掉,包括:In some embodiments, filtering out sample images in the first sample set that do not belong to the target category and have a similarity with the target category higher than a specified similarity, including:
在该第一模型经训练之前,以该第一样本集中的该样本图像作为该第一模型的输入,以该样本图像的类别作为该第一模型的期望输出,训练该第一模型,直至该第一模型训练收敛;Before the first model is trained, use the sample image in the first sample set as the input of the first model, and use the category of the sample image as the expected output of the first model, train the first model until The first model training converges;
将该第一样本集中各样本图像输入至该第一模型,得到该样本图像属于该目标类别的概率作为与该目标类别的相似度;Input each sample image in the first sample set into the first model, and obtain the probability that the sample image belongs to the target category as the similarity with the target category;
若该样本图像不属于目标类别且与该目标类别的相似度高于指定相似度,则将该样本图像过滤掉。If the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity, the sample image is filtered out.
在一些实施例中,该获取属于该目标类别的概率大于指定概率阈值的样本图像构建第二样本集,包括:In some embodiments, the acquisition of sample images whose probability of belonging to the target category is greater than a specified probability threshold to construct a second sample set includes:
对该第二样本集中的每张样本图像分别进行多次裁剪处理,获取多张裁剪后的样本图像;Perform multiple cropping processing on each sample image in the second sample set to obtain multiple cropped sample images;
获取该第二样本集中的样本图像和多张该裁剪后的样本图像各自的类别,构建由样本图像和对应的类别构成的该第三样本集。The respective categories of the sample images in the second sample set and the multiple cropped sample images are acquired, and the third sample set composed of the sample images and corresponding categories is constructed.
在一些实施例中,该从该样本图像中提取图像特征,并获取与该样本图像关联的目标对象的对象特征,包括:In some embodiments, extracting image features from the sample image and obtaining object features of the target object associated with the sample image include:
对该样本图像中的感兴趣对象进行特征识别,从该样本图像中获取该感兴趣对象的特征信息;Perform feature recognition on the object of interest in the sample image, and obtain feature information of the object of interest from the sample image;
获取该样本图像关联的目标对象的对象标识;Obtain the object identifier of the target object associated with the sample image;
根据该对象标识获取该目标对象的对象特征。The object feature of the target object is acquired according to the object identifier.
在一些实施例中,对该样本图像中的感兴趣对象进行特征识别,从该样本图像中获取该感兴趣对象的特征信息,包括:In some embodiments, performing feature identification on the object of interest in the sample image, and obtaining feature information of the object of interest from the sample image, including:
当该样本图像中包括多个感兴趣对象时,按照感兴趣对象的大小顺序,依序从该样本图像中获取至少一个该感兴趣对象的特征信息。When the sample image includes a plurality of objects of interest, the characteristic information of at least one object of interest is sequentially acquired from the sample image according to the size order of the objects of interest.
示例性的,本申请实施例提供了一种图像分类模型的训练方法,包括:Exemplarily, the embodiment of the present application provides a training method for an image classification model, including:
获取第一样本集,该第一样本集中包括多张样本图像,各该样本图像关联有预先标注的类别;obtaining a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
将该第一样本集中不属于目标类别且与该目标类别的相似度高于指定相似度的样本图像过滤掉,得到目标样本集;Filter out the sample images in the first sample set that do not belong to the target category and the similarity with the target category is higher than the specified similarity to obtain the target sample set;
以该目标样本集中的该样本图像作为该第一模型的输入,以该样本图像的类别作为该第一模型的期望输出,训练该第一模型直至该第一模型训练收敛;Using the sample image in the target sample set as the input of the first model, and using the category of the sample image as the expected output of the first model, train the first model until the training of the first model converges;
采用训练好的该第一模型分别对该第一样本集中的各样本图像进行分类识别,得到各该样本图像的属于该目标类别的概率;Use the trained first model to classify and identify each sample image in the first sample set, and obtain the probability of each sample image belonging to the target category;
获取属于该目标类别的概率大于指定概率阈值的样本图像构建第二样本集;Obtain sample images whose probability of belonging to the target category is greater than the specified probability threshold to construct a second sample set;
针对该第二样本集中的任一样本图像,从该样本图像中提取图像特征,并获取与该样本图像关联的目标对象的对象特征;For any sample image in the second sample set, extract image features from the sample image, and obtain object features of the target object associated with the sample image;
采用该图像特征和该对象特征训练该第二模型,直至该第二模型训练收敛。The second model is trained using the image features and the object features until the training of the second model converges.
在一些实施例中,该目标类别的样本为正样本,不属于该目标类别的样本为负样本;In some embodiments, the samples of the target category are positive samples, and the samples that do not belong to the target category are negative samples;
当需要提高该正样本召回率时,设置该第一模型的损失权重为大于1的值;When the positive sample recall rate needs to be improved, the loss weight of the first model is set to a value greater than 1;
当需要提高该负样本召回率时,设置该第一模型的损失权重为小于1的值。When the negative sample recall rate needs to be improved, the loss weight of the first model is set to a value less than 1.
在一些实施例中,该将该第一样本集中不属于目标类别且与该目标类别的相似度高于指定相似度的样本图像过滤掉,包括:In some embodiments, filtering out sample images in the first sample set that do not belong to the target category and have a similarity with the target category higher than a specified similarity, including:
在该第一模型经训练之前,以该第一样本集中的该样本图像作为该第一模型的输入,以该样本图像的类别作为该第一模型的期望输出,训练该第一模型,直至该第一模型训练收敛;Before the first model is trained, use the sample image in the first sample set as the input of the first model, and use the category of the sample image as the expected output of the first model, train the first model until The first model training converges;
将该第一样本集中各样本图像输入至该第一模型,得到该样本图像属于该目标类别的概率作为与该目标类别的相似度;Input each sample image in the first sample set into the first model, and obtain the probability that the sample image belongs to the target category as the similarity with the target category;
若该样本图像不属于目标类别且与该目标类别的相似度高于指定相似度,则将该样本图像过滤掉。If the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity, the sample image is filtered out.
在一些实施例中,该获取属于该目标类别的概率大于指定概率阈值的样本图像构建第二样本集,包括:In some embodiments, the acquisition of sample images whose probability of belonging to the target category is greater than a specified probability threshold to construct a second sample set includes:
对该第二样本集中的每张样本图像分别进行多次裁剪处理,获取多张裁剪后的样本图像;Perform multiple cropping processing on each sample image in the second sample set to obtain multiple cropped sample images;
获取该第二样本集中的样本图像和多张该裁剪后的样本图像各自的类别,构建由样本图像和对应的类别构成的该第三样本集。The respective categories of the sample images in the second sample set and the plurality of cropped sample images are acquired, and the third sample set composed of the sample images and corresponding categories is constructed.
在一些实施例中,该从该样本图像中提取图像特征,并获取与该样本图像关联的目标对象的对象特征,包括:In some embodiments, the image features are extracted from the sample image and the object features of the target object associated with the sample image are obtained, including:
对该样本图像中的感兴趣对象进行特征识别,从该样本图像中获取该感兴趣对象的特征信息;Perform feature recognition on the object of interest in the sample image, and obtain feature information of the object of interest from the sample image;
获取该样本图像关联的目标对象的对象标识;Obtain the object identifier of the target object associated with the sample image;
根据该对象标识获取该目标对象的对象特征。The object feature of the target object is acquired according to the object identifier.
在一些实施例中,对该样本图像中的感兴趣对象进行特征识别,从该样本图像中获取该感兴趣对象的特征信息,包括:In some embodiments, performing feature identification on the object of interest in the sample image, and obtaining feature information of the object of interest from the sample image, including:
当该样本图像中包括多个感兴趣对象时,按照感兴趣对象的大小顺序,依序从该样本图像中获取至少一个该感兴趣对象的特征信息。When the sample image includes a plurality of objects of interest, the characteristic information of at least one object of interest is sequentially acquired from the sample image according to the size order of the objects of interest.
示例性的,本申请还提供了一种图像分类装置,该装置包括:Exemplarily, the present application also provides an image classification device, the device comprising:
图像获取模块,被配置为获取目标图像;an image acquisition module, configured to acquire a target image;
第一特征提取模块,被配置为对该目标图像进行特征提取,得到该目标图像的第一图像特征;a first feature extraction module, configured to perform feature extraction on the target image to obtain a first image feature of the target image;
第一概率确定模块,被配置为采用该目标图像的第一图像特征,确定该目标图像属于目标类别的第一概率;a first probability determination module, configured to use the first image feature of the target image to determine the first probability that the target image belongs to the target category;
第一概率判断模块,被配置为当该第一概率高于第一概率阈值时,从该目标图像中提取第二图像特征,并获取该目标图像关联的目标对象的对象特征;采用决策树对该第二图像特征和该对象特征进行融合处理,得到该目标图像属于该目标类别的第二概率;The first probability judgment module is configured to extract the second image feature from the target image when the first probability is higher than the first probability threshold, and obtain the object feature of the target object associated with the target image; The second image feature and the object feature are fused to obtain a second probability that the target image belongs to the target category;
第二概率判断模块,被配置为当该第二概率高于第二概率阈值时,确定该目标对象属于该目标类别。The second probability judgment module is configured to determine that the target object belongs to the target category when the second probability is higher than the second probability threshold.
在一些实施例中,第一概率确定模块还被配置为:In some embodiments, the first probability determination module is further configured to:
当该第一概率小于或等于该第一概率阈值时,确定该目标图像不属于该目标类别。When the first probability is less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,第一概率判断模块还被配置为:In some embodiments, the first probability judgment module is further configured to:
当该第二概率高于第三概率阈值且小于该第二概率阈值时,将该目标对象分配到指定任务集合中;When the second probability is higher than the third probability threshold and less than the second probability threshold, assign the target object to the specified task set;
当该第二概率小于该第三概率阈值时,确定该目标图像不属于该目标类别。When the second probability is smaller than the third probability threshold, it is determined that the target image does not belong to the target category.
在一些实施例中,预先存储有准召变化曲线,该准召变化曲线用于描述召回率的召回参数、用于描述目标类别的判定准确率的准确率参数以及该第三概率阈值之间的关联关系;In some embodiments, a quasi-call variation curve is pre-stored, and the quasi-call variation curve is used to describe the recall parameter of the recall rate, the precision rate parameter used to describe the determination accuracy rate of the target category, and the difference between the third probability threshold. connection relation;
该第三概率阈值是根据第三的召回率指标和准确率指标设定的。The third probability threshold is set according to the third recall index and accuracy index.
在一些实施例中,采用预先训练好的第一模型对该目标图像进行特征提取,并确定该目标图像属于该目标类别的第一概率,其中,该第一模型是根据以下模块训练的:In some embodiments, a pre-trained first model is used to perform feature extraction on the target image, and a first probability that the target image belongs to the target category is determined, wherein the first model is trained according to the following modules:
第一样本集获取模块,被配置为获取第一样本集,该第一样本集中包括多张样本图像,各该样本图像关联有预先标注的类别;a first sample set obtaining module, configured to obtain a first sample set, the first sample set includes a plurality of sample images, and each sample image is associated with a pre-marked category;
过滤模块,被配置为将该第一样本集中不属于目标类别且与该目标类别的相似度高于指定相似度的样本图像过滤掉,得到目标样本集;The filtering module is configured to filter out the sample images in the first sample set that do not belong to the target category and whose similarity with the target category is higher than the specified similarity, to obtain the target sample set;
第一模型训练模块,被配置为以该目标样本集中的该样本图像作为该第一模型的输入,以该样本图像的类别作为该第一模型的期望输出,训练该第一模型直至该第一模型训练收敛。A first model training module, configured to use the sample image in the target sample set as the input of the first model, use the category of the sample image as the expected output of the first model, and train the first model until the first model Model training converges.
在一些实施例中,采用基于决策树的第二模型对该第二图像特征和该对象特征进行融合处理,得到该目标图像属于该目标类别的第二概率,其中,该第二模型是根据以下模块训练的:In some embodiments, a second model based on a decision tree is used to fuse the second image feature and the object feature to obtain a second probability that the target image belongs to the target category, wherein the second model is based on the following Module trained:
第二样本集获取模块,被配置为获取属于该目标类别的概率大于指定概率阈值的样本图像构建第二样本集;The second sample set obtaining module is configured to obtain sample images whose probability of belonging to the target category is greater than the specified probability threshold to construct a second sample set;
特征提取模块,被配置为针对该第二样本集中的任一样本图像,从该样本图像中提取图像特征,并获取与该样本图像关联的目标对象的对象特征;A feature extraction module, configured to extract image features from the sample image for any sample image in the second sample set, and obtain object features of the target object associated with the sample image;
第二模型训练模块,被配置为采用该图像特征和该对象特征训练该第二模型,直至该第二模型训练收敛。The second model training module is configured to use the image feature and the object feature to train the second model until the second model training converges.
在一些实施例中,该目标类别的样本为正样本,不属于该目标类别的样本为负样本;In some embodiments, the samples of the target category are positive samples, and the samples that do not belong to the target category are negative samples;
当需要提高该正样本召回率时,设置该第一模型的损失权重为大于1的值;When the positive sample recall rate needs to be improved, the loss weight of the first model is set to a value greater than 1;
当需要提高该负样本召回率时,设置该第一模型的损失权重为小于1的值。When the negative sample recall rate needs to be improved, the loss weight of the first model is set to a value less than 1.
在一些实施例中,该过滤模块,包括:In some embodiments, the filtering module includes:
初次训练单元,被配置为在该第一模型经训练之前,以该第一样本集中的该样本图像作为该第一模型的输入,以该样本图像的类别作为该第一模型的期望输出,训练该第一模型,直至该第一模型训练收敛;an initial training unit, configured to use the sample image in the first sample set as the input of the first model and the category of the sample image as the expected output of the first model before the first model is trained, training the first model until the training of the first model converges;
相似度获取单元,被配置为将该第一样本集中各样本图像输入至该第一模型,得到该样本图像属于该目标类别的概率作为与该目标类别的相似度;a similarity obtaining unit, configured to input each sample image in the first sample set into the first model, and obtain the probability that the sample image belongs to the target category as the similarity with the target category;
过滤单元,被配置为若该样本图像不属于目标类别且与该目标类别的相似度高于指定相似度,则将该样本图像过滤掉。The filtering unit is configured to filter out the sample image if the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity.
在一些实施例中,该第二样本集获取模块,包括:In some embodiments, the second sample set acquisition module includes:
裁剪单元,被配置为对该第二样本集中的每张样本图像分别进行多次裁剪处理,获取多张裁剪后的样本图像;a cropping unit, configured to perform multiple cropping processes on each sample image in the second sample set, to obtain multiple cropped sample images;
第三样本集获取单元,被配置为获取该第二样本集中的样本图像和多张该裁剪后的样本图像各自的类别,构建由样本图像和对应的类别构成的该第三样本集。The third sample set obtaining unit is configured to obtain the respective categories of the sample images in the second sample set and the plurality of cropped sample images, and construct the third sample set consisting of the sample images and the corresponding categories.
在一些实施例中,该特征提取模块,包括:In some embodiments, the feature extraction module includes:
特征信息获取单元,被配置为对该样本图像中的感兴趣对象进行特征识别,从该样本图像中获取该感兴趣对象的特征信息;a feature information acquisition unit, configured to perform feature recognition on the object of interest in the sample image, and obtain feature information of the object of interest from the sample image;
对象标识获取单元,被配置为获取该样本图像关联的目标对象的对象标识;an object identification obtaining unit, configured to obtain the object identification of the target object associated with the sample image;
对象特征获取单元,被配置为根据该对象标识获取该目标对象的对象特征。The object feature obtaining unit is configured to obtain the object feature of the target object according to the object identifier.
在一些实施例中,该特征信息获取单元,包括:In some embodiments, the feature information acquisition unit includes:
当该样本图像中包括多个感兴趣对象时,按照感兴趣对象的大小顺序,依序从该样本图像中获取至少一个该感兴趣对象的特征信息。When the sample image includes a plurality of objects of interest, the characteristic information of at least one object of interest is sequentially acquired from the sample image according to the size order of the objects of interest.
示例性的,本申请还提供了一种图像分类模型的训练装置,该装置包括:Exemplarily, the present application also provides an apparatus for training an image classification model, the apparatus comprising:
第一获取模块,被配置为获取第一样本集,该第一样本集中包括多张样本图像,各该样本图像关联有预先标注的类别;a first acquisition module, configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
目标样本集获取模块,被配置为将该第一样本集中不属于目标类别且与该目标类别的相似度高于指定相似度的样本图像过滤掉,得到目标样本集;The target sample set acquisition module is configured to filter out the sample images in the first sample set that do not belong to the target category and the similarity with the target category is higher than the specified similarity to obtain the target sample set;
第一训练模块,被配置为以该目标样本集中的该样本图像作为该第一模型的输入,以该样本图像的类别作为该第一模型的期望输出,训练该第一模型直至该第一模型训练收敛;a first training module, configured to use the sample image in the target sample set as the input of the first model, use the category of the sample image as the expected output of the first model, and train the first model until the first model training convergence;
概率获取模块,被配置为采用训练好的该第一模型分别对该第一样本集中的各样本图像进行分类识别,得到各该样本图像的属于该目标类别的概率;a probability acquisition module, configured to use the trained first model to classify and identify each sample image in the first sample set, and obtain the probability of each sample image belonging to the target category;
第二获取模块,被配置为获取属于该目标类别的概率大于指定概率阈值的样本图像构建第二样本集;A second acquisition module, configured to acquire sample images whose probability of belonging to the target category is greater than a specified probability threshold to construct a second sample set;
特征提取模块,被配置为针对该第二样本集中的任一样本图像,从该样本图像中提取图像特征,并获取与该样本图像关联的目标对象的对象特征;A feature extraction module, configured to extract image features from the sample image for any sample image in the second sample set, and obtain object features of the target object associated with the sample image;
第二训练模块,被配置为采用该图像特征和该对象特征训练该第二模型,直至该第二模型训练收敛。The second training module is configured to train the second model using the image feature and the object feature until the training of the second model converges.
在一些实施例中,该目标类别的样本为正样本,不属于该目标类别的样本为负样本;In some embodiments, the samples of the target category are positive samples, and the samples that do not belong to the target category are negative samples;
当需要提高该正样本召回率时,设置该第一模型的损失权重为大于1的值;When the positive sample recall rate needs to be improved, the loss weight of the first model is set to a value greater than 1;
当需要提高该负样本召回率时,设置该第一模型的损失权重为小于1的值。When the negative sample recall rate needs to be improved, the loss weight of the first model is set to a value less than 1.
在一些实施例中,该目标样本集获取模块,包括:In some embodiments, the target sample set acquisition module includes:
第一训练单元,被配置为在该第一模型经训练之前,以该第一样本集中的该样本图像作为该第一模型的输入,以该样本图像的类别作为该第一模型的期望输出,训练该第一模型,直至该第一模型训练收敛;A first training unit, configured to take the sample image in the first sample set as the input of the first model, and take the category of the sample image as the expected output of the first model before the first model is trained , train the first model until the training of the first model converges;
相似度获取单元,被配置为将该第一样本集中各样本图像输入至该第一模型,得到该样本图像属于该目标类别的概率作为与该目标类别的相似度;a similarity obtaining unit, configured to input each sample image in the first sample set into the first model, and obtain the probability that the sample image belongs to the target category as the similarity with the target category;
过滤单元,被配置为若该样本图像不属于目标类别且与该目标类别的相似度高于指定相似度,则将该样本图像过滤掉。The filtering unit is configured to filter out the sample image if the sample image does not belong to the target category and the similarity with the target category is higher than the specified similarity.
在一些实施例中,该第二获取模块,包括:In some embodiments, the second obtaining module includes:
裁剪处理单元,被配置为对该第二样本集中的每张样本图像分别进行多次裁剪处理,获取多张裁剪后的样本图像;a cropping processing unit, configured to perform multiple cropping processes on each sample image in the second sample set, respectively, to obtain a plurality of cropped sample images;
第三样本集获取单元,被配置为获取该第二样本集中的样本图像和多张该裁剪后的样本图像各自的类别,构建由样本图像和对应的类别构成的该第三样本集。The third sample set obtaining unit is configured to obtain the respective categories of the sample images in the second sample set and the plurality of cropped sample images, and construct the third sample set consisting of the sample images and the corresponding categories.
在一些实施例中,该特征提取模块,包括:In some embodiments, the feature extraction module includes:
特征识别单元,被配置为对该样本图像中的感兴趣对象进行特征识别,从该样本图像中获取该感兴趣对象的特征信息;a feature recognition unit, configured to perform feature recognition on the object of interest in the sample image, and obtain feature information of the object of interest from the sample image;
获取单元,被配置为获取该样本图像关联的目标对象的对象标识;an acquisition unit, configured to acquire the object identifier of the target object associated with the sample image;
画像获取单元,被配置为根据该对象标识获取该目标对象的对象特征。The portrait acquisition unit is configured to acquire the object feature of the target object according to the object identifier.
在一些实施例中,对该样本图像中的感兴趣对象进行特征识别,从该样本图像中获取该感兴趣对象的特征信息,包括:In some embodiments, performing feature identification on the object of interest in the sample image, and obtaining feature information of the object of interest from the sample image, including:
当该样本图像中包括多个感兴趣对象时,按照感兴趣对象的大小顺序,依序从该样本图像中获取至少一个该感兴趣对象的特征信息。When the sample image includes a plurality of objects of interest, the characteristic information of at least one object of interest is sequentially acquired from the sample image according to the size order of the objects of interest.
本公开所有实施例均可以单独被执行,也可以与其他实施例相结合被执行,均视为本公开要求的保护范围。All the embodiments of the present disclosure can be implemented independently or in combination with other embodiments, which are all regarded as the protection scope required by the present disclosure.

Claims (34)

  1. 一种图像分类方法,其中,所述方法包括:An image classification method, wherein the method comprises:
    对目标图像进行特征提取,得到所述目标图像的第一图像特征;performing feature extraction on the target image to obtain the first image feature of the target image;
    基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to indicate the possibility that the target image belongs to a target category;
    响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;
    基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;
    响应于所述第二概率大于第二概率阈值,确定所述目标对象属于所述目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
  2. 根据权利要求1所述的方法,其中,所述方法还包括:The method of claim 1, wherein the method further comprises:
    响应于所述第一概率小于或等于所述第一概率阈值,确定所述目标图像不属于所述目标类别。In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
  3. 根据权利要求1所述的方法,其中,所述方法还包括:The method of claim 1, wherein the method further comprises:
    响应于所述第二概率大于第三概率阈值且小于或等于所述第二概率阈值,将所述目标图像分配到指定任务集合中,所述指定任务集合用于存储需要人工处理的图像;in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;
    响应于所述第二概率小于或等于所述第三概率阈值,确定所述目标图像不属于所述目标类别。In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
  4. 根据权利要求3所述的方法,其中,所述第三概率阈值基于准召变化曲线、召回率和准确率确定,所述准召变化曲线用于描述所述召回率、所述准确率以及所述第三概率阈值之间的关联关系。The method according to claim 3, wherein the third probability threshold is determined based on a quasi-call variation curve, a recall rate and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate and the accuracy rate. The relationship between the third probability thresholds is described.
  5. 一种图像分类模型的训练方法,其中,所述图像分类模型包括第一模型和第二模型,所述方法包括:A training method for an image classification model, wherein the image classification model includes a first model and a second model, and the method includes:
    获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
    从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
    基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
    基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the first model, and obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;
    基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
    基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  6. 根据权利要求5所述的方法,其中,属于所述目标类别的样本为正样本,不属于所述目标类别的样本为负样本;The method according to claim 5, wherein the samples belonging to the target category are positive samples, and the samples not belonging to the target category are negative samples;
    所述方法还包括:The method also includes:
    响应于需要提高所述正样本的召回率,设置所述第一模型的损失权重为大于1的值;In response to the need to improve the recall rate of the positive samples, setting the loss weight of the first model to a value greater than 1;
    响应于需要提高所述负样本的召回率,设置所述第一模型的损失权重为小于1的值。In response to the need to improve the recall rate of the negative samples, the loss weight of the first model is set to a value less than 1.
  7. 根据权利要求5所述的方法,其中,所述从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,包括:The method according to claim 5, wherein the removing from the first sample set the sample images that do not belong to the target category and have a similarity with the target category greater than a specified similarity, comprises:
    基于所述第一样本集中的样本图像以及所述第一样本集中样本图像关联的类别,训练得到中间模型;An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
    基于所述中间模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像与所述目标类别的相似度,所述相似度用于表示样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the intermediate model, and obtain the similarity between each sample image and the target category, where the similarity is used to indicate that the sample image belongs to the target the possibility of categories;
    对于所述第一样本集中的任一样本图像,响应于所述任一样本图像不属于所述目标类别 且与所述目标类别的相似度大于指定相似度,则从所述第一样本集中移除所述任一样本图像。For any sample image in the first sample set, in response to the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, the first sample image is selected from the first sample. Centrally remove any of the sample images.
  8. 根据权利要求5所述的方法,其中,所述基于属于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集,包括:The method according to claim 5, wherein the constructing the second sample set based on the sample images belonging to the first sample set with a corresponding probability greater than a specified probability threshold comprises:
    将所述第一样本集中对应的概率大于所述指定概率阈值的样本图像,添加至第三样本集;adding the sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;
    对所述第三样本集中的每张样本图像分别进行多次裁剪,得到多张裁剪后的样本图像;Each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images;
    将多张所述裁剪后的样本图像以及多张所述裁剪后的样本图像关联的类别添加至所述第三样本集,得到所述第二样本集。The second sample set is obtained by adding a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set.
  9. 根据权利要求5所述的方法,其中,所述方法还包括:The method of claim 5, wherein the method further comprises:
    对于所述第二样本集中的任一样本图像,对所述任一样本图像中的感兴趣对象进行特征识别,得到所述任一样本图像的图像特征,所述图像特征用于指示所述感兴趣对象的至少一个特征;For any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain image features of the any sample image, where the image features are used to indicate the sense of at least one characteristic of the object of interest;
    基于所述图像特征,获取所述任一样本图像关联的样本对象的对象标识;Based on the image feature, obtain the object identifier of the sample object associated with any of the sample images;
    基于所述对象标识,获取所述样本对象的对象特征。Based on the object identification, an object feature of the sample object is obtained.
  10. 根据权利要求9所述的方法,其中,所述对所述任一样本图像中的感兴趣对象进行特征识别,得到所述任一样本图像的图像特征,包括:The method according to claim 9, wherein the performing feature identification on the object of interest in any of the sample images to obtain the image features of the any of the sample images, comprising:
    响应于所述任一样本图像中包括多个感兴趣对象,按照多个所述感兴趣对象的大小顺序,依序从所述任一样本图像中获取至少一个所述感兴趣对象的至少一个特征,得到所述任一样本图像的图像特征。In response to the plurality of objects of interest being included in any one of the sample images, sequentially acquiring at least one feature of at least one of the objects of interest from the any one of the sample images according to the size order of the plurality of objects of interest , to obtain the image features of any of the sample images.
  11. 一种图像分类装置,其中,所述装置包括:An image classification device, wherein the device comprises:
    第一特征提取模块,被配置为对所述目标图像进行特征提取,得到所述目标图像的第一图像特征;a first feature extraction module, configured to perform feature extraction on the target image to obtain a first image feature of the target image;
    第一概率确定模块,被配置为基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;a first probability determination module, configured to determine a first probability of the target image based on a first image feature of the target image, where the first probability is used to indicate a possibility that the target image belongs to a target category;
    第一概率判断模块,被配置为响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;a first probability judgment module, configured to extract a second image feature from the target image in response to the first probability being greater than a first probability threshold, and obtain an object feature of a target object associated with the target image;
    所述第一概率判断模块,还被配置为基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;The first probability judgment module is further configured to determine a second probability of the target image based on the second image feature and the object feature, where the second probability is used to indicate that the target image belongs to the the likelihood of the target class;
    第二概率判断模块,被配置为响应于所述第二概率大于第二概率阈值时,确定所述目标对象属于所述目标类别。The second probability judgment module is configured to determine that the target object belongs to the target category in response to the second probability being greater than a second probability threshold.
  12. 根据权利要求11所述的装置,其中,所述第一概率确定模块,还被配置为:响应于所述第一概率小于或等于所述第一概率阈值,确定所述目标图像不属于所述目标类别。The apparatus of claim 11, wherein the first probability determination module is further configured to determine that the target image does not belong to the target image in response to the first probability being less than or equal to the first probability threshold target category.
  13. 根据权利要求11所述的装置,其中,所述第一概率判断模块,还被配置为:The apparatus according to claim 11, wherein the first probability judgment module is further configured to:
    响应于所述第二概率大于第三概率阈值且小于或等于所述第二概率阈值,将所述目标图像分配到指定任务集合中,所述指定任务集合用于存储需要人工处理的图像;in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;
    响应于所述第二概率小于或等于所述第三概率阈值,确定所述目标图像不属于所述目标类别。In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
  14. 根据权利要求13所述的装置,其中,所述第三概率阈值基于准召变化曲线、召回率和准确率确定,所述准召变化曲线用于描述所述召回率、所述准确率以及所述第三概率阈值之间的关联关系。The apparatus according to claim 13, wherein the third probability threshold is determined based on a quasi-call variation curve, a recall rate and an accuracy rate, and the quasi-call variation curve is used to describe the recall rate, the accuracy rate and the accuracy rate. The relationship between the third probability thresholds is described.
  15. 一种图像分类模型的训练装置,其中,所述图像分类模型包括第一模型和第二模型,所述装置包括:An apparatus for training an image classification model, wherein the image classification model includes a first model and a second model, and the apparatus includes:
    第一获取模块,被配置为获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;a first acquisition module, configured to acquire a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
    目标样本集获取模块,被配置为从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;a target sample set acquisition module, configured to remove from the first sample set sample images that do not belong to the target category and whose similarity to the target category is greater than a specified similarity, to obtain a target sample set;
    第一训练模块,被配置为基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;a first training module, configured to obtain the first model by training based on the sample images in the target sample set and the categories associated with the sample images in the target sample set;
    概率获取模块,被配置为基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本对象属于所述目标类别的可能性;A probability acquisition module, configured to classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each of the sample images, where the probability is used to indicate a corresponding sample object Likelihood of falling into said target category;
    第二获取模块,被配置为基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;a second acquisition module, configured to construct a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
    第二训练模块,被配置为基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second training module is configured to obtain the second model by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  16. 根据权利要求15所述的装置,其中,属于所述目标类别的样本为正样本,不属于所述目标类别的样本为负样本;The apparatus according to claim 15, wherein the samples belonging to the target category are positive samples, and the samples not belonging to the target category are negative samples;
    所述第一训练模块,还被配置为响应于需要提高所述正样本的召回率,设置所述第一模型的损失权重为大于1的值;响应于需要提高所述负样本的召回率,设置所述第一模型的损失权重为小于1的值。The first training module is further configured to set the loss weight of the first model to a value greater than 1 in response to the need to improve the recall rate of the positive samples; in response to the need to improve the recall rate of the negative samples, Set the loss weight of the first model to a value less than 1.
  17. 根据权利要求15所述的装置,其中,所述目标样本集获取模块,包括:The apparatus according to claim 15, wherein the target sample set acquisition module comprises:
    第一训练单元,被配置为基于所述第一样本集中的样本图像以及所述第一样本集中样本图像关联的类别,训练得到中间模型;a first training unit, configured to obtain an intermediate model by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
    相似度获取单元,被配置为基于所述中间模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像与所述目标类别的相似度,所述相似度用于表示样本图像属于所述目标类别的可能性;A similarity obtaining unit, configured to classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each of the sample images and the target category, and the similarity is determined by to represent the likelihood that the sample image belongs to the target class;
    过滤单元,被配置为对于所述第一样本集中的任一样本图像,响应于所述任一样本图像不属于所述目标类别且与所述目标类别的相似度大于指定相似度,则从所述第一样本集中移除所述任一样本图像。The filtering unit is configured to, for any sample image in the first sample set, respond that the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, from The any one of the sample images is removed from the first sample set.
  18. 根据权利要求15所述的装置,其中,所述第二获取模块,包括:The apparatus according to claim 15, wherein the second obtaining module comprises:
    第三样本集获取单元,被配置将所述第一样本集中对应的概率大于指定概率阈值的样本图像,添加至第三样本集;a third sample set obtaining unit, configured to add sample images whose corresponding probability in the first sample set is greater than a specified probability threshold to the third sample set;
    裁剪处理单元,被配置为对所述第三样本集中的每张样本图像分别进行多次裁剪,得到多张裁剪后的样本图像;a cropping processing unit, configured to crop each sample image in the third sample set for multiple times to obtain a plurality of cropped sample images;
    第二样本集获取单元,被配置为将多张所述裁剪后的样本图像以及多张所述裁剪后的样本图像关联的类别添加至所述第三样本集,得到所述第二样本集。The second sample set acquiring unit is configured to add a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set to obtain the second sample set.
  19. 根据权利要求15所述的装置,其中,所述装置还包括:特征提取模块;The apparatus of claim 15, wherein the apparatus further comprises: a feature extraction module;
    所述特征提取模块,包括:The feature extraction module includes:
    特征识别单元,被配置为对于所述第二样本集中的任一样本图像,对所述任一样本图像中的感兴趣对象进行特征识别,得到所述任一样本图像的图像特征,所述图像特征用于指示所述感兴趣对象的至少一个特征;A feature recognition unit, configured to perform feature recognition on an object of interest in any sample image in the second sample set, to obtain image features of any sample image, and the image features are used to indicate at least one feature of the object of interest;
    获取单元,被配置为基于所述图像特征,获取所述任一样本图像关联的样本对象的对象标识;an obtaining unit, configured to obtain the object identifier of the sample object associated with any one of the sample images based on the image feature;
    画像获取单元,被配置为基于所述对象标识,获取所述样本对象的对象特征。The portrait acquisition unit is configured to acquire the object feature of the sample object based on the object identifier.
  20. 根据权利要求19所述的装置,其中,所述特征识别单元,被配置为响应于所述任一样本图像中包括多个感兴趣对象,按照多个所述感兴趣对象的大小顺序,依序从所述任一样本图像中获取至少一个所述感兴趣对象的至少一个特征,得到所述任一样本图像的图像特征。The apparatus according to claim 19, wherein the feature identification unit is configured to, in response to that any one of the sample images includes a plurality of objects of interest, in order of the size of the plurality of objects of interest, in sequence Obtain at least one feature of at least one of the objects of interest from any of the sample images to obtain image features of the any of the sample images.
  21. 一种电子设备,其中,包括至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够实现下述步骤:An electronic device comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor The at least one processor executes to enable the at least one processor to implement the following steps:
    对目标图像进行特征提取,得到所述目标图像的第一图像特征;performing feature extraction on the target image to obtain the first image feature of the target image;
    基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;
    响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;
    基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;
    响应于所述第二概率大于第二概率阈值,确定所述目标对象属于所述目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
  22. 根据权利要求21所述的电子设备,其中,所述至少一个处理器执行的指令,还用于实现下述步骤:The electronic device according to claim 21, wherein the instructions executed by the at least one processor are further used to implement the following steps:
    响应于所述第一概率小于或等于所述第一概率阈值,确定所述目标图像不属于所述目标类别。In response to the first probability being less than or equal to the first probability threshold, it is determined that the target image does not belong to the target category.
  23. 根据权利要求21所述的电子设备,其中,所述至少一个处理器执行的指令,还用于实现下述步骤:The electronic device according to claim 21, wherein the instructions executed by the at least one processor are further used to implement the following steps:
    响应于所述第二概率大于第三概率阈值且小于或等于所述第二概率阈值,将所述目标图像分配到指定任务集合中,所述指定任务集合用于存储需要人工处理的图像;in response to the second probability being greater than a third probability threshold and less than or equal to the second probability threshold, assigning the target image to a specified task set for storing images requiring manual processing;
    响应于所述第二概率小于或等于所述第三概率阈值,确定所述目标图像不属于所述目标类别。In response to the second probability being less than or equal to the third probability threshold, it is determined that the target image does not belong to the target category.
  24. 根据权利要求23所述的电子设备,其中,所述第三概率阈值基于准召变化曲线、召回率和准确率确定,所述准召变化曲线用于描述所述召回率、所述准确率以及所述第三概率阈值之间的关联关系。The electronic device according to claim 23, wherein the third probability threshold is determined based on a quasi-call variation curve, a recall rate and an accuracy rate, the quasi-call variation curve being used to describe the recall rate, the accuracy rate and the relationship between the third probability thresholds.
  25. 一种电子设备,其中,包括至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够实现下述步骤:An electronic device comprising at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor The at least one processor executes to enable the at least one processor to implement the following steps:
    获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
    从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
    基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
    基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;
    基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
    基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  26. 根据权利要求25所述的电子设备,其中,属于所述目标类别的样本为正样本,不属于所述目标类别的样本为负样本;The electronic device according to claim 25, wherein the samples belonging to the target category are positive samples, and the samples not belonging to the target category are negative samples;
    所述至少一个处理器执行的指令,还用于实现下述步骤:The instructions executed by the at least one processor are also used to implement the following steps:
    响应于需要提高所述正样本的召回率,设置所述第一模型的损失权重为大于1的值;In response to the need to improve the recall rate of the positive samples, setting the loss weight of the first model to a value greater than 1;
    响应于需要提高所述负样本的召回率,设置所述第一模型的损失权重为小于1的值。In response to the need to improve the recall rate of the negative samples, the loss weight of the first model is set to a value less than 1.
  27. 根据权利要求25所述的电子设备,其中,所述至少一个处理器执行的指令,还用于实现下述步骤:The electronic device according to claim 25, wherein the instructions executed by the at least one processor are further used to implement the following steps:
    基于所述第一样本集中的样本图像以及所述第一样本集中样本图像关联的类别,训练得到中间模型;An intermediate model is obtained by training based on the sample images in the first sample set and the categories associated with the sample images in the first sample set;
    基于所述中间模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像与所述目标类别的相似度,所述相似度用于表示样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the intermediate model, to obtain the similarity between each sample image and the target category, where the similarity is used to indicate that the sample image belongs to the target the possibility of categories;
    对于所述第一样本集中的任一样本图像,响应于所述任一样本图像不属于所述目标类别且与所述目标类别的相似度大于指定相似度,则从所述第一样本集中移除所述任一样本图像。For any sample image in the first sample set, in response to the any sample image does not belong to the target category and the similarity with the target category is greater than the specified similarity, the first sample image is selected from the first sample. Centrally remove any of the sample images.
  28. 根据权利要求25所述的电子设备,其中,所述至少一个处理器执行的指令,还用于 实现下述步骤:The electronic device of claim 25, wherein the instructions executed by the at least one processor are further used to implement the following steps:
    将所述第一样本集中对应的概率大于所述指定概率阈值的样本图像,添加至第三样本集;adding the sample images whose corresponding probability in the first sample set is greater than the specified probability threshold to the third sample set;
    对所述第三样本集中的每张样本图像分别进行多次裁剪,得到多张裁剪后的样本图像;Each sample image in the third sample set is cropped for multiple times to obtain multiple cropped sample images;
    将多张所述裁剪后的样本图像以及多张所述裁剪后的样本图像关联的类别添加至所述第三样本集,得到所述第二样本集。The second sample set is obtained by adding a plurality of the cropped sample images and a category associated with the plurality of cropped sample images to the third sample set.
  29. 根据权利要求25所述的电子设备,其中,所述至少一个处理器执行的指令,还用于实现下述步骤:The electronic device according to claim 25, wherein the instructions executed by the at least one processor are further used to implement the following steps:
    对于所述第二样本集中的任一样本图像,对所述任一样本图像中的感兴趣对象进行特征识别,得到所述任一样本图像的图像特征,所述图像特征用于指示所述感兴趣对象的至少一个特征;For any sample image in the second sample set, perform feature recognition on the object of interest in the any sample image to obtain image features of the any sample image, where the image features are used to indicate the sense of at least one characteristic of the object of interest;
    基于所述图像特征,获取所述任一样本图像关联的样本对象的对象标识;Based on the image feature, obtain the object identifier of the sample object associated with any of the sample images;
    基于所述对象标识,获取所述样本对象的对象特征。Based on the object identification, an object feature of the sample object is obtained.
  30. 根据权利要求29所述的电子设备,其中,所述至少一个处理器执行的指令,还用于实现下述步骤:The electronic device according to claim 29, wherein the instructions executed by the at least one processor are further used to implement the following steps:
    响应于所述任一样本图像中包括多个感兴趣对象,按照多个所述感兴趣对象的大小顺序,依序从所述任一样本图像中获取至少一个所述感兴趣对象的至少一个特征,得到所述任一样本图像的图像特征。In response to the plurality of objects of interest being included in any one of the sample images, sequentially acquiring at least one feature of at least one of the objects of interest from the any one of the sample images according to the size order of the plurality of objects of interest , to obtain the image features of any of the sample images.
  31. 一种非易失性计算机可读存储介质,其中,所述非易失性计算机可读存储介质存储有计算机程序,所述计算机程序用于使计算机实现下述步骤:A non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make a computer realize the following steps:
    对目标图像进行特征提取,得到所述目标图像的第一图像特征;performing feature extraction on the target image to obtain the first image feature of the target image;
    基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;
    响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;
    基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率用于表示所述目标图像属于所述目标类别的可能性;determining a second probability of the target image based on the second image feature and the object feature, where the second probability is used to represent the possibility that the target image belongs to the target category;
    响应于所述第二概率大于第二概率阈值,确定所述目标对象属于所述目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
  32. 一种非易失性计算机可读存储介质,其中,所述非易失性计算机可读存储介质存储有计算机程序,所述计算机程序用于使计算机实现下述步骤:A non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium stores a computer program, and the computer program is used to make a computer realize the following steps:
    获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
    从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
    基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
    基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;
    基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
    基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
  33. 一种计算机程序产品,包括计算机指令,其中,所述计算机指令被处理器执行时实现下述步骤:A computer program product comprising computer instructions, wherein the computer instructions, when executed by a processor, implement the following steps:
    对目标图像进行特征提取,得到所述目标图像的第一图像特征;performing feature extraction on the target image to obtain the first image feature of the target image;
    基于所述目标图像的第一图像特征,确定所述目标图像的第一概率,所述第一概率用于表示所述目标图像属于目标类别的可能性;determining a first probability of the target image based on the first image feature of the target image, where the first probability is used to represent the possibility that the target image belongs to a target category;
    响应于所述第一概率大于第一概率阈值,从所述目标图像中提取第二图像特征,并获取所述目标图像关联的目标对象的对象特征;In response to the first probability being greater than a first probability threshold, extracting a second image feature from the target image, and acquiring an object feature of a target object associated with the target image;
    基于所述第二图像特征和所述对象特征,确定所述目标图像的第二概率,所述第二概率 用于表示所述目标图像属于所述目标类别的可能性;Based on the second image feature and the object feature, a second probability of the target image is determined, and the second probability is used to represent the possibility that the target image belongs to the target category;
    响应于所述第二概率大于第二概率阈值,确定所述目标对象属于所述目标类别。In response to the second probability being greater than a second probability threshold, it is determined that the target object belongs to the target category.
  34. 一种计算机程序产品,包括计算机指令,其中,所述计算机指令被处理器执行时实现下述步骤:A computer program product comprising computer instructions, wherein the computer instructions, when executed by a processor, implement the following steps:
    获取第一样本集,所述第一样本集中包括多张样本图像,各所述样本图像关联有预先标注的类别;acquiring a first sample set, the first sample set includes a plurality of sample images, and each of the sample images is associated with a pre-marked category;
    从所述第一样本集中移除不属于目标类别且与所述目标类别的相似度大于指定相似度的样本图像,得到目标样本集;Remove sample images that do not belong to the target category and that have a similarity with the target category greater than a specified similarity from the first sample set to obtain a target sample set;
    基于所述目标样本集中的样本图像以及所述目标样本集中样本图像关联的类别,训练得到所述第一模型;The first model is obtained by training based on the sample images in the target sample set and the category associated with the sample images in the target sample set;
    基于所述第一模型分别对所述第一样本集中的各样本图像进行分类识别,得到各所述样本图像对应的概率,所述概率用于指示对应的样本图像属于所述目标类别的可能性;Classify and identify each sample image in the first sample set based on the first model, to obtain a probability corresponding to each sample image, where the probability is used to indicate the possibility that the corresponding sample image belongs to the target category sex;
    基于所述第一样本集中对应的概率大于指定概率阈值的样本图像,构建第二样本集;Constructing a second sample set based on sample images whose probability corresponding to the first sample set is greater than a specified probability threshold;
    基于所述第二样本集中样本图像的图像特征以及样本图像关联的样本对象的对象特征,训练得到所述第二模型。The second model is obtained by training based on the image features of the sample images in the second sample set and the object features of the sample objects associated with the sample images.
PCT/CN2021/114146 2020-11-23 2021-08-23 Image classification method and electronic device WO2022105336A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011325685.9A CN112434178A (en) 2020-11-23 2020-11-23 Image classification method and device, electronic equipment and storage medium
CN202011325685.9 2020-11-23

Publications (1)

Publication Number Publication Date
WO2022105336A1 true WO2022105336A1 (en) 2022-05-27

Family

ID=74693868

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/114146 WO2022105336A1 (en) 2020-11-23 2021-08-23 Image classification method and electronic device

Country Status (2)

Country Link
CN (1) CN112434178A (en)
WO (1) WO2022105336A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116205601A (en) * 2023-02-27 2023-06-02 开元数智工程咨询集团有限公司 Internet-based engineering list rechecking and data statistics method and system

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434178A (en) * 2020-11-23 2021-03-02 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN114255389A (en) * 2021-11-15 2022-03-29 浙江时空道宇科技有限公司 Target object detection method, device, equipment and storage medium
CN117292174B (en) * 2023-09-06 2024-04-19 中化现代农业有限公司 Apple disease identification method, apple disease identification device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102449661A (en) * 2009-06-01 2012-05-09 惠普发展公司,有限责任合伙企业 Determining detection certainty in a cascade classifier
CN107844785A (en) * 2017-12-08 2018-03-27 浙江捷尚视觉科技股份有限公司 A kind of method for detecting human face based on size estimation
CN110619350A (en) * 2019-08-12 2019-12-27 北京达佳互联信息技术有限公司 Image detection method, device and storage medium
CN111125422A (en) * 2019-12-13 2020-05-08 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium
US20200279109A1 (en) * 2014-08-28 2020-09-03 Retailmenot, Inc. Reducing the search space for recognition of objects in an image based on wireless signals
CN112434178A (en) * 2020-11-23 2021-03-02 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145862B (en) * 2017-05-05 2020-06-05 山东大学 Multi-feature matching multi-target tracking method based on Hough forest
CN108446723B (en) * 2018-03-08 2021-06-15 哈尔滨工业大学 Multi-scale space spectrum collaborative classification method for hyperspectral image
CN111444334B (en) * 2019-01-16 2023-04-25 阿里巴巴集团控股有限公司 Data processing method, text recognition device and computer equipment
CN110473192B (en) * 2019-04-10 2021-05-14 腾讯医疗健康(深圳)有限公司 Digestive tract endoscope image recognition model training and recognition method, device and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102449661A (en) * 2009-06-01 2012-05-09 惠普发展公司,有限责任合伙企业 Determining detection certainty in a cascade classifier
US20200279109A1 (en) * 2014-08-28 2020-09-03 Retailmenot, Inc. Reducing the search space for recognition of objects in an image based on wireless signals
CN107844785A (en) * 2017-12-08 2018-03-27 浙江捷尚视觉科技股份有限公司 A kind of method for detecting human face based on size estimation
CN110619350A (en) * 2019-08-12 2019-12-27 北京达佳互联信息技术有限公司 Image detection method, device and storage medium
CN111125422A (en) * 2019-12-13 2020-05-08 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN112434178A (en) * 2020-11-23 2021-03-02 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116205601A (en) * 2023-02-27 2023-06-02 开元数智工程咨询集团有限公司 Internet-based engineering list rechecking and data statistics method and system
CN116205601B (en) * 2023-02-27 2024-04-05 开元数智工程咨询集团有限公司 Internet-based engineering list rechecking and data statistics method and system

Also Published As

Publication number Publication date
CN112434178A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
WO2022105336A1 (en) Image classification method and electronic device
US11238310B2 (en) Training data acquisition method and device, server and storage medium
WO2019119505A1 (en) Face recognition method and device, computer device and storage medium
CN109189767B (en) Data processing method and device, electronic equipment and storage medium
CN111343161B (en) Abnormal information processing node analysis method, abnormal information processing node analysis device, abnormal information processing node analysis medium and electronic equipment
CN112541458B (en) Domain self-adaptive face recognition method, system and device based on meta learning
WO2020087774A1 (en) Concept-tree-based intention recognition method and apparatus, and computer device
CN111339813B (en) Face attribute recognition method and device, electronic equipment and storage medium
CN110717470A (en) Scene recognition method and device, computer equipment and storage medium
CN115086004A (en) Security event identification method and system based on heterogeneous graph
CN114049508B (en) Fraud website identification method and system based on picture clustering and manual research and judgment
US11423262B2 (en) Automatically filtering out objects based on user preferences
CN111476102A (en) Safety protection method, central control equipment and computer storage medium
CN114663871A (en) Image recognition method, training method, device, system and storage medium
CN113919361A (en) Text classification method and device
CN110704650B (en) OTA picture tag identification method, electronic equipment and medium
CN116756688A (en) Public opinion risk discovery method based on multi-mode fusion algorithm
CN110889717A (en) Method and device for filtering advertisement content in text, electronic equipment and storage medium
TW201435627A (en) System and method for optimizing search results
CN112949777B (en) Similar image determining method and device, electronic equipment and storage medium
CN116011810A (en) Regional risk identification method, device, equipment and storage medium
CN112966762B (en) Wild animal detection method and device, storage medium and electronic equipment
CN114638304A (en) Training method of image recognition model, image recognition method and device
CN113051911B (en) Method, apparatus, device, medium and program product for extracting sensitive words
CN105786929A (en) Information monitoring method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21893488

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 30.08.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21893488

Country of ref document: EP

Kind code of ref document: A1