CN111353330A - Image processing method, image processing device, electronic equipment and storage medium - Google Patents

Image processing method, image processing device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111353330A
CN111353330A CN201811564402.9A CN201811564402A CN111353330A CN 111353330 A CN111353330 A CN 111353330A CN 201811564402 A CN201811564402 A CN 201811564402A CN 111353330 A CN111353330 A CN 111353330A
Authority
CN
China
Prior art keywords
image
pixel
category
classification
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811564402.9A
Other languages
Chinese (zh)
Inventor
申世伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Reach Best Technology Co Ltd
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Reach Best Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Reach Best Technology Co Ltd filed Critical Reach Best Technology Co Ltd
Priority to CN201811564402.9A priority Critical patent/CN111353330A/en
Publication of CN111353330A publication Critical patent/CN111353330A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Abstract

The disclosure relates to an image processing method, an image processing device, an electronic device and a storage medium, and belongs to the technical field of image processing. The method comprises the following steps: classifying the images to obtain the categories of the images; when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category; and carrying out shielding processing on the target area in the image. The method and the device can automatically realize the shielding processing of the image without the participation of a user, avoid the problem of complex operation and improve the intelligence of image processing.

Description

Image processing method, image processing device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, an electronic device, and a storage medium.
Background
With the development of the internet, more and more users publish videos or pictures on the network, some content which is not suitable for the time may appear in the videos or the pictures, and the videos or the pictures are often required to be processed and then displayed to other users for watching in the network supervision process.
In the related art, a monitoring person generally observes a video or a picture uploaded by a user, and if the video or the picture is found to contain inappropriate content, the video or the picture is considered to need to be shielded, and then an editing software installed on an electronic device is used for drawing an area needing to be shielded on the picture or a certain frame of image of the video. For a picture, after the electronic device acquires the region, the electronic device can shield the region in the picture; for a video, after the electronic device acquires the region, the electronic device may automatically track an object corresponding to the region in a multi-frame image of the video through a video tracking algorithm, and block the region where the object is located, for example, make a mosaic.
The technology realizes image processing by interacting with a user, needs the user to judge whether the image needs to be shielded or not, and draws an area needing shielding in the image, and is complex in operation and poor in intelligence.
Disclosure of Invention
The present disclosure provides an image processing method, an image processing apparatus, an electronic device, and a storage medium, which can overcome the problems of complicated operation and poor intelligence.
According to a first aspect of embodiments of the present disclosure, there is provided an image processing method, including:
classifying the images to obtain the categories of the images;
when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category;
and carrying out shielding processing on the target area in the image.
In one possible implementation, the classifying the image to obtain the category of the image includes:
inputting the image into an image classification model, and outputting a plurality of classes of the image and the classification probability of each class;
correspondingly, when the category is contained in a preset category set, determining a target area with the pixel weight larger than a preset threshold value in the image comprises:
when a target category in the multiple categories is contained in the preset category set and the classification probability of the target category is greater than a probability threshold, determining a target area with the pixel weight greater than a preset threshold in the image, wherein the target category is the category with the highest classification probability in the multiple categories.
In one possible implementation, the determining a target region in the image where the pixel weight is greater than a preset threshold includes:
acquiring a plurality of feature maps of the image based on the image classification model, wherein the feature maps are feature maps output by a convolution layer of the image classification model;
acquiring the pixel weight of each pixel in the image according to the classification probability of the target category, the pixel value of each pixel in each feature map in the plurality of feature maps and the number of the pixels in each feature map;
determining each target pixel in the image according to the pixel weight of each pixel in the image, wherein the pixel weight of each target pixel is greater than the preset threshold value;
and determining the target area according to each target pixel in the image, wherein the target area comprises each target pixel.
In one possible implementation manner, the obtaining a pixel weight of each pixel in the image according to the classification probability of the target class, the pixel value of each pixel in each feature map of the plurality of feature maps, and the number of pixels in each feature map includes:
acquiring the weight of each feature map in the plurality of feature maps to the target category according to the classification probability of the target category, the pixel value of the pixel of each feature map in the plurality of feature maps and the number of the pixels of each feature map;
weighting the plurality of feature maps according to the weight of the plurality of feature maps and each feature map to the target category to obtain weighted feature maps;
carrying out size adjustment on the weighted feature map to obtain a thermodynamic diagram with the same size as the image;
and carrying out normalization processing on the thermodynamic diagram, wherein the pixel value of each pixel in the normalized thermodynamic diagram represents the pixel weight of the pixel at the corresponding position in the image.
In one possible implementation manner, before the classifying the image and obtaining the category of the image, the method further includes:
acquiring a first training data set, wherein the first training data set comprises a plurality of first sample images corresponding to each category in the preset category set;
and training based on the first training data set to obtain an image classification model.
In one possible implementation, the first training data set further includes classes other than the preset class set and a corresponding plurality of sample images.
In one possible implementation manner, the training based on the first training data set to obtain the image classification model includes:
training based on the first training data set to obtain a first image classification model;
when the classification accuracy of the first image classification model is less than an accuracy threshold, acquiring a second training data set, wherein the second training data set comprises a plurality of second sample images and a category of each second sample image;
training based on the second training data set and the first training data set to obtain a second image classification model;
and when the classification accuracy of the second image classification model is smaller than the accuracy threshold, continuing to acquire training data sets, and training based on the acquired training data sets until the classification accuracy of the trained image classification model is equal to or larger than the accuracy threshold.
In one possible implementation, the method further includes:
carrying out image sampling on a video to obtain a multi-frame image of the video;
and for each frame of image in the multi-frame images, performing classification, determining a target area and shielding processing.
According to a second aspect of the embodiments of the present disclosure, there is provided an image processing apparatus including:
a classification module configured to perform classification of an image to obtain a category of the image;
a determining module configured to determine a target region in the image, where a pixel weight is greater than a preset threshold when the category is included in a preset category set, the pixel weight being used to represent a contribution size of each pixel in the target region to the classification of the image into the category;
a processing module configured to perform occlusion processing on the target region in the image.
In one possible implementation, the classification module is configured to perform:
inputting the image into an image classification model, and outputting a plurality of classes of the image and the classification probability of each class;
correspondingly, when the category is contained in a preset category set, determining a target area with the pixel weight larger than a preset threshold value in the image comprises:
when a target category in the multiple categories is contained in the preset category set and the classification probability of the target category is greater than a probability threshold, determining a target area with the pixel weight greater than a preset threshold in the image, wherein the target category is the category with the highest classification probability in the multiple categories.
In one possible implementation, the determining module is configured to perform:
acquiring a plurality of feature maps of the image based on the image classification model, wherein the feature maps are feature maps output by a convolution layer of the image classification model;
acquiring the pixel weight of each pixel in the image according to the classification probability of the target category, the pixel value of each pixel in each feature map in the plurality of feature maps and the number of the pixels in each feature map;
determining each target pixel in the image according to the pixel weight of each pixel in the image, wherein the pixel weight of each target pixel is greater than the preset threshold value;
and determining the target area according to each target pixel in the image, wherein the target area comprises each target pixel.
In one possible implementation, the determining module is configured to perform:
acquiring the weight of each feature map in the plurality of feature maps to the target category according to the classification probability of the target category, the pixel value of the pixel of each feature map in the plurality of feature maps and the number of the pixels of each feature map;
weighting the plurality of feature maps according to the weight of the plurality of feature maps and each feature map to the target category to obtain weighted feature maps;
carrying out size adjustment on the weighted feature map to obtain a thermodynamic diagram with the same size as the image;
and carrying out normalization processing on the thermodynamic diagram, wherein the pixel value of each pixel in the normalized thermodynamic diagram represents the pixel weight of the pixel at the corresponding position in the image.
In one possible implementation, the apparatus further includes:
an obtaining module configured to perform obtaining a first training data set, where the first training data set includes a plurality of first sample images corresponding to each category in the preset category set;
and the training module is configured to perform training based on the first training data set to obtain an image classification model.
In one possible implementation, the first training data set further includes classes other than the preset class set and a corresponding plurality of sample images.
In one possible implementation, the training module is configured to perform:
training based on the first training data set to obtain a first image classification model;
when the classification accuracy of the first image classification model is less than an accuracy threshold, acquiring a second training data set, wherein the second training data set comprises a plurality of second sample images and a category of each second sample image;
training based on the second training data set and the first training data set to obtain a second image classification model;
and when the classification accuracy of the second image classification model is smaller than the accuracy threshold, continuing to acquire training data sets, and training based on the acquired training data sets until the classification accuracy of the trained image classification model is equal to or larger than the accuracy threshold.
In one possible implementation, the apparatus further includes:
the sampling module is configured to perform image sampling on a video to obtain a plurality of frame images of the video;
the classification module is further configured to perform a classification step for each of the plurality of frames of images;
the determining module is further configured to perform the step of determining a target area for each frame of image;
the processing module is further configured to perform the step of occlusion processing for each frame of image.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
classifying the images to obtain the categories of the images;
when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category;
and carrying out shielding processing on the target area in the image.
According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium having instructions therein, which when executed by a processor of an electronic device, enable the electronic device to perform an image processing method, the method comprising:
classifying the images to obtain the categories of the images;
when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category;
and carrying out shielding processing on the target area in the image.
According to a fifth aspect of embodiments of the present disclosure, there is provided an application program product, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform a method of image processing, the method comprising:
classifying the images to obtain the categories of the images;
when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category;
and carrying out shielding processing on the target area in the image.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
by classifying the image, if the classified category is contained in the preset category set, the image can be determined to be required to be shielded, at this time, a target area with the pixel weight larger than a preset threshold value in the image can be determined, and then the target area is shielded. Above-mentioned whole process need not user's participation, can realize handling the sheltering from of image automatically, has avoided complex operation's problem, has improved image processing's intellectuality.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow diagram illustrating an image processing method according to an exemplary embodiment.
FIG. 2 is a flow diagram illustrating an image processing method according to an exemplary embodiment.
Fig. 3 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment.
Fig. 4 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment.
Fig. 5 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment.
Fig. 6 is a block diagram illustrating an electronic device 600 according to an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flowchart illustrating an image processing method according to an exemplary embodiment, which is used in an electronic device, as shown in fig. 1, and includes the steps of:
in step S11, the images are classified to obtain the classification of the image.
In step S12, when the category is included in the preset category set, a target area in the image with a pixel weight greater than a preset threshold is determined, where the pixel weight is used to indicate the contribution of each pixel in the target area to the classification of the image into the category.
In step S13, the occlusion processing is performed on the target area in the image.
According to the method provided by the embodiment of the disclosure, by classifying the image, if the classified category is included in the preset category set, it can be determined that the image needs to be shielded, at this time, a target area with a pixel weight greater than a preset threshold value in the image can be determined, and then the target area is shielded, and since the larger the pixel weight is, the larger the contribution of the pixel to the classification result is, the accuracy of shielding processing can be ensured by using the target area as the area needing shielding processing in the image. Above-mentioned whole process need not user's participation, can realize handling the sheltering from of image automatically, has avoided complex operation's problem, has improved image processing's intellectuality.
In one possible implementation, the classifying the image to obtain the category of the image includes:
inputting the image into an image classification model, and outputting a plurality of classes of the image and the classification probability of each class;
correspondingly, when the category is included in the preset category set, determining a target area in the image with the pixel weight greater than a preset threshold value includes:
when a target class in the multiple classes is included in the preset class set and the classification probability of the target class is greater than a probability threshold, determining a target area with the pixel weight greater than the preset threshold in the image, wherein the target class is the class with the highest classification probability in the multiple classes.
In one possible implementation, the determining a target region in the image where the pixel weight is greater than a preset threshold includes:
acquiring a plurality of feature maps of the image based on the image classification model, wherein the feature maps are feature maps output by a convolution layer of the image classification model;
acquiring the pixel weight of each pixel in the image according to the classification probability of the target class, the pixel value of each pixel in each feature map in the plurality of feature maps and the number of the pixels in each feature map;
determining each target pixel in the image according to the pixel weight of each pixel in the image, wherein the pixel weight of each target pixel is greater than the preset threshold value;
and determining the target area according to each target pixel in the image, wherein the target area comprises each target pixel.
In a possible implementation manner, the obtaining a pixel weight of each pixel in the image according to the classification probability of the target class, the pixel value of each pixel in each feature map of the plurality of feature maps, and the number of pixels in each feature map includes:
acquiring the weight of each feature map in the feature maps to the target class according to the classification probability of the target class, the pixel value of the pixel of each feature map in the feature maps and the number of the pixels of each feature map;
weighting the plurality of feature maps according to the plurality of feature maps and the weight of each feature map to the target category to obtain weighted feature maps;
carrying out size adjustment on the weighted feature map to obtain a thermodynamic diagram with the same size as the image;
and carrying out normalization processing on the thermodynamic diagram, wherein the pixel value of each pixel in the normalized thermodynamic diagram represents the pixel weight of the pixel at the corresponding position in the image.
In one possible implementation, before the classifying the image and obtaining the category of the image, the method further includes:
acquiring a first training data set, wherein the first training data set comprises a plurality of first sample images corresponding to each category in the preset category set;
and training based on the first training data set to obtain an image classification model.
In one possible implementation, the first training data set further includes classes other than the preset class set and corresponding sample images.
In one possible implementation, the training based on the first training data set to obtain the image classification model includes:
training based on the first training data set to obtain a first image classification model;
when the classification accuracy of the first image classification model is smaller than an accuracy threshold, acquiring a second training data set, wherein the second training data set comprises a plurality of second sample images and the category of each second sample image;
training based on the second training data set and the first training data set to obtain a second image classification model;
and when the classification accuracy of the second image classification model is smaller than the accuracy threshold, continuing to acquire training data sets, and training based on the acquired training data sets until the classification accuracy of the trained image classification model is equal to or larger than the accuracy threshold.
In one possible implementation, the method further comprises:
carrying out image sampling on a video to obtain a multi-frame image of the video;
and for each frame of image in the multi-frame image, performing the steps of classifying, determining a target area and shielding processing.
All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
Fig. 2 is a flowchart illustrating an image processing method according to an exemplary embodiment, the image processing method being used in an electronic device as illustrated in fig. 2, including the steps of:
in step S21, an image classification model for outputting a category from an input image is acquired.
In the embodiment of the present disclosure, the image classification model may be obtained by training an electronic device, or may be sent to the electronic device after being obtained by training other devices, so that the electronic device obtains the image classification model.
In one possible implementation, the training process of the image classification model may include the following steps a to b:
step a, obtaining a first training data set, wherein the first training data set comprises a plurality of first sample images corresponding to each category in a preset category set.
The preset category set may be categories that a content sharing platform (such as a short video sharing platform, a live broadcast platform, etc.) needs to supervise, for example, the preset category set may include categories representing sensitive content (such as bloody smell, low popular and other content that is not suitable for being displayed to a user) and categories representing advertisements, such as various advertisement logos, and an advertisement logo may be regarded as a single category.
In the training process, a plurality of first sample images corresponding to each category in a preset category set are used as a first training data set by collecting the preset category set and the plurality of first sample images corresponding to each category in the preset category set. In one possible implementation, the first set of training data may include, in addition to a preset set of classes, classes other than the preset set of classes and corresponding sample images. In the training process, other category sample images except the preset category set are added, so that the model can learn more categories and recognize more categories of images, and the risk of misclassification is reduced.
And b, training based on the first training data set to obtain the image classification model.
The electronic device may perform model training based on the first training data set by using a preset training algorithm to obtain an image classification model. The preset training algorithm may be a neural network algorithm, and the electronic device may train an initial neural network model based on the first training data set, and use the trained neural network model as an image classification model. In the training process, the image classification model can learn the classification mode of the sample images of multiple classes, and has the capability of classifying the images, for example, if the images are classified into multiple classes, each class corresponds to one classification probability, and the higher the classification probability of the class is, the higher the possibility that the images are classified into the class is.
For example, the neural network model may be a VGG network model, and accordingly, the trained image classification model may be the VGG network model. The embodiment of the present disclosure does not specifically limit the type of the image classification model.
In one possible implementation, the step b may include: training based on the first training data set to obtain a first image classification model; when the classification accuracy of the first image classification model is smaller than an accuracy threshold, acquiring a second training data set, wherein the second training data set comprises a plurality of second sample images and the category of each second sample image; training based on the second training data set and the first training data set to obtain a second image classification model; and when the classification accuracy of the second image classification model is smaller than the accuracy threshold, continuing to acquire training data sets, and training based on the acquired training data sets until the classification accuracy of the trained image classification model is equal to or larger than the accuracy threshold.
An image classification model obtained by the electronic equipment through training based on the first training data set is recorded as a first image classification model. In order to ensure that the image classification model with higher classification accuracy is obtained through training, after the electronic device obtains the first image classification model, the classification accuracy of the first image classification model can be further determined, and whether the classification accuracy reaches the accuracy threshold value is judged. For example, the accuracy threshold may be 90%, and the accuracy threshold is not particularly limited by the embodiments of the present disclosure.
In one possible implementation, the process of the electronic device determining the classification accuracy may include: acquiring a test data set, wherein the test data set can comprise a plurality of sample images and correct types marked on the sample images; for each sample image in the plurality of sample images, inputting the sample image into an image classification model obtained by training based on a first training data set, and outputting the category of the sample image; and determining the proportion of the sample image of the correct category output by the image classification model to the number of the plurality of sample images in the test data set as the classification accuracy of the image classification model.
If the classification accuracy of the first image classification model is equal to or greater than the accuracy threshold, the first image classification model is deemed available and the electronic device may use the first image classification model for a subsequent image classification process. If the classification accuracy of the first image classification model is less than the accuracy threshold, the electronic device may continue to acquire more training data sets, such as a second training data set, and continue training based on the newly acquired second training data set and the previously acquired first training data set to obtain a second image classification model. The process of acquiring the second training data set is the same as the process of acquiring the first training data set, and the process of training the second image classification model is the same as the process of training the first image classification model, and is not repeated.
Further, considering that the accuracy of the second image classification model may still not be high enough, the electronic device may compare the accuracy of the second image classification model with the accuracy threshold after acquiring the second image classification model. If the classification accuracy of the second image classification model is equal to or greater than the accuracy threshold, the second image classification model is deemed available and may be used by the electronic device for a subsequent image classification process. When the classification accuracy of the second image classification model is smaller than the accuracy threshold, the electronic device may continue to acquire the training data sets and perform training based on the acquired training data sets until the classification accuracy of the trained image classification model is equal to or greater than the accuracy threshold. By increasing the training data and the iteration times, the image classification model with higher classification accuracy can be continuously obtained.
It should be noted that step S21 is an optional step, and step S21 is a step that needs to be executed before classifying the image, and is not required to be executed each time the image is classified, so as to ensure that the image classification model is already acquired when the image is classified, for example, the electronic device may be trained in advance to acquire the image classification model and then store the image classification model, or acquire the image classification model from another device and then store the image classification model, and when the image classification is needed, the image classification model is directly applied.
In step S22, the image is input to the image classification model, and a plurality of classes of the image and a classification probability for each class are output.
The image may be any image to be processed, such as a picture or an image obtained by image capturing a video. For example, in a content sharing platform supervision scenario, when any user uploads a picture, the electronic device may perform the step S22 and the subsequent steps S23 and S24 for the picture. When any user uploads a video, the electronic device may perform image sampling on the video to obtain a multi-frame image of the video, and perform the step S22, and subsequent steps S23 and S24 on each frame of the multi-frame image. The electronic device may sample the video by using a preset sampling manner, where the preset sampling manner may be to sample the video once every preset time interval, for example, the preset time interval may be 1s, 0.5s, or 0 s. The preset sampling manner may also be sampling once every preset frame number, for example, the preset frame number may be 1 frame or 0 frame, and the preset sampling manner is not specifically limited in the embodiment of the present disclosure.
In the embodiment of the disclosure, after the electronic device inputs the image into the image classification model, the image classification model may output a plurality of categories of the image and a classification probability of each category, where a larger classification probability indicates a higher probability that the image is the category. The plurality of classes may be classes used in training the image classification model, such as the classes comprised in the first set of training data in step S21.
Note that, this step S22 is one possible implementation of classifying the image to obtain the category of the image. The image classification model is used for classifying the image, so that a plurality of classes of the image and the classification probability of each class can be obtained, and the higher the classification accuracy of the image classification model is, the more accurate the classification result of the image is.
In step S23, when a target category of the multiple categories is included in the preset category set and the classification probability of the target category is greater than a probability threshold, a target area of the image with a pixel weight greater than a preset threshold is determined, where the target category is a category of the multiple categories with the highest classification probability, and the pixel weight is used to indicate the contribution of each pixel in the target area to the classification of the image into the category.
In the embodiment of the disclosure, after the electronic device classifies the image to obtain a plurality of classes through the image classification model, a target class with the highest classification probability in the plurality of classes can be determined according to the classification probabilities of the plurality of classes, and the target class can be used as a final classification result.
Furthermore, the electronic device may determine whether the image needs to be occluded according to the target category. Since the maximum classification probability of the target category indicates that the probability that the image belongs to the target category is the maximum, and the greater the classification probability, the higher the classification reliability and accuracy, when the classification probability of the target category is greater than the probability threshold, the electronic device may determine that the image belongs to the target category, and the target category is included in the preset category set and indicates that the target category is a category to be supervised, so that the electronic device may determine that the image is an image that needs to be subjected to occlusion processing, for this reason, the electronic device may determine a target area in the image that needs to be subjected to occlusion processing, that is, an area in the image where pixels having a relatively large influence on the final classification result are located, and further, the electronic device may perform occlusion processing on the target area through subsequent step S24.
In one possible implementation, the electronic device determining the target region in the image may include the following steps a to d:
step a, acquiring a plurality of feature maps of the image based on the image classification model, wherein the feature maps are feature maps output by the convolution layer of the image classification model.
The electronic device may input the image into an image classification model, and obtain a plurality of feature maps (feature _ maps) output by a last convolutional layer of the image classification model, where a kth feature map of the plurality of feature maps may be denoted as ak. For example, the image classification model may be a VGG model, and the last convolutional layer of the VGG model is "Conv 5_ 3", that is, the 3 rd convolutional layer in the 5 th convolutional block (block), "Conv 5_ 3" of the convolutional layer may output a plurality of feature maps, where the output dimension of the feature map is (7 × 512), that is, 512 feature maps may be output, and each feature map has a size of 7 × 7.
And b, acquiring the pixel weight of each pixel in the image according to the classification probability of the target class, the pixel value of each pixel in each feature map in the plurality of feature maps and the number of the pixels in each feature map.
In one possible implementation, the step b may include the following steps b1 to b4:
b1, obtaining the weight of each feature map in the feature maps to the target class according to the classification probability of the target class, the pixel value of the pixel of each feature map in the feature maps and the number of the pixels of each feature map.
The electronic device may apply the following formula (1) to obtain the weight of each feature map to a target class (hereinafter, represented by class c):
Figure BDA0001914153450000121
wherein the content of the first and second substances,
Figure BDA0001914153450000122
is the weight of the kth feature map to the class c, k is a positive integer, Z is the number of pixels of the feature map, ycIs the classification probability of class c of the model output (e.g. the value of the softmax layer output),
Figure BDA0001914153450000123
is the pixel value of the pixel at the (i, j) position in the kth feature map.
b2, weighting the plurality of feature maps according to the plurality of feature maps and the weight of each feature map to the target category to obtain weighted feature maps.
The electronic device can obtain the weights of all feature maps to the target class by the above formula (1), and further, the electronic device can apply the following formula (2) to perform weighting processing on them to obtain weighted feature maps, which are also called thermodynamic diagrams.
Wherein the formula (2) can be expressed as:
Figure BDA0001914153450000131
wherein the content of the first and second substances,
Figure BDA0001914153450000132
for the thermodynamic diagram obtained by using the Grad-CAM (Gradient-Class Activation Map) algorithm aiming at the Class c, the ReLU (Rectified Linear Units) is used as an Activation function,
Figure BDA0001914153450000133
to perform weighted summation on the feature maps. The Grad-CAM algorithm adds a layer of ReLUs to the final weighted sum, so that only those pixels which have a positive influence on the target class can be concerned, and pixels belonging to other classes are not included, so that the effect of interpreting the classification result by using the Grad-CAM algorithm can be ensured.
b3, adjusting the size of the weighted characteristic diagram to obtain the thermodynamic diagram with the same size as the image.
After the electronic device obtains the weighted feature map, the size of the feature map is smaller than that of the original image in step S22, at this time, the electronic device may adjust the feature map to the size of the original image to obtain a thermodynamic diagram with the same size as the original image, so that the thermodynamic diagram and the pixels in the original image may correspond to each other according to the position.
b4, carrying out normalization processing on the thermodynamic diagram, wherein the pixel value of each pixel in the normalized thermodynamic diagram represents the pixel weight of the pixel at the corresponding position in the image.
The normalization processing mode can include: and for each pixel in the thermodynamic diagram, subtracting the minimum pixel value of the thermodynamic diagram from the pixel value of the pixel, and dividing the difference by the difference between the maximum pixel value and the minimum pixel value of the thermodynamic diagram. The maximum pixel value refers to the largest pixel value among the pixel values of all the pixels in the thermodynamic diagram, and the minimum pixel value refers to the smallest pixel value among the pixel values of all the pixels in the thermodynamic diagram. By performing the normalization processing on the thermodynamic diagram, the range of the pixel values of the pixels in the normalized thermodynamic diagram is [0,1], which can be regarded as a probability diagram, wherein each value represents the pixel weight of the pixel at the corresponding position of the original image, i.e. the contribution of the pixel to the final classification result (target class).
And c, determining each target pixel in the image according to the pixel weight of each pixel in the image, wherein the pixel weight of each target pixel is greater than the preset threshold value.
After the pixel weight of each pixel in the image is obtained through the step b, the electronic device can find out each target pixel of which the pixel weight is greater than the preset threshold value in the image according to the preset threshold value. For example, the preset threshold may be 0.6, and the preset threshold is not specifically limited in the embodiments of the present disclosure.
And d, determining the target area according to each target pixel in the image, wherein the target area comprises each target pixel.
The electronic device may use a region composed of each target pixel in the image as the target region, for example, the electronic device may determine a minimum bounding rectangle containing each target pixel according to the position of each target pixel in the image, and use the minimum bounding rectangle as the target region. For the preset category set comprising categories representing sensitive content and categories representing advertisements, the target area may be a sensitive area in video surveillance, a logo portion in advertisement video, and the like.
It should be noted that, the step S23 is one possible implementation of determining the target area in the image where the pixel weight is greater than the preset threshold value when the category is included in the preset category set. By judging whether the image needs to be shielded or not on the basis of classification, and then searching for a target area needing to be shielded in the image, for example, determining the target area by using a Grad-CAM algorithm, the problems that whether the image needs to be shielded or not and the area needing to be shielded is drawn in the image by a user in the related art are solved, and the intelligence of image processing is improved.
In step S24, the occlusion processing is performed on the target area in the image.
In the embodiment of the present disclosure, after determining a target area that needs to be shielded in an image, an electronic device may use a preset shielding manner to shield the target area in the image, where the preset shielding manner includes, but is not limited to, superimposing an opaque blocking block on the target area, and this shielding block may be set to be in a target color or mosaic form as long as it can perform a shielding function.
According to the method provided by the embodiment of the disclosure, by classifying the image, if the classified category is included in the preset category set, it can be determined that the image needs to be shielded, at this time, a target area with a pixel weight greater than a preset threshold value in the image can be determined, and then the target area is shielded, and since the larger the pixel weight is, the larger the contribution of the pixel to the classification result is, the accuracy of shielding processing can be ensured by using the target area as the area needing shielding processing in the image. Above-mentioned whole process need not user's participation, can realize handling the sheltering from of image automatically, has avoided complex operation's problem, has improved image processing's intellectuality.
Fig. 3 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment. Referring to fig. 3, the apparatus includes a classification module 301, a determination module 302, and a processing module 303.
The classification module 301 is configured to perform classification of an image, resulting in a category of the image;
the determining module 302 is configured to perform, when the category is included in a preset category set, determining a target region in the image, where a pixel weight is greater than a preset threshold, where the pixel weight is used to represent a contribution of each pixel in the target region to the classification of the image into the category;
the processing module 303 is configured to perform occlusion processing of the target region in the image.
In one possible implementation, the classification module 301 is configured to perform:
inputting the image into an image classification model, and outputting a plurality of classes of the image and the classification probability of each class;
correspondingly, when the category is included in the preset category set, determining a target area in the image with the pixel weight greater than a preset threshold value includes:
when a target class in the multiple classes is included in the preset class set and the classification probability of the target class is greater than a probability threshold, determining a target area with the pixel weight greater than the preset threshold in the image, wherein the target class is the class with the highest classification probability in the multiple classes.
In one possible implementation, the determining module 302 is configured to perform:
acquiring a plurality of feature maps of the image based on the image classification model, wherein the feature maps are feature maps output by a convolution layer of the image classification model;
acquiring the pixel weight of each pixel in the image according to the classification probability of the target class, the pixel value of each pixel in each feature map in the plurality of feature maps and the number of the pixels in each feature map;
determining each target pixel in the image according to the pixel weight of each pixel in the image, wherein the pixel weight of each target pixel is greater than the preset threshold value;
and determining the target area according to each target pixel in the image, wherein the target area comprises each target pixel.
In one possible implementation, the determining module 302 is configured to perform:
acquiring the weight of each feature map in the feature maps to the target class according to the classification probability of the target class, the pixel value of the pixel of each feature map in the feature maps and the number of the pixels of each feature map;
weighting the plurality of feature maps according to the plurality of feature maps and the weight of each feature map to the target category to obtain weighted feature maps;
carrying out size adjustment on the weighted feature map to obtain a thermodynamic diagram with the same size as the image;
and carrying out normalization processing on the thermodynamic diagram, wherein the pixel value of each pixel in the normalized thermodynamic diagram represents the pixel weight of the pixel at the corresponding position in the image.
In one possible implementation, referring to fig. 4, the apparatus further includes:
an obtaining module 304 configured to perform obtaining a first training data set, where the first training data set includes a plurality of first sample images corresponding to each category in the preset category set;
a training module 305 configured to perform training based on the first set of training data, resulting in an image classification model.
In one possible implementation, the first training data set further includes classes other than the preset class set and corresponding sample images.
In one possible implementation, the training module 305 is configured to perform:
training based on the first training data set to obtain a first image classification model;
when the classification accuracy of the first image classification model is smaller than an accuracy threshold, acquiring a second training data set, wherein the second training data set comprises a plurality of second sample images and the category of each second sample image;
training based on the second training data set and the first training data set to obtain a second image classification model;
and when the classification accuracy of the second image classification model is smaller than the accuracy threshold, continuing to acquire training data sets, and training based on the acquired training data sets until the classification accuracy of the trained image classification model is equal to or larger than the accuracy threshold.
In one possible implementation, referring to fig. 5, the apparatus further includes:
a sampling module 306 configured to perform image sampling on a video to obtain a plurality of frame images of the video;
the classification module 301 is further configured to perform a classification step for each of the plurality of frames of images;
the determining module 302 is further configured to perform the step of determining a target area for each frame of image;
the processing module 303 is further configured to perform the step of occlusion processing for each frame of image.
According to the device provided by the embodiment of the disclosure, by classifying the image, if the classified category is included in the preset category set, it may be determined that the image needs to be shielded, and at this time, a target area in the image, in which the pixel weight is greater than a preset threshold, may be determined, and then the target area is shielded. Above-mentioned whole process need not user's participation, can realize handling the sheltering from of image automatically, has avoided complex operation's problem, has improved image processing's intellectuality.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 6 is a block diagram of an electronic device 600 according to an exemplary embodiment, where the electronic device 600 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 601 and one or more memories 602, where the memory 602 stores at least one instruction, and the at least one instruction is loaded and executed by the processor 601 to implement the methods provided by the method embodiments. Of course, the electronic device may further have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the electronic device may further include other components for implementing the functions of the device, which is not described herein again.
In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium having instructions therein, which when executed by a processor of an electronic device, enable the electronic device to perform a method of image processing, the method comprising:
classifying the image to obtain the category of the image;
when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category;
and carrying out shielding processing on the target area in the image.
For example, the non-transitory computer readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, there is also provided an application program product, instructions of which, when executed by a processor of an electronic device, enable the electronic device to perform a method of image processing, the method comprising:
classifying the image to obtain the category of the image;
when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category;
and carrying out shielding processing on the target area in the image.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. An image processing method, comprising:
classifying the images to obtain the categories of the images;
when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category;
and carrying out shielding processing on the target area in the image.
2. The method according to claim 1, wherein the classifying the image to obtain the category of the image comprises:
inputting the image into an image classification model, and outputting a plurality of classes of the image and the classification probability of each class;
correspondingly, when the category is contained in a preset category set, determining a target area with the pixel weight larger than a preset threshold value in the image comprises:
when a target category in the multiple categories is contained in the preset category set and the classification probability of the target category is greater than a probability threshold, determining a target area with the pixel weight greater than a preset threshold in the image, wherein the target category is the category with the highest classification probability in the multiple categories.
3. The method according to claim 2, wherein the determining the target area in the image where the pixel weight is greater than the preset threshold value comprises:
acquiring a plurality of feature maps of the image based on the image classification model, wherein the feature maps are feature maps output by a convolution layer of the image classification model;
acquiring the pixel weight of each pixel in the image according to the classification probability of the target category, the pixel value of each pixel in each feature map in the plurality of feature maps and the number of the pixels in each feature map;
determining each target pixel in the image according to the pixel weight of each pixel in the image, wherein the pixel weight of each target pixel is greater than the preset threshold value;
and determining the target area according to each target pixel in the image, wherein the target area comprises each target pixel.
4. The image processing method according to claim 3, wherein the obtaining the pixel weight of each pixel in the image according to the classification probability of the target class, the pixel value of each pixel in each feature map of the plurality of feature maps, and the number of pixels in each feature map comprises:
acquiring the weight of each feature map in the plurality of feature maps to the target category according to the classification probability of the target category, the pixel value of the pixel of each feature map in the plurality of feature maps and the number of the pixels of each feature map;
weighting the plurality of feature maps according to the weight of the plurality of feature maps and each feature map to the target category to obtain weighted feature maps;
carrying out size adjustment on the weighted feature map to obtain a thermodynamic diagram with the same size as the image;
and carrying out normalization processing on the thermodynamic diagram, wherein the pixel value of each pixel in the normalized thermodynamic diagram represents the pixel weight of the pixel at the corresponding position in the image.
5. The method of claim 1, wherein before the classifying the image into the category of the image, the method further comprises:
acquiring a first training data set, wherein the first training data set comprises a plurality of first sample images corresponding to each category in the preset category set;
and training based on the first training data set to obtain an image classification model.
6. The image processing method of claim 5, wherein the first training data set further comprises classes other than the preset class set and a corresponding plurality of sample images.
7. The image processing method of claim 5, wherein the training based on the first training data set to obtain the image classification model comprises:
training based on the first training data set to obtain a first image classification model;
when the classification accuracy of the first image classification model is less than an accuracy threshold, acquiring a second training data set, wherein the second training data set comprises a plurality of second sample images and a category of each second sample image;
training based on the second training data set and the first training data set to obtain a second image classification model;
and when the classification accuracy of the second image classification model is smaller than the accuracy threshold, continuing to acquire training data sets, and training based on the acquired training data sets until the classification accuracy of the trained image classification model is equal to or larger than the accuracy threshold.
8. An image processing apparatus characterized by comprising:
a classification module configured to perform classification of an image to obtain a category of the image;
a determining module configured to determine a target region in the image, where a pixel weight is greater than a preset threshold when the category is included in a preset category set, the pixel weight being used to represent a contribution size of each pixel in the target region to the classification of the image into the category;
a processing module configured to perform occlusion processing on the target region in the image.
9. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
classifying the images to obtain the categories of the images;
when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category;
and carrying out shielding processing on the target area in the image.
10. A non-transitory computer-readable storage medium having instructions therein, which when executed by a processor of an electronic device, enable the electronic device to perform a method of image processing, the method comprising:
classifying the images to obtain the categories of the images;
when the category is contained in a preset category set, determining a target area with pixel weight larger than a preset threshold value in the image, wherein the pixel weight is used for representing the contribution of each pixel in the target area to the classification of the image into the category;
and carrying out shielding processing on the target area in the image.
CN201811564402.9A 2018-12-20 2018-12-20 Image processing method, image processing device, electronic equipment and storage medium Pending CN111353330A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811564402.9A CN111353330A (en) 2018-12-20 2018-12-20 Image processing method, image processing device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811564402.9A CN111353330A (en) 2018-12-20 2018-12-20 Image processing method, image processing device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111353330A true CN111353330A (en) 2020-06-30

Family

ID=71192050

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811564402.9A Pending CN111353330A (en) 2018-12-20 2018-12-20 Image processing method, image processing device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111353330A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450401A (en) * 2021-07-19 2021-09-28 北京航空航天大学杭州创新研究院 Trash can fullness degree determining method, device and equipment and trash can
CN115249281A (en) * 2022-01-29 2022-10-28 北京百度网讯科技有限公司 Image occlusion and model training method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108899087A (en) * 2018-06-22 2018-11-27 中山仰视科技有限公司 X-ray intelligent diagnosing method based on deep learning
CN109040824A (en) * 2018-08-28 2018-12-18 百度在线网络技术(北京)有限公司 Method for processing video frequency, device, electronic equipment and readable storage medium storing program for executing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108899087A (en) * 2018-06-22 2018-11-27 中山仰视科技有限公司 X-ray intelligent diagnosing method based on deep learning
CN109040824A (en) * 2018-08-28 2018-12-18 百度在线网络技术(北京)有限公司 Method for processing video frequency, device, electronic equipment and readable storage medium storing program for executing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
RAMPRASAATH R. SELVARAJU等: "Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450401A (en) * 2021-07-19 2021-09-28 北京航空航天大学杭州创新研究院 Trash can fullness degree determining method, device and equipment and trash can
CN115249281A (en) * 2022-01-29 2022-10-28 北京百度网讯科技有限公司 Image occlusion and model training method, device, equipment and storage medium
CN115249281B (en) * 2022-01-29 2023-11-24 北京百度网讯科技有限公司 Image occlusion and model training method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109325933B (en) Method and device for recognizing copied image
CN110276767B (en) Image processing method and device, electronic equipment and computer readable storage medium
EP3579147A1 (en) Image processing method and electronic device
CN111193923B (en) Video quality evaluation method and device, electronic equipment and computer storage medium
WO2019233263A1 (en) Method for video processing, electronic device and computer-readable storage medium
US20230030267A1 (en) Method and apparatus for selecting face image, device, and storage medium
US20230085605A1 (en) Face image processing method, apparatus, device, and storage medium
US9305208B2 (en) System and method for recognizing offensive images
WO2021179471A1 (en) Face blur detection method and apparatus, computer device and storage medium
CN112818737B (en) Video identification method, device, storage medium and terminal
CN111935479B (en) Target image determination method and device, computer equipment and storage medium
CN111754267B (en) Data processing method and system based on block chain
US20190294863A9 (en) Method and apparatus for face classification
CN109284673B (en) Object tracking method and device, electronic equipment and storage medium
US11087140B2 (en) Information generating method and apparatus applied to terminal device
CN112135188A (en) Video clipping method, electronic device and computer-readable storage medium
CN112492297B (en) Video processing method and related equipment
CN111079864A (en) Short video classification method and system based on optimized video key frame extraction
US11804032B2 (en) Method and system for face detection
CN110599514B (en) Image segmentation method and device, electronic equipment and storage medium
CN111353330A (en) Image processing method, image processing device, electronic equipment and storage medium
CN113743378B (en) Fire monitoring method and device based on video
US11527091B2 (en) Analyzing apparatus, control method, and program
CN116612355A (en) Training method and device for face fake recognition model, face recognition method and device
CN115798005A (en) Reference photo processing method and device, processor and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination