CN112699902A - Fine-grained sensitive image detection method based on bilinear attention pooling mechanism - Google Patents
Fine-grained sensitive image detection method based on bilinear attention pooling mechanism Download PDFInfo
- Publication number
- CN112699902A CN112699902A CN202110031134.XA CN202110031134A CN112699902A CN 112699902 A CN112699902 A CN 112699902A CN 202110031134 A CN202110031134 A CN 202110031134A CN 112699902 A CN112699902 A CN 112699902A
- Authority
- CN
- China
- Prior art keywords
- attention
- sensitive image
- feature
- fine
- pooling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000011176 pooling Methods 0.000 title claims abstract description 38
- 230000007246 mechanism Effects 0.000 title claims abstract description 21
- 238000001514 detection method Methods 0.000 title claims abstract description 20
- 238000010586 diagram Methods 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 23
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000000605 extraction Methods 0.000 claims abstract description 9
- 238000004140 cleaning Methods 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 22
- 238000004590 computer program Methods 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 11
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 4
- 238000013459 approach Methods 0.000 claims description 3
- 238000005520 cutting process Methods 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 230000007935 neutral effect Effects 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 230000004931 aggregating effect Effects 0.000 claims 1
- 230000002776 aggregation Effects 0.000 abstract description 3
- 238000004220 aggregation Methods 0.000 abstract description 3
- 238000012550 audit Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2132—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a fine-grained sensitive image detection method based on a bilinear attention pooling mechanism, which comprises the following steps of: step S1: acquiring a sensitive image, and performing data cleaning on the acquired sensitive image set to obtain an NSFW sensitive image training data set; step S2: inputting the NSFW sensitive image training data set into a fine-grained sensitive image intelligent auditing network model, performing feature extraction, and generating a feature map and an attention map; step S3: performing attention-based data enhancement on the NSFW training set according to the obtained attention diagram, and performing attention cropping and attention discarding on the images; step S4: generating a local feature map by the aggregation feature map and the attention map through a bilinear attention pooling mechanism, extracting local features through convolution and pooling, and combining all the local features into a final feature; step S5: and predicting the sensitive image category according to the final characteristics. The method can effectively improve the detection accuracy of the sensitive image of the scene difficult to sample.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to a fine-grained sensitive image detection method based on a bilinear attention pooling mechanism.
Background
In recent years, a deep learning image classification network Model based on a convolutional neural network is applied to intelligent audit of sensitive images, typically, a sensitive image audit Model Opensfw Model of Yahoo, but the application effect of the sensitive image audit Model of Yahoo in an intelligent audit scene of domestic sensitive images is poor, the main reason is that the used training data is different from the practical application situation, the training data set mainly comes from West, the images collected in a data set are mainly white races, and the generalization capability of the deep learning image classification network Model applied to the domestic sensitive image audit task is insufficient.
Disclosure of Invention
In view of this, the present invention provides a fine-grained sensitive image detection method based on a bilinear attention pooling mechanism, which can effectively improve the detection accuracy of a sensitive image in a difficult-to-sample scene.
In order to achieve the purpose, the invention adopts the following technical scheme:
a fine-grained sensitive image detection method based on a bilinear attention pooling mechanism comprises the following steps:
step S1: acquiring a sensitive image, and performing data cleaning on the acquired sensitive image set to obtain an NSFW sensitive image training data set;
step S2: constructing a fine-grained sensitive image intelligent auditing network model, inputting an NSFW sensitive image training data set into the fine-grained sensitive image intelligent auditing network model, extracting features, and generating a feature map and an attention map;
step S3: performing attention mechanism-based data enhancement on the NSFW training set according to the obtained attention diagram, and performing attention clipping and attention discarding on the images on the basis of reserving the saliency discrimination areas in the sensitive images;
step S4: generating a local feature map by the aggregation feature map and the attention map through a bilinear attention pooling mechanism, extracting local features through convolution and pooling, and combining all the local features into a final feature;
step S5: and predicting the sensitive image category according to the final characteristics.
Further, the step S1 is specifically:
step S11, acquiring five category images of Drawings, Neutral, Sexy, Hentai and Porn in batches through URL addresses;
step S12: classifying the images of the Sexy and Port categories by using an Open _ Nsfw yellow identification model of the Yahoo Open source, and adjusting, screening and filtering sample images which do not belong to the corresponding categories or are unavailable;
step S13: and dividing the sample set subjected to data cleaning into a training set and a testing set according to the proportion of 8:1, and constructing an NSFW data set.
Further, the step S2 is specifically:
step S21: fine-tuning the pre-trained BiT-M model according to the obtained NSFW sensitive image training data set, and extracting features by using a ResNet50 network as a main network to obtain a feature map F;
step S22: performing 1 × 1 convolution operation on the obtained feature diagram F to obtain an attention diagram A;
step S23: an attention regularization loss mechanism is employed to weakly supervise the attention learning based process.
Further, the step S23 is specifically: the variance of local features belonging to the same object is balanced while the local features fkWill approach the global feature center ck∈R1×NAttention-seeking drawing AkWill be activated locally in the same kth object, and the attention regularization loss is only applied to the original image, and the loss function adopts L in the formulaARepresents:
ck←ck+β(fk-ck)
wherein L isAIs a loss function, M is the number of attention maps, fkIs a local feature, ckIs a global feature center that can be initialized from zero and updated using a moving average, β is the control ckA function of the update rate.
Further, the step S3 is specifically:
step S31: for each training image in the NSFW-sensitive image dataset, an attention map a of the picture is randomly selectedkTo guide and normalize the data enhancement process to the kth data enhancement graph The formula of (1) is as follows:
wherein A iskIn an effort to address the need for attention,for data enhancement map, min (A)k) Max (A) for minimal attentionk) Is the most attentive purpose;
step S32: enhancing maps by dataAmplifying the significant characteristic region and extracting local characteristics;
step S33: setting a bounding box B covering the entire selected positive field of the crop maskkAnd taking an image obtained by amplifying the area from the original image as input data of data enhancement;
step S34: attention regularization loss supervision Each attention map Ak∈RH×WThe part representing the same k-th object will be larger than the threshold value thetad∈[0,1]Of (2) element(s)Set to 0 and the other elements to 1, the formula is as follows:
whereinIs a discriminant element of the position (i, j), Dk(i, j) is the discard mask for position (i, j), θcIs a threshold value.
Further, the step S32 is specifically: by setting elementsTo come fromObtaining a cutting mask CkThe element being greater than the threshold value thetac∈[0,1]The value of the timer is set to 1, otherwise to 0, and the formula is as follows:
whereinIs a distinguishing element of position (i, j), Ck(i, j) is a trim mask for position (i, j) (-)cIs a threshold value.
Further, the step S4 is specifically:
step S41: each attention diagram represents a part of a specific object, each attention diagram is multiplied by a feature diagram element by element to generate a partial feature diagram, and then discriminant local features are further extracted through an additional feature extraction function to obtain a kth attention saliency feature;
step S42: local feature fkGenerating object characteristics by superposition, and generating a part of characteristic matrix P epsilon RM×NExpressed, Γ (a, F) represents the bilinear attention pooling process formula for the attention map a and the feature map F:
where Γ (A, F) represents a bilinear attention pooling function, g (-) is a feature extraction function, a1,...,aMFor local attention, F is a feature map, F1,...,fMIs a local feature;
step S43: local features are extracted from the partial feature map generated in step S41 by convolution or pooling operation, and a final feature matrix is composed of all the partial features.
A fine-grained sensitive image detection system based on a bilinear attention pooling scheme, comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, which when executed by the processor, implement the method steps of any of claims 1-7.
A computer-readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions, when executed by the processor, performing the method steps of any of claims 1-7.
Compared with the prior art, the invention has the following beneficial effects:
1. the fine-grained sensitive image detection method based on the bilinear attention pooling mechanism can effectively solve the problem of distinguishing feature extraction in sensitive images in a difficult sample scene, fully exerts the advantage of feature extraction of a deep learning method, can learn simple features from a large amount of data set firstly and then learn more complex and abstract deep features gradually without depending on artificial feature engineering, and completes the intelligent examination and verification of fine-grained sensitive images under refined classification;
2. according to the method, the data cleaning based on the deep learning Open _ Nsfw model is carried out on the sensitive image data set collected from the Internet, the invalid data samples in the collected sensitive image are effectively screened and filtered, and compared with manual screening, a large amount of time and labor cost are saved;
3. aiming at the problem that the random data enhancement efficiency of a sensitive image data set is low, particularly under the condition that the target size is small, background noise with a high proportion is introduced, the invention provides the attention mechanism-based sensitive image data enhancement method, and the attention clipping and attention discarding are carried out on the image on the basis of reserving the significance distinguishing area in the sensitive image, so that the effectiveness of data enhancement is effectively improved;
4. the invention provides a fine-grained sensitive image content intelligent auditing model based on a bilinear attention pooling mechanism, aiming at the problems of few thinning categories, poor performance on difficult sample sensitive images and the like of the conventional sensitive image auditing model. The model firstly generates a target feature map and an attention map representing the salient features of the target through weak supervised learning, then generates a local feature map from the aggregate feature map and the attention map through a bilinear attention pooling mechanism, extracts local features through convolution and pooling, and finally combines all the local features into a final feature to improve the discrimination of the model. The discrimination capability of the hard sample sensitive image is effectively enhanced, and the accuracy of intelligent examination of the content of the sensitive image is finally improved.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
Referring to fig. 1, the present invention provides a fine-grained sensitive image detection method based on bilinear attention pooling, which includes the following steps:
step S1: acquiring a sensitive image, and performing data cleaning on the acquired sensitive image set to obtain an NSFW sensitive image training data set;
step S2: constructing a fine-grained sensitive image intelligent auditing network model, inputting an NSFW sensitive image training data set into the fine-grained sensitive image intelligent auditing network model, extracting features, and generating a feature map and an attention map;
step S3: performing attention mechanism-based data enhancement on the NSFW training set according to the obtained attention diagram, and performing attention clipping and attention discarding on the images on the basis of reserving the saliency discrimination areas in the sensitive images;
step S4: generating a local feature map by the aggregation feature map and the attention map through a bilinear attention pooling mechanism, extracting local features through convolution and pooling, and combining all the local features into a final feature;
step S5: and predicting the sensitive image category according to the final characteristics.
In this embodiment, the step S1 specifically includes:
step S11, acquiring 63486 images of Drawings, Neutral, Sexy, Hentai and Porn in batches through URL addresses;
step S12: classifying the images of the Sexy and Port categories by using an Open _ Nsfw yellow identification model of the Yahoo Open source, and adjusting, screening and filtering sample images which do not belong to the corresponding categories or are unavailable;
step S13: and dividing the sample set subjected to data cleaning into a training set and a testing set according to the proportion of 8:1, and constructing an NSFW data set.
In this embodiment, the step S2 specifically includes:
step S21: fine-tuning the pre-trained BiT-M model according to the obtained NSFW sensitive image training data set, and extracting features by using a ResNet50 network as a main network to obtain a feature map F;
step S22: performing 1 × 1 convolution operation on the obtained feature diagram F to obtain an attention diagram A;
step S23: an attention regularization loss mechanism is employed to weakly supervise the attention learning based process.
The variance of local features belonging to the same object is balanced while the local features fkWill approach the global feature center ck∈R1×NAttention-seeking drawing AkWill be activated locally in the same kth object, and the attention regularization loss is only applied to the original image, and the loss function adopts L in the formulaARepresents:
ck←ck+β(fk-ck)
wherein L isAIs a loss function, M is the number of attention maps, fkIs a local feature, ckIs a global feature center that can be initialized from zero and updated using a moving average, β is the control ckA function of the update rate.
In this embodiment, the step S3 specifically includes:
step S31: for each training image in the NSFW-sensitive image dataset, an attention map a of the picture is randomly selectedkTo guide and normalize the data enhancement process to the kth data enhancement graph The formula of (1) is as follows:
wherein A iskIn an effort to address the need for attention,for data enhancement map, min (A)k) Max (A) for minimal attentionk) Is the most attentive purpose;
step S32: enhancing maps by dataThe significant characteristic region is enlarged, and local characteristics are extracted;
by setting elementsTo come fromObtaining a cutting mask CkThe element being greater than the threshold value thetac∈[0,1]The value of the timer is set to 1, otherwise to 0, and the formula is as follows:
whereinIs a distinguishing element of position (i, j), Ck(i, j) is a trim mask for position (i, j) (-)cIs a threshold value;
step S33: setting a bounding box B covering the entire selected positive field of the crop maskkAnd taking an image obtained by amplifying the area from the original image as input data of data enhancement; as the local scale of the object increases, the object can be observed more clearly to extract finer grained features;
step S34: attention regularization loss supervision Each attention map Ak∈RH×WThe part representing the same k-th object will be larger than the threshold value thetad∈[0,1]Of (2) element(s)Set to 0 and the other elements to 1, the formula is as follows:
whereinIs a discriminant element of the position (i, j), Dk(i, j) is the discard mask for position (i, j), θcIs a threshold value. Because the part of the kth target is removed from the sensitive image, the network supports the proposing of other distinguishing area characteristics, the target can be better seen, and the classification robustness and the positioning accuracy are finally improved.
In this embodiment, the step S4 specifically includes:
step S41: each attention diagram represents a part of a specific object, each attention diagram is multiplied by a feature diagram element by element to generate a partial feature diagram, and then discriminant local features are further extracted through an additional feature extraction function to obtain a kth attention saliency feature;
step S42: local feature fkGenerating object characteristics by superposition, and generating a part of characteristic matrix P epsilon RM×NExpressed, Γ (a, F) represents the bilinear attention pooling process formula for the attention map a and the feature map F:
where Γ (A, F) represents a bilinear attention pooling function, g (-) is a feature extraction function, a1,...,aMFor local attention, F is a feature map, F1,...,fMIs a local feature;
step S43: local features are extracted from the partial feature map generated in step S41 by convolution or pooling operation, and a final feature matrix is composed of all the partial features.
Preferably, in this embodiment, there is also provided a fine-grained sensitive image detection system based on a bilinear attention pooling scheme, which includes a memory, a processor, and computer program instructions stored on the memory and capable of being executed by the processor, and when the computer program instructions are executed by the processor, the method steps as described in any one of the above are implemented.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.
Claims (9)
1. A fine-grained sensitive image detection method based on a bilinear attention pooling mechanism is characterized by comprising the following steps:
step S1: acquiring a sensitive image, and performing data cleaning on the acquired sensitive image set to obtain an NSFW sensitive image training data set;
step S2: constructing a fine-grained sensitive image intelligent auditing network model, inputting an NSFW sensitive image training data set into the fine-grained sensitive image intelligent auditing network model, extracting features, and generating a feature map and an attention map;
step S3: performing attention mechanism-based data enhancement on the NSFW training set according to the obtained attention diagram, and performing attention clipping and attention discarding on the images on the basis of reserving the saliency discrimination areas in the sensitive images;
step S4: aggregating the feature map and the attention map through a bilinear attention pooling mechanism to generate a local feature map, extracting local features through convolution and pooling, and combining all the local features into a final feature;
step S5: and predicting the sensitive image category according to the final characteristics.
2. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 1, wherein the step S1 specifically comprises:
step S11, acquiring five category images of Drawings, Neutral, Sexy, Hentai and Porn in batches through URL addresses;
step S12: classifying the images of the Sexy and Port categories by using an Open _ Nsfw yellow identification model of the Yahoo Open source, and adjusting, screening and filtering sample images which do not belong to the corresponding categories or are unavailable;
step S13: and dividing the sample set subjected to data cleaning into a training set and a testing set according to the proportion of 8:1, and constructing an NSFW data set.
3. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 1, wherein the step S2 specifically comprises:
step S21: fine-tuning the pre-trained BiT-M model according to the obtained NSFW sensitive image training data set, and extracting features by using a ResNet50 network as a main network to obtain a feature map F;
step S22: performing 1 × 1 convolution operation on the obtained feature diagram F to obtain an attention diagram A;
step S23: an attention regularization loss mechanism is employed to weakly supervise the attention learning based process.
4. The fine-grained sensitive image detection method based on bilinear attention pooling of claims 3, wherein the step S23 specifically comprises: the variance of local features belonging to the same object is balanced while the local features fkWill approach the global feature center ck∈R1×NAttention-seeking drawing AkWill be activated locally in the same kth object, and the attention regularization loss is only applied to the original image, and the loss function adopts L in the formulaARepresents:
ck←ck+β(fk-ck)
wherein L isAIs a loss function, M is the number of attention maps, fkIs a local feature, ckIs a global feature center that can be initialized from zero and updated using a moving average, β is the control ckA function of the update rate.
5. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 1, wherein step S3 is specifically:
step S31: for each training image in the NSFW-sensitive image dataset, an attention map a of the picture is randomly selectedkTo guide and normalize the data enhancement process to the kth data enhancement graph The formula of (1) is as follows:
wherein A iskIn an effort to address the need for attention,for data enhancement map, min (A)k) Max (A) for minimal attentionk) Is the most attentive purpose;
step S32: enhancing maps by dataAmplifying the significant characteristic region and extracting local characteristics;
step S33: setting a bounding box B covering the entire selected positive field of the crop maskkAnd enlarging the region from the original image as data-enhanced outputInputting data;
step S34: attention regularization loss supervision Each attention map Ak∈RH×WThe part representing the same k-th object will be larger than the threshold value thetad∈[0,1]Of (2) element(s)Set to 0 and the other elements to 1, the formula is as follows:
6. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 5, wherein the step S32 specifically comprises: by setting elementsTo come fromObtaining a cutting mask CkThe element being greater than the threshold value thetac∈[0,1]The value of the timer is set to 1, otherwise to 0, and the formula is as follows:
7. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 1, wherein the step S4 specifically comprises:
step S41: each attention diagram represents a part of a specific object, each attention diagram is multiplied by a feature diagram element by element to generate a partial feature diagram, and then discriminant local features are further extracted through an additional feature extraction function to obtain a kth attention saliency feature;
step S42: local feature fkGenerating object characteristics by superposition, and generating a part of characteristic matrix P epsilon RM×NExpressed, Γ (a, F) represents the bilinear attention pooling process formula for the attention map a and the feature map F:
where F (A, F) represents a bilinear attention pooling function, g (-) is a feature extraction function, a1,...,aMFor local attention, F is a feature map, F1,...,fMIs a local feature;
step S43: local features are extracted from the partial feature map generated in step S41 by convolution or pooling operation, and a final feature matrix is composed of all the partial features.
8. A fine-grained sensitive image detection system based on a bilinear attention pooling scheme, comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, the computer program instructions, when executed by the processor, performing the method steps of any of claims 1-7.
9. A computer-readable storage medium, having stored thereon computer program instructions executable by a processor, for performing, when the processor executes the computer program instructions, the method steps according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110031134.XA CN112699902A (en) | 2021-01-11 | 2021-01-11 | Fine-grained sensitive image detection method based on bilinear attention pooling mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110031134.XA CN112699902A (en) | 2021-01-11 | 2021-01-11 | Fine-grained sensitive image detection method based on bilinear attention pooling mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112699902A true CN112699902A (en) | 2021-04-23 |
Family
ID=75513854
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110031134.XA Pending CN112699902A (en) | 2021-01-11 | 2021-01-11 | Fine-grained sensitive image detection method based on bilinear attention pooling mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112699902A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610757A (en) * | 2021-07-02 | 2021-11-05 | 华中科技大学同济医学院附属同济医院 | Medical x-ray lung image detection method based on fine granularity |
CN113627377A (en) * | 2021-08-18 | 2021-11-09 | 福州大学 | Cognitive radio frequency spectrum sensing method and system Based on Attention-Based CNN |
CN113936145A (en) * | 2021-10-08 | 2022-01-14 | 南京信息工程大学 | Fine-grained identification method based on attention diagram sorting |
CN114708466A (en) * | 2022-06-08 | 2022-07-05 | 南京智莲森信息技术有限公司 | Part abnormal fine granularity classification method and system, storage medium and computing equipment |
CN116458897A (en) * | 2023-04-18 | 2023-07-21 | 山东省人工智能研究院 | Electrocardiosignal quality assessment method based on two-dimensional image and attention mechanism |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190362199A1 (en) * | 2018-05-25 | 2019-11-28 | Adobe Inc. | Joint blur map estimation and blur desirability classification from an image |
CN111489334A (en) * | 2020-04-02 | 2020-08-04 | 暖屋信息科技(苏州)有限公司 | Defect workpiece image identification method based on convolution attention neural network |
CN111539469A (en) * | 2020-04-20 | 2020-08-14 | 东南大学 | Weak supervision fine-grained image identification method based on vision self-attention mechanism |
CN112163465A (en) * | 2020-09-11 | 2021-01-01 | 华南理工大学 | Fine-grained image classification method, fine-grained image classification system, computer equipment and storage medium |
CN112183602A (en) * | 2020-09-22 | 2021-01-05 | 天津大学 | Multi-layer feature fusion fine-grained image classification method with parallel rolling blocks |
-
2021
- 2021-01-11 CN CN202110031134.XA patent/CN112699902A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190362199A1 (en) * | 2018-05-25 | 2019-11-28 | Adobe Inc. | Joint blur map estimation and blur desirability classification from an image |
CN111489334A (en) * | 2020-04-02 | 2020-08-04 | 暖屋信息科技(苏州)有限公司 | Defect workpiece image identification method based on convolution attention neural network |
CN111539469A (en) * | 2020-04-20 | 2020-08-14 | 东南大学 | Weak supervision fine-grained image identification method based on vision self-attention mechanism |
CN112163465A (en) * | 2020-09-11 | 2021-01-01 | 华南理工大学 | Fine-grained image classification method, fine-grained image classification system, computer equipment and storage medium |
CN112183602A (en) * | 2020-09-22 | 2021-01-05 | 天津大学 | Multi-layer feature fusion fine-grained image classification method with parallel rolling blocks |
Non-Patent Citations (2)
Title |
---|
HUA WEI ET AL.: "Two-Level Progressive Attention Convolutional Network for Fine-Grained Image Recognition", 《IEEE ACCESS》 * |
李丰磊: "基于多层权重自适应双线性池化和注意力机制的细粒度图像分类", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610757A (en) * | 2021-07-02 | 2021-11-05 | 华中科技大学同济医学院附属同济医院 | Medical x-ray lung image detection method based on fine granularity |
CN113627377A (en) * | 2021-08-18 | 2021-11-09 | 福州大学 | Cognitive radio frequency spectrum sensing method and system Based on Attention-Based CNN |
CN113627377B (en) * | 2021-08-18 | 2024-07-02 | 福州大学 | Cognitive radio spectrum sensing method and system Based on Attention-Based CNN |
CN113936145A (en) * | 2021-10-08 | 2022-01-14 | 南京信息工程大学 | Fine-grained identification method based on attention diagram sorting |
CN113936145B (en) * | 2021-10-08 | 2024-06-11 | 南京信息工程大学 | Fine granularity identification method based on attention-seeking diagram ordering |
CN114708466A (en) * | 2022-06-08 | 2022-07-05 | 南京智莲森信息技术有限公司 | Part abnormal fine granularity classification method and system, storage medium and computing equipment |
CN114708466B (en) * | 2022-06-08 | 2022-09-09 | 南京智莲森信息技术有限公司 | Part abnormal fine granularity classification method and system, storage medium and computing equipment |
CN116458897A (en) * | 2023-04-18 | 2023-07-21 | 山东省人工智能研究院 | Electrocardiosignal quality assessment method based on two-dimensional image and attention mechanism |
CN116458897B (en) * | 2023-04-18 | 2024-01-26 | 山东省人工智能研究院 | Electrocardiosignal quality assessment method based on two-dimensional image and attention mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112699902A (en) | Fine-grained sensitive image detection method based on bilinear attention pooling mechanism | |
CN109154978A (en) | System and method for detecting plant disease | |
CN110728330A (en) | Object identification method, device, equipment and storage medium based on artificial intelligence | |
CN110837768B (en) | Online detection and identification method for rare animal protection | |
CN109800682B (en) | Driver attribute identification method and related product | |
CN111986183B (en) | Chromosome scattered image automatic segmentation and identification system and device | |
CN111179216B (en) | Crop disease identification method based on image processing and convolutional neural network | |
CN114092450B (en) | Real-time image segmentation method, system and device based on gastroscopy video | |
CN112417955A (en) | Patrol video stream processing method and device | |
CN112101352A (en) | Underwater alumen ustum state identification method and monitoring device, computer equipment and storage medium | |
Bai et al. | Robust texture-aware computer-generated image forensic: Benchmark and algorithm | |
CN114140663A (en) | Multi-scale attention and learning network-based pest identification method and system | |
CN117253071B (en) | Semi-supervised target detection method and system based on multistage pseudo tag enhancement | |
CN106780286A (en) | A kind of particle group optimizing water mark method extracted based on blind watermatking | |
CN113344935B (en) | Image segmentation method and system based on multi-scale difficulty perception | |
CN104899875A (en) | Rapid image cooperation salient region monitoring method based on integration matching | |
CN110751034B (en) | Pedestrian behavior recognition method and terminal equipment | |
CN113963178A (en) | Method, device, equipment and medium for detecting infrared dim and small target under ground-air background | |
Das et al. | Ayurvedic Medicinal Plant Identification System Using Embedded Image Processing Techniques | |
Selvy et al. | A proficient clustering technique to detect CSF level in MRI brain images using PSO algorithm | |
Nair et al. | Under water fish species recognition | |
Liu et al. | Microscopic image analysis and recognition on pathological cells | |
CN118279596B (en) | Underwater fish sunlight refraction image denoising method and system | |
Wu et al. | Tumor segmentation on whole slide images: training or prompting? | |
Bao et al. | An Improved Densenet-Cnn Model to Classify the Damage Caused by Cotton Aphid |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210423 |
|
RJ01 | Rejection of invention patent application after publication |