CN112699902A - Fine-grained sensitive image detection method based on bilinear attention pooling mechanism - Google Patents

Fine-grained sensitive image detection method based on bilinear attention pooling mechanism Download PDF

Info

Publication number
CN112699902A
CN112699902A CN202110031134.XA CN202110031134A CN112699902A CN 112699902 A CN112699902 A CN 112699902A CN 202110031134 A CN202110031134 A CN 202110031134A CN 112699902 A CN112699902 A CN 112699902A
Authority
CN
China
Prior art keywords
attention
sensitive image
feature
fine
pooling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110031134.XA
Other languages
Chinese (zh)
Inventor
柯逍
王俊强
林艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202110031134.XA priority Critical patent/CN112699902A/en
Publication of CN112699902A publication Critical patent/CN112699902A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a fine-grained sensitive image detection method based on a bilinear attention pooling mechanism, which comprises the following steps of: step S1: acquiring a sensitive image, and performing data cleaning on the acquired sensitive image set to obtain an NSFW sensitive image training data set; step S2: inputting the NSFW sensitive image training data set into a fine-grained sensitive image intelligent auditing network model, performing feature extraction, and generating a feature map and an attention map; step S3: performing attention-based data enhancement on the NSFW training set according to the obtained attention diagram, and performing attention cropping and attention discarding on the images; step S4: generating a local feature map by the aggregation feature map and the attention map through a bilinear attention pooling mechanism, extracting local features through convolution and pooling, and combining all the local features into a final feature; step S5: and predicting the sensitive image category according to the final characteristics. The method can effectively improve the detection accuracy of the sensitive image of the scene difficult to sample.

Description

Fine-grained sensitive image detection method based on bilinear attention pooling mechanism
Technical Field
The invention relates to the technical field of image recognition, in particular to a fine-grained sensitive image detection method based on a bilinear attention pooling mechanism.
Background
In recent years, a deep learning image classification network Model based on a convolutional neural network is applied to intelligent audit of sensitive images, typically, a sensitive image audit Model Opensfw Model of Yahoo, but the application effect of the sensitive image audit Model of Yahoo in an intelligent audit scene of domestic sensitive images is poor, the main reason is that the used training data is different from the practical application situation, the training data set mainly comes from West, the images collected in a data set are mainly white races, and the generalization capability of the deep learning image classification network Model applied to the domestic sensitive image audit task is insufficient.
Disclosure of Invention
In view of this, the present invention provides a fine-grained sensitive image detection method based on a bilinear attention pooling mechanism, which can effectively improve the detection accuracy of a sensitive image in a difficult-to-sample scene.
In order to achieve the purpose, the invention adopts the following technical scheme:
a fine-grained sensitive image detection method based on a bilinear attention pooling mechanism comprises the following steps:
step S1: acquiring a sensitive image, and performing data cleaning on the acquired sensitive image set to obtain an NSFW sensitive image training data set;
step S2: constructing a fine-grained sensitive image intelligent auditing network model, inputting an NSFW sensitive image training data set into the fine-grained sensitive image intelligent auditing network model, extracting features, and generating a feature map and an attention map;
step S3: performing attention mechanism-based data enhancement on the NSFW training set according to the obtained attention diagram, and performing attention clipping and attention discarding on the images on the basis of reserving the saliency discrimination areas in the sensitive images;
step S4: generating a local feature map by the aggregation feature map and the attention map through a bilinear attention pooling mechanism, extracting local features through convolution and pooling, and combining all the local features into a final feature;
step S5: and predicting the sensitive image category according to the final characteristics.
Further, the step S1 is specifically:
step S11, acquiring five category images of Drawings, Neutral, Sexy, Hentai and Porn in batches through URL addresses;
step S12: classifying the images of the Sexy and Port categories by using an Open _ Nsfw yellow identification model of the Yahoo Open source, and adjusting, screening and filtering sample images which do not belong to the corresponding categories or are unavailable;
step S13: and dividing the sample set subjected to data cleaning into a training set and a testing set according to the proportion of 8:1, and constructing an NSFW data set.
Further, the step S2 is specifically:
step S21: fine-tuning the pre-trained BiT-M model according to the obtained NSFW sensitive image training data set, and extracting features by using a ResNet50 network as a main network to obtain a feature map F;
step S22: performing 1 × 1 convolution operation on the obtained feature diagram F to obtain an attention diagram A;
step S23: an attention regularization loss mechanism is employed to weakly supervise the attention learning based process.
Further, the step S23 is specifically: the variance of local features belonging to the same object is balanced while the local features fkWill approach the global feature center ck∈R1×NAttention-seeking drawing AkWill be activated locally in the same kth object, and the attention regularization loss is only applied to the original image, and the loss function adopts L in the formulaARepresents:
Figure BDA0002892114100000031
ck←ck+β(fk-ck)
wherein L isAIs a loss function, M is the number of attention maps, fkIs a local feature, ckIs a global feature center that can be initialized from zero and updated using a moving average, β is the control ckA function of the update rate.
Further, the step S3 is specifically:
step S31: for each training image in the NSFW-sensitive image dataset, an attention map a of the picture is randomly selectedkTo guide and normalize the data enhancement process to the kth data enhancement graph
Figure BDA0002892114100000032
Figure BDA0002892114100000033
The formula of (1) is as follows:
Figure BDA0002892114100000034
wherein A iskIn an effort to address the need for attention,
Figure BDA0002892114100000041
for data enhancement map, min (A)k) Max (A) for minimal attentionk) Is the most attentive purpose;
step S32: enhancing maps by data
Figure BDA0002892114100000042
Amplifying the significant characteristic region and extracting local characteristics;
step S33: setting a bounding box B covering the entire selected positive field of the crop maskkAnd taking an image obtained by amplifying the area from the original image as input data of data enhancement;
step S34: attention regularization loss supervision Each attention map Ak∈RH×WThe part representing the same k-th object will be larger than the threshold value thetad∈[0,1]Of (2) element(s)
Figure BDA0002892114100000043
Set to 0 and the other elements to 1, the formula is as follows:
Figure BDA0002892114100000044
wherein
Figure BDA0002892114100000045
Is a discriminant element of the position (i, j), Dk(i, j) is the discard mask for position (i, j), θcIs a threshold value.
Further, the step S32 is specifically: by setting elements
Figure BDA0002892114100000046
To come from
Figure BDA0002892114100000047
Obtaining a cutting mask CkThe element being greater than the threshold value thetac∈[0,1]The value of the timer is set to 1, otherwise to 0, and the formula is as follows:
Figure BDA0002892114100000048
wherein
Figure BDA0002892114100000049
Is a distinguishing element of position (i, j), Ck(i, j) is a trim mask for position (i, j) (-)cIs a threshold value.
Further, the step S4 is specifically:
step S41: each attention diagram represents a part of a specific object, each attention diagram is multiplied by a feature diagram element by element to generate a partial feature diagram, and then discriminant local features are further extracted through an additional feature extraction function to obtain a kth attention saliency feature;
step S42: local feature fkGenerating object characteristics by superposition, and generating a part of characteristic matrix P epsilon RM×NExpressed, Γ (a, F) represents the bilinear attention pooling process formula for the attention map a and the feature map F:
Figure BDA0002892114100000051
where Γ (A, F) represents a bilinear attention pooling function, g (-) is a feature extraction function, a1,...,aMFor local attention, F is a feature map, F1,...,fMIs a local feature;
step S43: local features are extracted from the partial feature map generated in step S41 by convolution or pooling operation, and a final feature matrix is composed of all the partial features.
A fine-grained sensitive image detection system based on a bilinear attention pooling scheme, comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, which when executed by the processor, implement the method steps of any of claims 1-7.
A computer-readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions, when executed by the processor, performing the method steps of any of claims 1-7.
Compared with the prior art, the invention has the following beneficial effects:
1. the fine-grained sensitive image detection method based on the bilinear attention pooling mechanism can effectively solve the problem of distinguishing feature extraction in sensitive images in a difficult sample scene, fully exerts the advantage of feature extraction of a deep learning method, can learn simple features from a large amount of data set firstly and then learn more complex and abstract deep features gradually without depending on artificial feature engineering, and completes the intelligent examination and verification of fine-grained sensitive images under refined classification;
2. according to the method, the data cleaning based on the deep learning Open _ Nsfw model is carried out on the sensitive image data set collected from the Internet, the invalid data samples in the collected sensitive image are effectively screened and filtered, and compared with manual screening, a large amount of time and labor cost are saved;
3. aiming at the problem that the random data enhancement efficiency of a sensitive image data set is low, particularly under the condition that the target size is small, background noise with a high proportion is introduced, the invention provides the attention mechanism-based sensitive image data enhancement method, and the attention clipping and attention discarding are carried out on the image on the basis of reserving the significance distinguishing area in the sensitive image, so that the effectiveness of data enhancement is effectively improved;
4. the invention provides a fine-grained sensitive image content intelligent auditing model based on a bilinear attention pooling mechanism, aiming at the problems of few thinning categories, poor performance on difficult sample sensitive images and the like of the conventional sensitive image auditing model. The model firstly generates a target feature map and an attention map representing the salient features of the target through weak supervised learning, then generates a local feature map from the aggregate feature map and the attention map through a bilinear attention pooling mechanism, extracts local features through convolution and pooling, and finally combines all the local features into a final feature to improve the discrimination of the model. The discrimination capability of the hard sample sensitive image is effectively enhanced, and the accuracy of intelligent examination of the content of the sensitive image is finally improved.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
Referring to fig. 1, the present invention provides a fine-grained sensitive image detection method based on bilinear attention pooling, which includes the following steps:
step S1: acquiring a sensitive image, and performing data cleaning on the acquired sensitive image set to obtain an NSFW sensitive image training data set;
step S2: constructing a fine-grained sensitive image intelligent auditing network model, inputting an NSFW sensitive image training data set into the fine-grained sensitive image intelligent auditing network model, extracting features, and generating a feature map and an attention map;
step S3: performing attention mechanism-based data enhancement on the NSFW training set according to the obtained attention diagram, and performing attention clipping and attention discarding on the images on the basis of reserving the saliency discrimination areas in the sensitive images;
step S4: generating a local feature map by the aggregation feature map and the attention map through a bilinear attention pooling mechanism, extracting local features through convolution and pooling, and combining all the local features into a final feature;
step S5: and predicting the sensitive image category according to the final characteristics.
In this embodiment, the step S1 specifically includes:
step S11, acquiring 63486 images of Drawings, Neutral, Sexy, Hentai and Porn in batches through URL addresses;
step S12: classifying the images of the Sexy and Port categories by using an Open _ Nsfw yellow identification model of the Yahoo Open source, and adjusting, screening and filtering sample images which do not belong to the corresponding categories or are unavailable;
step S13: and dividing the sample set subjected to data cleaning into a training set and a testing set according to the proportion of 8:1, and constructing an NSFW data set.
In this embodiment, the step S2 specifically includes:
step S21: fine-tuning the pre-trained BiT-M model according to the obtained NSFW sensitive image training data set, and extracting features by using a ResNet50 network as a main network to obtain a feature map F;
step S22: performing 1 × 1 convolution operation on the obtained feature diagram F to obtain an attention diagram A;
step S23: an attention regularization loss mechanism is employed to weakly supervise the attention learning based process.
The variance of local features belonging to the same object is balanced while the local features fkWill approach the global feature center ck∈R1×NAttention-seeking drawing AkWill be activated locally in the same kth object, and the attention regularization loss is only applied to the original image, and the loss function adopts L in the formulaARepresents:
Figure BDA0002892114100000081
ck←ck+β(fk-ck)
wherein L isAIs a loss function, M is the number of attention maps, fkIs a local feature, ckIs a global feature center that can be initialized from zero and updated using a moving average, β is the control ckA function of the update rate.
In this embodiment, the step S3 specifically includes:
step S31: for each training image in the NSFW-sensitive image dataset, an attention map a of the picture is randomly selectedkTo guide and normalize the data enhancement process to the kth data enhancement graph
Figure BDA0002892114100000091
Figure BDA0002892114100000092
The formula of (1) is as follows:
Figure BDA0002892114100000093
wherein A iskIn an effort to address the need for attention,
Figure BDA0002892114100000094
for data enhancement map, min (A)k) Max (A) for minimal attentionk) Is the most attentive purpose;
step S32: enhancing maps by data
Figure BDA0002892114100000095
The significant characteristic region is enlarged, and local characteristics are extracted;
by setting elements
Figure BDA0002892114100000096
To come from
Figure BDA0002892114100000097
Obtaining a cutting mask CkThe element being greater than the threshold value thetac∈[0,1]The value of the timer is set to 1, otherwise to 0, and the formula is as follows:
Figure BDA0002892114100000098
wherein
Figure BDA0002892114100000099
Is a distinguishing element of position (i, j), Ck(i, j) is a trim mask for position (i, j) (-)cIs a threshold value;
step S33: setting a bounding box B covering the entire selected positive field of the crop maskkAnd taking an image obtained by amplifying the area from the original image as input data of data enhancement; as the local scale of the object increases, the object can be observed more clearly to extract finer grained features;
step S34: attention regularization loss supervision Each attention map Ak∈RH×WThe part representing the same k-th object will be larger than the threshold value thetad∈[0,1]Of (2) element(s)
Figure BDA0002892114100000101
Set to 0 and the other elements to 1, the formula is as follows:
Figure BDA0002892114100000102
wherein
Figure BDA0002892114100000103
Is a discriminant element of the position (i, j), Dk(i, j) is the discard mask for position (i, j), θcIs a threshold value. Because the part of the kth target is removed from the sensitive image, the network supports the proposing of other distinguishing area characteristics, the target can be better seen, and the classification robustness and the positioning accuracy are finally improved.
In this embodiment, the step S4 specifically includes:
step S41: each attention diagram represents a part of a specific object, each attention diagram is multiplied by a feature diagram element by element to generate a partial feature diagram, and then discriminant local features are further extracted through an additional feature extraction function to obtain a kth attention saliency feature;
step S42: local feature fkGenerating object characteristics by superposition, and generating a part of characteristic matrix P epsilon RM×NExpressed, Γ (a, F) represents the bilinear attention pooling process formula for the attention map a and the feature map F:
Figure BDA0002892114100000104
where Γ (A, F) represents a bilinear attention pooling function, g (-) is a feature extraction function, a1,...,aMFor local attention, F is a feature map, F1,...,fMIs a local feature;
step S43: local features are extracted from the partial feature map generated in step S41 by convolution or pooling operation, and a final feature matrix is composed of all the partial features.
Preferably, in this embodiment, there is also provided a fine-grained sensitive image detection system based on a bilinear attention pooling scheme, which includes a memory, a processor, and computer program instructions stored on the memory and capable of being executed by the processor, and when the computer program instructions are executed by the processor, the method steps as described in any one of the above are implemented.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (9)

1. A fine-grained sensitive image detection method based on a bilinear attention pooling mechanism is characterized by comprising the following steps:
step S1: acquiring a sensitive image, and performing data cleaning on the acquired sensitive image set to obtain an NSFW sensitive image training data set;
step S2: constructing a fine-grained sensitive image intelligent auditing network model, inputting an NSFW sensitive image training data set into the fine-grained sensitive image intelligent auditing network model, extracting features, and generating a feature map and an attention map;
step S3: performing attention mechanism-based data enhancement on the NSFW training set according to the obtained attention diagram, and performing attention clipping and attention discarding on the images on the basis of reserving the saliency discrimination areas in the sensitive images;
step S4: aggregating the feature map and the attention map through a bilinear attention pooling mechanism to generate a local feature map, extracting local features through convolution and pooling, and combining all the local features into a final feature;
step S5: and predicting the sensitive image category according to the final characteristics.
2. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 1, wherein the step S1 specifically comprises:
step S11, acquiring five category images of Drawings, Neutral, Sexy, Hentai and Porn in batches through URL addresses;
step S12: classifying the images of the Sexy and Port categories by using an Open _ Nsfw yellow identification model of the Yahoo Open source, and adjusting, screening and filtering sample images which do not belong to the corresponding categories or are unavailable;
step S13: and dividing the sample set subjected to data cleaning into a training set and a testing set according to the proportion of 8:1, and constructing an NSFW data set.
3. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 1, wherein the step S2 specifically comprises:
step S21: fine-tuning the pre-trained BiT-M model according to the obtained NSFW sensitive image training data set, and extracting features by using a ResNet50 network as a main network to obtain a feature map F;
step S22: performing 1 × 1 convolution operation on the obtained feature diagram F to obtain an attention diagram A;
step S23: an attention regularization loss mechanism is employed to weakly supervise the attention learning based process.
4. The fine-grained sensitive image detection method based on bilinear attention pooling of claims 3, wherein the step S23 specifically comprises: the variance of local features belonging to the same object is balanced while the local features fkWill approach the global feature center ck∈R1×NAttention-seeking drawing AkWill be activated locally in the same kth object, and the attention regularization loss is only applied to the original image, and the loss function adopts L in the formulaARepresents:
Figure FDA0002892114090000021
ck←ck+β(fk-ck)
wherein L isAIs a loss function, M is the number of attention maps, fkIs a local feature, ckIs a global feature center that can be initialized from zero and updated using a moving average, β is the control ckA function of the update rate.
5. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 1, wherein step S3 is specifically:
step S31: for each training image in the NSFW-sensitive image dataset, an attention map a of the picture is randomly selectedkTo guide and normalize the data enhancement process to the kth data enhancement graph
Figure FDA0002892114090000031
Figure FDA0002892114090000032
The formula of (1) is as follows:
Figure FDA0002892114090000033
wherein A iskIn an effort to address the need for attention,
Figure FDA0002892114090000034
for data enhancement map, min (A)k) Max (A) for minimal attentionk) Is the most attentive purpose;
step S32: enhancing maps by data
Figure FDA0002892114090000035
Amplifying the significant characteristic region and extracting local characteristics;
step S33: setting a bounding box B covering the entire selected positive field of the crop maskkAnd enlarging the region from the original image as data-enhanced outputInputting data;
step S34: attention regularization loss supervision Each attention map Ak∈RH×WThe part representing the same k-th object will be larger than the threshold value thetad∈[0,1]Of (2) element(s)
Figure FDA0002892114090000036
Set to 0 and the other elements to 1, the formula is as follows:
Figure FDA0002892114090000037
wherein
Figure FDA0002892114090000041
Is a discriminant element of the position (i, j), Dk(i, j) is the discard mask for position (i, j), θcIs a threshold value.
6. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 5, wherein the step S32 specifically comprises: by setting elements
Figure FDA0002892114090000042
To come from
Figure FDA0002892114090000043
Obtaining a cutting mask CkThe element being greater than the threshold value thetac∈[0,1]The value of the timer is set to 1, otherwise to 0, and the formula is as follows:
Figure FDA0002892114090000046
wherein
Figure FDA0002892114090000044
Is a distinguishing element of position (i, j), Ck(i, j) is a trim mask for position (i, j) (-)cIs a threshold value.
7. The fine-grained sensitive image detection method based on bilinear attention pooling of claim 1, wherein the step S4 specifically comprises:
step S41: each attention diagram represents a part of a specific object, each attention diagram is multiplied by a feature diagram element by element to generate a partial feature diagram, and then discriminant local features are further extracted through an additional feature extraction function to obtain a kth attention saliency feature;
step S42: local feature fkGenerating object characteristics by superposition, and generating a part of characteristic matrix P epsilon RM×NExpressed, Γ (a, F) represents the bilinear attention pooling process formula for the attention map a and the feature map F:
Figure FDA0002892114090000045
where F (A, F) represents a bilinear attention pooling function, g (-) is a feature extraction function, a1,...,aMFor local attention, F is a feature map, F1,...,fMIs a local feature;
step S43: local features are extracted from the partial feature map generated in step S41 by convolution or pooling operation, and a final feature matrix is composed of all the partial features.
8. A fine-grained sensitive image detection system based on a bilinear attention pooling scheme, comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, the computer program instructions, when executed by the processor, performing the method steps of any of claims 1-7.
9. A computer-readable storage medium, having stored thereon computer program instructions executable by a processor, for performing, when the processor executes the computer program instructions, the method steps according to any one of claims 1-7.
CN202110031134.XA 2021-01-11 2021-01-11 Fine-grained sensitive image detection method based on bilinear attention pooling mechanism Pending CN112699902A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110031134.XA CN112699902A (en) 2021-01-11 2021-01-11 Fine-grained sensitive image detection method based on bilinear attention pooling mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110031134.XA CN112699902A (en) 2021-01-11 2021-01-11 Fine-grained sensitive image detection method based on bilinear attention pooling mechanism

Publications (1)

Publication Number Publication Date
CN112699902A true CN112699902A (en) 2021-04-23

Family

ID=75513854

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110031134.XA Pending CN112699902A (en) 2021-01-11 2021-01-11 Fine-grained sensitive image detection method based on bilinear attention pooling mechanism

Country Status (1)

Country Link
CN (1) CN112699902A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610757A (en) * 2021-07-02 2021-11-05 华中科技大学同济医学院附属同济医院 Medical x-ray lung image detection method based on fine granularity
CN113627377A (en) * 2021-08-18 2021-11-09 福州大学 Cognitive radio frequency spectrum sensing method and system Based on Attention-Based CNN
CN113936145A (en) * 2021-10-08 2022-01-14 南京信息工程大学 Fine-grained identification method based on attention diagram sorting
CN114708466A (en) * 2022-06-08 2022-07-05 南京智莲森信息技术有限公司 Part abnormal fine granularity classification method and system, storage medium and computing equipment
CN116458897A (en) * 2023-04-18 2023-07-21 山东省人工智能研究院 Electrocardiosignal quality assessment method based on two-dimensional image and attention mechanism

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190362199A1 (en) * 2018-05-25 2019-11-28 Adobe Inc. Joint blur map estimation and blur desirability classification from an image
CN111489334A (en) * 2020-04-02 2020-08-04 暖屋信息科技(苏州)有限公司 Defect workpiece image identification method based on convolution attention neural network
CN111539469A (en) * 2020-04-20 2020-08-14 东南大学 Weak supervision fine-grained image identification method based on vision self-attention mechanism
CN112163465A (en) * 2020-09-11 2021-01-01 华南理工大学 Fine-grained image classification method, fine-grained image classification system, computer equipment and storage medium
CN112183602A (en) * 2020-09-22 2021-01-05 天津大学 Multi-layer feature fusion fine-grained image classification method with parallel rolling blocks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190362199A1 (en) * 2018-05-25 2019-11-28 Adobe Inc. Joint blur map estimation and blur desirability classification from an image
CN111489334A (en) * 2020-04-02 2020-08-04 暖屋信息科技(苏州)有限公司 Defect workpiece image identification method based on convolution attention neural network
CN111539469A (en) * 2020-04-20 2020-08-14 东南大学 Weak supervision fine-grained image identification method based on vision self-attention mechanism
CN112163465A (en) * 2020-09-11 2021-01-01 华南理工大学 Fine-grained image classification method, fine-grained image classification system, computer equipment and storage medium
CN112183602A (en) * 2020-09-22 2021-01-05 天津大学 Multi-layer feature fusion fine-grained image classification method with parallel rolling blocks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HUA WEI ET AL.: "Two-Level Progressive Attention Convolutional Network for Fine-Grained Image Recognition", 《IEEE ACCESS》 *
李丰磊: "基于多层权重自适应双线性池化和注意力机制的细粒度图像分类", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610757A (en) * 2021-07-02 2021-11-05 华中科技大学同济医学院附属同济医院 Medical x-ray lung image detection method based on fine granularity
CN113627377A (en) * 2021-08-18 2021-11-09 福州大学 Cognitive radio frequency spectrum sensing method and system Based on Attention-Based CNN
CN113627377B (en) * 2021-08-18 2024-07-02 福州大学 Cognitive radio spectrum sensing method and system Based on Attention-Based CNN
CN113936145A (en) * 2021-10-08 2022-01-14 南京信息工程大学 Fine-grained identification method based on attention diagram sorting
CN113936145B (en) * 2021-10-08 2024-06-11 南京信息工程大学 Fine granularity identification method based on attention-seeking diagram ordering
CN114708466A (en) * 2022-06-08 2022-07-05 南京智莲森信息技术有限公司 Part abnormal fine granularity classification method and system, storage medium and computing equipment
CN114708466B (en) * 2022-06-08 2022-09-09 南京智莲森信息技术有限公司 Part abnormal fine granularity classification method and system, storage medium and computing equipment
CN116458897A (en) * 2023-04-18 2023-07-21 山东省人工智能研究院 Electrocardiosignal quality assessment method based on two-dimensional image and attention mechanism
CN116458897B (en) * 2023-04-18 2024-01-26 山东省人工智能研究院 Electrocardiosignal quality assessment method based on two-dimensional image and attention mechanism

Similar Documents

Publication Publication Date Title
CN112699902A (en) Fine-grained sensitive image detection method based on bilinear attention pooling mechanism
CN109154978A (en) System and method for detecting plant disease
CN110728330A (en) Object identification method, device, equipment and storage medium based on artificial intelligence
CN110837768B (en) Online detection and identification method for rare animal protection
CN109800682B (en) Driver attribute identification method and related product
CN111986183B (en) Chromosome scattered image automatic segmentation and identification system and device
CN111179216B (en) Crop disease identification method based on image processing and convolutional neural network
CN114092450B (en) Real-time image segmentation method, system and device based on gastroscopy video
CN112417955A (en) Patrol video stream processing method and device
CN112101352A (en) Underwater alumen ustum state identification method and monitoring device, computer equipment and storage medium
Bai et al. Robust texture-aware computer-generated image forensic: Benchmark and algorithm
CN114140663A (en) Multi-scale attention and learning network-based pest identification method and system
CN117253071B (en) Semi-supervised target detection method and system based on multistage pseudo tag enhancement
CN106780286A (en) A kind of particle group optimizing water mark method extracted based on blind watermatking
CN113344935B (en) Image segmentation method and system based on multi-scale difficulty perception
CN104899875A (en) Rapid image cooperation salient region monitoring method based on integration matching
CN110751034B (en) Pedestrian behavior recognition method and terminal equipment
CN113963178A (en) Method, device, equipment and medium for detecting infrared dim and small target under ground-air background
Das et al. Ayurvedic Medicinal Plant Identification System Using Embedded Image Processing Techniques
Selvy et al. A proficient clustering technique to detect CSF level in MRI brain images using PSO algorithm
Nair et al. Under water fish species recognition
Liu et al. Microscopic image analysis and recognition on pathological cells
CN118279596B (en) Underwater fish sunlight refraction image denoising method and system
Wu et al. Tumor segmentation on whole slide images: training or prompting?
Bao et al. An Improved Densenet-Cnn Model to Classify the Damage Caused by Cotton Aphid

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210423

RJ01 Rejection of invention patent application after publication