CN110119688A - A kind of Image emotional semantic classification method using visual attention contract network - Google Patents
A kind of Image emotional semantic classification method using visual attention contract network Download PDFInfo
- Publication number
- CN110119688A CN110119688A CN201910311521.1A CN201910311521A CN110119688A CN 110119688 A CN110119688 A CN 110119688A CN 201910311521 A CN201910311521 A CN 201910311521A CN 110119688 A CN110119688 A CN 110119688A
- Authority
- CN
- China
- Prior art keywords
- emotion
- emotional semantic
- semantic classification
- visual attention
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of Image emotional semantic classification methods using visual attention contract network, belong to technical field of computer vision.The purpose of this method is the regional area for detecting to cause emotion in picture using Weakly supervised study, extracts the corresponding partial-depth feature in emotion region, then and by it with global depth feature merges, form final feature vector, the classification for emotion picture.Visual attention collaboration convolutional neural networks therein mainly include shared shallow-layer convolutional layer, and the Liang Ge branch of two kinds of tasks is carried out simultaneously, it is respectively used to generate emotion regional distribution chart and the richer vector of generative semantics information, is then sent through classifier and is identified.Image emotional semantic region detection and Image emotional semantic classification task are integrated in a unified depth network by the technology, it realizes and trains end to end, and the other Emotion tagging information of picture level is only needed, rather than the rectangle frame of pixel scale marks, thus alleviate the burden largely marked.
Description
Technical field
The invention belongs to technical field of computer vision, and can simultaneously global information and local message be used for by being related to one kind
The visual attention contract network of Image emotional semantic classification.
Background technique
Image emotional semantic classification causes more and more concerns in computer vision at present, carries out emotion automatically to picture
Assessment has important application in fields such as education, environment, business.With the development of deep learning, depth network is answered
In the work for having used forecast image emotion, however, still there is many challenges to annoying this more abstract work: 1) due to
The mankind have in identification affective process with very strong subjectivity, and Image emotional semantic classification has more than traditional visual identity task
Challenge;2) if better table will be obtained in identification mission by being capable of providing markup information more more detailed than picture tag
It is existing.But since the accurate mark mark other compared to picture level of picture region is more laborious, and viewer is for different
The emotion that region generates is also different, so mark is difficult to realize in detail and accurately.
Currently, proposing many methods for Image emotional semantic prediction.In early time, by the theories of psychology and Art Principle
It inspires, many methods devise the combination of different manual features.Machajdik et al. defined in document 1 low in 2010
The combination of layer feature goes to indicate that affective content, such as color, texture, ingredient, Zhao et al. proposed in document 2 in 2016
By the hypergraph model of multitask for personal building emotion forecasting system, the system can consider simultaneously personal life background,
Social environment, location information and past mood etc. carry out emotion prediction, and disclose IESN data set, this is for solution
Emotion subjectivity problem starts sex work.In order to solve the problems, such as training data deficiency, You et al. is in 2015 in document 3
In joined the weak flag data of a large amount of network, and the relatively high data of confidence level are screened iteratively to network according to prediction result
It is trained, target data set fine tuning model parameter is finally reused, so that model capability is improved.In order to solve emotion mould
The problem of paste property, Yang et al. were proposed in document 4 in 2017 and are utilized the study of progress labeling and label Distributed learning
Multitask convolutional neural networks go extract depth characteristic.The existing method based on convolutional neural networks is mostly from whole picture
Depth characteristic is extracted, seldom considers the effect that local feature predicts emotion.Sun et al. was based on mesh in document 5 in 2016
It marks proposed algorithm and finds emotion region, and classify in conjunction with depth characteristic.However, this method is suboptimum, because of target
Proposed algorithm and prediction work are separation, and unlike the region of object is likely to be left out at the very start.
As deep learning is succeeded in object recognition task, many Weakly supervised convolutional neural networks methods are used to
Carry out object detection task.2016, Zhou et al. was averaged pond in document 6 using the overall situation after the last one convolutional layer
The response message of layer processing particular category, while Durand et al. proposes WILDCAT method, study and difference in document 7
The relevant indication of multiple local feature of classification.In view of target information, Zhu et al. proposes " soft proposal in document 8
Network ", which is generated, to be recommended region and region and characteristic pattern will be recommended to merge fusion picture feature.
The development in above-mentioned field excites our inspiration, therefore carries out emotion detection and feelings simultaneously we have proposed a kind of
Global information and local message fusion are classified, and realized end-to-end by the visual attention contract network for feeling classification
Training.
Document:
1、Affective image classification using features inspired by
psychology and art theory.In ACM MM,2010.
2、Predicting personalized emotion perceptions of social images.In ACM
MM,2016.
3、Robust image sentiment analysis using progressively trained and
domain transferred deep networks.In AAAI,2015.
4、Joint image emotion classification and distribution learning via
deep convolutional neural network.In IJCAI,2017.
5、Discovering affective regions in deep convolutional neural networks
for visual sentiment prediction.In ICME,2016.
6、Learning deep features for discriminative localization.In CVPR,
2016.
7、Wildcat:Weakly supervised learning of deep convnets for image
classification,pointwise localization and segmentation.In CVPR,2017.
8、Soft proposal networks for weakly supervised object localization.In
ICCV,2017.
Summary of the invention
The technical problem to be solved in the invention is to be detected in the case where only picture level label using Weakly supervised study
The regional area for causing emotion in picture out, extracts the corresponding partial-depth feature in emotion region, then and by itself and the overall situation
Depth characteristic merges, and forms final feature vector, the classification for emotion picture.
In order to achieve the object of the present invention, we realize by following technical scheme:
A. emotion picture data are input to full convolution net after the pretreatment operation of data enhancing such as overturning, shearing
In network, convolution characteristic response figure is generated;
B. the characteristic response characteristic pattern with suitable space resolution ratio generated in a is sent into two network branches,
Middle detection branches are weighted summation to the information of emotional semantic classification each in characteristic pattern using across space pondization strategy, generate final
Emotion distribution map;
C. in classification branch by the depth characteristic generated in a and the emotion regional distribution chart generated in b carry out element it
Between cooperating, by assign weight in the way of the prominent main region for generating emotion;
D. global information and local message are blended to form the feature vector for being rich in semantic information;
E. feature vector is sent into classifier, emotional semantic classification is carried out to picture.
In b step, for every a kind of affective tag, have specific detector go the characteristic response figure that a is generated into
Row detection, associated region just will appear high response, then carry out the testing result of all categories to assign weight addition, mould
Shape parameter can be updated according to the loss function of this branch, until convergence.
Further, the network model for being used for region detection and the network model for being used to classify are integrated to one in the present invention
In a framework, realizes and train end to end.
Further, the emotion regional distribution chart generated in the present invention for region detection branch, what is fully considered is each
The other emotion distribution map of every type is finally assigned weight and is added the final emotion of acquisition by characteristic information corresponding to kind emotional category
Distribution map, and weight is constantly adjusted according to classification results, so that the corresponding region of main emotion in emotion picture distribution map
It can effectively be protruded.
Further, the class label that picture is only needed in training process does not need the markup information to emotion region, greatly
Reduce the burden of labeled data greatly.
Further, global information be added with local message in the present invention and obtain final feature, in prominent weight
While wanting information, the loss of information is avoided.
The invention has the benefit that this method can detect in picture in the case where only picture tag and cause feelings
The region of sense extracts the corresponding local feature in emotion region, and it is merged with global depth feature, is merged
The feature vector of global information and local message improves last classifying quality, be more than on mainstream data collection it is existing most
Advanced method.This method can simply move to the enterprising market sense classification task of a variety of convolutional neural networks, it is only necessary to will
Full articulamentum is changed to full convolutional layer, and the size of learning parameter and training data batch requires to meet selected convolutional neural networks knot
The demand of structure.Generally speaking, this method is that Image emotional semantic classification task proposes a completely new solution, it is believed that this method
On other convolutional neural networks and affection data collection, there can be good performance.
Detailed description of the invention
The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments:
Fig. 1 is the flow chart using the Image emotional semantic classification method of visual attention contract network.
Fig. 2 is the schematic diagram using the Image emotional semantic classification method of visual attention contract network.
Wherein: k: the corresponding characteristic pattern quantity of every class emotion;C: the classification number of emotion;The size of characteristic pattern is n*w*h;
MOF indicates that emotion distribution map M and feature F merges.
Fig. 3 is the schematic diagram of emotion regional distribution chart generating process.
Specific embodiment
Referring to Fig.1, it indicates the flow chart of the Image emotional semantic classification method using visual attention contract network, is indicated in figure
The step of are as follows:
A. it is sent into model after image data being carried out the operations such as resetting size, data enhancing, generates characteristic response figure F.It is former
Beginning model obtains initiation parameter after pre-training on large-scale dataset ImageNet.The last layer pond layer of model and
Full articulamentum is by Liang Ge branch (detection branches, classification branch) replacement;
B. it for the detection in emotion region in picture, is obtained first with one 1 × 1 convolutional layer and is directed to each emotion class
Other information, in C class emotion, every class emotion has k detector, and through handling, the characteristic pattern port number of acquisition is kC, then
It takes mean value to obtain such emotion area distribution k characteristic pattern in same class, C emotion distribution map is finally assigned into weight vc
It is added, obtains final emotion distribution map M, weight is obtained by the operation of maximum value pondization, and formula indicates are as follows:
Use fC, iIndicate the emotion distribution map that i-th of detector obtains in c class emotion.
C. emotion information distribution map M and characteristic response figure F are merged using Hadamard product, coding forms semantic
The richer feature of information is sent into classifier by global mean value pondization, loses entropy function by Softmax and instructs to model
Practice undated parameter.
Fig. 2 illustrates the schematic diagram of this method, wherein key problem, training process and system to algorithm in each stage
Output and input very vivid description.The same meaning of Fig. 2 and Fig. 1 expression, only abstraction hierarchy is different, is mainly to aid in
Understand various pieces in Fig. 1.
Fig. 3 illustrates the generating process of emotion regional distribution chart, for every one kind, has k characteristic response figure, for every
K a kind of characteristic response figure generates the corresponding characteristic pattern of the category by the way of average pond, finally by all classes
Another characteristic figure is assigned weight and is added, and final emotion regional distribution chart is obtained.
Claims (6)
1. a kind of Image emotional semantic classification method using visual attention contract network, which is characterized in that this method includes as follows
Step:
A. emotion picture data are input in full convolutional network after the pretreatment operation of data enhancing such as overturning, shearing,
Generate convolution characteristic response figure;
B. the characteristic response characteristic pattern with suitable space resolution ratio generated in a is sent into two network branches, wherein examining
It surveys branch and summation is weighted to the information of emotional semantic classification each in characteristic pattern using across space pondization strategy, generate final feelings
Feel distribution map;
C. the depth characteristic generated in a and the emotion regional distribution chart generated in b are carried out between element in classification branch
Cooperating, the prominent main region for generating emotion in the way of tax weight;
D. global information and local message are blended to form the feature vector for being rich in semantic information;
E. feature vector is sent into classifier, emotional semantic classification is carried out to picture.
2. a kind of Image emotional semantic classification method using visual attention contract network according to claim 1, feature
It is: the network model for being used for region detection and the network model for being used to classify is integrated in a framework, end is realized and arrives
The training at end.
3. a kind of Image emotional semantic classification method using visual attention contract network according to claim 1, feature
It is: for the emotion regional distribution chart of region detection branch generation, spy corresponding to each emotional category fully considered
The other emotion distribution map of every type is finally assigned weight and is added the final emotion distribution map of acquisition by reference breath, and according to classification
As a result weight is constantly adjusted, the corresponding region of main emotion in emotion picture distribution map is effectively protruded.
4. a kind of Image emotional semantic classification method using visual attention contract network according to claim 1, feature
It is: only needs the class label of picture in training process, do not need the markup information to emotion region, greatly reduces mark
The burden of data.
5. a kind of Image emotional semantic classification method using visual attention contract network according to claim 1, feature
It is: global information is carried out to be added the final feature of acquisition with local message, while protrusion important information, avoids letter
The loss of breath.
6. a kind of Image emotional semantic classification method using visual attention contract network according to claim 1, feature
It is: in b step, for every a kind of affective tag, has specific detector that the characteristic response figure generated to a is gone to examine
It surveys, associated region just will appear high response, then carry out the testing result of all categories to assign weight addition, model ginseng
Number can be updated according to the loss function of this branch, until convergence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910311521.1A CN110119688A (en) | 2019-04-18 | 2019-04-18 | A kind of Image emotional semantic classification method using visual attention contract network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910311521.1A CN110119688A (en) | 2019-04-18 | 2019-04-18 | A kind of Image emotional semantic classification method using visual attention contract network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110119688A true CN110119688A (en) | 2019-08-13 |
Family
ID=67521142
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910311521.1A Pending CN110119688A (en) | 2019-04-18 | 2019-04-18 | A kind of Image emotional semantic classification method using visual attention contract network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110119688A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110705490A (en) * | 2019-10-09 | 2020-01-17 | 中国科学技术大学 | Visual emotion recognition method |
CN110796150A (en) * | 2019-10-29 | 2020-02-14 | 中山大学 | Image emotion recognition method based on emotion significant region detection |
CN110827312A (en) * | 2019-11-12 | 2020-02-21 | 北京深境智能科技有限公司 | Learning method based on cooperative visual attention neural network |
CN111026898A (en) * | 2019-12-10 | 2020-04-17 | 云南大学 | Weak supervision image emotion classification and positioning method based on cross space pooling strategy |
CN111311364A (en) * | 2020-02-13 | 2020-06-19 | 山东大学 | Commodity recommendation method and system based on multi-mode commodity comment analysis |
CN111832573A (en) * | 2020-06-12 | 2020-10-27 | 桂林电子科技大学 | Image emotion classification method based on class activation mapping and visual saliency |
CN112836718A (en) * | 2020-12-08 | 2021-05-25 | 上海大学 | Fuzzy knowledge neural network-based image emotion recognition method |
CN114626454A (en) * | 2022-03-10 | 2022-06-14 | 华南理工大学 | Visual emotion recognition method integrating self-supervision learning and attention mechanism |
CN116719930A (en) * | 2023-04-28 | 2023-09-08 | 西安工程大学 | Multi-mode emotion analysis method based on visual attention |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107341506A (en) * | 2017-06-12 | 2017-11-10 | 华南理工大学 | A kind of Image emotional semantic classification method based on the expression of many-sided deep learning |
US20170344880A1 (en) * | 2016-05-24 | 2017-11-30 | Cavium, Inc. | Systems and methods for vectorized fft for multi-dimensional convolution operations |
CN108427740A (en) * | 2018-03-02 | 2018-08-21 | 南开大学 | A kind of Image emotional semantic classification and searching algorithm based on depth measure study |
US10140515B1 (en) * | 2016-06-24 | 2018-11-27 | A9.Com, Inc. | Image recognition and classification techniques for selecting image and audio data |
-
2019
- 2019-04-18 CN CN201910311521.1A patent/CN110119688A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170344880A1 (en) * | 2016-05-24 | 2017-11-30 | Cavium, Inc. | Systems and methods for vectorized fft for multi-dimensional convolution operations |
US10140515B1 (en) * | 2016-06-24 | 2018-11-27 | A9.Com, Inc. | Image recognition and classification techniques for selecting image and audio data |
CN107341506A (en) * | 2017-06-12 | 2017-11-10 | 华南理工大学 | A kind of Image emotional semantic classification method based on the expression of many-sided deep learning |
CN108427740A (en) * | 2018-03-02 | 2018-08-21 | 南开大学 | A kind of Image emotional semantic classification and searching algorithm based on depth measure study |
Non-Patent Citations (1)
Title |
---|
JUFENG YANG: "Weakly Supervised Coupled Networks for Visual Sentiment Analysis", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110705490B (en) * | 2019-10-09 | 2022-09-02 | 中国科学技术大学 | Visual emotion recognition method |
CN110705490A (en) * | 2019-10-09 | 2020-01-17 | 中国科学技术大学 | Visual emotion recognition method |
CN110796150A (en) * | 2019-10-29 | 2020-02-14 | 中山大学 | Image emotion recognition method based on emotion significant region detection |
CN110796150B (en) * | 2019-10-29 | 2022-09-16 | 中山大学 | Image emotion recognition method based on emotion significant region detection |
CN110827312A (en) * | 2019-11-12 | 2020-02-21 | 北京深境智能科技有限公司 | Learning method based on cooperative visual attention neural network |
CN110827312B (en) * | 2019-11-12 | 2023-04-28 | 北京深境智能科技有限公司 | Learning method based on cooperative visual attention neural network |
CN111026898A (en) * | 2019-12-10 | 2020-04-17 | 云南大学 | Weak supervision image emotion classification and positioning method based on cross space pooling strategy |
CN111311364A (en) * | 2020-02-13 | 2020-06-19 | 山东大学 | Commodity recommendation method and system based on multi-mode commodity comment analysis |
CN111832573B (en) * | 2020-06-12 | 2022-04-15 | 桂林电子科技大学 | Image emotion classification method based on class activation mapping and visual saliency |
CN111832573A (en) * | 2020-06-12 | 2020-10-27 | 桂林电子科技大学 | Image emotion classification method based on class activation mapping and visual saliency |
CN112836718A (en) * | 2020-12-08 | 2021-05-25 | 上海大学 | Fuzzy knowledge neural network-based image emotion recognition method |
CN114626454A (en) * | 2022-03-10 | 2022-06-14 | 华南理工大学 | Visual emotion recognition method integrating self-supervision learning and attention mechanism |
CN116719930A (en) * | 2023-04-28 | 2023-09-08 | 西安工程大学 | Multi-mode emotion analysis method based on visual attention |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110119688A (en) | A kind of Image emotional semantic classification method using visual attention contract network | |
Xiao et al. | A framework for quantitative analysis and differentiated marketing of tourism destination image based on visual content of photos | |
Shetty et al. | Not Using the Car to See the Sidewalk--Quantifying and Controlling the Effects of Context in Classification and Segmentation | |
CN107134144B (en) | A kind of vehicle checking method for traffic monitoring | |
CN107133955B (en) | A kind of collaboration conspicuousness detection method combined at many levels | |
CN108537269B (en) | Weak interactive object detection deep learning method and system thereof | |
CN109255334A (en) | Remote sensing image terrain classification method based on deep learning semantic segmentation network | |
CN106408030B (en) | SAR image classification method based on middle layer semantic attribute and convolutional neural networks | |
CN109002834A (en) | Fine granularity image classification method based on multi-modal characterization | |
CN109740686A (en) | A kind of deep learning image multiple labeling classification method based on pool area and Fusion Features | |
CN106537390B (en) | Identify the presentation style of education video | |
CN108427740B (en) | Image emotion classification and retrieval algorithm based on depth metric learning | |
CN111985532B (en) | Scene-level context-aware emotion recognition deep network method | |
CN113159826B (en) | Garment fashion element prediction system and method based on deep learning | |
CN106960176A (en) | A kind of pedestrian's gender identification method based on transfinite learning machine and color characteristic fusion | |
CN111832573A (en) | Image emotion classification method based on class activation mapping and visual saliency | |
CN102945372B (en) | Classifying method based on multi-label constraint support vector machine | |
Lin et al. | Two stream active query suggestion for active learning in connectomics | |
Fu et al. | Personality trait detection based on ASM localization and deep learning | |
CN102542590B (en) | High-resolution SAR (Synthetic Aperture Radar) image marking method based on supervised topic model | |
Kobayashi et al. | Aesthetic design based on the analysis of questionnaire results using deep learning techniques | |
CN113705301A (en) | Image processing method and device | |
Cao et al. | SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model | |
Wang et al. | Human interaction understanding with joint graph decomposition and node labeling | |
CN115115979A (en) | Identification and replacement method of component elements in video and video recommendation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190813 |