CN111797936A - Image emotion classification method and device based on significance detection and multi-level feature fusion - Google Patents

Image emotion classification method and device based on significance detection and multi-level feature fusion Download PDF

Info

Publication number
CN111797936A
CN111797936A CN202010670001.2A CN202010670001A CN111797936A CN 111797936 A CN111797936 A CN 111797936A CN 202010670001 A CN202010670001 A CN 202010670001A CN 111797936 A CN111797936 A CN 111797936A
Authority
CN
China
Prior art keywords
emotion
image
branch
characteristic diagram
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010670001.2A
Other languages
Chinese (zh)
Other versions
CN111797936B (en
Inventor
邓泽林
朱其然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University of Science and Technology
Original Assignee
Changsha University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University of Science and Technology filed Critical Changsha University of Science and Technology
Priority to CN202010670001.2A priority Critical patent/CN111797936B/en
Publication of CN111797936A publication Critical patent/CN111797936A/en
Application granted granted Critical
Publication of CN111797936B publication Critical patent/CN111797936B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses an image emotion classification method based on significance detection and multi-level feature fusion, which comprises the steps of firstly, extracting a significance map of an emotion image by using a significance detection network; the feature map of the significance map is modulated to the feature map of the corresponding emotion image through the twin neural network, so that the inclusion-v 4 network has more attention to the emotion expression area of the emotion image, and the image emotion classification precision is effectively improved; and finally, classifying the modulation characteristic diagram modulated by the twin neural network by utilizing an increment-v 4 network so as to accurately obtain the emotion category corresponding to the emotion image. The image emotion classification method provided by the invention can accurately position the emotion expression area in the emotion image, and realizes that the classification network gives higher attention to the emotion expression area of the emotion image in a characteristic modulation mode, thereby effectively improving the accuracy of the image emotion classification method.

Description

Image emotion classification method and device based on significance detection and multi-level feature fusion
Technical Field
The invention relates to the technical field of image emotion classification, in particular to an image emotion classification method and device based on significance detection and multi-level feature fusion.
Background
With the development and popularization of photography and social networks, people have become accustomed to sharing experience and expressing opinions on the internet through images or videos. This places a pressing need for the processing and understanding of image and video content. Humans are better able to perceive and understand high levels of semantics and emotion than low levels of visual appearance. In recent years, psychology, emotion calculation, and emotion hierarchical analysis of image contents in a multimedia community have been receiving wide attention. The analysis of the image in the emotion level is also the important point of the analysis of the image content, and the method can be widely applied to the aspects of man-machine interaction, public opinion analysis, image retrieval and the like.
The expression of the image emotion is mainly determined by the emotion area in the image, such as a target in the image, but the conventional method does not give more attention to the emotion area of the image, so that more discriminative emotion characteristics cannot be acquired.
Disclosure of Invention
The invention provides an image emotion classification method and device based on significance detection and multi-level feature fusion, which are used for overcoming the defects of less attention of emotion areas and the like in the prior art.
In order to achieve the above object, the present invention provides an image emotion classification method based on saliency detection and multi-level feature fusion, the image emotion classification method includes:
constructing an emotion image set; the emotion image set comprises marked emotion images;
establishing a training set and a verification set according to the emotion image set, and extracting significance images of the emotion images in the training set and the verification set by using a significance detection network;
inputting the emotion images and the saliency maps in the training set into a pre-constructed image emotion classification model; the image emotion classification model comprises a twin neural network and an inclusion-v 4 network;
training the twin neural network by using the emotion images and the significance maps in the training set, and performing feature modulation on the corresponding emotion images by using the trained twin neural network to obtain a modulation feature map;
training the inclusion-v 4 network by using the modulation feature map to obtain a trained image emotion classification model;
verifying the trained image emotion classification model by using the emotion images and the saliency maps in the verification set;
and inputting the images to be classified and the saliency maps of the images to be classified into the verified image emotion classification models for classification, and obtaining the image emotion categories.
In order to achieve the above object, the present invention further provides an image emotion classification apparatus based on saliency detection and multi-level feature fusion, including:
the image set construction module is used for constructing an emotion image set; the emotion image set comprises marked emotion images;
the significance map acquisition module is used for establishing a training set and a verification set according to the emotion image set and extracting significance maps of the emotion images in the training set and the verification set by using a significance detection network;
the model training module is used for inputting the emotion images and the saliency maps in the training set into a pre-constructed image emotion classification model; the image emotion classification model comprises a twin neural network and an inclusion-v 4 network; training the twin neural network by using the emotion images and the significance maps in the training set, and performing feature modulation on the corresponding emotion images by using the trained twin neural network to obtain a modulation feature map; training the inclusion-v 4 network by using the modulation feature map to obtain a trained image emotion classification model;
the model verification module is used for verifying the trained image emotion classification model by utilizing the emotion images and the saliency maps in the verification set;
and the classification module is used for inputting the images to be classified and the saliency maps of the images to be classified into the verified image emotion classification model for classification, so as to obtain the image emotion classes.
To achieve the above object, the present invention further provides a computer device, which includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the method when executing the computer program.
To achieve the above object, the present invention further proposes a computer-readable storage medium having a computer program stored thereon, which, when being executed by a processor, implements the steps of the above method.
Compared with the prior art, the invention has the beneficial effects that:
the image emotion classification method based on significance detection and multi-level feature fusion firstly extracts a significance map of an emotion image by utilizing a significance detection network; the feature map of the significance map is modulated to the feature map of the corresponding emotion image through the twin neural network, so that the inclusion-v 4 network has more attention to the emotion expression area of the emotion image, and the image emotion classification precision is effectively improved; and finally, classifying the modulation characteristic diagram modulated by the twin neural network by utilizing an increment-v 4 network so as to accurately obtain the emotion category corresponding to the emotion image. The image emotion classification method provided by the invention can accurately position the emotion expression area in the emotion image, and realizes that the classification network gives higher attention to the emotion expression area of the emotion image in a characteristic modulation mode, thereby effectively improving the accuracy of the image emotion classification method.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a flowchart of an image emotion classification method based on saliency detection and multi-level feature fusion provided by the present invention;
FIG. 2 is an overall structure diagram of the image emotion classification method based on saliency detection and multi-level feature fusion provided by the present invention;
FIG. 3 is a diagram of a structure of an image emotion classification model provided by the present invention;
FIG. 4 is an emotion wheel and emotion distance map.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In addition, the technical solutions in the embodiments of the present invention may be combined with each other, but it must be based on the realization of those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should not be considered to exist, and is not within the protection scope of the present invention.
The invention provides an image emotion classification method based on significance detection and multi-level feature fusion, which comprises the following steps of:
101: constructing an emotion image set; the emotion image set comprises marked emotion images;
the emotion images in the emotion image set are from an international emotion image system subset (iasa), Abstract data set (Abstract), art image set (ArtPhoto), and a plurality of weakly labeled emotion images collected from flickers and instagrams, and the like.
The marking means that the emotion images in the emotion image set are previously subjected to emotion type marking.
Emotional categories include anger, distorst, fear, sadness, amusement, awe, contentment, and excitement.
102: establishing a training set and a verification set according to the emotion image set, and extracting significance graphs of the emotion images in the training set and the verification set by using a significance detection network;
a significance Detection network, see Liu J, Hou Q, Cheng M M, et al. A Simple Pooling-Based Design for Real-Time salt Object Detection [ J ].2019.
103: inputting emotion images and significance maps in a training set into a pre-constructed image emotion classification model; the image emotion classification model comprises a twin neural network and an inclusion-v 4 network;
the twin neural network consists of two identical neural networks with shared weight.
The Incep-v 4 network is a classification network with excellent performance.
104: training the twin neural network by using the emotion images and the significance map in the training set, and performing feature modulation on the corresponding emotion images by using the significance map in the training set through the trained twin neural network to obtain a modulation feature map;
the feature map of the significance map is modulated to the feature map of the corresponding emotion image through the twin neural network, so that the inclusion-v 4 network has more attention to the emotion expression area of the emotion image, and the image emotion classification precision is effectively improved.
105: training the inclusion-v 4 network by using a modulation characteristic diagram to obtain a trained image emotion classification model;
106: verifying the trained image emotion classification model by using the emotion images and the saliency maps in the verification set;
107: and inputting the images to be classified and the saliency maps of the images to be classified into the verified image emotion classification models for classification, and obtaining the image emotion categories.
The image emotion classification method based on significance detection and multi-level feature fusion firstly extracts a significance map of an emotion image by utilizing a significance detection network; the feature map of the significance map is modulated to the feature map of the corresponding emotion image through the twin neural network, so that the inclusion-v 4 network has more attention to the emotion expression area of the emotion image, and the image emotion classification precision is effectively improved; and finally, classifying the modulation characteristic diagram modulated by the twin neural network by utilizing an increment-v 4 network so as to accurately obtain the emotion category corresponding to the emotion image. The image emotion classification method provided by the invention can accurately position the emotion expression area in the emotion image, and realizes that the classification network gives higher attention to the emotion expression area of the emotion image in a characteristic modulation mode, thereby effectively improving the accuracy of the image emotion classification method.
In one embodiment, for step 103, the twin neural network is shown in fig. 3 and includes a first branch and a second branch, wherein the first branch is used for feature extraction of the emotion image, and the second branch is used for feature extraction of the saliency map; the first branch and the second branch are both composed of four convolutional layers (BasICConv2d) with the same convolutional kernel size; the output terminals of the second and fourth convolutional layers in the second branch are connected to the output terminals of the second and fourth convolutional layers in the first branch.
In order to ensure that the features of the emotion image and the features of the saliency map keep similar feature spaces in network propagation, two branches of the twin neural network are both composed of four convolution layers with convolution kernels of the same size. For the first branch, the input and output channels of the four convolutional layers are all 3; and the input and output channels of the four convolutional layers of the second branch are all 1.
After the significance map of the emotion image is obtained through the significance detection network, in order to enable the significance map to successfully constrain the classification model to give higher attention to the emotion area of the emotion image, a twin neural network is introduced. In the twin neural network, the significance map in the second branch is used for modulating the emotion image in the first branch, and the attention of the emotion area in the emotion image is enhanced.
In a certain embodiment, for step 104, the feature modulation is performed on the corresponding emotion image by using the significance map in the training set through the trained twin neural network, so as to obtain a modulation feature map, including:
401: inputting emotion images a (x, y, z) in the training set into a first branch of the trained twin neural network, and inputting saliency maps b (x, y) into a second branch; wherein, x and y are coordinate systems, and z is three color channels of the emotion image.
402: obtaining a characteristic diagram S output by the second convolution layer in the second branch (S is belonged to R)w×hWhere w is the width of the profile and h is the height of the profile), and combining the profile S with the profile T output by the second convolutional layer in the first branch (T ∈ R)c ×w×hC is the number of channels of the characteristic diagram) to carry out multiplication operation of corresponding elements to obtain a characteristic diagram H (H epsilon R)c×w×h) Adding corresponding elements of the feature graph H and the feature graph T to obtain a feature graph G;
the multiplication operation is to modulate the characteristic diagram T by using the characteristic diagram S;
the addition is to re-emphasize the features of the original image (input emotion image) in order to avoid the problem that the features of the feature map T are completely ignored after the multiplication.
403: inputting the feature map G into a third convolutional layer of the first branch, and inputting the feature map S into a third convolutional layer of the second branch;
404: and obtaining a characteristic diagram S 'output by the fourth convolution layer in the second branch, carrying out multiplication operation of corresponding elements on the characteristic diagram S' and a characteristic diagram T 'output by the fourth convolution layer in the first branch to obtain a characteristic diagram H', and carrying out addition operation of corresponding elements on the characteristic diagram H 'and the characteristic diagram T' to obtain a modulation characteristic diagram F.
The feature modulation is used for enabling the significant region in the feature map T to obtain a larger response value, namely, the Incep-v 4 network gives more attention to the emotion expression region in the emotion image. Meanwhile, in order to ensure that the effect is obvious, two times of continuous modulation are carried out.
In another embodiment, only a single signature S is considered, since the number of signatures S and T is independent. The calculation formula of the characteristic diagram G is as follows:
G=f(T(w,h,c)*[S(w,h)+1]) (1)
in the formula (I), the compound is shown in the specification,f represents a Sigmoid activation function, such that 0 < T ∈ Rw×h< 1, thus ensuring a suitable range for the characteristic modulation; t (w, h, c) represents a characteristic diagram of the output of the second convolutional layer in the first branch; s (w, h) represents a characteristic diagram of the output of the second convolutional layer in the second branch; w, h and c respectively represent the width, height and channel number of the characteristic diagram;
the calculation formula of the modulation characteristic diagram F is as follows:
F=f(T′(w,h,c)*[S′(w,h)+1]) (2)
in the formula, T' (w, h, c) represents a characteristic diagram of the output of the fourth convolutional layer in the first branch; s' (w, h) represents a characteristic diagram of the output of the fourth convolutional layer in the second branch.
In a next embodiment, the inclusion-v 4 network is a multi-branch structure as shown in fig. 3, and includes three convolutional layers (BasicConv2d), one Mixed _3a module, one Mixed _4a module, one Mixed _5a module, four inclusion _ a modules, one Reduction _ a module, seven inclusion _ B modules, one Reduction _ B module, three inclusion _ C modules, one average Pooling layer (avarpousing) and one full connection layer (FullyConnection) in sequence; the Mixed _5a module, the Reduction-A module and the Reduction-B module are respectively followed by the introduction of a side branch structure (BasICConv2d), which consists of one convolutional layer; the output ends of the three side branch structures are connected with a full Connection layer (full Connection), and the full Connection layer is used for fusing the side branch features output by the three side branch structures and outputting a fused feature, and outputting the fused feature to the output end of the average pooling layer.
The three side branch structures are respectively composed of convolution layers with 256 output channels, the size of a convolution kernel is 1, and the convolution step length is 1. In order to fuse the features output by the three side branch structures, the present embodiment defines a full Connection layer (full Connection) with a layer neuron number of 256 behind the three side branch structures. Finally, the number of neurons in the last full-junction layer of the inclusion-v 4 network is set to be 8, and 8 emotion categories are used as a final classifier. By the network structure, the characteristics of three different levels in the deep network can be acquired from three side branch structuresL1,L2,L3Plus top level features L of the deep network4And obtaining feature maps of four levels in the Incep-v 4 network.
After obtaining features at different levels, it is extremely important how to integrate these features at different levels in the model approach. It is observed that in image emotion recognition, semantic features of an image play a greater role in emotion recognition than style features of the image. Therefore, the embodiment gives higher attention to semantic features when the features are fused.
The feature fusion of the embodiment specifically includes:
step 1: first pair of features L1,L2,L3And performing concat operation, and inputting the concat operation into the full connection layer to obtain the characteristic L.
Step 2: for feature L and semantic feature L4And performing concat operation to obtain the final classification characteristic F.
The embodiment fully considers the influence of different types of features on emotion awakening, so that the features with expression power are formulated, the gap problem between emotion features and emotion expression is further solved, and the accuracy of emotion classification is improved.
In the embodiment, a multi-branch structure is introduced on the basis of an inclusion-v 4 network to acquire style features and semantic features of an image at a low level and a high level of the network respectively, then the multi-level features are fused, and emotion type prediction distribution of an image emotion classification model is obtained in a Softmax mode.
In a certain embodiment, the probability that the emotion image belongs to the ith emotion category is obtained by the last full connection layer in the inclusion-v 4 network in a Softmax mode:
Figure BDA0002581926030000101
in the formula, yiRepresenting the probability that the emotion image belongs to the ith emotion category; z is a radical ofiAn activation value representing that the emotion image belongs to the ith class; z is a radical ofjIndicating that the emotional image belongs toAn activation value of class j; c represents an emotion category.
In another embodiment, for step 105, because the data volume of the emotion data set is relatively small, the inclusion-v 4 network is pre-trained on the ImageNet data set by using the migration learning strategy. And then training the inclusion-v 4 network by using the modulation characteristic diagram.
In a next embodiment, the image emotion classification model adopts a multitask loss function based on emotion diversity constraint, and the multitask loss function is as follows:
Lmulti=Lcls+λLed(4)
Figure BDA0002581926030000102
Figure BDA0002581926030000103
Figure BDA0002581926030000104
Figure BDA0002581926030000111
in the formula, LmultiRepresenting a multitask penalty function; l isclsRepresents the conventional classification loss; l isedIndicating loss of other accompanying emotions; λ represents a weight; q. q.siRepresenting the probability that the image emotion label calibration image belongs to the ith emotion; p is a radical ofiRepresenting the probability that the model predicts that the emotion image belongs to the ith emotion category; f (i) probability of other accompanying emotions; j represents the dominant emotion; i represents other accompanying emotions;
Figure BDA0002581926030000112
representing a probability of a dominant emotion; p is a radical ofjA category probability representing a dominant emotion j; disijRepresenting the distance between emotion i and leading emotion j as defined in Mikels' whereSeparating; i, j ∈ B indicates that emotion categories i and j are in the same polarity.
In most collections of image emotion data, a majority voting strategy is widely adopted to acquire emotion labels of images. And from the diversity of emotion expression, the distribution of emotion is estimated in a form based on tag probability. Inspired by the theory of emotion, the relationship between two emotions determines the similarity between the two emotions. The two emotions can be expressed by Mikels 'Wheel (see FIG. 4, wherein (a) is an emotion Wheel and (b) is an emotion distance) from being similar to each other to be completely opposite, the distance between the two emotions can be calculated by the distance defined by the Mikels' Wheel, and the distance represents the similarity of the two emotions, namely the distance d between the emotion i and the emotion jijThe smaller the more similar the two emotions are represented. Therefore, by the definition of Mikels' Wheel, a probability label of the emotion image can be obtained.
In order to obtain a more reasonable probability label, the emotion analysis method is further combined with the research of emotion theory, the emotion of the emotion image is divided into two polarities of negative (anger, distust, fear, sadness) N and positive (amument, awe, content, and interest) P, and the emotion theory research shows that the diversity expression of emotion is actually more one dominant emotion accompanied by a plurality of other emotions with the same polarity, so the relationship of emotion polarity is introduced when the label probability is generated.
From the dominant emotion and the distance definition in Mikels' Wheel, the probability distribution of other emotions of the same polarity as the dominant emotion is calculated, and the probability distribution for the emotion of the opposite polarity is set to 0. The calculation formulas are expressed as formulas (7) and (8).
And introducing a multitask loss function, namely formula (5), through the category label and the probability label of the emotion image.
And generating a distributed label of the emotion image according to Mikels' Wheel, calculating the emotion distribution loss of the prediction distribution and the distributed label, and combining classification loss by introducing a weight lambda to form a new loss function so as to achieve the constraint on the expression diversity of the image.
In this embodiment, random gradient descent is used to optimize the above-mentioned multiplesTask loss function LmultiDefine { ai I 1, 2.., C represents the activation value of the ith emotion category of the last fully-connected layer.
The gradient can be calculated by equation (9):
Figure BDA0002581926030000121
the expression of image emotion has subjectivity and diversity, and most emotion data sets are collected by a single emotion tag generated by voting. In practice, however, different regions of an image may express different emotions, and it is difficult to divide an image into a single emotion type, and the emotion expression of the image is often a dominant emotion accompanied by one or more emotions of the same polarity. The single emotion label can cause inaccurate marking of the image emotion data set, thereby greatly influencing the accuracy of image emotion classification. Therefore, in the embodiment, the relation of emotion polarities is introduced, and the dominant emotion and other emotions with the same polarity as the dominant emotion are comprehensively considered by introducing a multitask loss function, so that the accuracy of image emotion classification is improved.
In this embodiment, a new loss function combining emotion distribution loss calculation is proposed based on the diversity of image emotion expressions, so as to complete the constraint on the diversity of image emotion expressions.
The invention also provides an image emotion classification device based on significance detection and multi-level feature fusion, which comprises the following components:
the image set construction module is used for constructing an emotion image set; the emotion image set comprises marked emotion images;
the significance map acquisition module is used for establishing a training set and a verification set according to the emotion image set and extracting significance maps of the emotion images in the training set and the verification set by using a significance detection network;
the model training module is used for inputting the emotion images and the significance maps in the training set into a pre-constructed image emotion classification model; the image emotion classification model comprises a twin neural network and an inclusion-v 4 network; training the twin neural network by using the emotion images and the significance maps in the training set, and performing feature modulation on the corresponding emotion images by using the trained twin neural network to obtain a modulation feature map; training the inclusion-v 4 network by using the modulation feature map to obtain a trained image emotion classification model;
the model verification module is used for verifying the trained image emotion classification model by utilizing the emotion images and the saliency maps in the verification set;
and the classification module is used for inputting the images to be classified and the saliency maps of the images to be classified into the verified image emotion classification model for classification, so as to obtain the image emotion classes.
In one embodiment, for the model training module, the twin neural network is shown in fig. 3 and comprises a first branch and a second branch, wherein the first branch is used for extracting features of the emotion images, and the second branch is used for extracting features of the significance map; the first branch and the second branch are both composed of four convolutional layers (BasICConv2d) with the same convolutional kernel size; the output terminals of the second and fourth convolutional layers in the second branch are connected to the output terminals of the second and fourth convolutional layers in the first branch.
In order to ensure that the features of the emotion image and the features of the saliency map keep similar feature spaces in network propagation, two branches of the twin neural network are both composed of four convolution layers with convolution kernels of the same size. For the first branch, the input and output channels of the four convolutional layers are all 3; and the input and output channels of the four convolutional layers of the second branch are all 1.
After the significance map of the emotion image is obtained through the significance detection network, in order to enable the significance map to successfully constrain the classification model to give higher attention to the emotion area of the emotion image, a twin neural network is introduced. In the twin neural network, the significance map in the second branch is used for modulating the emotion image in the first branch, and the attention of the emotion area in the emotion image is enhanced.
In a certain embodiment, the model training module further comprises:
401: inputting emotion images a (x, y, z) in the training set into a first branch of the trained twin neural network, and inputting saliency maps b (x, y) into a second branch; wherein, x and y are coordinate systems, and z is three color channels of the emotion image.
402: obtaining a characteristic diagram S output by the second convolution layer in the second branch (S is belonged to R)w×hWhere w is the width of the profile and h is the height of the profile), and combining the profile S with the profile T output by the second convolutional layer in the first branch (T ∈ R)c ×w×hC is the number of channels of the characteristic diagram) to carry out multiplication operation of corresponding elements to obtain a characteristic diagram H (H epsilon R)c×w×h) Adding corresponding elements of the feature graph H and the feature graph T to obtain a feature graph G;
the multiplication operation is to modulate the characteristic diagram T by using the characteristic diagram S;
the addition is to re-emphasize the features of the original image (input emotion image) in order to avoid the problem that the features of the feature map T are completely ignored after the multiplication.
403: inputting the feature map G into a third convolutional layer of the first branch, and inputting the feature map S into a third convolutional layer of the second branch;
404: and obtaining a characteristic diagram S 'output by the fourth convolution layer in the second branch, carrying out multiplication operation of corresponding elements on the characteristic diagram S' and a characteristic diagram T 'output by the fourth convolution layer in the first branch to obtain a characteristic diagram H', and carrying out addition operation of corresponding elements on the characteristic diagram H 'and the characteristic diagram T' to obtain a modulation characteristic diagram F.
The feature modulation is used for enabling the significant region in the feature map T to obtain a larger response value, namely, the Incep-v 4 network gives more attention to the emotion expression region in the emotion image. Meanwhile, in order to ensure that the effect is obvious, two times of continuous modulation are carried out.
In another embodiment, only a single signature S is considered, since the number of signatures S and T is independent. The calculation formula of the characteristic diagram G is as follows:
G=f(T(w,h,c)*[S(w,h)+1]) (1)
wherein f represents a Sigmoid activation function such that 0 < T ∈ Rw×h< 1, thus ensuring a suitable range for the characteristic modulation; t (w, h, c) represents a characteristic diagram of the output of the second convolutional layer in the first branch; s (w, h) represents a characteristic diagram of the output of the second convolutional layer in the second branch; w, h and c respectively represent the width, height and channel number of the characteristic diagram;
the calculation formula of the modulation characteristic diagram F is as follows:
F=f(T′(w,h,c)*[S′(w,h)+1]) (2)
in the formula, T' (w, h, c) represents a characteristic diagram of the output of the fourth convolutional layer in the first branch; s' (w, h) represents a characteristic diagram of the output of the fourth convolutional layer in the second branch.
In a next embodiment, the inclusion-v 4 network is a multi-branch structure as shown in fig. 3, and includes three convolutional layers (BasicConv2d), one Mixed _3a module, one Mixed _4a module, one Mixed _5a module, four inclusion _ a modules, one Reduction _ a module, seven inclusion _ B modules, one Reduction _ B module, three inclusion _ C modules, one average Pooling layer (avarpousing) and one full connection layer (FullyConnection) in sequence; the Mixed _5a module, the Reduction-A module and the Reduction-B module are respectively followed by the introduction of a side branch structure (BasICConv2d), which consists of one convolutional layer; the output ends of the three side branch structures are connected with a full Connection layer (full Connection), and the full Connection layer is used for fusing the side branch features output by the three side branch structures and outputting a fused feature, and outputting the fused feature to the output end of the average pooling layer.
The three side branch structures are respectively composed of convolution layers with 256 output channels, the size of a convolution kernel is 1, and the convolution step length is 1. In order to fuse the features output by the three side branch structures, the present embodiment defines a full Connection layer (full Connection) with a layer neuron number of 256 behind the three side branch structures. Finally, the neuron number at the last fully connected layer of the inclusion-v 4 network was set toAnd 8, corresponding to 8 emotion categories to serve as a final classifier. By the network structure, the characteristics L of three different levels in the deep network can be acquired from three side branch structures1,L2,L3Plus top level features L of the deep network4And obtaining feature maps of four levels in the Incep-v 4 network.
After obtaining features at different levels, it is extremely important how to integrate these features at different levels in the model approach. It is observed that in image emotion recognition, semantic features of an image play a greater role in emotion recognition than style features of the image. Therefore, the embodiment gives higher attention to semantic features when the features are fused.
The feature fusion of the embodiment specifically includes:
step 1: first pair of features L1,L2,L3And performing concat operation, and inputting the concat operation into the full connection layer to obtain the characteristic L.
Step 2: for feature L and semantic feature L4And performing concat operation to obtain the final classification characteristic F.
The embodiment fully considers the influence of different types of features on emotion awakening, so that the features with expression power are formulated, the gap problem between emotion features and emotion expression is further solved, and the accuracy of emotion classification is improved.
In the embodiment, a multi-branch structure is introduced on the basis of an inclusion-v 4 network to acquire style features and semantic features of an image at a low level and a high level of the network respectively, then the multi-level features are fused, and emotion type prediction distribution of an image emotion classification model is obtained in a Softmax mode.
In a certain embodiment, the probability that the emotion image belongs to the ith emotion category is obtained by the last full connection layer in the inclusion-v 4 network in a Softmax mode:
Figure BDA0002581926030000171
in the formula, yiRepresenting the probability that the emotion image belongs to the ith emotion category; z is a radical ofiAn activation value representing that the emotion image belongs to the ith class; z is a radical ofjAn activation value representing that the emotion image belongs to the j-th class; c represents an emotion category.
In another embodiment, in the model training module, because the data volume of the emotion data set is relatively small, the inclusion-v 4 network is pre-trained on the ImageNet data set by adopting a migration learning strategy. And then training the inclusion-v 4 network by using the modulation characteristic diagram.
In a next embodiment, in the model training module, the image emotion classification model adopts a multitask loss function based on emotion diversity constraint, where the multitask loss function is:
Lmulti=Lcls+λLed(4)
Figure BDA0002581926030000172
Figure BDA0002581926030000173
Figure BDA0002581926030000174
Figure BDA0002581926030000175
in the formula, LmultiRepresenting a multitask penalty function; l isclsRepresents the conventional classification loss; l isedIndicating loss of other accompanying emotions; λ represents a weight; q. q.siRepresenting the probability that the image emotion label calibration image belongs to the ith emotion; p is a radical ofiRepresenting the probability that the model predicts that the emotion image belongs to the ith emotion category; f (i) probability of other accompanying emotions; j represents the dominant emotion; i represents other accompanying emotions;
Figure BDA0002581926030000181
representing a probability of a dominant emotion; p is a radical ofjA category probability representing a dominant emotion j; disijRepresents the distance between emotion i and dominant emotion j defined in Mikels' where; i, j ∈ B indicates that emotion categories i and j are in the same polarity.
In most collections of image emotion data, a majority voting strategy is widely adopted to acquire emotion labels of images. And from the diversity of emotion expression, the distribution of emotion is estimated in a form based on tag probability. Inspired by the theory of emotion, the relationship between two emotions determines the similarity between the two emotions. The two emotions can be expressed by Mikels 'Wheel (see FIG. 4, wherein (a) is an emotion Wheel and (b) is an emotion distance) from being similar to each other to be completely opposite, the distance between the two emotions can be calculated by the distance defined by the Mikels' Wheel, and the distance represents the similarity of the two emotions, namely the distance d between the emotion i and the emotion jijThe smaller the more similar the two emotions are represented. Therefore, by the definition of Mikels' Wheel, a probability label of the emotion image can be obtained.
In order to obtain a more reasonable probability label, the emotion analysis method is further combined with the research of emotion theory, the emotion of the emotion image is divided into two polarities of negative (anger, distust, fear, sadness) N and positive (amument, awe, content, and interest) P, and the emotion theory research shows that the diversity expression of emotion is actually more one dominant emotion accompanied by a plurality of other emotions with the same polarity, so the relationship of emotion polarity is introduced when the label probability is generated.
From the dominant emotion and the distance definition in Mikels' Wheel, the probability distribution of other emotions of the same polarity as the dominant emotion is calculated, and the probability distribution for the emotion of the opposite polarity is set to 0. The calculation formulas are expressed as formulas (7) and (8).
And introducing a multitask loss function, namely formula (5), through the category label and the probability label of the emotion image.
And generating a distributed label of the emotion image according to Mikels' Wheel, calculating the emotion distribution loss of the prediction distribution and the distributed label, and combining classification loss by introducing a weight lambda to form a new loss function so as to achieve the constraint on the expression diversity of the image.
In this embodiment, the multi-tasking penalty function L is optimized using stochastic gradient descentmultiDefine { ai I 1, 2.., C represents the activation value of the ith emotion category of the last fully-connected layer.
The gradient can be calculated by equation (9):
Figure BDA0002581926030000191
the expression of image emotion has subjectivity and diversity, and most emotion data sets are collected by a single emotion tag generated by voting. In practice, however, different regions of an image may express different emotions, and it is difficult to divide an image into a single emotion type, and the emotion expression of the image is often a dominant emotion accompanied by one or more emotions of the same polarity. The single emotion label can cause inaccurate marking of the image emotion data set, thereby greatly influencing the accuracy of image emotion classification. Therefore, in the embodiment, the relation of emotion polarities is introduced, and the dominant emotion and other emotions with the same polarity as the dominant emotion are comprehensively considered by introducing a multitask loss function, so that the accuracy of image emotion classification is improved.
In this embodiment, a new loss function combining emotion distribution loss calculation is proposed based on the diversity of image emotion expressions, so as to complete the constraint on the diversity of image emotion expressions.
The invention further provides a computer device, which includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the method when executing the computer program.
The invention also proposes a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method described above.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all modifications and equivalents of the present invention, which are made by the contents of the present specification and the accompanying drawings, or directly/indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An image emotion classification method based on significance detection and multi-level feature fusion is characterized by comprising the following steps:
constructing an emotion image set; the emotion image set comprises marked emotion images;
establishing a training set and a verification set according to the emotion image set, and extracting significance images of the emotion images in the training set and the verification set by using a significance detection network;
inputting the emotion images and the saliency maps in the training set into a pre-constructed image emotion classification model; the image emotion classification model comprises a twin neural network and an inclusion-v 4 network;
training the twin neural network by using the emotion images and the significance maps in the training set, and performing feature modulation on the corresponding emotion images by using the trained twin neural network to obtain a modulation feature map;
training the inclusion-v 4 network by using the modulation feature map to obtain a trained image emotion classification model;
verifying the trained image emotion classification model by using the emotion images and the saliency maps in the verification set;
and inputting the images to be classified and the saliency maps of the images to be classified into the verified image emotion classification models for classification, and obtaining the image emotion categories.
2. The image emotion classification method of claim 1, wherein the twin neural network comprises a first branch and a second branch, the first branch is used for feature extraction of emotion images, and the second branch is used for feature extraction of significance maps; the first branch and the second branch are both formed by four convolution layers with convolution kernels of the same size; the output terminals of the second and fourth convolutional layers in the second branch are connected to the output terminals of the second and fourth convolutional layers in the first branch.
3. The image emotion classification method of claim 2, wherein the step of obtaining a modulation feature map by performing feature modulation on the corresponding emotion image by using the significance map in the training set through the trained twin neural network comprises:
inputting the emotion images in the training set into a first branch of a trained twin neural network, and inputting a saliency map into a second branch;
obtaining a characteristic diagram S output by a second convolution layer in a second branch, carrying out multiplication operation of corresponding elements on the characteristic diagram S and a characteristic diagram T output by the second convolution layer in the first branch to obtain a characteristic diagram H, and carrying out addition operation of corresponding elements on the characteristic diagram H and the characteristic diagram T to obtain a characteristic diagram G;
inputting the feature map G into a third convolutional layer of the first branch, and inputting the feature map S into a third convolutional layer of the second branch;
and obtaining a characteristic diagram S 'output by the fourth convolution layer in the second branch, carrying out multiplication operation of corresponding elements on the characteristic diagram S' and a characteristic diagram T 'output by the fourth convolution layer in the first branch to obtain a characteristic diagram H', and carrying out addition operation of corresponding elements on the characteristic diagram H 'and the characteristic diagram T' to obtain a modulation characteristic diagram F.
4. The image emotion classification method of claim 3, wherein the calculation formula of the feature map G is as follows:
G=f(T(w,h,c)*[S(w,h)+1]) (1)
wherein f represents a Sigmoid activation function; t (w, h, c) represents a characteristic diagram of the output of the second convolutional layer in the first branch; s (w, h) represents a characteristic diagram of the output of the second convolutional layer in the second branch; w, h and c respectively represent the width, height and channel number of the characteristic diagram;
the calculation formula of the modulation characteristic diagram F is as follows:
F=f(T′(w,h,c)*[S′(w,h)+1]) (2)
in the formula, T' (w, h, c) represents a characteristic diagram of the output of the fourth convolutional layer in the first branch; s' (w, h) represents a characteristic diagram of the output of the fourth convolutional layer in the second branch.
5. The image emotion classification method of claim 1, wherein the inclusion-v 4 network is a multi-branch structure and sequentially comprises three convolution layers, a Mixed _3a module, a Mixed _4a module, a Mixed _5a module, four inclusion _ a modules, a Reduction _ a module, seven inclusion _ B modules, a Reduction _ B module, three inclusion _ C modules, an average pooling layer and a full connection layer; introducing side branch structures respectively after the Mixed _5a module, the Reduction-A module and the Reduction-B module, wherein the side branch structures consist of convolution layers; and the output ends of the three side branch structures are connected with a full connection layer, and the full connection layer is used for fusing the side branch characteristics output by the three side branch structures and outputting fused characteristics, and outputting the fused characteristics to the output end of the average pooling layer.
6. The image emotion classification method of claim 5, wherein the probability that the emotion image belongs to the ith emotion category is obtained by the last full connection layer in the inclusion-v 4 network in a Softmax manner:
Figure FDA0002581926020000031
in the formula, yiRepresenting the probability that the emotion image belongs to the ith emotion category; z is a radical ofiAn activation value representing that the emotion image belongs to the ith class; z is a radical ofjAn activation value representing that the emotion image belongs to the j-th class; c represents an emotion category.
7. The image emotion classification method of claim 1, wherein the image emotion classification model adopts a multitask loss function based on emotion diversity constraints, and the multitask loss function is as follows:
Lmulti=Lcls+λLed(4)
Figure FDA0002581926020000032
Figure FDA0002581926020000033
Figure FDA0002581926020000041
Figure FDA0002581926020000042
in the formula, LmultiRepresenting a multitask penalty function; l isclsRepresents the conventional classification loss; l isedIndicating loss of other accompanying emotions; λ represents a weight; q. q.siRepresenting the probability that the image emotion label calibration image belongs to the ith emotion; p is a radical ofiRepresenting the probability that the model predicts that the emotion image belongs to the ith emotion category; j represents the dominant emotion; i represents other accompanying emotions;
Figure FDA0002581926020000043
representing a probability of a dominant emotion; p is a radical ofjA category probability representing a dominant emotion j; f (i) probability of other accompanying emotions; disijRepresents the distance between emotion i and dominant emotion j defined in Mikels' where; i, j ∈ B indicates that emotion categories i and j are in the same polarity.
8. An image emotion classification device based on significance detection and multi-level feature fusion is characterized by comprising the following components:
the image set construction module is used for constructing an emotion image set; the emotion image set comprises marked emotion images;
the significance map acquisition module is used for establishing a training set and a verification set according to the emotion image set and extracting significance maps of the emotion images in the training set and the verification set by using a significance detection network;
the model training module is used for inputting the emotion images and the saliency maps in the training set into a pre-constructed image emotion classification model; the image emotion classification model comprises a twin neural network and an inclusion-v 4 network; training the twin neural network by using the emotion images and the significance maps in the training set, and performing feature modulation on the corresponding emotion images by using the trained twin neural network to obtain a modulation feature map; training the inclusion-v 4 network by using the modulation feature map to obtain a trained image emotion classification model;
the model verification module is used for verifying the trained image emotion classification model by utilizing the emotion images and the saliency maps in the verification set;
and the classification module is used for inputting the images to be classified and the saliency maps of the images to be classified into the verified image emotion classification model for classification, so as to obtain the image emotion classes.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202010670001.2A 2020-07-13 2020-07-13 Image emotion classification method and device based on saliency detection and multi-level feature fusion Active CN111797936B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010670001.2A CN111797936B (en) 2020-07-13 2020-07-13 Image emotion classification method and device based on saliency detection and multi-level feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010670001.2A CN111797936B (en) 2020-07-13 2020-07-13 Image emotion classification method and device based on saliency detection and multi-level feature fusion

Publications (2)

Publication Number Publication Date
CN111797936A true CN111797936A (en) 2020-10-20
CN111797936B CN111797936B (en) 2023-08-08

Family

ID=72808462

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010670001.2A Active CN111797936B (en) 2020-07-13 2020-07-13 Image emotion classification method and device based on saliency detection and multi-level feature fusion

Country Status (1)

Country Link
CN (1) CN111797936B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861978A (en) * 2021-02-20 2021-05-28 齐齐哈尔大学 Multi-branch feature fusion remote sensing scene image classification method based on attention mechanism
CN113017630A (en) * 2021-03-02 2021-06-25 贵阳像树岭科技有限公司 Visual perception emotion recognition method
CN114937182A (en) * 2022-04-18 2022-08-23 江西师范大学 Image emotion distribution prediction method based on emotion wheel and convolutional neural network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368798A (en) * 2017-07-07 2017-11-21 四川大学 A kind of crowd's Emotion identification method based on deep learning
CN109165551A (en) * 2018-07-13 2019-01-08 广东工业大学 A kind of expression recognition method of adaptive weighted fusion conspicuousness structure tensor and LBP feature
US20200035259A1 (en) * 2018-07-27 2020-01-30 Microsoft Technology Licensing, Llc Systems, methods, and computer-readable media for improved audio feature discovery using a neural network
CN110796150A (en) * 2019-10-29 2020-02-14 中山大学 Image emotion recognition method based on emotion significant region detection
CN111026898A (en) * 2019-12-10 2020-04-17 云南大学 Weak supervision image emotion classification and positioning method based on cross space pooling strategy

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368798A (en) * 2017-07-07 2017-11-21 四川大学 A kind of crowd's Emotion identification method based on deep learning
CN109165551A (en) * 2018-07-13 2019-01-08 广东工业大学 A kind of expression recognition method of adaptive weighted fusion conspicuousness structure tensor and LBP feature
US20200035259A1 (en) * 2018-07-27 2020-01-30 Microsoft Technology Licensing, Llc Systems, methods, and computer-readable media for improved audio feature discovery using a neural network
CN110796150A (en) * 2019-10-29 2020-02-14 中山大学 Image emotion recognition method based on emotion significant region detection
CN111026898A (en) * 2019-12-10 2020-04-17 云南大学 Weak supervision image emotion classification and positioning method based on cross space pooling strategy

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
SUSHAMA TELRANDHE 等: "Automatic Fetal Facial Expression Recognition by Hybridizing Saliency Maps with Recurrent Neural Network", 《2019 IEEE BOMBAY SECTION SIGNATURE CONFERENCE》, pages 1 - 6 *
ZELIN DENG 等: "A Saliency Detection and Gram Matrix Transform-Based Convolutional Neural Network for Image Emotion Classification", 《SECURITY AND COMMUNICATION NETWORKS》, pages 1 - 12 *
ZHENYUE QIN 等: "Visual Saliency Maps Can Apply to Facial Expression Recognition", 《ARXIV》, pages 1 - 8 *
卿粼波;熊文诗;周文俊;熊珊珊;吴晓红;: "基于多流CNN-LSTM网络的群体情绪识别", 《计算机应用研究》, vol. 35, no. 12, pages 3828 - 3831 *
王金华: "基于IAM的深度学习语音情感识别算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, pages 136 - 317 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861978A (en) * 2021-02-20 2021-05-28 齐齐哈尔大学 Multi-branch feature fusion remote sensing scene image classification method based on attention mechanism
CN112861978B (en) * 2021-02-20 2022-09-02 齐齐哈尔大学 Multi-branch feature fusion remote sensing scene image classification method based on attention mechanism
CN113017630A (en) * 2021-03-02 2021-06-25 贵阳像树岭科技有限公司 Visual perception emotion recognition method
CN114937182A (en) * 2022-04-18 2022-08-23 江西师范大学 Image emotion distribution prediction method based on emotion wheel and convolutional neural network
CN114937182B (en) * 2022-04-18 2024-04-09 江西师范大学 Image emotion distribution prediction method based on emotion wheel and convolutional neural network

Also Published As

Publication number Publication date
CN111797936B (en) 2023-08-08

Similar Documents

Publication Publication Date Title
Chen et al. A deep learning framework for time series classification using Relative Position Matrix and Convolutional Neural Network
Rao et al. Multi-level region-based convolutional neural network for image emotion classification
Jiang et al. A probability and integrated learning based classification algorithm for high-level human emotion recognition problems
Qi et al. Image-based action recognition using hint-enhanced deep neural networks
CN111797936A (en) Image emotion classification method and device based on significance detection and multi-level feature fusion
CN110659723B (en) Data processing method and device based on artificial intelligence, medium and electronic equipment
Zhang et al. Object semantics sentiment correlation analysis enhanced image sentiment classification
Wang et al. Learning performance prediction via convolutional GRU and explainable neural networks in e-learning environments
CN110210027B (en) Fine-grained emotion analysis method, device, equipment and medium based on ensemble learning
CN112667877A (en) Scenic spot recommendation method and equipment based on tourist knowledge map
Li et al. A hierarchical CNN-RNN approach for visual emotion classification
Cai et al. MIFAD-net: multi-layer interactive feature fusion network with angular distance loss for face emotion recognition
Jain et al. An automated hyperparameter tuned deep learning model enabled facial emotion recognition for autonomous vehicle drivers
Pan et al. Driver activity recognition using spatial‐temporal graph convolutional LSTM networks with attention mechanism
Jing et al. Relational graph neural network for situation recognition
Mou et al. Vision‐based vehicle behaviour analysis: a structured learning approach via convolutional neural networks
Zhou et al. Multi-scale pseudo labeling for unsupervised deep edge detection
Rajani et al. Ensembling visual explanations
CN114969078A (en) Method for updating expert research interest of federated learning through real-time online prediction
Alenazy et al. An automatic facial expression recognition system employing convolutional neural network with multi-strategy gravitational search algorithm
Dai et al. Two novel hybrid Self-Organizing Map based emotional learning algorithms
Fei et al. Research on facial expression recognition based on voting model
Jin et al. Incremental learning of multi-tasking networks for aesthetic radar map prediction
Li Image Color Recognition and Optimization Based on Deep Learning
Li et al. SentiNet: Mining visual sentiment from scratch

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant