CN112488214A - Image emotion analysis method and related device - Google Patents

Image emotion analysis method and related device Download PDF

Info

Publication number
CN112488214A
CN112488214A CN202011401899.XA CN202011401899A CN112488214A CN 112488214 A CN112488214 A CN 112488214A CN 202011401899 A CN202011401899 A CN 202011401899A CN 112488214 A CN112488214 A CN 112488214A
Authority
CN
China
Prior art keywords
emotion
image
loss function
function value
recognition model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011401899.XA
Other languages
Chinese (zh)
Inventor
薛罗阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202011401899.XA priority Critical patent/CN112488214A/en
Publication of CN112488214A publication Critical patent/CN112488214A/en
Priority to PCT/CN2021/128682 priority patent/WO2022116771A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/176Dynamic expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image emotion analysis method and a related device, wherein the method comprises the following steps: obtaining a plurality of emotion images related to the development process of the target event; inputting the plurality of emotion images into a trained image emotion recognition model to obtain a plurality of emotion category ratios of each key time point of a plurality of key time points of the target event in a development process; and obtaining social public opinion trends in the target event development process according to the plurality of key time points and the corresponding plurality of emotion category ratios. Through the design scheme, the image emotion recognition model can directly train sample images with noisy labels and learn subtle differences among emotion types, so that the accuracy of image emotion recognition is improved, image emotion recognition of a plurality of key time points of a target event in a development process can be effectively realized, and real-time analysis of social public opinion trends is facilitated.

Description

Image emotion analysis method and related device
Technical Field
The present application relates to the field of public sentiment analysis technologies, and in particular, to an image sentiment analysis method and a related apparatus.
Background
In daily life, rich emotional expression greatly helps to convey ideas. Meanwhile, emotion analysis in the event occurrence process is also one of important research fields of human-computer interaction. By establishing a corresponding public opinion analysis system, the emotion of the event on the social network media is identified and analyzed, and real-time social public opinion analysis can be provided for organizations, enterprises and individuals, so that the dynamic response capability of the organizations, the enterprises and the individuals is improved, the social public opinion trend is timely known, and the public opinion crisis is avoided.
With the continuous development of social networking media, a large number of emotional images exist in the social networking media, most of the existing public opinion analysis systems analyze social public opinion trends aiming at text emotions in the social networking media, and the large number of emotional images in the social networking media cannot be effectively analyzed, so that the real-time analysis of the social public opinion trends aiming at events is not facilitated. Therefore, it is necessary to provide a new image emotion analysis method to solve the above problems.
Disclosure of Invention
The present application mainly solves the technical problem of providing an image emotion analysis method and a related device, so as to effectively realize image emotion recognition of a plurality of key time points of a target event in a development process and analyze social public opinion trends in real time.
In order to solve the technical problem, the application adopts a technical scheme that: provided is an image emotion analysis method, comprising the following steps: obtaining a plurality of emotion images related to the development process of the target event; inputting the plurality of emotion images into a trained image emotion recognition model to obtain a plurality of emotion category ratios of a plurality of key time points of the target event in a development process; and obtaining social public opinion trends in the target event development process according to the plurality of key time points and the corresponding plurality of emotion category ratios.
Wherein the obtaining a plurality of emotion images related to the target event development history comprises: obtaining three groups of emotional images related to the development process of the target event, wherein the three groups of emotional images are respectively related to the target event before, after and during the development process; the step of inputting the plurality of emotion images into the trained image emotion recognition model comprises the following steps: and respectively inputting the three groups of emotion images into the trained image emotion recognition model to respectively obtain multiple emotion category ratios before, after and during the occurrence of the target event.
Wherein, before the step of obtaining a plurality of emotion images related to the development process of the target event, the method further comprises the following steps: obtaining a plurality of sample images according to emotion types, wherein the emotion types comprise anger, fear, happiness, disgust, surprise and sadness, the positive emotion comprises happiness, the negative emotion comprises anger, fear, disgust and sadness, and the neutral emotion comprises surprise; respectively inputting the plurality of sample images into an initial image emotion recognition model for training until the number of times of training of the image emotion recognition model reaches a threshold value, so as to obtain a current image emotion recognition model after current training; obtaining a total loss function value of the current image emotion recognition model, and judging whether the total loss function value of the current image emotion recognition model is smaller than the total loss function value of the initial image emotion recognition model or not; if yes, outputting the current image emotion recognition model; and otherwise, taking the current image emotion recognition model as an initial image emotion recognition model, and returning to the step of respectively inputting the plurality of sample images into the initial image emotion recognition model for training.
Wherein the total loss function value is related to at least a portion of a cross-entropy loss function value, a binary loss function value, a central loss function value, and a triplet of loss function values.
Wherein the total loss function value is equal to a sum of the cross-entropy loss function value, the binary loss function value, a product of a first coefficient and the center loss function value, a product of a second coefficient and the triplet of loss function values; wherein the first coefficient and the second coefficient are equal to or greater than 0 and equal to or less than 1.
Wherein the step of obtaining the total loss function value of the current image emotion recognition model comprises the following steps: obtaining an attention feature matrix corresponding to each sample image; obtaining an emotion activation matrix according to the input vector of each two-classification detector and the attention feature matrix, wherein the number of the two-classification detectors is the same as that of the emotion types; performing fusion processing on the attention feature matrix and the emotion activation matrix to obtain an emotion recognition image; carrying out global average pooling layer operation on the emotion recognition image to obtain an image emotion vector; obtaining the cross entropy loss function value, the center loss function value and the triple loss function value according to the image emotion vector, and obtaining the two-classification loss function value according to the emotion type to which the current sample image belongs and the type of the current two-classification detector; and obtaining the total loss function value according to the cross entropy loss function value, the center loss function value, the triple loss function value and the two-classification loss function value.
Wherein the step of obtaining the attention feature matrix corresponding to each sample image comprises: obtaining a characteristic matrix of each sample image extracted by the last layer of convolution layer of the image emotion recognition model; converting the feature matrix into an attention distribution value by using a nonlinear activation function; and carrying out normalization processing on the attention distribution values, and multiplying the normalized attention distribution values by the corresponding feature matrix to obtain an attention feature matrix.
Before the step of inputting the plurality of sample images into the initial image emotion recognition model for training, the method includes: preprocessing the plurality of sample images, wherein the preprocessing comprises a centralization processing, a normalization processing and a sample expansion processing.
In order to solve the above technical problem, another technical solution adopted by the present application is: there is provided an image emotion analyzing apparatus, comprising a memory and a processor coupled to each other, wherein the memory stores program instructions, and the processor is configured to execute the program instructions to implement the image emotion analyzing method in any of the above embodiments.
In order to solve the above technical problem, the present application adopts another technical solution: there is provided a storage device storing program instructions executable by a processor for implementing the image emotion analysis method as described in any of the above embodiments.
Different from the prior art, the beneficial effects of the application are that: according to the method, a plurality of emotion images related to the development process of the target event are obtained, the emotion images are input into a trained image emotion recognition model, so that a plurality of emotion category ratios of the target event at each key time point in a plurality of key time points in the development process are obtained, and the social public opinion trend in the development process of the target event is obtained according to the key time points and the corresponding emotion category ratios. Through the design scheme, the accuracy of image emotion recognition is improved, so that the image emotion recognition of a plurality of key time points of a target event in a development process can be effectively realized, and the real-time analysis of social public opinion trends is facilitated.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:
FIG. 1 is a schematic flow chart diagram of an embodiment of an image emotion analysis method according to the present application;
FIG. 2 is a schematic flow chart illustrating an embodiment of steps preceding step S101 in FIG. 1;
FIG. 3 is a schematic flowchart of an embodiment corresponding to step S203 in FIG. 2;
FIG. 4 is a schematic diagram of a framework of an embodiment of an image emotion analysis apparatus according to the present application;
FIG. 5 is a schematic structural diagram of an embodiment of an image emotion analyzing apparatus according to the present application;
FIG. 6 is a block diagram of an embodiment of a storage device according to the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a schematic flow chart of an embodiment of an image emotion analysis method according to the present application, where the analysis method includes:
s101: a plurality of emotion images related to the development history of the target event are obtained.
Specifically, in this embodiment, a plurality of emotion images related to the target event development history may be obtained from the social media through a crawler technology, and of course, in other embodiments, the emotion images required by the user may be obtained in real time from other information sources in other manners, which is not limited herein. The social media may be Twitter, Flick, Instagram, news, forum, microblog, and the like, which is not limited in this application. The plurality of emotion images can be three groups of emotion images which are respectively related to the occurrence of the target event, the occurrence of the target event and the occurrence process. By the method, the emotion images related to the target event before, after and during the occurrence process can be acquired in real time, and the method is beneficial to the follow-up social public opinion trend analysis about the target event.
S102: and inputting the plurality of emotion images into the trained image emotion recognition model to obtain a plurality of emotion category ratios of each key time point of a plurality of key time points of the target event in the development process.
Specifically, in this embodiment, the emotion categories include positive emotion, negative emotion and neutral emotion, and the three groups of emotion images in step S101 are respectively input into the trained image emotion recognition model to respectively obtain the proportions of the positive emotion, the negative emotion and the neutral emotion before, after and during occurrence of the target event, so as to implement emotion analysis of the target event at each of multiple key time points in the development process by the social media.
S103: and obtaining social public opinion trend in the development process of the target event according to the plurality of key time points and the corresponding plurality of emotion category ratios.
In this embodiment, the plurality of key time points may be before, after, and during the occurrence of the target event, or may be other time points related to the target event, which is not limited herein. And analyzing the tendency of social public opinion before and after the key time point of the target event by comparing the emotional category ratios before and after the key time points of the target event.
Through the design scheme, the image sentiment recognition of a plurality of key time points of the target event in the development process can be effectively realized, and the social public opinion trend is analyzed in real time so as to realize the prediction of the subsequent event similar to the target event. In addition, when a similar event occurs later, the existing emotion image data set can be augmented by acquiring the emotion image of the current event, so as to realize emotion image identification later.
In one embodiment, referring to fig. 2, fig. 2 is a flowchart illustrating an embodiment of a step before step S101 in fig. 1. The method comprises the following steps before the step S101:
s201: a plurality of sample images are obtained according to emotion types including anger, fear, happiness, disgust, surprise, and sadness.
Specifically, in this embodiment, a crawler technology may be used to obtain a plurality of sample images from social media according to emotion types, and of course, in other embodiments, a plurality of sample images may also be obtained from other information sources in other manners according to emotion types, which is not limited herein. The social media may be Twitter, Flick, Instagram, news, forum, microblog, and the like, which is not limited in this application. In this way, the plurality of sample images can be directly used as a training set of the image emotion recognition model. The plurality of sample images are divided into a plurality of emotion data sets according to emotion types, for example, sample images with emotion types of anger, fear, happiness, disgust, surprise and sadness are respectively set as six emotion data sets, and the number of sample images in one emotion data set ranges from 0 to 50000, for example, 500, 1000, 1500, 2000, 2500, 3000, 3500, 5000, 8000, 10000, 20000, 30000, 40000, 50000, and the like. The emotion types are set to a plurality of emotion categories including positive emotion including happy, negative emotion including anger, fear, disgust and sadness, and neutral emotion including surprise to analyze the proportion of the target event to the positive emotion, the negative emotion and the neutral emotion in the social opinion.
S202: and respectively inputting the plurality of sample images into the initial image emotion recognition model for training until the training times of the image emotion recognition model reach a threshold value, so as to obtain the current image emotion recognition model after current training.
Specifically, in this embodiment, before the step of inputting the plurality of sample images into the initial image emotion recognition model for training, the method includes: the method comprises the steps of preprocessing a plurality of sample images, wherein the preprocessing comprises a centralizing processing, a standardization processing and a sample expansion processing. Specifically, the pixel value of the sample image is set to 0-1, for example, 0.2, 0.5, 1, etc., so that the dimension of the sample image is compressed, the sample image is normalized, and at the same time, the sample image can be turned upside down and left and right according to the set standard, so that a plurality of sample images are input into the initial image emotion recognition model according to the unified standard for training. When the sample image is too small, the emotion data set to be trained can be expanded through sample expansion processing. And compressing the feature matrix of the sample images through centralization processing, standardization processing and sample expansion processing, so that the plurality of sample images have features with the same dimension, and facilitating the training of the plurality of subsequent sample images in the initial image emotion recognition model.
The threshold range of the training times is 50-500, for example, 50, 100, 200, 300, 400, 500, etc. For example, the threshold is set to be 50, and when the training times do not reach 50, the image emotion recognition model continues to be trained; and when the training frequency reaches 50 times, stopping the training of the image emotion recognition model, and acquiring the currently trained image emotion recognition model to perform the subsequent steps.
S203: and obtaining a total loss function value of the current image emotion recognition model, and judging whether the total loss function value of the current image emotion recognition model is smaller than the total loss function value of the initial image emotion recognition model.
In one embodiment, please refer to fig. 3, wherein fig. 3 is a flowchart illustrating an embodiment corresponding to step S203 in fig. 2. The step of obtaining the total loss function value of the current image emotion recognition model specifically comprises the following steps:
s301: and obtaining an attention feature matrix corresponding to each sample image.
Specifically, the specific implementation process of step S301 may be: extracting the depth characteristic of each sample image through a set convolutional neural network model, and further acquiring a characteristic matrix corresponding to the nth sample through the last convolutional layer of the convolutional neural network model
Figure BDA0002812757740000071
Where i is 1,2,3, …, d and j are 1,2,3, …, d is the coordinates of the feature matrix, d is the length and width of the feature matrix, m is the number of convolution kernels of the convolution layer (i.e., the height of the feature matrix), and C is the number of emotion data sets. Using a nonlinear activation function sigmoid function f (x) 1/(1+ e)-x) The feature matrix is obtained
Figure BDA0002812757740000072
Converting into one-dimensional attention distribution value SmThe formula is as follows:
Sm=fsigmoid(WT·Xi,j,m+b) (1)
where W and b are the weight and bias of the attention mechanism, respectively.
Further, the above-mentioned attention distribution value S is normalizedmTo reduce the magnitude of the eigenvalue of the noisy tag, the formula is:
Figure BDA0002812757740000073
wherein, amIs the final attention mechanism distribution value, moreover, Σ am=1。
The attention distribution value S obtained above is usedmAnd the above feature matrix
Figure BDA0002812757740000074
Multiplying to obtain an attention feature matrix
Figure BDA0002812757740000075
The formula is as follows:
Figure BDA0002812757740000076
where "°" denotes multiplication of corresponding elements of the matrix.
Through the attention mechanism, the attention value of the error information is reduced, and the attention value of the correct information is increased, so that the negative influence caused by the error information is inhibited, and the image emotion recognition model can directly train the sample image data set with the noisy label.
S302: and obtaining an emotion activation matrix according to the input vector and the attention feature matrix of each binary detector.
Specifically, in order to enable the image emotion recognition model to recognize multiple emotions in one sample image, a plurality of binary detectors are set to convert the emotion problem of one sample image into a plurality of binary detection problems. Meanwhile, a class activation graph mechanism is added into each binary detector to obtain the salient features of each type of emotion type in a sample image. Obtaining the input vector of each two-class detector as
Figure BDA0002812757740000081
Where m is the dimension of a particular emotion vector, ckK is the number of emotion classifications for the emotion data set for the corresponding binary detector. The input vector through each binary detector is
Figure BDA0002812757740000082
And attention is paid toForce signature matrix
Figure BDA0002812757740000083
Obtaining an emotion activation matrix
Figure BDA0002812757740000084
The formula is as follows:
Figure BDA0002812757740000085
wherein, the number of the two classification detectors is the same as the number of the emotion types. Specifically, for the given number C of emotion types, C emotion activation matrixes corresponding to emotions are obtained by distributing C binary detectors
Figure BDA0002812757740000086
And each corresponding class activation graph obtains the salient features corresponding to the emotion.
S303: and carrying out fusion processing on the attention feature matrix and the emotion activation matrix to obtain an emotion recognition image.
In particular, an emotion activation matrix for a specific emotion to be obtained
Figure BDA0002812757740000087
As local features, an attention feature matrix is obtained
Figure BDA0002812757740000088
As global features, these two features are operated on by a link ": "fuse to obtain emotion recognition image Ci,j,mTo improve the emotion recognition effect, the formula is as follows:
Figure BDA0002812757740000089
by the method, the problems of a plurality of emotions in one sample image can be converted into a plurality of simple emotion detection problems, in the process of detecting each type of emotion, the class activation image is used for obtaining the significant features of the corresponding emotion, the final image emotion recognition effect is improved in a feature fusion mode, and the accuracy of image emotion recognition is improved.
S304: and carrying out global average pooling layer operation on the emotion recognition image to obtain an image emotion vector.
Specifically, the emotion classification of the whole image emotion recognition model is supervised and learned
Figure BDA00028127577400000810
Where N is the number of pictures of the emotion data set, IiFor the input sample image, YiFor the corresponding emotion classification, Softmax is used as a loss function for the final emotion classification. For emotion recognition image C in step S303 abovei,j,mPerforming global average pooling layer operation to obtain image emotion vector ci=AP(Ci,j,m)。
S305: and obtaining a cross entropy loss function value, a center loss function value and a triplet loss function value according to the image emotion vector, and obtaining a two-classification loss function value according to the emotion type to which the current sample image belongs and the type of the current two-classification detector.
Specifically, the emotion vector c is calculated from the imageiObtaining a cross entropy loss function value, wherein the formula is as follows:
Figure BDA0002812757740000091
wherein, W is the parameter of the whole image emotion recognition model, YiIn one-hot encoded form.
Obtaining a two-classification loss function value according to the emotion type of the current sample image and the type of the current two-classification detector, wherein the formula is as follows:
Figure BDA0002812757740000092
Figure BDA0002812757740000093
wherein, yiThe sample image is defined as the emotion type of the Nth sample image, i belongs to sigma C, and is 1 when the sample image is the emotion type of the current second-class detector and is 0 when the sample image is not the emotion type of the current second-class detector. For example, the emotional types of anger, fear, happiness, disgust, surprise, sadness are defined as y1,y2,y3,y4,y5,y6In a sample image, when the emotion type of the front second class detector is anger, then y1=1,y2=0,y3=0,y4=0,y5=0,y6=0。
In order to further improve the effect of emotion classification, a central loss function value and a triple loss function value are introduced as regular terms of final emotion classification. The image emotion recognition is treated approximately as a linear problem, introducing a central loss function value to reduce the distance between the same emotion types:
Figure BDA0002812757740000094
wherein the content of the first and second substances,
Figure BDA0002812757740000095
for image emotion vector ciA category center; and establish
Figure BDA0002812757740000096
And ciTo update the image emotion vector ciCategory center
Figure BDA0002812757740000097
Figure BDA0002812757740000098
Therein, only lie in
Figure BDA0002812757740000099
Then according to ciUpdating
Figure BDA00028127577400000910
Further, the distance between different emotion types is extended by introducing a triplet of loss function values:
Figure BDA0002812757740000101
wherein the content of the first and second substances,
Figure BDA0002812757740000102
and θ is set to the hyper-parameter.
Figure BDA0002812757740000103
And
Figure BDA0002812757740000104
for the same type of emotion,
Figure BDA0002812757740000105
and
Figure BDA0002812757740000106
are of different emotion types.
S306: and obtaining a total loss function value according to the cross entropy loss function value, the center loss function value, the triple loss function value and the binary loss function value.
In this embodiment, the total loss function value is at least partially related to one of a cross-entropy loss function value, a binary loss function value, a central loss function value, and a triplet of loss function values. Specifically, the total loss function value is equal to the sum of the cross-entropy loss function value, the two-class loss function value, the product of the first coefficient and the center loss function value, the product of the second coefficient and the triplet loss function value:
L=Lclass(I,Y)+LSCAM(I,y)+λLCenter(c)+βLTriple(c) (12)
wherein the first coefficient lambda and the second coefficient beta are respectively LCenterAnd LTripleThe first coefficient λ and the second coefficient β are equal to or greater than 0 and equal to or less than 1, for example, 0, 0.2, 0.5, 0.7, 1, and the like. When λ is equal to 0, then the total loss function value is not correlated with the central loss function value; when β is equal to 0, then the total loss function value is not correlated with the triplet of loss function values; when λ and β are both equal to 0, then the total loss function value is not correlated with both the central loss function value and the triplet of loss function values.
By introducing the central loss function value and the triple loss function value as the final regular items of emotion classification, the distance between the same emotion types is reduced, the distance between different emotion types is enlarged, the image emotion recognition model can learn the subtle difference between the emotion types, and therefore the accuracy of image emotion recognition is improved.
And obtaining the total loss function value L of the current image emotion recognition model in the above mode. In addition, step S203 further includes: and judging whether the total loss function value of the current image emotion recognition model is smaller than the total loss function value of the initial image emotion recognition model.
S204: and if so, outputting the current image emotion recognition model.
When the total loss function value of the current image emotion recognition model is smaller than that of the initial image emotion recognition model, the current image emotion recognition model is output, and the current image emotion recognition model and the model parameters thereof are stored.
S205: and otherwise, taking the current image emotion recognition model as an initial image emotion recognition model, and returning to the step of respectively inputting the plurality of sample images into the initial image emotion recognition model for training.
That is, when the total loss function value of the current image emotion recognition model is greater than or equal to the total loss function value of the initial image emotion recognition model, the current image emotion recognition model is used as the initial image emotion recognition model, and the process returns to step S202.
Through the design scheme, the image emotion recognition model can directly train sample images with noisy labels and learn subtle differences among emotion types, so that the accuracy of image emotion recognition is improved, image emotion recognition of a plurality of key time points of a target event in a development process can be effectively realized, and real-time analysis of social public opinion trends is facilitated.
Referring to fig. 4, fig. 4 is a schematic diagram of an embodiment of an image emotion analyzing apparatus according to the present application, where the image emotion analyzing apparatus includes an obtaining module 10, an input module 12, and a processing module 14. The obtaining module 10 is configured to obtain a plurality of emotion images to be detected related to a development process of a target event; the input module 12 is configured to input an emotion image to be detected into the trained image emotion recognition model, so as to obtain multiple emotion category ratios of each key time point of multiple key time points of the target event in the development history; the processing module 14 is configured to load the trained optimal image emotion recognition model and parameters thereof, read and preprocess an emotion image to be detected, display emotion categories of the obtained emotion image to be detected, and display emotion category ratios corresponding to a plurality of key time points in the whole target event in a form of a histogram, so as to analyze social public opinion trends in a development process of the target event, display social public opinion trends in a form of a broken line graph, and highlight major turning points in an occurrence process of the target event, so that a user can make a correct public opinion decision.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an embodiment of an image emotion analyzing apparatus according to the present application, the image emotion analyzing apparatus includes a memory 200 and a processor 202 coupled to each other, the memory 200 stores program instructions, and the processor 202 is configured to execute the program instructions to implement the image emotion analyzing method mentioned in any of the above embodiments.
Specifically, the processor 202 may also be referred to as a CPU (Central Processing Unit). The processor 202 may be an integrated circuit chip having signal processing capabilities. The Processor 202 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, processor 202 may be implemented collectively by a plurality of integrated circuit chips.
Referring to fig. 6, fig. 6 is a schematic diagram of a memory device according to an embodiment of the present disclosure. The storage device 30 stores program instructions 300 capable of being executed by a processor, and the program instructions 300 are used for implementing the image emotion analysis method mentioned in any of the above embodiments. The program instructions 300 may be stored in the storage device in the form of a software product, and include several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. The aforementioned storage device includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.
In summary, different from the situation of the prior art, the image emotion recognition model constructed by the end-to-end image emotion recognition method in the application can directly train the sample image with the noisy label and learn the subtle difference between emotion types, so that the accuracy of image emotion recognition is improved, the image emotion recognition of a plurality of key time points of a target event in a development process can be effectively realized, and the social public opinion trend can be analyzed according to the target event. In addition, when similar events occur later, real-time public opinion analysis can be provided at the time of the event, and the social public opinion trend of the event can be predicted.
The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (10)

1. An image emotion analysis method, comprising:
obtaining a plurality of emotion images related to the development process of the target event;
inputting the plurality of emotion images into a trained image emotion recognition model to obtain a plurality of emotion category ratios of each key time point of a plurality of key time points of the target event in a development process;
and obtaining social public opinion trends in the target event development process according to the plurality of key time points and the corresponding plurality of emotion category ratios.
2. The method for emotion analysis of an image according to claim 1,
the obtaining a plurality of emotion images related to the development history of the target event comprises the following steps: obtaining three groups of emotional images related to the development process of the target event, wherein the three groups of emotional images are respectively related to the target event before, after and during the development process;
the step of inputting the plurality of emotion images into the trained image emotion recognition model comprises the following steps: and respectively inputting the three groups of emotion images into the trained image emotion recognition model to respectively obtain multiple emotion category ratios before, after and during the occurrence of the target event.
3. The method for emotion analysis of images according to claim 1, wherein said step of obtaining a plurality of emotion images related to the development history of the target event is preceded by the steps of:
obtaining a plurality of sample images according to emotion types, wherein the emotion types comprise anger, fear, happiness, disgust, surprise and sadness, the positive emotion comprises happiness, the negative emotion comprises anger, fear, disgust and sadness, and the neutral emotion comprises surprise;
respectively inputting the plurality of sample images into an initial image emotion recognition model for training until the number of times of training of the image emotion recognition model reaches a threshold value, so as to obtain a current image emotion recognition model after current training;
obtaining a total loss function value of the current image emotion recognition model, and judging whether the total loss function value of the current image emotion recognition model is smaller than the total loss function value of the initial image emotion recognition model or not;
if yes, outputting the current image emotion recognition model; and otherwise, taking the current image emotion recognition model as an initial image emotion recognition model, and returning to the step of respectively inputting the plurality of sample images into the initial image emotion recognition model for training.
4. The image emotion analysis method of claim 3,
the total loss function value is associated with at least a portion of a cross-entropy loss function value, a binary loss function value, a central loss function value, and a triplet of loss function values.
5. The image emotion analysis method of claim 4,
the total loss function value is equal to the sum of the cross-entropy loss function value, the binary loss function value, the product of a first coefficient and the central loss function value, and the product of a second coefficient and the triplet loss function value;
wherein the first coefficient and the second coefficient are equal to or greater than 0 and equal to or less than 1.
6. The image emotion analysis method of claim 5, wherein the step of obtaining the total loss function value of the current image emotion recognition model comprises:
obtaining an attention feature matrix corresponding to each sample image;
obtaining an emotion activation matrix according to the input vector of each two-classification detector and the attention feature matrix, wherein the number of the two-classification detectors is the same as that of the emotion types;
performing fusion processing on the attention feature matrix and the emotion activation matrix to obtain an emotion recognition image;
carrying out global average pooling layer operation on the emotion recognition image to obtain an image emotion vector;
obtaining the cross entropy loss function value, the center loss function value and the triple loss function value according to the image emotion vector, and obtaining the two-classification loss function value according to the emotion type to which the current sample image belongs and the type of the current two-classification detector;
and obtaining the total loss function value according to the cross entropy loss function value, the center loss function value, the triple loss function value and the two-classification loss function value.
7. The image emotion analysis method of claim 6, wherein the step of obtaining the attention feature matrix corresponding to each sample image comprises:
obtaining a characteristic matrix of each sample image extracted by the last layer of convolution layer of the image emotion recognition model;
converting the feature matrix into an attention distribution value by using a nonlinear activation function;
and carrying out normalization processing on the attention distribution values, and multiplying the normalized attention distribution values by the corresponding feature matrix to obtain an attention feature matrix.
8. The method for image emotion analysis according to claim 3, wherein before the step of inputting the sample images into the initial image emotion recognition model for training, the method comprises:
preprocessing the plurality of sample images, wherein the preprocessing comprises a centralization processing, a normalization processing and a sample expansion processing.
9. An image emotion analysis apparatus, comprising a memory and a processor coupled to each other, wherein the memory stores program instructions, and the processor is configured to execute the program instructions to implement the image emotion analysis method according to any one of claims 1 to 8.
10. A storage device storing program instructions executable by a processor to implement the image emotion analysis method as claimed in any one of claims 1 to 8.
CN202011401899.XA 2020-12-02 2020-12-02 Image emotion analysis method and related device Pending CN112488214A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011401899.XA CN112488214A (en) 2020-12-02 2020-12-02 Image emotion analysis method and related device
PCT/CN2021/128682 WO2022116771A1 (en) 2020-12-02 2021-11-04 Method for analyzing emotion shown in image and related devices

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011401899.XA CN112488214A (en) 2020-12-02 2020-12-02 Image emotion analysis method and related device

Publications (1)

Publication Number Publication Date
CN112488214A true CN112488214A (en) 2021-03-12

Family

ID=74939763

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011401899.XA Pending CN112488214A (en) 2020-12-02 2020-12-02 Image emotion analysis method and related device

Country Status (2)

Country Link
CN (1) CN112488214A (en)
WO (1) WO2022116771A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688985A (en) * 2021-07-26 2021-11-23 浙江大华技术股份有限公司 Training method of heart rate estimation model, heart rate estimation method and device
CN114064969A (en) * 2021-11-19 2022-02-18 浙江大学 Dynamic picture linkage display device based on emotional curve
WO2022116771A1 (en) * 2020-12-02 2022-06-09 Zhejiang Dahua Technology Co., Ltd. Method for analyzing emotion shown in image and related devices

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115409855B (en) * 2022-09-20 2023-07-07 北京百度网讯科技有限公司 Image processing method, device, electronic equipment and storage medium
CN115496113B (en) * 2022-11-17 2023-04-07 深圳市中大信通科技有限公司 Emotional behavior analysis method based on intelligent algorithm
CN117851588A (en) * 2023-06-19 2024-04-09 合肥奕谦信息科技有限公司 Service information processing method and device based on big data and computer equipment
CN117058405B (en) * 2023-07-04 2024-05-17 首都医科大学附属北京朝阳医院 Image-based emotion recognition method, system, storage medium and terminal
CN117494068B (en) * 2023-11-17 2024-04-19 之江实验室 Network public opinion analysis method and device combining deep learning and causal inference

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764268A (en) * 2018-04-02 2018-11-06 华南理工大学 A kind of multi-modal emotion identification method of picture and text based on deep learning
CN110263822A (en) * 2019-05-29 2019-09-20 广东工业大学 A kind of Image emotional semantic analysis method based on multi-task learning mode
CN110852360A (en) * 2019-10-30 2020-02-28 腾讯科技(深圳)有限公司 Image emotion recognition method, device, equipment and storage medium
CN111813894A (en) * 2020-06-30 2020-10-23 郑州信大先进技术研究院 Natural language emotion recognition method based on deep learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2588747B (en) * 2019-06-28 2021-12-08 Huawei Tech Co Ltd Facial behaviour analysis
CN110556129B (en) * 2019-09-09 2022-04-19 北京大学深圳研究生院 Bimodal emotion recognition model training method and bimodal emotion recognition method
CN112488214A (en) * 2020-12-02 2021-03-12 浙江大华技术股份有限公司 Image emotion analysis method and related device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764268A (en) * 2018-04-02 2018-11-06 华南理工大学 A kind of multi-modal emotion identification method of picture and text based on deep learning
CN110263822A (en) * 2019-05-29 2019-09-20 广东工业大学 A kind of Image emotional semantic analysis method based on multi-task learning mode
CN110852360A (en) * 2019-10-30 2020-02-28 腾讯科技(深圳)有限公司 Image emotion recognition method, device, equipment and storage medium
CN111813894A (en) * 2020-06-30 2020-10-23 郑州信大先进技术研究院 Natural language emotion recognition method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张海涛等: "基于卷积神经网络的微博舆情情感分类研究", 《情报学报》 *
薛罗阳 等: "NLWSNet:基于弱监督学习的嘈杂标签Web图像情感分析(英文)", 《FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022116771A1 (en) * 2020-12-02 2022-06-09 Zhejiang Dahua Technology Co., Ltd. Method for analyzing emotion shown in image and related devices
CN113688985A (en) * 2021-07-26 2021-11-23 浙江大华技术股份有限公司 Training method of heart rate estimation model, heart rate estimation method and device
CN114064969A (en) * 2021-11-19 2022-02-18 浙江大学 Dynamic picture linkage display device based on emotional curve

Also Published As

Publication number Publication date
WO2022116771A1 (en) 2022-06-09

Similar Documents

Publication Publication Date Title
CN112488214A (en) Image emotion analysis method and related device
CN110377740B (en) Emotion polarity analysis method and device, electronic equipment and storage medium
CN111680159B (en) Data processing method and device and electronic equipment
CN109471944B (en) Training method and device of text classification model and readable storage medium
CN112164391A (en) Statement processing method and device, electronic equipment and storage medium
CN112732911A (en) Semantic recognition-based conversational recommendation method, device, equipment and storage medium
CN109284371B (en) Anti-fraud method, electronic device, and computer-readable storage medium
WO2022095376A1 (en) Aspect-based sentiment classification method and apparatus, device, and readable storage medium
WO2020147409A1 (en) Text classification method and apparatus, computer device, and storage medium
CN112995414B (en) Behavior quality inspection method, device, equipment and storage medium based on voice call
CN111159409A (en) Text classification method, device, equipment and medium based on artificial intelligence
CN113553510A (en) Text information recommendation method and device and readable medium
Kumar et al. An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation
CN110826327A (en) Emotion analysis method and device, computer readable medium and electronic equipment
CN111190967A (en) User multi-dimensional data processing method and device and electronic equipment
CN114223012A (en) Push object determination method and device, terminal equipment and storage medium
CN112131506B (en) Webpage classification method, terminal equipment and storage medium
CN116680401A (en) Document processing method, document processing device, apparatus and storage medium
CN110888983A (en) Positive and negative emotion analysis method, terminal device and storage medium
CN116029760A (en) Message pushing method, device, computer equipment and storage medium
CN112633394B (en) Intelligent user label determination method, terminal equipment and storage medium
CN110597985A (en) Data classification method, device, terminal and medium based on data analysis
CN115795344A (en) Graph convolution network document classification method and system based on mixed diffusion
CN113255824B (en) Method and apparatus for training classification model and data classification
CN116089605A (en) Text emotion analysis method based on transfer learning and improved word bag model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210312