CN109508640A - A kind of crowd's sentiment analysis method, apparatus and storage medium - Google Patents

A kind of crowd's sentiment analysis method, apparatus and storage medium Download PDF

Info

Publication number
CN109508640A
CN109508640A CN201811191726.2A CN201811191726A CN109508640A CN 109508640 A CN109508640 A CN 109508640A CN 201811191726 A CN201811191726 A CN 201811191726A CN 109508640 A CN109508640 A CN 109508640A
Authority
CN
China
Prior art keywords
image
affective
sample image
sample
scene characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811191726.2A
Other languages
Chinese (zh)
Inventor
徐嵚嵛
李琳
周冰
周效军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Culture Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201811191726.2A priority Critical patent/CN109508640A/en
Publication of CN109508640A publication Critical patent/CN109508640A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/175Static expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a kind of crowd's sentiment analysis methods, comprising: obtains the first image;It include at least two faces in the first image;The scene characteristic that the first image is determined based on preset scene characteristic model determines at least one affective characteristics of the first image based at least one preset face affective characteristics model;At least two classifiers are constructed according to the scene characteristic and at least one described affective characteristics, at least two prediction results are obtained according at least two classifier;The prediction result characterization prediction crowd emotion is the probability of various affective styles;According at least two prediction result, crowd's affective style of the first image is determined.The invention also discloses a kind of crowd's sentiment analysis device and computer readable storage mediums.

Description

A kind of crowd's sentiment analysis method, apparatus and storage medium
Technical field
The present invention relates to multimedia communication technologies more particularly to a kind of crowd's sentiment analysis method, apparatus and computer can Read storage medium.
Background technique
With the continuous development of network, hardware and social software, there are into ten thousand more than one hundred million users daily and upload to picture In network.Wherein most is all shot in the wedding scene, party scene or performance scene etc. of relatives and friends, on these More people even crowd may be included in the picture of biography.
It is less to the emotional intensity identification of crowd at present, it is largely focused on individual facial expression recognition or crowd The perceptive intensity of independent emotion, and preferable method is then lacked to the identification of the emotional intensity of group.
Summary of the invention
It can in view of this, the main purpose of the present invention is to provide a kind of crowd's sentiment analysis method, apparatus and computer Read storage medium.
In order to achieve the above objectives, the technical scheme of the present invention is realized as follows:
The embodiment of the invention provides a kind of crowd's sentiment analysis methods, which comprises
Obtain the first image;It include at least two faces in the first image;
The scene characteristic of the first image is determined based on preset scene characteristic model, is based at least one preset people Face affective characteristics model determines at least one affective characteristics of the first image;
Construct at least two classifiers according to the scene characteristic and at least one described affective characteristics, according to it is described at least Two classifiers obtain at least two prediction results;At least one emotion of the prediction result characterization the first image performance The corresponding probability of type;
According at least two prediction result, crowd's affective style of the first image is determined.
The embodiment of the invention provides a kind of crowd's sentiment analysis device, described device includes: first processing module, second Processing module, third processing module and fourth processing module;Wherein,
The first processing module, for obtaining the first image;It include at least two faces in the first image;
The Second processing module, the scene for determining the first image based on preset scene characteristic model are special Sign, at least one affective characteristics of the first image are determined based at least one preset face affective characteristics model;
The third processing module, for according to the scene characteristic and at least one described affective characteristics building at least two A classifier obtains at least two prediction results according at least two classifier;The prediction result characterization described first The corresponding probability of at least one affective style of image appearance;
The fourth processing module, for determining the crowd of the first image according at least two prediction result Affective style.
The embodiment of the invention provides a kind of crowd's sentiment analysis device, described device includes: processor and for storing The memory for the computer program that can be run on a processor;Wherein,
The processor is for executing the step of above-described crowd's sentiment analysis method when running the computer program Suddenly.
The embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, described The step of above-described crowd's sentiment analysis method is realized when computer program is executed by processor.
Crowd's sentiment analysis method, apparatus and computer readable storage medium provided by the embodiment of the present invention obtain the One image;It include at least two faces in the first image;The first image is determined based on preset scene characteristic model Scene characteristic, determine that at least one emotion of the first image is special based at least one preset face affective characteristics model Sign;At least two classifiers are constructed according to the scene characteristic and at least one described affective characteristics, according to described at least two Classifier obtains at least two prediction results;The prediction result characterization prediction crowd emotion is the probability of various affective styles; According at least two prediction result, crowd's affective style of the first image is determined.In the embodiment of the present invention, Neng Goushi The affective style for affective style and the environment performance that face shows in other image, can be more quasi- in conjunction with both affective styles True determination crowd's affective style.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of crowd's sentiment analysis method provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of another crowd's sentiment analysis method provided in an embodiment of the present invention;
Fig. 3 is a kind of flow diagram of scaling method provided in an embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of convolutional neural networks provided in an embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of Inception module provided in an embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of down-sampled module provided in an embodiment of the present invention;
Fig. 7 is a kind of convolution flow diagram provided in an embodiment of the present invention;
Fig. 8 is a kind of maximum value pond flow diagram provided in an embodiment of the present invention;
Fig. 9 is a kind of schematic network structure of ResNet34 and ResNet50 provided in an embodiment of the present invention;
Figure 10 is a kind of structural schematic diagram of ResNet module provided in an embodiment of the present invention;
Figure 11 is a kind of structural schematic diagram of crowd's sentiment analysis device provided in an embodiment of the present invention;
Figure 12 is the structural schematic diagram of another crowd's sentiment analysis device provided in an embodiment of the present invention.
Specific embodiment
In various embodiments of the present invention, the first image is obtained;It include at least two faces in the first image;Base The scene characteristic of the first image is determined in preset scene characteristic model, is based at least one preset face affective characteristics Model determines at least one affective characteristics of the first image;According to the scene characteristic and at least one described affective characteristics At least two classifiers are constructed, at least two prediction results are obtained according at least two classifier;The prediction result table Levy the probability that prediction crowd emotion is various affective styles;According at least two prediction result, the first image is determined Crowd's affective style.
Below with reference to embodiment, the present invention is further described in more detail.
Fig. 1 is a kind of flow diagram of crowd's sentiment analysis method provided in an embodiment of the present invention;The method can be with Applied to smart machine, such as server equipment;As shown in Figure 1, which comprises
Step 101 obtains the first image;It include at least two faces in the first image.
Here, the first image is the image of pending crowd's sentiment analysis.
Step 102, the scene characteristic that the first image is determined based on preset scene characteristic model, based on it is preset extremely A few face affective characteristics model determines at least one affective characteristics of the first image.
Specifically, before the step 102, the method also includes: determine scene characteristic model and at least one people Face affective characteristics model.
Specifically, the determining scene characteristic model and at least one face affective characteristics model, comprising:
The sample image for obtaining preset quantity, determines the emotion of each sample image in the sample image of the preset quantity Label;The affective style of the affective tag characterization sample image;
Learning training is carried out according to the sample image of the preset quantity and the corresponding affective tag of each sample image, Obtain the scene characteristic model and at least one face affective characteristics model.
In one embodiment, the affective style may include: positive emotion, calmness, negative emotion;In another implementation In example, the affective style can also be divided in more detail, the affective style may include: wild with joy, great rejoicing, vigorously It likes, is happy, tranquil, sad, sad, grieved, grieved.The above affective style can be indicated with different number or letter.
In the present embodiment, the method also includes: the sample image of the preset quantity is divided into the first image set and Two image sets;The first image collection includes at least one first sample image, second image set include at least one Two sample images.
For example, obtain at least 100000 sample images, 80% sample image is taken to take 20% as the first image set Sample image as the second image set.
Here, learnt according to the sample image of the preset quantity and the corresponding affective tag of each sample image Training, obtains at least one face affective characteristics model, comprising:
It extracts at least two first facial images and second sample image in the first sample image extremely Few two the second facial images, according at least two first facial images extracted in the first sample image and described first The affective tag of sample image carries out propagated forward based on the first convolutional neural networks and according to second sample image At least two second facial images of middle extraction and the affective tag of second sample image are carried out based on first convolution The back-propagating of neural network obtains the first face affective characteristics model.
And/or at least two first facial images and second sample graph extracted in the first sample image At least two second facial images that extract are adjusted as in, according to extracted in the first sample image adjusted to The affective tag of few two the first facial images and the first sample image carries out the forward direction based on the second convolutional neural networks It propagates and according at least two second facial images and second sample extracted in second sample image adjusted The affective tag of this image carries out the back-propagating based on second convolutional neural networks, obtains the second face affective characteristics mould Type.
Here, facial image is extracted, may include: to be identified by face recognition technology to sample image, calibrate Facial image is extracted according to the position of calibration in the position of face;The facial image generally extracted can be rectangular image.
The above-described facial image to extraction is adjusted, and may include: that facial image is adjusted to pre-set dimension Grayscale image etc..
Here, learnt according to the sample image of the preset quantity and the corresponding affective tag of each sample image Training, obtains the scene characteristic model, comprising:
Extract the scene image in the scene image and second sample image of the first sample image;According to institute The affective tag for stating the scene image in first sample image and the first sample image is carried out based on third convolutional Neural net The propagated forward of network and affective tag according to scene image and second sample image in second sample image The back-propagating based on the third convolutional neural networks is carried out, scene characteristic model is obtained.
Above-described first convolutional neural networks, second convolutional neural networks, the third convolutional neural networks, Can using any one neural network such as GoogLeNet, residual error network (ResNet) 34, ResNet50, be also possible to Any one upper neural network carries out the neural network obtained after structural adjustment, here without limitation.
Step 103 constructs at least two classifiers according to the scene characteristic and at least one described affective characteristics, according to At least two classifier obtains at least two prediction results;The prediction result characterization the first image performance is at least A kind of corresponding probability of affective style.
Specifically, described that at least two classifiers are constructed according to the scene characteristic and at least one affective characteristics, according to At least two classifier obtains at least two prediction results, comprising:
Construct the first classifier according to the scene characteristic, according at least one described affective characteristics construct at least one the Two classifiers;
Classify with the affective style that first classifier shows the first image, it is true according to classification results The corresponding probability of at least one affective style for determining the first image performance, as the first prediction result;
Classify with the affective style that at least one described second classifier shows the first image, according to extremely Few classification results determine the corresponding probability of at least one affective style of at least one set of the first image performance, as extremely Few second prediction result.
Step 104, according at least two prediction result, determine crowd's affective style of the first image.
Specifically, described that crowd's affective style of the first image is determined according at least two prediction result, packet It includes:
Determine the scene characteristic and the corresponding default weight of at least one described affective characteristics;
At least one of the first image performance is determined according to the default weight and at least two prediction result The corresponding destination probability of affective style;
Crowd's affective style of the first image is determined according to the corresponding destination probability of at least one affective style.
Here, described that the first image performance is determined according to the default weight and at least two prediction result The corresponding destination probability of at least one affective style, comprising:
By the affective characteristics corresponding with the classifier of the probability of every kind of affective style in the prediction result of each classifier Default multiplied by weight;
By the results added of the probability of identical affective style and default multiplied by weight, obtain corresponding for every kind of affective style Destination probability.
In the present embodiment, the method also includes: by any two in the scene characteristic and at least one affective characteristics Feature-level fusion.
Correspondingly, described construct at least two classifiers according to the scene characteristic and at least one affective characteristics, according to At least two classifier obtains at least two prediction results;Include:
At least three classifiers, root are constructed according to fused feature, the scene characteristic and at least one affective characteristics At least three prediction results are obtained according at least three classifier;The prediction result characterization the first image performance is extremely The corresponding probability of a kind of affective style less.
Here, described according at least two prediction result, determine crowd's affective style of the first image;Packet It includes:
Determine that the fused feature, the scene characteristic and at least one described affective characteristics are corresponding default Weight;
At least one of the first image performance is determined according to the default weight and at least three prediction result The corresponding destination probability of affective style;
Crowd's affective style of the first image is determined according to the corresponding destination probability of at least one affective style.
Here it is possible to take the destination probability highest or the second high corresponding affective style of value for the first image Crowd's affective style.
Fig. 2 is the flow diagram of another crowd's sentiment analysis method provided in an embodiment of the present invention;As shown in Fig. 2, The described method includes:
Step 201, the sample image for obtaining preset quantity.
Here, the step 201 includes: that web crawlers technology can be used, and obtains the conjunction comprising multiple personages from network Shadow picture, as sample image.The preset quantity can be 100000.
The step 201 can also include: that unrelated image is removed by artificial screening, further increase then training mould The quality of type.
Step 202 classifies to the sample image of acquisition, determines the affective tag of each sample image.
Here, the affective style of the affective tag characterization sample image.
The step 202 may include: to identify with sample image of the face recognition method to preset quantity, draw It is divided into positive emotion sample, tranquil sample, negative emotion sample.Here, it is the accuracy rate for guaranteeing identification, can also be added artificial The model split sample set of identification.
The positive emotion sample may include: that personage in sample image is presented with the positive emotions such as glad, surprised Image;The calmness sample, may include: image of the personage without special expression in sample image;The negative emotion sample, It may include: the image that personage in sample image is presented with the negative emotions such as sadness, indignation, fear.
Further, the step 202 can also include: the atmosphere in conjunction with sample image, to the sample image into one Step divides.Specifically, according to the positive emotion sample divided above, tranquil sample, negative emotion sample;It can be combined with sample Three above sample set is further divided into 9 grades according to environment by the background environment of this image, such as: for " wedding The relevant environments such as gift ", " dinner party ", can mark as great rejoicing " atmosphere can for the pleasant environment such as " park ", " flowers and plants " To be labeled as the atmosphere of " happiness ".Specifically, environment can by digital representation, such as: with digital label " -4 " to "+ Totally 9 digital representative ring border atmosphere, " 4 " performance " wild with joy ", " 3 " performance " great rejoicing ", " 2 " show 4 " (or other labels such as 0 to 8) " joyful ", " 1 " performance " happiness ", " 0 " performance " calmness ", " -1 " performance " sad ", " -2 " performance " sadness ", " -3 " performance are " sad Bitterly ", " -4 " performance " grief ".It can be marked on sample image by way of labelling, so as to then quick by search It obtains.
It, can be using positive emotion, calmness or negative emotion as the affective tag in the present embodiment;Alternatively, may be used also Using the grade of environment, specially wild with joy, great rejoicing, joyful, happy, tranquil, sad, sad, grieved or grieved conduct The affective tag.
Step 203, according to the sample image of the preset quantity, at least two convolutional neural networks of training, according to training At least two convolutional neural networks afterwards obtain scene characteristic model and at least one face affective characteristics model.
Specifically, the sample image according to the preset quantity, at least one convolutional neural networks of training, obtains extremely A few face affective characteristics model, comprising: the facial image in sample image is extracted, according to the facial image of extraction and described The corresponding affective tag of sample image, at least one convolutional neural networks of training, obtains at least one face affective characteristics model.
Facial image in the extraction sample image, may include: using multitask concatenated convolutional network (MTCNN, Multi-task Cascaded Convolutional Networks) face is detected, calibrate the position of face, root Facial image is extracted according to the position of calibration.Here, the face identified can be include face rectangle picture.It is identifying Facial image after the facial image identified can also be screened, can be obtained by screening size it is suitable, it is clear, Positive facial image, give up undersized, ambiguous face (such as: delete short side less than 24 pixels picture, if Maximum three faces of size in picture are then saved without residue), further increase the quality of subsequent training pattern.
The MTCNN is a kind of cascade convolutional neural networks frame, by Face datection by way of multi-task learning It is integrated with two tasks of positioning feature point.Its network structure mainly includes three phases, and each stage is rolled up by one Product neural network (CNN) is constituted.Firstly, first stage by the convolutional neural networks of a shallow-layer (P-Net, Proposal Network) quickly generate a large amount of candidate windows;Secondly, in second stage, it is relatively complicated by using one Convolutional neural networks (R-Net, Refine Network) exclude a large amount of non-face windows to optimize candidate window;Finally, Phase III optimizes output window using a more complicated convolutional neural networks (O-Net, Output Network) again Mouthful, while exporting the coordinate of five human face characteristic points.By taking the input picture of Fig. 3 as an example, first by the size tune of the image of input Whole is different size, to construct its image pyramid (Image Pyramid);Obtained image pyramid will be used as MTCNN The input picture of three subtended network configurations.
In the present embodiment, the facial image extracted in sample image, according to the facial image of extraction and the sample The corresponding affective tag of image is trained convolutional neural networks, comprising:
Size pretreatment to the facial image;The pretreatment may include: by the unified adjustment of facial image size Be 32 × 32 × 3, wherein 32 × 32 be facial image size, 3 be image port number (port number of color image be 3, That is tri- kinds of Color Channels of RGB);
Determine training set and test set (here it is possible to using 80% sample image as training set, 20% sample image As test set);
By the facial image extracted in the sample image of the pretreated training set and the corresponding emotion of sample image Label inputs the first convolutional neural networks, carries out the propagated forward based on the first convolutional neural networks;It will be pretreated described The corresponding affective tag of facial image and sample image extracted in the sample image of test set, the convolution after inputting propagated forward Neural network carries out the back-propagating based on first convolutional neural networks, obtains the first face affective characteristics model.
In the present embodiment, the facial image extracted in sample image, according to the facial image of extraction and the sample The corresponding affective tag of image is trained convolutional neural networks, can also include:
The size of the facial image of extraction is adjusted;The adjustment may include: to be converted to facial image The grayscale image of pre-set dimension (can be 224 × 224);
Determine training set and test set (here it is possible to using 80% sample image as training set, 20% sample image As test set);
The corresponding emotion mark of facial image and sample image that will be extracted in the sample image of the training set adjusted Label input the second convolutional neural networks, carry out the propagated forward based on the second convolutional neural networks;By the test adjusted The corresponding affective tag of facial image and sample image extracted in the sample image of collection, the convolutional Neural after inputting propagated forward Network carries out the back-propagating based on second convolutional neural networks, obtains the second face affective characteristics model.
In the present embodiment, the sample image according to the preset quantity, training convolutional neural networks, after training Convolutional neural networks obtain scene characteristic model, comprising:
Sample image is adjusted, may include: that the size of sample image is adjusted to 224 × 224;
Determine training set and test set (here it is possible to using 80% sample image as training set, 20% sample image As test set);
Extract the scene image (such as wedding, dinner party, park image) of sample image adjusted;
It is defeated by the scene image and the corresponding affective tag of sample image in the sample image of the training set adjusted Enter third convolutional neural networks, carries out the propagated forward based on third convolutional neural networks;By the test set adjusted The corresponding affective tag of scene image and sample image of sample image, the convolutional neural networks after inputting propagated forward, carries out Based on the back-propagating of the third convolutional neural networks, scene characteristic model is obtained.
Certain group affective tag that the required affective tag of three of the above training can be determined using step 202.
Step 204 obtains the first image to be identified;The field of the first image is determined based on the scene characteristic model Scape feature determines the first face affective characteristics of the first image based on the first face affective characteristics model;Based on institute State the second face affective characteristics that the second face affective characteristics model determines the first image.
Step 205, according to the scene characteristic, the first face affective characteristics, the second face affective characteristics structure Three classifiers are built, three prediction results are obtained according to three classifiers;The prediction result characterizes the first image The corresponding probability of at least one affective style of performance.
Here, the type of the classifier can be the types such as SVM, KNN or the common fully-connected network of neural network One or more, the type of three classifiers can be the same or different.
Specifically, the classifier constructed may include: the first face affective characteristics classifier, the second face affective characteristics Classifier, scene characteristic classifier;
Each classifier exploitation right reassignment algorithm obtains prediction result corresponding with the classifier, specifically can be according to face The standards such as product, readability preset weight distribution algorithm.For example, it is assumed that the affective style includes: positive emotion, calmness and bears Three kinds of face emotion.
For example, having 80% in the first image for for the prediction result of the first face affective characteristics classifier Face shows positive emotion, and 20% face shows as calmness, then the first face affective characteristics classifier corresponding first Prediction result are as follows: the probability of positive emotion is 80%, tranquil probability is 20%.
For the prediction result of the second face affective characteristics classifier, there is 70% face table in the first image Existing positive emotion, 30% face show as calmness, then the corresponding second prediction knot of the second face affective characteristics classifier Fruit are as follows: the probability of positive emotion is 70%, tranquil probability is 30%.
For the prediction result of scene characteristic classifier, there is scene image to be divided according in the first image Analysis, determines that 80% Scene Representation positive emotion, 10% scene image show as calmness, 10% scene image is shown as 10%;The then corresponding third prediction result of the scene characteristic classifier are as follows: the probability of positive emotion is the probability of 60%, calmness It is 20%, the probability of negative emotion is 20%.
Step 206, according to three prediction results, determine crowd's affective style of the first image.
In the step 206, the method that can use existing decision-making level's information fusion is a variety of to what is obtained in step 205 The prediction result of classifier is distributed different weights and is merged, to obtain crowd's affective style of the first image.
Specifically, in conjunction with the embodiment of step 205, the process of step 206 is further illustrated, step 206 packet It includes:
Step 2061 determines the corresponding default weight of three kinds of features.
Such as: the weight of the first face affective characteristics is w0, the weight of the second face affective characteristics is w1, scene characteristic Weight is w2, and w0+w1+w2=1.
Step 2062 determines a variety of of the first image performance according to the default weight and three prediction results The corresponding destination probability of affective style.
Here, the step 2062, comprising: by the probability of every kind of affective style in the prediction result of each classifier and be somebody's turn to do The default multiplied by weight of the corresponding affective characteristics of classifier;
By the results added of the probability of identical affective style and default multiplied by weight, obtain corresponding for every kind of affective style Destination probability.With specific reference to following formula (1):
S=ω0Sface11Sface23Sscene (1)
Wherein, ω0、ω2And ω3The weight of various affective characteristics is respectively corresponded, also, meets ω012=1. Sface1、Sface2、SsceneRespectively the first face affective characteristics classifier, the second face affective characteristics classifier, scene emotion are special Levy the probability of affective style S in classifier.
In conjunction with above-described embodiment, can be determined using above formula (1):
The destination probability A=80%*w0+70%*w1+60%*w2 of positive emotion;
Tranquil destination probability B=20%*w0+30%*w1+0*w2;
The destination probability C=0*w0+0*w1+20%*w2 of negative emotion.
Step 2063, the crowd that the first image is determined according to the corresponding destination probability of at least one affective style Affective style.
Here, the corresponding destination probability of every kind of affective style is compared, is such as compared A, B and C, determined maximum It is worth corresponding affective style, as crowd's affective style of the first image.
In one embodiment, crowd's sentiment analysis method can also include: by the scene characteristic and at least one The Feature-level fusion of any two in affective characteristics.
Correspondingly, described construct at least two classifiers according to the scene characteristic and at least one affective characteristics, according to At least two classifier obtains at least two prediction results;Include:
At least three classifiers, root are constructed according to fused feature, the scene characteristic and at least one affective characteristics At least three prediction results are obtained according at least three classifier;The prediction result characterization the first image performance is extremely The corresponding probability of a kind of affective style less.
Correspondingly, it is described according at least two prediction result, determine crowd's affective style of the first image;Packet It includes:
Determine the fused feature, the scene characteristic and the corresponding default power of at least one described affective characteristics Weight;
At least one of the first image performance is determined according to the default weight and at least three prediction result The corresponding destination probability of affective style;
Crowd's affective style of the first image is determined according to the corresponding destination probability of at least one affective style.
It, can be with for example, the Feature-level fusion by any two in the scene characteristic and at least one affective characteristics Include: that any two in the first face affective characteristics, the second face affective characteristics and scene characteristic are subjected to Feature-level fusion, melts It is combined into a kind of new feature.
Fig. 4 is a kind of schematic diagram of the network structure of convolutional neural networks provided in an embodiment of the present invention;It is above-described First convolutional neural networks, can be using the convolutional neural networks of network structure shown in Fig. 4.In Fig. 4, the convolution module is successively It include: convolutional layer → batch normalization (BN, Batch Normalization) layer → active coating.The complexity of deep layer network exists In with trained progress, several layers of parameters can change before network, will affect point of all-network layer input below immediately Cloth situation.In order to slow down this influence when hands-on, often selection sets lesser learning rate and initiation parameter, but this also exists Trained speed is largely reduced, while increasing the difficulty for obtaining the nonlinear model of saturation.In order to alleviate in this way The problem of, optimization method, the training of Lai Gaishan deep layer network are normalized using batch.Normalized basic thought is that handle is returned in batches One is turned to a part of neural network structure, after first carrying out normalized operation to every layer of batch input, then carries out subsequent Training.
Carrying out the pretreated best approach to data is whitening pretreatment, but have a disadvantage in that it is computationally intensive, and not Satisfaction everywhere can be micro-, thus with whitening pretreatment unlike, selection pretreatment operation is individually carried out to each feature of input, Make its mean value 0, variance 1.Assuming that a certain layer d dimension input of network is x=(x(1)...x(d)), to the normalizing of every one-dimensional progress It is as follows to change processing:
In view of only carrying out the normalization in above formula to every layer of input, current preceding layer can be changed to a certain extent and mentioned The feature distribution taken, carries out transformation reconstruct on the basis of above formula, introducing can learning parameter β and γ, expression formula becomes:
To, whenAnd β(k)=E [x(k)] when, one layer of obtained primitive character point on network can be restored Cloth.Normalized why it is referred to as batch, yet further point is that when training the data be sent into be as unit of Batch, Selected stochastic gradient descent algorithm is also to be calculated as unit of Batch when training, is considered from trained efficiency, formula (3) mean value and variance in are calculated as unit of Batch.Assuming that the input of batch normalization layer is B={ x1...m, output For { yi=BNγ, β(xi), propagated forward process can be expressed as follows, wherein m indicates the number of sample in current Batch:
This method relaxes the requirement to training parameter setting, simplifies the training of deep layer network, while can also be by it As the adjuster of a network parameter, there is certain substitution effect to deep learning (Dropout).
Here, the activation primitive used is amendment linear unit (ELU, Exponential Linear Units) function. ELU functional form is as follows:
In the present embodiment, Inception module can be as shown in figure 5, down-sampled module can be as shown in fig. 6, C be filtering The number of device.
The convolutional layer, for extracting characteristics of image by carrying out convolution operation to image.In convolutional neural networks, Each convolutional layer would generally include multiple trainable convolution masks (i.e. convolution kernel), and different convolution masks corresponds to different figures As feature.After convolution kernel and input picture carry out convolution operation, handled by nonlinear activation function, such as S sigmoid growth curve (sigmoid) function, line rectification function (ReLU, Rectified Linear Unit) function, ELU function etc., can reflect It penetrates to obtain corresponding characteristic pattern (Feature Map).Wherein, the parameter of convolution kernel be usually using specific learning algorithm (such as: Stochastic gradient descent algorithm) be calculated.The convolution refers to the pixel with parameter and image corresponding position in template Value is weighted the operation of summation.One typical convolution process can be as described in Figure 7, by sleiding form window, to input All positions in image carry out convolution operation, can obtain corresponding characteristic pattern later.
In the present embodiment, based on convolutional neural networks, it is advantageous that: it abandons adjacent in traditional neural network The model ginseng for needing training is greatly reduced in such a way that locally connection and weight are shared in " full connection " design between layer Several numbers, reduce calculation amount.The part connection refers to that each neuron is only and in input picture in convolutional neural networks One regional area is connected, rather than connect entirely with all neurons.The weight is shared to be referred in the not same district of input picture Domain is shared Connecting quantity (i.e. convolution nuclear parameter).In addition, the design side that the part connection of convolutional neural networks and weight are shared Formula, so that the feature that network extracts has the stability of height, it is insensitive to translation, scaling and deformation etc..
Pond layer usually occurs with convolutional layer in pairs, after convolutional layer, is used to carry out down-sampled behaviour to input feature vector figure Make.Image is commonly entered after convolution operation, a large amount of characteristic patterns that can be obtained, characteristic dimension is excessively high to will lead to network query function amount Increase severely.Pond layer greatly reduces the number of parameters of model by the dimension of reduction characteristic pattern.On the one hand this method reduces net The calculation amount of network operation, on the other hand also reduces the risk of network over-fitting.The spy of characteristic pattern and convolutional layer that pond obtains Sign figure is one-to-one, therefore pondization operation is only reduction of characteristic pattern dimension, and number does not change.
Currently, there is common pond method in convolutional neural networks: maximum value pond (Max Pooling), mean value pond (Mean Pooling) and random pool (Stochastic Pooling).For a sampling subregion, maximum value pond Change the output result for referring to choosing the maximum point of wherein pixel value as the region;Mean value pond, which refers to calculating, wherein to be owned The mean value of pixel uses the mean value as the output of sampling area;Random pool refers to randomly selecting one from sampling area A pixel value exports as a result, and usual pixel value is bigger, and the probability selected is higher.Maximum value pond process such as Fig. 8 institute Show.
Fig. 9 is a kind of schematic diagram of the network structure of ResNet34 and ResNet50 provided in an embodiment of the present invention;Step ResNet module in 203 can use structure shown in Fig. 10, wherein the BN is Batch Normalization, that is, is criticized Specification one is changed.RELU is amendment linear unit (Rectified Linear Unit) function.RELU functional form can use down Formula 9:
θ (x)=max (0, x) (9).
Specifically, above-described second convolutional neural networks can be residual error network ResNet34;It is being directed to second Before convolutional neural networks carry out learning training, using existing FER2013 database to the residual error network of ImageNet pre-training ResNet34 structure is finely adjusted (Finetune), comprising:
Step 001 modifies network structure on the basic ResNet34 network trained, such as: the classification on FER2013 Number is 7, then the number of nodes of the last full articulamentum of ResNet34 network is set as 7.
Step 002 freezes network weight before average pond layer, i.e. the coefficient of convolution kernel will not change when trained; The above subnetwork (such as above-mentioned full articulamentum) modified of training.
Step 003, defrosting ResNet module 4.
Subnetwork (such as above-mentioned full connection that step 004, common training ResNet module 4 and above step 001 are modified Layer).
Here, defeated using the facial image of acquisition after completing Finetune on ResNet34 using FER2013 database Enter network, extracts the output matrix of average pond layer, as the second face affective characteristics.Low layer coding in convolutional layer has logical With, reusable feature, and the coding of higher is then more abstract and closely related with the data set trained.So finetune The network structure of higher level is more useful, because these high-rise obtained features and new problem to be treated are to be closely connected , and the effect of experiment can be greatly reduced by the good lower level of Finetune pre-training.Since ImageNet database includes The common object of 1000 classes, animal etc., scene is various and complex, and since the step is the relevant task of human face expression, So being carried out using ResNet34 network of the FER2013 Facial expression database to ImageNet database pre-training Finetune, the network after then reusing Finetune extract human face expression feature, extract the feature of draw pond layer output As the second face affective characteristics.
Specifically, above-described third convolutional neural networks can be residual error network ResNet50.
Before carrying out learning training for third convolutional neural networks, the residual error network in ImageNet pre-training is used Before ResNet50, Finetune is carried out to the ResNet50 structure, comprising:
Step 011 modifies network structure on the basic ResNet50 network trained, such as: by ResNet50 network The number of nodes of last full articulamentum is set as 3 or 9 (i.e. the quantity of affective style).
Step 012 freezes network weight before average pond layer, i.e. the coefficient of convolution kernel will not change when trained; The above subnetwork (such as above-mentioned full articulamentum) modified of training.
Step 013, defrosting ResNet module 4.
Subnetwork (such as above-mentioned full connection that step 014, common training ResNet module 4 and above step 011 are modified Layer).
ResNet50 after Finetune can be used as scene characteristic model, and the scene image extracted in the first image is defeated Enter scene characteristic model, the feature of the draw pond layer output of scene characteristic model extraction is scene characteristic.
In the embodiment of the present invention, following three kinds of fusion methods are provided, it can be using any one following progress Fusion Features:
(1) it is based on the spy of kernel canonical correlation analysis (KCCA, Kernel Canonical Correlation Analysis) Sign fusion
It is assumed thatWithIt is two mean values is zero Eigenmatrix, every a line x of matrixiAnd yiIndicate a feature vector, line number n indicates the data set of this feature matrix characterization Size, p, q respectively indicate the dimension of two kinds of features.By the feature vector composition characteristic in eigenmatrix with a line to { (x1, y1), (x2, y2) ..., (xn, yn), each feature is to both from two different mode, the purpose of canonical correlation analysis It is to find mapping matrix α=(α1, α2... αd) and β=(β1, β2..., βd), d≤min (p, q) so that following formula (10) at It is vertical:
Wherein, Cxx=XTX, Cyy=YTY, Cxy=XTY respectively indicates the auto-covariance matrix and the two of two eigenmatrixes Covariance matrix.Above-mentioned optimization problem can be converted to the solution for seeking Eigenvalue Problems:
Canonical correlation analysis is analyzed based on linear space, can not be obtained non-linear between different modalities feature Relationship, thus introduce kernel method on the basis of canonical correlation analysis, propose kernel canonical correlation analysis (KCCA) method, for original The canonical correlation analysis algorithm of beginning adds nonlinear property.The basic thought of KCCA is similar with Nonlinear Support Vector Machines, will Original eigenmatrix X and Y is mapped to higher dimensional space, i.e. nuclear space X ' and Y ', carries out the analysis of correlation in nuclear space.Core allusion quotation The majorized function of type correlation analysis are as follows:
Wherein, KxAnd KyIt is nuclear matrix, meets, Kx=X 'TX ', Ky=Y 'TY′.It is similar with CCA, above-mentioned majorized function Solution can also be converted to the problem of seeking characteristic value, due to being related to inverse of a matrix in characteristic value solution procedure, and nuclear matrix without Method guarantees invertibity, in order to solve this problem, carries out Regularization to formula (13):
Wherein 0≤τ≤1 is regularization coefficient.
(2) Fusion Features based on nuclear matrix fusion (KMF, Kernel Matrix Fusion)
The thought of nuclear matrix fusion is finding a public subspace for two different mode, which can be most Characterize to big degree the feature of two mode.
Assuming that X=(x1, x2..., xn)TWith Y=(y1, y2..., yn)TSample data is respectively corresponded to mention from two mode The feature taken, KxAnd KyThe nuclear matrix of two mode is respectively corresponded, is metWherein,Indicate kernel function.Nuclear matrix fusion combines above-mentioned two nuclear matrix by algebraic operation, can choose nuclear moment The weighted sum either product of two elements of position is corresponded in battle array as the nuclear matrix element after combination.Fused matrix can be with It indicates are as follows:
Kf=aKx+bKy (6)
Wherein, a+b=1.After obtaining fused matrix, dimension-reduction treatment can be carried out to it by traditional kernel method, To obtain compressed characteristic value.
(3) based on using core cross over model factorial analysis (KCFA, KernelCrossModalFactorAnalysis) Fusion Features
It can to the solution of projection matrix in canonical correlation analysis (CCA, Canonical Correlation Analysis) To be converted to the problem of solving eigenmatrix, but this also correspondingly requires covariance matrix reversible, and the requirement is to a certain degree On limit the application of canonical correlation analysis.
Cross over model factorial analysis improves canonical correlation analysis, and target is to find to make in projector space The mapping matrix of Frobenius Norm minimum.Assuming that the eigenmatrix of two mode is respectively It is right The matrix of a linear transformation answered is respectively U=(u1, u2..., ud), V=(v1, v2..., vd), meet d≤min (p, q), then hands over Pitch the majorized function of factor of a model are as follows:
Wherein,Indicate the Frobenius norm of input matrix, I indicates unit matrix.By Known to the property of Frobenius norm:Tr () is indicated The mark of matrix.It is determining, thus tr (XX since X and Y is as eigenmatrixT) and tr (YYT) it is constant, then formula (16) can be with Abbreviation are as follows:
minU, V(XUVTYT)s.t. UTU=I, VTV=I (17)
The solution of the above problem can be converted to the problem of singular value decomposition, enable XTY=SxyAxyDxy, then corresponding transfer Matrix is respectively U=sxy, v=Dxy
Basic cross over model factorial analysis can only learn the linear character of both modalities which, can equally incite somebody to action by kernel method It is core cross over model factorial analysis (KCFA) that it, which is expanded,.Assuming that after X' and Y' are respectively X and Y Nonlinear Mapping to higher dimensional space Eigenmatrix, Kx,KyFor corresponding nuclear matrix.Similar with cross over model factorial analysis, cross over model factorial analysis can also be converted To solve X 'TThe singular value decomposition problem of Y '.
Obtained new feature still can carry out Decision-level fusion, to improve the accuracy of final differentiation result.
Figure 11 is a kind of structural schematic diagram of crowd's sentiment analysis device provided in an embodiment of the present invention;As shown in figure 11, Described device includes: first processing module 301, Second processing module 302, third processing module 303 and fourth processing module 304。
Wherein, the first processing module 301, for obtaining the first image;It include at least two in the first image Face.
The Second processing module 302, for determining the scene of the first image based on preset scene characteristic model Feature determines at least one affective characteristics of the first image based at least one preset face affective characteristics model.
The third processing module 303, for according to the scene characteristic and at least one described affective characteristics construct to Few two classifiers obtain at least two prediction results according at least two classifier;Described in the prediction result characterization The corresponding probability of at least one affective style of first image appearance.
The fourth processing module 304, for determining the people of the first image according at least two prediction result Group's affective style.
Specifically, described device further include: preprocessing module, for determining scene characteristic model and at least one face Affective characteristics model.
The preprocessing module determines the sample of the preset quantity specifically for obtaining the sample image of preset quantity The affective tag of each sample image in image;The affective style of the affective tag characterization sample image;According to described default The sample image of quantity and the corresponding affective tag of each sample image carry out learning training, obtain the scene characteristic model And at least one face affective characteristics model.
Specifically, the preprocessing module, be also used to for the sample image of the preset quantity being divided into the first image set and Second image set;The first image collection includes at least one first sample image, and second image set includes at least one Second sample image.
The preprocessing module, specifically for extract at least two first facial images in the first sample image and At least two second facial images in second sample image, according at least two extracted in the first sample image The affective tag of first facial image and the first sample image carry out propagated forward based on the first convolutional neural networks, with And the emotion mark according at least two second facial images and second sample image that are extracted in second sample image Label carry out the back-propagating based on first convolutional neural networks, obtain the first face affective characteristics model;And/or
To being mentioned at least two first facial images and second sample image extracted in the first sample image At least two second facial images taken are adjusted, according at least two extracted in the first sample image adjusted The affective tag of first facial image and the first sample image carry out propagated forward based on the second convolutional neural networks, with And according at least two second facial images and second sample image extracted in second sample image adjusted Affective tag carry out the back-propagating based on second convolutional neural networks, obtain the second face affective characteristics model.
Specifically, the preprocessing module, specifically for extracting the scene image of the first sample image and described Scene image in second sample image;According to the scene image and the first sample image in the first sample image Affective tag carries out propagated forward based on third convolutional neural networks and according to the scene figure in second sample image The affective tag of picture and second sample image carries out the back-propagating based on the third convolutional neural networks, obtains scene Characteristic model.
Specifically, the third processing module 303 is specifically used for constructing the first classifier, root according to the scene characteristic At least one second classifier is constructed according at least one described affective characteristics;With first classifier to the first image The affective style of performance is classified, and determines that at least one affective style of the first image performance is corresponding according to classification results Probability, as the first prediction result;The emotion class that the first image is showed at least one described second classifier Type is classified, and at least one affective style of at least one set of the first image performance is determined according at least one classification results Corresponding probability, as at least one the second prediction result.
Specifically, the fourth processing module 304 is specifically used for determining the scene characteristic and at least one described emotion The corresponding default weight of feature;Determine that the first image shows according to the default weight and at least two prediction result The corresponding destination probability of at least one affective style;Institute is determined according to the corresponding destination probability of at least one affective style State crowd's affective style of the first image.
Specifically, the Second processing module 302 is also used to appoint in the scene characteristic and at least one affective characteristics The Feature-level fusion of meaning two.
Correspondingly, the third processing module 303, it is also used to according to fused feature, the scene characteristic and at least One affective characteristics constructs at least three classifiers, obtains at least three prediction results according at least three classifier;Institute State the corresponding probability of at least one affective style of prediction result characterization the first image performance.
Specifically, the fourth processing module 304 is specifically used for determining the fused feature, the scene characteristic Default weight corresponding at least one described affective characteristics;It is true according to the default weight and at least three prediction result Determine the corresponding destination probability of at least one affective style of the first image performance;According at least one affective style pair The destination probability answered determines crowd's affective style of the first image.
It should be understood that crowd's sentiment analysis device provided by the above embodiment is when carrying out crowd's sentiment analysis, only With the division progress of above-mentioned each program module for example, in practical application, can according to need and by above-mentioned processing distribution by Different program modules is completed, i.e., the internal structure of device is divided into different program modules, described above complete to complete Portion or part are handled.In addition, crowd's sentiment analysis device provided by the above embodiment and crowd's sentiment analysis embodiment of the method Belong to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Figure 12 is the structural schematic diagram of another crowd's sentiment analysis device provided in an embodiment of the present invention;Crowd's feelings Sense analytical equipment can be applied to server;As shown in figure 12, described device 40 includes: processor 401 and can for storing The memory 402 of the computer program run on the processor;Wherein, the processor 401 is for running the calculating It when machine program, executes: obtaining the first image;It include at least two faces in the first image;Based on preset scene characteristic Model determines the scene characteristic of the first image, determines described first based at least one preset face affective characteristics model At least one affective characteristics of image;According at least two classification of the scene characteristic and the building of at least one described affective characteristics Device obtains at least two prediction results according at least two classifier;The prediction result characterizes the first image table The existing corresponding probability of at least one affective style;According at least two prediction result, the people of the first image is determined Group's affective style.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining scene spy Levy model and at least one face affective characteristics model;The determining scene characteristic model and at least one face emotion are special Levy model, comprising: the sample image for obtaining preset quantity determines each sample image in the sample image of the preset quantity Affective tag;The affective style of the affective tag characterization sample image;According to the sample image of the preset quantity and often The corresponding affective tag of a sample image carries out learning training, obtains the scene characteristic model and at least one face emotion Characteristic model.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: will be described default The sample image of quantity is divided into the first image set and the second image set;The first image collection includes at least one first sample figure Picture, second image set include at least one second sample image;According to the sample image of the preset quantity and each The corresponding affective tag of sample image carries out learning training, obtains at least one face affective characteristics model, comprising: described in extraction At least two second face figures at least two first facial images and second sample image in first sample image Picture, according to the emotion mark of at least two first facial images and the first sample image that are extracted in the first sample image Label carry out propagated forward based on the first convolutional neural networks and according at least two extracted in second sample image The affective tag of second facial image and second sample image carries out the backward biography based on first convolutional neural networks It broadcasts, obtains the first face affective characteristics model;And/or at least two first faces extracted in the first sample image At least two second facial images extracted in image and second sample image are adjusted, according to adjusted described The affective tag of at least two first facial images and the first sample image that extract in one sample image is carried out based on the The propagated forward of two convolutional neural networks and according at least two second extracted in second sample image adjusted The affective tag of facial image and second sample image obtained based on the back-propagating of second convolutional neural networks Obtain the second face affective characteristics model.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: extracting described the Scene image in the scene image of one sample image and second sample image;According in the first sample image The affective tag of scene image and the first sample image carries out propagated forward based on third convolutional neural networks, Yi Jigen Roll up based on the third according to the affective tag of scene image and second sample image in second sample image The back-propagating of product neural network, obtains scene characteristic model.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: according to the field The first classifier of scape feature construction constructs at least one second classifier according at least one described affective characteristics;With described The affective style that first classifier shows the first image is classified, and determines the first image table according to classification results The existing corresponding probability of at least one affective style, as the first prediction result;With at least one described second classifier pair The affective style of the first image performance is classified, and determines at least one set of first figure according at least one classification results As the corresponding probability of at least one affective style of performance, as at least one the second prediction result.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining the field Scape feature and the corresponding default weight of at least one described affective characteristics;According to the default weight and at least two prediction As a result the corresponding destination probability of at least one affective style of the first image performance is determined;According at least one emotion The corresponding destination probability of type determines crowd's affective style of the first image.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: by the scene The Feature-level fusion of any two in feature and at least one affective characteristics;Correspondingly, it is described according to the scene characteristic and extremely Few affective characteristics construct at least two classifiers, obtain at least two prediction results according at least two classifier; It include: to construct at least three classifiers according to fused feature, the scene characteristic and at least one affective characteristics, according to institute It states at least three classifiers and obtains at least three prediction results;At least the one of the prediction result characterization the first image performance The corresponding probability of kind affective style.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: melting described in determining Feature, the scene characteristic and the corresponding default weight of at least one described affective characteristics after conjunction;According to the default weight The corresponding destination probability of at least one affective style of the first image performance is determined at least three prediction result;Root Crowd's affective style of the first image is determined according to the corresponding destination probability of at least one affective style.
It should be understood that crowd's sentiment analysis device provided by the above embodiment and crowd's sentiment analysis embodiment of the method Belong to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
When practical application, described device 40 can also include: at least one network interface 403.Crowd's sentiment analysis device Various components in 40 are coupled by bus system 404.It is understood that bus system 404 for realizing these components it Between connection communication.Bus system 404 further includes power bus, control bus and status signal in addition to including data/address bus Bus.But for the sake of clear explanation, various buses are all designated as bus system 404 in Fig. 4.Wherein, the processor 404 number can be at least one.Network interface 403 be used between crowd's sentiment analysis device 40 and other equipment it is wired or The communication of wireless mode.
Memory 402 in the embodiment of the present invention is for storing various types of data with backer's public sentiment sense analytical equipment 40 operation.
The method that the embodiments of the present invention disclose can be applied in processor 401, or be realized by processor 401. Processor 401 may be a kind of IC chip, the processing capacity with signal.During realization, the above method it is each Step can be completed by the integrated logic circuit of the hardware in processor 401 or the instruction of software form.Above-mentioned processing Device 401 can be general processor, digital signal processor (DSP, Digital Signal Processor) or other can Programmed logic device, discrete gate or transistor logic, discrete hardware components etc..Processor 401 may be implemented or hold Disclosed each method, step and logic diagram in the row embodiment of the present invention.General processor can be microprocessor or appoint What conventional processor etc..The step of method in conjunction with disclosed in the embodiment of the present invention, can be embodied directly at hardware decoding Reason device executes completion, or in decoding processor hardware and software module combine and execute completion.Software module can be located at In storage medium, which is located at memory 402, and processor 401 reads the information in memory 402, in conjunction with its hardware The step of completing preceding method.
In the exemplary embodiment, crowd's sentiment analysis device 40 can be by one or more application specific integrated circuit (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable Logic Device), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic Device), field programmable gate array (FPGA, Field-Programmable Gate Array), general processor, control Device, microcontroller (MCU, Micro Controller Unit), microprocessor (Microprocessor) or other electronics member Part is realized, for executing preceding method.
The embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, described It when computer program is run by processor, executes: obtaining the first image;It include at least two faces in the first image;Base The scene characteristic of the first image is determined in preset scene characteristic model, is based at least one preset face affective characteristics Model determines at least one affective characteristics of the first image;According to the scene characteristic and at least one described affective characteristics At least two classifiers are constructed, at least two prediction results are obtained according at least two classifier;The prediction result table Levy the corresponding probability of at least one affective style of the first image performance;According at least two prediction result, determine Crowd's affective style of the first image.
In one embodiment, it when the computer program is run by processor, executes: determining scene characteristic model and extremely A few face affective characteristics model;The determining scene characteristic model and at least one face affective characteristics model, comprising: The sample image for obtaining preset quantity, determines the affective tag of each sample image in the sample image of the preset quantity;Institute State the affective style of affective tag characterization sample image;According to the sample image of the preset quantity and each sample image pair The affective tag answered carries out learning training, obtains the scene characteristic model and at least one face affective characteristics model.
In one embodiment, it when the computer program is run by processor, executes: by the sample graph of the preset quantity As being divided into the first image set and the second image set;The first image collection includes at least one first sample image, and described second Image set includes at least one second sample image;It is corresponding according to the sample image of the preset quantity and each sample image Affective tag carry out learning training, obtain at least one face affective characteristics model, comprising: extract the first sample image In at least two first facial images and second sample image at least two second facial images, according to described The affective tag of at least two first facial images and the first sample image that extract in one sample image is carried out based on the The propagated forward of one convolutional neural networks and according at least two second facial images extracted in second sample image The back-propagating based on first convolutional neural networks is carried out with the affective tag of second sample image, is obtained the first Face affective characteristics model;And/or at least two first facial images and described second extracted in the first sample image At least two second facial images extracted in sample image are adjusted, and are mentioned according in the first sample image adjusted The affective tag of at least two first facial images and the first sample image that take is carried out based on the second convolutional neural networks Propagated forward and according at least two second facial images that are extracted in second sample image adjusted and described The affective tag of second sample image carries out the back-propagating based on second convolutional neural networks, obtains the second face emotion Characteristic model.
In one embodiment, it when the computer program is run by processor, executes: extracting the first sample image Scene image in scene image and second sample image;According in the first sample image scene image and institute The affective tag for stating first sample image carries out propagated forward based on third convolutional neural networks and according to second sample The affective tag of scene image and second sample image in this image is carried out based on the third convolutional neural networks Back-propagating obtains scene characteristic model.
In one embodiment, it when the computer program is run by processor, executes: according to scene characteristic building the One classifier constructs at least one second classifier according at least one described affective characteristics;With first classifier pair The affective style of the first image performance is classified, and at least one of the first image performance is determined according to classification results The corresponding probability of affective style, as the first prediction result;With at least one described second classifier to the first image The affective style of performance is classified, and determines at least one set of the first image performance at least according at least one classification results The corresponding probability of a kind of affective style, as at least one the second prediction result.
In one embodiment, it when the computer program is run by processor, executes: determining the scene characteristic and described The corresponding default weight of at least one affective characteristics;According to the default weight and at least two prediction result determination The corresponding destination probability of at least one affective style of first image appearance;According to the corresponding mesh of at least one affective style Mark crowd's affective style of determine the probability the first image.
In one embodiment, it when the computer program is run by processor, executes: by the scene characteristic and at least one The Feature-level fusion of any two in a affective characteristics;Correspondingly, described special according to the scene characteristic and at least one emotion Sign at least two classifiers of building, obtain at least two prediction results according at least two classifier;It include: according to fusion Feature, the scene characteristic and at least one affective characteristics afterwards construct at least three classifiers, according to described at least three points Class device obtains at least three prediction results;At least one affective style pair of the prediction result characterization the first image performance The probability answered.
In one embodiment, it when the computer program is run by processor, executes: determining the scene characteristic and described The corresponding default weight of at least one affective characteristics;According to the default weight and at least three prediction result determination The corresponding destination probability of at least one affective style of first image appearance;According to the corresponding mesh of at least one affective style Mark crowd's affective style of determine the probability the first image.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only A kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, or It is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition portion Mutual coupling or direct-coupling or communication connection is divided to can be through some interfaces, the INDIRECT COUPLING of equipment or unit Or communication connection, it can be electrical, mechanical or other forms.
Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit The component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple network lists In member;Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
In addition, each functional unit in various embodiments of the present invention can be fully integrated in one processing unit, it can also To be each unit individually as a unit, can also be integrated in one unit with two or more units;It is above-mentioned Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned include: movable storage device, it is read-only Memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or The various media that can store program code such as person's CD.
If alternatively, the above-mentioned integrated unit of the present invention is realized in the form of software function module and as independent product When selling or using, it also can store in a computer readable storage medium.Based on this understanding, the present invention is implemented Substantially the part that contributes to existing technology can be embodied in the form of software products the technical solution of example in other words, The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with It is personal computer, server or network equipment etc.) execute all or part of each embodiment the method for the present invention. And storage medium above-mentioned includes: that movable storage device, ROM, RAM, magnetic or disk etc. are various can store program code Medium.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all Made any modifications, equivalent replacements, and improvements etc. within the spirit and principles in the present invention, should be included in protection of the invention Within the scope of.

Claims (18)

1. a kind of crowd's sentiment analysis method, which is characterized in that the described method includes:
Obtain the first image;It include at least two faces in the first image;
The scene characteristic of the first image is determined based on preset scene characteristic model, is based at least one preset face feelings Sense characteristic model determines at least one affective characteristics of the first image;
At least two classifiers are constructed according to the scene characteristic and at least one described affective characteristics, according to described at least two Classifier obtains at least two prediction results;At least one affective style of the prediction result characterization the first image performance Corresponding probability;
According at least two prediction result, crowd's affective style of the first image is determined.
2. the method according to claim 1, wherein described determine described based on preset scene characteristic model The scene characteristic of one image determines the first image at least one based at least one preset face affective characteristics model Before affective characteristics, the method also includes: determine scene characteristic model and at least one face affective characteristics model;
The determining scene characteristic model and at least one face affective characteristics model, comprising:
The sample image for obtaining preset quantity, determines the emotion mark of each sample image in the sample image of the preset quantity Label;The affective style of the affective tag characterization sample image;
Learning training is carried out according to the sample image of the preset quantity and the corresponding affective tag of each sample image, is obtained The scene characteristic model and at least one face affective characteristics model.
3. according to the method described in claim 2, it is characterized in that, the method also includes: by the sample of the preset quantity Image is divided into the first image set and the second image set;The first image collection includes at least one first sample image, and described Two image sets include at least one second sample image;
Learning training is carried out according to the sample image of the preset quantity and the corresponding affective tag of each sample image, is obtained At least one face affective characteristics model, comprising:
Extract at least two at least two first facial images and second sample image in the first sample image A second facial image, according at least two first facial images extracted in the first sample image and the first sample The affective tag of image carries out the propagated forward based on the first convolutional neural networks and mentions according in second sample image The affective tag of at least two second facial images and second sample image that take is carried out based on first convolutional Neural The back-propagating of network obtains the first face affective characteristics model;And/or
To what is extracted at least two first facial images and second sample image extracted in the first sample image At least two second facial images are adjusted, according at least two first extracted in the first sample image adjusted The affective tag of facial image and the first sample image carries out propagated forward based on the second convolutional neural networks, Yi Jigen According to the feelings of at least two second facial images and second sample image extracted in second sample image adjusted Sense label carries out the back-propagating based on second convolutional neural networks, obtains the second face affective characteristics model.
4. according to the method described in claim 3, it is characterized in that, according to the sample image of the preset quantity and each sample The corresponding affective tag of this image carries out learning training, obtains the scene characteristic model, comprising:
Extract the scene image in the scene image and second sample image of the first sample image;According to described The affective tag of scene image and the first sample image in one sample image is carried out based on third convolutional neural networks Propagated forward and according to the affective tag of scene image and second sample image in second sample image carry out Based on the back-propagating of the third convolutional neural networks, scene characteristic model is obtained.
5. the method according to claim 1, wherein described special according to the scene characteristic and at least one emotion Sign at least two classifiers of building, obtain at least two prediction results according at least two classifier, comprising:
The first classifier is constructed according to the scene characteristic, constructs at least one second point according at least one described affective characteristics Class device;
Classify with the affective style that first classifier shows the first image, determines institute according to classification results The corresponding probability of at least one affective style for stating the first image appearance, as the first prediction result;
Classify with the affective style that at least one described second classifier shows the first image, according at least one A classification results determine the corresponding probability of at least one affective style of at least one set of the first image performance, as at least one A second prediction result.
6. determining institute the method according to claim 1, wherein described according at least two prediction result State crowd's affective style of the first image, comprising:
Determine the scene characteristic and the corresponding default weight of at least one described affective characteristics;
At least one emotion of the first image performance is determined according to the default weight and at least two prediction result The corresponding destination probability of type;
Crowd's affective style of the first image is determined according to the corresponding destination probability of at least one affective style.
7. the method according to claim 1, wherein the method also includes: by the scene characteristic and at least The Feature-level fusion of any two in one affective characteristics;
Correspondingly, described construct at least two classifiers according to the scene characteristic and at least one affective characteristics, according to described At least two classifiers obtain at least two prediction results;Include:
At least three classifiers are constructed according to fused feature, the scene characteristic and at least one affective characteristics, according to institute It states at least three classifiers and obtains at least three prediction results;At least the one of the prediction result characterization the first image performance The corresponding probability of kind affective style.
8. determining institute the method according to the description of claim 7 is characterized in that described according at least two prediction result State crowd's affective style of the first image;Include:
Determine the fused feature, the scene characteristic and the corresponding default weight of at least one described affective characteristics;
At least one emotion of the first image performance is determined according to the default weight and at least three prediction result The corresponding destination probability of type;
Crowd's affective style of the first image is determined according to the corresponding destination probability of at least one affective style.
9. a kind of crowd's sentiment analysis device, which is characterized in that described device include: first processing module, Second processing module, Third processing module and fourth processing module;Wherein,
The first processing module, for obtaining the first image;It include at least two faces in the first image;
The Second processing module, for determining the scene characteristic of the first image, base based on preset scene characteristic model At least one affective characteristics of the first image are determined at least one preset face affective characteristics model;
The third processing module, for constructing at least two points according to the scene characteristic and at least one described affective characteristics Class device obtains at least two prediction results according at least two classifier;The prediction result characterizes the first image The corresponding probability of at least one affective style of performance;
The fourth processing module, for determining crowd's emotion of the first image according at least two prediction result Type.
10. device according to claim 9, which is characterized in that described device further include: preprocessing module, for determining Scene characteristic model and at least one face affective characteristics model;
The preprocessing module determines the sample image of the preset quantity specifically for obtaining the sample image of preset quantity In each sample image affective tag;The affective style of the affective tag characterization sample image;
Learning training is carried out according to the sample image of the preset quantity and the corresponding affective tag of each sample image, is obtained The scene characteristic model and at least one face affective characteristics model.
11. device according to claim 10, which is characterized in that the preprocessing module is also used to the present count The sample image of amount is divided into the first image set and the second image set;The first image collection includes at least one first sample figure Picture, second image set include at least one second sample image;
The preprocessing module, specifically for extracting at least two first facial images in the first sample image and described At least two second facial images in second sample image, according at least two first extracted in the first sample image The affective tag of facial image and the first sample image carries out propagated forward based on the first convolutional neural networks, Yi Jigen According at least two second facial images and second sample image that are extracted in second sample image affective tag into Back-propagating of the row based on first convolutional neural networks, obtains the first face affective characteristics model;And/or
To what is extracted at least two first facial images and second sample image extracted in the first sample image At least two second facial images are adjusted, according at least two first extracted in the first sample image adjusted The affective tag of facial image and the first sample image carries out propagated forward based on the second convolutional neural networks, Yi Jigen According to the feelings of at least two second facial images and second sample image extracted in second sample image adjusted Sense label carries out the back-propagating based on second convolutional neural networks, obtains the second face affective characteristics model.
12. device according to claim 11, which is characterized in that the preprocessing module is specifically used for extracting described the Scene image in the scene image of one sample image and second sample image;According in the first sample image The affective tag of scene image and the first sample image carries out propagated forward based on third convolutional neural networks, Yi Jigen Roll up based on the third according to the affective tag of scene image and second sample image in second sample image The back-propagating of product neural network, obtains scene characteristic model.
13. device according to claim 9, which is characterized in that the third processing module is specifically used for according to the field The first classifier of scape feature construction constructs at least one second classifier according at least one described affective characteristics;
Classify with the affective style that first classifier shows the first image, determines institute according to classification results The corresponding probability of at least one affective style for stating the first image appearance, as the first prediction result;
Classify with the affective style that at least one described second classifier shows the first image, according at least one A classification results determine the corresponding probability of at least one affective style of at least one set of the first image performance, as at least one A second prediction result.
14. device according to claim 9, which is characterized in that the fourth processing module is specifically used for determining the field Scape feature and the corresponding default weight of at least one described affective characteristics;
At least one emotion of the first image performance is determined according to the default weight and at least two prediction result The corresponding destination probability of type;
Crowd's affective style of the first image is determined according to the corresponding destination probability of at least one affective style.
15. device according to claim 9, which is characterized in that the Second processing module is also used to the scene is special The Feature-level fusion for any two at least one affective characteristics of seeking peace;
Correspondingly, the third processing module, is also used to according to fused feature, the scene characteristic and at least one emotion At least three classifier of feature construction obtains at least three prediction results according at least three classifier;The prediction knot Fruit characterizes the corresponding probability of at least one affective style of the first image performance.
16. device according to claim 15, which is characterized in that the fourth processing module, described in determining Fused feature, the scene characteristic and the corresponding default weight of at least one described affective characteristics;
At least one emotion of the first image performance is determined according to the default weight and at least three prediction result The corresponding destination probability of type;
Crowd's affective style of the first image is determined according to the corresponding destination probability of at least one affective style.
17. a kind of crowd's sentiment analysis device, which is characterized in that described device includes: processor and can locate for storing The memory of the computer program run on reason device;Wherein,
The processor is for the step of when running the computer program, perform claim requires any one of 1 to 8 the method.
18. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of any one of claim 1 to 8 the method is realized when being executed by processor.
CN201811191726.2A 2018-10-12 2018-10-12 A kind of crowd's sentiment analysis method, apparatus and storage medium Pending CN109508640A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811191726.2A CN109508640A (en) 2018-10-12 2018-10-12 A kind of crowd's sentiment analysis method, apparatus and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811191726.2A CN109508640A (en) 2018-10-12 2018-10-12 A kind of crowd's sentiment analysis method, apparatus and storage medium

Publications (1)

Publication Number Publication Date
CN109508640A true CN109508640A (en) 2019-03-22

Family

ID=65746460

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811191726.2A Pending CN109508640A (en) 2018-10-12 2018-10-12 A kind of crowd's sentiment analysis method, apparatus and storage medium

Country Status (1)

Country Link
CN (1) CN109508640A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712671A (en) * 2020-12-18 2021-04-27 济南浪潮高新科技投资发展有限公司 Intelligent alarm system and method based on 5G
CN112784776A (en) * 2021-01-26 2021-05-11 山西三友和智慧信息技术股份有限公司 BPD facial emotion recognition method based on improved residual error network
CN112800875A (en) * 2021-01-14 2021-05-14 北京理工大学 Multi-mode emotion recognition method based on mixed feature fusion and decision fusion
CN113326723A (en) * 2020-12-24 2021-08-31 杭州海康威视数字技术股份有限公司 Emotion recognition method, device, equipment and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1514399A (en) * 2002-11-25 2004-07-21 ��˹���´﹫˾ Imaging method and system for healthy monitoring and personal safety
US20120308971A1 (en) * 2011-05-31 2012-12-06 Hyun Soon Shin Emotion recognition-based bodyguard system, emotion recognition device, image and sensor control apparatus, personal protection management apparatus, and control methods thereof
CN106803069A (en) * 2016-12-29 2017-06-06 南京邮电大学 Crowd's level of happiness recognition methods based on deep learning
CN107368798A (en) * 2017-07-07 2017-11-21 四川大学 A kind of crowd's Emotion identification method based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1514399A (en) * 2002-11-25 2004-07-21 ��˹���´﹫˾ Imaging method and system for healthy monitoring and personal safety
US20120308971A1 (en) * 2011-05-31 2012-12-06 Hyun Soon Shin Emotion recognition-based bodyguard system, emotion recognition device, image and sensor control apparatus, personal protection management apparatus, and control methods thereof
CN106803069A (en) * 2016-12-29 2017-06-06 南京邮电大学 Crowd's level of happiness recognition methods based on deep learning
CN107368798A (en) * 2017-07-07 2017-11-21 四川大学 A kind of crowd's Emotion identification method based on deep learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ABBAS A等: ""Group emotion recognition in the wild by combining deep neural networks for facial expression classification and scene-context analysis"", 《PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION》 *
LI J等: "《Happiness level prediction with sequential inputs via multiple regressions》", 《PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION》 *
XIN GUO等: ""Group-level emotion recognition using deep models on image scene, faces, and skeletons"", 《PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION》 *
饶元: ""基于语义分析的情感计算技术研究进展"", 《软件学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712671A (en) * 2020-12-18 2021-04-27 济南浪潮高新科技投资发展有限公司 Intelligent alarm system and method based on 5G
CN113326723A (en) * 2020-12-24 2021-08-31 杭州海康威视数字技术股份有限公司 Emotion recognition method, device, equipment and system
CN113326723B (en) * 2020-12-24 2024-04-05 杭州海康威视数字技术股份有限公司 Emotion recognition method, device, equipment and system
CN112800875A (en) * 2021-01-14 2021-05-14 北京理工大学 Multi-mode emotion recognition method based on mixed feature fusion and decision fusion
CN112784776A (en) * 2021-01-26 2021-05-11 山西三友和智慧信息技术股份有限公司 BPD facial emotion recognition method based on improved residual error network
CN112784776B (en) * 2021-01-26 2022-07-08 山西三友和智慧信息技术股份有限公司 BPD facial emotion recognition method based on improved residual error network

Similar Documents

Publication Publication Date Title
Rahman et al. A new benchmark on american sign language recognition using convolutional neural network
Mascarenhas et al. A comparison between VGG16, VGG19 and ResNet50 architecture frameworks for Image Classification
CN109409222B (en) Multi-view facial expression recognition method based on mobile terminal
Rao et al. Deep convolutional neural networks for sign language recognition
CN110532920B (en) Face recognition method for small-quantity data set based on FaceNet method
CN109685819B (en) Three-dimensional medical image segmentation method based on feature enhancement
CN111160350B (en) Portrait segmentation method, model training method, device, medium and electronic equipment
CN109508640A (en) A kind of crowd's sentiment analysis method, apparatus and storage medium
CN112464865A (en) Facial expression recognition method based on pixel and geometric mixed features
CN109522925A (en) A kind of image-recognizing method, device and storage medium
CN111832573B (en) Image emotion classification method based on class activation mapping and visual saliency
CN113221663B (en) Real-time sign language intelligent identification method, device and system
CN109886161A (en) A kind of road traffic index identification method based on possibility cluster and convolutional neural networks
Ali et al. Facial emotion detection using neural network
CN109508625A (en) A kind of analysis method and device of affection data
CN110110724A (en) The text authentication code recognition methods of function drive capsule neural network is squeezed based on exponential type
Borgalli et al. Deep learning for facial emotion recognition using custom CNN architecture
Pratama et al. Deep convolutional neural network for hand sign language recognition using model E
Hoque et al. Bdsl36: A dataset for bangladeshi sign letters recognition
Liu et al. Lightweight ViT model for micro-expression recognition enhanced by transfer learning
CN109460485A (en) A kind of image library method for building up, device and storage medium
CN113343773B (en) Facial expression recognition system based on shallow convolutional neural network
Srininvas et al. A framework to recognize the sign language system for deaf and dumb using mining techniques
Kousalya et al. Prediction of Best Optimizer for Facial Expression Detection using Convolutional Neural Network
Dembani et al. UNSUPERVISED FACIAL EXPRESSION DETECTION USING GENETIC ALGORITHM.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190322

RJ01 Rejection of invention patent application after publication