CN109784154A

CN109784154A - Emotion identification method, apparatus, equipment and medium based on deep neural network

Info

Publication number: CN109784154A
Application number: CN201811503089.8A
Authority: CN
Inventors: 盛建达
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-12-10
Filing date: 2018-12-10
Publication date: 2019-05-21
Anticipated expiration: 2038-12-10
Also published as: CN109784154B

Abstract

The invention discloses a kind of Emotion identification method, apparatus, equipment and medium based on deep neural network, which comprises receive face picture to be identified；Face picture to be identified is pre-processed according to preset processing mode, obtains picture to be identified；Picture to be identified is inputted in preset depth residual error network model；Multi-channel data extraction is carried out to picture to be identified using input layer, obtains the corresponding face picture data of picture to be identified；Characteristic extraction is carried out to face picture data using convolutional layer, obtains the face characteristic of picture to be identified；Classification recurrence is carried out to face characteristic using full articulamentum, obtains the emotional state of face in picture to be identified.The embodiment of the present invention is higher to the prediction accuracy of face picture by preset depth residual error network model, improves model to the accuracy rate of face Emotion identification.

Description

Emotion identification method, apparatus, equipment and medium based on deep neural network

Technical field

The present invention relates to technical field of biometric identification more particularly to a kind of Emotion identification sides based on deep neural network Method, device, equipment and medium.

Background technique

With the development of science and technology, artificial intelligence technology has also obtained very fast development, and face Emotion identification is artificial The key technology of smart field, the identification of face mood are to study to obtain computer from still image or video sequence The technology for taking human face expression and being distinguished has important meaning, generally, people for the research of human-computer interaction and affection computation The basic emotion of face can be divided into 7 kinds, i.e., happily, it is sad, frightened, angry, surprised, detest and tranquil.

The recognition methods of current face mood mainly uses random forests algorithm or SVM (Support Vector Machine, support vector machines) classifier to face samples pictures carry out learning training, obtain Emotion identification model, still, by More more complicated with rule in the classification of the expression of face picture, existing face Emotion identification method can not be to face picture In deeper data characteristics learnt so that existing Emotion identification model to the recognition accuracy of personage's mood not It is high.

Summary of the invention

A kind of Emotion identification method, apparatus, equipment and medium based on deep neural network is provided in the embodiment of the present invention, Accuracy rate to solve the problems, such as current personage's Emotion identification is low.

A kind of Emotion identification method based on deep neural network, comprising:

Receive face picture to be identified；

The face picture to be identified is pre-processed according to preset processing mode, obtains the figure to be identified Piece；

The picture to be identified is inputted in preset depth residual error network model, wherein the preset depth residual error Network model includes input layer, convolutional layer and full articulamentum；

Multi-channel data extraction is carried out to the picture to be identified using the input layer, obtains the picture pair to be identified The face picture data answered；

Characteristic extraction is carried out to the face picture data using the convolutional layer, obtains the picture to be identified Face characteristic；

Classification recurrence is carried out to the face characteristic using the full articulamentum, obtains the identification knot of the picture to be identified Fruit, wherein the recognition result includes the emotional state of face in the picture to be identified.

A kind of Emotion identification device based on deep neural network, comprising:

Picture receiving module, for receiving face picture to be identified；

Picture processing module, for being pre-processed according to preset processing mode to the face picture to be identified, Obtain picture to be identified；

Picture recognition module, for inputting the picture to be identified in preset depth residual error network model, wherein institute Stating preset depth residual error network model includes input layer, convolutional layer and full articulamentum；

Data extraction module is obtained for carrying out multi-channel data extraction to the picture to be identified using the input layer To the corresponding face picture data of the picture to be identified；

Characteristic extracting module is obtained for carrying out characteristic extraction to the face picture data using the convolutional layer To the face characteristic of the picture to be identified；

Mood output module obtains described for carrying out classification recurrence to the face characteristic using the full articulamentum The recognition result of picture to be identified, wherein the recognition result includes the emotional state of face in the picture to be identified.

A kind of computer equipment, including memory, processor and storage are in the memory and can be in the processing The computer program run on device, the processor are realized above-mentioned based on deep neural network when executing the computer program Emotion identification method.

A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, the meter Calculation machine program realizes the above-mentioned Emotion identification method based on deep neural network when being executed by processor.

Above-mentioned Emotion identification method, apparatus, equipment and medium based on deep neural network, by receiving people to be identified Face picture pre-processes face picture to be identified according to preset processing mode, obtains picture to be identified, and will locate in advance Picture to be identified after reason inputs in preset depth residual error network model, uses the input layer pair in depth residual error network model Picture to be identified carries out multi-channel data extraction, obtains the corresponding face picture data of picture to be identified, then use convolutional layer pair Face picture data carry out characteristic extraction, the face characteristic of picture to be identified are obtained, finally, using full articulamentum to face Feature carries out classification recurrence, obtains the recognition result of picture to be identified, so that it is determined that in picture to be identified face emotional state. By being pre-processed to picture to be identified, the noise data in picture is removed, can be improved the recognition rate and standard of model True rate, and using the extraction of the convolutional layer characteristic profound to the face progress in picture to be identified, so that depth is residual Poor network model is higher to the prediction accuracy of face picture, improves model to the accuracy rate of face Emotion identification.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.

Fig. 1 is the application environment signal of the Emotion identification method in one embodiment of the invention based on deep neural network Figure；

Fig. 2 is a flow chart of the Emotion identification method in one embodiment of the invention based on deep neural network；

Fig. 3 is a specific stream of step S5 in Emotion identification method in one embodiment of the invention based on deep neural network Cheng Tu；

Fig. 4 is a tool of convolutional layer processing in Emotion identification method in one embodiment of the invention based on deep neural network Body flow chart；

Fig. 5 is to extract picture to be identified in Emotion identification method in one embodiment of the invention based on deep neural network One specific flow chart of characteristic；

Fig. 6 is the another of convolutional layer processing in Emotion identification method in one embodiment of the invention based on deep neural network Specific flow chart；

Fig. 7 is a specific stream of step S2 in Emotion identification method in one embodiment of the invention based on deep neural network Cheng Tu；

Fig. 8 is a specific stream of step S6 in Emotion identification method in one embodiment of the invention based on deep neural network Cheng Tu；

Fig. 9 is a functional block diagram of the Emotion identification device in one embodiment of the invention based on deep neural network；

Figure 10 is a schematic diagram of computer equipment in one embodiment of the invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.

Emotion identification method provided by the embodiments of the present application based on deep neural network, can be applicable to the application such as Fig. 1 In environment, which includes server-side and client, wherein it is attached between server-side and client by network, User inputs picture to be identified by client, and server-side receives picture to be identified, and uses preset depth residual error network mould Type identifies that identification obtains the mood of personage in picture to be identified to face picture to be identified.Client specifically can with but It is not limited to various personal computers, laptop, smart phone, tablet computer and portable wearable device, server-side It can specifically be realized with the server cluster that independent server or multiple servers form.Offer of the embodiment of the present invention is based on The method of the Emotion identification of deep neural network is applied to server-side.

In one embodiment, Fig. 2 shows a processes of the Emotion identification method in the present embodiment based on deep neural network Figure, this method applies the server-side in Fig. 1, for identification in face picture personage mood, improve face Emotion identification Accuracy rate.As shown in Fig. 2, the Emotion identification method based on deep neural network of being somebody's turn to do includes step S1 to step S6, details are as follows:

S1: face picture to be identified is received.

In the present embodiment, server-side receives the face picture to be identified that client is sent by network, this is to be identified Face picture refer to the face picture for needing the mood to personage in picture to be identified, wherein face picture to be identified Picture format include but is not limited to the formats such as jpg, png and jpeg, specifically can be the people that client is obtained from internet Face picture is also possible to the single frames face picture etc. of face picture or video camera shooting that user is shot by client Deng.

S2: face picture to be identified is pre-processed according to preset processing mode, obtains picture to be identified.

Specifically, wherein preset processing mode, which refers to preset, carries out size, gray scale to face picture to be identified With the mode of the conversion process such as shape, for face picture to be identified to be converted into the picture to be identified of default specification, this is pre- If specification includes but is not limited to preset size, preset tonal gradation and preset shape etc., so as to the processing of subsequent picture More efficiently, the data-handling efficiency of picture is improved.

Wherein, the default specification of picture to be identified can be configured according to the needs of practical application, herein with no restrictions, For example, the pixel of picture to be identified may be sized to 168*168,256*256 etc. may be set to be.

Specifically, server-side first obtains the face area in face picture to be identified using preset Face datection algorithm Domain, which can detect to obtain human face region according to the human face five-sense-organ in picture, from people to be identified The region where face, the face picture after being cut, then the picture that obtained face picture will be cut are cut out in face picture The size conversion of element is the picture of pre-set dimension, obtains the picture of pre-set dimension, then carry out gray scale to the picture of the pre-set dimension The processing such as change and denoising, eliminates the noise information in face picture to be identified, enhances the detectable of information relevant to face Property and simplify image data, and using the picture after pretreatment as picture to be identified, realize to face picture to be identified Pretreatment.

For example, 168*168 can be set by the Pixel Dimensions of picture to be identified in advance, to one having a size of [1280,720] Face picture to be identified, the region of face in face picture to be identified is detected by preset Face datection algorithm, And the region where cutting out face in face picture to be identified, then the ruler that obtained face picture to be identified will be cut The very little picture for being converted to [168,168] size, and the processing such as gray processing and denoising is carried out by the picture to pre-set dimension, from And obtain the picture to be identified of default specification.

S3: picture to be identified is inputted in preset depth residual error network model, wherein preset depth residual error network mould Type includes input layer, convolutional layer and full articulamentum.

In the present embodiment, preset depth residual error network model is based on ResNet (Residual Networks, depth Spend residual error network) building neural network model, ResNet is a kind of neural network of deep learning, depth residual error network model Refer to and introduce the training that a depth residual error learning framework carries out machine mould in ResNet network structure, solves degenerate problem Deep learning model.

It is to be appreciated that working as depth down, learning ability enhancing, therefore depth network for deep learning neural network Can be better than shallower network effect, but depth network residual error disappears, and will lead to degenerate problem (degradation, deeper net The performance of network is poorer than shallower network instead), the learning effect of influence depth learning neural network.Therefore, depth residual error network Model introduces a depth residual error network to solve degenerate problem, to obtain preferable prediction effect in ResNet network structure Fruit.

Specifically, preset depth residual error network model includes input layer, convolutional layer and full articulamentum, and input layer is to be used for To the network layer that the channel data of picture extracts, convolutional layer is the network extracted for the characteristic information to picture Layer, full articulamentum are the network layers for the characteristic information extracted to be carried out to regression analysis.

S4: multi-channel data extraction is carried out to picture to be identified using input layer, obtains the corresponding face of picture to be identified Image data.

Specifically, in preset depth residual error network model, using channel preset in input layer to picture to be identified The extraction for carrying out multi-channel data, can obtain the face picture data for describing picture to be identified, thus by picture Information is digitized, convenient for the discriminance analysis of machine mould.

Wherein, multi-channel data refers to, for a pixel in picture, this if desired is described with multiple numerical value Pixel, then the vector of multiple numerical value composition is the multi-channel data of picture.

Optionally, the channel number of input layer can be set to 3 in the embodiment of the present invention, pass through R (red), G (green) and B The pixel in picture to be identified is described in (indigo plant) three components, that is, vector (R, G, B), which can be used, indicates figure to be identified Pixel in piece, wherein the value range of the component in each channel is [0,255], and 0 indicates ater, and 255 indicate pure white Color.

S5: characteristic extraction is carried out to face picture data using convolutional layer, obtains the face characteristic of picture to be identified.

In the present embodiment, convolutional layer can specifically include multiple convolution units, and convolutional layer be by input data and Basic structure of the jump structure that three convolution units are overlapped as convolutional layer, using preset depth residual error network Convolutional layer in model carries out convolutional calculation to picture to be identified, which is for extracting the people for indicating picture to be identified The convolution operation of the characteristic information of face obtains the face characteristic of picture to be identified to extract.

Wherein, the quantity of convolution unit can be configured in advance in convolutional layer, and the convolution unit of preset quantity is pressed It is arranged according to preset sequence, for example, convolutional layer can specifically include 16*3 convolution unit, shares 48 convolution units, but simultaneously Without being limited thereto, the quantity of specific convolution unit can be configured according to the needs of practical application, herein with no restrictions.

It is noted that first convolution unit is believed for extracting the feature of some low levels in picture to be identified Breath, for example, the characteristic information of the low levels such as the edge of picture, lines and angle, the more convolution unit of deep layer grade can be from low levels Iterative calculation obtains characteristic information more complicated in picture to be identified in characteristic information.

S6: classification recurrence is carried out to face characteristic using full articulamentum, obtains the recognition result of picture to be identified, wherein Recognition result includes the emotional state of face in picture to be identified.

Specifically, server-side is preset in the full articulamentum of preset depth residual error network model using in full articulamentum Activation primitive regression analysis is carried out to face characteristic, the face characteristic for obtaining picture to be identified belongs to each preset mood shape Probability of state value to realize the classification to face characteristic, and regard the maximum emotional state output of probability value as figure to be identified The recognition result of piece obtains the emotional state of personage in picture to be identified, wherein the emotional state of pre-set personage includes But be not limited to it is happy, sad, frightened, angry, surprised, impatient, detest it is peaceful wait mood quietly, but it is not limited to this, specifically can be with The classification of mood is set according to the needs of practical application.

Further, activation primitive is used to carry out regression analysis to the face characteristic of picture to be identified, obtains figure to be identified The function of correlation between the face characteristic of piece and preset emotional state, activation primitive specifically can be sigmoid, rule It can be special to the face for inputting full articulamentum using Softmax activation primitive in the present embodiment with the activation primitives such as Softmax Sign carries out classification recurrence, can intuitively compare depth residual error network model and predict that the face in picture to be identified belongs to each feelings The probability value of not-ready status.

In the corresponding embodiment of Fig. 2, by receiving picture to be identified, according to preset processing mode to picture to be identified It is pre-processed, pretreated picture to be identified is inputted in preset depth residual error network model, depth residual error net is used Input layer in network model carries out multi-channel data extraction to picture to be identified, obtains the corresponding face picture number of picture to be identified According to, then convolutional layer is used to carry out characteristic extraction to face picture data, the face characteristic of picture to be identified is obtained, finally, Classification recurrence is carried out to face characteristic using full articulamentum, obtains the recognition result of picture to be identified, so that it is determined that figure to be identified The emotional state of face in piece.By being pre-processed to picture to be identified, the noise data in picture is removed, can be improved The recognition rate and accuracy rate of model, and profound characteristic is carried out to the face in picture to be identified using convolutional layer Extraction improve model to face Emotion identification so that depth residual error network model is higher to the prediction accuracy of face picture Accuracy rate.

In one embodiment, the present embodiment carries out spy to face picture data using convolutional layer to mentioned in step S5 It levies data to extract, the concrete methods of realizing for obtaining the face characteristic of picture to be identified is described in detail.

Referring to Fig. 3, Fig. 3 shows a specific flow chart of step S5, details are as follows:

S51: convolutional calculation is carried out to face picture data using convolutional layer, obtains convolved data.

In the present embodiment, the visual perception model of different pre-set dimensions is preset in each convolution unit of convolutional layer The convolution kernel enclosed determines the corresponding convolutional calculation function of the convolution unit according to the convolution kernel in each convolution unit, and according to Preset convolution kernel carries out convolutional calculation to face picture data, so that face picture data are multiplied with preset convolution kernel, Convolved data is obtained, which can describe the characteristic information of face in picture to be identified, wherein the size of convolution kernel can To be configured according to the needs of practical application, for example, convolution kernel can be set in advance as 1 × 1,3 × 3 or 5 × 5 etc., volume The unit of product core is pixel.

S52: calculation process is overlapped to convolved data and face picture data, obtains characteristic.

Specifically, calculation process is overlapped to convolved data and face picture data, by face picture data and convolution The characteristic information that data are extracted merges, and obtains the face characteristic of picture to be identified.

It is described in detail in such a way that a specific embodiment is to the convolution algorithm of convolutional layer below:

For example, in one embodiment, as shown in figure 4, Fig. 4 shows the convolution algorithm process of a convolutional layer, the volume Lamination includes the first convolution unit, the second convolution unit and third convolution unit, and specific process flow is as follows:

S401: characteristic extraction is carried out to face picture data using the first convolution unit, obtains the first convolved data；

S402: characteristic extraction is carried out to the first convolved data using the second convolution unit, obtains the second convolved data；

S403: characteristic extraction is carried out to the second convolved data using third convolution unit, obtains third convolved data；

S404: calculation process is overlapped to face picture data and third convolved data, obtains the people of picture to be identified Face feature.

In the corresponding embodiment of Fig. 3, convolutional calculation is carried out to face picture data by using convolutional layer, obtains convolution Data, and calculation process is overlapped to convolved data and face picture data, the face characteristic of picture to be identified is obtained, it can The characteristic information for describing face in picture to be identified is obtained, transmitting effect of the characteristic information in neural network model is reinforced Fruit, so that the mood to personage in picture to be identified identifies, to effectively improve preset depth residual error network model Predictablity rate.

In one embodiment, as shown in figure 5, the present embodiment to mentioned in step S52 to convolved data and face figure Sheet data is overlapped calculation process, and the concrete methods of realizing for obtaining face characteristic is described in detail.

S521: if the dimension of convolved data is consistent with the dimension of face picture data, to convolved data and face figure Sheet data carries out summation operation processing.

Specifically, dimension refers to the storage form of data, for example, (x, y) is a 2-D data, (x, y, z) is one Three-dimensional data detects the dimension of the dimension and face picture data of convolved data, if the dimension and face of convolved data The dimension of image data is consistent, then carries out summation operation processing to convolved data and face picture data, on same dimension, Convolved data and face picture data are in the data with dimension to be added, thus by convolved data and face picture data into Row superposition, reinforces transmitting of the characteristic information of face in picture to be identified in neural network model.

S522: if the dimension of convolved data and the dimension of face picture data are inconsistent, face picture data are carried out Dimension conversion process obtains face picture data identical with the dimension of convolved data, and to the transformed face picture of dimension Data and convolved data carry out summation operation processing.

Specifically, after the dimension of dimension and face picture data to convolved data detects, if convolved data The dimension of dimension and face picture data is inconsistent, then dimension conversion process is carried out to face picture data, according to preset volume Product unit carries out dimensionality reduction to face picture data or rises dimension processing, obtains face picture number identical with the dimension of convolved data According to, then summation operation processing is carried out to the transformed face picture data of dimension and convolved data, on same dimension, by convolution Data are added with the transformed face picture data of dimension in the data with dimension.

For example, in one embodiment, as shown in fig. 6, Fig. 6 is shown in a convolutional layer, the dimension of convolved data The convolution algorithm process inconsistent with the dimension of face picture data, the convolutional layer include Volume Four product unit, the 5th convolution list Member and the 6th convolution unit, specific process flow are as follows:

S601: characteristic extraction is carried out to face picture data using Volume Four product unit, obtains Volume Four volume data；

S602: characteristic extraction is carried out to Volume Four volume data using the 5th convolution unit, obtains the 5th convolved data；

S603: characteristic extraction is carried out to the 5th convolved data using the 6th convolution unit, obtains the 6th convolved data；

S604: dimension conversion process is carried out to face picture data, obtains the face with the 6th convolved data identical dimensional Image data；

S605: carrying out summation operation processing to the transformed face picture data of dimension and the 6th convolved data, obtain to Identify the face characteristic of picture.

It should be noted that each convolution unit in convolutional layer, it should to input according to different preset convolution kernels The input data of convolution unit carries out convolutional calculation, and obtained convolved data may have the dimension and input number of convolved data According to dimension it is different, this is because during convolutional calculation to the feature extraction of input data when, to the dimensionality reduction of input data Perhaps it rises caused by dimension processing and the dimensionality reduction to input data or to rise dimension processing be profound level for extracting picture to be identified The operation of feature.

S523: the result of summation operation processing is obtained as face characteristic.

Specifically, convolved data is added with data of the face picture data in respective dimensions, is obtained at summation operation The result of reason reinforces transmitting of the characteristic information of face in picture to be identified in neural network model as face characteristic.

In the corresponding embodiment of Fig. 5, detected by the dimension of dimension and face picture data to convolved data, For the dimension of the convolved data data consistent with the dimension of face picture data, directly to convolved data and face picture number According to carrying out summation operation processing, and the data inconsistent for the dimension of the dimension of convolved data and face picture data, then it is right Face picture data carry out dimension conversion process, obtain face picture data identical with the dimension of convolved data, reuse dimension It spends transformed face picture data and convolved data carries out summation operation processing, and the result that summation operation is handled is as people Face feature, to reinforce transmitting of the characteristic information of face in picture to be identified in neural network model, and can be according to wait know The characteristic information of face is analyzed to obtain the emotional state of personage in other picture, improves preset depth residual error network model Recognition accuracy.

In one embodiment, the present embodiment to mentioned in step S2 according to preset processing mode to people to be identified Face picture is pre-processed, and the concrete methods of realizing for obtaining picture to be identified is described in detail.

Referring to Fig. 7, Fig. 7 shows a specific flow chart of step S2, details are as follows:

S21: gray scale transformation processing is carried out to face picture to be identified, obtains gray scale picture.

Specifically, gray scale picture refers to the picture containing only luminance information, without color information, gray scale transformation processing be by Picture to be identified containing brightness and color is transformed into the process of gray scale picture, is become according to formula (1) using preset gray value Exchange the letters number carries out gray scale transformation processing to face picture to be identified:

G (x, y)=T (f (x, y)) formula (1)

Wherein, f is face picture to be identified, and T is preset gray value transforming function transformation function, and g is gray scale picture, x and y difference Indicate the abscissa and ordinate in face picture to be identified, f (x, y) indicate coordinate points in face picture to be identified (x, Y) corresponding pixel value, g (x, y) indicate the corresponding pixel value of coordinate points (x, y) in gray scale picture.

S22: denoising is carried out to gray scale picture, obtains pretreated picture to be identified.

Specifically, picture noise refers to the unnecessary or extra interference information being present in image data, for example, high The identification of this noise, rayleigh noise, gamma noise and salt-pepper noise etc., noise meeting face picture impacts, and therefore, can adopt Noise removal process is carried out to gray scale picture with the methods of mean filter, median filtering or Wiener filtering.

Optionally, server-side can carry out noise removal process to gray scale picture using median filtering, and median filtering method is A kind of nonlinear signal processing technology, it is all in the noise neighborhood of a point window by replacing with the gray value of noise spot The intermediate value of the gray value of pixel, so that the true value that the gray value of the pixel of surrounding is close, to eliminate isolated noise Point.

In the corresponding embodiment of Fig. 7, by carrying out gray scale transformation processing to face picture to be identified, gray scale is obtained Picture, and denoising is carried out to gray scale picture, picture to be identified is obtained, so that picture to be identified more standardizes, and to It identifies that the details of picture is clearer, is easy to identified, so that subsequent model training process is to the processing energy of picture to be identified It is enough more efficient, the complexity and information processing capacity of picture to be identified are reduced, to improve the training rate of machine learning model And recognition accuracy.

In one embodiment, full articulamentum includes M classifier, wherein M is positive integer, and the present embodiment is in step S6 Mentioned carries out classification recurrence to face characteristic using full articulamentum, obtains the specific implementation of the recognition result of picture to be identified Method is described in detail.

Referring to Fig. 8, Fig. 8 shows a specific flow chart of step S6, details are as follows:

S61: recurrence calculating is carried out to face characteristic using M classifier of full articulamentum, obtains the M kind of picture to be identified The probability of emotional state, wherein each classifier corresponds to a kind of emotional state.

Specifically, there are M trained classifiers, the classifier in the full articulamentum of preset depth residual error network model It specifically can be Softmax and return classifier, for carrying out recurrence calculating to the face characteristic for inputting full articulamentum, obtain the people The similarity of face feature emotional state corresponding with each classifier, which can specifically be indicated by probability value, be obtained The probability value of the M kind emotional state of face in picture to be identified, to indicate that the face in picture to be identified belongs to every kind of mood shape Probability of state, wherein each classifier corresponds to a kind of emotional state, and the specific type of emotional state can be according to practical application It needs to be configured, also, the probability value of classifier is bigger, then the phase of face characteristic emotional state corresponding with each classifier It is higher like spending.

S62: from M probability, choosing emotional state of the emotional state of maximum probability as face in picture to be identified, Obtain the recognition result of picture to be identified.

Specifically, the face in picture to be identified obtained according to step S61 belongs to the probability value of every kind of emotional state, from In the probability of M kind emotional state, emotional state of the maximum emotional state of probability value as face in picture to be identified is chosen, and The emotional state is exported, the recognition result as picture to be identified.

For example, as shown in table 1, full articulamentum one shares 8 trained classifiers, classifier 1 is corresponding to classifier 8 Emotional state is respectively happy, sad, frightened, angry, surprised, impatient, detest and calmness, and it is residual that table 1 shows preset depth Poor network model predicts that a picture to be identified, the face obtained in the picture to be identified belongs to each emotional state The prediction result of probability value, according to table 1, since the face in the picture to be identified belongs to the corresponding mood shape of classifier 5 The probability value of state " surprised " is maximum, hence, it can be determined that the emotional state of personage is surprised in the picture to be identified.

The prediction result of the picture to be identified of table 1.

In the corresponding embodiment of Fig. 8, recurrence calculating is carried out to face characteristic by using the classifier of full articulamentum, is obtained Belong to the prediction result of the probability value of each emotional state to the face in picture to be identified, thus intuitively to figure to be identified The probability value that face in piece belongs to each emotional state is compared, and chooses the emotional state of maximum probability as to be identified The recognition result of picture determines the emotional state of personage in picture to be identified, realizes the mood to personage in picture to be identified The identification of state.

It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.

In one embodiment, a kind of Emotion identification device based on deep neural network is provided, depth nerve net should be based on The Emotion identification method based on deep neural network corresponds in the Emotion identification device and above-described embodiment of network.Such as Fig. 9 institute Show, being somebody's turn to do the Emotion identification device based on deep neural network includes: picture receiving module 91, picture processing module 92, picture knowledge Other module 93, data extraction module 94, characteristic extracting module 95 and mood output module 96.Each functional module is described in detail such as Under:

Picture receiving module 91, for receiving face picture to be identified；

Picture processing module 92 is obtained for pre-processing according to preset processing mode to face picture to be identified To picture to be identified；

Picture recognition module 93, for inputting picture to be identified in preset depth residual error network model, wherein default Depth residual error network model include input layer, convolutional layer and full articulamentum；

Data extraction module 94 is obtained for carrying out multi-channel data extraction to picture to be identified using input layer wait know The corresponding face picture data of other picture；

Characteristic extracting module 95 is obtained for carrying out characteristic extraction to face picture data using convolutional layer wait know The face characteristic of other picture；

Mood output module 96 obtains picture to be identified for carrying out classification recurrence to face characteristic using full articulamentum Recognition result, wherein recognition result includes the emotional state of face in picture to be identified.

Further, characteristic extracting module 95 includes:

Convolutional calculation submodule 951 obtains convolution number for carrying out convolutional calculation to face picture data using convolutional layer According to；

Feature acquisition submodule 952 obtains people for being overlapped calculation process to convolved data and face picture data Face feature.

Further, feature acquisition submodule 952 includes:

First computing unit 9521, it is right if the dimension for convolved data is consistent with the dimension of face picture data Convolved data and face picture data carry out summation operation processing；

Second computing unit 9522, it is right if the dimension of dimension and face picture data for convolved data is inconsistent Face picture data carry out dimension conversion process, obtain face picture data identical with the dimension of convolved data, and to dimension Transformed face picture data and convolved data carry out summation operation processing；

As a result acquiring unit 9523, for obtaining the result of summation operation processing as face characteristic.

Further, picture processing module 92 includes:

First processing submodule 921 obtains grayscale image for carrying out gray scale transformation processing to face picture to be identified Piece；

Second processing submodule 922 obtains pretreated figure to be identified for carrying out denoising to gray scale picture Piece.

Further, full articulamentum includes M classifier, wherein M is positive integer, and mood output module 96 includes:

Regression analysis submodule 961, for carrying out recurrence calculating to face characteristic using M classifier of full articulamentum, Obtain the probability of the M kind emotional state of picture to be identified, wherein each classifier corresponds to a kind of emotional state；

Mood determines submodule 962, for choosing the emotional state of maximum probability as figure to be identified from M probability The emotional state of face in piece, obtains the recognition result of picture to be identified.

Specific restriction about the Emotion identification device based on deep neural network may refer to above for based on deep The restriction of the Emotion identification method of neural network is spent, details are not described herein.The above-mentioned Emotion identification dress based on deep neural network Modules in setting can be realized fully or partially through software, hardware and combinations thereof.Above-mentioned each module can be in the form of hardware It is embedded in or independently of the storage that in the processor in computer equipment, can also be stored in a software form in computer equipment In device, the corresponding operation of the above modules is executed in order to which processor calls.

In one embodiment, a kind of computer equipment is provided, which can be server, internal junction Composition can be as shown in Figure 10.The computer equipment include by system bus connect processor, memory, network interface and Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The network interface of machine equipment is used to communicate with external terminal by network connection.When the computer program is executed by processor with Realize a kind of Emotion identification method based on deep neural network.

In one embodiment, a kind of computer equipment is provided, including memory, processor and storage are on a memory And the computer program that can be run on a processor, processor are realized in above-described embodiment when executing computer program based on depth Step in the Emotion identification method of neural network, such as step S1 shown in Fig. 2 to step S6, alternatively, processor executes meter The function of each module/unit of the Emotion identification device in above-described embodiment based on deep neural network is realized when calculation machine program, Such as module 91 shown in Fig. 9 is to the function of module 96.To avoid repeating, which is not described herein again.

In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program realizes the step in the Emotion identification method in above-described embodiment based on deep neural network, example when being executed by processor Step S1 to step S6 as shown in Figure 2, alternatively, processor is realized in above-described embodiment when executing computer program based on depth The function of each module/unit of the Emotion identification device of neural network, such as module 91 shown in Fig. 9 is to the function of module 96.For It avoids repeating, which is not described herein again.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.

Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations；Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features；And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all It is included within protection scope of the present invention.

Claims

1. a kind of Emotion identification method based on deep neural network, which is characterized in that the feelings based on deep neural network Thread recognition methods includes:

Receive face picture to be identified；

The face picture to be identified is pre-processed according to preset processing mode, obtains picture to be identified；

Multi-channel data extraction is carried out to the picture to be identified using the input layer, it is corresponding to obtain the picture to be identified Face picture data；

Characteristic extraction is carried out to the face picture data using the convolutional layer, obtains the face of the picture to be identified Feature；

Classification recurrence is carried out to the face characteristic using the full articulamentum, obtains the recognition result of the picture to be identified, Wherein, the recognition result includes the emotional state of face in the picture to be identified.

2. the Emotion identification method based on deep neural network as described in claim 1, which is characterized in that described in the use Convolutional layer carries out characteristic extraction to the face picture data, and the face characteristic for obtaining the picture to be identified includes:

Convolutional calculation is carried out to the face picture data using the convolutional layer, obtains convolved data；

Calculation process is overlapped to the convolved data and the face characteristic data, obtains the face characteristic.

3. the Emotion identification method based on deep neural network as claimed in claim 2, which is characterized in that described to the volume Volume data and the face characteristic data are overlapped calculation process, obtain the face characteristic and include:

If the dimension of the convolved data is consistent with the dimension of the face characteristic data, to the convolved data and described Face characteristic data carry out summation operation processing；

If the dimension of the convolved data and the dimension of the face characteristic data are inconsistent, to the face characteristic data into Row dimension conversion process, obtains the face characteristic data identical with the dimension of the convolved data, and to dimension transformation after The face characteristic data and the convolved data carry out summation operation processing；

The result of the summation operation processing is obtained as the face characteristic.

4. the Emotion identification method based on deep neural network as described in claim 1, which is characterized in that described according to default Processing mode the face picture to be identified is pre-processed, obtaining picture to be identified includes:

Gray scale transformation processing is carried out to the face picture to be identified, obtains gray scale picture；

Denoising is carried out to the gray scale picture, obtains the pretreated picture to be identified.

5. such as the described in any item Emotion identification methods based on deep neural network of Claims 1-4, which is characterized in that institute Stating full articulamentum includes M classifier, wherein M is positive integer, described to be carried out using the full articulamentum to the face characteristic Classification returns, and the recognition result for obtaining the picture to be identified includes:

Recurrence calculating is carried out to the face characteristic using M classifier of the full articulamentum, obtains the picture to be identified M kind emotional state probability, wherein each classifier corresponds to a kind of emotional state；

From the M probability, emotional state of the emotional state of maximum probability as face in the picture to be identified is chosen, Obtain the recognition result of the picture to be identified.

6. a kind of Emotion identification device based on deep neural network, which is characterized in that the feelings based on deep neural network Thread identification device includes:

Picture receiving module, for receiving face picture to be identified；

Picture processing module is obtained for pre-processing according to preset processing mode to the face picture to be identified The picture to be identified；

Picture recognition module, for inputting the picture to be identified in preset depth residual error network model, wherein described pre- If depth residual error network model include input layer, convolutional layer and full articulamentum；

Data extraction module obtains institute for carrying out multi-channel data extraction to the picture to be identified using the input layer State the corresponding face picture data of picture to be identified；

Characteristic extracting module obtains institute for carrying out characteristic extraction to the face picture data using the convolutional layer State the face characteristic of picture to be identified；

Mood output module obtains described wait know for carrying out classification recurrence to the face characteristic using the full articulamentum The recognition result of other picture, wherein the recognition result includes the emotional state of face in the picture to be identified.

7. the Emotion identification device based on deep neural network as claimed in claim 6, which is characterized in that the feature extraction Module includes:

Convolutional calculation submodule obtains convolution for carrying out convolutional calculation to the face picture data using the convolutional layer Data；

Feature acquisition submodule is obtained for being overlapped calculation process to the convolved data and the face picture data The face characteristic.

8. the Emotion identification device based on deep neural network as claimed in claim 7, which is characterized in that the feature obtains Submodule includes:

First computing unit, it is right if the dimension for the convolved data is consistent with the dimension of the face picture data The convolved data and the face picture data carry out summation operation processing；

Second computing unit, it is right if the dimension of dimension and the face picture data for the convolved data is inconsistent The face picture data carry out dimension conversion process, obtain the face picture number identical with the dimension of the convolved data According to, and summation operation processing is carried out to the transformed face picture data of dimension and the convolved data；

As a result acquiring unit, for obtaining the result of the summation operation processing as the human face data.

9. a kind of computer equipment, including memory, processor and storage are in the memory and can be in the processor The computer program of upper operation, which is characterized in that the processor realized when executing the computer program as claim 1 to Emotion identification method described in 5 any one based on deep neural network.

10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In realization is as described in any one of claim 1 to 5 when the computer program is executed by processor based on deep neural network Emotion identification method.