CN107590457A

CN107590457A - Emotion identification method and system based on the cascade change network architecture

Info

Publication number: CN107590457A
Application number: CN201710801842.0A
Authority: CN
Inventors: 简仁贤; 杨闵淳; 许世焕
Original assignee: Intelligent Technology (shanghai) Co Ltd
Current assignee: Intelligent Technology (shanghai) Co Ltd
Priority date: 2017-09-07
Filing date: 2017-09-07
Publication date: 2018-01-16

Abstract

The invention belongs to computer machine learning art field, and in particular to a kind of Emotion identification method and system based on the cascade change network architecture, including：Obtain image information；Described image information is inputted into Face datection model, exports human face image information；The human face image information is inputted into emotion recognition model, exports the probable value of every kind of emotion；Most probable value is selected from the probable value of every kind of emotion, exports the emotion corresponding to the most probable value.The cascade change framework of the present invention makes certain improvements on the basis of stack arranges framework, and the mode linked plus shortcut, the shortcomings that so as to improve existing model, the result of its Emotion identification obtains more preferable effect in speed and precision.

Description

Emotion identification method and system based on the cascade change network architecture

Technical field

The invention belongs to computer machine learning art field, and in particular to a kind of feelings based on the cascade change network architecture Thread recognition methods and system.

Background technology

With the introducing of depth learning technology, image identification, which achieves, to be significantly in progress, and the core technology in this field is just It is convolutional neural networks.The framework of traditional neutral net is stacked by multilayer convolution and forms (such as Fig. 1), and existing top network Cell module is the Inception module that Google proposes, using parallel and multiple dimensioned convolutional layer (such as Fig. 2), and Microsoft The residual error neutral net that Asia institute proposes, in a manner of increasing shortcut and link, accelerate the speed of the training of network.

Among existing research, the research for having three schools is to be set for efficient with dynamical network structure Meter：

1. stressing the depth (such as Residual Networks, ResNet) of network structure, more the network structure of deep layer can carry Rise the degree of accuracy of identification.But under the finiteness of data, the intensification of network simply can cause network model parameter to increase with getting over Slow identification speed, and linear lifting can not be presented in the degree of accuracy.

2. stressing the range (such as Wider Residual Networks) of network structure, architecturally add toward every layer several The different types of building block (i.e. using the connected mode of different convolutional layers) of layer, the range of network is allowed to increase without too deep Network, similar identification effect can also be reached.

3. (such as Inception module or simple convolutional layers) is combined with parallel for each layer of building block The distributed structure of framework is designed as main shaft, model size and the largest optimization under efficiency is pursued under this framework, this framework can Demand according to device is limited to design.The present invention is to be built in the network rack that such Model Design Principle is developed with concept Structure.

Parallel (Parallel) arrangement framework (such as Fig. 2), though various features conversion can be carried out simultaneously, due to inputting source Identical, the feature after conversion also can be in same feature level.And framework (such as Fig. 1) is arranged according to stack, though reason Higher level feature can be obtained on thinking, but due to being carried out continuously multiple non-linear conversions, can cause to train convergence difficulties, prolong The long training time, and the problems such as to initial parameter value sensitivity, thus using the model of stack arrangement framework, its facial image The speed and precision of identification and Emotion identification are undesirable.

The content of the invention

For the deficiency of problem above, the invention provides a kind of Emotion identification method based on the cascade change network architecture And system, cascade change framework of the invention make certain improvements on the basis of stack arranges framework, connected plus shortcut The mode of knot, the shortcomings that so as to improve existing model, the result of its Emotion identification obtains more preferable in speed and precision Effect.

To achieve the above object, a kind of Emotion identification method based on the cascade change network architecture provided by the invention, bag Include：

Obtain image information；

Described image information is inputted into Face datection model, exports human face image information；

The human face image information is inputted into emotion recognition model, exports the probable value of every kind of emotion；

Most probable value is selected from the probable value of every kind of emotion, exports the feelings corresponding to the most probable value Sense.

Preferentially, the Face datection model and the emotion recognition model use the cascade change network architecture.

Preferably, the cascade change network architecture includes input layer, hidden layer and output layer, and the hidden layer includes the For one convolutional layer to several convolutional layers of N convolutional layers, the N is positive integer more than or equal to 2, first convolutional layer it is defeated Enter the initial input value that value is input layer, the input value of remaining convolutional layer is the output valve of upper level convolutional layer, the output layer Multiple input values be respectively the initial input value of the input layer and the output valve of each convolutional layer.

Preferentially, if the initial input value of the input layer is x, the output valve of the output layer is Y, several described volumes The output valve of lamination is respectively f (x)₁、f(x)₂……f(x)_N, then the output valve Y=x+f (x) of the output layer₁+f(x)₂…… f(x)_N。

Emotion identification system based on the cascade change network architecture, including：

Input module, for obtaining image information；

Face detection module, for described image information to be inputted into Face datection model, export human face image information；

Emotion recognition module, for the human face image information to be inputted into emotion recognition model, export the general of every kind of emotion Rate value；

Output module, it is most general described in output for selecting most probable value from the probable value of every kind of emotion Emotion corresponding to rate value.

From such scheme, beneficial effects of the present invention are：The cascade change framework of the present invention arranges frame in stack Made certain improvements on the basis of structure, the mode linked plus shortcut, the shortcomings that so as to improve existing model, its mood is known Other result obtains more preferable effect in speed and precision.

Brief description of the drawings

, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical scheme of the prior art The required accompanying drawing used is briefly described in embodiment or description of the prior art.In all of the figs, similar element Or part is typically identified by similar reference.In accompanying drawing, each element or part might not be drawn according to the ratio of reality.

Fig. 1 is the flow chart of the Emotion identification method based on the cascade change network architecture in the present embodiment；

Fig. 2 is the structural representation that stack arranges framework in background technology；

Fig. 3 is the structural representation of framework arranged in parallel in background technology；

Fig. 4 is the structural representation of the present embodiment cascade switching network framework；

Fig. 5 is the structural representation of the Emotion identification system based on the cascade change network architecture in the present embodiment.

Embodiment

Embodiments of the invention are described in detail below in conjunction with accompanying drawing.Following examples are only used for clearer Ground illustrates the product of the present invention, therefore be intended only as example, and can not be limited the scope of the invention with this.

Embodiment：

The embodiment provides a kind of Emotion identification method based on the cascade change network architecture, as shown in figure 1, Including：

Obtain image information；

The Face datection model and the emotion recognition model use the cascade change network architecture.

The cascade change network architecture includes input layer, hidden layer and output layer, and the hidden layer includes the first convolution Layer is to several convolutional layers of N convolutional layers, and the N is the positive integer more than or equal to 2, and the input value of first convolutional layer is The initial input value of input layer, the input value of remaining convolutional layer are the output valve of upper level convolutional layer, the output layer it is multiple Input value is respectively the initial input value of the input layer and the output valve of each convolutional layer.

If the initial input value of the input layer is x, the output valve of the output layer is Y, several convolutional layers Output valve is respectively f (x)₁、f(x)₂……f(x)_N, then the output valve Y=x+f (x) of the output layer₁+f(x)₂……f(x)_N。

Emotion identification system based on the cascade change network architecture, as shown in figure 5, including：

Input module, for obtaining image information；

As shown in figure 4, the present embodiment is a cascade change network architecture example with two layers of convolutional layer, i.e., in multilayer Shortcut is introduced on stacked architecture to link, except the output line of last convolutional layer is directly connected to output layer, remaining each layer Except thering is a line to export to output layer also while thering is an other line to export to next convolutional layer.More clearly shown with letter To represent the relevance in the network architecture：Assuming that the initial input value of the input layer is x, then first layer convolutional layer output value table It is shown as f (x)₁, then the output value table of the second convolutional layer is shown as f (x)₂, the output value table of last output layer is shown as Y, then Y=x +f(x)₁+f(x)₂.Therefore between this cascade change network architecture can be represented to input and exported using a multinomial formula Corresponding relation.This connection level conversion framework can extend to three layers, four layers ... and the polynomial repressentation mode as infinite layer. With PolyNet framework difference, this cascade change network architecture uses merely convolutional layer structure network, and PolyNet is adopted Make and link of Inception module blocks, all not as good as the cascade change network rack of concentration version in model size and speed Structure is in the practicality and execution efficiency disposed on hardware.Furthermore the result after each convolution is changed contributes to output, also has Inception module (starting module) provide the effect of observable different scale characteristic pattern details.

In addition from framework, most long path represents the high-level characteristic of depth stacking in the cascade change network architecture Practise, and the connection of different convolutional layers represents the study of different scale feature in parallel framework.Therefore possess simultaneously in framework The feature learning of different dimensions (the parallelization study of depth e-learning and different scale), thus can provide and prevailing network frame Structure (Google Inception module, ResNet, PolyNet, Wider Residual Net as previously mentioned) is close Recognition effect, but because being the network architecture of full convolution, faster in aforementioned network framework in the training of model and test speed And model is also smaller and faster performs speed.

The cascade change network architecture of the present embodiment has the structure diversity in architecture design, can lift learning efficiency With the effect of identification, this structure diversity allows different images details to be reinforced e-learning and the ability of identification.And level Connection switching network framework has residual error network in the e-learning of deep layer simultaneously again, can retain the renewal money of network weight News.Herein under both advantages, the design of the cascade change network architecture more high resilience and can apply asking in reality In topic with (such as mobile phone, IOT devices, robot development's hardware) in entity apparatus.Show that equal memory body uses in experiment Under, the cascade change network architecture can provide more accurately identifies achievement with efficient, and this achievement in research more may be such that our institutes The face emotion recognition system of invention can be more easily deployed among actual application.

Actual operation scene example is set using public domain security protection scene to illustrate, such as in public places on (such as airport) Multiple cameras, can clearly capture the facial image of pedestrian as main target, can then pass through the present invention be System framework, (being such as probably terrorist or fugitive convict) is showed in the abnormal emotion of the specific pedestrian of background analysis, and by rear Platform sender number pays attention to the situation of this person to scene dimension peace personnel.And this analysis needs to pass through instant computing and feedback, therefore The arithmetic speed of model is required for reaching certain standard with the degree of accuracy.

Finally it should be noted that：The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations；Although The present invention is described in detail with reference to foregoing embodiments, it will be understood by those within the art that：It is still Technical scheme described in foregoing embodiments can be modified, either which part or all technical characteristic are carried out Equivalent substitution；And these modifications or replacement, the essence of appropriate technical solution is departed from various embodiments of the present invention technical side The scope of case, it all should cover among the claim of the present invention and the scope of specification.

Claims

1. the Emotion identification method based on the cascade change network architecture, it is characterised in that including：

Obtain image information；

Most probable value is selected from the probable value of every kind of emotion, exports the emotion corresponding to the most probable value.

2. the Emotion identification method according to claim 1 based on the cascade change network architecture, it is characterised in that the people Face detection model and the emotion recognition model use the cascade change network architecture.

3. the Emotion identification method according to claim 2 based on the cascade change network architecture, it is characterised in that the level Connection switching network framework includes input layer, hidden layer and output layer, and the hidden layer includes the first convolutional layer to N convolutional layers Several convolutional layers, the N are the positive integer more than or equal to 2, and the input value of first convolutional layer is the initial defeated of input layer Enter value, the input value of remaining convolutional layer is the output valve of upper level convolutional layer, and multiple input values of the output layer are respectively institute State the initial input value of input layer and the output valve of each convolutional layer.

4. the Emotion identification method according to claim 3 based on the cascade change network architecture, it is characterised in that set described The initial input value of input layer is x, and the output valve of the output layer is Y, and the output valve of several convolutional layers is respectively f (x)₁、f(x)₂……f(x)_N, then the output valve Y=x+f (x) of the output layer₁+f(x)₂……f(x)_N。

5. the Emotion identification system based on the cascade change network architecture based on claim 1 methods described, it is characterised in that bag Include：

Input module, for obtaining image information；

Emotion recognition module, for the human face image information to be inputted into emotion recognition model, export the probable value of every kind of emotion；

Output module, for selecting most probable value from the probable value of every kind of emotion, export the most probable value Corresponding emotion.

6. the Emotion identification system according to claim 5 based on the cascade change network architecture, it is characterised in that the people Face detection model and the emotion recognition model use the cascade change network architecture.

7. the Emotion identification system according to claim 6 based on the cascade change network architecture, it is characterised in that the level Connection switching network framework includes input layer, hidden layer and output layer, and the hidden layer includes the first convolutional layer to N convolutional layers Several convolutional layers, the N are the positive integer more than or equal to 2, and the input value of first convolutional layer is the initial defeated of input layer Enter value, the input value of remaining convolutional layer is the output valve of upper level convolutional layer, and multiple input values of the output layer are respectively institute State the initial input value of input layer and the output valve of each convolutional layer.

8. the Emotion identification system according to claim 7 based on the cascade change network architecture, it is characterised in that set described The initial input value of input layer is x, and the output valve of the output layer is Y, and the output valve of several convolutional layers is respectively f (x)₁、f(x)₂……f(x)_N, then the output valve Y=x+f (x) of the output layer₁+f(x)₂……f(x)_N。