CN112036281A - Facial expression recognition method based on improved capsule network - Google Patents

Facial expression recognition method based on improved capsule network Download PDF

Info

Publication number
CN112036281A
CN112036281A CN202010860025.4A CN202010860025A CN112036281A CN 112036281 A CN112036281 A CN 112036281A CN 202010860025 A CN202010860025 A CN 202010860025A CN 112036281 A CN112036281 A CN 112036281A
Authority
CN
China
Prior art keywords
capsule
layer
expression
network
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010860025.4A
Other languages
Chinese (zh)
Other versions
CN112036281B (en
Inventor
张会焱
敖文刚
刘宗敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Technology and Business University
Original Assignee
Chongqing Technology and Business University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Technology and Business University filed Critical Chongqing Technology and Business University
Publication of CN112036281A publication Critical patent/CN112036281A/en
Application granted granted Critical
Publication of CN112036281B publication Critical patent/CN112036281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a facial expression recognition method based on an improved capsule network, which comprises the following steps: inputting the sample picture into an improved capsule network for training; inputting the live-action picture into an improved capsule network for recognition, and extracting the facial expression in the live-action picture; the training of inputting the sample picture into the improved capsule network specifically comprises the following steps: s1, extracting a face region from a picture through a multitask convolutional neural network; s2, marking the extracted face area to obtain the expression and the head posture of the face area; s3, inputting the expression and the head posture of the face area into a generation countermeasure network, and generating the face area with the expression for the countermeasure network; s4, the face area with the expression is input into the improved capsule network to train the improved capsule network, the expression of the face under different postures can be accurately recognized, the posture condition of a human body does not need to be considered, and therefore the recognition accuracy can be guaranteed, and the recognition efficiency can be effectively improved.

Description

Facial expression recognition method based on improved capsule network
Technical Field
The invention relates to a facial expression recognition method, in particular to a facial expression recognition method based on an improved capsule network.
Background
The facial expression recognition is widely applied to modern production life, the facial expression recognition method mainly depends on a frame of a deep convolutional neural network, although the deep neural network can recognize a face to a certain extent, the facial expression recognition under different postures is difficult to realize, the angle of the face needs to be adjusted in the recognition process, so that the facial expression recognition efficiency is reduced, on the other hand, the facial expression recognition depends on the respective characteristics of organs such as eyes, a nose and a mouth, the existing facial expression recognition cannot recognize the relative position between the organs, and the recognition accuracy is low.
Therefore, in order to solve the above technical problems, it is necessary to provide a technical means for solving the problems.
Disclosure of Invention
In view of the above, the present invention aims to provide a facial expression recognition method based on an improved capsule network, which can accurately recognize facial expressions in different postures without considering the posture conditions of the human body, thereby ensuring the recognition accuracy and effectively improving the recognition efficiency.
The invention provides a facial expression recognition method based on an improved capsule network, which comprises the following steps:
inputting the sample picture into an improved capsule network for training;
inputting the live-action picture into an improved capsule network for recognition, and extracting the facial expression in the live-action picture;
the training of inputting the sample picture into the improved capsule network specifically comprises the following steps:
s1, extracting a face region from a picture through a multitask convolutional neural network;
s2, marking the extracted face area to obtain the expression and the head posture of the face area;
s3, inputting the expression and the head posture of the face area into a generation countermeasure network, and generating the face area with the expression for the countermeasure network;
and S4, inputting the face area with the expression into the improved capsule network to train the improved capsule network.
Further, in step S3, the generating a face area with an expression by generating a confrontation network specifically includes:
the generative countermeasure network includes an encoder and a decoder;
inputting the expression and the head posture of the face area into an encoder for processing, and outputting the facial picture characteristics, the expression and the posture by the encoder;
inputting the characteristics, expressions and postures of the face picture into a decoder for processing, and outputting the face picture with the expressions by the decoder;
constructing an objective function for generating a countermeasure network:
Figure BDA0002647769550000021
wherein x is a human face area, y is an expression label and a posture label, D (x, y) is an expression of the discriminator, and the output is true or false; g (x, y) is an expression of a generator, and is output as a generated face picture; d (G (x, y), y) is the result of judging the generation of the face picture by the generator G by the discriminator D; pd (x, y) is the joint probability of x and y; ex,y~pd(x,y)As desired for pd (x, y);
and judging the facial picture with the expression output by the decoder by adopting an objective function for generating the confrontation network, and outputting the facial picture with the expression with the real judgment result.
Further, step S4 specifically includes:
the modified capsule network has a relu convolution layer, an initial capsule layer prim _ cap, a first convolution capsule layer conv _ cap1, a second convolution capsule layer conv _ cap2 and a classification capsule layer class _ cap;
inputting the generated face picture with the face expression of the countermeasure network into a relu convolution layer for processing, and outputting the local characteristics of the face picture;
the initial capsule layer prim _ cap processes local features of the face picture output by the relu volume layer and outputs 32 capsules;
the first capsule convolution layer conv _ cap1 processes the initial capsule layer and outputs 32 capsules;
the second capsule convolution layer conv _ cap2 processes the 32 capsules output by the first capsule convolution layer conv _ cap1 and outputs 32 capsules;
the classified capsule layer class _ cap processes the 32 capsules output from the second capsule convolution layer conv _ cap2 and outputs 7 capsules, the 7 capsules corresponding to 7-of-the-face expression.
Further, the 32 capsules output by the initial capsule layer prim _ cap are sequentially realized by a first capsule convolution layer conv _ cap1, a second capsule convolution layer conv _ cap2 and a classified capsule layer class _ cap through a T-EM route, and the method specifically comprises the following steps:
determining voting matrix V from lower layer capsule i to higher layer capsule jij
Vij=Pi·Wij
wherein ,PiIs the attitude matrix of the lower layer capsule i, WijA visual angle invariant matrix from a lower layer capsule i to a higher layer capsule j;
wherein, the voting matrix VijThe k element of (1)
Figure BDA0002647769550000031
The capsules j belonging to the higher layer are determined by the T distribution:
Figure BDA0002647769550000032
wherein, (. cndot.) is a gamma function,
Figure BDA0002647769550000033
is an element
Figure BDA0002647769550000034
To mean value mujMahalanobis distance of;
Figure BDA0002647769550000035
in order to be the expectation of the T distribution,
Figure BDA0002647769550000036
is a degree of freedom of the distribution of T,
Figure BDA0002647769550000037
is the variance of T distribution, and pi is the circumferential rate;
wherein ,
Figure BDA0002647769550000041
the loss function C for classifying the I lower layer capsules into the J higher layer capsules is:
Figure BDA0002647769550000042
wherein ,RijA weight for the jth capsule belonging to the jth capsule;
attitude matrix P for higher layer capsulesjAnd an activation matrix ajVoting matrix P by lower layer capsuleiAnd an activation matrix aiThe method is obtained by minimizing the formula (3) through a T-EM routing process, and specifically comprises the following steps:
initializing parameters:
Figure BDA0002647769550000043
wherein J is the number of capsules of higher layer;
Figure BDA0002647769550000044
and M:
Rij=Rij×ai,i=[1,I];
Figure BDA0002647769550000045
Figure BDA0002647769550000046
Figure BDA0002647769550000047
βa and βvExpressing a trainable variable, wherein lambda is a temperature coefficient and takes the value of 0.01;
Figure BDA0002647769550000051
degree of freedom
Figure BDA0002647769550000052
By calculating the solution of:
Figure BDA0002647769550000053
Figure DA00026477695532014
wherein ,
Figure BDA0002647769550000055
the degree of freedom of T distribution in the previous calculation;
e, step E: determining routes based on t-distribution based on the parameters calculated in the M steps:
Figure BDA0002647769550000056
Figure BDA0002647769550000057
;Пkis an arithmetic operator;
after the times are set through the M step and the E step of iteration, the attitude matrix P of the higher-layer capsule is obtainedjAnd the attitude matrix PjIs the voting matrix VijOf (2) element(s)
Figure BDA0002647769550000058
Is measured.
Further, the capsule network is trained by a propagation loss function, wherein the propagation loss function of the t-th capsule in the lower-layer capsule activating the higher-layer capsule is:
L=∑i≠t(max(0,m-(at-ai)))2(ii) a Where m is the variable gap, the initial value is 0.2, the maximum value is 0.9, atFor activating the activation value of the parent capsule, aiThe activation value of the inactive parent capsule.
The invention has the beneficial effects that: according to the invention, the expressions of the human faces under different postures can be accurately recognized without considering the posture condition of the human body, so that the recognition accuracy can be ensured, and the recognition efficiency can be effectively improved.
Drawings
The invention is further described below with reference to the following figures and examples:
FIG. 1 is a flow chart of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings of the specification:
the invention provides a facial expression recognition method based on an improved capsule network, which comprises the following steps:
inputting the sample picture into an improved capsule network for training;
inputting the live-action picture into an improved capsule network for recognition, and extracting the facial expression in the live-action picture;
the training of inputting the sample picture into the improved capsule network specifically comprises the following steps:
s1, extracting a face region from a picture through a multitask convolutional neural network;
s2, marking the extracted face area to obtain the expression and the head posture of the face area;
s3, inputting the expression and the head posture of the face area into a generation countermeasure network, and generating the face area with the expression for the countermeasure network;
s4, inputting the face area with the expression into an improved capsule network to train the improved capsule network; according to the invention, the expressions of the human faces under different postures can be accurately recognized without considering the posture condition of the human body, so that the recognition accuracy can be ensured, and the recognition efficiency can be effectively improved.
In this embodiment, in step S3, the generating a face area with an expression by generating a countermeasure network specifically includes:
the generative countermeasure network includes an encoder and a decoder;
inputting the expression and the head posture of the face area into an encoder for processing, and outputting the facial picture characteristics, the expression and the posture by the encoder; the encoder inputs the face picture of 224 × 244 × 3 and outputs the face picture feature f with the length of 50, and the encoder is composed of five convolution layers and a full connection layer, wherein: the convolution layer has a kernel of 5 x 5 and a relu activation function; the full connection layer is provided with a tanh activation function;
inputting the characteristics, expressions and postures of the face picture into a decoder for processing, and outputting the face picture with the expressions by the decoder; the decoder is composed of seven deconvolution layers, the convolution kernel is 5 x 5, the first six deconvolution layers are provided with relu activation functions, and the last deconvolution layer is provided with tanh activation functions;
constructing an objective function for generating a countermeasure network:
Figure BDA0002647769550000071
wherein x is a human face area, y is an expression label and a posture label, D (x, y) is an expression of the discriminator, and the output is true or false; g (x, y) is an expression of a generator, and is output as a generated face picture; d (G (x, y), y) is the result of judging the generation of the face picture by the generator G by the discriminator D; pd (x, y) is the joint probability of x and y; ex,y~pd(x,y)As desired for pd (x, y);
adopting an objective function for generating a countermeasure network to judge the facial picture with the expression output by the decoder, and outputting the facial picture with the expression with the real judgment result; by the method, the facial expressions in the sample pictures can be accurately extracted, so that subsequent training and final recognition are facilitated.
In this embodiment, step S4 specifically includes:
the modified capsule network has a relu convolution layer, an initial capsule layer prim _ cap, a first convolution capsule layer conv _ cap1, a second convolution capsule layer conv _ cap2 and a classification capsule layer class _ cap; the Relu convolution layer conv _ Relu has the input of 28 × 3 facial expression pictures and the output of 14 × 32 local features, and consists of a 5 × 5 convolution layer, a BatchNormal layer and a Relu layer;
inputting the generated face picture with the face expression of the countermeasure network into a relu convolution layer for processing, and outputting the local characteristics of the face picture;
the initial capsule layer prim _ cap processes local features of the face picture output by the relu volume layer and outputs 32 capsules; the input of the initial capsule layer (prim _ cap) is the local feature output by the Relu convolution layer, the output is 32 capsules, the initial capsule layer is composed of two 1 x 1 coiling layers, the step length is 1, and the attitude matrix and the activation matrix of the output capsules are respectively formed;
the first capsule convolution layer conv _ cap1 processes the initial capsule layer and outputs 32 capsules;
the second capsule convolution layer conv _ cap2 processes the 32 capsules output by the first capsule convolution layer conv _ cap1 and outputs 32 capsules;
the classified capsule layer class _ cap processes the 32 capsules output from the second capsule convolution layer conv _ cap2 and outputs 7 capsules, the 7 capsules corresponding to 7-of-the-face expression.
Specifically, the method comprises the following steps: the 32 capsules output by the initial capsule layer prim _ cap are sequentially realized by a first capsule convolution layer conv _ cap1, a second capsule convolution layer conv _ cap2 and a classification capsule layer class _ cap through a T-EM route, and the method specifically comprises the following steps:
determining voting matrix V from lower layer capsule i to higher layer capsule jij
Vij=Pi·Wij
wherein ,PiIs the attitude matrix of the lower layer capsule i, WijA visual angle invariant matrix from a lower layer capsule i to a higher layer capsule j;
wherein, the voting matrix VijThe k element of (1)
Figure BDA0002647769550000081
The capsules j belonging to the higher layer are determined by the T distribution:
Figure BDA0002647769550000082
wherein, (. cndot.) is a gamma function,
Figure BDA0002647769550000083
is an element
Figure BDA0002647769550000084
To mean value mujMahalanobis distance of;
Figure BDA0002647769550000085
in order to be the expectation of the T distribution,
Figure BDA0002647769550000086
is a degree of freedom of the distribution of T,
Figure BDA0002647769550000087
is the variance of the distribution of T,pi is the circumference ratio;
wherein ,
Figure BDA0002647769550000088
the loss function C for classifying the I lower layer capsules into the J higher layer capsules is:
Figure BDA0002647769550000091
wherein ,RijA weight for the jth capsule belonging to the jth capsule;
attitude matrix P for higher layer capsulesjAnd an activation matrix ajVoting matrix P by lower layer capsuleiAnd an activation matrix aiThe method is obtained by minimizing the formula (3) through a T-EM routing process, and specifically comprises the following steps:
initializing parameters:
Figure BDA0002647769550000092
wherein J is the number of capsules of higher layer;
Figure BDA0002647769550000093
and M:
Rij=Rij×ai,i=[1,I];
Figure BDA0002647769550000094
Figure BDA0002647769550000095
Figure BDA0002647769550000096
βa and βvExpressing a trainable variable, wherein lambda is a temperature coefficient and takes the value of 0.01;
Figure BDA0002647769550000097
degree of freedom
Figure BDA0002647769550000098
By calculating the solution of:
Figure BDA0002647769550000101
Figure DA00026477695532118
wherein ,
Figure BDA0002647769550000103
the degree of freedom of T distribution in the previous calculation;
e, step E: determining routes based on t-distribution based on the parameters calculated in the M steps:
Figure BDA0002647769550000104
Figure BDA0002647769550000105
;Пkis an arithmetic operator;
after the times are set through the M step and the E step of iteration, the attitude matrix P of the higher-layer capsule is obtainedjAnd the attitude matrix PjIs the voting matrix VijOf (2) element(s)
Figure BDA0002647769550000106
Is measured. By the method, each capsule can be trained, so that the accuracy of final recognition is ensured.
In this embodiment, the capsule network is trained through the propagation loss function, and is obtained in the training processTrainable variable betaa and βvThe propagation loss function of the t-th capsule in the lower layer capsule activation higher layer capsule is:
L=∑i≠t(max(0,m-(at-ai)))2(ii) a Where m is the variable gap, the initial value is 0.2, the maximum value is 0.9, atFor activating the activation value of the parent capsule, aiThe activation value of the inactive parent capsule.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims (5)

1. A facial expression recognition method based on an improved capsule network is characterized by comprising the following steps: the method comprises the following steps:
inputting the sample picture into an improved capsule network for training;
inputting the live-action picture into an improved capsule network for recognition, and extracting the facial expression in the live-action picture;
the training of inputting the sample picture into the improved capsule network specifically comprises the following steps:
s1, extracting a face region from a picture through a multitask convolutional neural network;
s2, marking the extracted face area to obtain the expression and the head posture of the face area;
s3, inputting the expression and the head posture of the face area into a generation countermeasure network, and generating the face area with the expression for the countermeasure network;
and S4, inputting the face area with the expression into the improved capsule network to train the improved capsule network.
2. The facial expression recognition method based on the improved capsule network as claimed in claim 1, wherein: in step S3, the generating of the confrontation network to generate the face area with an expression specifically includes:
the generative countermeasure network includes an encoder and a decoder;
inputting the expression and the head posture of the face area into an encoder for processing, and outputting the facial picture characteristics, the expression and the posture by the encoder;
inputting the characteristics, expressions and postures of the face picture into a decoder for processing, and outputting the face picture with the expressions by the decoder;
constructing an objective function for generating a countermeasure network:
Figure FDA0002647769540000011
wherein x is a human face area, y is an expression label and a posture label, D (x, y) is an expression of the discriminator, and the output is true or false; g (x, y) is an expression of a generator, and is output as a generated face picture; d (G (x, y), y) is the result of judging the generation of the face picture by the generator G by the discriminator D; pd (x, y) is the joint probability of x and y; ex,y~pd(x,y)As desired for pd (x, y);
and judging the facial picture with the expression output by the decoder by adopting an objective function for generating the confrontation network, and outputting the facial picture with the expression with the real judgment result.
3. The facial expression recognition method based on the improved capsule network as claimed in claim 2, wherein: step S4 specifically includes:
the modified capsule network has a relu convolution layer, an initial capsule layer prim _ cap, a first convolution capsule layer conv _ cap1, a second convolution capsule layer conv _ cap2 and a classification capsule layer class _ cap;
inputting the generated face picture with the face expression of the countermeasure network into a relu convolution layer for processing, and outputting the local characteristics of the face picture;
the initial capsule layer prim _ cap processes local features of the face picture output by the relu volume layer and outputs 32 capsules;
the first capsule convolution layer conv _ cap1 processes the initial capsule layer and outputs 32 capsules;
the second capsule convolution layer conv _ cap2 processes the 32 capsules output by the first capsule convolution layer conv _ cap1 and outputs 32 capsules;
the classified capsule layer class _ cap processes the 32 capsules output from the second capsule convolution layer conv _ cap2 and outputs 7 capsules, the 7 capsules corresponding to 7-of-the-face expression.
4. The facial expression recognition method based on the improved capsule network as claimed in claim 3, wherein: the 32 capsules output by the initial capsule layer prim _ cap are sequentially realized by a first capsule convolution layer conv _ cap1, a second capsule convolution layer conv _ cap2 and a classification capsule layer class _ cap through a T-EM route, and the method specifically comprises the following steps:
determining voting matrix V from lower layer capsule i to higher layer capsule jij
Vij=Pi·Wij
wherein ,PiIs the attitude matrix of the lower layer capsule i, WijA visual angle invariant matrix from a lower layer capsule i to a higher layer capsule j;
wherein, the voting matrix VijThe k element of (1)
Figure FDA0002647769540000031
The capsules j belonging to the higher layer are determined by the T distribution:
Figure FDA0002647769540000032
wherein, (. cndot.) is a gamma function,
Figure FDA0002647769540000033
is an element
Figure FDA0002647769540000034
To mean value mujMahalanobis distance of;
Figure FDA0002647769540000035
in order to be the expectation of the T distribution,
Figure FDA0002647769540000036
is a degree of freedom of the distribution of T,
Figure FDA0002647769540000037
is the variance of T distribution, and pi is the circumferential rate;
wherein ,
Figure FDA0002647769540000038
the loss function C for classifying the I lower layer capsules into the J higher layer capsules is:
Figure FDA0002647769540000039
wherein ,RijA weight for the jth capsule belonging to the jth capsule;
attitude matrix P for higher layer capsulesjAnd an activation matrix ajVoting matrix P by lower layer capsuleiAnd an activation matrix aiThe method is obtained by minimizing the formula (3) through a T-EM routing process, and specifically comprises the following steps:
initializing parameters:
Figure FDA00026477695400000310
wherein J is the number of capsules of higher layer;
Figure FDA00026477695400000311
and M:
Rij=Rij×ai,i=[1,I];
Figure FDA0002647769540000041
Figure FDA0002647769540000042
Figure FDA0002647769540000043
βa and βvRepresenting a trainable variable, λ being a temperature coefficient;
Figure FDA0002647769540000044
degree of freedom
Figure FDA0002647769540000045
By calculating the solution of:
Figure FDA0002647769540000046
wherein ,
Figure FDA0002647769540000047
the degree of freedom of T distribution in the previous calculation;
e, step E: determining routes based on t-distribution based on the parameters calculated in the M steps:
Figure FDA0002647769540000048
Пkis an arithmetic operator;
after the times are set through the M step and the E step of iteration, the attitude matrix P of the higher-layer capsule is obtainedjAnd the attitude matrix PjIs the voting matrix VijOf (2) element(s)
Figure FDA0002647769540000051
Is measured.
5. The facial expression recognition method based on the improved capsule network as claimed in claim 4, wherein: training the capsule network through a propagation loss function, wherein the propagation loss function of the t-th capsule in the lower-layer capsule activating the higher-layer capsule is as follows:
L=∑i≠t(max(0,m-(at-ai)))2(ii) a Where m is the variable gap, the initial value is 0.2, the maximum value is 0.9, atFor activating the activation value of the parent capsule, aiThe activation value of the inactive parent capsule.
CN202010860025.4A 2020-07-29 2020-08-24 Facial expression recognition method based on improved capsule network Active CN112036281B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2020107460730 2020-07-29
CN202010746073 2020-07-29

Publications (2)

Publication Number Publication Date
CN112036281A true CN112036281A (en) 2020-12-04
CN112036281B CN112036281B (en) 2023-06-09

Family

ID=73581053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010860025.4A Active CN112036281B (en) 2020-07-29 2020-08-24 Facial expression recognition method based on improved capsule network

Country Status (1)

Country Link
CN (1) CN112036281B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507916A (en) * 2020-12-16 2021-03-16 苏州金瑞阳信息科技有限责任公司 Face detection method and system based on facial expression

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446609A (en) * 2018-03-02 2018-08-24 南京邮电大学 A kind of multi-angle human facial expression recognition method based on generation confrontation network
CN108764031A (en) * 2018-04-17 2018-11-06 平安科技(深圳)有限公司 Identify method, apparatus, computer equipment and the storage medium of face
CN109063724A (en) * 2018-06-12 2018-12-21 中国科学院深圳先进技术研究院 A kind of enhanced production confrontation network and target sample recognition methods
CN109934116A (en) * 2019-02-19 2019-06-25 华南理工大学 A kind of standard faces generation method based on generation confrontation mechanism and attention mechanism
CN110197125A (en) * 2019-05-05 2019-09-03 上海资汇信息科技有限公司 Face identification method under unconfined condition
US20190303742A1 (en) * 2018-04-02 2019-10-03 Ca, Inc. Extension of the capsule network
CN110533004A (en) * 2019-09-07 2019-12-03 哈尔滨理工大学 A kind of complex scene face identification system based on deep learning
CN111241958A (en) * 2020-01-06 2020-06-05 电子科技大学 Video image identification method based on residual error-capsule network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446609A (en) * 2018-03-02 2018-08-24 南京邮电大学 A kind of multi-angle human facial expression recognition method based on generation confrontation network
US20190303742A1 (en) * 2018-04-02 2019-10-03 Ca, Inc. Extension of the capsule network
CN108764031A (en) * 2018-04-17 2018-11-06 平安科技(深圳)有限公司 Identify method, apparatus, computer equipment and the storage medium of face
CN109063724A (en) * 2018-06-12 2018-12-21 中国科学院深圳先进技术研究院 A kind of enhanced production confrontation network and target sample recognition methods
CN109934116A (en) * 2019-02-19 2019-06-25 华南理工大学 A kind of standard faces generation method based on generation confrontation mechanism and attention mechanism
CN110197125A (en) * 2019-05-05 2019-09-03 上海资汇信息科技有限公司 Face identification method under unconfined condition
CN110533004A (en) * 2019-09-07 2019-12-03 哈尔滨理工大学 A kind of complex scene face identification system based on deep learning
CN111241958A (en) * 2020-01-06 2020-06-05 电子科技大学 Video image identification method based on residual error-capsule network

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
GEOFFREY E. HINTON等: "MATRIX CAPSULES WITH EM ROUTING", 《ICLR 2018 》 *
INYOUNG PAIK等: "Capsule Networks Need an Improved Routing Algorithm", 《ARXIV》 *
KANAKO MARUSAKI等: "Capsule GAN Using Capsule Network for Generator Architecture", 《ARXIV》 *
SARA SABOUR等: "Dynamic Routing Between Capsules", 《ARXIV》 *
姚乃明等: "基于生成式对抗网络的鲁棒人脸表情识别", 《自动化学报》 *
姚玉倩: "基于胶囊网络的人脸表情特征提取与识别算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
杨巨成等: "胶囊网络模型综述", 《山东大学学报(工学版)》 *
罗佳等: "生成式对抗网络研究综述", 《仪器仪表学报》 *
陈霖等: "基于GAN网络的面部表情识别", 《电子技术与软件工程》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507916A (en) * 2020-12-16 2021-03-16 苏州金瑞阳信息科技有限责任公司 Face detection method and system based on facial expression
CN112507916B (en) * 2020-12-16 2021-07-27 苏州金瑞阳信息科技有限责任公司 Face detection method and system based on facial expression

Also Published As

Publication number Publication date
CN112036281B (en) 2023-06-09

Similar Documents

Publication Publication Date Title
CN108596039B (en) Bimodal emotion recognition method and system based on 3D convolutional neural network
Park et al. A depth camera-based human activity recognition via deep learning recurrent neural network for health and social care services
CN109409297B (en) Identity recognition method based on dual-channel convolutional neural network
Mohandes et al. Arabic sign language recognition using the leap motion controller
Várkonyi-Kóczy et al. Human–computer interaction for smart environment applications using fuzzy hand posture and gesture models
Geetha et al. A vision based dynamic gesture recognition of indian sign language on kinect based depth images
Ibraheem et al. Vision based gesture recognition using neural networks approaches: A review
CN107766850A (en) Based on the face identification method for combining face character information
Huynh et al. Convolutional neural network models for facial expression recognition using bu-3dfe database
JP2018165948A (en) Image recognition device, image recognition method, computer program, and product monitoring system
CN109919085B (en) Human-human interaction behavior identification method based on light-weight convolutional neural network
US20200302162A1 (en) Method and apparatus for recognizing an organism action, server, and storage medium
CN107704813B (en) Face living body identification method and system
US20220207305A1 (en) Multi-object detection with single detection per object
CN112200176B (en) Method and system for detecting quality of face image and computer equipment
CN112836680A (en) Visual sense-based facial expression recognition method
Calado et al. A geometric model-based approach to hand gesture recognition
Chattopadhyay et al. SURDS: Self-supervised attention-guided reconstruction and dual triplet loss for writer independent offline signature verification
CN112036281A (en) Facial expression recognition method based on improved capsule network
CN111401116B (en) Bimodal emotion recognition method based on enhanced convolution and space-time LSTM network
Tautkutė et al. Classifying and visualizing emotions with emotional DAN
Xing et al. Skeletal human action recognition using hybrid attention based graph convolutional network
CN111950592B (en) Multi-modal emotion feature fusion method based on supervised least square multi-class kernel canonical correlation analysis
Yin et al. Msa-gcn: Multiscale adaptive graph convolution network for gait emotion recognition
CN114511798A (en) Transformer-based driver distraction detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant