CN111160327A - Expression recognition method based on lightweight convolutional neural network - Google Patents
Expression recognition method based on lightweight convolutional neural network Download PDFInfo
- Publication number
- CN111160327A CN111160327A CN202010252867.1A CN202010252867A CN111160327A CN 111160327 A CN111160327 A CN 111160327A CN 202010252867 A CN202010252867 A CN 202010252867A CN 111160327 A CN111160327 A CN 111160327A
- Authority
- CN
- China
- Prior art keywords
- neural network
- convolutional neural
- lightweight
- expression
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 62
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 52
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000012549 training Methods 0.000 claims abstract description 48
- 241000282414 Homo sapiens Species 0.000 claims abstract description 15
- 230000008921 facial expression Effects 0.000 claims abstract description 10
- 230000006835 compression Effects 0.000 claims abstract description 9
- 238000007906 compression Methods 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 238000004364 calculation method Methods 0.000 claims description 23
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 5
- 238000013434 data augmentation Methods 0.000 claims description 4
- 239000000126 substance Substances 0.000 claims description 3
- 238000007477 logistic regression Methods 0.000 claims description 2
- 238000012706 support-vector machine Methods 0.000 claims description 2
- 238000013473 artificial intelligence Methods 0.000 abstract 1
- 238000012795 verification Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 5
- 238000012937 correction Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000008451 emotion Effects 0.000 description 3
- 230000001815 facial effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000013145 classification model Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000012821 model calculation Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 230000036544 posture Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G06T5/80—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Abstract
The invention relates to the field of artificial intelligence, and particularly provides an expression recognition method based on a lightweight convolutional neural network, which is characterized by comprising the following steps of: s1: building and training a lightweight convolution network model, wherein the range of the number of convolution layers of the lightweight convolution network model is 36-58, the range of the number of grouped convolution groups is 2-4, and the range of compression factors of compression layers is 0.3-0.5; s2: building a human face corrector; s3: detecting and correcting an input image by adopting a face corrector to obtain a preprocessed image; s4: and classifying and preprocessing the facial expressions in the image by adopting a lightweight convolutional neural network model. The invention solves the technical problems of low identification accuracy and low identification speed in the prior art, and has high real-time performance while ensuring the accuracy.
Description
Technical Field
The invention relates to the field of computer vision, in particular to an expression recognition method based on a lightweight convolutional neural network.
Background
Emotion is a cognitive experience produced by human beings under intense psychological activities and is an important element for guiding communication in social environments. The initiation of emotion comes from a variety of sources, including mood, character, motivation, and the like. Facial expressions, as a unique signal transmission system, can express the psychological state of a person, and are one of effective methods for analyzing emotions. The expression recognition mainly comprises the following four processes: face positioning, face correction, feature extraction and expression classification. The feature extraction and the expression classification are important parts in the process and are the core difficult problems of expression identification. Conventional methods extract facial information using manually designed geometric features based on geometric attributes in the image and appearance features based on grayscale information of the image. These methods have high recognition accuracy for specific data distributions, but are difficult to handle with a wide range of pose changes, and have poor results when generalized to other data sets. In recent years, a method based on data driving has been attracting attention. For example, the convolutional neural network model learns the features directly from the data by means of weight sharing and downsampling, and is robust to changes such as postures, shelters and light. However, in order to obtain higher accuracy, the depth of the model is continuously deepened by the scholars, and the quantity of the model parameters is excessive. This is not conducive to the training of the model and the use of practical applications.
Disclosure of Invention
In order to solve the technical problems of low identification accuracy and low identification speed in the prior art, the invention provides an expression identification method based on a lightweight convolutional neural network, which adopts calculated quantity to reduce parametersDetermining parameters of a lightweight convolutional network model, comprising the steps of:
s1: building and training a lightweight convolution network model, and collecting input image information by adopting the lightweight convolution network model; the range of the convolution layer number of the lightweight convolution network model is 36-58, and the range of the compression factor of the compression layer is 0.3-0.5;
s2: building a human face corrector;
s3: detecting and correcting the input image information by adopting a face corrector to obtain a preprocessed image;
s4: classifying and preprocessing the facial expressions in the image by adopting a lightweight convolutional neural network model;
the building and training of the lightweight convolutional network model comprises the following steps:
s1.1: building a network model, and transmitting the output of each convolutional layer into a subsequent convolutional layer as an additional input, wherein the number of initial grouped convolutional groups is 2-4, and the number of convolutional layers of a single dense block is not less than 12;
s1.2: determining a rate of increase of a structural parameter of a lightweight convolutional neural networkLength of convolution filterWidth of convolution filterNumber of layers of convolution;
Determining a rate of increase of a structural parameter of a lightweight convolutional neural networkLength H of convolution filterkWidth of convolution Filter Wk number of convolution layersThe method comprises the following steps:
calculating a calculation amount reduction parameter based on a lightweight convolutional neural network modelUsing parametersNRate of growth of structural parameter at minimumLength of convolution filterWidth of convolution filterNumber of layers of convolutionAs parameters of the lightweight convolutional network model:
whereinIn order to increase the rate of growth of the structural parameter,is the length of the convolution filter and,in order to be the width of the convolution filter,is the number of convolutional layers.
Preferably, the S1 includes: and training a lightweight convolutional neural network model according to a FERPLUS expression recognition database.
Preferably, the face corrector is built by adopting HOG characteristics and SVM algorithm.
Preferably, the S3 includes: detecting at least four reference points of a face in an input image through a logistic regression tree, matching the at least four reference points through the face corrector, and segmenting the input image according to the at least four reference points to obtain a preprocessed image.
Preferably, the step of training the lightweight convolutional neural network comprises:
acquiring a training sample, wherein the training sample comprises at least 1000 first expression images;
turning, rotating, cutting, scaling and deforming each first expression image by using a data augmentation method to obtain at least 10 corresponding second expression images;
randomly intercepting at least one picture block in the second expression image to obtain a third expression image with a blank area;
training a lightweight convolutional neural network model using the third expression image.
Preferably, the step of building the face corrector by using the HOG feature and the SVM algorithm comprises:
acquiring a training sample, wherein the training sample comprises at least 1000 standard face images;
the calculation formula for obtaining the gradient value and gradient direction of the HOG characteristic of the standard face image is as follows:
wherein the content of the first and second substances,in order to specify the abscissa of a pixel point,in order to specify the ordinate of the pixel point,andin the representative imageThe gradient values of the horizontal direction and the vertical direction of the dot,andis in the range of 0 to 255,the gradient values of the representative pixels are then calculated,representing the direction of the gradient of the pixel point,the value is limited between 0 and 180 degrees;
building a human face SVM model according to the support vector machine principle;
training a human face SVM model by using the gradient value and the gradient direction of the obtained HOG characteristic of the standard human face image to obtain a training result;
the training results are used to form a face detector.
Preferably, the step of training the lightweight convolutional neural network model according to the FERPLUS expression recognition database includes: calculating an updated gradient value:
whereinThe direction is updated for the current gradient,for the update direction of the gradient of the previous step,for the current gradient calculated from the second derivative of the gradient,andin order to attenuate the weight(s),to update the gradient values.
Preferably, at least one type of classified expression classifier constructed by using the lightweight convolutional neural network model obtains the probability of expression prediction according to the extraction features of the input Softmax layer, and the calculation formula is as follows:
whereinIs as followsA label of a similar expression is displayed on the display,is as followsThe characteristics of the class input are such that,to be generalized to represent all weights of a dense network,as predicted probability of all expressionsThe vector of the composition is then calculated,in order to be transposed, the device is provided with a plurality of groups of parallel connection terminals,、、is an integer variable.
According to the technical scheme provided by the invention, on the basis of face positioning and face correction, a light convolution mode is realized by reducing the parameter N by using the calculated amount, so that the calculated amount is lightened on the basis of keeping the accuracy of the dense convolution amount, and the advantages of high accuracy and low calculated amount are achieved. The invention combines the flows of facial feature extraction and expression classification by using a lightweight convolutional neural network model to realize facial expression recognition, realizes the recognition of facial expressions by using a single camera and image processing in a laboratory environment, has higher real-time property while ensuring the accuracy, and effectively analyzes facial expression information.
Drawings
Fig. 1 is a flowchart of an expression recognition method based on a lightweight convolutional neural network according to an embodiment of the present invention.
Fig. 2 is a schematic detection diagram of a face detector according to an embodiment of the present invention.
Fig. 3 is a schematic correction diagram of a face corrector according to an embodiment of the present invention.
Fig. 4a is a result of accuracy of the identification method of the lightweight convolutional neural network according to the first embodiment of the present invention on a verification data set.
Fig. 4b is a result of accuracy of the identification method of the lightweight convolutional neural network according to the first embodiment of the present invention on the verification data set.
Fig. 4c is a result of accuracy of the identification method of the lightweight convolutional neural network according to the first embodiment of the present invention on the verification data set.
Fig. 4d is a result of accuracy of the identification method of the lightweight convolutional neural network according to the first embodiment of the present invention on the verification data set.
Fig. 4e is a result of accuracy of the identification method of the lightweight convolutional neural network according to the first embodiment of the present invention on the verification data set.
Fig. 5a is a comparison graph of model parameters required by the identification method of the lightweight convolutional neural network according to the first embodiment of the present invention compared with other models.
Fig. 5b is a comparison graph of the model calculation amount compared with other models in the identification method of the lightweight convolutional neural network according to the first embodiment of the present invention.
Fig. 6a is a learning curve diagram of a lightweight convolutional neural network identification method in data set FER2013 according to an embodiment of the present invention.
Fig. 6b is a learning curve diagram of the lightweight convolutional neural network identification method in the data set FERPLUS according to the first embodiment of the present invention.
Fig. 6c is a learning curve diagram of the lightweight convolutional neural network identification method in the data set ferfn according to the first embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the scope of the present invention.
Example one
The embodiment provides an expression recognition method based on a lightweight convolutional neural network.
As shown in fig. 1, the expression recognition method based on the lightweight convolutional neural network provided in this embodiment includes four parts: inputting a single-frame image, detecting a human face, correcting the human face and identifying an expression. Starting from an original input image, the classification of the facial expression is predicted after two links of image processing.
The light weight of the embodiment refers to a convolution network calculation mode which is more efficient and requires less calculation amount compared with standard convolution calculation, so that the calculation complexity of the convolution network is reduced, and the calculation efficiency is improved. Dense blocks refer to the number of convolutional networks, and within a certain range, the greater the number of convolutional networks, the higher the model accuracy. According to the embodiment, the calculation mode of the convolutional network is improved in a light weight mode, the dense number of the convolutional network is not changed, the light weight neural network structure is obtained by optimizing the parameters of the convolutional network, and the operation efficiency is improved while the accuracy of the dense number is maintained.
Firstly, recognition accuracy and calculation time are two standards for detecting and positioning human faces in a man-machine interaction environment, but in consideration of the real-time performance of an expression recognition system, on the premise of ensuring certain accuracy, features and learning algorithms with higher calculation speed need to be selected to optimize various parameters of a lightweight convolution model.
The model training speed is slow considering that dense connection can cause the model to be too large in calculation amount. The lightweight convolutional neural network used in the present embodiment optimizes the convolutional layer. Building a network model, and transmitting the output of each convolutional layer into a subsequent convolutional layer as an additional input, wherein the number of initial grouped convolutional groups is 2-4, and the number of convolutional layers of a single dense block is not less than 12;
WhereinIn order to increase the rate of growth of the structural parameter,is the length of the convolution filter and,in order to be the width of the convolution filter,is the number of convolutional layers. Using calculation reducing parametersRate of growth of structural parameter at minimumLength of convolution filterWidth of convolution filterNumber of layers of convolutionAs a lightweight convolution model parameter.
The present embodiment forms a face detector based on the HOG features and the SVM algorithm for detecting the face position in a single frame image. Specifically, the present embodiment obtains training samples, where the training samples include 3000 face images in an LFW database; calculating the HOG characteristics of the face image according to a generation method of the direction gradient histogram; training a face detection SVM model by using the extracted HOG characteristics; and forming a face detector according to the training result. The HOG characteristic gradient calculation formula is as follows:
wherein the content of the first and second substances,in order to specify the abscissa of a pixel point,in order to specify the ordinate of the pixel point,andin the representative imageThe gradient values of the horizontal direction and the vertical direction of the dot,andis in the range of 0 to 255,the gradient values of the representative pixels are then calculated,representing the direction of the gradient of the pixel point,the value is limited to between 0 and 180 degrees.
As shown in fig. 2, after an original image is input, the HOG feature form of the image is first calculated, then the trained standard face HOG feature is compared with the HOG feature, and finally the position of the face in the original image is found and output.
The embodiment uses the regression tree set to detect the reference points in the face image block so as to correct the face in the single-frame image. Specifically, training samples are obtained, wherein the training samples comprise 2000 training face images and 330 testing face images; training the training sample by using a regression tree set of shape invariant feature segmentation; the training results are used to construct a face corrector. Fig. 3 is a schematic correction diagram of the face corrector according to the first embodiment. As shown in fig. 3, after the face image block is input, 68 feature points of the face are calculated first, then the face image block is compared with 68 feature points of the standard face, and finally the face image block is corrected.
The embodiment utilizes the lightweight convolutional neural network to perform feature extraction and prediction on the corrected human face so as to obtain expression classification. After a lightweight convolutional network model is built, model parameters are optimized through extreme value calculation; acquiring a facial expression recognition database FERFIN; preprocessing the expression data such as data augmentation; training by using the preprocessed expression image data set; and taking the training result as a final expression classification model.
Considering that applications in real environments require high real-time performance, an excessively large neural network architecture may result in an increased amount of computation. Setting the range of the convolution layer number of the lightweight convolution network model to be 36-58 and the range of the compression factor of the compression layer to be 0.3-0.5, and reducing parameters by adopting calculated amountRate of growth of structural parameter at minimumLength of convolution filterWidth of convolution filterNumber of layers of convolutionThe parameters of the lightweight convolutional network model are beneficial to reducing the quantity of the parameters of the model and learning more characteristic features.
In the middle of each dense block there is a transition layer in order to accomplish parameter compression and adjust the computation variables. After 3 dense blocks, the feature tensor calculated by the model is input into a full-connection network layer, the combined kernel function of the layer maps the features extracted from the image into a 1 × 7 vector, wherein the value of each position represents the confidence of the expression of the category, and a vector calculation formula consisting of the prediction probabilities of all expressions is as follows:
whereinIs as followsA label of a similar expression is displayed on the display,is as followsThe characteristics of the class input are such that,to be generalized to represent all weights of a dense network,is a vector consisting of the predicted probabilities of all expressions,in order to be transposed, the device is provided with a plurality of groups of parallel connection terminals,、、is an integer variable.
The embodiment trains an expression classifier using a lightweight convolutional neural network. For the deep convolutional neural network model, a large amount of training data is required to achieve high accuracy, so the present embodiment adopts the ferfn data set as the training data set. The ferfn data set was improved from the FER2013 data set, and included 12858 cases of "neutral" images, "9354 cases of" happy "images," 4462 cases of "sad" images 4351 cases of "angry" images 3082 cases of "disgust" images 575 cases of "afraid" images and "afraid" images 816 cases, and 35498 cases of 48 by 48 pixel grayish human expression images were summed.
The present embodiment processes the rfin database in two steps, taking into account the changes in pose, light, occlusion, etc. that exist in the expression recognition task. Turning, rotating, cutting, scaling and deforming the original picture to obtain twelve new pictures by using a data augmentation method for each picture; and randomly cutting the new picture into 16-by-16 pixel blocks to obtain a picture with blank areas.
In order to accelerate the convergence rate of the lightweight convolutional network model, the present embodiment obtains the current gradient by using the following momentum method instead of the conventional gradient descent method, where the momentum method calculates the current gradient formula as follows:
whereinAndrespectively representing the update direction of the current and previous step gradients,representing the current gradient calculated from the second derivative of the gradient,andrespectively represent two attenuation weights which are respectively represented by,to update the gradient values.
As shown in FIGS. 4a, 4b, 4c, 4d and 4e, the used PGC-DenseNet model for optimizing DenseNet, which comprises 3 Dense blocks with 12 convolutional layers, optimizes all convolutional layers into depth separable convolution after grouping before each input image enters the Dense Block, and reduces the parameters by the calculated amountThe model after the parameters are optimized is compared with other popular lightweight networks, and the result shows that the used method exceeds other models in accuracy performance, the used model converges earlier than other models, the speed reaching 80% is faster, and the final accuracy is higher.
As shown in FIG. 5a, the lightweight model comprises PGC-DenseNet, Squeezet, ShuffleNet1, ShuffleNet2, MobileNet1, MobileNet2 and MobileNet3 from left to right, and comparison of model parameters is carried out; as shown in FIG. 5b, the weight-reduced models were PGC-DenseNet, Squeezet, ShuffleNet1, ShuffleNet2, MobileNet1, MobileNet2, and MobileNet3 in the order from left to right, and were compared in terms of the amount of model calculations. As can be seen from fig. 5a and 5b, when the model used by PGC-DenseNet is the smallest in model parameters, and the calculated amount of the model is also in the same order as that of other models, only about 25 ten thousand of parameters are included, and the model parameters are reduced by at most 6 times compared with other lightweight models.
Fig. 6a, fig. 6b, and fig. 6c are the learning graphs of the PGC-DenseNet model in the data set FER2013, the data set FERPLUS, and the data set ferfn, respectively, in which the curves are more smoothly and continuously rising, and finally, the training set is located above, and the test set is where the curves fluctuate more and tend to converge. As can be clearly seen from fig. 6a, 6b, and 6c, the learning curves of the used models on the training set and the verification set exhibit sufficient robustness in the face of the over-fitting problem, and the curves of the training set and the verification set fit closely within 150 cycles.
The invention provides a lightweight rollThe expression recognition method of the neural network comprises the following steps: preprocessing facial expression data, training a lightweight convolutional neural network model to obtain an expression classification model, training a facial corrector, detecting a face based on a single frame image, correcting and recording the face based on the single frame image, and recognizing the expression based on the single frame image to obtain expression classification. The invention adopts calculation amount reduction parametersThe method optimizes the parameters of the model and adopts a light convolution mode, so that on the basis of keeping the accuracy of the dense convolution quantity, the light calculation quantity has the advantages of high accuracy and low calculation quantity, online expression recognition is realized by using a single camera and a network transmission scheme in a laboratory environment, and the method has high real-time performance while the accuracy is ensured.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only examples of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (8)
1. An expression recognition method based on a lightweight convolutional neural network is characterized in that a calculated quantity reduction parameter is adoptedDetermining parameters of a lightweight convolutional network model, comprising the steps of:
s1: building and training a lightweight convolution network model, and collecting input image information by adopting the lightweight convolution network model; the range of the convolution layer number of the lightweight convolution network model is 36-58, and the range of the compression factor of the compression layer is 0.3-0.5;
s2: building a human face corrector;
s3: detecting and correcting the input image information by adopting a face corrector to obtain a preprocessed image;
s4: classifying and preprocessing the facial expressions in the image by adopting a lightweight convolutional neural network model;
the building and training of the lightweight convolutional network model comprises the following steps:
s1.1: building a network model, and transmitting the output of each convolutional layer into a subsequent convolutional layer as an additional input, wherein the number of initial grouped convolutional groups is 2-4, and the number of convolutional layers of a single dense block is not less than 12;
s1.2: determining a rate of increase of a structural parameter of a lightweight convolutional neural networkLength of convolution filterWidth of convolution filterNumber of layers of convolution;
Determining a rate of increase of a structural parameter of a lightweight convolutional neural networkLength of convolution filterWidth of convolution filterNumber of layers of convolutionThe method comprises the following steps:
calculating a calculation amount reduction parameter based on a lightweight convolutional neural network modelUsing parametersRate of growth of structural parameter at minimumLength of convolution filterWidth of convolution filterNumber of layers of convolutionAs parameters of the lightweight convolutional network model:
2. The expression recognition method based on the light-weighted convolutional neural network of claim 1, wherein the S1 includes: and training a lightweight convolutional neural network model according to a FERPLUS expression recognition database.
3. The expression recognition method based on the light-weight convolutional neural network as claimed in claim 1, wherein the face corrector is built by adopting HOG features and SVM algorithm.
4. The expression recognition method based on the light-weighted convolutional neural network of claim 1, wherein the S3 includes: detecting at least four reference points of a face in an input image through a logistic regression tree, matching the at least four reference points through the face corrector, and segmenting the input image according to the at least four reference points to obtain a preprocessed image.
5. The expression recognition method based on the light-weighted convolutional neural network of claim 2, wherein the step of training the light-weighted convolutional neural network comprises:
acquiring a training sample, wherein the training sample comprises at least 1000 first expression images;
turning, rotating, cutting, scaling and deforming each first expression image by using a data augmentation method to obtain at least 10 corresponding second expression images;
randomly intercepting at least one picture block in the second expression image to obtain a third expression image with a blank area;
training a lightweight convolutional neural network model using the third expression image.
6. The expression recognition method based on the light-weight convolutional neural network as claimed in claim 3, wherein the step of constructing the face corrector by adopting the HOG feature and the SVM algorithm comprises the following steps:
acquiring a training sample, wherein the training sample comprises at least 1000 standard face images;
the calculation formula for obtaining the gradient value and gradient direction of the HOG characteristic of the standard face image is as follows:
wherein the content of the first and second substances,in order to specify the abscissa of a pixel point,in order to specify the ordinate of the pixel point,andin the representative imageThe gradient values of the horizontal direction and the vertical direction of the dot,andis in the range of 0 to 255,the gradient values of the representative pixels are then calculated,representing the direction of the gradient of the pixel point,the value is limited between 0 and 180 degrees;
building a human face SVM model according to the support vector machine principle;
training a human face SVM model by using the gradient value and the gradient direction of the obtained HOG characteristic of the standard human face image to obtain a training result;
the training results are used to form a face detector.
7. The expression recognition method based on the lightweight convolutional neural network as claimed in claim 1, wherein the step of training the lightweight convolutional neural network model according to the FERPLUS expression recognition database comprises: calculating an updated gradient value:
8. The expression recognition method based on the lightweight convolutional neural network as claimed in claim 1, wherein at least one type of classified expression classifier constructed by using the lightweight convolutional neural network model obtains the probability of expression prediction according to the extracted features of the input Softmax layer, and the calculation formula is as follows:
whereinIs as followsA label of a similar expression is displayed on the display,is as followsThe characteristics of the class input are such that,to be generalized to represent all weights of a dense network,is a vector consisting of the predicted probabilities of all expressions,in order to be transposed, the device is provided with a plurality of groups of parallel connection terminals,、、is an integer variable.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010252867.1A CN111160327B (en) | 2020-04-02 | 2020-04-02 | Expression recognition method based on lightweight convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010252867.1A CN111160327B (en) | 2020-04-02 | 2020-04-02 | Expression recognition method based on lightweight convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111160327A true CN111160327A (en) | 2020-05-15 |
CN111160327B CN111160327B (en) | 2020-06-30 |
Family
ID=70567689
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010252867.1A Active CN111160327B (en) | 2020-04-02 | 2020-04-02 | Expression recognition method based on lightweight convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111160327B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113642477A (en) * | 2021-08-17 | 2021-11-12 | 苏州大学 | Character recognition method, device and equipment and readable storage medium |
CN116958703A (en) * | 2023-08-02 | 2023-10-27 | 德智鸿(上海)机器人有限责任公司 | Identification method and device based on acetabulum fracture |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109753922A (en) * | 2018-12-29 | 2019-05-14 | 北京建筑大学 | Anthropomorphic robot expression recognition method based on dense convolutional neural networks |
CN109829923A (en) * | 2018-12-24 | 2019-05-31 | 五邑大学 | A kind of antenna for base station based on deep neural network has a down dip angle measuring system and method |
CN110853630A (en) * | 2019-10-30 | 2020-02-28 | 华南师范大学 | Lightweight speech recognition method facing edge calculation |
WO2020051776A1 (en) * | 2018-09-11 | 2020-03-19 | Intel Corporation | Method and system of deep supervision object detection for reducing resource usage |
-
2020
- 2020-04-02 CN CN202010252867.1A patent/CN111160327B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020051776A1 (en) * | 2018-09-11 | 2020-03-19 | Intel Corporation | Method and system of deep supervision object detection for reducing resource usage |
CN109829923A (en) * | 2018-12-24 | 2019-05-31 | 五邑大学 | A kind of antenna for base station based on deep neural network has a down dip angle measuring system and method |
CN109753922A (en) * | 2018-12-29 | 2019-05-14 | 北京建筑大学 | Anthropomorphic robot expression recognition method based on dense convolutional neural networks |
CN110853630A (en) * | 2019-10-30 | 2020-02-28 | 华南师范大学 | Lightweight speech recognition method facing edge calculation |
Non-Patent Citations (1)
Title |
---|
孙若钒等: "VansNet轻量化卷积神经网络", 《贵州大学学报(自然科学版)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113642477A (en) * | 2021-08-17 | 2021-11-12 | 苏州大学 | Character recognition method, device and equipment and readable storage medium |
CN116958703A (en) * | 2023-08-02 | 2023-10-27 | 德智鸿(上海)机器人有限责任公司 | Identification method and device based on acetabulum fracture |
Also Published As
Publication number | Publication date |
---|---|
CN111160327B (en) | 2020-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110569795B (en) | Image identification method and device and related equipment | |
CN108133188B (en) | Behavior identification method based on motion history image and convolutional neural network | |
CN109815826B (en) | Method and device for generating face attribute model | |
CN109492529A (en) | A kind of Multi resolution feature extraction and the facial expression recognizing method of global characteristics fusion | |
CN112801040B (en) | Lightweight unconstrained facial expression recognition method and system embedded with high-order information | |
CN113158862A (en) | Lightweight real-time face detection method based on multiple tasks | |
CN113420794B (en) | Binaryzation Faster R-CNN citrus disease and pest identification method based on deep learning | |
CN111160327B (en) | Expression recognition method based on lightweight convolutional neural network | |
CN113705290A (en) | Image processing method, image processing device, computer equipment and storage medium | |
CN111476178A (en) | Micro-expression recognition method based on 2D-3D CNN | |
Xu et al. | Face expression recognition based on convolutional neural network | |
CN112883931A (en) | Real-time true and false motion judgment method based on long and short term memory network | |
CN112418032A (en) | Human behavior recognition method and device, electronic equipment and storage medium | |
CN111401116B (en) | Bimodal emotion recognition method based on enhanced convolution and space-time LSTM network | |
CN115393944A (en) | Micro-expression identification method based on multi-dimensional feature fusion | |
CN112836748A (en) | Casting identification character recognition method based on CRNN-CTC | |
CN111126364A (en) | Expression recognition method based on packet convolutional neural network | |
CN113343773B (en) | Facial expression recognition system based on shallow convolutional neural network | |
CN113052132A (en) | Video emotion recognition method based on face key point track feature map | |
TWI722383B (en) | Pre feature extraction method applied on deep learning | |
CN113469116A (en) | Face expression recognition method combining LBP (local binary pattern) features and lightweight neural network | |
Shijin et al. | Research on classroom expression recognition based on deep circular convolution self-encoding network | |
CN112819133A (en) | Construction method of deep hybrid neural network emotion recognition model | |
Nayak et al. | Facial Expression Recognition based on Feature Enhancement and Improved Alexnet | |
Duan | An object recognition method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |