CN112084944A - Method and system for identifying dynamically evolved expressions - Google Patents

Method and system for identifying dynamically evolved expressions Download PDF

Info

Publication number
CN112084944A
CN112084944A CN202010940835.0A CN202010940835A CN112084944A CN 112084944 A CN112084944 A CN 112084944A CN 202010940835 A CN202010940835 A CN 202010940835A CN 112084944 A CN112084944 A CN 112084944A
Authority
CN
China
Prior art keywords
expression
picture
classification
unit
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010940835.0A
Other languages
Chinese (zh)
Inventor
赵曦滨
朱俊杰
骆炳君
高跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202010940835.0A priority Critical patent/CN112084944A/en
Publication of CN112084944A publication Critical patent/CN112084944A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/176Dynamic expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/245Classification techniques relating to the decision surface
    • G06F18/2451Classification techniques relating to the decision surface linear, e.g. hyperplane
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Abstract

The invention provides a method for identifying dynamically evolved expressions, wherein the self-adaptive identification method comprises the following steps: step 1, constructing and optimizing past expression memory modules according to dynamic increase of expression categories; step 2, dynamically upgrading the expression recognition system by utilizing a minimum 'compact classification-distillation' loss function of a past expression memory module according to the expression data of the new category; and 3, outputting the expression type label of the picture according to the input expression picture to be identified and the expression identification system in the current state. The facial expression recognition system provided by the invention can support training and recognition in any stage of dynamic increase of categories and achieve stable and reliable performance. This system significantly reduces memory and time consumption during training, compared to conventional facial expression recognition systems.

Description

Method and system for identifying dynamically evolved expressions
Technical Field
The application relates to the technical field, in particular to a method and a system for identifying dynamically evolved expressions.
Background
The expression recognition technology is widely applied to the fields of human-computer interaction, industrial control, medical education and the like, and is an important tool for tasks of analyzing user behaviors, evaluating psychological states, monitoring disease development and the like. With the continued refinement of mood analysis in psychology, more and more complex expression categories are beginning to be created, attended to, and used by the public. In this case, the recognition method of dynamically evolving expressions becomes an urgent and fundamental challenge in the expression recognition technology.
In the prior art, a fixed-type expression recognition method is adopted, when data of a new expression type is introduced, an original expression recognition model needs to be abandoned, and then all data including the original expression type and the new expression type are used for retraining the new expression recognition model, so that memory and time loss are too high in the actual application process, and the application of the expression recognition technology in the complex expression type is limited. The root of the problem lies in that when the model for recognizing the expressions with dynamically increased categories is trained, the knowledge memory of the model for the original expression categories is difficult to keep, and the possibility that the recognition accuracy of the original expression categories is greatly reduced exists, so that the expression recognition technology cannot be generalized to the application scene with dynamically increased expression categories.
Disclosure of Invention
The purpose of this application lies in: the memory capacity of the expression recognition model for category dynamic increase is enhanced, the recognition accuracy of the original expression type is improved, the memory and time loss of the expression recognition method for category dynamic increase are reduced, and further the feasibility of the expression recognition technology for category dynamic increase is improved.
The technical solution of the first aspect of the present application is to provide a method for identifying a dynamically evolved expression, where the method for adaptively identifying includes:
step 1, constructing and optimizing past expression memory modules according to dynamic increase of expression categories;
step 2, dynamically upgrading the expression recognition system by utilizing a minimum 'compact classification-distillation' loss function of a past expression memory module according to the expression data of the new category;
and 3, outputting the expression type label of the picture according to the input expression picture to be identified and the expression identification system in the current state.
Further, the step 1 specifically includes:
step 11, setting a total scale K of a past expression memory module for storing a representative expression picture according to the memory limit of the computing equipment and the completeness of the data acquisition permission;
step 12, after the system initialization stage, according to s in the initialization0The similar expression data is memory which can store K pictures and is set for the past expression memory module;
step 13, for s0Extracting a feature vector of each picture in each type of expression in the type expression data through a feature extraction module of a neural network;
step 14, solving an arithmetic mean characteristic vector c from the characteristic vectors of the pictures with each type of expressions;
step 15, sequentially generating K/s according to the arithmetic mean vector0Opening pictures and forming an ordered sequence; for the K-th picture, K is 1 to K/s0The following can be calculated according to the formula:
Figure BDA0002673587590000021
wherein x is an expression picture, f () is a feature extraction module of the network model, pjC is an arithmetic mean characteristic vector of the expression type corresponding to the kth picture, wherein the pictures are generated in the ordered picture sequence;
step 16, an initialized past expression memory module is jointly constructed according to the ordered picture sequence generated by the picture data of each type of expression;
step 17, in new sk-sk-1When the expression-like data appears, optimizing and updating the representative data module of the past expression; traverse existing skJudging whether the currently executed expression category is a new s or not according to the expression categoryk-sk-1Class, if new sk-sk-1Class for new sk-sk-1The class expression is executed from step 13 to step 15, and finally K/s is generatedkAn ordered sequence of sheets; if not new sk-sk-1Class, then go to step 18;
step 18, for the expression data of the non-new type, each type of expression has one existing K/s in the memory modulek-1Ordered sequence of pictures, deleting (K/s) at the end of the sequencek-1)-(K/sk) Opening a picture;
and step 19, storing the ordered picture sequence of the new category expression in a memory module.
Further, the step 2 specifically includes:
step 21, in the system initialization stage, according to s in initialization0The expression data is similar to the expression data, and an available expression picture database is constructed;
step 22, acquiring at least one expression picture in the available expression picture database;
step 23, extracting the features of the expression picture through a feature extraction module of a neural network according to the expression picture, and recording the features as feature vectors;
24, generating classification characteristics by utilizing a linear classification algorithm according to the characteristic vectors, and calculating a 'compact classification' loss function;
step 25, constructing the expression recognition model according to the classification characteristics, and calculating the performance index of the expression recognition model;
step 26, judging whether the performance index of the expression recognition model is converged, and if not, executing step 23;
step 27, in the new sk-sk-1When the expression data of the class appears, the new s is usedk-sk-1Expression data and place of classThe previous expression memory module is used for constructing an available expression picture database;
step 28, recording the expression recognition model in the last state as an original expression recognition model, and constructing a current expression recognition model according to the original expression recognition model;
step 29, acquiring at least one expression picture in the available expression picture database, and recording the number of the acquired expression pictures as m;
step 210, extracting the corresponding features of the expression picture through a feature extraction module of an original expression recognition model according to the expression picture, generating classification features by utilizing a linear classification algorithm, and recording the classification features as original classification features o0 (i)
Step 211, extracting the corresponding features of the expression picture through a feature extraction module of the current expression recognition model according to the expression picture, and recording the features as a current feature vector feat(i)
Step 212, generating a classification feature according to the current feature vector by using a linear classification algorithm, and recording the classification feature as a current classification feature o(i)
Step 213, calculating the "compact classification-distillation" loss function, constructing the expression recognition model according to the current classification features, and calculating the performance index of the expression recognition model, wherein the calculation formula of the "compact classification-distillation" loss function is as follows:
Figure BDA0002673587590000041
Figure BDA0002673587590000042
Figure BDA0002673587590000043
Figure BDA0002673587590000044
in the formula o(i)Current classification characteristic, o, representing the ith expression picture0 (i)The original classification characteristic of the ith expression picture is represented, and sigma (·) is a Logistic Sigmoid function, feat(i)Feature vector, y, representing the ith expression picture(i)Class label representing ith expression picture, cyThe average value of the feature vectors of the expression pictures marked as y for all the classifications is shown, and lambda is a balance parameter;
step 214, determining whether the performance index of the expression recognition model is converged, if not, executing step 210.
Further, the step 3 specifically includes:
step 31, inputting the expression picture to be identified according to a data modality;
step 32, extracting a feature vector corresponding to the expression picture to be identified through a feature extraction module of the expression identification model in the current state;
step 33, calculating the classification feature score of the expression picture to be identified by utilizing a linear classification algorithm according to the feature vector;
and step 34, outputting the expression category corresponding to the maximum classification characteristic according to the classification characteristic.
The invention also provides a system for identifying dynamically evolved expressions, which is characterized in that the self-adaptive identification system comprises: the system comprises a past expression memory module, a network model construction module and an identification result output module;
the past expression memory module is configured to extract the characteristics corresponding to all expression objects of new expression according to the dynamic increase of the expression types and the network model of expression recognition, construct a characteristic database and finally optimize and update the past expression memory module;
the network model building module is configured to update and upgrade the network model according to the network model of the previous stage, the expression objects of the new category and past expression memory modules, and minimizing a 'compact classification-distillation' loss function;
the recognition result output module is configured to extract corresponding feature vectors according to the input expression pictures needing to be recognized, calculate classification feature scores of the expression pictures needing to be recognized, and output the expression categories with the maximum corresponding scores.
Further, the past expression memory module specifically comprises: the device comprises an image acquisition unit, a feature extraction unit, an arithmetic mean unit, a sequence generation unit and a sequence updating unit;
the image acquisition unit is used for acquiring picture objects belonging to the same class in expression classes appearing for the first time, wherein the number of the picture objects in the same class is not less than two pictures;
the feature extraction unit is used for extracting a group of features corresponding to the picture objects according to the feature extraction part of the expression recognition network and the picture objects of the same class and recording the group of features as a preliminary feature vector group;
the arithmetic mean unit is used for respectively carrying out arithmetic mean on the characteristic components in the preliminary characteristic vector group according to each dimension of the vector to obtain arithmetic mean vectors of the expression pictures;
the sequence generating unit is used for sequentially generating K/s according to each type of picture object, the preliminary characteristic vector group and the arithmetic mean vector0Opening pictures and forming an ordered sequence; for the kth picture of the sequence, K is 1 to K/s0The following can be calculated according to the formula:
Figure BDA0002673587590000051
wherein x is an expression picture, f () is a feature extraction part of the network model, pjC is an arithmetic mean characteristic vector of the expression type corresponding to the kth picture, wherein the pictures are generated in the ordered picture sequence;
and the sequence adjusting unit is used for deleting the picture objects at the tail of the list for the expression type sequence of the existing memory module when the new type expression data appears, and remaining picture objects in the sequence are still kept in the memory module.
Further, the network model building module further includes: the system comprises a database construction unit, an index calculation unit and a judgment unit;
the database construction unit is used for constructing an expression picture database which can be used for model upgrading;
the index calculation unit is used for calculating the performance index of the expression recognition model;
the judging unit is used for judging whether the performance index of the expression recognition model is converged or not, and if not, the expression picture is obtained again.
Further, the identification result output module specifically includes: the device comprises an input unit, an extraction unit, a calculation unit and an output unit;
the input unit is used for inputting the expression picture to be identified;
the extraction unit is used for extracting the feature vector corresponding to the expression picture to be identified;
the calculating unit is used for calculating the classification characteristic score of the expression picture to be identified by utilizing a maximum pooling algorithm and a linear classification algorithm;
the output unit is used for outputting the expression category with the maximum corresponding score as a prediction label of the expression picture according to the classification characteristic score
The beneficial effect of this application is: the method comprises the steps of screening the most representative pictures in each type of expression, generating an ordered picture sequence, dynamically optimizing past expression memory modules, and constructing and upgrading an expression recognition system under the condition that the types of the expressions are continuously increased, so that the catastrophic forgetting problem of the past expressions by the expression recognition system is solved, the dynamic increase of any expression type can be supported by the expression recognition network, the memory and time consumption of training is greatly reduced, the stable and reliable performance is achieved, and the current recognition method for dynamically evolving the expressions has the possibility of practical application.
Drawings
The advantages of the above and/or additional aspects of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic flow diagram of a method of identifying dynamically evolving expressions according to an embodiment of the present application;
FIG. 2 is a graph comparing identification performance according to an embodiment of the present application;
fig. 3 is a schematic block diagram of a recognition system of dynamically evolving expressions according to an embodiment of the present application.
Detailed Description
In order that the above objects, features and advantages of the present application can be more clearly understood, the present application will be described in further detail with reference to the accompanying drawings and detailed description. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced in other ways than those described herein, and therefore the scope of the present application is not limited by the specific embodiments disclosed below.
The first embodiment is as follows:
the first embodiment of the present application will be described below with reference to fig. 1 to 2.
As shown in fig. 1, the present embodiment provides a method for identifying a dynamically evolving expression, including:
step 1, constructing and optimizing past expression memory modules according to dynamic increase of expression categories;
in the step 1, the method specifically comprises the following steps:
and step 11, setting a total scale K of the past expression memory module for storing the representative expression picture according to the memory limit of the computing equipment and the integrity degree of the data acquisition permission. Generally, the memory occupied by the K pictures should be no more than 10% of the memory of the current device, and the memory occupied by the K pictures is moderately increased along with the increase of the integrity of the data acquisition authority;
step 12, after the system initialization stage, according to s in the initialization0The data of the similar expression is stored in the memory,setting a memory capable of storing K pictures for a past expression memory module;
step 13, for s0Extracting a feature vector of each picture in each type of expression in the type expression data through a feature extraction module of a neural network;
specifically, m expression pictures are read in by using a gray scale format, the pixel size of the picture is set to be 100 multiplied by 100, and each expression picture v is recorded as v e R100×100×1. Performing feature extraction on the read expression picture by using a trained feature extraction module in the current system to obtain a feature vector group { feat }(i)Wherein, feat ∈ R512,i=1,2,…,m。
Step 14, solving an arithmetic mean characteristic vector c from the characteristic vectors of the pictures with each type of expressions; specifically, the arithmetic mean feature vector c corresponding to the expression of the ith classiIs calculated by the formula
Figure BDA0002673587590000071
Wherein y is(j)Class label (given by training data set) representing jth expression picture, feat(j)And representing the feature vector of the j-th expression picture.
Step 15, sequentially generating K/s according to the arithmetic mean vector0Pictures are displayed and an ordered sequence is formed. For the K-th picture, K is 1 to K/s0The following can be calculated according to the formula:
Figure BDA0002673587590000081
wherein x is an expression picture, f () is a feature extraction module of the network model, pjC is the arithmetic mean feature vector of the expression type corresponding to the kth picture.
And step 16, according to the ordered picture sequence generated by the picture data of each type of expression, jointly constructing an initialized past expression memory module.
Step 17, in new sk-sk-1When the expression-like data appears, optimizing and updating the representative data module of the past expression; traverse existing skJudging whether the currently executed expression category is a new s or not according to the expression categoryk-sk-1Class, if new sk-sk-1Class, executing the steps 13-15 for the expression of the class, and finally generating K/skAn ordered sequence of sheets; if not new sk-sk-1Class, then go to step 18;
step 18, for the expression data of the non-new type, each type of expression has one existing K/s in the memory modulek-1Ordered sequence of pictures, deleting (K/s) at the end of the sequencek-1)-(K/sk) Opening a picture;
and step 19, storing the ordered picture sequence of the new category expression in a memory module.
Step 2, dynamically upgrading the expression recognition system by utilizing a minimum 'compact classification-distillation' loss function of a past expression memory module according to the expression data of the new category;
in the step 2, the method specifically comprises the following steps:
step 21, in the system initialization stage, according to s in initialization0The expression data is similar to the expression data, and an available expression picture database is constructed;
specifically, s in initialization is used0And constructing a current available expression picture database by the expression pictures of the classes and the corresponding classification marks.
Step 22, acquiring at least one expression picture in the available expression picture database;
specifically, m different expression pictures are acquired from an available expression picture database and used as basic data for constructing an expression recognition model, wherein m is a positive integer greater than or equal to 1, and the specific value of m is determined by the precision required by the model.
Step 23, extracting the features of the expression picture through a feature extraction module of a neural network according to the expression picture, and recording the features as feature vectors;
in particular, ash is usedReading m expression pictures in a format, setting the pixel size of the pictures to be 100 multiplied by 100, and recording each expression picture v as v ∈ R100×100×1. Feature extraction is carried out on the read-in expression picture by utilizing a feature extraction module to obtain a feature vector feat(i)∈R512,i=1,2,…,m。
24, generating classification characteristics by utilizing a linear classification algorithm according to the characteristic vectors, and calculating a 'compact classification' loss function;
specifically, feature vectors feat(i)As input, the linear classification algorithm is used for operation, and the classification characteristic o of the expression picture predicted by the neural network is output(i)The method comprises the steps of calculating 'compact classification' loss (loss) by using a classification result of an object predicted by the neural network and a classification label of a standard object, and performing Gradient return by using a Stochastic Gradient Descent (SGD) algorithm to construct a neural network model. Using the superscript c to represent the picture index corresponding to the vector, and according to the expression picture number i acquired in step 22 being 1,2, …, m, the calculation formula corresponding to the "compact classification" loss function of one iteration is:
Figure BDA0002673587590000091
Figure BDA0002673587590000092
Figure BDA0002673587590000093
in the formula o(i)Represents the output of the linear classification algorithm, σ (-) is the Logistic Sigmoid function, feat(i)Feature vector, y, representing the ith expression picture(i)Class label representing ith expression picture, cyThe arithmetic mean eigenvector corresponding to the y-th expression is the balance parameter, where λ is 0.1.
Step 25, constructing the expression recognition model according to the classification characteristics, and calculating the performance index of the expression recognition model;
step 26, judging whether the performance index of the expression recognition model is converged, and if not, executing step 23;
specifically, it is determined whether the value of the "compact classification" loss function loss is stable and converged or whether the cumulative training number has reached the maximum limit, where the determination criterion is that the cumulative training number has reached 32 times. If the judgment criterion is met, step 27 is executed, otherwise, step 23 is executed.
Further, after step 26, the method further includes:
step 27, in the new sk-sk-1When the expression data of the class appears, the new s is usedk-sk-1The expression data of the class and the past expression memory module construct an available expression picture database;
specifically, a current available expression picture database is constructed by using the expression pictures of the new category, the expression pictures in the past expression memory module and the classification marks corresponding to the expression pictures.
Step 28, recording the expression recognition model in the last state as an original expression recognition model, and constructing a current expression recognition model according to the original expression recognition model;
step 29, acquiring at least one expression picture in the available expression picture database, and recording the number of the acquired expression pictures as m;
specifically, m different expression pictures are acquired from an available expression picture database and used as basic data for constructing an expression recognition model, wherein m is a positive integer greater than or equal to 1, and the specific value of m is determined by the precision required by the model.
Step 210, extracting the corresponding features of the expression picture through a feature extraction module of an original expression recognition model according to the expression picture, generating classification features by utilizing a linear classification algorithm, and recording the classification features as original classification features o0 (i)
Specifically, m expression pictures are read in by using a gray scale format, and the pixel size of the pictures is set to be100 x 100, and recording each expression picture v as v ∈ R100×100×1. Performing feature extraction on the read-in expression picture by using a feature extraction module of the original expression recognition model to obtain a feature vector
Figure BDA0002673587590000101
Then will be
Figure BDA0002673587590000102
As input, the linear classification algorithm is used for operation, and the classification characteristic o of the expression picture predicted by the neural network is output0 (i)
Step 211, extracting the corresponding features of the expression picture through a feature extraction module of the current expression recognition model according to the expression picture, and recording the features as a current feature vector feat(i)
Specifically, m expression pictures are read in by using a gray scale format, the pixel size of the picture is set to be 100 multiplied by 100, and each expression picture v is recorded as v e R100×100×1. Feature extraction is carried out on the read-in expression picture by utilizing a feature extraction module of the current expression recognition model to obtain a feature vector feat(i)∈R512,c=1,2,…,m。
Step 212, generating a classification feature according to the current feature vector by using a linear classification algorithm, and recording the classification feature as a current classification feature o(i)
Specifically, feature vectors feat(i)As input, the linear classification algorithm of the current expression recognition model is used for operation, and the classification characteristic o of the expression picture predicted by the neural network is output(i)
Step 213, calculating the "compact classification-distillation" loss function, constructing the expression recognition model according to the current classification features, and calculating the performance index of the expression recognition model, wherein the calculation formula of the "compact classification-distillation" loss function is as follows:
Figure BDA0002673587590000111
Figure BDA0002673587590000112
Figure BDA0002673587590000113
Figure BDA0002673587590000114
in the formula o(i)Representing the current classification characteristic of the ith classified picture, o0 (i)Represents the original classification characteristics of the ith classification picture, and sigma (-) is a Logistic Sigmoid function, feat(i)Feature vector, y, representing the ith expression picture(i)Class label representing ith expression picture, cyThe arithmetic mean characteristic vector corresponding to the expression of the y-th class is adopted, lambda is a balance parameter, and 0.1 is taken;
specifically, a 'compact classification-distillation' loss (loss) is calculated by using a classification result of an object predicted by the neural network and a classification label of a standard object, and a Stochastic Gradient Descent (SGD) algorithm is used for Gradient return to construct a neural network model.
Step 214, determining whether the performance index of the expression recognition model is converged, if not, executing step 210.
And 3, outputting the expression type label of the picture according to the input expression picture to be identified and the expression identification system in the current state.
In the step 3, the method specifically comprises the following steps:
step 31, inputting the expression picture to be identified according to a data modality;
step 32, extracting a feature vector corresponding to the expression picture to be identified through a feature extraction module of the expression identification model in the current state;
specifically, m expression pictures are read in by using a gray scale format, and the pictures are setThe pixel size is 100 multiplied by 100, and each expression picture v is recorded as v epsilon R100×100×1. Feature extraction is carried out on the read-in expression picture by utilizing a feature extraction module of the current expression recognition model to obtain a feature vector feat(i)∈R512,i=1,2,…,m。
Step 33, calculating the classification feature score of the expression picture to be identified by utilizing a linear classification algorithm according to the feature vector;
specifically, feature vectors feat(i)As input, the linear classification algorithm is used for operation, and the classification characteristic o of the expression picture predicted by the neural network is output(i)
And step 34, outputting the expression category corresponding to the maximum classification characteristic according to the classification characteristic.
In the present application, as shown in fig. 2, the identification accuracy of dynamically evolving expressions is compared by using the conventional generic image identification method iCaRL which faces dynamic increase of categories as a comparison method, the feature extraction modules of the two identification methods both use the ResNet-18 deep neural network, the accuracy curve corresponding to the adaptive identification method of the present application is a curve 201, and the accuracy curves corresponding to the comparison method are curves 202, respectively. Through comparison, the accuracy of the self-adaptive identification method is improved particularly under the condition that the expression classes are increased in more stages.
When the expression category is dynamically increased, the adaptive identification method and the existing general identification method are adopted to identify the expression picture on the RAF-DB expression database, and the accuracy comparison results obtained at each stage are shown in Table 2.
TABLE 2
Figure BDA0002673587590000121
Figure BDA0002673587590000131
As can be seen from table 2, especially under the condition that the expression categories are increased more, the accuracy of the adaptive method in the present application is significantly improved.
Example two:
as shown in fig. 3, the present embodiment provides a system 10 for identifying dynamically evolving expressions, comprising: past expression memory module 100, network model construction module 200 and recognition result output module 300;
the past expression memory module 100 is configured to extract the features corresponding to all the objects of the new type of expressions according to the dynamic increase of the expression types and the neural network model for the expression recognition, construct a feature database, and finally optimize and update the past expression memory module;
further, the past expression memory module 100 specifically includes: an image acquisition unit 101, a feature extraction unit 102, an arithmetic average unit 103, a sequence generation unit 104, and a sequence adjustment unit 105; the image obtaining unit 101 is configured to obtain picture objects belonging to the same class in the expression classes appearing for the first time, where the picture objects of the same class should be no less than two pictures;
specifically, m expression pictures are read in by using a gray scale format, the pixel size of the picture is set to be 100 multiplied by 100, and each expression picture v is recorded as v e R100×100×1
The feature extraction unit 102 is configured to extract a group of features corresponding to the picture object according to a feature extraction part of the expression recognition network and the picture object of the same class, and record the group of features as a preliminary feature vector group;
specifically, feature extraction is performed on the read-in expression picture by using a trained feature extraction module in the current system to obtain a feature vector group { feat }(i)Wherein, feat ∈ R512,i=1,2,…,m。
The arithmetic mean unit 103 is configured to perform arithmetic mean on the feature components in the preliminary feature vector group according to each dimension of the vector, so as to obtain an arithmetic mean vector of the expression picture;
specifically, the arithmetic mean feature vector c corresponding to the expression of the ith classiIs calculated by the formula
Figure BDA0002673587590000132
Wherein y is(j)A classification label (given by the training data set) representing the jth emoticon.
The sequence generating unit 104 is configured to sequentially generate K/s according to the picture object of each class, the preliminary feature vector group, and the arithmetic mean vector0Opening pictures and forming an ordered sequence; for the kth picture of the sequence, K is 1 to K/s0The following can be calculated according to the formula:
Figure BDA0002673587590000141
wherein x is an expression picture, f () is a feature extraction part of the network model, pjC is the arithmetic mean feature vector of the expression type corresponding to the kth picture.
The sequence adjusting unit 105 is configured to delete the picture objects at the end of the sequence for the expression category sequence in which the memory module already exists when the new category expression data appears, and the rest of the picture objects in the sequence still remain in the memory module.
The network construction module 200 is configured to update and upgrade the network model according to the network model of the previous stage, the expression objects of the new category and past expression memory modules, minimizing the "compact classification-distillation" loss function;
further, the network building module 200 specifically includes: a database construction unit 201, an index calculation unit 202, and a judgment unit 204;
the database construction unit 201 is configured to construct an expression picture database that can be used for model upgrade;
specifically, a currently available expression picture database is constructed by using the expression pictures of the new category and the corresponding classification marks, and the number of all the current expression categories is recorded as s.
The index calculation unit 202 is configured to calculate a performance index of the expression recognition model;
specifically, feature vectors feat(i)As input, the linear classification algorithm is used for operation, and the classification characteristic o of the expression picture predicted by the neural network is output(i)The method comprises the steps of calculating 'compact classification' loss (loss) by using a classification result of an object predicted by the neural network and a classification label of a standard object, and performing Gradient return by using a Stochastic Gradient Descent (SGD) algorithm to construct a neural network model. Using the superscript c to represent the picture index corresponding to the vector, and according to the expression picture number i acquired in step 22 being 1,2, …, m, the calculation formula corresponding to the "compact classification" loss function of one iteration is:
Figure BDA0002673587590000151
Figure BDA0002673587590000152
Figure BDA0002673587590000153
Figure BDA0002673587590000154
in the formula o(i)Representing the current classification characteristic of the ith classified picture, o0 (i)Represents the original classification characteristics of the ith classification picture, and sigma (-) is a Logistic Sigmoid function, feat(i)Feature vector, y, representing the ith expression picture(i)Class label representing ith expression picture, cyThe arithmetic mean eigenvector corresponding to the y-th expression is the balance parameter, where λ is 0.1.
The judging unit 203 is used for calculating Euclidean distance between the retrieval object and the retrieval three-dimensional object in the retrieval feature database;
specifically, it is determined whether the value of the "compact classification" loss function loss is stable and converged or whether the cumulative training number has reached the maximum limit, where the determination criterion is that the cumulative training number has reached 32 times. If the judgment standard is reached, the execution is continued, and if the judgment standard is not reached, the unit is executed again.
The recognition result output module 300 is configured to extract corresponding feature vectors according to the input expression pictures to be recognized, calculate classification feature scores of the expression pictures to be recognized, and output expression categories with the maximum corresponding scores;
further, the recognition result output module 300 specifically includes: an input unit 301, an extraction unit 302, a calculation unit 303, and an output unit 304;
the input unit 301 is configured to input the expression picture to be identified;
the extracting unit 302 is configured to extract a feature vector corresponding to the expression picture to be identified;
specifically, m expression pictures are read in by using a gray scale format, the pixel size of the picture is set to be 100 multiplied by 100, and each expression picture v is recorded as v e R100×100×1. Feature extraction is carried out on the read-in expression picture by utilizing a feature extraction module of the current expression recognition model to obtain a feature vector feat(i)∈R512,i=1,2,…,m。
The calculating unit 303 is configured to calculate a classification feature score of the expression picture to be identified by using a maximum pooling algorithm and a linear classification algorithm;
specifically, feature vectors feat(i)As input, the linear classification algorithm is used for operation, and the classification characteristic o of the expression picture predicted by the neural network is output(i)
The output unit 304 is configured to output, according to the classification feature score, the expression category with the largest corresponding score as a prediction label of the expression picture.
The steps in the present application may be sequentially adjusted, combined, and subtracted according to actual requirements.
The units in the device can be merged, divided and deleted according to actual requirements.
Although the present application has been disclosed in detail with reference to the accompanying drawings, it is to be understood that such description is merely illustrative and not restrictive of the application of the present application. The scope of the present application is defined by the appended claims and may include various modifications, adaptations, and equivalents of the invention without departing from the scope and spirit of the application.

Claims (8)

1. A method for identifying dynamically evolved expressions is characterized in that the self-adaptive identification method comprises the following steps:
step 1, constructing and optimizing past expression memory modules according to dynamic increase of expression categories;
step 2, dynamically upgrading the expression recognition system by utilizing a minimum 'compact classification-distillation' loss function of a past expression memory module according to the expression data of the new category;
and 3, outputting the expression type label of the picture according to the input expression picture to be identified and the expression identification system in the current state.
2. The method for identifying dynamically evolving expressions according to claim 1, wherein the step 1 specifically includes:
step 11, setting a total scale K of a past expression memory module for storing a representative expression picture according to the memory limit of the computing equipment and the completeness of the data acquisition permission;
step 12, after the system initialization stage, according to s in the initialization0The similar expression data is memory which can store K pictures and is set for the past expression memory module;
step 13, for s0Extracting a feature vector of each picture in each type of expression in the type expression data through a feature extraction module of a neural network;
step 14, solving an arithmetic mean characteristic vector c from the characteristic vectors of the pictures with each type of expressions;
step 15, sequentially generating K/s according to the arithmetic mean vector0Open the picture and constructForming an ordered sequence; for the K-th picture, K is 1 to K/s0The following can be calculated according to the formula:
Figure FDA0002673587580000011
wherein x is an expression picture, f () is a feature extraction module of the network model, pjC is an arithmetic mean characteristic vector of the expression type corresponding to the kth picture, wherein the pictures are generated in the ordered picture sequence;
step 16, an initialized past expression memory module is jointly constructed according to the ordered picture sequence generated by the picture data of each type of expression;
step 17, in new sk-sk-1When the expression-like data appears, optimizing and updating the representative data module of the past expression; traverse existing skJudging whether the currently executed expression category is a new s or not according to the expression categoryk-sk-1Class, if new sk-sk-1Class for new sk-sk-1The class expression is executed from step 13 to step 15, and finally K/s is generatedkAn ordered sequence of sheets; if not new sk-sk-1Class, then go to step 18;
step 18, for the expression data of the non-new type, each type of expression has one existing K/s in the memory modulek-1Ordered sequence of pictures, deleting (K/s) at the end of the sequencek-1)-(K/sk) Opening a picture;
and step 19, storing the ordered picture sequence of the new category expression in a memory module.
3. The method for identifying dynamically evolving expressions according to claim 1, wherein the step 2 specifically includes:
step 21, in the system initialization stage, according to s in initialization0The expression data is similar to the expression data, and an available expression picture database is constructed;
step 22, acquiring at least one expression picture in the available expression picture database;
step 23, extracting the features of the expression picture through a feature extraction module of a neural network according to the expression picture, and recording the features as feature vectors;
24, generating classification characteristics by utilizing a linear classification algorithm according to the characteristic vectors, and calculating a 'compact classification' loss function;
step 25, constructing the expression recognition model according to the classification characteristics, and calculating the performance index of the expression recognition model;
step 26, judging whether the performance index of the expression recognition model is converged, and if not, executing step 23;
step 27, in the new sk-sk-1When the expression data of the class appears, the new s is usedk-sk-1The expression data of the class and the past expression memory module construct an available expression picture database;
step 28, recording the expression recognition model in the last state as an original expression recognition model, and constructing a current expression recognition model according to the original expression recognition model;
step 29, acquiring at least one expression picture in the available expression picture database, and recording the number of the acquired expression pictures as m;
step 210, extracting the corresponding features of the expression picture through a feature extraction module of an original expression recognition model according to the expression picture, generating classification features by utilizing a linear classification algorithm, and recording the classification features as original classification features o0 (i)
Step 211, extracting the corresponding features of the expression picture through a feature extraction module of the current expression recognition model according to the expression picture, and recording the features as a current feature vector feat(i)
Step 212, generating a classification feature according to the current feature vector by using a linear classification algorithm, and recording the classification feature as a current classification feature o(i)
Step 213, calculating the "compact classification-distillation" loss function, constructing the expression recognition model according to the current classification features, and calculating the performance index of the expression recognition model, wherein the calculation formula of the "compact classification-distillation" loss function is as follows:
Figure FDA0002673587580000031
Figure FDA0002673587580000032
Figure FDA0002673587580000033
Figure FDA0002673587580000034
in the formula o(i)Current classification characteristic, o, representing the ith expression picture0 (i)The original classification characteristic of the ith expression picture is represented, and sigma (·) is a Logistic Sigmoid function, feat(i)Feature vector, y, representing the ith expression picture(i)Class label representing ith expression picture, cyThe average value of the feature vectors of the expression pictures marked as y for all the classifications is shown, and lambda is a balance parameter;
step 214, determining whether the performance index of the expression recognition model is converged, if not, executing step 210.
4. The method for identifying dynamically evolving expressions according to claim 1, wherein in step 3, the method specifically includes:
step 31, inputting the expression picture to be identified according to a data modality;
step 32, extracting a feature vector corresponding to the expression picture to be identified through a feature extraction module of the expression identification model in the current state;
step 33, calculating the classification feature score of the expression picture to be identified by utilizing a linear classification algorithm according to the feature vector;
and step 34, outputting the expression category corresponding to the maximum classification characteristic according to the classification characteristic.
5. A recognition system for dynamically evolving expressions, the adaptive recognition system comprising: the system comprises a past expression memory module, a network model construction module and an identification result output module;
the past expression memory module is configured to extract the characteristics corresponding to all expression objects of new expression according to the dynamic increase of the expression types and the network model of expression recognition, construct a characteristic database and finally optimize and update the past expression memory module;
the network model building module is configured to update and upgrade the network model according to the network model of the previous stage, the expression objects of the new category and past expression memory modules, and minimizing a 'compact classification-distillation' loss function;
the recognition result output module is configured to extract corresponding feature vectors according to the input expression pictures needing to be recognized, calculate classification feature scores of the expression pictures needing to be recognized, and output the expression categories with the maximum corresponding scores.
6. The system for identifying dynamically evolving expressions according to claim 5, wherein the past expression memory module specifically comprises: the device comprises an image acquisition unit, a feature extraction unit, an arithmetic mean unit, a sequence generation unit and a sequence updating unit;
the image acquisition unit is used for acquiring picture objects belonging to the same class in expression classes appearing for the first time, wherein the number of the picture objects in the same class is not less than two pictures;
the feature extraction unit is used for extracting a group of features corresponding to the picture objects according to the feature extraction part of the expression recognition network and the picture objects of the same class and recording the group of features as a preliminary feature vector group;
the arithmetic mean unit is used for respectively carrying out arithmetic mean on the characteristic components in the preliminary characteristic vector group according to each dimension of the vector to obtain arithmetic mean vectors of the expression pictures;
the sequence generating unit is used for sequentially generating K/s according to each type of picture object, the preliminary characteristic vector group and the arithmetic mean vector0Opening pictures and forming an ordered sequence; for the kth picture of the sequence, K is 1 to K/s0The following can be calculated according to the formula:
Figure FDA0002673587580000041
wherein x is an expression picture, f () is a feature extraction part of the network model, pjC is an arithmetic mean characteristic vector of the expression type corresponding to the kth picture, wherein the pictures are generated in the ordered picture sequence;
and the sequence adjusting unit is used for deleting the picture objects at the tail of the list for the expression type sequence of the existing memory module when the new type expression data appears, and remaining picture objects in the sequence are still kept in the memory module.
7. The system for identifying dynamically evolving expressions according to claim 6, wherein the network model building module further comprises: the system comprises a database construction unit, an index calculation unit and a judgment unit;
the database construction unit is used for constructing an expression picture database which can be used for model upgrading;
the index calculation unit is used for calculating the performance index of the expression recognition model;
the judging unit is used for judging whether the performance index of the expression recognition model is converged or not, and if not, the expression picture is obtained again.
8. The system for identifying dynamically evolving expressions according to claim 5, wherein the recognition result output module specifically includes: the device comprises an input unit, an extraction unit, a calculation unit and an output unit;
the input unit is used for inputting the expression picture to be identified;
the extraction unit is used for extracting the feature vector corresponding to the expression picture to be identified;
the calculating unit is used for calculating the classification characteristic score of the expression picture to be identified by utilizing a maximum pooling algorithm and a linear classification algorithm;
and the output unit is used for outputting the expression category with the maximum corresponding score as a prediction label of the expression picture according to the classification characteristic score.
CN202010940835.0A 2020-09-09 2020-09-09 Method and system for identifying dynamically evolved expressions Pending CN112084944A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010940835.0A CN112084944A (en) 2020-09-09 2020-09-09 Method and system for identifying dynamically evolved expressions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010940835.0A CN112084944A (en) 2020-09-09 2020-09-09 Method and system for identifying dynamically evolved expressions

Publications (1)

Publication Number Publication Date
CN112084944A true CN112084944A (en) 2020-12-15

Family

ID=73731705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010940835.0A Pending CN112084944A (en) 2020-09-09 2020-09-09 Method and system for identifying dynamically evolved expressions

Country Status (1)

Country Link
CN (1) CN112084944A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766145A (en) * 2021-01-15 2021-05-07 深圳信息职业技术学院 Method and device for identifying dynamic facial expressions of artificial neural network
CN114092649A (en) * 2021-11-25 2022-02-25 马上消费金融股份有限公司 Picture generation method and device based on neural network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766145A (en) * 2021-01-15 2021-05-07 深圳信息职业技术学院 Method and device for identifying dynamic facial expressions of artificial neural network
CN112766145B (en) * 2021-01-15 2021-11-26 深圳信息职业技术学院 Method and device for identifying dynamic facial expressions of artificial neural network
CN114092649A (en) * 2021-11-25 2022-02-25 马上消费金融股份有限公司 Picture generation method and device based on neural network
CN114092649B (en) * 2021-11-25 2022-10-18 马上消费金融股份有限公司 Picture generation method and device based on neural network

Similar Documents

Publication Publication Date Title
Cakir et al. Mihash: Online hashing with mutual information
Adams et al. A survey of feature selection methods for Gaussian mixture models and hidden Markov models
CN109271958B (en) Face age identification method and device
KR20040037180A (en) System and method of face recognition using portions of learned model
CN110097096B (en) Text classification method based on TF-IDF matrix and capsule network
CN112084944A (en) Method and system for identifying dynamically evolved expressions
Estevez-Velarde et al. AutoML strategy based on grammatical evolution: A case study about knowledge discovery from text
CN116110089A (en) Facial expression recognition method based on depth self-adaptive metric learning
CN113609388A (en) Sequence recommendation method based on counterfactual user behavior sequence generation
CN113722507B (en) Hospitalization cost prediction method and device based on knowledge graph and computer equipment
Somogyi The Application of Artificial Intelligence
US20210319269A1 (en) Apparatus for determining a classifier for identifying objects in an image, an apparatus for identifying objects in an image and corresponding methods
CN110867225A (en) Character-level clinical concept extraction named entity recognition method and system
TW202125323A (en) Processing method of learning face recognition by artificial intelligence module
JP3896868B2 (en) Pattern feature selection method, classification method, determination method, program, and apparatus
CN111950592B (en) Multi-modal emotion feature fusion method based on supervised least square multi-class kernel canonical correlation analysis
CN114692809A (en) Data processing method and device based on neural cluster, storage medium and processor
CN114757310B (en) Emotion recognition model and training method, device, equipment and readable storage medium thereof
CN113362920B (en) Feature selection method and device based on clinical data
CN112465054B (en) FCN-based multivariate time series data classification method
JP7347750B2 (en) Verification device, learning device, method, and program
CN113868597A (en) Regression fairness measurement method for age estimation
CN113535928A (en) Service discovery method and system of long-term and short-term memory network based on attention mechanism
CN113468936A (en) Food material identification method, device and equipment
JP2021081930A (en) Learning device, information classification device, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination