CN115050075B

CN115050075B - Cross-granularity interactive learning micro-expression image labeling method and device

Info

Publication number: CN115050075B
Application number: CN202210736803.8A
Authority: CN
Inventors: 刘海; 张昭理; 周启云; 石佛波; 朱俊艳; 宋云霄; 刘婷婷; 杨兵
Original assignee: Hubei University; Central China Normal University
Current assignee: Hubei University; Central China Normal University
Priority date: 2022-06-27
Filing date: 2022-06-27
Publication date: 2024-07-02
Anticipated expiration: 2042-06-27
Also published as: CN115050075A

Abstract

The invention provides a micro-expression image labeling method and device for cross-granularity interactive learning, which relate to the technical field of image processing and comprise the following steps: acquiring a micro-expression image sequence to be marked; acquiring a preset number of marked micro-expression images, and inputting the marked micro-expression images and a micro-expression image sequence to be marked into a pre-trained feature extractor model; labeling the category of each micro expression to be labeled; obtaining standard confidence scores corresponding to each micro expression category; acquiring the confidence score of the identified microexpressions, comparing the confidence score with the standard confidence score of the microexpressions of the corresponding categories, and outputting microexpressions which are larger than or equal to the standard confidence score; updating the sequence of the micro-expression images to be marked until all the micro-expressions are marked, and outputting a marked micro-expression image set. The automatic labeling method has the advantages that the efficient and accurate automatic labeling of the micro-expressions of the students under the teaching scene is realized, the subjectivity of manual labeling and the ambiguity of the acquired micro-expressions are avoided, and a large amount of manpower and material resources are saved.

Description

Cross-granularity interactive learning micro-expression image labeling method and device

Technical Field

The invention relates to the technical field of image processing, in particular to a micro-expression image labeling method and device for cross-granularity interactive learning.

Background

In order to study the class performance states of students in teaching scenes, such as concentration, input, liveness and the like, analysis and study are generally carried out by collecting video data of the students in the teaching scenes at present, so that the performance states of the students in the class are analyzed from the expression of the students. The micro-expression of the student in the class is a weak and extremely short-lasting facial activity, and when the student tries to hide the true emotion of the heart of the student, the expression can be unconsciously generated, and compared with other data information, the micro-expression of the student can reflect the true emotion state of the student in the class.

At present, the micro-expression data of students in a teaching scene are usually marked and analyzed in a manual mode, the manual marking can lead to certain subjectivity in the judgment of the expression category, and the recognition result of different people on the same micro-expression data can be different, so that ambiguity is easy to generate in data processing; the subjectivity of the labels and the ambiguity of the expressions can seriously influence the accuracy of analysis and research on the learning state of the students; in addition, with the rapid development of information technology, educational big data is massive in daily teaching activities, and if a traditional manual labeling method is adopted, a great deal of manual labor force is consumed, and the time cost is extremely high.

Disclosure of Invention

The invention provides a micro-expression image labeling method for cross-granularity interactive learning, which is used for solving the defect that manual labeling cannot accurately identify the category of each student micro-expression in a large amount of emotion data in the prior art, realizing high-efficiency and accurate automatic labeling of the student micro-expressions in a teaching scene, effectively avoiding subjectivity of manual labeling and ambiguity of acquired micro-expressions, and saving a large amount of manpower and material resources.

The invention provides a micro-expression image labeling method for cross-granularity interactive learning, which comprises the following steps:

S1, acquiring a micro-expression image sequence to be marked;

s2, acquiring a preset number of marked micro-expression images, and inputting the marked micro-expression images and the micro-expression image sequence to be marked into a pre-trained feature extractor model;

labeling a category corresponding to each micro-expression in the micro-expression image sequence to be labeled through the feature extractor model; acquiring standard confidence scores corresponding to each micro-expression category based on the marked micro-expression images;

s3, obtaining the confidence score of the identified micro-expression in the micro-expression image sequence to be marked, comparing the confidence score with the standard confidence score of the micro-expression of the corresponding category, and adding the micro-expression with the standard confidence score to the marked micro-expression image set if the confidence score is larger than or equal to the standard confidence score;

And S4, updating the micro-expression image sequence to be marked until all the micro-expressions are marked, and outputting the marked micro-expression image set.

According to the micro-expression image labeling method provided by the invention, before step S1, the method comprises the following steps:

collecting video data of students in a classroom;

Preprocessing the video data, and converting the video data into image data of each frame;

performing face detection on the image data to generate a plurality of face images;

and storing a plurality of face images as the sequence of micro-expression images to be annotated.

According to the micro-expression image labeling method provided by the invention, training the feature extractor model comprises the following steps:

and carrying out a data enhancement strategy on the marked micro-expression image and the micro-expression image sequence to be marked, and then inputting the data enhancement strategy into a feature extractor model:

And obtaining the category corresponding to the marked micro-expression image, obtaining the standard confidence score corresponding to each micro-expression category, and outputting the trained feature extractor model.

Selecting any image from the micro-expression image sequence to be annotated, and carrying out a strong enhancement strategy and a weak enhancement strategy to obtain a corresponding strong enhancement image u _s and weak enhancement image u _w; selecting any image from the marked micro-expression image, and performing a weak enhancement strategy to obtain a weak enhancement image s _w;

Respectively inputting the strong enhanced image u _s and the weak enhanced image u _w、s_w into the feature extractor, respectively obtaining corresponding image features, and judging the category of the micro expression;

Based on the microexpressive categories of the weak enhanced images u _w, outputting category probability distribution corresponding to each microexpressive category through the feature extractor, comparing the maximum value of the category probability distribution with a preset threshold value, and taking the microexpressive category corresponding to the preset threshold value or more as a pseudo category of the weak enhanced image; acquiring cross entropy of the microexpressive categories of the strong enhanced images u _s and the pseudo categories;

Based on the microexpressive categories of the weak enhanced images s _w, outputting category probability distribution corresponding to each microexpressive category through the feature extractor, and acquiring cross entropy between the output category probability distribution and the original category probability distribution of the marked microexpressive images.

According to the method for labeling the microexpressive image provided by the invention, the confidence score of the identified microexpressive image sequence to be labeled is compared with the standard confidence score of the microexpressive image of the corresponding category, and the method comprises the following steps:

Acquiring first-class probability distribution of image data in a micro-expression image sequence to be annotated as follows Acquiring second-class probability distribution of marked micro-expression data set as

For the same microexpressive category, obtaining the average probability distribution of the first category probability distribution and the second category probability distribution

Will beCompare with a preset adaptive score T _c ifJudging that the confidence score of the corresponding micro-expression is larger than or equal to the standard confidence score, updating the micro-expression image sequence to be marked, and adding the micro-expression larger than or equal to the standard confidence score in the micro-expression image sequence to be marked into the marked micro-expression image set;

If it is And judging that the confidence score of the corresponding micro-expression is smaller than the standard confidence score, adding the corresponding micro-expression into a retraining image set, and using the retraining image set for retraining the feature extractor model.

According to the micro-expression image labeling method provided by the invention, the feature extractor model comprises a fine-granularity feature extractor model and a coarse-granularity feature extractor model, and the training of the feature extractor model comprises the following steps:

Acquiring a neutral expression image and other expression images of the same target to be identified in the micro-expression image data; acquiring identity features of the neutral expression image, identity features of other expression images and expression category features through an encoder;

Carrying out expression reconstruction by combining the identity characteristics of the neutral expression image and the expression category characteristics of the other expression images through a decoder;

The encoder and the decoder conduct countermeasure learning, and the encoder classifies the microexpressive images of different targets to be identified according to identity characteristics, so that the difference value between microexpressive category distribution of the different targets to be identified is minimum;

and analyzing the reconstructed expression image through the expression classifier to obtain corresponding expression category characteristics and outputting category probability distribution.

Forming a triplet (x _anchor,x_nega,x_posi) by a micro-expression anchor x _anchor, a macro-expression positive example x _posi with the same expression category as the micro-expression anchor and a micro-expression negative example x _nega with different expression categories as the micro-expression anchor, and inputting the triplet into the feature extractor model;

Wherein the micro-table lee anchor and the micro-expression counter-example are input into the fine-granularity feature extractor to respectively acquire expressions for embedding AndInputting the macro expression positive example into the coarse granularity feature extractor model to acquire expressions and embed the expressions into the coarse granularity feature extractor model

Triple loss between multiple expression embeddings is obtained:

Performing a countermeasure learning between the fine granularity feature extractor and the coarse granularity feature extractor model; the coarse granularity feature learning module provides embedded representation of macro expressions and marks the classification of the corresponding macro expression images as correct categories; the fine-grained feature extractor provides embedded representation of the micro-expressions, and the classification of the corresponding micro-expression images is marked as an error class;

distinguishing the two embedded representations through a discriminator, and adjusting common expression characteristics between a macro expression and a micro expression through a fine-granularity feature extractor model so that the gradient between the macro expression and the micro expression is greater than or equal to a minimum threshold;

wherein m is a super parameter.

On the other hand, the invention also provides a micro-expression image labeling device for cross-granularity interactive learning, which comprises the following steps:

the image processing module is used for acquiring a micro-expression image sequence to be marked, acquiring a preset number of marked micro-expression images, and inputting the marked micro-expression images and the micro-expression image sequence to be marked into the feature extractor module;

The feature extractor module marks the category corresponding to each micro expression in the micro expression image sequence to be marked through a feature extractor model; acquiring standard confidence scores corresponding to each micro-expression category based on the marked micro-expression images;

the image labeling module is used for acquiring the confidence score of the identified micro-expression in the micro-expression image sequence to be labeled and comparing the confidence score with the standard confidence score of the micro-expression of the corresponding category, and if the confidence score is greater than or equal to the standard confidence score, adding the micro-expression which is greater than or equal to the standard confidence score into the labeled micro-expression image set;

The image labeling module is also used for updating the micro-expression image sequence to be labeled until all the micro-expressions are labeled and then outputting a labeled micro-expression image set.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the micro-expression image labeling method as described in any of the above.

According to the micro-expression image labeling method and device for cross-granularity interactive learning, the labeled micro-expression image and the micro-expression image sequence to be labeled are input into the pre-trained feature extractor model, the standard confidence score corresponding to each micro-expression category is obtained through the labeled micro-expression image, so that the category corresponding to each micro-expression in the micro-expression image sequence to be labeled is further identified, the confidence score of the micro-expression identified in the micro-expression image sequence to be labeled is obtained, and compared with the standard confidence score of the micro-expression of the corresponding category, and the accuracy of micro-expression identification is improved; the micro-expression images which do not meet the confidence score requirement are input to the feature extractor model again for training, so that the accuracy of the feature extractor model can be effectively improved; the method provided by the invention can be used for identifying the micro-expressions of the students in the teaching scene, so that the feedback of the students on the teaching content and the teaching emotion of the students are obtained, the teacher or the teaching personnel can effectively master the teaching states of different students, and the teacher or the teaching personnel can provide guidance for different students and improve the teaching effect of the teaching in the teaching scene.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a method for labeling a micro-expression image according to the present invention;

FIG. 2 is a schematic image acquisition diagram of the micro-expression image labeling method provided by the invention;

FIG. 3 is a second flowchart of the micro-expression image labeling method according to the present invention;

fig. 4 is a third flowchart of the micro-expression image labeling method provided by the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that the term "first/second" related to the present invention is merely to distinguish similar objects, and does not represent a specific order for the objects, and it should be understood that "first/second" may interchange a specific order or precedence where allowed. It is to be understood that the "first\second" distinguishing aspects may be interchanged where appropriate to enable embodiments of the invention described herein to be implemented in sequences other than those described or illustrated herein.

In one embodiment, as shown in fig. 1, the invention provides a micro-expression image labeling method for cross-granularity interactive learning, which comprises the following steps:

S1, acquiring a micro-expression image sequence to be marked;

It should be noted that the marked microexpressive image can select microexpressive image and marked category from the currently published microexpressive database;

further, if the confidence score is smaller than the standard confidence score, inputting the micro-expression image with the confidence score lower than the standard confidence score into the feature extractor model again for training;

it should be noted that, learning the confidence score corresponding to each microexpressive category provides a reference basis for judging and comparing whether the recognition result of the microexpressive data to be marked is qualified or not; the confidence score represents the probability that each microexpressive category is identified correctly, with a greater score indicating a higher probability of containing that category;

Optionally, before step S1, the method includes the steps of:

collecting video data of students in a classroom;

storing a plurality of face images as the micro-expression image sequence to be annotated;

specifically, video data of students in a classroom are collected through a front RGB intelligent camera and an intelligent tracking camera;

As shown in fig. 2, the camera comprises an intelligent tracking camera facing the teaching scene; the intelligent RGB camera is characterized by further comprising two front intelligent RGB cameras CL and CR which are respectively arranged, so that acquisition of each student in the teaching scene is guaranteed, acquisition of facial micro expression data of the students in the teaching scene is guaranteed in all directions and multiple angles, and the lens can be enlarged and reduced for students at different distances.

Specifically, after video data of students in a class are collected by using a specific camera device, the video data of the students in the class are subjected to data preprocessing, which comprises the following steps: converting video data into image data of each frame, carrying out face detection on students, cutting face images of the students of each frame according to the same size, and finally storing the data set preprocessed by the data as a microexpressive image sequence to be annotated.

In one embodiment, as shown in FIG. 3, training the feature extractor model includes:

The feature extractor model is a feature extractor for cross-granularity interactive learning, and the trained feature extractor model is saved by performing model training on the marked microexpressive data set and learning confidence scores corresponding to each microexpressive category; the macro-expression and the micro-expression have some shared characteristics, but the macro-expression characteristics are coarse-grained, and the micro-expression characteristics are finer-grained, so that the fine-grained micro-expression characteristic training characteristic extractor is guided by the coarse-grained macro-expression characteristics, and the accuracy of identifying the micro-expression to be marked is better facilitated;

It should be noted that the micro-expression has the characteristics of short duration (less than 0.5 seconds), small amplitude and unaware to naked eyes, and is an unconscious spontaneous behavior, and the reaction appears quickly and is difficult to inhibit after an emotion arousing event, so that the true emotion state of the person can be revealed; the macro expression has the characteristics of long duration (1 s-5 s), large amplitude and easy observation;

Wherein, macro expression is coarse granularity, micro expression is fine granularity, and granularity refers to the fineness of expression characteristics; micro-expressions are usually short in duration, short in time, small in facial muscle movement amplitude, difficult to capture, and are finer expression features such as an instant lifting of corners of the mouth, micro-movement contraction or stretching of eyebrows and eyes, and therefore fine granularity; the macro expression has long duration time compared with the micro expression, and the facial muscle movement amplitude is large, so that the captured characteristics are easy to extract, and the macro expression is coarse-grained;

specifically, training the feature extractor model includes two phases:

the first stage comprises:

dividing the acquired video data into a micro expression data set and a macro expression data set; acquiring neutral expression in the same video (Neutral), other expressions(Other Expression) image and corresponding Expression category y ⁱ;

In the microexpressive dataset Pre-training a fine granularity feature extractor model, wherein M ₁ represents the number of videos; at the same time in macro expression data setPre-training a coarse granularity feature extractor model;

The inputs of the fine granularity feature extractor model and the coarse granularity feature extractor model are paired images from the same pair of videos, including a neutral micro-expression image x _N and other expression images x _O; training of both feature extractor models includes four steps: expression feature extraction, expression reconstruction, identity classification and expression classification:

Wherein, expression feature extraction includes: the fine granularity feature learning module encodes neutral expression images x _N and other expression images x _O from the micro-expression video through an encoder, and respectively acquires embedded representations of the neutral expression images x _N and the other expression images x _O: Acquiring the identity characteristic of the neutral expression image through an encoder Acquiring identity characteristics of other expression imagesAnd expression class featuresExpression category features for neutral expression imagesCan be regarded as a fixed value; the neutral expression image refers to the condition of no expression, and can be generally regarded as that the neutral expression image only comprises identity features; other expression images refer to expressions generated by facial muscle movements, such as happy, sad, angry, surprise, aversion and other basic expressions, besides neutral expressions;

it should be noted that the neutral expression image x _N and the other expression images x _O are images derived from the same object, and their identity-related features And (3) withSimilarly, the difference between the two can be expressed as a loss function as follows:

further, other expression-related features Enough category characteristic information of the original expression is transferred, so the expression reconstruction comprises: carrying out expression reconstruction by combining the identity features of the neutral expression image and the expression category features of the other expression images through a decoder D _r, and outputting a reconstruction loss function

Further, the step of identity classification includes: the encoder E and the identity classifier D _s perform countermeasure learning; classifying the micro-expression images of different targets to be identified according to the identity characteristics by an encoder to obtain micro-expression category distribution of a plurality of different targets to be identified, so that the difference value between the micro-expression category distribution of the different targets to be identified is minimum, namely the probability distribution difference value of the expression categories is minimum in the identification results of the plurality of images of the different targets; thereby, the D _s is difficult to recognize the relevant expression characteristics of the targetClassifying; the goal of the countermeasure training is expression category distributionCross entropy loss with the true expression class s of the recognition target:

it should be noted that, similarity between the expression category distribution and the true category distribution obtained by recognition can be measured through cross entropy loss, which is beneficial to improving the learning rate of the recognition model;

Further, the step of expression classification includes: generating expression-related features by the expression classifier D _e, that is, generating multiple expression features based on micro-expressions and/or macro-expressions, introducing cross entropy loss L _c is defined as: Wherein y is the expression category of x _O, The expression category distribution is obtained for recognition;

Finally, a total loss function is obtained, and for a fine-grained feature extractor model, the total loss function L _micro is defined as: wherein λ, β, γ are hyper-parameters controlling the loss function;

it should be noted that, the cross entropy loss is the difference between the true probability distribution and the prediction probability distribution, and the smaller the cross entropy is, the better the model prediction effect is; the smaller the value corresponding to the total loss function, the better the model training.

Further, the second stage comprises two steps of feature space guiding and category space guiding:

Wherein the feature space guidance comprises: fixing the coarse-granularity feature extractor model, and training the fine-granularity feature extractor model to improve the identification accuracy of the fine-granularity feature extractor model;

Forming a triplet (x _anchor,x_nega,x_posi) by a micro-expression anchor x _anchor, a macro-expression positive example x _posi with the same expression category as the micro-expression anchor and a micro-expression negative example x _nega with different expression categories as the micro-expression anchor, and inputting the triplet into the fine-granularity feature extractor model;

For each triplet input, inputting the micro-table lee anchor and micro-expression counterexamples into the fine-granularity feature extractor, respectively obtaining expressions and embedding the expressions into the fine-granularity feature extractor AndInputting the macro expression positive example into the coarse granularity feature extractor model to acquire expressions and embed the expressions into the coarse granularity feature extractor model

Triple loss between multiple expression embeddings is obtained:

specifically, the details of the expression can be better distinguished by acquiring the triple loss, and the recognition accuracy of the expression is improved.

Wherein m is a super parameter;

it should be noted that the micro-expression anchor is a micro-expression reconstructed image from the fine-granularity feature extractor, the macro-expression positive example is a macro-expression reconstructed image from the coarse-granularity feature extractor, and the micro-expression negative example is an expression image with the same identity feature as the micro-expression anchor;

Further, performing countermeasure training between the coarse grain feature extractor model and the fine grain feature extractor model; the coarse granularity feature learning module provides embedded representation of macro expressions and marks the classification of the corresponding macro expression images as correct categories; the fine-grained feature extractor provides embedded representation of the micro-expressions, and the classification of the corresponding micro-expression images is marked as an error class;

Distinguishing the two embedded representations through a discriminator D, and adjusting common expression characteristics between the macro expression and the micro expression through a fine-granularity feature extractor model, so that a distinguishing gradient between the macro expression and the micro expression is more than or equal to a minimum threshold, wherein the aim of resisting learning is as follows:

Loss of discriminator is defined as To avoid gradient extinction, minimizeCalculating the antagonism loss of the fine granularity feature learning module:

It should be noted that the common expression features of the macro expression and the micro expression refer to: for example, a smile with a macro expression may be a smile, a smile with a micro expression may be a shallow smile, and the shared characteristics between the two may include stretching of muscles of cheeks, lifting of corners of the mouth, change of corners of the eyes, and the like, which are common characteristic points of the macro expression and the micro expression, except that the stretching of the muscles, lifting of corners of the mouth, and change of corners of the eyes are different.

Further, the category space guidance includes:

The expression classification loss is used for controlling the recognition accuracy of the expression, and the introduced classification loss is defined as follows, and the classification loss of the fine granularity encoder branch is as follows:

where y represents the expression class of x _anchor, Distributing the identified expression categories;

During training, the fine granularity feature learning module and the coarse granularity feature learning module are assumed to generate similar output, so that the two networks are jointly trained, and the difference between the two networks is punished by adding a regularization term in a loss function;

it should be noted that, the features with fine granularity are more important than those with coarse granularity, the contained feature information is more abundant, and regularization can be added to restrict, so that the network learns to finer features

Wherein the loss function is defined as:

L_LIR＝max{L_ds-L_cls′，0}；

Wherein L _cls′ is the classification loss of coarse-granularity encoder branch, representing the positive example feature Cross entropy loss of classification result between true expression class y of the positive image;

The total loss function is defined as:

L_MM＝L_cls+λ₂L_tri+λ₃L_adv+λ₄L_LIR；

Wherein lambda ₂,λ₃,λ₄ is the hyper-parameter controlling the loss factor;

In the training process, the total loss function is required to be as minimum as possible, so that the effect of the training fine-granularity recognition model is improved, some model parameters in the training process are updated, the model is optimized, and the performance is improved;

the feature extractor model obtained after the training is used for obtaining the category recognition results of all the micro-expression images, and comparing the recognition results with the real categories of the micro-expression images:

Selecting the sample with the correct identification Where s _i represents the confidence score for the ith tag data,Representing the i-th identification mark, N representing the number of samples S ^C;

Constructing an adaptive confidence interval t= { (T ₁,...,T_c)|T_c e R, c=1., C }, with the specific formula:

Wherein, The number of the c-th expression marked by the sample in the representative sample S ^C;

After the complete training process is carried out, the primary feature extractor model with the best effect and highest prediction accuracy and the confidence score corresponding to each expression category are stored;

In one embodiment, training the feature extractor model as shown in FIG. 4 further comprises:

Performing a data enhancement strategy on the micro-expression image, including: selecting any image from the micro-expression image sequence to be annotated, and carrying out a strong enhancement strategy and a weak enhancement strategy to obtain a corresponding strong enhancement image u _s and weak enhancement image u _w; selecting any image from the marked micro-expression image, and performing a weak enhancement strategy to obtain a weak enhancement image s _w;

It should be noted that, the strong enhancement strategy: enhancement is performed using RandomAugment and then application CutOut, the weak enhancement strategy described: random horizontal flipping (Random Horizontal Flip) with 50% probability, random horizontal and vertical movement operations (Random Translation) with 13.5% probability to obtain image u _w; more image samples are run through the strong enhancement strategy and the weak enhancement strategy;

Further, the strong enhanced image u _s and the weak enhanced image u _w、s_w are respectively input into the feature extractor, corresponding image features are respectively obtained, and the category of the micro expression is judged;

based on the microexpressive categories of the weak enhanced images u _w, outputting category probability distribution corresponding to each microexpressive category through the feature extractor, comparing the maximum value of the category probability distribution with a preset threshold tau, and taking the microexpressive category corresponding to the preset threshold tau or more as a pseudo category of the weak enhanced image; acquiring cross entropy of the micro expression category and the pseudo category of the plurality of strong enhanced images u _s:

Where M represents the number of unlabeled images, p _w represents the class probability distribution identified after the weak enhancement strategy, Representing the pseudo-category of the weakly enhanced image,Representing class probability distribution of the strong enhanced image, wherein H (·) is a cross entropy loss function;

The pseudo categories are image categories which are more than or equal to a preset threshold value in category probability distribution obtained through recognition after a weak enhancement strategy;

Optionally, a weak enhancement image s _w after performing a weak enhancement strategy on a small number of identified micro-expression images is identified to obtain a corresponding class probability distribution, and a cross entropy loss L _s between the corresponding class probability distribution and the real class distribution of the image s is obtained, which is defined as follows:

Wherein N represents the number of data, C represents the number of categories, True class distribution, p _c(s_i,) is class probability distribution, s _i is the ith data, θ represents network parameters, and p _c (·) represents recognition functions;

Further, comparing the confidence score of the identified micro-expression in the micro-expression image sequence to be annotated with the standard confidence score of the micro-expression of the corresponding category, comprising the steps of:

Wherein c is the corresponding microexpressive category;

Will be Compare with a preset adaptive score T _c ifJudging that the confidence score of the corresponding micro-expression is larger than or equal to the standard confidence score, updating the micro-expression image sequence to be marked, and adding the micro-expression larger than or equal to the standard confidence score in the micro-expression image sequence to be marked into the marked micro-expression image set;

In another embodiment, the invention further provides a micro-expression image labeling device for cross-granularity interactive learning, which comprises:

The image labeling module is also used for updating the micro-expression image sequence to be labeled until all the micro-expressions are labeled and then outputting a labeled micro-expression image set;

Specifically, the micro-expression image labeling device further comprises an image acquisition device which is arranged at all positions in the teaching scene, and video data of students in the class are acquired through devices including an RGB intelligent camera and an intelligent tracking camera; the invention is not limited in this regard, how to acquire images and by which devices to acquire image information in a classroom does not affect the practice of the invention;

Specifically, student expression categories are classified into three categories: positive, neutral, and negative; the neutral expression is that the face has no obvious expression characteristics, and is a facial expression; active expressions include smiling and the like, and passive expressions include vital energy, difficulty and the like;

Specifically, the facial expression of the character can be further judged by judging the facial features including the eye corners, the mouth corners, the stretching of the muscles of the cheeks and the like, and further judging the variation degree of the eye corners, the upward degree of the mouth corners, the stretching degree of the facial muscles and the like;

The device provided by the invention is correspondingly referred to the micro-expression labeling method, so that the micro-expressions of the students can be identified and labeled through the pre-trained feature extractor based on the student images acquired by the teaching scene, and the micro-expressions of the students and the corresponding micro-expression categories can be accurately identified from the student image data shot by the teaching scene;

Through simple arrangement, the identification of each student emotion state and each classroom state on a classroom is realized, subjectivity of micro-expression identification is avoided, and only the collected classroom images are required to be input, so that the micro-expressions of students in a teaching scene can be identified based on the device and the method provided by the invention, large-scale expression data can be marked in batches respectively, teaching staff can concentrate on improving the teaching effect, and teaching efficiency of the teaching staff is improved.

In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the method of labeling a microexpressive image provided by the methods described above.

In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the method for labeling a microexpressive image provided by the above methods.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The micro-expression image labeling method is characterized by comprising the following steps of:

S1, acquiring a micro-expression image sequence to be marked;

S4, updating the micro-expression image sequence to be marked until all the micro-expressions are marked and then outputting a marked micro-expression image set;

The feature extractor model comprises a fine granularity feature extractor model and a coarse granularity feature extractor model, and training the feature extractor model comprises the following steps:

analyzing the reconstructed expression image through an expression classifier to obtain corresponding expression category characteristics and outputting category probability distribution;

Training the feature extractor model includes the steps of:

will micro-expression anchor Macro expression positive example of the same expression category as micro expression anchorMicro-expression counterexamples of different expression categories from micro-expression anchorsForm a tripletInputting to the feature extractor model;

Wherein the micro-table lee anchor and the micro-expression counter-example are input into the fine-granularity feature extractor to respectively acquire expressions for embedding And; Inputting the macro expression positive example into the coarse granularity feature extractor model to acquire expressions and embed the expressions into the coarse granularity feature extractor model；

Triple loss between multiple expression embeddings is obtained:

；

wherein m is a super parameter.

2. The method for labeling a micro-image according to claim 1, comprising the steps of, before step S1:

collecting video data of students in a classroom;

3. The method of claim 2, wherein training the feature extractor model comprises:

4. A method of micro-image annotation as claimed in claim 3, wherein training the feature extractor model comprises:

Selecting any image from the micro-expression image sequence to be annotated, and carrying out a strong enhancement strategy and a weak enhancement strategy to obtain a corresponding strong enhancement image And weakly enhanced images; Selecting any image from the marked micro-expression images, and performing a weak enhancement strategy to obtain a weak enhancement image；

Enhancing images stronglyAnd weakly enhanced images、Respectively inputting the feature extractors, respectively acquiring corresponding image features, and judging the category of the micro expression;

Based on a plurality of weakly enhanced images Outputting a class probability distribution corresponding to each micro-expression class through the feature extractor, comparing the maximum value of the class probability distribution with a preset threshold value, and taking the micro-expression class corresponding to the preset threshold value or more as a pseudo class of the weak enhanced image; acquiring multiple strongly enhanced imagesCross entropy of the microexpressive category and the pseudo category;

Based on a plurality of weakly enhanced images Outputting the class probability distribution corresponding to each micro-expression class through the feature extractor, and obtaining the cross entropy between the output class probability distribution and the original class probability distribution of the marked micro-expression image.

5. The method according to claim 4, wherein comparing the confidence score of the identified microexpressions in the microexpressive image sequence to be annotated with the standard confidence score of the microexpressions of the corresponding category, comprises the steps of:

Acquiring first-class probability distribution of image data in a micro-expression image sequence to be annotated as follows Acquiring second-class probability distribution of marked micro-expression data set as；

For the same microexpressive category, obtaining the average probability distribution of the first category probability distribution and the second category probability distribution；

Will beWith a preset adaptive scoreComparing ifJudging that the confidence score of the corresponding micro-expression is larger than or equal to the standard confidence score, updating the micro-expression image sequence to be marked, and adding the micro-expression larger than or equal to the standard confidence score in the micro-expression image sequence to be marked into the marked micro-expression image set;

6. A microexpressive image annotation device, comprising:

Training the feature extractor model includes the steps of:

Triple loss between multiple expression embeddings is obtained:

；

wherein m is a super parameter;

7. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the micro-expressive image annotation method according to any one of claims 1 to 5.